A Complete Genome for the Common Marmoset
A Complete Genome for the Common Marmoset
Hebbar, P.; Potapova, T. A.; Loucks, H.; Ray, K.; Rodrigues, M. F.; Ryabov, F.; Malukiewicz, J.; Yoo, D.; de Lima, L. G.; Haber, A.; Kumar, S.; Banerjee, S.; Borchers, M.; Garcia, G. H.; Gardner, J.; Hachem, S.; Heath, H. D.; Ha, S.-K.; Mastoras, M.; McNulty, B.; Munson, K. M.; Pal, K.; Park, J. E.; Plosch, S.; Roos, C.; Seligmann, W. E.; Shepelev, V.; Spruce, C.; Violich, I.; Walter, L.; Makova, K. D.; Thathiah, A.; Sukoff Rizzo, S. J.; Silva, A. C.; Carter, G. W.; Miga, K. H.; Eichler, E. E.; Conrad, D. F.; Gerton, J. L.; Alexandrov, I.; Paten, B.
AbstractThe common marmoset is a New World monkey (NWM) commonly used as a model organism to investigate questions in primate evolution and human disease, including Alzheimers and other neurodegenerative diseases, as well as neuropsychiatric disorders. Here we present the first telomere-to-telomere (T2T) reference genome for the common marmoset, adding over 88 Mb of sequence and resolving challenging genomic regions. An additional near-T2T assembly from a second unrelated individual yields a total of four high-quality haplotypes for analysis. The improved contiguity and accuracy of these assemblies enable unprecedented insights into complex and rapidly evolving genomic regions such as centromeres, sex chromosomes, ribosomal DNA (rDNA) structure, and the major histocompatibility complex (MHC). We fully resolved all marmoset centromeres, uncovering dimeric alpha satellites with chromosomal specificity and stratified inactive layers documenting ancestral centromere turnover. We assembled six acrocentric autosomes with gene-poor, satellite-rich short arms and provide evidence that most of them can harbor rDNA and all of them share large pseudo-homologous regions (PHRs). The Y chromosome, but not the X chromosome, carries active rDNA and PHRs, and the rDNA copy number is sexually dimorphic. Chromosomes that share PHRs also share closely related centromeric satellite DNA, supporting a model of ongoing recombinational exchange between heterologous chromosomes facilitated by rDNA. We discovered multiple novel, marmoset-specific MHC genes that are predicted to protect against pathogens encountered in its environment. Leveraging this complete reference, we further identified over 500 transcribed genes with transcript models or expansions specific to the marmoset lineage. Together with additional long-read marmoset assemblies, these genomes were used to construct a marmoset pangenome, providing a robust reference framework for short-read mapping across diverse individuals. This resource will improve the utility of the common marmoset as a biomedical model organism and fill key gaps in our understanding of primate evolution.