A reference panel for linkage disequilibrium and genotype imputation using whole-genome sequencing data from 2,680 participants across India

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

A reference panel for linkage disequilibrium and genotype imputation using whole-genome sequencing data from 2,680 participants across India

Authors

Li, Z.; Zhao, W.; Zhou, X.; Leung, Y. Y.; Schellenberg, G. D.; Wang, L.-S.; Dey, S.; Lee, J.; Smith, J. A.; Dey, A. B.; Kardia, S.

Abstract

India is the most populous country globally, yet genetic studies involving Indian individuals remain limited. The Indian population is composed of many founder groups and has a mixed genetic ancestry, including an ancestral component not observed anywhere outside of India. This presents a unique opportunity to uncover novel disease variants and develop more tailored medical interventions. To facilitate genetic research in India, a crucial first step is to create a foundational resource that serves as a benchmark for future population studies and methods development. To this end, we have constructed the largest and most nationally representative linkage disequilibrium and genotype imputation reference panels in India to date, using high-coverage whole-genome sequencing data of 2,680 Indian participants from the Longitudinal Aging Study in India-Harmonized Diagnostic Assessment of Dementia (LASI-DAD). As an LD reference panel, LASI-DAD includes 69.5 million variants, representing 170% and 213% increases relative to the 1000 Genomes Project (1000G) and TOP-LD panels, respectively. Besides serving as an LD lookup panel, LASI-DAD facilitates various statistical analyses that rely on precise LD estimates. In a polygenic risk score (PRS) analysis, LASI-DAD improved the predictive performance of PRS by 2.1% to 35.1% across traits and studies. As an imputation reference panel, LASI-DAD improved the imputation accuracy by 3% to 101% (mean = 38%) compared to the TOPMed panel (Version R3) and by 3% to 73% (mean = 27%) compared to the Genome Asia Pilot (GAsP) panel across different minor allele frequencies. The LASI-DAD reference panel is publicly available to benefit future studies.

Follow Us on

0 comments

Add comment