Optimizing Strain Selection for Association Studies Under Hard Cost Constraints

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

Optimizing Strain Selection for Association Studies Under Hard Cost Constraints

Authors

Rau, C. D.; Bradley, P. H.

Abstract

Quantitative genetics methods can be particularly powerful in model organisms and non-human populations, and we now have strain collections such as recombinant inbred lines, etc. that can be phenotyped. Natural diversity is also valuable in non-model systems that do not yet have reverse genetic tools. However, purchasing and phenotyping large collections can be cost-prohibitive. Strain or sample acquisition costs may also vary dramatically for different strains or isolates. Thus, investigators need efficient strategies to optimize experimental power for a given limited budget. In this study, we evaluate several approaches to optimally select subsets of the total cohort to best maintain power when performing genome-wide association studies. Some approaches focus solely on costs, others on genetic diversity, and some on both simultaneously. Through simulation studies across different minor allele frequencies and SNP effect sizes, we demonstrate that selecting for cost is most beneficial at low-to-moderate budget thresholds, while selecting for diversity is optimal in scenarios involving rare (MAF 5-10%) variants or higher total costs, or when accounting for additional costs per strain studied. We also evaluate these approaches on data from the Hybrid Mouse Diversity Panel (HMDP), and find that an approach that considers both cost and diversity is superior at recovering significant loci and maintaining statistical power under real-world conditions. This approach picks the strains that, for a given budget, minimize the total genetic distance to the strains that were not selected. This approach, which we term \"ThriftyMD\" (for \"Thrifty Minimum Distance\"), extends previous distance-based methods to pick a representative panel by explicitly adding a cost constraint. Overall, our results highlight the trade-offs between cost, diversity, and power in GWAS cohort design, and present the ThriftyMD algorithm as a versatile and robust approach for optimizing study design in resource-limited settings.

Follow Us on

0 comments

Add comment