Adding iPSC donor lines does not adequately control for genetic heterogeneity

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

Adding iPSC donor lines does not adequately control for genetic heterogeneity

Authors

Shvetcov, A.; Thomson, S.; Finney, C. A.

Abstract

Human induced pluripotent stem cell (iPSC)-based disease modelling studies are widely expected to include three to five independent donor lines to control for the contribution of donor genetic background to phenotypic variance. This convention has been formalized into major guidelines, yet no power analysis has evaluated whether these sample sizes can detect, estimate, or control for donor-level genetic effects. Here, we provide that evaluation. Using Monte Carlo simulation, closed-form confidence intervals, population genetics, and empirical resampling of transcriptomic data from iPSC lines, we show that studies with three to five donors cannot reliably detect donor-level variance, cannot estimate its magnitude with useful precision, and cannot determine whether a treatment effect generalizes across genetic backgrounds. The sample sizes required to reliably detect, estimate, or control for donor-level variance exceed 20 donors and, for many phenotypes, exceed 50, well beyond what any standard disease modelling experiment can deliver. Adding two or three donor lines to a study does not meaningfully increase statistical power, narrow confidence intervals, or establish whether a treatment effect generalizes across genetic backgrounds. The inability to control for genetic background is not a limitation of individual study design but a structural property of iPSC-based modelling. We propose that the field adopt isogenic controls for variant-specific questions and orthogonal validation against clinical datasets for generalizability, rather than treating donor number as a proxy for rigour.

Follow Us on

0 comments

Add comment