CANCAN: high-resolution copy number and mutation heterogeneity analysis of DNA sequence data for clinical applications
CANCAN: high-resolution copy number and mutation heterogeneity analysis of DNA sequence data for clinical applications
Pladsen, A. V.; Vodak, D.; Zhao, S.; Nakken, S.; Nebdal, D.; Lien, T.; Danielsen, B. K.; Wang, C.; Kildal, W.; Hjortland, G. O.; Hovig, E.; Russnes, H. G.; Lingjaerde, O. C.
AbstractHigh-throughput DNA sequencing is central to precision oncology, yet robust and interpretable methods for integrated analysis of copy number alterations and somatic variants across sequencing platforms remain limited. We present CANCAN (Copy number integrative ANalysis in CANcer), a platform-agnostic computational framework for high-resolution analysis of allele-specific copy number and variant data. CANCAN integrates novel normalization and segmentation strategies and enables inference of tumor purity, ploidy, subclonality and mutation multiplicity, while providing statistical confidence estimates and transparent evaluation of alternative solutions. Benchmarking across whole-genome, whole-exome and targeted sequencing datasets from TCGA and the IMPRESS-Norway study demonstrates high concordance with established methods, with particularly strong performance on targeted sequencing data. CANCAN accurately estimates global genomic features, including purity and ploidy, even at reduced sequencing coverage, and shows comparable or improved agreement relative to existing tools. In addition, it provides detailed visualization of the genomic context of clinically relevant biomarkers, supporting diagnostic interpretation. CANCAN constitutes a reproducible and interpretable approach for integrated genomic analysis, addressing key methodological and practical challenges in clinical cancer genomics.