A complete human pancreatic cancer genome

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

A complete human pancreatic cancer genome

Authors

Wagner, J.; Keskus, A. G.; Oshima, K. K.; Ranallo-Benavidez, T. R.; McDaniel, J.; Sikic, M.; Lin, D.; Paulin, L. F.; English, A. C.; Sedlazeck, F. J.; Munding, E. M.; Sanborn, J. Z.; Carroll, A.; Chang, P.-C.; Cook, D. E.; Shafin, K.; de Ligt, J.; Hassaine, R.; Cameron, D.; Catreux, S.; Lee, Y.; Murray, L.; Truong, S.; Brueffer, C.; Zimin, A. V.; Cross, E.; McGowan, M.; Vernich, M.; Liss, A. S.; Kocher, J.-P.; Stephens, Z.; Ahmad, T.; Bryant, A.; Dwarshuis, N.; He, H.-J.; He, Z.; Olson, N. D.; Thibaud-Nissen, F.; Antipov, D.; Koren, S.; Phillippy, A.; Musunuri, R. L.; Narzisi, G.; Jain, M.; We

Abstract

In cancer genome sequencing, reference gaps and germline variants obscure detection of small and large somatic variants and methylation in repetitive regions, impeding both tumor evolutionary inference and precision medicine. To identify somatic variants comprehensively, including complex rearrangements, we construct and curate near-complete, donor-specific, haplotype-resolved assemblies of the most recent common ancestor of a broadly-consented hypodiploid pancreatic cancer cell line and matched normal tissues. The tumor assembly completely recapitulates all 35 tumor chromosomes observed with karyotyping, with multiple translocation-induced hybrid chromosomes. The hybrid chromosomes contain four types of centromeres, including a putative functional dicentric chromosome and a fused centromere with a putative kinetochore derived from both normal kinetochores. We precisely resolve breakpoints of all copy number variants caused by translocations and inversions, most of which are complex, including telomeric, centromeric, and acrocentric translocations. By directly comparing near-complete tumor and normal assembled haplotypes, we discover many variants missed by typical methods, creating curated truncal somatic small variant, structural variant, and copy number variant benchmarks. We discover that most somatic LINE insertions originate from two rare hypomethylated non-reference germline LINE insertions. We resolve remarkably complex somatic events, including a translocation between two different haplotypes of chromosome 19 and the acrocentric short arm of chromosome 22 involving nested foldback inversions and 14 breakpoints. In regions without GRCh38 coordinates, we confirm 1,460 truncal somatic small variants, 46 insertions and 57 deletions >50 bp, four exceptionally large centromeric satellite tandem duplications of 61 kbp to 136 kbp, and 14 translocation and inversion breakpoints, mostly in centromeric satellite regions. Additionally, the polished assemblies reveal 5,824 somatic variants obscured by germline variants, mostly in homopolymers and tandem repeats, including a stop-gain SNV inside a 63 bp germline coding VNTR expansion in PRB4. Overall, this paired tumor and normal assembly uncovers >7,000 variants altering >1 Mbp of sequence in repetitive regions that have been hidden by reference gaps and germline variants.

Follow Us on

0 comments

Add comment