MethylBench: A comprehensive benchmark of DNA methylation profiling methods across diverse sequencing platforms
MethylBench: A comprehensive benchmark of DNA methylation profiling methods across diverse sequencing platforms
Laufer, L.; Gasparoni, G.; Hentrich, T.; Sofan, L.; Admard, J.; Buena-Atienza, E.; Pogoda, M.; Ossowski, S.; Casadei, N.; Riess, O.; Haack, T.; Buchert, R.; Schulze-Hentrich, J.
AbstractBackground: DNA methylation can be profiled using multiple technologies that vary in resolution, coverage and cost. Yet systematic benchmarks across these methods remain scarce. Methods: We compared six widely used technologies - Illumina EPIC array, TWIST, Whole-Genome Enzymatic Conversion, Reduced Representation Bisulfite Sequencing, long-read genome sequencing (LR-GS) with Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) - using Genome in a Bottle (GIAB) reference samples and ten samples derived of blood and fibroblast cultures of 5 individuals. We assessed CpG coverage, consistency of differentially methylated cytosine (DMC) detection and genomic annotation, with particular attention to overlapping signals across assays. Results: Despite major differences in assay design, all technologies consistently identified DMCs enriched in promoter and intronic regions, highlighting these loci as robust hotspots of epigenetic variability. Annotation redundancy strongly influenced initial interpretations, with CpG island-related categories largely disappearing once annotations were collapsed to unique features. Sequencing-based methods (WGEC, TWIST, ONT) achieved the most comprehensive coverage, whereas EPIC arrays reproducibly captured promoter-associated differences despite limited scope. ONT sequencing enabled direct, long-read-based methylation profiling with phasing capability and showed strong concordance with short-read sequencing methods after coverage filtering, but required higher and more uniform coverage to achieve reproducible CpG-level agreement. PacBio methylation profiles showed a coverage-dependent discrepancy, with cross-platform concordance plateauing in GIAB samples despite high mean coverage, indicating residual technology-specific biases beyond simple coverage effects. Conclusions: Cross-platform benchmarking yields coherent biological insights when coverage and annotation redundancies are carefully addressed. Practically, EPIC arrays remain valuable for promoter-focused cohort studies, WGEC and TWIST enable genome-wide discovery and ONT provides unique phasing and multimodal potential. This comparative framework can guide method selection and support more robust interpretation of DNA methylation data across diverse platforms.