A Comprehensive Benchmarking of Spatial Deconvolution andDomain Detection Methods across Diverse Tissues and SpatialTranscriptomic Technologies
A Comprehensive Benchmarking of Spatial Deconvolution andDomain Detection Methods across Diverse Tissues and SpatialTranscriptomic Technologies
Shree, A.; V, A.; Kumar, T.; Zafar, H.
AbstractSpatial transcriptomic technologies enable high-resolution characterization of gene expression patterns and reconstruction of cellular architecture within tissue contexts. Two key computational problems have emerged for analyzing these datasets: spatial deconvolution, for disentangling cell-type compositions at spatial locations, and spatial domain detection, for identifying spatially coherent regions within a tissue section. Although numerous methods have been developed for each task, a comprehensive and unified benchmarking study spanning diverse tissue types, spatial resolutions, and technological platforms remains lacking, hindering informed method selection by end users and impeding future methodological advancements. Here, we present spDDB (https://github.com/Zafar-Lab/spDDB), a comprehensive benchmarking framework for spatial deconvolution and domain detection methods across a large and diverse collection of datasets spanning multiple tissues, technologies, and biological conditions. We evaluated 21 deconvolution methods, including seven recently-developed approaches, across 37 datasets curated from brain, cancer, and organ tissues encompassing four distinct technologies. To enable rigorous evaluation, we introduced SynthST, a deep graph attention autoencoder-based simulator that generates realistic spatial cell-type distributions from spatial transcriptomic data, and employed a suite of spatial bivariate metrics including a novel bivariate Geary's C metric, alongside rare cell-type, and cell-shape characterization metrics, for multidimensional performance assessment. While Cell2location, RCTD and SONAR emerged as top-performing methods for spatial deconvolution across tissue types, deconvolution performance varied substantially based on tissue architecture, spatial technology, dataset scale, and cell type diversity. For domain detection, we benchmarked 18 methods across 36 datasets spanning six spatial technologies, identifying PROST, BASS, and SpaceFlow as the leading approaches, while revealing notable limitations of existing methods in handling large-scale datasets. Finally, we provide practical guidelines to assist end users in selecting optimal methods for both tasks across diverse experimental settings.