Benchmarking Short-Read ITS2 and Full-Length ITS Sequencing Reveals Pipeline-Dependent Biases in Indoor Fungal Community Profiling
Benchmarking Short-Read ITS2 and Full-Length ITS Sequencing Reveals Pipeline-Dependent Biases in Indoor Fungal Community Profiling
Dong, M.; Blackwood, D.; Lott, M. E. J.; Castro, S. P.; Larkin, X.; Clerkin, T.; Hemric, H.; Nash, J.; Kim, Y. J.; Arnold, J.; David, L. A.; Vilgalys, R.; Fodor, A. A.; Noble, R. T.
AbstractShort-read amplicon sequencing is widely used for fungal surveys but can limit taxonomic resolution. Long-read sequencing enables recovery of the full internal transcribed spacer (ITS) region and may improve ecological and taxonomic inference. Here, we conducted a paired comparison of Illumina ITS2 and PacBio HiFi full-length ITS sequencing using identical DNA extracts from built-environmental air and surface samples (n = 68) collected across homes, a dormitory, and laboratories. Both datasets were taxonomically assigned using the same algorithm and reference database. We performed paired statistics, in-silico ITS2 trimming of long-read sequences, and cross-platform mapping at multiple identity thresholds. Full-length ITS provided higher taxonomic resolution, assigning a greater fraction of ASVs at the family (98% vs. 88%) and species (42% vs. 32%) ranks than ITS2 (paired Wilcoxon q=0.002). Alpha-diversity comparisons showed similar Shannon diversity across pipelines, whereas richness metrics were consistently higher for full-length ITS. Beta-diversity analyses indicated broadly comparable community-level patterns, although full-length ITS revealed stronger sample-type- and location-associated structure (PERMANOVA R{superscript 2} 0.06, p=0.0001). In-silico ITS2 trimming reduced these differences, indicating that amplicon length is a major contributor to enhanced taxonomic resolution and ecological inference. Cross-platform mapping further showed extensive one-to-many relationships between ITS2 and full-length ITS ASVs, consistent with increased sequence resolution in long-read data.Together, these results show that ITS2 sequencing provides robust community-level profiling, while full-length ITS enables improved richness estimates and finer ecological and taxonomic resolution. This paired, bias-aware framework provides a practical template for selecting fungal amplicon sequencing strategies in built-environment mycobiome studies.