NanoLabel: A fast and accurate real-time nanopore signal classifier

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

NanoLabel: A fast and accurate real-time nanopore signal classifier

Authors

Mahajan, D.; Jain, C.; Kashyap, N.

Abstract

Oxford Nanopore Technologies adaptive sampling capability promises to reduce sequencing cost and turnaround time. At its core, adaptive sampling is a real-time classification problem that distinguishes reads originating from regions of interest. Direct signal-based classification approaches bypass the computational bottleneck of basecalling and can eliminate the need for powerful GPUs. However, operating directly on noisy raw signals remains challenging in real-time settings, where classification decisions must be made quickly. In this work, we propose NanoLabel, a new method for real-time classification of nanopore signals. We build NanoLabel on top of the signal-based read-mapping tool RawHash2. We accelerate the classification workflow by mapping reads using only the target regions as the reference. To further improve accuracy, we train a lightweight classifier on mapping-derived features. We also introduce a data augmentation strategy to construct sufficiently large and class-balanced training datasets. We evaluate NanoLabel using publicly available real sequencing datasets from three human genomes (HG001, HG002, and HG005) and assume a cancer gene panel as the target. Compared to directly mapping reads with RawHash2, we demonstrate 80x improvement in the classification time and 0.10-0.25 units improvement in the F1 score.

Follow Us on

0 comments

Add comment