EnhancerDetector: Enhancer Discovery from Human to Fly via Interpretable Deep Learning
EnhancerDetector: Enhancer Discovery from Human to Fly via Interpretable Deep Learning
Solis, L. M.; Sterling-Lentsch, G.; Halfon, M. S.; Girgis, H. Z.
AbstractEnhancers are essential non-coding DNA elements that regulate gene expression, yet their accurate identification remains a major challenge. We introduce EnhancerDetector, a convolutional neural network-based framework for cross-species enhancer prediction that combines high accuracy with biological interpretability. Trained on human data, EnhancerDetector achieves strong performance across human, mouse, and fly datasets, consistently outperforming existing methods in precision and F1 score. It generalizes effectively to datasets generated using diverse experimental assays. An ensemble strategy further improves prediction reliability by reducing false positives (critical for genome-wide applications). EnhancerDetector supports fine-tuning on new species and retains strong performance even when adapted with as few as 20,000 enhancer sequences, making it ideal for newly sequenced genomes with limited experimental data. For interpretability and visualization, we apply class activation maps to identify sequence regions predictive of enhancer activity. Experimental validation in transgenic flies confirms the predictive power of EnhancerDetector: five of six tested candidates drove reporter expression, and four exhibited expression patterns supported by prior literature. These analyses highlight distinct sequence and contextual features that confer what we term enhancerness: enhancer sequences possess a characteristic, identifiable signature. Together, these findings position EnhancerDetector as a practical, accurate, and interpretable framework for enhancer discovery across diverse species.