Multimodal gene embeddings for drug-target prediction and lineage reconstruction
Multimodal gene embeddings for drug-target prediction and lineage reconstruction
Kidder, B. L.
AbstractUnderstanding how gene function emerges across molecular, cellular, and pharmacologic contexts remains a central challenge in systems biology and drug discovery. Conventional computational models typically operate within a single modality, such as expression, ontology, or interaction networks, limiting their ability to capture the multidimensional nature of gene function. Here, we present NEWT (Neural Embeddings for Wide-spectrum Targeting), a multimodal deep learning framework that integrates heterogeneous biological knowledge into a unified and interpretable representation space. By combining functional annotations, large-scale co-expression data, pathway information, lineage programs, transcriptional regulons, and protein-protein interaction features through an attention-guided fusion architecture, NEWT learns cross-modal dependencies that reflect both global functional hierarchies and context-specific regulatory relationships. Applied to L1000 perturbational transcriptomes, NEWT achieves higher compound-target prediction accuracy than prior embedding models and reconstructs pharmacological networks that reveal mechanistic and repurposing opportunities. When extended to single-cell RNA-seq data, NEWT preserves developmental trajectories and enhances the resolution of lineage hierarchies. Together, these results demonstrate that multimodal gene embeddings can bridge pharmacogenomic and single-cell transcriptomic analyses within a common functional geometry, establishing a scalable foundation for integrative target discovery and systems-level modeling of cellular identity.