Science Cast

MSACLR: Contrastive Learning of Protein Conformations from MSAs

Xiaolin ChengOctober 4, 2025 8:21pm

Views (82)
Comments (0)

Export Citation

Voice is AI-generated

Connected to paperThis paper is a preprint and has not been certified by peer review

MSACLR: Contrastive Learning of Protein Conformations from MSAs

bioRxivPDFOctober 3, 2025 12:00am

Authors

ZHANG, J.; Xing, E.; Cheng, X.

Abstract

We propose MSACLR ( Multiple Sequence Alignment Contrastive Learning Representation), a two-stage contrastive learning framework that maps MSA space to conformational space. In Stage 1, embeddings are trained to discriminate structural folds across diverse proteins using only MSA information. In Stage 2, embeddings are fine-tuned on subMSAs labeled by their associated predicted structural clusters, enabling discrimination of alternative conformations within the same protein. To enrich training data, we introduce BLOSUM62-guided [1] augmentation, which expands the pool of subMSAs associated with each structural cluster label by introducing sequence-level diversity. Our experiments show that MSACLR embeddings achieve clearer fold-level separation than single sequence baselines, while fine-tuned embeddings capture conformational variation across scales from local loop motions to domain motions and fold switching. MSACLR provides a foundation for efficient exploration of MSA space and enables sampling of conformational ensembles, bridging the gap between static structure pre- diction and dynamic protein behavior.

TwitterandLinkedIn

0 comments

Add comment

MSACLR: Contrastive Learning of Protein Conformations from MSAs

MSACLR: Contrastive Learning of Protein Conformations from MSAs

AI-powered Paper ChatBeta

AI-powered Paper ChatBeta

0 comments