Structure tokens sharpen the feature vocabulary of protein language models

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

Structure tokens sharpen the feature vocabulary of protein language models

Authors

Steenwyk, J. L.

Abstract

Protein language models predict structure and function from amino acid sequences, but the internal computations that produce these predictions remain opaque. We applied sparse autoencoders to ESM-2 (650M parameters, sequence-only) and ESM-3 (1.4B parameters, multimodal) and found that 78% of learned features converge between the two architectures (permutation null: 14.2%, p < 0.001). These convergent features account for nearly all functional knowledge encoded by the models (functional site AUROC 0.925 versus 0.661 for architecture-unique features). Structure tokens in ESM-3 do not create a new feature vocabulary. Instead, the 15.2% of features most activated by structure tokens are more convergent with sequence-only ESM-2 than structure-invariant features are (r = 0.54 versus 0.45) and carry richer biological annotation (134 versus 29 enriched GO terms). Attention analysis identified a single geometric head (L0H7) as the bottleneck through which structural information enters the network; ablating this head alone changed secondary structure predictions at 40% of residues, while ablating random layer-0 heads altered fewer than 17%. Steering vectors, attribution patching, and sparse feature circuits confirmed that these features sit within the model's causal pathway. Two architecturally distinct models, trained on different objectives and input modalities, converge on a shared biological vocabulary - and explicit structure tokens sharpen that vocabulary rather than rewriting it.

Follow Us on

0 comments

Add comment