Adapting ProteinMPNN for antibody design without retraining
Adapting ProteinMPNN for antibody design without retraining
del Alamo, D.; Frick, R.; Truan, D.; Karpiak, J. D.
AbstractThe neural network ProteinMPNN designs protein sequences capable of folding into predefined tertiary structures and quaternary assemblies. It has become widely used due to its high success rates when working with synthetic topologies rich in secondary structure. Here, we show degraded performance on the complementarity-determining regions (CDRs) of antibodies, with designs frequently failing to resemble native antibodies or failing to refold into the designed conformations. We also show that this underperformance can be rescued by ensembling its predictions with those from the antibody-specific protein language model AbLang, which designs exclusively using sequence information learned from large databases of antibody sequences. Finally, we tested 96 trastuzumab variants with CDRH3 loops redesigned by the ensembled ProteinMPNN+AbLang method and found that it generated thirty-six HER2 binders, compared to three out of 96 designs generated by ProteinMPNN alone. The data highlight the value of incorporating additional restraints derived from language models during structure-based sequence design of antibodies.