Prediction of peptide cleavage sites using protein language models and graph neural networks

Prediction of peptide cleavage sites using protein language models and graph neural networks

October 30, 2025

Paula Cifuentes, Ramon Adàlia, Ismael Zamora

Abstract

The growing interest in using peptide molecules as therapeutic agents, driven by their high selectivity and efficacy, has become a significant trend in the pharmaceutical industry. However, their oral administration remains challenging due to their low bioavailability and vulnerability to proteases, which produce the cleavage of peptide bonds. To optimize peptide drug development, in silico tools based on machine learning algorithms have been developed for site of cleavage prediction. These tools, which rely on manual feature extraction, have limitations in capturing complex peptide structures, especially those involving non-natural amino acids or cyclic peptides. This study presents two novel in silico approaches for cleavage site prediction. The first approach uses protein language models, specifically ESM-2, which has been fine- tuned to leverage its learned peptide structure embeddings for accurate cleavage site prediction, eliminating the need for manual feature engineering. The second approach employs graph neural networks, representing peptides via hierarchical graphs at the atom and amino acid levels, effectively handling cyclic peptide structures, including those containing non-natural amino acids. The applicability of this second approach is shown through a case study on a set of four cyclic peptides containing non-natural amino acids, comparing in silico predictions with experimental data.

2026 Mass Analytica Training

Contact us for a focused and hands-on, on-site training session designed to deliver practical skills and targeted insights on our IT solutions – tailored specifically to your needs and delivered directly at your location. 

Whether you want to deepen your knowledge of a specific software, discover new tools, or exchange best data analysis practices, this interactive session combines expert guidance, live demos, and hands-on training tailored to your use cases.  

On-site training format  

Duration: 2-4 hours (adjustable based on your needs)
Format: In-person, interactive, and hands-on
Content: Training, live demos, and practical examples focused on selected use cases 

Note:  Contact us to learn more and customize your session: info@mass-analytica.com