Predicting enzymatic cleavage sites in cyclic peptides with non-canonical amino acids using a Graphormer model trained on MetID user data
April 25, 2026
Abstract
Peptides are promising therapeutic agents because of their high selectivity and efficacy. However, their development is often limited by rapid enzymatic degradation, resulting in short half-lives. Chemical modifications such as cyclization, incorporation of D- or non-natural amino acids, and terminal modifications can improve peptide stability, yet their productive application requires prior identification of potential cleavage sites. Experimental determination of these sites is time-consuming, expensive, and may not fully capture the complexity of physiological environments. While computational approaches for cleavage site prediction exist, most are limited: they apply only to linear peptides composed of standard amino acids, have been tested only in single-enzyme systems, and cannot incorporate user-generated metabolite identification (MetID) data, restricting their utility for customized peptide design. To overcome these limitations, we present a workflow that integrates liquid chromatography–mass spectrometry (LC–MS) data from peptide metabolism studies with a Graphormer-based machine learning model to predict potential cleavage sites in peptides, including those with cycles and/or modified amino acids. The approach was evaluated using publicly available MEROPS datasets and MetID datasets from a leading pharmaceutical company, which included cyclic peptides with both natural and modified amino acids incubated in complex enzymatic matrices. The results show that the model achieves high precision in top-ranked cleavage site predictions, providing scientists with a practical tool that can help guide peptide drug design.
