Recent advancements in MARS for computer driven-analysis in stable isotope labeling studies in untargeted metabolomics

Recent advancements in MARS for computer driven-analysis in stable isotope labeling studies in untargeted metabolomics

73rd ASMS Conference on Mass Spectrometry. June 2025

Stefano Bonciarelli1; Paolo Tiberi2; Ismael Zamora1; Marta Piroddi3; Giovanna Ilaria Passeri2; Gabriele Cruciani4; Laura Goracci4

1Mass Spec Analytica, Sant Cugat del Vallés, Spain; 2Molecular Discovery Ltd, Borehamwood, United Kingdom; 3Molecular Horizon Srl, Bettona, Italy; 4University of Perugia, Perugia, Italy

Abstract

Introduction

High resolution mass spectrometry has become routinely used in untargeted metabolomics especially coupled with liquid chromatography. However, the huge amount of data generated by these approaches humpers an easy interpretation of the data. Complexity increases when stable isotope labelling (SIL) studies are performed. Indeed, unlabeled and labelled species coelute during the chromatography originating overlapping peaks to be distinguished during the analysis in order to annotate labelled and unlabelled species within the same dataset. In addition, the projection of the identified species in metabolic maps is crucial for the biological interpretation of results. Several software solutions have been to assist the data processing in untargeted metabolomic studies. However, only a few are equipped with tailored tools for SIL analysis.

Methods

Here, we describe the improvements of the computational workflow for SIL analysis already in place in the previous version of MARS (https://mass-analytica.com/products/mars/) a software solution for untargeted LC-MS untargeted metabolomics analysis.

Preliminary data

In particular, MARS allows the generation of database of partially and/or uniformly labelled species starting from a database of native metabolite. Native species can be easy edited and labelled with common isotopes (e.g. D, 13C, 15N, 34S, 37Cl). MS and MS/MS data of labelled species are automatically computed from those of native metabolites and stored in the database for identification purpose. In addition, 9 ready-to-use databases including native and labelled metabolites reported in SIL studies using a specific isotope tracers are available for download. The isotopic pattern clustering algorithm was also recently optimized to take into consideration instrument resolution and isotopes of any element. Native and labelled compounds identified as different ionization adducts and at different

level (MS or MS/MS) can be inspected together or separately for an easier investigation and for projection into the metabolic maps included in the package.

 

You must be logged in to access this content. Not yet registered? Create a new account

 

 

Automation in Metabolite Identification Workflows with Software-Assisted Processing of Mass Spectrometry Data

Automation in Metabolite Identification Workflows with Software-Assisted Processing of Mass Spectrometry Data

73rd ASMS Conference on Mass Spectrometry. June 2025

Savannah M Mason1; Ismael Zamora1; Luca Morettoni1; Paula Cifuentes1; Ramon Adalia1

1Mass Analytica, S.L., Sant Cugat del Vallés, Spain

Abstract

Introduction

The identification of metabolites using mass spectrometry is a crucial component of drug discovery and development. In recent years, the development of software-assisted approaches for metabolite identification have resulted in the expedited analysis of LCMS data. Despite these advances, challenges remain, particularly in the submission of data for processing, which can be tedious for the user. In this work, we develop automation to facilitate metabolite identification workflows. We demonstrate how automation may be used to parse information from a sample list and process data into a database. Following software-assisted peak finding and structural elucidation, we further automate filters to sort the peaks and generate a report, resulting in an expedited workflow for metabolite identification.

Methods

This work used automation to generate experiments and process mass spectrometry data in a database for application in metabolite identification workflows. A script was executed, which directed the system to monitor a designated folder for an instrument sample list and LCMS data files. The presence of these files triggered both the creation of experiments in a database and the processing of the data for metabolite identification. Filters were applied to simplify analysis of the peaks before user intervention. Data processing and analysis were performed using MassMetaSite 4.7 in the ONIRO 1.6.2 server with LCMS data from Agilent, Bruker, Sciex, Thermo, and Waters.

Preliminary data

Following incubation and LCMS data acquisition, the automation described herein significantly reduced user involvement in the laborious tasks of this metabolite identification workflow. The sample list, which was automatically generated from a mass spectrometer, was parsed to obtain the information necessary to define the protocol and create experiments within the ONIRO database. Using MassMetaSite, the LCMS data was analyzed to find metabolite peaks and elucidate the structures.

Several process tasks were also automated during data analysis to consolidate tasks which are otherwise tedious for the user during data review. Peaks were automatically filtered and discarded based on various parameters, such as mass error, isotope similarity score, and negative control area ratio. In experiments with multiple timepoints, calculations were performed to provide kinetic analysis, such as AUC and the area comparison between a given incubation sample and the 0 minute sample. An AI-based peak selection model was applied, which provided suggestions for peak selection and removal. The peaks which the model suggested to remove with high probability were automatically hidden. Finally, the system generated metabolite identification reports after approval of the experiments.

This workflow reduced the time required of a user by automating the parsing of a sample list, creation of experiments, processing of mass spectrometry data, and filtering of metabolite peaks. Upon review of the data by the user, the system automatically generated a metID report. This automation has the potential to significantly decrease the time between LCMS data acquisition and report generation, providing faster access to information to better understand the metabolism and design compounds with improved properties.

 

You must be logged in to access this content. Not yet registered? Create a new account

 

 

Software-aided approach designed to analyze and predict cleavage sites for peptides

Software-aided approach designed to analyze and predict cleavage sites for peptides

73rd ASMS Conference on Mass Spectrometry. June 2025

Paula Cifuentes1,2; Ramon Adalia1,2; Ismael Zamora2; Lisa O’Callaghan3, Richard Gundersdorf3

1Lead Molecular Design, S.L., Sant Cugat Del Valles, Spain. 2Mass Analytica, S.L., Sant Cugat Del Valles, Spain. 3Merck & Co., Inc., West Point, PA, USA

Abstract

Introduction

The growing interest in using peptide molecules as therapeutic agents, driven by their high selectivity and efficacy, has become a significant trend in the pharmaceutical industry. However, oral administration remains a key challenge, as peptide drugs have low bioavailability and are highly susceptible to proteases that produce the cleavage of peptide bonds. Identifying this site of cleavage and characterizing the resulting metabolites (MetID) is essential to understanding how peptides are metabolized. In-silico tools have been developed to predict peptide cleavage sites. However, these tools face limitations, such as limited applicability to unnatural amino acids, inability to process cyclic peptides, and lack of customization to user-specific data. These challenges highlight the need for further advancements in this area.

Methods

The methodology defines a new workflow that uses LC-MS data from peptide metabolic experiments as well as data coming from external sources to predict potential cleavage sites in new candidate’s peptide drugs by employing a machine-learning model. The models make use of transformer architecture with added mechanisms to encode graph structural data. Notably, these models eliminate the need for manual feature extraction, as they can predict peptide properties such as secondary structure and solvent accessibility. The methodology is designed to operate without structural constraints, allowing for linear and cyclic peptides, and including natural and unnatural amino acids. Users can train the models with their own experimental data. The methodology was validated using experimental MetID data from over 100 individual peptides.

Preliminary data

Our machine learning model demonstrated strong performance on an experimental dataset of 114 peptides incubated with a complex matrix of proteases, including cyclic structures with non-canonical amino acids. The model achieved a Hits@4 score of 2.74, indicating that, on average, 2.74 correct cleavage sites were identified within the top four predictions per peptide. Furthermore, the model achieved a precision of 91.30% for the top-ranked prediction, signifying that the predicted cleavage site was correct in 91.30% of cases. Additionally, the model achieved a mean average precision (MAP) of 84.56, highlighting its effectiveness in ranking cleavage sites accurately across the dataset.  Moreover, this model can be updated with new experimental MetID user data to further improve its performance by a self-learning approach where new expert curated information is added to the model building process without human intervention.

In addition, models were developed and trained on publicly available data for a selected number of proteases involved in peptide drug degradation. These models were optimized using 5-fold cross-validation and hyperparameter tuning, achieving F1 scores exceeding 95% and precisions of 98%, demonstrating their high accuracy and reliability. When compared to existing cleavage site prediction models from the literature, our approach outperformed by achieving an F1 score 60% higher, without the need for feature extraction or dataset balancing techniques.

This tool has the potential to significantly accelerate the development of peptide-based drugs by efficiently identifying cleavage sites, enabling more effective modifications to compound structures that enhance their stability, while reducing the time and cost associated with experimental validation..

 

You must be logged in to access this content. Not yet registered? Create a new account