Tags:Ataxia telangiectasia, machine learning, mass spectrometry, metabolic pathways and untargeted metabolomics
Abstract:
Metabolomics has emerged as a promising discipline in pharmaceuticals and preventive healthcare, holding great potential for disease detection and drug testing. However, analysing large metabolic datasets remains challenging, with available methods generally relying on limited and incompletely annotated biological pathways. In this study, we introduce a novel approach that leverages machine learning classifiers trained on molecular fingerprints of metabolites to predict their responses under specific experimental conditions. We evaluate this approach using a cellular model for the genetic disease Ataxia Telangiectasia. Our analysis reveals several challenges associated with training machine learning models on metabolite fingerprints, including data imbalance, high dimensionality, and duplicate fingerprints. After addressing these issues, our models achieve satisfactory performance, providing evidence that the structural properties of metabolites hold predictive power over their response to experimental conditions. Additionally, we perform feature importance analysis to identify chemical configurations contributing to the classification process, shedding light on impacted biological pathways. Remarkably, this analysis not only identifies groups of metabolites known to participate in affected pathways but also discovers metabolites not previously associated with the disease, opening up novel opportunities for further exploration. These findings offer new avenues for research and underscore the potential of machine learning in elucidating complex metabolic responses in genetic disorders, ultimately contributing to improved disease understanding and therapeutic development.
Machine Learning-Enabled Prediction of Metabolite Response in Genetic Disorders