Skip to Main content Skip to Navigation

Machine learning-based classification to improve Gas Chromatography-Mass spectrometry data processing.

Abstract : Introduction Lack of reliable peak detection impedes automated analysis of large-scale gas chromatography-mass spectrometry (GCMS) metabolomics datasets. Performance and outcome of individual peak-picking algorithms can differ widely depending on both algorithmic approach and parameters, as well as data acquisition method. Therefore, comparing and contrasting between algorithms is difficult. Technological and methodological innovation We present part of the work published in [1] and implemented in our workflow for improved peak picking (WiPP), focusing on the use of machine learning-based classification to optimize and improve different steps of the common GC-MS metabolomics data processing workflow. Our approach evaluates the quality of detected peaks using a machine learning based classification scheme based on seven peak classes. The quality information returned by the classifier for each individual peak is merged with results from different peak detection algorithms to create one final high-quality peak set for immediate down-stream analysis. Results and impact We benchmarked our workflow to standard compound mixes and a complex biological dataset, demonstrating that peak detection is improved. Furthermore, the approach can provide an impartial performance comparison of different peak picking algorithms. We also discuss the applicability of the approach to liquid chromatography-mass spectrometry data. References [1] Gloaguen, Y.; Borgsmüller, N. et al. WiPP: Workflow for Improved Peak Picking for Gas Chromatography-Mass Spectrometry (GC-MS) Data. Metabolites 2019, 9, 171.
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-02505901
Contributor : Archive Ouverte Prodinra <>
Submitted on : Wednesday, March 11, 2020 - 8:10:04 PM
Last modification on : Tuesday, March 17, 2020 - 3:36:19 AM

Identifiers

  • HAL Id : hal-02505901, version 1
  • PRODINRA : 496662

Collections

Citation

Yoann Gloaguen, Nico Borgsmüller, Tobias Opialla, Eric Blanc, Emilie Sicard, et al.. Machine learning-based classification to improve Gas Chromatography-Mass spectrometry data processing.. European RFMF Metabomeeting 2020, Jan 2020, Toulouse, France. 263 p., 2020. ⟨hal-02505901⟩

Share

Metrics

Record views

17