Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatment

Bibliographic Details
Main Author: Marcos-Zambrano, LJ
Publication Date: 2021
Other Authors: Karaduzovic-Hadziabdic, K, Loncar, Turukalo, T, Przymus, P, Trajkovik, V, Aasmets, O, Berland, M, Gruca, A, Hasic, J, Hron, K, Klammsteiner, T, Kolev, M, Lahti, L, Lopes, MB, Moreno, V, Naskinova, I, Org, E, Paciência, I, Papoutsoglou, G, Shigdel, R, Stres, B, Vilne, B, Yousef, M, Zdravevski, E, Tsamardinos, I, Carrillo, de, Santa, Pau, E, Claesson, MJ, Moreno-Indias, I, Truu, J
Format: Article
Language: eng
Source: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Download full: https://hdl.handle.net/10216/149451
Summary: The number of microbiome-related studies has notably increased the availability of data on human microbiome composition and function. These studies provide the essential material to deeply explore host-microbiome associations and their relation to the development and progression of various complex diseases. Improved data-analytical tools are needed to exploit all information from these biological datasets, taking into account the peculiarities of microbiome data, i.e., compositional, heterogeneous and sparse nature of these datasets. The possibility of predicting host-phenotypes based on taxonomy-informed feature selection to establish an association between microbiome and predict disease states is beneficial for personalized medicine. In this regard, machine learning (ML) provides new insights into the development of models that can be used to predict outputs, such as classification and prediction in microbiology, infer host phenotypes to predict diseases and use microbial communities to stratify patients by their characterization of state-specific microbial signatures. Here we review the state-of-the-art ML methods and respective software applied in human microbiome studies, performed as part of the COST Action ML4Microbiome activities. This scoping review focuses on the application of ML in microbiome studies related to association and clinical use for diagnostics, prognostics, and therapeutics. Although the data presented here is more related to the bacterial community, many algorithms could be applied in general, regardless of the feature type. This literature and software review covering this broad topic is aligned with the scoping review methodology. The manual identification of data sources has been complemented with: (1) automated publication search through digital libraries of the three major publishers using natural language processing (NLP) Toolkit, and (2) an automated identification of relevant software repositories on GitHub and ranking of the related research papers relying on learning to rank approach.
id RCAP_ab932b371bd2a049912030a29aed72b8
oai_identifier_str oai:repositorio-aberto.up.pt:10216/149451
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatmentbiomarker identificationdisease predictionfeature selectionmachine learningmicrobiomeThe number of microbiome-related studies has notably increased the availability of data on human microbiome composition and function. These studies provide the essential material to deeply explore host-microbiome associations and their relation to the development and progression of various complex diseases. Improved data-analytical tools are needed to exploit all information from these biological datasets, taking into account the peculiarities of microbiome data, i.e., compositional, heterogeneous and sparse nature of these datasets. The possibility of predicting host-phenotypes based on taxonomy-informed feature selection to establish an association between microbiome and predict disease states is beneficial for personalized medicine. In this regard, machine learning (ML) provides new insights into the development of models that can be used to predict outputs, such as classification and prediction in microbiology, infer host phenotypes to predict diseases and use microbial communities to stratify patients by their characterization of state-specific microbial signatures. Here we review the state-of-the-art ML methods and respective software applied in human microbiome studies, performed as part of the COST Action ML4Microbiome activities. This scoping review focuses on the application of ML in microbiome studies related to association and clinical use for diagnostics, prognostics, and therapeutics. Although the data presented here is more related to the bacterial community, many algorithms could be applied in general, regardless of the feature type. This literature and software review covering this broad topic is aligned with the scoping review methodology. The manual identification of data sources has been complemented with: (1) automated publication search through digital libraries of the three major publishers using natural language processing (NLP) Toolkit, and (2) an automated identification of relevant software repositories on GitHub and ranking of the related research papers relying on learning to rank approach.Frontiers Media20212021-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttps://hdl.handle.net/10216/149451eng1664-302X10.3389/fmicb.2021.634511Marcos-Zambrano, LJKaraduzovic-Hadziabdic, KLoncar, Turukalo, TPrzymus, PTrajkovik, VAasmets, OBerland, MGruca, AHasic, JHron, KKlammsteiner, TKolev, MLahti, LLopes, MBMoreno, VNaskinova, IOrg, EPaciência, IPapoutsoglou, GShigdel, RStres, BVilne, BYousef, MZdravevski, ETsamardinos, ICarrillo, de, Santa, Pau, EClaesson, MJMoreno-Indias, ITruu, Jinfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2025-02-27T18:08:03Zoai:repositorio-aberto.up.pt:10216/149451Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T22:38:14.810916Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatment
title Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatment
spellingShingle Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatment
Marcos-Zambrano, LJ
biomarker identification
disease prediction
feature selection
machine learning
microbiome
title_short Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatment
title_full Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatment
title_fullStr Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatment
title_full_unstemmed Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatment
title_sort Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatment
author Marcos-Zambrano, LJ
author_facet Marcos-Zambrano, LJ
Karaduzovic-Hadziabdic, K
Loncar, Turukalo, T
Przymus, P
Trajkovik, V
Aasmets, O
Berland, M
Gruca, A
Hasic, J
Hron, K
Klammsteiner, T
Kolev, M
Lahti, L
Lopes, MB
Moreno, V
Naskinova, I
Org, E
Paciência, I
Papoutsoglou, G
Shigdel, R
Stres, B
Vilne, B
Yousef, M
Zdravevski, E
Tsamardinos, I
Carrillo, de, Santa, Pau, E
Claesson, MJ
Moreno-Indias, I
Truu, J
author_role author
author2 Karaduzovic-Hadziabdic, K
Loncar, Turukalo, T
Przymus, P
Trajkovik, V
Aasmets, O
Berland, M
Gruca, A
Hasic, J
Hron, K
Klammsteiner, T
Kolev, M
Lahti, L
Lopes, MB
Moreno, V
Naskinova, I
Org, E
Paciência, I
Papoutsoglou, G
Shigdel, R
Stres, B
Vilne, B
Yousef, M
Zdravevski, E
Tsamardinos, I
Carrillo, de, Santa, Pau, E
Claesson, MJ
Moreno-Indias, I
Truu, J
author2_role author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
dc.contributor.author.fl_str_mv Marcos-Zambrano, LJ
Karaduzovic-Hadziabdic, K
Loncar, Turukalo, T
Przymus, P
Trajkovik, V
Aasmets, O
Berland, M
Gruca, A
Hasic, J
Hron, K
Klammsteiner, T
Kolev, M
Lahti, L
Lopes, MB
Moreno, V
Naskinova, I
Org, E
Paciência, I
Papoutsoglou, G
Shigdel, R
Stres, B
Vilne, B
Yousef, M
Zdravevski, E
Tsamardinos, I
Carrillo, de, Santa, Pau, E
Claesson, MJ
Moreno-Indias, I
Truu, J
dc.subject.por.fl_str_mv biomarker identification
disease prediction
feature selection
machine learning
microbiome
topic biomarker identification
disease prediction
feature selection
machine learning
microbiome
description The number of microbiome-related studies has notably increased the availability of data on human microbiome composition and function. These studies provide the essential material to deeply explore host-microbiome associations and their relation to the development and progression of various complex diseases. Improved data-analytical tools are needed to exploit all information from these biological datasets, taking into account the peculiarities of microbiome data, i.e., compositional, heterogeneous and sparse nature of these datasets. The possibility of predicting host-phenotypes based on taxonomy-informed feature selection to establish an association between microbiome and predict disease states is beneficial for personalized medicine. In this regard, machine learning (ML) provides new insights into the development of models that can be used to predict outputs, such as classification and prediction in microbiology, infer host phenotypes to predict diseases and use microbial communities to stratify patients by their characterization of state-specific microbial signatures. Here we review the state-of-the-art ML methods and respective software applied in human microbiome studies, performed as part of the COST Action ML4Microbiome activities. This scoping review focuses on the application of ML in microbiome studies related to association and clinical use for diagnostics, prognostics, and therapeutics. Although the data presented here is more related to the bacterial community, many algorithms could be applied in general, regardless of the feature type. This literature and software review covering this broad topic is aligned with the scoping review methodology. The manual identification of data sources has been complemented with: (1) automated publication search through digital libraries of the three major publishers using natural language processing (NLP) Toolkit, and (2) an automated identification of relevant software repositories on GitHub and ranking of the related research papers relying on learning to rank approach.
publishDate 2021
dc.date.none.fl_str_mv 2021
2021-01-01T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://hdl.handle.net/10216/149451
url https://hdl.handle.net/10216/149451
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 1664-302X
10.3389/fmicb.2021.634511
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Frontiers Media
publisher.none.fl_str_mv Frontiers Media
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833599779813195776