Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning
Main Author: | |
---|---|
Publication Date: | 2020 |
Other Authors: | , , , , , , , |
Format: | Article |
Language: | eng |
Source: | Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
Download full: | http://hdl.handle.net/10400.18/7319 |
Summary: | The complex genetic architecture of Autism Spectrum Disorder (ASD) and its heterogeneous phenotype makes molecular diagnosis and patient prognosis challenging tasks. To establish more precise genotype-phenotype correlations in ASD, we developed a novel machine-learning integrative approach, which seeks to delineate associations between patients' clinical profiles and disrupted biological processes, inferred from their copy number variants (CNVs) that span brain genes. Clustering analysis of the relevant clinical measures from 2446 ASD cases in the Autism Genome Project identified two distinct phenotypic subgroups. Patients in these clusters differed significantly in ADOS-defined severity, adaptive behavior profiles, intellectual ability, and verbal status, the latter contributing the most for cluster stability and cohesion. Functional enrichment analysis of brain genes disrupted by CNVs in these ASD cases identified 15 statistically significant biological processes, including cell adhesion, neural development, cognition, and polyubiquitination, in line with previous ASD findings. A Naive Bayes classifier, generated to predict the ASD phenotypic clusters from disrupted biological processes, achieved predictions with a high precision (0.82) but low recall (0.39), for a subset of patients with higher biological Information Content scores. This study shows that milder and more severe clinical presentations can have distinct underlying biological mechanisms. It further highlights how machine-learning approaches can reduce clinical heterogeneity by using multidimensional clinical measures, and establishes genotype-phenotype correlations in ASD. However, predictions are strongly dependent on patient's information content. Findings are therefore a first step toward the translation of genetic information into clinically useful applications, and emphasize the need for larger datasets with very complete clinical and biological information. |
id |
RCAP_d3b562cecdcc6bd5c975c9b9c70ffab6 |
---|---|
oai_identifier_str |
oai:repositorio.insa.pt:10400.18/7319 |
network_acronym_str |
RCAP |
network_name_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
repository_id_str |
https://opendoar.ac.uk/repository/7160 |
spelling |
Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learningAutismAutism Spectrum Disorder (ASD)Neurodevelopmental DisorderASD PhenotypePerturbações do Desenvolvimento Infantil e Saúde MentalThe complex genetic architecture of Autism Spectrum Disorder (ASD) and its heterogeneous phenotype makes molecular diagnosis and patient prognosis challenging tasks. To establish more precise genotype-phenotype correlations in ASD, we developed a novel machine-learning integrative approach, which seeks to delineate associations between patients' clinical profiles and disrupted biological processes, inferred from their copy number variants (CNVs) that span brain genes. Clustering analysis of the relevant clinical measures from 2446 ASD cases in the Autism Genome Project identified two distinct phenotypic subgroups. Patients in these clusters differed significantly in ADOS-defined severity, adaptive behavior profiles, intellectual ability, and verbal status, the latter contributing the most for cluster stability and cohesion. Functional enrichment analysis of brain genes disrupted by CNVs in these ASD cases identified 15 statistically significant biological processes, including cell adhesion, neural development, cognition, and polyubiquitination, in line with previous ASD findings. A Naive Bayes classifier, generated to predict the ASD phenotypic clusters from disrupted biological processes, achieved predictions with a high precision (0.82) but low recall (0.39), for a subset of patients with higher biological Information Content scores. This study shows that milder and more severe clinical presentations can have distinct underlying biological mechanisms. It further highlights how machine-learning approaches can reduce clinical heterogeneity by using multidimensional clinical measures, and establishes genotype-phenotype correlations in ASD. However, predictions are strongly dependent on patient's information content. Findings are therefore a first step toward the translation of genetic information into clinically useful applications, and emphasize the need for larger datasets with very complete clinical and biological information.Springer NatureRepositório Científico do Instituto Nacional de SaúdeAsif, MuhammadMartiniano, Hugo F.M.C.Marques, Ana RitaSantos, João XavierVilela, JoanaRasga, CeliaOliveira, GuiomarCouto, Francisco M.Vicente, Astrid M.2021-03-04T18:58:25Z2020-01-282020-01-28T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10400.18/7319eng2158-318810.1038/s41398-020-0721-1info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2025-02-26T14:13:02Zoai:repositorio.insa.pt:10400.18/7319Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T21:27:31.605827Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse |
dc.title.none.fl_str_mv |
Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning |
title |
Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning |
spellingShingle |
Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning Asif, Muhammad Autism Autism Spectrum Disorder (ASD) Neurodevelopmental Disorder ASD Phenotype Perturbações do Desenvolvimento Infantil e Saúde Mental |
title_short |
Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning |
title_full |
Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning |
title_fullStr |
Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning |
title_full_unstemmed |
Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning |
title_sort |
Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning |
author |
Asif, Muhammad |
author_facet |
Asif, Muhammad Martiniano, Hugo F.M.C. Marques, Ana Rita Santos, João Xavier Vilela, Joana Rasga, Celia Oliveira, Guiomar Couto, Francisco M. Vicente, Astrid M. |
author_role |
author |
author2 |
Martiniano, Hugo F.M.C. Marques, Ana Rita Santos, João Xavier Vilela, Joana Rasga, Celia Oliveira, Guiomar Couto, Francisco M. Vicente, Astrid M. |
author2_role |
author author author author author author author author |
dc.contributor.none.fl_str_mv |
Repositório Científico do Instituto Nacional de Saúde |
dc.contributor.author.fl_str_mv |
Asif, Muhammad Martiniano, Hugo F.M.C. Marques, Ana Rita Santos, João Xavier Vilela, Joana Rasga, Celia Oliveira, Guiomar Couto, Francisco M. Vicente, Astrid M. |
dc.subject.por.fl_str_mv |
Autism Autism Spectrum Disorder (ASD) Neurodevelopmental Disorder ASD Phenotype Perturbações do Desenvolvimento Infantil e Saúde Mental |
topic |
Autism Autism Spectrum Disorder (ASD) Neurodevelopmental Disorder ASD Phenotype Perturbações do Desenvolvimento Infantil e Saúde Mental |
description |
The complex genetic architecture of Autism Spectrum Disorder (ASD) and its heterogeneous phenotype makes molecular diagnosis and patient prognosis challenging tasks. To establish more precise genotype-phenotype correlations in ASD, we developed a novel machine-learning integrative approach, which seeks to delineate associations between patients' clinical profiles and disrupted biological processes, inferred from their copy number variants (CNVs) that span brain genes. Clustering analysis of the relevant clinical measures from 2446 ASD cases in the Autism Genome Project identified two distinct phenotypic subgroups. Patients in these clusters differed significantly in ADOS-defined severity, adaptive behavior profiles, intellectual ability, and verbal status, the latter contributing the most for cluster stability and cohesion. Functional enrichment analysis of brain genes disrupted by CNVs in these ASD cases identified 15 statistically significant biological processes, including cell adhesion, neural development, cognition, and polyubiquitination, in line with previous ASD findings. A Naive Bayes classifier, generated to predict the ASD phenotypic clusters from disrupted biological processes, achieved predictions with a high precision (0.82) but low recall (0.39), for a subset of patients with higher biological Information Content scores. This study shows that milder and more severe clinical presentations can have distinct underlying biological mechanisms. It further highlights how machine-learning approaches can reduce clinical heterogeneity by using multidimensional clinical measures, and establishes genotype-phenotype correlations in ASD. However, predictions are strongly dependent on patient's information content. Findings are therefore a first step toward the translation of genetic information into clinically useful applications, and emphasize the need for larger datasets with very complete clinical and biological information. |
publishDate |
2020 |
dc.date.none.fl_str_mv |
2020-01-28 2020-01-28T00:00:00Z 2021-03-04T18:58:25Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10400.18/7319 |
url |
http://hdl.handle.net/10400.18/7319 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
2158-3188 10.1038/s41398-020-0721-1 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Springer Nature |
publisher.none.fl_str_mv |
Springer Nature |
dc.source.none.fl_str_mv |
reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP |
instname_str |
FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
collection |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
repository.name.fl_str_mv |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
repository.mail.fl_str_mv |
info@rcaap.pt |
_version_ |
1833599286272589824 |