Export Ready — 

Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning

Bibliographic Details
Main Author: Asif, Muhammad
Publication Date: 2020
Other Authors: Martiniano, Hugo F.M.C., Marques, Ana Rita, Santos, João Xavier, Vilela, Joana, Rasga, Celia, Oliveira, Guiomar, Couto, Francisco M., Vicente, Astrid M.
Format: Article
Language: eng
Source: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Download full: http://hdl.handle.net/10400.18/7319
Summary: The complex genetic architecture of Autism Spectrum Disorder (ASD) and its heterogeneous phenotype makes molecular diagnosis and patient prognosis challenging tasks. To establish more precise genotype-phenotype correlations in ASD, we developed a novel machine-learning integrative approach, which seeks to delineate associations between patients' clinical profiles and disrupted biological processes, inferred from their copy number variants (CNVs) that span brain genes. Clustering analysis of the relevant clinical measures from 2446 ASD cases in the Autism Genome Project identified two distinct phenotypic subgroups. Patients in these clusters differed significantly in ADOS-defined severity, adaptive behavior profiles, intellectual ability, and verbal status, the latter contributing the most for cluster stability and cohesion. Functional enrichment analysis of brain genes disrupted by CNVs in these ASD cases identified 15 statistically significant biological processes, including cell adhesion, neural development, cognition, and polyubiquitination, in line with previous ASD findings. A Naive Bayes classifier, generated to predict the ASD phenotypic clusters from disrupted biological processes, achieved predictions with a high precision (0.82) but low recall (0.39), for a subset of patients with higher biological Information Content scores. This study shows that milder and more severe clinical presentations can have distinct underlying biological mechanisms. It further highlights how machine-learning approaches can reduce clinical heterogeneity by using multidimensional clinical measures, and establishes genotype-phenotype correlations in ASD. However, predictions are strongly dependent on patient's information content. Findings are therefore a first step toward the translation of genetic information into clinically useful applications, and emphasize the need for larger datasets with very complete clinical and biological information.
id RCAP_d3b562cecdcc6bd5c975c9b9c70ffab6
oai_identifier_str oai:repositorio.insa.pt:10400.18/7319
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learningAutismAutism Spectrum Disorder (ASD)Neurodevelopmental DisorderASD PhenotypePerturbações do Desenvolvimento Infantil e Saúde MentalThe complex genetic architecture of Autism Spectrum Disorder (ASD) and its heterogeneous phenotype makes molecular diagnosis and patient prognosis challenging tasks. To establish more precise genotype-phenotype correlations in ASD, we developed a novel machine-learning integrative approach, which seeks to delineate associations between patients' clinical profiles and disrupted biological processes, inferred from their copy number variants (CNVs) that span brain genes. Clustering analysis of the relevant clinical measures from 2446 ASD cases in the Autism Genome Project identified two distinct phenotypic subgroups. Patients in these clusters differed significantly in ADOS-defined severity, adaptive behavior profiles, intellectual ability, and verbal status, the latter contributing the most for cluster stability and cohesion. Functional enrichment analysis of brain genes disrupted by CNVs in these ASD cases identified 15 statistically significant biological processes, including cell adhesion, neural development, cognition, and polyubiquitination, in line with previous ASD findings. A Naive Bayes classifier, generated to predict the ASD phenotypic clusters from disrupted biological processes, achieved predictions with a high precision (0.82) but low recall (0.39), for a subset of patients with higher biological Information Content scores. This study shows that milder and more severe clinical presentations can have distinct underlying biological mechanisms. It further highlights how machine-learning approaches can reduce clinical heterogeneity by using multidimensional clinical measures, and establishes genotype-phenotype correlations in ASD. However, predictions are strongly dependent on patient's information content. Findings are therefore a first step toward the translation of genetic information into clinically useful applications, and emphasize the need for larger datasets with very complete clinical and biological information.Springer NatureRepositório Científico do Instituto Nacional de SaúdeAsif, MuhammadMartiniano, Hugo F.M.C.Marques, Ana RitaSantos, João XavierVilela, JoanaRasga, CeliaOliveira, GuiomarCouto, Francisco M.Vicente, Astrid M.2021-03-04T18:58:25Z2020-01-282020-01-28T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10400.18/7319eng2158-318810.1038/s41398-020-0721-1info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2025-02-26T14:13:02Zoai:repositorio.insa.pt:10400.18/7319Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T21:27:31.605827Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning
title Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning
spellingShingle Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning
Asif, Muhammad
Autism
Autism Spectrum Disorder (ASD)
Neurodevelopmental Disorder
ASD Phenotype
Perturbações do Desenvolvimento Infantil e Saúde Mental
title_short Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning
title_full Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning
title_fullStr Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning
title_full_unstemmed Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning
title_sort Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning
author Asif, Muhammad
author_facet Asif, Muhammad
Martiniano, Hugo F.M.C.
Marques, Ana Rita
Santos, João Xavier
Vilela, Joana
Rasga, Celia
Oliveira, Guiomar
Couto, Francisco M.
Vicente, Astrid M.
author_role author
author2 Martiniano, Hugo F.M.C.
Marques, Ana Rita
Santos, João Xavier
Vilela, Joana
Rasga, Celia
Oliveira, Guiomar
Couto, Francisco M.
Vicente, Astrid M.
author2_role author
author
author
author
author
author
author
author
dc.contributor.none.fl_str_mv Repositório Científico do Instituto Nacional de Saúde
dc.contributor.author.fl_str_mv Asif, Muhammad
Martiniano, Hugo F.M.C.
Marques, Ana Rita
Santos, João Xavier
Vilela, Joana
Rasga, Celia
Oliveira, Guiomar
Couto, Francisco M.
Vicente, Astrid M.
dc.subject.por.fl_str_mv Autism
Autism Spectrum Disorder (ASD)
Neurodevelopmental Disorder
ASD Phenotype
Perturbações do Desenvolvimento Infantil e Saúde Mental
topic Autism
Autism Spectrum Disorder (ASD)
Neurodevelopmental Disorder
ASD Phenotype
Perturbações do Desenvolvimento Infantil e Saúde Mental
description The complex genetic architecture of Autism Spectrum Disorder (ASD) and its heterogeneous phenotype makes molecular diagnosis and patient prognosis challenging tasks. To establish more precise genotype-phenotype correlations in ASD, we developed a novel machine-learning integrative approach, which seeks to delineate associations between patients' clinical profiles and disrupted biological processes, inferred from their copy number variants (CNVs) that span brain genes. Clustering analysis of the relevant clinical measures from 2446 ASD cases in the Autism Genome Project identified two distinct phenotypic subgroups. Patients in these clusters differed significantly in ADOS-defined severity, adaptive behavior profiles, intellectual ability, and verbal status, the latter contributing the most for cluster stability and cohesion. Functional enrichment analysis of brain genes disrupted by CNVs in these ASD cases identified 15 statistically significant biological processes, including cell adhesion, neural development, cognition, and polyubiquitination, in line with previous ASD findings. A Naive Bayes classifier, generated to predict the ASD phenotypic clusters from disrupted biological processes, achieved predictions with a high precision (0.82) but low recall (0.39), for a subset of patients with higher biological Information Content scores. This study shows that milder and more severe clinical presentations can have distinct underlying biological mechanisms. It further highlights how machine-learning approaches can reduce clinical heterogeneity by using multidimensional clinical measures, and establishes genotype-phenotype correlations in ASD. However, predictions are strongly dependent on patient's information content. Findings are therefore a first step toward the translation of genetic information into clinically useful applications, and emphasize the need for larger datasets with very complete clinical and biological information.
publishDate 2020
dc.date.none.fl_str_mv 2020-01-28
2020-01-28T00:00:00Z
2021-03-04T18:58:25Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10400.18/7319
url http://hdl.handle.net/10400.18/7319
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 2158-3188
10.1038/s41398-020-0721-1
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Springer Nature
publisher.none.fl_str_mv Springer Nature
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833599286272589824