ROSIE

Detalhes bibliográficos
Autor(a) principal: Jensch, Antje
Data de Publicação: 2022
Outros Autores: Lopes, Marta B., Vinga, Susana, Radde, Nicole
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Texto Completo: http://hdl.handle.net/10362/143575
Resumo: We thank Peter Segaert for providing his adapted code of the enetLTS method. The results presented here are in whole or part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga . Funded by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy - EXC 2075 390740016 (AJ and NR). Publisher Copyright: © The Author(s) 2022.
id RCAP_42b8654080fe0d5b9dc6f883180d7b17
oai_identifier_str oai:run.unl.pt:10362/143575
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling ROSIERObust Sparse ensemble for outlIEr detection and gene selection in cancer omics databiomarkerclassificationEnsemblefeature selectionoutlierrobustsparsetriple-Negative Breast CancerEpidemiologyStatistics and ProbabilityHealth Information ManagementSDG 3 - Good Health and Well-beingWe thank Peter Segaert for providing his adapted code of the enetLTS method. The results presented here are in whole or part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga . Funded by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy - EXC 2075 390740016 (AJ and NR). Publisher Copyright: © The Author(s) 2022.The extraction of novel information from omics data is a challenging task, in particular, since the number of features (e.g. genes) often far exceeds the number of samples. In such a setting, conventional parameter estimation leads to ill-posed optimization problems, and regularization may be required. In addition, outliers can largely impact classification accuracy. Here we introduce ROSIE, an ensemble classification approach, which combines three sparse and robust classification methods for outlier detection and feature selection and further performs a bootstrap-based validity check. Outliers of ROSIE are determined by the rank product test using outlier rankings of all three methods, and important features are selected as features commonly selected by all methods. We apply ROSIE to RNA-Seq data from The Cancer Genome Atlas (TCGA) to classify observations into Triple-Negative Breast Cancer (TNBC) and non-TNBC tissue samples. The pre-processed dataset consists of (Formula presented.) genes and more than (Formula presented.) samples. We demonstrate that ROSIE selects important features and outliers in a robust way. Identified outliers are concordant with the distribution of the commonly selected genes by the three methods, and results are in line with other independent studies. Furthermore, we discuss the association of some of the selected genes with the TNBC subtype in other investigations. In summary, ROSIE constitutes a robust and sparse procedure to identify outliers and important genes through binary classification. Our approach is ad hoc applicable to other datasets, fulfilling the overall goal of simultaneously identifying outliers and candidate disease biomarkers to the targeted in therapy research and personalized medicine frameworks.NOVALincsCMA - Centro de Matemática e AplicaçõesRUNJensch, AntjeLopes, Marta B.Vinga, SusanaRadde, Nicole2022-09-07T22:28:22Z2022-052022-05-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article12application/pdfhttp://hdl.handle.net/10362/143575eng0962-2802PURE: 42456000https://doi.org/10.1177/09622802211072456info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-05-22T18:04:57Zoai:run.unl.pt:10362/143575Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T17:35:37.110798Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv ROSIE
RObust Sparse ensemble for outlIEr detection and gene selection in cancer omics data
title ROSIE
spellingShingle ROSIE
Jensch, Antje
biomarker
classification
Ensemble
feature selection
outlier
robust
sparse
triple-Negative Breast Cancer
Epidemiology
Statistics and Probability
Health Information Management
SDG 3 - Good Health and Well-being
title_short ROSIE
title_full ROSIE
title_fullStr ROSIE
title_full_unstemmed ROSIE
title_sort ROSIE
author Jensch, Antje
author_facet Jensch, Antje
Lopes, Marta B.
Vinga, Susana
Radde, Nicole
author_role author
author2 Lopes, Marta B.
Vinga, Susana
Radde, Nicole
author2_role author
author
author
dc.contributor.none.fl_str_mv NOVALincs
CMA - Centro de Matemática e Aplicações
RUN
dc.contributor.author.fl_str_mv Jensch, Antje
Lopes, Marta B.
Vinga, Susana
Radde, Nicole
dc.subject.por.fl_str_mv biomarker
classification
Ensemble
feature selection
outlier
robust
sparse
triple-Negative Breast Cancer
Epidemiology
Statistics and Probability
Health Information Management
SDG 3 - Good Health and Well-being
topic biomarker
classification
Ensemble
feature selection
outlier
robust
sparse
triple-Negative Breast Cancer
Epidemiology
Statistics and Probability
Health Information Management
SDG 3 - Good Health and Well-being
description We thank Peter Segaert for providing his adapted code of the enetLTS method. The results presented here are in whole or part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga . Funded by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy - EXC 2075 390740016 (AJ and NR). Publisher Copyright: © The Author(s) 2022.
publishDate 2022
dc.date.none.fl_str_mv 2022-09-07T22:28:22Z
2022-05
2022-05-01T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10362/143575
url http://hdl.handle.net/10362/143575
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 0962-2802
PURE: 42456000
https://doi.org/10.1177/09622802211072456
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv 12
application/pdf
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833596817529372672