ROSIE
| Autor(a) principal: | |
|---|---|
| Data de Publicação: | 2022 |
| Outros Autores: | , , |
| Tipo de documento: | Artigo |
| Idioma: | eng |
| Título da fonte: | Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| Texto Completo: | http://hdl.handle.net/10362/143575 |
Resumo: | We thank Peter Segaert for providing his adapted code of the enetLTS method. The results presented here are in whole or part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga . Funded by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy - EXC 2075 390740016 (AJ and NR). Publisher Copyright: © The Author(s) 2022. |
| id |
RCAP_42b8654080fe0d5b9dc6f883180d7b17 |
|---|---|
| oai_identifier_str |
oai:run.unl.pt:10362/143575 |
| network_acronym_str |
RCAP |
| network_name_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| repository_id_str |
https://opendoar.ac.uk/repository/7160 |
| spelling |
ROSIERObust Sparse ensemble for outlIEr detection and gene selection in cancer omics databiomarkerclassificationEnsemblefeature selectionoutlierrobustsparsetriple-Negative Breast CancerEpidemiologyStatistics and ProbabilityHealth Information ManagementSDG 3 - Good Health and Well-beingWe thank Peter Segaert for providing his adapted code of the enetLTS method. The results presented here are in whole or part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga . Funded by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy - EXC 2075 390740016 (AJ and NR). Publisher Copyright: © The Author(s) 2022.The extraction of novel information from omics data is a challenging task, in particular, since the number of features (e.g. genes) often far exceeds the number of samples. In such a setting, conventional parameter estimation leads to ill-posed optimization problems, and regularization may be required. In addition, outliers can largely impact classification accuracy. Here we introduce ROSIE, an ensemble classification approach, which combines three sparse and robust classification methods for outlier detection and feature selection and further performs a bootstrap-based validity check. Outliers of ROSIE are determined by the rank product test using outlier rankings of all three methods, and important features are selected as features commonly selected by all methods. We apply ROSIE to RNA-Seq data from The Cancer Genome Atlas (TCGA) to classify observations into Triple-Negative Breast Cancer (TNBC) and non-TNBC tissue samples. The pre-processed dataset consists of (Formula presented.) genes and more than (Formula presented.) samples. We demonstrate that ROSIE selects important features and outliers in a robust way. Identified outliers are concordant with the distribution of the commonly selected genes by the three methods, and results are in line with other independent studies. Furthermore, we discuss the association of some of the selected genes with the TNBC subtype in other investigations. In summary, ROSIE constitutes a robust and sparse procedure to identify outliers and important genes through binary classification. Our approach is ad hoc applicable to other datasets, fulfilling the overall goal of simultaneously identifying outliers and candidate disease biomarkers to the targeted in therapy research and personalized medicine frameworks.NOVALincsCMA - Centro de Matemática e AplicaçõesRUNJensch, AntjeLopes, Marta B.Vinga, SusanaRadde, Nicole2022-09-07T22:28:22Z2022-052022-05-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article12application/pdfhttp://hdl.handle.net/10362/143575eng0962-2802PURE: 42456000https://doi.org/10.1177/09622802211072456info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-05-22T18:04:57Zoai:run.unl.pt:10362/143575Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T17:35:37.110798Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse |
| dc.title.none.fl_str_mv |
ROSIE RObust Sparse ensemble for outlIEr detection and gene selection in cancer omics data |
| title |
ROSIE |
| spellingShingle |
ROSIE Jensch, Antje biomarker classification Ensemble feature selection outlier robust sparse triple-Negative Breast Cancer Epidemiology Statistics and Probability Health Information Management SDG 3 - Good Health and Well-being |
| title_short |
ROSIE |
| title_full |
ROSIE |
| title_fullStr |
ROSIE |
| title_full_unstemmed |
ROSIE |
| title_sort |
ROSIE |
| author |
Jensch, Antje |
| author_facet |
Jensch, Antje Lopes, Marta B. Vinga, Susana Radde, Nicole |
| author_role |
author |
| author2 |
Lopes, Marta B. Vinga, Susana Radde, Nicole |
| author2_role |
author author author |
| dc.contributor.none.fl_str_mv |
NOVALincs CMA - Centro de Matemática e Aplicações RUN |
| dc.contributor.author.fl_str_mv |
Jensch, Antje Lopes, Marta B. Vinga, Susana Radde, Nicole |
| dc.subject.por.fl_str_mv |
biomarker classification Ensemble feature selection outlier robust sparse triple-Negative Breast Cancer Epidemiology Statistics and Probability Health Information Management SDG 3 - Good Health and Well-being |
| topic |
biomarker classification Ensemble feature selection outlier robust sparse triple-Negative Breast Cancer Epidemiology Statistics and Probability Health Information Management SDG 3 - Good Health and Well-being |
| description |
We thank Peter Segaert for providing his adapted code of the enetLTS method. The results presented here are in whole or part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga . Funded by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy - EXC 2075 390740016 (AJ and NR). Publisher Copyright: © The Author(s) 2022. |
| publishDate |
2022 |
| dc.date.none.fl_str_mv |
2022-09-07T22:28:22Z 2022-05 2022-05-01T00:00:00Z |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
| format |
article |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10362/143575 |
| url |
http://hdl.handle.net/10362/143575 |
| dc.language.iso.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
0962-2802 PURE: 42456000 https://doi.org/10.1177/09622802211072456 |
| dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
12 application/pdf |
| dc.source.none.fl_str_mv |
reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP |
| instname_str |
FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
| instacron_str |
RCAAP |
| institution |
RCAAP |
| reponame_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| collection |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| repository.name.fl_str_mv |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
| repository.mail.fl_str_mv |
info@rcaap.pt |
| _version_ |
1833596817529372672 |