id RCAP_4379769c2de136ef5b6f31ff66463dcc
oai_identifier_str oai:run.unl.pt:10362/154852
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling Identification of biomarkers predictive of metastasis development in early-stage colorectal cancer using network-based regularizationBiomarker selectionClassificationColorectal canceriTwinerRegularizationStructural BiologyBiochemistryMolecular BiologyComputer Science ApplicationsApplied MathematicsSDG 3 - Good Health and Well-beingPublisher Copyright: © 2023, The Author(s).Colorectal cancer (CRC) is the third most common cancer and the second most deathly worldwide. It is a very heterogeneous disease that can develop via distinct pathways where metastasis is the primary cause of death. Therefore, it is crucial to understand the molecular mechanisms underlying metastasis. RNA-sequencing is an essential tool used for studying the transcriptional landscape. However, the high-dimensionality of gene expression data makes selecting novel metastatic biomarkers problematic. To distinguish early-stage CRC patients at risk of developing metastasis from those that are not, three types of binary classification approaches were used: (1) classification methods (decision trees, linear and radial kernel support vector machines, logistic regression, and random forest) using differentially expressed genes (DEGs) as input features; (2) regularized logistic regression based on the Elastic Net penalty and the proposed iTwiner—a network-based regularizer accounting for gene correlation information; and (3) classification methods based on the genes pre-selected using regularized logistic regression. Classifiers using the DEGs as features showed similar results, with random forest showing the highest accuracy. Using regularized logistic regression on the full dataset yielded no improvement in the methods’ accuracy. Further classification using the pre-selected genes found by different penalty factors, instead of the DEGs, significantly improved the accuracy of the binary classifiers. Moreover, the use of network-based correlation information (iTwiner) for gene selection produced the best classification results and the identification of more stable and robust gene sets. Some are known to be tumor suppressor genes (OPCML-IT2), to be related to resistance to cancer therapies (RAC1P3), or to be involved in several cancer processes such as genome stability (XRCC6P2), tumor growth and metastasis (MIR602) and regulation of gene transcription (NME2P2). We show that the classification of CRC patients based on pre-selected features by regularized logistic regression is a valuable alternative to using DEGs, significantly increasing the models’ predictive performance. Moreover, the use of correlation-based penalization for biomarker selection stands as a promising strategy for predicting patients’ groups based on RNA-seq data.NOVALincsCMA - Centro de Matemática e AplicaçõesUCIBIO - Applied Molecular Biosciences UnitDCV - Departamento de Ciências da VidaRUNPeixoto, CarolinaLopes, Marta B.Martins, MartaCasimiro, SandraSobral, DanielGrosso, Ana RitaAbreu, CatarinaMacedo, DanielaCosta, Ana LúciaPais, HelenaAlvim, CecíliaMansinho, AndréFilipe, PedroCosta, Pedro Marques daFernandes, AfonsoBorralho, PaulaFerreira, CristinaMalaquias, JoãoQuintela, AntónioKaplan, ShannonGolkaram, MahdiSalmans, MichaelKhan, NafeesaVijayaraghavan, RaakheeZhang, ShilePawlowski, TraciGodsey, JimSo, AlexLiu, LiCosta, LuísVinga, Susana2023-07-04T22:18:45Z2023-122023-12-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article23application/pdfhttp://hdl.handle.net/10362/154852eng1471-2105PURE: 65214208https://doi.org/10.1186/s12859-022-05104-zinfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-05-22T18:12:38Zoai:run.unl.pt:10362/154852Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T17:42:54.794855Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv Identification of biomarkers predictive of metastasis development in early-stage colorectal cancer using network-based regularization
title Identification of biomarkers predictive of metastasis development in early-stage colorectal cancer using network-based regularization
spellingShingle Identification of biomarkers predictive of metastasis development in early-stage colorectal cancer using network-based regularization
Peixoto, Carolina
Biomarker selection
Classification
Colorectal cancer
iTwiner
Regularization
Structural Biology
Biochemistry
Molecular Biology
Computer Science Applications
Applied Mathematics
SDG 3 - Good Health and Well-being
title_short Identification of biomarkers predictive of metastasis development in early-stage colorectal cancer using network-based regularization
title_full Identification of biomarkers predictive of metastasis development in early-stage colorectal cancer using network-based regularization
title_fullStr Identification of biomarkers predictive of metastasis development in early-stage colorectal cancer using network-based regularization
title_full_unstemmed Identification of biomarkers predictive of metastasis development in early-stage colorectal cancer using network-based regularization
title_sort Identification of biomarkers predictive of metastasis development in early-stage colorectal cancer using network-based regularization
author Peixoto, Carolina
author_facet Peixoto, Carolina
Lopes, Marta B.
Martins, Marta
Casimiro, Sandra
Sobral, Daniel
Grosso, Ana Rita
Abreu, Catarina
Macedo, Daniela
Costa, Ana Lúcia
Pais, Helena
Alvim, Cecília
Mansinho, André
Filipe, Pedro
Costa, Pedro Marques da
Fernandes, Afonso
Borralho, Paula
Ferreira, Cristina
Malaquias, João
Quintela, António
Kaplan, Shannon
Golkaram, Mahdi
Salmans, Michael
Khan, Nafeesa
Vijayaraghavan, Raakhee
Zhang, Shile
Pawlowski, Traci
Godsey, Jim
So, Alex
Liu, Li
Costa, Luís
Vinga, Susana
author_role author
author2 Lopes, Marta B.
Martins, Marta
Casimiro, Sandra
Sobral, Daniel
Grosso, Ana Rita
Abreu, Catarina
Macedo, Daniela
Costa, Ana Lúcia
Pais, Helena
Alvim, Cecília
Mansinho, André
Filipe, Pedro
Costa, Pedro Marques da
Fernandes, Afonso
Borralho, Paula
Ferreira, Cristina
Malaquias, João
Quintela, António
Kaplan, Shannon
Golkaram, Mahdi
Salmans, Michael
Khan, Nafeesa
Vijayaraghavan, Raakhee
Zhang, Shile
Pawlowski, Traci
Godsey, Jim
So, Alex
Liu, Li
Costa, Luís
Vinga, Susana
author2_role author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
dc.contributor.none.fl_str_mv NOVALincs
CMA - Centro de Matemática e Aplicações
UCIBIO - Applied Molecular Biosciences Unit
DCV - Departamento de Ciências da Vida
RUN
dc.contributor.author.fl_str_mv Peixoto, Carolina
Lopes, Marta B.
Martins, Marta
Casimiro, Sandra
Sobral, Daniel
Grosso, Ana Rita
Abreu, Catarina
Macedo, Daniela
Costa, Ana Lúcia
Pais, Helena
Alvim, Cecília
Mansinho, André
Filipe, Pedro
Costa, Pedro Marques da
Fernandes, Afonso
Borralho, Paula
Ferreira, Cristina
Malaquias, João
Quintela, António
Kaplan, Shannon
Golkaram, Mahdi
Salmans, Michael
Khan, Nafeesa
Vijayaraghavan, Raakhee
Zhang, Shile
Pawlowski, Traci
Godsey, Jim
So, Alex
Liu, Li
Costa, Luís
Vinga, Susana
dc.subject.por.fl_str_mv Biomarker selection
Classification
Colorectal cancer
iTwiner
Regularization
Structural Biology
Biochemistry
Molecular Biology
Computer Science Applications
Applied Mathematics
SDG 3 - Good Health and Well-being
topic Biomarker selection
Classification
Colorectal cancer
iTwiner
Regularization
Structural Biology
Biochemistry
Molecular Biology
Computer Science Applications
Applied Mathematics
SDG 3 - Good Health and Well-being
description Publisher Copyright: © 2023, The Author(s).
publishDate 2023
dc.date.none.fl_str_mv 2023-07-04T22:18:45Z
2023-12
2023-12-01T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10362/154852
url http://hdl.handle.net/10362/154852
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 1471-2105
PURE: 65214208
https://doi.org/10.1186/s12859-022-05104-z
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv 23
application/pdf
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833596915305938944