Identification of biomarkers predictive of metastasis development in early-stage colorectal cancer using network-based regularization
Main Author: | |
---|---|
Publication Date: | 2023 |
Other Authors: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
Format: | Article |
Language: | eng |
Source: | Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
Download full: | http://hdl.handle.net/10362/154852 |
Summary: | Publisher Copyright: © 2023, The Author(s). |
id |
RCAP_4379769c2de136ef5b6f31ff66463dcc |
---|---|
oai_identifier_str |
oai:run.unl.pt:10362/154852 |
network_acronym_str |
RCAP |
network_name_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
repository_id_str |
https://opendoar.ac.uk/repository/7160 |
spelling |
Identification of biomarkers predictive of metastasis development in early-stage colorectal cancer using network-based regularizationBiomarker selectionClassificationColorectal canceriTwinerRegularizationStructural BiologyBiochemistryMolecular BiologyComputer Science ApplicationsApplied MathematicsSDG 3 - Good Health and Well-beingPublisher Copyright: © 2023, The Author(s).Colorectal cancer (CRC) is the third most common cancer and the second most deathly worldwide. It is a very heterogeneous disease that can develop via distinct pathways where metastasis is the primary cause of death. Therefore, it is crucial to understand the molecular mechanisms underlying metastasis. RNA-sequencing is an essential tool used for studying the transcriptional landscape. However, the high-dimensionality of gene expression data makes selecting novel metastatic biomarkers problematic. To distinguish early-stage CRC patients at risk of developing metastasis from those that are not, three types of binary classification approaches were used: (1) classification methods (decision trees, linear and radial kernel support vector machines, logistic regression, and random forest) using differentially expressed genes (DEGs) as input features; (2) regularized logistic regression based on the Elastic Net penalty and the proposed iTwiner—a network-based regularizer accounting for gene correlation information; and (3) classification methods based on the genes pre-selected using regularized logistic regression. Classifiers using the DEGs as features showed similar results, with random forest showing the highest accuracy. Using regularized logistic regression on the full dataset yielded no improvement in the methods’ accuracy. Further classification using the pre-selected genes found by different penalty factors, instead of the DEGs, significantly improved the accuracy of the binary classifiers. Moreover, the use of network-based correlation information (iTwiner) for gene selection produced the best classification results and the identification of more stable and robust gene sets. Some are known to be tumor suppressor genes (OPCML-IT2), to be related to resistance to cancer therapies (RAC1P3), or to be involved in several cancer processes such as genome stability (XRCC6P2), tumor growth and metastasis (MIR602) and regulation of gene transcription (NME2P2). We show that the classification of CRC patients based on pre-selected features by regularized logistic regression is a valuable alternative to using DEGs, significantly increasing the models’ predictive performance. Moreover, the use of correlation-based penalization for biomarker selection stands as a promising strategy for predicting patients’ groups based on RNA-seq data.NOVALincsCMA - Centro de Matemática e AplicaçõesUCIBIO - Applied Molecular Biosciences UnitDCV - Departamento de Ciências da VidaRUNPeixoto, CarolinaLopes, Marta B.Martins, MartaCasimiro, SandraSobral, DanielGrosso, Ana RitaAbreu, CatarinaMacedo, DanielaCosta, Ana LúciaPais, HelenaAlvim, CecíliaMansinho, AndréFilipe, PedroCosta, Pedro Marques daFernandes, AfonsoBorralho, PaulaFerreira, CristinaMalaquias, JoãoQuintela, AntónioKaplan, ShannonGolkaram, MahdiSalmans, MichaelKhan, NafeesaVijayaraghavan, RaakheeZhang, ShilePawlowski, TraciGodsey, JimSo, AlexLiu, LiCosta, LuísVinga, Susana2023-07-04T22:18:45Z2023-122023-12-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article23application/pdfhttp://hdl.handle.net/10362/154852eng1471-2105PURE: 65214208https://doi.org/10.1186/s12859-022-05104-zinfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-05-22T18:12:38Zoai:run.unl.pt:10362/154852Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T17:42:54.794855Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse |
dc.title.none.fl_str_mv |
Identification of biomarkers predictive of metastasis development in early-stage colorectal cancer using network-based regularization |
title |
Identification of biomarkers predictive of metastasis development in early-stage colorectal cancer using network-based regularization |
spellingShingle |
Identification of biomarkers predictive of metastasis development in early-stage colorectal cancer using network-based regularization Peixoto, Carolina Biomarker selection Classification Colorectal cancer iTwiner Regularization Structural Biology Biochemistry Molecular Biology Computer Science Applications Applied Mathematics SDG 3 - Good Health and Well-being |
title_short |
Identification of biomarkers predictive of metastasis development in early-stage colorectal cancer using network-based regularization |
title_full |
Identification of biomarkers predictive of metastasis development in early-stage colorectal cancer using network-based regularization |
title_fullStr |
Identification of biomarkers predictive of metastasis development in early-stage colorectal cancer using network-based regularization |
title_full_unstemmed |
Identification of biomarkers predictive of metastasis development in early-stage colorectal cancer using network-based regularization |
title_sort |
Identification of biomarkers predictive of metastasis development in early-stage colorectal cancer using network-based regularization |
author |
Peixoto, Carolina |
author_facet |
Peixoto, Carolina Lopes, Marta B. Martins, Marta Casimiro, Sandra Sobral, Daniel Grosso, Ana Rita Abreu, Catarina Macedo, Daniela Costa, Ana Lúcia Pais, Helena Alvim, Cecília Mansinho, André Filipe, Pedro Costa, Pedro Marques da Fernandes, Afonso Borralho, Paula Ferreira, Cristina Malaquias, João Quintela, António Kaplan, Shannon Golkaram, Mahdi Salmans, Michael Khan, Nafeesa Vijayaraghavan, Raakhee Zhang, Shile Pawlowski, Traci Godsey, Jim So, Alex Liu, Li Costa, Luís Vinga, Susana |
author_role |
author |
author2 |
Lopes, Marta B. Martins, Marta Casimiro, Sandra Sobral, Daniel Grosso, Ana Rita Abreu, Catarina Macedo, Daniela Costa, Ana Lúcia Pais, Helena Alvim, Cecília Mansinho, André Filipe, Pedro Costa, Pedro Marques da Fernandes, Afonso Borralho, Paula Ferreira, Cristina Malaquias, João Quintela, António Kaplan, Shannon Golkaram, Mahdi Salmans, Michael Khan, Nafeesa Vijayaraghavan, Raakhee Zhang, Shile Pawlowski, Traci Godsey, Jim So, Alex Liu, Li Costa, Luís Vinga, Susana |
author2_role |
author author author author author author author author author author author author author author author author author author author author author author author author author author author author author author |
dc.contributor.none.fl_str_mv |
NOVALincs CMA - Centro de Matemática e Aplicações UCIBIO - Applied Molecular Biosciences Unit DCV - Departamento de Ciências da Vida RUN |
dc.contributor.author.fl_str_mv |
Peixoto, Carolina Lopes, Marta B. Martins, Marta Casimiro, Sandra Sobral, Daniel Grosso, Ana Rita Abreu, Catarina Macedo, Daniela Costa, Ana Lúcia Pais, Helena Alvim, Cecília Mansinho, André Filipe, Pedro Costa, Pedro Marques da Fernandes, Afonso Borralho, Paula Ferreira, Cristina Malaquias, João Quintela, António Kaplan, Shannon Golkaram, Mahdi Salmans, Michael Khan, Nafeesa Vijayaraghavan, Raakhee Zhang, Shile Pawlowski, Traci Godsey, Jim So, Alex Liu, Li Costa, Luís Vinga, Susana |
dc.subject.por.fl_str_mv |
Biomarker selection Classification Colorectal cancer iTwiner Regularization Structural Biology Biochemistry Molecular Biology Computer Science Applications Applied Mathematics SDG 3 - Good Health and Well-being |
topic |
Biomarker selection Classification Colorectal cancer iTwiner Regularization Structural Biology Biochemistry Molecular Biology Computer Science Applications Applied Mathematics SDG 3 - Good Health and Well-being |
description |
Publisher Copyright: © 2023, The Author(s). |
publishDate |
2023 |
dc.date.none.fl_str_mv |
2023-07-04T22:18:45Z 2023-12 2023-12-01T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10362/154852 |
url |
http://hdl.handle.net/10362/154852 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
1471-2105 PURE: 65214208 https://doi.org/10.1186/s12859-022-05104-z |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
23 application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP |
instname_str |
FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
collection |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
repository.name.fl_str_mv |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
repository.mail.fl_str_mv |
info@rcaap.pt |
_version_ |
1833596915305938944 |