A data-driven approach to predict hospital length of stay: A Portuguese case study

Bibliographic Details
Main Author: Caetano, N.
Publication Date: 2014
Other Authors: Laureano, R., Cortez, P.
Language: eng
Source: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Download full: http://hdl.handle.net/10071/25857
Summary: Data Mining (DM) aims at the extraction of useful knowledge from raw data. In the last decades, hospitals have collected large amounts of data through new methods of electronic data storage, thus increasing the potential value of DM in this domain area, in what is known as medical data mining. This work focuses on the case study of a Portuguese hospital, based on recent and large dataset that was collected from 2000 to 2013. A data-driven predictive model was obtained for the length of stay (LOS), using as inputs indicators commonly available at the hospitalization process. Based on a regression approach, several state-of-the-art DM models were compared. The best result was obtained by a Random Forest (RF), which presents a high quality coefficient of determination value (0.81). Moreover, a sensitivity analysis approach was used to extract human understandable knowledge from the RF model, revealing top three influential input attributes: hospital episode type, the physical service where the patient is hospitalized and the associated medical specialty. Such predictive and explanatory knowledge is valuable for supporting decisions of hospital managers.
id RCAP_d0c2a80dca59356a7f77c9194f153d1e
oai_identifier_str oai:repositorio.iscte-iul.pt:10071/25857
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling A data-driven approach to predict hospital length of stay: A Portuguese case studyMedical data miningLength of stayCRISP-DMRegressionRandom forestData Mining (DM) aims at the extraction of useful knowledge from raw data. In the last decades, hospitals have collected large amounts of data through new methods of electronic data storage, thus increasing the potential value of DM in this domain area, in what is known as medical data mining. This work focuses on the case study of a Portuguese hospital, based on recent and large dataset that was collected from 2000 to 2013. A data-driven predictive model was obtained for the length of stay (LOS), using as inputs indicators commonly available at the hospitalization process. Based on a regression approach, several state-of-the-art DM models were compared. The best result was obtained by a Random Forest (RF), which presents a high quality coefficient of determination value (0.81). Moreover, a sensitivity analysis approach was used to extract human understandable knowledge from the RF model, revealing top three influential input attributes: hospital episode type, the physical service where the patient is hospitalized and the associated medical specialty. Such predictive and explanatory knowledge is valuable for supporting decisions of hospital managers.SCITEPRESS Digital Library2022-07-18T08:32:51Z2014-01-01T00:00:00Z20142022-07-07T15:21:54Zconference objectinfo:eu-repo/semantics/publishedVersionapplication/pdfhttp://hdl.handle.net/10071/25857eng978-989-758-027-7Caetano, N.Laureano, R.Cortez, P.info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-07-07T02:38:12Zoai:repositorio.iscte-iul.pt:10071/25857Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T18:02:43.659247Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv A data-driven approach to predict hospital length of stay: A Portuguese case study
title A data-driven approach to predict hospital length of stay: A Portuguese case study
spellingShingle A data-driven approach to predict hospital length of stay: A Portuguese case study
Caetano, N.
Medical data mining
Length of stay
CRISP-DM
Regression
Random forest
title_short A data-driven approach to predict hospital length of stay: A Portuguese case study
title_full A data-driven approach to predict hospital length of stay: A Portuguese case study
title_fullStr A data-driven approach to predict hospital length of stay: A Portuguese case study
title_full_unstemmed A data-driven approach to predict hospital length of stay: A Portuguese case study
title_sort A data-driven approach to predict hospital length of stay: A Portuguese case study
author Caetano, N.
author_facet Caetano, N.
Laureano, R.
Cortez, P.
author_role author
author2 Laureano, R.
Cortez, P.
author2_role author
author
dc.contributor.author.fl_str_mv Caetano, N.
Laureano, R.
Cortez, P.
dc.subject.por.fl_str_mv Medical data mining
Length of stay
CRISP-DM
Regression
Random forest
topic Medical data mining
Length of stay
CRISP-DM
Regression
Random forest
description Data Mining (DM) aims at the extraction of useful knowledge from raw data. In the last decades, hospitals have collected large amounts of data through new methods of electronic data storage, thus increasing the potential value of DM in this domain area, in what is known as medical data mining. This work focuses on the case study of a Portuguese hospital, based on recent and large dataset that was collected from 2000 to 2013. A data-driven predictive model was obtained for the length of stay (LOS), using as inputs indicators commonly available at the hospitalization process. Based on a regression approach, several state-of-the-art DM models were compared. The best result was obtained by a Random Forest (RF), which presents a high quality coefficient of determination value (0.81). Moreover, a sensitivity analysis approach was used to extract human understandable knowledge from the RF model, revealing top three influential input attributes: hospital episode type, the physical service where the patient is hospitalized and the associated medical specialty. Such predictive and explanatory knowledge is valuable for supporting decisions of hospital managers.
publishDate 2014
dc.date.none.fl_str_mv 2014-01-01T00:00:00Z
2014
2022-07-18T08:32:51Z
2022-07-07T15:21:54Z
dc.type.driver.fl_str_mv conference object
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10071/25857
url http://hdl.handle.net/10071/25857
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 978-989-758-027-7
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv SCITEPRESS Digital Library
publisher.none.fl_str_mv SCITEPRESS Digital Library
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833597150961860608