LIMITATIONS OF LOGISTIC REGRESSIONS TO ESTIMATE MEASURES OF ASSOCIATION FOR BINARY HEALTH OUTCOMES. A HANDS-ON STUDY WITH TWO EPIDEMIOLOGICAL EXAMPLES
Autor(a) principal: | |
---|---|
Data de Publicação: | 2023 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
Texto Completo: | http://hdl.handle.net/10362/162997 |
Resumo: | Introduction: In medical literature, the logistic regression is frequently used to estimate measures of association between an exposure, a health determinant or an intervention, and a binary outcome. How-ever, when the outcome is frequent (>10%), model estimates for relative risks and prevalence ratios might be biased, with potential impact on medical decision and policymaking. Despite the availability of several alternatives, many studies still rely on logistic regression models, and a consensus on this matter is yet to be reached. We aimed to compare the estimation and goodness-of-fit of logistic, log-binomial and robust Poisson regression models, in cross-sectional studies involving frequent binary outcomes. Methodology: Two cross-sectional studies with distinct characteristics and on different topics were con-ducted to estimate measures of association between an exposure and a frequent binary outcome. Study 1 was a nationally-representative study on the impact of air pollution on mental health. Study 2 was a local study on immigrants' access to urgent health care services. Odds ratios (OR) were obtained through logistic regression models, while prevalence ratios (PR) were obtained through log-binomial and robust Poisson regression models. Confidence intervals (CI), their ranges, and standard-errors (SE) were also computed, along with models’ relative goodness-of-fit through Akaike Information Criterion (AIC), when applicable. Results: In Study 1, the OR (95%IC) was 1.015 (0.970-1.063), while the PR (95%CI) obtained through the robust Poisson mode was 1.012 (0.979-1.045). The log-binomial regression model did not converge in this study. In Study 2, the OR (95%CI) was 1.584 (1.026-2.446), the PR (95%CI) for the log-binomial model was 1.217 (0.978-1.515), and 1.130 (1.013-1.261) for the robust Poisson model. The 95%CI, their ranges, and the SE of the OR were higher than those of the PR, in both studies. However, in Study 2, the AIC value was lower for the logistic regression model, followed by the log-binomial regression and the robust Poisson regression (1.345, 1.350, and 1.656, respectively). Discussion and conclusions: In the two presented examples, OR overestimated PR, with wider 95%CI and higher EPs. The extent of overestimation was greater as the outcome under study became more prevalent, in line with previous studies. Employing logistic regression models by default might lead to misinterpretations, especially by less experienced researchers. Robust Poisson models are viable alternatives to logistic regression models, in cross-sectional studies with frequent binary outcomes, avoiding the non-convergence issues of log-binomial models. Nonetheless, in Study 2, the logistic re-gression was the model with the best fit, which illustrates the need to consider multiple criteria, rather than just one, when selecting the most appropriate statistical model for each study. This dissertation highlights the need for statistical guidelines to support the selection of the most appropriate models and to facilitate the correct reporting, interpretation, and communication of scientific results. |
id |
RCAP_3d9199a180fabac10ed5af47acd85c90 |
---|---|
oai_identifier_str |
oai:run.unl.pt:10362/162997 |
network_acronym_str |
RCAP |
network_name_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
repository_id_str |
https://opendoar.ac.uk/repository/7160 |
spelling |
LIMITATIONS OF LOGISTIC REGRESSIONS TO ESTIMATE MEASURES OF ASSOCIATION FOR BINARY HEALTH OUTCOMES. A HANDS-ON STUDY WITH TWO EPIDEMIOLOGICAL EXAMPLESlogistic modelsodds ratiorisk ratiolog-binomial modelsrobust Poisson modelsprevalence ratioDomínio/Área Científica::Ciências Naturais::MatemáticasIntroduction: In medical literature, the logistic regression is frequently used to estimate measures of association between an exposure, a health determinant or an intervention, and a binary outcome. How-ever, when the outcome is frequent (>10%), model estimates for relative risks and prevalence ratios might be biased, with potential impact on medical decision and policymaking. Despite the availability of several alternatives, many studies still rely on logistic regression models, and a consensus on this matter is yet to be reached. We aimed to compare the estimation and goodness-of-fit of logistic, log-binomial and robust Poisson regression models, in cross-sectional studies involving frequent binary outcomes. Methodology: Two cross-sectional studies with distinct characteristics and on different topics were con-ducted to estimate measures of association between an exposure and a frequent binary outcome. Study 1 was a nationally-representative study on the impact of air pollution on mental health. Study 2 was a local study on immigrants' access to urgent health care services. Odds ratios (OR) were obtained through logistic regression models, while prevalence ratios (PR) were obtained through log-binomial and robust Poisson regression models. Confidence intervals (CI), their ranges, and standard-errors (SE) were also computed, along with models’ relative goodness-of-fit through Akaike Information Criterion (AIC), when applicable. Results: In Study 1, the OR (95%IC) was 1.015 (0.970-1.063), while the PR (95%CI) obtained through the robust Poisson mode was 1.012 (0.979-1.045). The log-binomial regression model did not converge in this study. In Study 2, the OR (95%CI) was 1.584 (1.026-2.446), the PR (95%CI) for the log-binomial model was 1.217 (0.978-1.515), and 1.130 (1.013-1.261) for the robust Poisson model. The 95%CI, their ranges, and the SE of the OR were higher than those of the PR, in both studies. However, in Study 2, the AIC value was lower for the logistic regression model, followed by the log-binomial regression and the robust Poisson regression (1.345, 1.350, and 1.656, respectively). Discussion and conclusions: In the two presented examples, OR overestimated PR, with wider 95%CI and higher EPs. The extent of overestimation was greater as the outcome under study became more prevalent, in line with previous studies. Employing logistic regression models by default might lead to misinterpretations, especially by less experienced researchers. Robust Poisson models are viable alternatives to logistic regression models, in cross-sectional studies with frequent binary outcomes, avoiding the non-convergence issues of log-binomial models. Nonetheless, in Study 2, the logistic re-gression was the model with the best fit, which illustrates the need to consider multiple criteria, rather than just one, when selecting the most appropriate statistical model for each study. This dissertation highlights the need for statistical guidelines to support the selection of the most appropriate models and to facilitate the correct reporting, interpretation, and communication of scientific results.Introdução: Na literatura médica, a regressão logística é frequentemente utilizada para estimar medi-das de associação entre uma exposição, um determinante de saúde ou uma intervenção e um desfecho binário. No entanto, quando o desfecho é frequente (>10%), as estimativas destes modelos para o risco relativo ou razão de prevalências (RP) podem ser enviesadas. Apesar de existirem modelos estatísticos alternativos, muitos estudos continuam a aplicar modelos de regressão logística indiscriminadamente. O objetivo desta dissertação foi comparar as estimativas e o ajuste de modelos de regressão logística, log-binomial e Poisson robusta, em estudos transversais com desfechos binários frequentes. Metodologia: Elaboraram-se dois estudos transversais com características distintas e sobre diferentes tópicos, de modo a estimar medidas de associação entre uma exposição e um desfecho binário fre-quente. O Estudo 1 tratou-se de um estudo representativo a nível nacional sobre o impacto da poluição atmosférica na saúde mental. O Estudo 2 tratou-se de um estudo local sobre o acesso de imigrantes a serviços de urgência. Obtiveram-se odds ratio (OR) através de modelos de regressão logística e RP através de modelos log-binomiais e Poisson robustos. Foram ainda obtidos os intervalos de confiança a 95% (IC95%), suas amplitudes, os erros-padrão (EP) das estimativas e comparados os valores Akaike Information Criteria (AIC) entre os modelos elaborados. Resultados: No Estudo 1, a OR (IC95%) foi de 1,015 (0,970-1,063) e a RP (IC 95%) obtida através do modelo de Poisson robusto foi de 1,012 (0,979-1,045). O modelo de regressão log-binomial não con-vergiu. No Estudo 2, a OR (IC95%) foi de 1,584 (1,026-2,446), a RP (IC95%) para o modelo de regres-são log-binomial foi de 1,217 (0,978-1,515) e para o modelo de Poisson robusto foi de 1,130 (1,013-1,261). Os IC95%, as suas amplitudes e os EP das OR foram superiores ao das RP, em ambos os estudos. No entanto, no Estudo 2, o valor do AIC foi inferior no modelo de regressão logística, seguido pela regressão log-binomial e pela Poisson robusta (1,345; 1,350 e 1,656, respetivamente). Discussão e conclusões: Nos dois exemplos apresentados, as OR sobrestimaram as RP, apresen-tando também IC95% mais amplos e EP superiores. A magnitude da sobrestimação foi tanto maior quanto mais prevalente o desfecho em estudo, em linha com estudos prévios. No entanto, no Estudo 2, a regressão logística foi a que melhor se ajustou aos dados. Este exemplo ilustra a necessidade de avaliar vários critérios, ao invés de apenas um, para a seleção do modelo estatístico mais apropriado a cada estudo. Os modelos de Poisson robustos são uma alternativa viável à regressão logística, em estudos transversais com desfechos binários frequentes e evitam o problema de não convergência dos modelos log-binomiais.Martins, Maria do RosárioRUNGuedes, Lara Pinheiro2024-02-01T16:22:39Z2023-112023-11-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/162997enginfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-05-22T18:17:58Zoai:run.unl.pt:10362/162997Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T17:48:22.150652Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse |
dc.title.none.fl_str_mv |
LIMITATIONS OF LOGISTIC REGRESSIONS TO ESTIMATE MEASURES OF ASSOCIATION FOR BINARY HEALTH OUTCOMES. A HANDS-ON STUDY WITH TWO EPIDEMIOLOGICAL EXAMPLES |
title |
LIMITATIONS OF LOGISTIC REGRESSIONS TO ESTIMATE MEASURES OF ASSOCIATION FOR BINARY HEALTH OUTCOMES. A HANDS-ON STUDY WITH TWO EPIDEMIOLOGICAL EXAMPLES |
spellingShingle |
LIMITATIONS OF LOGISTIC REGRESSIONS TO ESTIMATE MEASURES OF ASSOCIATION FOR BINARY HEALTH OUTCOMES. A HANDS-ON STUDY WITH TWO EPIDEMIOLOGICAL EXAMPLES Guedes, Lara Pinheiro logistic models odds ratio risk ratio log-binomial models robust Poisson models prevalence ratio Domínio/Área Científica::Ciências Naturais::Matemáticas |
title_short |
LIMITATIONS OF LOGISTIC REGRESSIONS TO ESTIMATE MEASURES OF ASSOCIATION FOR BINARY HEALTH OUTCOMES. A HANDS-ON STUDY WITH TWO EPIDEMIOLOGICAL EXAMPLES |
title_full |
LIMITATIONS OF LOGISTIC REGRESSIONS TO ESTIMATE MEASURES OF ASSOCIATION FOR BINARY HEALTH OUTCOMES. A HANDS-ON STUDY WITH TWO EPIDEMIOLOGICAL EXAMPLES |
title_fullStr |
LIMITATIONS OF LOGISTIC REGRESSIONS TO ESTIMATE MEASURES OF ASSOCIATION FOR BINARY HEALTH OUTCOMES. A HANDS-ON STUDY WITH TWO EPIDEMIOLOGICAL EXAMPLES |
title_full_unstemmed |
LIMITATIONS OF LOGISTIC REGRESSIONS TO ESTIMATE MEASURES OF ASSOCIATION FOR BINARY HEALTH OUTCOMES. A HANDS-ON STUDY WITH TWO EPIDEMIOLOGICAL EXAMPLES |
title_sort |
LIMITATIONS OF LOGISTIC REGRESSIONS TO ESTIMATE MEASURES OF ASSOCIATION FOR BINARY HEALTH OUTCOMES. A HANDS-ON STUDY WITH TWO EPIDEMIOLOGICAL EXAMPLES |
author |
Guedes, Lara Pinheiro |
author_facet |
Guedes, Lara Pinheiro |
author_role |
author |
dc.contributor.none.fl_str_mv |
Martins, Maria do Rosário RUN |
dc.contributor.author.fl_str_mv |
Guedes, Lara Pinheiro |
dc.subject.por.fl_str_mv |
logistic models odds ratio risk ratio log-binomial models robust Poisson models prevalence ratio Domínio/Área Científica::Ciências Naturais::Matemáticas |
topic |
logistic models odds ratio risk ratio log-binomial models robust Poisson models prevalence ratio Domínio/Área Científica::Ciências Naturais::Matemáticas |
description |
Introduction: In medical literature, the logistic regression is frequently used to estimate measures of association between an exposure, a health determinant or an intervention, and a binary outcome. How-ever, when the outcome is frequent (>10%), model estimates for relative risks and prevalence ratios might be biased, with potential impact on medical decision and policymaking. Despite the availability of several alternatives, many studies still rely on logistic regression models, and a consensus on this matter is yet to be reached. We aimed to compare the estimation and goodness-of-fit of logistic, log-binomial and robust Poisson regression models, in cross-sectional studies involving frequent binary outcomes. Methodology: Two cross-sectional studies with distinct characteristics and on different topics were con-ducted to estimate measures of association between an exposure and a frequent binary outcome. Study 1 was a nationally-representative study on the impact of air pollution on mental health. Study 2 was a local study on immigrants' access to urgent health care services. Odds ratios (OR) were obtained through logistic regression models, while prevalence ratios (PR) were obtained through log-binomial and robust Poisson regression models. Confidence intervals (CI), their ranges, and standard-errors (SE) were also computed, along with models’ relative goodness-of-fit through Akaike Information Criterion (AIC), when applicable. Results: In Study 1, the OR (95%IC) was 1.015 (0.970-1.063), while the PR (95%CI) obtained through the robust Poisson mode was 1.012 (0.979-1.045). The log-binomial regression model did not converge in this study. In Study 2, the OR (95%CI) was 1.584 (1.026-2.446), the PR (95%CI) for the log-binomial model was 1.217 (0.978-1.515), and 1.130 (1.013-1.261) for the robust Poisson model. The 95%CI, their ranges, and the SE of the OR were higher than those of the PR, in both studies. However, in Study 2, the AIC value was lower for the logistic regression model, followed by the log-binomial regression and the robust Poisson regression (1.345, 1.350, and 1.656, respectively). Discussion and conclusions: In the two presented examples, OR overestimated PR, with wider 95%CI and higher EPs. The extent of overestimation was greater as the outcome under study became more prevalent, in line with previous studies. Employing logistic regression models by default might lead to misinterpretations, especially by less experienced researchers. Robust Poisson models are viable alternatives to logistic regression models, in cross-sectional studies with frequent binary outcomes, avoiding the non-convergence issues of log-binomial models. Nonetheless, in Study 2, the logistic re-gression was the model with the best fit, which illustrates the need to consider multiple criteria, rather than just one, when selecting the most appropriate statistical model for each study. This dissertation highlights the need for statistical guidelines to support the selection of the most appropriate models and to facilitate the correct reporting, interpretation, and communication of scientific results. |
publishDate |
2023 |
dc.date.none.fl_str_mv |
2023-11 2023-11-01T00:00:00Z 2024-02-01T16:22:39Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10362/162997 |
url |
http://hdl.handle.net/10362/162997 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP |
instname_str |
FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
collection |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
repository.name.fl_str_mv |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
repository.mail.fl_str_mv |
info@rcaap.pt |
_version_ |
1833596979227131904 |