Método de auxílio ao diagnóstico de câncer de próstata utilizando aprendizado de máquina e dados clínicos

Detalhes bibliográficos
Ano de defesa: 2024
Autor(a) principal: ARAUJO, Wesley Batista Dominices de lattes
Orientador(a): SANTANA, Ewaldo Eder Carvalho lattes
Banca de defesa: SANTANA, Ewaldo Eder Carvalho lattes, LOBATO, Fábio Manoel França lattes, ROSA, Claudia Regina de Andrade Arrais lattes, SILVA, Luís Cláudio Nascimento da lattes, BARROS FILHO, Allan Kardec Duailibe lattes
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal do Maranhão
Programa de Pós-Graduação: PROGRAMA DE PÓS-GRADUAÇÃO EM ENGENHARIA DE ELETRICIDADE/CCET
Departamento: DEPARTAMENTO DE ENGENHARIA DA ELETRICIDADE/CCET
País: Brasil
Palavras-chave em Português:
Palavras-chave em Inglês:
Área do conhecimento CNPq:
Link de acesso: https://tedebc.ufma.br/jspui/handle/tede/5792
Resumo: Prostate cancer, after non-melanoma skin cancer, is the most common type of cancer among men, and the one that causes the most deaths. To begin the diagnosis of prostate cancer, a physical examination (digital rectal exam) and laboratory exam (prostate-specific antigen) are used. If there are changes in these tests, other tests may be requested, such as resonance magnetic imaging and biopsy. Currently, biopsy is the only procedure capable of confirming cancer, it has a high financial cost and is a very invasive procedure. This thesis proposes a new method to aid in the screening of patients at risk for prostate cancer. The method was developed based on clinical variables (age, race, systemic arterial hypertension, diabetes mellitus, smoking, alcoholism, digital rectal examination, and total PSA) of 274 patients, of which 137 have cancer and 137 do not, as obtained from medical records. The data were analyzed using several machine learning algorithms, such as Artificial Neural Networks, Support Vector Machine, Naive Bayes, K-nearest neighbors, and decision tree, to classify the samples according to the presence or absence of prostate cancer. The method was evaluated based on performance metrics, including accuracy, sensitivity, specificity, and area under the ROC curve. To increase the reliability of the results and the generalization capacity of the classifier, the 10-fold cross-validation technique was used. The best performance was obtained with the Naive Bayes model, resulting in an accuracy of 89.09%, sensitivity of 92%, specificity of 86.67% and an Area under the ROC curve of 0.9187.