Machine Learning applied to credit risk assessment: Prediction of loan defaults
| Autor(a) principal: | |
|---|---|
| Data de Publicação: | 2023 |
| Tipo de documento: | Dissertação |
| Idioma: | eng |
| Título da fonte: | Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| Texto Completo: | http://hdl.handle.net/10362/149818 |
Resumo: | Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data Science |
| id |
RCAP_e0ffc47e83258a0caca6e055e6edc721 |
|---|---|
| oai_identifier_str |
oai:run.unl.pt:10362/149818 |
| network_acronym_str |
RCAP |
| network_name_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| repository_id_str |
https://opendoar.ac.uk/repository/7160 |
| spelling |
Machine Learning applied to credit risk assessment: Prediction of loan defaultsCredit RiskMachine LearningLogistic RegressionEnsemble MethodsLoan DefaultsDissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data ScienceDue to the recent financial crisis and regulatory concerns of Basel II, credit risk assessment is becoming a very important topic in the field of financial risk management. Financial institutions need to take great care when dealing with consumer loans in order to avoid losses and costs of opportunity. For this matter, credit scoring systems have been used to make informed decisions on whether or not to grant credit to clients who apply to them. Until now several credit scoring models have been proposed, from statistical models, to more complex artificial intelligence techniques. However, most of previous work is focused on employing single classifiers. Ensemble learning is a powerful machine learning paradigm which has proven to be of great value in solving a variety of problems. This study compares the performance of the industry standard, logistic regression, to four ensemble methods, i.e. AdaBoost, Gradient Boosting, Random Forest and Stacking in identifying potential loan defaults. All the models were built with a real world dataset with over one million customers from Lending Club, a financial institution based in the United States. The performance of the models was compared by using the Hold-out method as the evaluation design and accuracy, AUC, type I error and type II error as evaluation metrics. Experimental results reveal that the ensemble classifiers were able to outperform logistic regression on three key indicators, i.e. accuracy, type I error and type II error. AdaBoost performed better than the remaining classifiers considering a trade off between all the metrics evaluated. The main contribution of this thesis is an experimental addition to the literature on the preferred models for predicting potential loan defaulters.Castelli, MauroRUNSimão, Sofia Beatriz Santos2023-02-28T18:49:41Z2023-01-262023-01-26T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/149818TID:203239067enginfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-05-22T18:09:34Zoai:run.unl.pt:10362/149818Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T17:39:57.972694Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse |
| dc.title.none.fl_str_mv |
Machine Learning applied to credit risk assessment: Prediction of loan defaults |
| title |
Machine Learning applied to credit risk assessment: Prediction of loan defaults |
| spellingShingle |
Machine Learning applied to credit risk assessment: Prediction of loan defaults Simão, Sofia Beatriz Santos Credit Risk Machine Learning Logistic Regression Ensemble Methods Loan Defaults |
| title_short |
Machine Learning applied to credit risk assessment: Prediction of loan defaults |
| title_full |
Machine Learning applied to credit risk assessment: Prediction of loan defaults |
| title_fullStr |
Machine Learning applied to credit risk assessment: Prediction of loan defaults |
| title_full_unstemmed |
Machine Learning applied to credit risk assessment: Prediction of loan defaults |
| title_sort |
Machine Learning applied to credit risk assessment: Prediction of loan defaults |
| author |
Simão, Sofia Beatriz Santos |
| author_facet |
Simão, Sofia Beatriz Santos |
| author_role |
author |
| dc.contributor.none.fl_str_mv |
Castelli, Mauro RUN |
| dc.contributor.author.fl_str_mv |
Simão, Sofia Beatriz Santos |
| dc.subject.por.fl_str_mv |
Credit Risk Machine Learning Logistic Regression Ensemble Methods Loan Defaults |
| topic |
Credit Risk Machine Learning Logistic Regression Ensemble Methods Loan Defaults |
| description |
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data Science |
| publishDate |
2023 |
| dc.date.none.fl_str_mv |
2023-02-28T18:49:41Z 2023-01-26 2023-01-26T00:00:00Z |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
| format |
masterThesis |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10362/149818 TID:203239067 |
| url |
http://hdl.handle.net/10362/149818 |
| identifier_str_mv |
TID:203239067 |
| dc.language.iso.fl_str_mv |
eng |
| language |
eng |
| dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.source.none.fl_str_mv |
reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP |
| instname_str |
FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
| instacron_str |
RCAAP |
| institution |
RCAAP |
| reponame_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| collection |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| repository.name.fl_str_mv |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
| repository.mail.fl_str_mv |
info@rcaap.pt |
| _version_ |
1833596873749823488 |