Text Mining Research Project: Internship at Ageas Portugal
Main Author: | |
---|---|
Publication Date: | 2021 |
Format: | Master thesis |
Language: | eng |
Source: | Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
Download full: | http://hdl.handle.net/10362/128809 |
Summary: | Internship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics |
id |
RCAP_a33fa14096336c20b4a2dd88cc6a1093 |
---|---|
oai_identifier_str |
oai:run.unl.pt:10362/128809 |
network_acronym_str |
RCAP |
network_name_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
repository_id_str |
https://opendoar.ac.uk/repository/7160 |
spelling |
Text Mining Research Project: Internship at Ageas PortugalText miningText analyticsNatural language processingSentiment analysisTopic classificationInternship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsAs an insurance company, Ageas Portugal has lots of data related to their customers. Usually, most of data used by companies (disregarding few companies that already use advanced machine learning and artificial intelligence techniques) are structured data, that are known as formatted datasets and tables with customer information. But, with the advance of technology, more companies are starting to use their unstructured data, which could be helpful to find insights and achieve goals. From the different data sources in human language form the company has as emails, customer surveys, medical transcriptions and etc., we have agreed an email database would be the best option for the project development. This type of data requires a very thorough data preparation as there are irrelevant parts within emails as signatures and disclaimers, which should be excluded. Analyzing customer’s interaction with the company we could find insights about how to increase sales and reduce churn rate. We have applied two Text Mining techniques (Sentiment Analysis and Topic Classification) and a proof of concept was conducted. It showed that clients who send or are mentioned in emails tend to cancel their policies at higher rate than those without emails, even if the email’s topic is not related to cancellation. It has also showed that the effect of sentiment on cancellations behavior appears to be mixed, requiring further analysis. The full project was developed in Python but there was also a comparison with other market solutions as Amazon Web Services, SAS, Google Cloud and Microsoft Azure, in order to find the best Text Mining tool to fit with the company. As expected, Python was elected as the best option.Pinheiro, Flávio Luís PortasRUNTeixeira, Daniel Rocha2021-12-07T16:58:16Z2021-11-262021-11-26T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/128809TID:202809668enginfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-05-22T17:57:26Zoai:run.unl.pt:10362/128809Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T17:28:42.870435Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse |
dc.title.none.fl_str_mv |
Text Mining Research Project: Internship at Ageas Portugal |
title |
Text Mining Research Project: Internship at Ageas Portugal |
spellingShingle |
Text Mining Research Project: Internship at Ageas Portugal Teixeira, Daniel Rocha Text mining Text analytics Natural language processing Sentiment analysis Topic classification |
title_short |
Text Mining Research Project: Internship at Ageas Portugal |
title_full |
Text Mining Research Project: Internship at Ageas Portugal |
title_fullStr |
Text Mining Research Project: Internship at Ageas Portugal |
title_full_unstemmed |
Text Mining Research Project: Internship at Ageas Portugal |
title_sort |
Text Mining Research Project: Internship at Ageas Portugal |
author |
Teixeira, Daniel Rocha |
author_facet |
Teixeira, Daniel Rocha |
author_role |
author |
dc.contributor.none.fl_str_mv |
Pinheiro, Flávio Luís Portas RUN |
dc.contributor.author.fl_str_mv |
Teixeira, Daniel Rocha |
dc.subject.por.fl_str_mv |
Text mining Text analytics Natural language processing Sentiment analysis Topic classification |
topic |
Text mining Text analytics Natural language processing Sentiment analysis Topic classification |
description |
Internship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics |
publishDate |
2021 |
dc.date.none.fl_str_mv |
2021-12-07T16:58:16Z 2021-11-26 2021-11-26T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10362/128809 TID:202809668 |
url |
http://hdl.handle.net/10362/128809 |
identifier_str_mv |
TID:202809668 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP |
instname_str |
FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
collection |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
repository.name.fl_str_mv |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
repository.mail.fl_str_mv |
info@rcaap.pt |
_version_ |
1833596721352933376 |