Algoritmos incrementais para previsão de variáveis quantitativas usando dados de chamadas móveis

Detalhes bibliográficos
Autor(a) principal: Marta Carolina Madeira Bebiano
Data de Publicação: 2015
Tipo de documento: Dissertação
Idioma: por
Título da fonte: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Texto Completo: https://repositorio-aberto.up.pt/handle/10216/83512
Resumo: The information flow that circulates nowadays in both local and transnational data networks is huge. That information originates, for example, in the media or as the result of users' everyday activities. The mass storage of information in massive databases, and at a increasing rate, creates growing difficulties for the organizations in how this information should be handled, but at the same time, it contains an hidden potential, often misunderstood and poorly acknowledged. With the emergence of this phenomenon of the growing accumulation of data, new problems and challenges have also arisen. How can one identify significant data, useful information and patterns of value amongst seemingly irrelevant information?In most areas information is constantly beeing stored, and, in this context, a new area of investigation, the Data Mining, has evolved over the last three decades.Telecommunication enterprises in particular have at their disposal millions of records of precious information which they could use to develop new services for their clients, that is, if they could find a clear way to use it properly. With that information they could perform several tasks like predicting the length of a call from the moment it begins, which is the goal of this study. This work intended to contribute to the knowledge of how to transform data coming from a big database into relevant information for businesses. Ways to add more value and knowledge to the available information, were searched for in order to boost businesses' profits.Any study in this area is rapidly confronted with a great difficulty, the analysis of an enormous amount of data, a problem of computer capacity in data processing. Difficulty lies not only in identifying useful hidden information but also in the necessity of processing that information in a reasonable ammount of time. Therefore the main goal of this project is to study and compare incremental algorithms for the prediction of the length of a call from the moment it begins, and identifying the best algorithms for this regression problem and included preprocessing tasks. It is a problem of supervised learning in which regression techniques are used.The following methods are used: distance based methods, k-Nearest Neighbor method, search based methods - decision trees, VFDT - Very Fast Decision Tree, and methods for heterogeneous and homogeneous ensembles, where several models are combined to make the best decisions. At the end of the study there will be used evaluation methods which will allow for the comparisso of the algorithms' efficiency. It is expected that through the results one can identify which method is the most efficient in predicting the length of a call, the expected precision for the prediction and which confidence interval the results fall within.
id RCAP_24d029ec8f9d2a73a8b6cb2998c5af16
oai_identifier_str oai:repositorio-aberto.up.pt:10216/83512
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling Algoritmos incrementais para previsão de variáveis quantitativas usando dados de chamadas móveisEngenharia electrotécnica, electrónica e informáticaElectrical engineering, Electronic engineering, Information engineeringThe information flow that circulates nowadays in both local and transnational data networks is huge. That information originates, for example, in the media or as the result of users' everyday activities. The mass storage of information in massive databases, and at a increasing rate, creates growing difficulties for the organizations in how this information should be handled, but at the same time, it contains an hidden potential, often misunderstood and poorly acknowledged. With the emergence of this phenomenon of the growing accumulation of data, new problems and challenges have also arisen. How can one identify significant data, useful information and patterns of value amongst seemingly irrelevant information?In most areas information is constantly beeing stored, and, in this context, a new area of investigation, the Data Mining, has evolved over the last three decades.Telecommunication enterprises in particular have at their disposal millions of records of precious information which they could use to develop new services for their clients, that is, if they could find a clear way to use it properly. With that information they could perform several tasks like predicting the length of a call from the moment it begins, which is the goal of this study. This work intended to contribute to the knowledge of how to transform data coming from a big database into relevant information for businesses. Ways to add more value and knowledge to the available information, were searched for in order to boost businesses' profits.Any study in this area is rapidly confronted with a great difficulty, the analysis of an enormous amount of data, a problem of computer capacity in data processing. Difficulty lies not only in identifying useful hidden information but also in the necessity of processing that information in a reasonable ammount of time. Therefore the main goal of this project is to study and compare incremental algorithms for the prediction of the length of a call from the moment it begins, and identifying the best algorithms for this regression problem and included preprocessing tasks. It is a problem of supervised learning in which regression techniques are used.The following methods are used: distance based methods, k-Nearest Neighbor method, search based methods - decision trees, VFDT - Very Fast Decision Tree, and methods for heterogeneous and homogeneous ensembles, where several models are combined to make the best decisions. At the end of the study there will be used evaluation methods which will allow for the comparisso of the algorithms' efficiency. It is expected that through the results one can identify which method is the most efficient in predicting the length of a call, the expected precision for the prediction and which confidence interval the results fall within.2015-07-212015-07-21T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://repositorio-aberto.up.pt/handle/10216/83512TID:201313774porMarta Carolina Madeira Bebianoinfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2025-02-27T18:12:28Zoai:repositorio-aberto.up.pt:10216/83512Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T22:41:28.915144Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv Algoritmos incrementais para previsão de variáveis quantitativas usando dados de chamadas móveis
title Algoritmos incrementais para previsão de variáveis quantitativas usando dados de chamadas móveis
spellingShingle Algoritmos incrementais para previsão de variáveis quantitativas usando dados de chamadas móveis
Marta Carolina Madeira Bebiano
Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
title_short Algoritmos incrementais para previsão de variáveis quantitativas usando dados de chamadas móveis
title_full Algoritmos incrementais para previsão de variáveis quantitativas usando dados de chamadas móveis
title_fullStr Algoritmos incrementais para previsão de variáveis quantitativas usando dados de chamadas móveis
title_full_unstemmed Algoritmos incrementais para previsão de variáveis quantitativas usando dados de chamadas móveis
title_sort Algoritmos incrementais para previsão de variáveis quantitativas usando dados de chamadas móveis
author Marta Carolina Madeira Bebiano
author_facet Marta Carolina Madeira Bebiano
author_role author
dc.contributor.author.fl_str_mv Marta Carolina Madeira Bebiano
dc.subject.por.fl_str_mv Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
topic Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
description The information flow that circulates nowadays in both local and transnational data networks is huge. That information originates, for example, in the media or as the result of users' everyday activities. The mass storage of information in massive databases, and at a increasing rate, creates growing difficulties for the organizations in how this information should be handled, but at the same time, it contains an hidden potential, often misunderstood and poorly acknowledged. With the emergence of this phenomenon of the growing accumulation of data, new problems and challenges have also arisen. How can one identify significant data, useful information and patterns of value amongst seemingly irrelevant information?In most areas information is constantly beeing stored, and, in this context, a new area of investigation, the Data Mining, has evolved over the last three decades.Telecommunication enterprises in particular have at their disposal millions of records of precious information which they could use to develop new services for their clients, that is, if they could find a clear way to use it properly. With that information they could perform several tasks like predicting the length of a call from the moment it begins, which is the goal of this study. This work intended to contribute to the knowledge of how to transform data coming from a big database into relevant information for businesses. Ways to add more value and knowledge to the available information, were searched for in order to boost businesses' profits.Any study in this area is rapidly confronted with a great difficulty, the analysis of an enormous amount of data, a problem of computer capacity in data processing. Difficulty lies not only in identifying useful hidden information but also in the necessity of processing that information in a reasonable ammount of time. Therefore the main goal of this project is to study and compare incremental algorithms for the prediction of the length of a call from the moment it begins, and identifying the best algorithms for this regression problem and included preprocessing tasks. It is a problem of supervised learning in which regression techniques are used.The following methods are used: distance based methods, k-Nearest Neighbor method, search based methods - decision trees, VFDT - Very Fast Decision Tree, and methods for heterogeneous and homogeneous ensembles, where several models are combined to make the best decisions. At the end of the study there will be used evaluation methods which will allow for the comparisso of the algorithms' efficiency. It is expected that through the results one can identify which method is the most efficient in predicting the length of a call, the expected precision for the prediction and which confidence interval the results fall within.
publishDate 2015
dc.date.none.fl_str_mv 2015-07-21
2015-07-21T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://repositorio-aberto.up.pt/handle/10216/83512
TID:201313774
url https://repositorio-aberto.up.pt/handle/10216/83512
identifier_str_mv TID:201313774
dc.language.iso.fl_str_mv por
language por
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833599812994334721