Data Integration Solution in an Heterogeneous Environment

Detalhes bibliográficos
Autor(a) principal: Pedro Manuel dos Santos Rocha
Data de Publicação: 2017
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Texto Completo: https://hdl.handle.net/10216/106487
Resumo: Over the last few years there has been an increase in the attention given to both data collection and knowledge extraction. Recent developments in data storage, distributed systems and parallelization made the analysis of vast amounts of data more straightforward. However, whilst processing large quantities of information has been made simpler there are still some problems that need to be addressed. One of these problems resides in the clean-up of the data collected, meaning the transformation of the information collected into a more useful format from where knowledge can be extracted. Usually this problem is addressed by developing a solution on a case by case basis that has no power of generalization. As expected, this type of solution works well in an environment where the data is well known and with a fixed structure, but if there are changes in the initial structure or the final structure of the information there needs to be an adjustment made to the solution. This brings added complexity that can cause an application to become increasingly difficult to maintain and add new features. The solution that is analyzed throughout this dissertation work is the creation of an application where a user can combine and transform information that originates from different sources. This is made utilizing user-defined configuration documents, so that when a change is made in the system the impact for the end-user is minimized. In order to better test the suitability of the solution, it is going to be developed using a real-world scenario. This scenario is based on an already existing application that collects information from a variety of sources and has the necessity of transforming the information collected into a more useful structure.
id RCAP_b89fd5ac5347f5cc3a381e71a203f05d
oai_identifier_str oai:repositorio-aberto.up.pt:10216/106487
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling Data Integration Solution in an Heterogeneous EnvironmentEngenharia electrotécnica, electrónica e informáticaElectrical engineering, Electronic engineering, Information engineeringOver the last few years there has been an increase in the attention given to both data collection and knowledge extraction. Recent developments in data storage, distributed systems and parallelization made the analysis of vast amounts of data more straightforward. However, whilst processing large quantities of information has been made simpler there are still some problems that need to be addressed. One of these problems resides in the clean-up of the data collected, meaning the transformation of the information collected into a more useful format from where knowledge can be extracted. Usually this problem is addressed by developing a solution on a case by case basis that has no power of generalization. As expected, this type of solution works well in an environment where the data is well known and with a fixed structure, but if there are changes in the initial structure or the final structure of the information there needs to be an adjustment made to the solution. This brings added complexity that can cause an application to become increasingly difficult to maintain and add new features. The solution that is analyzed throughout this dissertation work is the creation of an application where a user can combine and transform information that originates from different sources. This is made utilizing user-defined configuration documents, so that when a change is made in the system the impact for the end-user is minimized. In order to better test the suitability of the solution, it is going to be developed using a real-world scenario. This scenario is based on an already existing application that collects information from a variety of sources and has the necessity of transforming the information collected into a more useful structure.2017-07-172017-07-17T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://hdl.handle.net/10216/106487TID:201802252engPedro Manuel dos Santos Rochainfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2025-02-27T17:32:50Zoai:repositorio-aberto.up.pt:10216/106487Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T22:18:20.727370Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv Data Integration Solution in an Heterogeneous Environment
title Data Integration Solution in an Heterogeneous Environment
spellingShingle Data Integration Solution in an Heterogeneous Environment
Pedro Manuel dos Santos Rocha
Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
title_short Data Integration Solution in an Heterogeneous Environment
title_full Data Integration Solution in an Heterogeneous Environment
title_fullStr Data Integration Solution in an Heterogeneous Environment
title_full_unstemmed Data Integration Solution in an Heterogeneous Environment
title_sort Data Integration Solution in an Heterogeneous Environment
author Pedro Manuel dos Santos Rocha
author_facet Pedro Manuel dos Santos Rocha
author_role author
dc.contributor.author.fl_str_mv Pedro Manuel dos Santos Rocha
dc.subject.por.fl_str_mv Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
topic Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
description Over the last few years there has been an increase in the attention given to both data collection and knowledge extraction. Recent developments in data storage, distributed systems and parallelization made the analysis of vast amounts of data more straightforward. However, whilst processing large quantities of information has been made simpler there are still some problems that need to be addressed. One of these problems resides in the clean-up of the data collected, meaning the transformation of the information collected into a more useful format from where knowledge can be extracted. Usually this problem is addressed by developing a solution on a case by case basis that has no power of generalization. As expected, this type of solution works well in an environment where the data is well known and with a fixed structure, but if there are changes in the initial structure or the final structure of the information there needs to be an adjustment made to the solution. This brings added complexity that can cause an application to become increasingly difficult to maintain and add new features. The solution that is analyzed throughout this dissertation work is the creation of an application where a user can combine and transform information that originates from different sources. This is made utilizing user-defined configuration documents, so that when a change is made in the system the impact for the end-user is minimized. In order to better test the suitability of the solution, it is going to be developed using a real-world scenario. This scenario is based on an already existing application that collects information from a variety of sources and has the necessity of transforming the information collected into a more useful structure.
publishDate 2017
dc.date.none.fl_str_mv 2017-07-17
2017-07-17T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://hdl.handle.net/10216/106487
TID:201802252
url https://hdl.handle.net/10216/106487
identifier_str_mv TID:201802252
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833599634073714688