A non-intrusive and reactive architecture to support real-time ETL processes in data warehousing environments.

Detalhes bibliográficos
Autor(a) principal: VILELA, F. DE A.
Data de Publicação: 2023
Outros Autores: TIMES, V. C., BERNARDI, A. C. de C., FREITAS, A. DE P., CIFERRI, R. R.
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)
Texto Completo: http://www.alice.cnptia.embrapa.br/alice/handle/doc/1153634
https://doi.org/10.1016/j.heliyon.2023.e15728
Resumo: Nowadays, organizations are very interested to gather data for strategic decision-making. Data are disposable in operational sources, which are distributed, heterogeneous, and autonomous. These data are gathered through ETL processes, which occur traditionally in a pre-defined time, that is, once a day, once a week, once a month or in a specific period of time. On the other hand, there are special applications for which data needs to be obtained in a faster way and sometimes even immediately after the data are generated in the operation data sources, such as health systems and digital agriculture. Thus, the conventional ETL process and the disposable techniques are incapable of making the operational data delivered in real-time, providing low latency, high availability, and scalability. As our proposal, we present an innovative architecture, named Data Magnet, to cope with real-time ETL processes. The experimental tests performed in the digital agriculture domain using real and synthetic data showed that our proposal was able to deal in real-time with the ETL process. The Data Magnet provided great performance, showing an almost constant elapsed time for growing data volumes. Besides, Data Magnet provided significant performance gains over the traditional trigger technique.
id EMBR_a5c4cc0fea60fa2ac153f30d6b57fd4e
oai_identifier_str oai:www.alice.cnptia.embrapa.br:doc/1153634
network_acronym_str EMBR
network_name_str Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)
repository_id_str 2154
spelling A non-intrusive and reactive architecture to support real-time ETL processes in data warehousing environments.Data warehouseReal timeETLData extractionData loadingNowadays, organizations are very interested to gather data for strategic decision-making. Data are disposable in operational sources, which are distributed, heterogeneous, and autonomous. These data are gathered through ETL processes, which occur traditionally in a pre-defined time, that is, once a day, once a week, once a month or in a specific period of time. On the other hand, there are special applications for which data needs to be obtained in a faster way and sometimes even immediately after the data are generated in the operation data sources, such as health systems and digital agriculture. Thus, the conventional ETL process and the disposable techniques are incapable of making the operational data delivered in real-time, providing low latency, high availability, and scalability. As our proposal, we present an innovative architecture, named Data Magnet, to cope with real-time ETL processes. The experimental tests performed in the digital agriculture domain using real and synthetic data showed that our proposal was able to deal in real-time with the ETL process. The Data Magnet provided great performance, showing an almost constant elapsed time for growing data volumes. Besides, Data Magnet provided significant performance gains over the traditional trigger technique.FLÁVIO DE ASSIS VILELA, Federal Institute of Goiás; VALÉRIA CESÁRIO TIMES, Federal University of Pernambuco; ALBERTO CARLOS DE CAMPOS BERNARDI, CPPSE; AUGUSTO DE PAULA FREITAS, Federal University of São Carlos; RICARDO RODRIGUES CIFERRI, Federal University of São Carlos.VILELA, F. DE A.TIMES, V. C.BERNARDI, A. C. de C.FREITAS, A. DE P.CIFERRI, R. R.2023-05-10T13:47:24Z2023-05-10T13:47:24Z2023-05-102023info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article25 p.Heliyon, v. 9, n. 5, e15728, may 2023.http://www.alice.cnptia.embrapa.br/alice/handle/doc/1153634https://doi.org/10.1016/j.heliyon.2023.e15728enginfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)instname:Empresa Brasileira de Pesquisa Agropecuária (Embrapa)instacron:EMBRAPA2025-03-16T03:59:46Zoai:www.alice.cnptia.embrapa.br:doc/1153634Repositório InstitucionalPUBhttps://www.alice.cnptia.embrapa.br/oai/requestcg-riaa@embrapa.bropendoar:21542025-03-16T03:59:46Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) - Empresa Brasileira de Pesquisa Agropecuária (Embrapa)false
dc.title.none.fl_str_mv A non-intrusive and reactive architecture to support real-time ETL processes in data warehousing environments.
title A non-intrusive and reactive architecture to support real-time ETL processes in data warehousing environments.
spellingShingle A non-intrusive and reactive architecture to support real-time ETL processes in data warehousing environments.
VILELA, F. DE A.
Data warehouse
Real time
ETL
Data extraction
Data loading
title_short A non-intrusive and reactive architecture to support real-time ETL processes in data warehousing environments.
title_full A non-intrusive and reactive architecture to support real-time ETL processes in data warehousing environments.
title_fullStr A non-intrusive and reactive architecture to support real-time ETL processes in data warehousing environments.
title_full_unstemmed A non-intrusive and reactive architecture to support real-time ETL processes in data warehousing environments.
title_sort A non-intrusive and reactive architecture to support real-time ETL processes in data warehousing environments.
author VILELA, F. DE A.
author_facet VILELA, F. DE A.
TIMES, V. C.
BERNARDI, A. C. de C.
FREITAS, A. DE P.
CIFERRI, R. R.
author_role author
author2 TIMES, V. C.
BERNARDI, A. C. de C.
FREITAS, A. DE P.
CIFERRI, R. R.
author2_role author
author
author
author
dc.contributor.none.fl_str_mv FLÁVIO DE ASSIS VILELA, Federal Institute of Goiás; VALÉRIA CESÁRIO TIMES, Federal University of Pernambuco; ALBERTO CARLOS DE CAMPOS BERNARDI, CPPSE; AUGUSTO DE PAULA FREITAS, Federal University of São Carlos; RICARDO RODRIGUES CIFERRI, Federal University of São Carlos.
dc.contributor.author.fl_str_mv VILELA, F. DE A.
TIMES, V. C.
BERNARDI, A. C. de C.
FREITAS, A. DE P.
CIFERRI, R. R.
dc.subject.por.fl_str_mv Data warehouse
Real time
ETL
Data extraction
Data loading
topic Data warehouse
Real time
ETL
Data extraction
Data loading
description Nowadays, organizations are very interested to gather data for strategic decision-making. Data are disposable in operational sources, which are distributed, heterogeneous, and autonomous. These data are gathered through ETL processes, which occur traditionally in a pre-defined time, that is, once a day, once a week, once a month or in a specific period of time. On the other hand, there are special applications for which data needs to be obtained in a faster way and sometimes even immediately after the data are generated in the operation data sources, such as health systems and digital agriculture. Thus, the conventional ETL process and the disposable techniques are incapable of making the operational data delivered in real-time, providing low latency, high availability, and scalability. As our proposal, we present an innovative architecture, named Data Magnet, to cope with real-time ETL processes. The experimental tests performed in the digital agriculture domain using real and synthetic data showed that our proposal was able to deal in real-time with the ETL process. The Data Magnet provided great performance, showing an almost constant elapsed time for growing data volumes. Besides, Data Magnet provided significant performance gains over the traditional trigger technique.
publishDate 2023
dc.date.none.fl_str_mv 2023-05-10T13:47:24Z
2023-05-10T13:47:24Z
2023-05-10
2023
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv Heliyon, v. 9, n. 5, e15728, may 2023.
http://www.alice.cnptia.embrapa.br/alice/handle/doc/1153634
https://doi.org/10.1016/j.heliyon.2023.e15728
identifier_str_mv Heliyon, v. 9, n. 5, e15728, may 2023.
url http://www.alice.cnptia.embrapa.br/alice/handle/doc/1153634
https://doi.org/10.1016/j.heliyon.2023.e15728
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv 25 p.
dc.source.none.fl_str_mv reponame:Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)
instname:Empresa Brasileira de Pesquisa Agropecuária (Embrapa)
instacron:EMBRAPA
instname_str Empresa Brasileira de Pesquisa Agropecuária (Embrapa)
instacron_str EMBRAPA
institution EMBRAPA
reponame_str Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)
collection Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)
repository.name.fl_str_mv Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) - Empresa Brasileira de Pesquisa Agropecuária (Embrapa)
repository.mail.fl_str_mv cg-riaa@embrapa.br
_version_ 1830224738910208000