A non-intrusive and reactive architecture to support real-time ETL processes in data warehousing environments.
Main Author: | |
---|---|
Publication Date: | 2023 |
Other Authors: | , , , |
Format: | Article |
Language: | eng |
Source: | Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) |
Download full: | http://www.alice.cnptia.embrapa.br/alice/handle/doc/1153634 https://doi.org/10.1016/j.heliyon.2023.e15728 |
Summary: | Nowadays, organizations are very interested to gather data for strategic decision-making. Data are disposable in operational sources, which are distributed, heterogeneous, and autonomous. These data are gathered through ETL processes, which occur traditionally in a pre-defined time, that is, once a day, once a week, once a month or in a specific period of time. On the other hand, there are special applications for which data needs to be obtained in a faster way and sometimes even immediately after the data are generated in the operation data sources, such as health systems and digital agriculture. Thus, the conventional ETL process and the disposable techniques are incapable of making the operational data delivered in real-time, providing low latency, high availability, and scalability. As our proposal, we present an innovative architecture, named Data Magnet, to cope with real-time ETL processes. The experimental tests performed in the digital agriculture domain using real and synthetic data showed that our proposal was able to deal in real-time with the ETL process. The Data Magnet provided great performance, showing an almost constant elapsed time for growing data volumes. Besides, Data Magnet provided significant performance gains over the traditional trigger technique. |
id |
EMBR_a5c4cc0fea60fa2ac153f30d6b57fd4e |
---|---|
oai_identifier_str |
oai:www.alice.cnptia.embrapa.br:doc/1153634 |
network_acronym_str |
EMBR |
network_name_str |
Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) |
repository_id_str |
2154 |
spelling |
A non-intrusive and reactive architecture to support real-time ETL processes in data warehousing environments.Data warehouseReal timeETLData extractionData loadingNowadays, organizations are very interested to gather data for strategic decision-making. Data are disposable in operational sources, which are distributed, heterogeneous, and autonomous. These data are gathered through ETL processes, which occur traditionally in a pre-defined time, that is, once a day, once a week, once a month or in a specific period of time. On the other hand, there are special applications for which data needs to be obtained in a faster way and sometimes even immediately after the data are generated in the operation data sources, such as health systems and digital agriculture. Thus, the conventional ETL process and the disposable techniques are incapable of making the operational data delivered in real-time, providing low latency, high availability, and scalability. As our proposal, we present an innovative architecture, named Data Magnet, to cope with real-time ETL processes. The experimental tests performed in the digital agriculture domain using real and synthetic data showed that our proposal was able to deal in real-time with the ETL process. The Data Magnet provided great performance, showing an almost constant elapsed time for growing data volumes. Besides, Data Magnet provided significant performance gains over the traditional trigger technique.FLÁVIO DE ASSIS VILELA, Federal Institute of Goiás; VALÉRIA CESÁRIO TIMES, Federal University of Pernambuco; ALBERTO CARLOS DE CAMPOS BERNARDI, CPPSE; AUGUSTO DE PAULA FREITAS, Federal University of São Carlos; RICARDO RODRIGUES CIFERRI, Federal University of São Carlos.VILELA, F. DE A.TIMES, V. C.BERNARDI, A. C. de C.FREITAS, A. DE P.CIFERRI, R. R.2023-05-10T13:47:24Z2023-05-10T13:47:24Z2023-05-102023info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article25 p.Heliyon, v. 9, n. 5, e15728, may 2023.http://www.alice.cnptia.embrapa.br/alice/handle/doc/1153634https://doi.org/10.1016/j.heliyon.2023.e15728enginfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)instname:Empresa Brasileira de Pesquisa Agropecuária (Embrapa)instacron:EMBRAPA2025-03-16T03:59:46Zoai:www.alice.cnptia.embrapa.br:doc/1153634Repositório InstitucionalPUBhttps://www.alice.cnptia.embrapa.br/oai/requestcg-riaa@embrapa.bropendoar:21542025-03-16T03:59:46Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) - Empresa Brasileira de Pesquisa Agropecuária (Embrapa)false |
dc.title.none.fl_str_mv |
A non-intrusive and reactive architecture to support real-time ETL processes in data warehousing environments. |
title |
A non-intrusive and reactive architecture to support real-time ETL processes in data warehousing environments. |
spellingShingle |
A non-intrusive and reactive architecture to support real-time ETL processes in data warehousing environments. VILELA, F. DE A. Data warehouse Real time ETL Data extraction Data loading |
title_short |
A non-intrusive and reactive architecture to support real-time ETL processes in data warehousing environments. |
title_full |
A non-intrusive and reactive architecture to support real-time ETL processes in data warehousing environments. |
title_fullStr |
A non-intrusive and reactive architecture to support real-time ETL processes in data warehousing environments. |
title_full_unstemmed |
A non-intrusive and reactive architecture to support real-time ETL processes in data warehousing environments. |
title_sort |
A non-intrusive and reactive architecture to support real-time ETL processes in data warehousing environments. |
author |
VILELA, F. DE A. |
author_facet |
VILELA, F. DE A. TIMES, V. C. BERNARDI, A. C. de C. FREITAS, A. DE P. CIFERRI, R. R. |
author_role |
author |
author2 |
TIMES, V. C. BERNARDI, A. C. de C. FREITAS, A. DE P. CIFERRI, R. R. |
author2_role |
author author author author |
dc.contributor.none.fl_str_mv |
FLÁVIO DE ASSIS VILELA, Federal Institute of Goiás; VALÉRIA CESÁRIO TIMES, Federal University of Pernambuco; ALBERTO CARLOS DE CAMPOS BERNARDI, CPPSE; AUGUSTO DE PAULA FREITAS, Federal University of São Carlos; RICARDO RODRIGUES CIFERRI, Federal University of São Carlos. |
dc.contributor.author.fl_str_mv |
VILELA, F. DE A. TIMES, V. C. BERNARDI, A. C. de C. FREITAS, A. DE P. CIFERRI, R. R. |
dc.subject.por.fl_str_mv |
Data warehouse Real time ETL Data extraction Data loading |
topic |
Data warehouse Real time ETL Data extraction Data loading |
description |
Nowadays, organizations are very interested to gather data for strategic decision-making. Data are disposable in operational sources, which are distributed, heterogeneous, and autonomous. These data are gathered through ETL processes, which occur traditionally in a pre-defined time, that is, once a day, once a week, once a month or in a specific period of time. On the other hand, there are special applications for which data needs to be obtained in a faster way and sometimes even immediately after the data are generated in the operation data sources, such as health systems and digital agriculture. Thus, the conventional ETL process and the disposable techniques are incapable of making the operational data delivered in real-time, providing low latency, high availability, and scalability. As our proposal, we present an innovative architecture, named Data Magnet, to cope with real-time ETL processes. The experimental tests performed in the digital agriculture domain using real and synthetic data showed that our proposal was able to deal in real-time with the ETL process. The Data Magnet provided great performance, showing an almost constant elapsed time for growing data volumes. Besides, Data Magnet provided significant performance gains over the traditional trigger technique. |
publishDate |
2023 |
dc.date.none.fl_str_mv |
2023-05-10T13:47:24Z 2023-05-10T13:47:24Z 2023-05-10 2023 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
Heliyon, v. 9, n. 5, e15728, may 2023. http://www.alice.cnptia.embrapa.br/alice/handle/doc/1153634 https://doi.org/10.1016/j.heliyon.2023.e15728 |
identifier_str_mv |
Heliyon, v. 9, n. 5, e15728, may 2023. |
url |
http://www.alice.cnptia.embrapa.br/alice/handle/doc/1153634 https://doi.org/10.1016/j.heliyon.2023.e15728 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
25 p. |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) instname:Empresa Brasileira de Pesquisa Agropecuária (Embrapa) instacron:EMBRAPA |
instname_str |
Empresa Brasileira de Pesquisa Agropecuária (Embrapa) |
instacron_str |
EMBRAPA |
institution |
EMBRAPA |
reponame_str |
Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) |
collection |
Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) |
repository.name.fl_str_mv |
Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) - Empresa Brasileira de Pesquisa Agropecuária (Embrapa) |
repository.mail.fl_str_mv |
cg-riaa@embrapa.br |
_version_ |
1830224738910208000 |