Challenging SQL-on-Hadoop performance with Apache Druid
| Autor(a) principal: | |
|---|---|
| Data de Publicação: | 2019 |
| Outros Autores: | , |
| Idioma: | eng |
| Título da fonte: | Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| Texto Completo: | http://hdl.handle.net/1822/66785 |
Resumo: | In Big Data, SQL-on-Hadoop tools usually provide satisfactory performance for processing vast amounts of data, although new emerging tools may be an alternative. This paper evaluates if Apache Druid, an innovative column-oriented data store suited for online analytical processing workloads, is an alternative to some of the well-known SQL-on-Hadoop technologies and its potential in this role. In this evaluation, Druid, Hive and Presto are benchmarked with increasing data volumes. The results point Druid as a strong alternative, achieving better performance than Hive and Presto, and show the potential of integrating Hive and Druid, enhancing the potentialities of both tools. |
| id |
RCAP_8e1e4e0c963026a6ea2bf3bccfedb941 |
|---|---|
| oai_identifier_str |
oai:repositorium.sdum.uminho.pt:1822/66785 |
| network_acronym_str |
RCAP |
| network_name_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| repository_id_str |
https://opendoar.ac.uk/repository/7160 |
| spelling |
Challenging SQL-on-Hadoop performance with Apache DruidBig DataBig Data WarehouseSQL-on-HadoopDruidOLAPScience & TechnologyIn Big Data, SQL-on-Hadoop tools usually provide satisfactory performance for processing vast amounts of data, although new emerging tools may be an alternative. This paper evaluates if Apache Druid, an innovative column-oriented data store suited for online analytical processing workloads, is an alternative to some of the well-known SQL-on-Hadoop technologies and its potential in this role. In this evaluation, Druid, Hive and Presto are benchmarked with increasing data volumes. The results point Druid as a strong alternative, achieving better performance than Hive and Presto, and show the potential of integrating Hive and Druid, enhancing the potentialities of both tools.This work is supported by COMPETE: POCI-01-0145-FEDER-007043 and FCT - Fundacao para a Ciencia e Tecnologia within Project UID/CEC/00319/2013 and by European Structural and Investment Funds in the FEDER component, COMPETE 2020 (Funding Reference: POCI-01-0247-FEDER-002814).Springer VerlagUniversidade do MinhoCorreia, JoséCosta, Carlos A. P.Santos, Maribel Yasmina20192019-01-01T00:00:00Zconference paperinfo:eu-repo/semantics/publishedVersionapplication/pdfhttp://hdl.handle.net/1822/66785eng97830302048461865-134810.1007/978-3-030-20485-3_12https://link.springer.com/chapter/10.1007%2F978-3-030-20485-3_12info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-05-11T05:02:55Zoai:repositorium.sdum.uminho.pt:1822/66785Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T15:06:27.612239Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse |
| dc.title.none.fl_str_mv |
Challenging SQL-on-Hadoop performance with Apache Druid |
| title |
Challenging SQL-on-Hadoop performance with Apache Druid |
| spellingShingle |
Challenging SQL-on-Hadoop performance with Apache Druid Correia, José Big Data Big Data Warehouse SQL-on-Hadoop Druid OLAP Science & Technology |
| title_short |
Challenging SQL-on-Hadoop performance with Apache Druid |
| title_full |
Challenging SQL-on-Hadoop performance with Apache Druid |
| title_fullStr |
Challenging SQL-on-Hadoop performance with Apache Druid |
| title_full_unstemmed |
Challenging SQL-on-Hadoop performance with Apache Druid |
| title_sort |
Challenging SQL-on-Hadoop performance with Apache Druid |
| author |
Correia, José |
| author_facet |
Correia, José Costa, Carlos A. P. Santos, Maribel Yasmina |
| author_role |
author |
| author2 |
Costa, Carlos A. P. Santos, Maribel Yasmina |
| author2_role |
author author |
| dc.contributor.none.fl_str_mv |
Universidade do Minho |
| dc.contributor.author.fl_str_mv |
Correia, José Costa, Carlos A. P. Santos, Maribel Yasmina |
| dc.subject.por.fl_str_mv |
Big Data Big Data Warehouse SQL-on-Hadoop Druid OLAP Science & Technology |
| topic |
Big Data Big Data Warehouse SQL-on-Hadoop Druid OLAP Science & Technology |
| description |
In Big Data, SQL-on-Hadoop tools usually provide satisfactory performance for processing vast amounts of data, although new emerging tools may be an alternative. This paper evaluates if Apache Druid, an innovative column-oriented data store suited for online analytical processing workloads, is an alternative to some of the well-known SQL-on-Hadoop technologies and its potential in this role. In this evaluation, Druid, Hive and Presto are benchmarked with increasing data volumes. The results point Druid as a strong alternative, achieving better performance than Hive and Presto, and show the potential of integrating Hive and Druid, enhancing the potentialities of both tools. |
| publishDate |
2019 |
| dc.date.none.fl_str_mv |
2019 2019-01-01T00:00:00Z |
| dc.type.driver.fl_str_mv |
conference paper |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/1822/66785 |
| url |
http://hdl.handle.net/1822/66785 |
| dc.language.iso.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
9783030204846 1865-1348 10.1007/978-3-030-20485-3_12 https://link.springer.com/chapter/10.1007%2F978-3-030-20485-3_12 |
| dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.publisher.none.fl_str_mv |
Springer Verlag |
| publisher.none.fl_str_mv |
Springer Verlag |
| dc.source.none.fl_str_mv |
reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP |
| instname_str |
FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
| instacron_str |
RCAAP |
| institution |
RCAAP |
| reponame_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| collection |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| repository.name.fl_str_mv |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
| repository.mail.fl_str_mv |
info@rcaap.pt |
| _version_ |
1833595104363806720 |