SAVIME: enabling declarative array processing in memory

Lustosa, Hermano Lourenço Souza

SAVIME: enabling declarative array processing in memory

Detalhes bibliográficos
Ano de defesa:	2020
Autor(a) principal:	Lustosa, Hermano Lourenço Souza
Orientador(a):	Não Informado pela instituição
Banca de defesa:	Não Informado pela instituição
Tipo de documento:	Tese
Tipo de acesso:	Acesso aberto
Idioma:	eng
Instituição de defesa:	Laboratório Nacional de Computação Científica Coordenação de Pós-Graduação e Aperfeiçoamento (COPGA) Brasil LNCC Programa de Pós-Graduação em Modelagem Computacional
Programa de Pós-Graduação:	Não Informado pela instituição
Departamento:	Não Informado pela instituição
País:	Não Informado pela instituição
Palavras-chave em Português:	Sistema de gerência de bancos de dados Modelo de dados baseado em matrizes multidimensionais Análise de dados Visualização de dados CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
Link de acesso:	https://tede.lncc.br/handle/tede/328
Resumo:	Current limitations in array database management systems prevent their adoption in scientific applications, even though arrays are vastly present in scientific datasets. Most traditional DBMSs impose a huge performance penalty during data ingestion, since they require data to be converted to the DBMS’s internal format. In addition, when the data is kept in the DBMS’s format, it gets concealed from users, unless they know in details how the DBMS stores it. This is fine for many applications that access data only through the DBMS’s query language, but is inconvenient for domain specific applications whose complex analytical code is unlikely to be performed efficiently by such languages, requiring a more involved approach in which files are accessed directly. As a consequence, users adopt in-situ analysis libraries, in-transit I/O interfaces and scientific data format files to manage their data. However, these alternatives might not offer the same benefits a DBMS does, such as: richer data model semantics, declarative analytical query languages and isolation between data and applications. Therefore, in this work, we propose a novel array data model named TARS and a database management system, named Savime, which implements such data model. We show how Savime can foster declarative scientific data analysis and visualization without imposing costly data rearrangements and format conversions by using a flexible data model. We also compare Savime with a state-of-the-art array DBMS. The results show that Savime is up to 20 times faster than another array DBMS for data ingestion, while providing a performance similar for the execution of basic array operations. We believe that Savime can substantially empower scientists in developing scientific data analysis in-silico.

SAVIME: enabling declarative array processing in memory

Registros relacionados