The Similarity-aware Relational Division Database Operator

Detalhes bibliográficos
Ano de defesa: 2017
Autor(a) principal: Gonzaga, André dos Santos
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Biblioteca Digitais de Teses e Dissertações da USP
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://www.teses.usp.br/teses/disponiveis/55/55134/tde-17112017-135006/
Resumo: In Relational Algebra, the operator Division (÷) is an intuitive tool used to write queries with the concept of for all, and thus, it is constantly required in real applications. However, as we demonstrate in this MSc work, the division does not support many of the needs common to modern applications, particularly those that involve complex data analysis, such as processing images, audio, genetic data, large graphs, fingerprints, and many other non-traditional data types. The main issue is the existence of intrinsic comparisons of attribute values in the operator, which, by definition, are always performed by identity (=), despite the fact that complex data must be compared by similarity. Recent works focus on supporting similarity comparison in relational operators, but no one treats the division. MSc work proposes the new Similarity-aware Division (÷) operator. Our novel operator is naturally well suited to answer queries with an idea of candidate elements and exigencies to be performed on complex data from real applications of high-impact. For example, it is potentially useful to support agriculture, genetic analyses, digital library search, and even to help controlling the quality of manufactured products and identifying new clients in industry. We validate our proposal by studying the first two of these applications.