Um novo método de indexação para consultas por similaridade utilizando mapeamentos unidimensionais baseados em focos globais

Lima, Rafael Lucas Bernardes

Um novo método de indexação para consultas por similaridade utilizando mapeamentos unidimensionais baseados em focos globais

Detalhes bibliográficos
Ano de defesa:	2016
Autor(a) principal:	Lima, Rafael Lucas Bernardes
Orientador(a):	Não Informado pela instituição
Banca de defesa:	Não Informado pela instituição
Tipo de documento:	Dissertação
Tipo de acesso:	Acesso aberto
Idioma:	por
Instituição de defesa:	Universidade Federal de Uberlândia Brasil Programa de Pós-graduação em Ciência da Computação
Programa de Pós-Graduação:	Não Informado pela instituição
Departamento:	Não Informado pela instituição
País:	Não Informado pela instituição
Palavras-chave em Português:	B+-Tree Métodos de acesso GroupSim Access methods Computação Recuperação de dados (Computação) Recuperação da informação CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
Link de acesso:	https://repositorio.ufu.br/handle/123456789/24057 http://dx.doi.org/10.14393/ufu.di.2019.314
Resumo:	To recovering complex data the most appropriate is to use similarity queries. To opti- mize the response of a query the access methods are used. When a set of objects is deĄned by a distance function (metric) can be said that these objects became part of a metric space, which allows the preparation of Metric Access Methods (MAM). Generally MAM are represented by a hierarchical structure. There are several variations of metric trees, and an interesting structure to work is the B+Tree, a useful feature of this structure is that the leaf nodes are stored in a doubly linked list facilitating navigation between the nodes. The GroupSim method presents an approach based on mapping, indexing and retrieval of objects. First is performed the mapping of objects to one-dimensional spaces based on representative objects previously chosen, after mapping are generated one-dimensional vectors which are indexed in a single structure B+Tree, allowing sub- sequently more eicient queries are applied. Through experiments carried out it was possible to note that the proposed method has a performance superior to other methods may be found in the literature. By performing KNN queries with k varying between 10 and 100, using diferent sets of data it was possible to assess the proposed method. Some of the results were obtained by comparing the GroupSim and iDistance method using the Euclidian function and Sierpinski database, the proposed method achieves an average of 3.400% better performance. Compared to OmniB - Forest the best performance achieved is using the database Covertype and the Euclidean distance function, in this case the proposed method comes to have an average performance for query 1000% better and in comparison with sequential access to performance also arrives to 1000% using the data- base Sierpinskie and the Euclidean distance function. Based on the results obtained from the experiments, it is clear that the proposed method has superior performance to some methods in the literature, like the iDistance and the OmniB-Forest.

Um novo método de indexação para consultas por similaridade utilizando mapeamentos unidimensionais baseados em focos globais

Registros relacionados