Acelerando a construção de tabelas hash para dados textuais com aplicações

Barros, Chayner Cordeiro

Acelerando a construção de tabelas hash para dados textuais com aplicações

Detalhes bibliográficos
Ano de defesa:	2020
Autor(a) principal:	Barros, Chayner Cordeiro
Orientador(a):	Martins, Wellington Santos
Banca de defesa:	Martins, Wellington Santos, Rosa, Thierson Couto, Sousa, Daniel Xavier de
Tipo de documento:	Dissertação
Tipo de acesso:	Acesso aberto
Idioma:	por
Instituição de defesa:	Universidade Federal de Goiás
Programa de Pós-Graduação:	Programa de Pós-graduação em Ciência da Computação (INF)
Departamento:	Instituto de Informática - INF (RG)
País:	Brasil
Palavras-chave em Português:	Computação de alto desempenho Aprendizado de máquina Mineração de texto Matriz de coocorrência Tabelas hash Embeddings CUDA Hpc
Palavras-chave em Inglês:	Machine learning High performance computing Text mining Co-occurrence matrix Hash tables Embeddings
Área do conhecimento CNPq:	CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO
Link de acesso:	http://repositorio.bc.ufg.br/tede/handle/tede/11006
Resumo:	Text mining is characterized by the extraction of information from textual data, in the most diverse formats, aiming at the knowledge production, classification, clusterization, translation of this information among other things. In order for text mining to be efficient, some procedures are performed on the data to ensure that it contains only content relevant to the analysis to be performed, and that it is structured in a format that is easier to manipulate computationally. Several pre-processing tasks must be performed on this data, in order to achieve the desired quality and representation. In this sense, the present work proposes an implementation of a hash table capable of efficiently exploring the high parallelism available in GPUs, as a way to increase the performance of pre- processing tasks. However, this work not only presents more efficient algorithms, but also demonstrates the feasibility of its use in applications such as the generation of the co- occurrence matrix and the representation of the text using embeddings.

Acelerando a construção de tabelas hash para dados textuais com aplicações

Registros relacionados