Detalhes bibliográficos
Ano de defesa: |
2024 |
Autor(a) principal: |
Maciel, Noberto Pires
 |
Orientador(a): |
Calumby, Rodrigo Tripodi
 |
Banca de defesa: |
Não Informado pela instituição |
Tipo de documento: |
Dissertação
|
Tipo de acesso: |
Acesso aberto |
Idioma: |
por |
Instituição de defesa: |
Universidade Estadual de Feira de Santana
|
Programa de Pós-Graduação: |
Programa de Pós-Graduação em Ciência da Computação
|
Departamento: |
DEPARTAMENTO DE CIÊNCIAS EXATAS
|
País: |
Brasil
|
Palavras-chave em Português: |
|
Palavras-chave em Inglês: |
|
Área do conhecimento CNPq: |
|
Link de acesso: |
http://tede2.uefs.br:8080/handle/tede/1847
|
Resumo: |
Searching images for content in a data collection, whether through social media mechanisms or free web search tools, is a complex task where results based on similarity alone often present relevance problems such as unrepresentative items and near-duplicates. Commonly, search engines try to perform a broad coverage based on implicit subtopics of the query in order to serve the user as completely as possible. In this sense, the approach based on content diversification using data clustering algorithms has been widely used. In this approach, each group identified by the algorithm in the search results is treated as a subtopic. These groups are used to extract representative images that together bring diversity to the result presented to the user. However, the effectiveness of the approach depends on choosing a good clustering scheme, something that is directly linked to the number of groups generated by the algorithm, a task that has been an immense challenge. This work aims to evaluate the possible gains in terms of efficiency in the task of retrieving diverse images by selecting the best grouping schemes generated by clustering algorithms, dynamically searching for the ideal number of groups for each query. In addition, we intend to extend the literature by carrying out an experimental evaluation of the DTRS method for estimating the quality of clusters, as well as developing an efficient auxiliary method for determining the stopping criteria for clustering algorithms and, consequently, reducing the computational costs of the results diversification procedure. To this end, we conducted experiments using the K-Medoids and Hierarchical Agglomerative algorithms, employing different validation methods, exploring variations in the number of clusters and adopting different auxiliary approaches for selecting the best clusters schemes, such as the Elbow’s method. The results showed gains in terms of efficiency in retrieving diverse images and a significant reduction in the running time of the CBIR system used in this work. |