Detalhes bibliográficos
Ano de defesa: |
2024 |
Autor(a) principal: |
Vinces, Braulio Valentin Sánchez |
Orientador(a): |
Não Informado pela instituição |
Banca de defesa: |
Não Informado pela instituição |
Tipo de documento: |
Tese
|
Tipo de acesso: |
Acesso aberto |
Idioma: |
eng |
Instituição de defesa: |
Biblioteca Digitais de Teses e Dissertações da USP
|
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: |
|
Link de acesso: |
https://www.teses.usp.br/teses/disponiveis/55/55134/tde-11022025-113850/
|
Resumo: |
This Ph.D. work addresses the critical challenge of outlier detection in large and complex data sets. We focus on developing efficient and scalable methods to accurately identify anomalies in various data types and scenarios. The first part of the dissertation explores the use of similarity join operations for distance-based outlier detection. We propose two novel methods: MCCATCH, which effectively identifies microclusters in dimensional and nondimensional data sets, and GOOST, which efficiently detects outliers in massive data streams. Both methods leverage similarity joins to achieve superior accuracy, efficiency, and scalability performance. The second part of the dissertation rigorously investigates the effectiveness of clustering-based outlier detection approaches. Through a meticulous and comprehensive comparative evaluation, we demonstrate that clustering-based methods can be competitive with state-of-the-art non-clustering-based algorithms, offering advantages in terms of robustness and scalability. Our research significantly contributes to the field of outlier detection by providing novel methodologies and insights into the effectiveness of different approaches. The methods we propose have profound practical implications for a wide range of applications, including fraud detection, network intrusion detection, and medical diagnosis, making our work highly relevant and applicable. |