Dimension reduction in projective clustering

Detalhes bibliográficos
Ano de defesa: 2022
Autor(a) principal: Lima, Rafael Zuolo Coppini
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Biblioteca Digitais de Teses e Dissertações da USP
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://www.teses.usp.br/teses/disponiveis/45/45134/tde-02092022-150552/
Resumo: The high dimensionality of data may be a barrier to algorithmic efficiency (Nelson, 2020), mainly because of the well known curse of dimensionality which imposes exponential time and/or memory complexity for algorithms, such as the nearest neighbour problem (Har-Peled, Indyk, and Motwani, 2012). It is natural then to search for ways to break the curse by relaxing the problem with approximate versions and by finding good ways to reduce the dimension of data. Our objective is to write a dissertation about a dimension reduction scheme for clustering under 2 2 metric, with a focus on an approximation scheme for a particular case of this problem, called projective clustering. The dimension reduction is achieved by combining randomized techniques, such as the Johnson and Lindenstrauss Lemma, and deterministic techniques, such as the singular value decomposition. The result is an (1 + )-approximation for projective clustering that is polynomial in the number of data points and the dimension of the space. This dissertation will have as main references four papers: Sarlós, 2006, Feldman, Schmidt, and Sohler, 2020, Pratap and Sen, 2018 and Deshpande, Rademacher, Vempala, and Wang, 2006. The results presented in the dissertation will be either the original or modified versions that incorporate current improvements.