Fuzzy approach for classification and novelty detection in data streams

Cristiani, André Luis

Fuzzy approach for classification and novelty detection in data streams

Detalhes bibliográficos
Ano de defesa:	2022
Autor(a) principal:	Cristiani, André Luis
Orientador(a):	Camargo, Heloisa de Arruda
Banca de defesa:	Não Informado pela instituição
Tipo de documento:	Dissertação
Tipo de acesso:	Acesso aberto
Idioma:	eng
Instituição de defesa:	Universidade Federal de São Carlos Câmpus São Carlos
Programa de Pós-Graduação:	Programa de Pós-Graduação em Ciência da Computação - PPGCC
Departamento:	Não Informado pela instituição
País:	Não Informado pela instituição
Palavras-chave em Português:	Fluxo contínuo de dados Detecção de novidades Latência intermediária Teoria dos conjuntos fuzzy
Palavras-chave em Inglês:	Data streams Novelty detection Intermediate latency Fuzzy set theory
Área do conhecimento CNPq:	CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO
Link de acesso:	https://repositorio.ufscar.br/handle/20.500.14289/20010
Resumo:	Learning in data streams (DS) is a research area that seeks to extract knowledge from a large amount of continuously generated data in a short period of time. The novelty detection (ND) is responsible for identifying the emergence of new concepts and changes in known concepts. The true labels of the instances can be used so that the algorithms adapt to the concept evolution and concept drift. The time between the classification of an instance and the arrival of its true label is called latency. Most applications consider that these true labels will never be available. Others are more optimistic and assume that the true label will be available shortly after the instance has been classified. Another way is to consider that, after a certain time, the true labels will be available, which is applicable in most real-world scenarios. The use of concepts from fuzzy set theory makes it possible to make learning adaptable to possible inaccuracies in the data. However, few approaches use the concepts of fuzzy set theory and consider intermediate latency to obtain the labels. Therefore, this work proposes a method for classifying multiclass ND in DS for intermediate and extreme latency scenarios based on ECSMiner and PFuzzND algorithms. The results obtained show that the proposed algorithm obtained good accuracy in the classification and detection of multiclass novelties, classifying outliers that approaches that use crisp clustering were not able to classify. In addition, improvements were presented in relation to the algorithm initialization parameters, which reduce the complexity of its use, maintaining good results.

Fuzzy approach for classification and novelty detection in data streams

Registros relacionados