Sistema de Aprendizagem de Máquina Distribuído utilizando o VCube

Detalhes bibliográficos
Ano de defesa: 2024
Autor(a) principal: Salles, Charles Giovane de lattes
Orientador(a): Brun, André Luiz
Banca de defesa: Rodrigues, Luiz Antonio, Silva, Ronan Assumpção
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Estadual do Oeste do Paraná
Cascavel
Programa de Pós-Graduação: Programa de Pós-Graduação em Ciência da Computação
Departamento: Centro de Ciências Exatas e Tecnológicas
País: Brasil
Palavras-chave em Português:
Palavras-chave em Inglês:
Área do conhecimento CNPq:
Link de acesso: https://tede.unioeste.br/handle/tede/7354
Resumo: As the amount of data generated increases, it is no longer stored locally, resulting in distributed data scenarios. Therefore, if it is necessary to perform a classification process, i.e. the process of predicting the category of new entries based on training data, it would be necessary to consolidate the information at a central point in the network to perform the learning. However, in some situations, it is not practical to move the data across the network because the connections are congested or the information is exposed to attacks. To overcome such difficulties, a distributed classification method using a peer-to-peer strategy in conjunction with VCube is proposed in this paper. VCube is a distributed diagnosis algorithm that organizes the network nodes in a virtual topology of a hypercube, enabling efficient detection of failures in the network nodes. In the proposed solution, the models are trained locally and then shared so that no information needs to be sent and displayed. During the experiments, eight nodes were used in the network, each of which performed local training using the multilayer perceptron algorithm. Different scenarios of data distribution in the network were tested, varying the number of instances and the distribution of classes. We also simulated cases where one of the network nodes was unavailable. The results show that local training is faster than training that focuses on a single node. The performance in terms of accuracy was greater when each node received models trained on other nodes, i.e. the distributed system achieved higher accuracy than the individual solution. The results emphasize the applicability of VCube as a topology for sharing trained models. In cases where one of the nodes was unavailable, the strategy allowed the distributed learning system to function properly and achieve better performance than the models generated on each individual node.