Sistema de Aprendizagem de Máquina Distribuído utilizando o VCube
Ano de defesa: | 2024 |
---|---|
Autor(a) principal: | |
Orientador(a): | |
Banca de defesa: | , |
Tipo de documento: | Dissertação |
Tipo de acesso: | Acesso aberto |
Idioma: | por |
Instituição de defesa: |
Universidade Estadual do Oeste do Paraná
Cascavel |
Programa de Pós-Graduação: |
Programa de Pós-Graduação em Ciência da Computação
|
Departamento: |
Centro de Ciências Exatas e Tecnológicas
|
País: |
Brasil
|
Palavras-chave em Português: | |
Palavras-chave em Inglês: | |
Área do conhecimento CNPq: | |
Link de acesso: | https://tede.unioeste.br/handle/tede/7354 |
Resumo: | As the amount of data generated increases, it is no longer stored locally, resulting in distributed data scenarios. Therefore, if it is necessary to perform a classification process, i.e. the process of predicting the category of new entries based on training data, it would be necessary to consolidate the information at a central point in the network to perform the learning. However, in some situations, it is not practical to move the data across the network because the connections are congested or the information is exposed to attacks. To overcome such difficulties, a distributed classification method using a peer-to-peer strategy in conjunction with VCube is proposed in this paper. VCube is a distributed diagnosis algorithm that organizes the network nodes in a virtual topology of a hypercube, enabling efficient detection of failures in the network nodes. In the proposed solution, the models are trained locally and then shared so that no information needs to be sent and displayed. During the experiments, eight nodes were used in the network, each of which performed local training using the multilayer perceptron algorithm. Different scenarios of data distribution in the network were tested, varying the number of instances and the distribution of classes. We also simulated cases where one of the network nodes was unavailable. The results show that local training is faster than training that focuses on a single node. The performance in terms of accuracy was greater when each node received models trained on other nodes, i.e. the distributed system achieved higher accuracy than the individual solution. The results emphasize the applicability of VCube as a topology for sharing trained models. In cases where one of the nodes was unavailable, the strategy allowed the distributed learning system to function properly and achieve better performance than the models generated on each individual node. |