D-VisionDraughts: uma rede neural jogadora de damas que aprende por reforço em um ambiente de computação distribuída

Detalhes bibliográficos
Ano de defesa: 2011
Autor(a) principal: Barcelos, Ayres Roberto Araújo
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Uberlândia
BR
Programa de Pós-graduação em Ciência da Computação
Ciências Exatas e da Terra
UFU
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://repositorio.ufu.br/handle/123456789/12509
Resumo: The objetive of this work is to propose a draughts learning system, the D-VisionDraughts (Distributed VisionDraughts): a distributed draughts player agent based on neural networks that learns by reinforcement. The D-VisionDraughts is trained in a distributed processing environment in order to achieve a high level of play without expert game analysis and with minimal human intervention as possible (distinctly from the world draughts champion Chinook). The D-VisionDraughts corresponds to a distributed version of the eficient VisionDraughts player, where the latter corresponds to a MLP (multilayer perceptron) neural network that learns by means of temporal diferences. The role of the neural network is to evaluate how much a board state is favorable to the agent (prediction). This value will lead the search module to determine the best action (in this case, the best move) of the current board state of the game. Another factor that has an important impact on the search eciency, which is analyzed in this work, is the degree of ordering of the game tree. Thus, the main contributions of this work are: the replacement of the serial algorithm used in VisionDraughts, the minimax with alpha-beta pruning, by the distributed algorithm Young Brothers Wait Concept (YBWC); the use of heuristics for game tree ordering, that is essential for the proper performance of YBWC and alpha-beta pruning in general; the impact analysis of the high-performance processing environment on the unsupervised learning skills of the player. This work shows that with the techniques used, the time required to perform a game tree search was signicantly reduced and through tournaments played with VisionDraughts the overall performance of the distributed agent is improved.