BTT-Go: um agente jogador de Go com busca Monte-Carlo aprimorada com tabela de transposição e modelo Bradley-Terry

Detalhes bibliográficos
Ano de defesa: 2014
Autor(a) principal: Vieira Júnior, Eldane
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Uberlândia
BR
Programa de Pós-graduação em Ciência da Computação
Ciências Exatas e da Terra
UFU
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Go
Link de acesso: https://repositorio.ufu.br/handle/123456789/12556
https://doi.org/10.14393/ufu.di.2014.165
Resumo: The game of Go is, nowadays, one of the greatest challenge in the Articial Intelligence area, since this game has a set of characteristics that prevents the success application of techniques, which has been very successful in other games. In this set of characteristics there is the high level of complexity, which prevents it from the use of techniques that require the maximum exploration of its search state-space. In this thesis is described the development of a player agent for the game of Go named BTT-Go. This agent was created from another one named Fuego, which uses one of the few techniques that had provided improvement to the automatic players of Go: the Monte- Carlo Tree Search algorithm. The player Fuego uses a supervised learning, once its search method is based, exclusively, on Monte-Carlo simulations, heuristics board evaluations and database, which contains data about the game start (opening book).This way, the objective of this thesis is to produce a competitive agent in spite of the supervision reduction, which is much less then the supervision used by the agent Fuego. To achieve this objective, BTTGo was developed in three versions: in the rst, the agent uses a Transposition Table, which is a repository of data processed previously. This way, it is possible to reduce the simulation supervision by its reduction, and in some situations, the agent uses the data from the table instead of using the Fuego prior knowledge evaluation. The second version of BTT-Go consists in the application, in the nal stage of the Monte-Carlo search algorithm, of a bayesian technique inspired on Bradley-Terry model. This technique predicts the best move by a board evaluation. This evaluation is done considering some features that describes how good a move is. In this stage, the agent Fuego uses policies to indicate which move should be played. The BTT-Go third version was created by the combination of the rst and the second versions, in a way that the techniques used can work together without any loss. Once the development of the three version was completed, it was performed some experiments in dierent board sizes (9x9, 13x13 and 19x19). In these experiments, it was observed that the use of Transposition Table reduced the agent supervision. Although, there was a little reduction in its winning rate in large boards (13x13 and 19x19), comparing it to Fuego, nevertheless BTT-Go is still a competitive player. It was also observed that the technique inspired on Bradley-Terry model increased the competitiveness of the agent in large boards (13x13 and 19x19), and in some situation it was better than the agent Fuego. Therefore, the development of the player BTT-Go has provided a supervision reduction by the use of Transposition Table and by the use of bayesian technique inspired on Bradley- Terry model, and also a increase of the acuity in the moves generation during the search process.