Go-Ahead: melhorando heurísticas prior-knowledge através de informações extraídas das simulações play-out

Detalhes bibliográficos
Ano de defesa: 2015
Autor(a) principal: Santos, Gabriel Machado
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Uberlândia
Brasil
Programa de Pós-graduação em Ciência da Computação
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Go
Link de acesso: https://repositorio.ufu.br/handle/123456789/38758
http://doi.org/10.14393/ufu.di.2023.7078
Resumo: Despite being a very ancient game, probably originated in China about 2000 BCE, the game of Go is one of the greatest challenges in the Ąeld of ArtiĄcial Intelligence. In this thesis is described the agent Go-Ahead: an automatic Go player that uses a new technique to improve the accuracy of the pre-estimated values of the moves which are candidate to be introduced into the classical Monte Carlo Tree Search (MCTS) algorithm used by many current top agents for Go. Go-Ahead is built upon the framework of one of these agents: the well known open- source automatic player Fuego, in which these pre-estimated values are obtained by means of a heuristic called prior-knowledge. Go-Ahead copes with the task of reĄning the calculus of these values through a new technique that performs a balanced combination between the prior-knowledge heuristic and some relevant information retrieved from the numerous play-out simulation phases that are repeatedly executed throughout the Monte Carlo search. With such a strategy, Go-Ahead provides two distinct contributions: Ąrst, it enables the agent to enhance the process of choosing appropriate moves. Second, the balancing in the combination of the prior-knowledge and the play-out information - which is obtained by means of an adjustable parameter - represents an interesting alternative to attenuate the supervised character of the calculus of the node evaluations in MCTS based agents, since it allows to reduce the impact of the prior-knowledge heuristic by strengthening the impact of this information. The results obtained in tournaments against Fuego conĄrm the beneĄts and the con- tributions provided by this approach.