Meta-level reasoning in reinforcement learning

Detalhes bibliográficos
Ano de defesa: 2014
Autor(a) principal: Maissiat, Jiéverson lattes
Orientador(a): Meneguzzi, Felipe Rech lattes
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Pontifícia Universidade Católica do Rio Grande do Sul
Programa de Pós-Graduação: Programa de Pós-Graduação em Ciência da Computação
Departamento: Faculdade de Informáca
País: BR
Palavras-chave em Português:
Área do conhecimento CNPq:
Link de acesso: http://tede2.pucrs.br/tede2/handle/tede/5253
Resumo: Reinforcement learning (RL) is a technique to compute an optimal policy in stochastic settings where actions from an initial policy are simulated (or directly executed) and the value of a state is updated based on the immediate rewards obtained as the policy is executed. Existing efforts model opponents in competitive games as elements of a stochastic environment and use RL to learn policies against such opponents. In this setting, the rate of change for state values monotonically decreases over time, as learning converges. Although this modeling assumes that the opponent strategy is static over time, such an assumption is too strong with human opponents. Consequently, in this work, we develop a meta-level RL mechanism that detects when an opponent changes strategy and allows the state-values to deconverge in order to learn how to play against a different strategy. We validate this approach empirically for high-level strategy selection in the Starcraft: Brood War game.