Export Ready — 

Abalearn: a risk-sensitive approach to self-play learning in Abalone

Bibliographic Details
Main Author: Campos, Pedro
Publication Date: 2003
Other Authors: Langlois, Thibault
Language: eng
Source: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Download full: http://hdl.handle.net/10400.13/4603
Summary: This paper presents Abalearn, a self-teaching Abalone pro gram capable of automatically reaching an intermediate level of play without needing expert-labeled training examples, deep searches or ex posure to competent play. Our approach is based on a reinforcement learning algorithm that is risk seeking, since defensive players in Abalone tend to never end a game. We show that it is the risk-sensitivity that allows a successful self-play training. We also propose a set of features that seem relevant for achiev ing a good level of play. We evaluate our approach using a fixed heuristic opponent as a bench mark, pitting our agents against human players online and comparing samples of our agents at different times of training.
id RCAP_8da716a10f0301d18c1a003f2c3592ad
oai_identifier_str oai:digituma.uma.pt:10400.13/4603
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling Abalearn: a risk-sensitive approach to self-play learning in AbaloneAbalearnSelf-play learningAbalone.Faculdade de Ciências Exatas e da EngenhariaThis paper presents Abalearn, a self-teaching Abalone pro gram capable of automatically reaching an intermediate level of play without needing expert-labeled training examples, deep searches or ex posure to competent play. Our approach is based on a reinforcement learning algorithm that is risk seeking, since defensive players in Abalone tend to never end a game. We show that it is the risk-sensitivity that allows a successful self-play training. We also propose a set of features that seem relevant for achiev ing a good level of play. We evaluate our approach using a fixed heuristic opponent as a bench mark, pitting our agents against human players online and comparing samples of our agents at different times of training.SpringerDigitUMaCampos, PedroLanglois, Thibault2022-09-13T11:05:59Z20032003-01-01T00:00:00Zconference objectinfo:eu-repo/semantics/publishedVersionapplication/pdfhttp://hdl.handle.net/10400.13/4603eng10.1007/978-3-540-39857-8_6info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2025-02-24T16:52:34Zoai:digituma.uma.pt:10400.13/4603Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T20:41:28.584012Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv Abalearn: a risk-sensitive approach to self-play learning in Abalone
title Abalearn: a risk-sensitive approach to self-play learning in Abalone
spellingShingle Abalearn: a risk-sensitive approach to self-play learning in Abalone
Campos, Pedro
Abalearn
Self-play learning
Abalone
.
Faculdade de Ciências Exatas e da Engenharia
title_short Abalearn: a risk-sensitive approach to self-play learning in Abalone
title_full Abalearn: a risk-sensitive approach to self-play learning in Abalone
title_fullStr Abalearn: a risk-sensitive approach to self-play learning in Abalone
title_full_unstemmed Abalearn: a risk-sensitive approach to self-play learning in Abalone
title_sort Abalearn: a risk-sensitive approach to self-play learning in Abalone
author Campos, Pedro
author_facet Campos, Pedro
Langlois, Thibault
author_role author
author2 Langlois, Thibault
author2_role author
dc.contributor.none.fl_str_mv DigitUMa
dc.contributor.author.fl_str_mv Campos, Pedro
Langlois, Thibault
dc.subject.por.fl_str_mv Abalearn
Self-play learning
Abalone
.
Faculdade de Ciências Exatas e da Engenharia
topic Abalearn
Self-play learning
Abalone
.
Faculdade de Ciências Exatas e da Engenharia
description This paper presents Abalearn, a self-teaching Abalone pro gram capable of automatically reaching an intermediate level of play without needing expert-labeled training examples, deep searches or ex posure to competent play. Our approach is based on a reinforcement learning algorithm that is risk seeking, since defensive players in Abalone tend to never end a game. We show that it is the risk-sensitivity that allows a successful self-play training. We also propose a set of features that seem relevant for achiev ing a good level of play. We evaluate our approach using a fixed heuristic opponent as a bench mark, pitting our agents against human players online and comparing samples of our agents at different times of training.
publishDate 2003
dc.date.none.fl_str_mv 2003
2003-01-01T00:00:00Z
2022-09-13T11:05:59Z
dc.type.driver.fl_str_mv conference object
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10400.13/4603
url http://hdl.handle.net/10400.13/4603
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 10.1007/978-3-540-39857-8_6
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Springer
publisher.none.fl_str_mv Springer
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833598799118860288