Abalearn: a risk-sensitive approach to self-play learning in Abalone
Main Author: | |
---|---|
Publication Date: | 2003 |
Other Authors: | |
Language: | eng |
Source: | Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
Download full: | http://hdl.handle.net/10400.13/4603 |
Summary: | This paper presents Abalearn, a self-teaching Abalone pro gram capable of automatically reaching an intermediate level of play without needing expert-labeled training examples, deep searches or ex posure to competent play. Our approach is based on a reinforcement learning algorithm that is risk seeking, since defensive players in Abalone tend to never end a game. We show that it is the risk-sensitivity that allows a successful self-play training. We also propose a set of features that seem relevant for achiev ing a good level of play. We evaluate our approach using a fixed heuristic opponent as a bench mark, pitting our agents against human players online and comparing samples of our agents at different times of training. |
id |
RCAP_8da716a10f0301d18c1a003f2c3592ad |
---|---|
oai_identifier_str |
oai:digituma.uma.pt:10400.13/4603 |
network_acronym_str |
RCAP |
network_name_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
repository_id_str |
https://opendoar.ac.uk/repository/7160 |
spelling |
Abalearn: a risk-sensitive approach to self-play learning in AbaloneAbalearnSelf-play learningAbalone.Faculdade de Ciências Exatas e da EngenhariaThis paper presents Abalearn, a self-teaching Abalone pro gram capable of automatically reaching an intermediate level of play without needing expert-labeled training examples, deep searches or ex posure to competent play. Our approach is based on a reinforcement learning algorithm that is risk seeking, since defensive players in Abalone tend to never end a game. We show that it is the risk-sensitivity that allows a successful self-play training. We also propose a set of features that seem relevant for achiev ing a good level of play. We evaluate our approach using a fixed heuristic opponent as a bench mark, pitting our agents against human players online and comparing samples of our agents at different times of training.SpringerDigitUMaCampos, PedroLanglois, Thibault2022-09-13T11:05:59Z20032003-01-01T00:00:00Zconference objectinfo:eu-repo/semantics/publishedVersionapplication/pdfhttp://hdl.handle.net/10400.13/4603eng10.1007/978-3-540-39857-8_6info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2025-02-24T16:52:34Zoai:digituma.uma.pt:10400.13/4603Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T20:41:28.584012Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse |
dc.title.none.fl_str_mv |
Abalearn: a risk-sensitive approach to self-play learning in Abalone |
title |
Abalearn: a risk-sensitive approach to self-play learning in Abalone |
spellingShingle |
Abalearn: a risk-sensitive approach to self-play learning in Abalone Campos, Pedro Abalearn Self-play learning Abalone . Faculdade de Ciências Exatas e da Engenharia |
title_short |
Abalearn: a risk-sensitive approach to self-play learning in Abalone |
title_full |
Abalearn: a risk-sensitive approach to self-play learning in Abalone |
title_fullStr |
Abalearn: a risk-sensitive approach to self-play learning in Abalone |
title_full_unstemmed |
Abalearn: a risk-sensitive approach to self-play learning in Abalone |
title_sort |
Abalearn: a risk-sensitive approach to self-play learning in Abalone |
author |
Campos, Pedro |
author_facet |
Campos, Pedro Langlois, Thibault |
author_role |
author |
author2 |
Langlois, Thibault |
author2_role |
author |
dc.contributor.none.fl_str_mv |
DigitUMa |
dc.contributor.author.fl_str_mv |
Campos, Pedro Langlois, Thibault |
dc.subject.por.fl_str_mv |
Abalearn Self-play learning Abalone . Faculdade de Ciências Exatas e da Engenharia |
topic |
Abalearn Self-play learning Abalone . Faculdade de Ciências Exatas e da Engenharia |
description |
This paper presents Abalearn, a self-teaching Abalone pro gram capable of automatically reaching an intermediate level of play without needing expert-labeled training examples, deep searches or ex posure to competent play. Our approach is based on a reinforcement learning algorithm that is risk seeking, since defensive players in Abalone tend to never end a game. We show that it is the risk-sensitivity that allows a successful self-play training. We also propose a set of features that seem relevant for achiev ing a good level of play. We evaluate our approach using a fixed heuristic opponent as a bench mark, pitting our agents against human players online and comparing samples of our agents at different times of training. |
publishDate |
2003 |
dc.date.none.fl_str_mv |
2003 2003-01-01T00:00:00Z 2022-09-13T11:05:59Z |
dc.type.driver.fl_str_mv |
conference object |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10400.13/4603 |
url |
http://hdl.handle.net/10400.13/4603 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
10.1007/978-3-540-39857-8_6 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Springer |
publisher.none.fl_str_mv |
Springer |
dc.source.none.fl_str_mv |
reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP |
instname_str |
FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
collection |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
repository.name.fl_str_mv |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
repository.mail.fl_str_mv |
info@rcaap.pt |
_version_ |
1833598799118860288 |