Framework para sistemas de recomendação baseados em neural contextual Bandits com restrição de justiça

Detalhes bibliográficos
Autor(a) principal: Santana, Marlesson Rodrigues Oliveira de
Data de Publicação: 2024
Tipo de documento: Tese
Idioma: por
Título da fonte: Repositório Institucional da UFG
dARK ID: ark:/38995/001300000fprk
Texto Completo: http://repositorio.bc.ufg.br/tede/handle/tede/13642
Resumo: The advent of digital businesses such as marketplaces, in which a company mediates a commercial transaction between different actors, presents challenges to recommendation systems as it is a multi-stakeholder scenario. In this scenario, the recommendation must meet conflicting objectives between the parties, such as relevance versus exposure, for example. State-of-the-art models that address the problem in a supervised way not only assume that the recommendation is a stationary problem, but are also user-centered, which leads to long-term system degradation. This thesis focuses on modeling the recommendation system as a reinforcement learning problem, through a Markovian decision-making process with uncertainty where it is possible to model the different interests of stakeholders in an environment with fairness constraints. The main challenges are the need for real interactions between stakeholders and the recommendation system in a continuous cycle of events that enables the scenario for online learning. For the development of this work, we present a model proposal, based on Neural Contextual Bandits with fairness constrain for multi-stakeholder scenarios. As results, we present the construction of MARS-Gym, a framework for modeling, training and evaluating recommendation systems based on reinforcement learning, and the development of different recommendation policies with fairness control adaptable to Neural models. Contextual Bandits, which led to an increase in fairness metrics for all scenarios presented while controlling the reduction in relevance metrics.
id UFG-2_df80e7e3bb4e664d8d88723d933fda9f
oai_identifier_str oai:repositorio.bc.ufg.br:tede/13642
network_acronym_str UFG-2
network_name_str Repositório Institucional da UFG
repository_id_str oai:repositorio.bc.ufg.br:tede/1234
spelling Soares, Anderson da Silvahttp://lattes.cnpq.br/1096941114079527Soares, Anderson da SilvaRosa, Thierson CoutoCarvalho, Cedric Luiz DeAraújo, Aluizio Fausto RibeiroVeloso, Adrianohttp://lattes.cnpq.br/4493140717229623Santana, Marlesson Rodrigues Oliveira de2024-11-13T15:16:51Z2024-11-13T15:16:51Z2024-06-03SANTANA, M. R. O. Framework para Sistemas de Recomendação Baseados em Neural Contextual Bandits com Restrição de Justiça. Goiânia. 2024. 105p. Tese (Doutorado em Ciência da Computação) - Instituto de Informática, Universidade Federal de Goiás, Goiânia, 2024.http://repositorio.bc.ufg.br/tede/handle/tede/13642ark:/38995/001300000fprkThe advent of digital businesses such as marketplaces, in which a company mediates a commercial transaction between different actors, presents challenges to recommendation systems as it is a multi-stakeholder scenario. In this scenario, the recommendation must meet conflicting objectives between the parties, such as relevance versus exposure, for example. State-of-the-art models that address the problem in a supervised way not only assume that the recommendation is a stationary problem, but are also user-centered, which leads to long-term system degradation. This thesis focuses on modeling the recommendation system as a reinforcement learning problem, through a Markovian decision-making process with uncertainty where it is possible to model the different interests of stakeholders in an environment with fairness constraints. The main challenges are the need for real interactions between stakeholders and the recommendation system in a continuous cycle of events that enables the scenario for online learning. For the development of this work, we present a model proposal, based on Neural Contextual Bandits with fairness constrain for multi-stakeholder scenarios. As results, we present the construction of MARS-Gym, a framework for modeling, training and evaluating recommendation systems based on reinforcement learning, and the development of different recommendation policies with fairness control adaptable to Neural models. Contextual Bandits, which led to an increase in fairness metrics for all scenarios presented while controlling the reduction in relevance metrics.O advento dos negócios digitais como marketplaces, em que uma empresa intermedeia uma transação comercial entre diferentes atores, apresenta desafios aos sistemas de recomendação por se tratar de um cenário multistakeholders. Nesse cenário, a recomendação deve atender a objetivos conflitantes entre as partes, como relevância versus exposição, por exemplo. Modelos estado da arte que tratam o problema de forma supervisionada, não apenas assumem que a recomendação é um problema estacionário, mas também são centradas no usuário, o que leva à degradação do sistema em longo prazo. Esta tese foca em modelar o sistema de recomendação como um problema de aprendizado por reforço, por um processo markoviano de tomada de decisão com incerteza onde seja possível modelar os diferentes interesses dos stakeholders em um ambiente com restrições de justiça. Os principais desafios estão na necessidade de interações reais entre os stakeholders e o sistema de recomendação em um ciclo de eventos contínuo que possibilite o cenário para o aprendizado online. Para o desenvolvimento deste trabalho, apresentamos uma proposta de modelo, baseado em Neural Contextual Bandits com restrição de justiça para cenários multistakeholders. Como resultados, apresentamos um framework de código aberto (MARS-Gym) para modelagem, treinamento, e avaliação de agentes de RL para sistemas de recomendação em ambientes multistakeholders e a arquitetura de Neural Contextual Bandit ‘Fair-Feature-Policy‘ com otimização multiobjetiva e restrição de justiça, o que levou a um aumento nas métricas de exposição ideal dos fornecedores em todos os cenários avaliados, em contrapartida, com pouca ou nenhuma redução na relevância das recomendações dadas pelo modelo.porUniversidade Federal de GoiásPrograma de Pós-graduação em Ciência da Computação (INF)UFGBrasilInstituto de Informática - INF (RMG)Attribution-NonCommercial-NoDerivatives 4.0 Internationalinfo:eu-repo/semantics/openAccessrecomendação multistakeholderjustiça na recomendaçãoaprendizado por reforçomultistakeholder recommendationfairnessreinforcement learningCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::SISTEMAS DE COMPUTACAOFramework para sistemas de recomendação baseados em neural contextual Bandits com restrição de justiçaFramework for Recommender Systems based on Neural Contextual bandits with Fairness-Constrainedinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisreponame:Repositório Institucional da UFGinstname:Universidade Federal de Goiás (UFG)instacron:UFGORIGINALTese - Marlesson Rodrigues Oliveira de Santana - 2024.pdfTese - Marlesson Rodrigues Oliveira de Santana - 2024.pdfapplication/pdf12692662http://repositorio.bc.ufg.br/tede/bitstreams/d3d161c4-ec23-439b-a380-2ebfb3d3a34b/download57f091413bd4f9e4cc57811c25805bf0MD51LICENSElicense.txtlicense.txttext/plain; charset=utf-81748http://repositorio.bc.ufg.br/tede/bitstreams/70eae756-8316-4a52-84d5-eaadace5045b/download8a4605be74aa9ea9d79846c1fba20a33MD52CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8805http://repositorio.bc.ufg.br/tede/bitstreams/57f3ccf6-eefc-4ee4-9fc1-f2052e83f468/download4460e5956bc1d1639be9ae6146a50347MD53tede/136422024-11-13 12:16:51.66http://creativecommons.org/licenses/by-nc-nd/4.0/Attribution-NonCommercial-NoDerivatives 4.0 Internationalopen.accessoai:repositorio.bc.ufg.br:tede/13642http://repositorio.bc.ufg.br/tedeRepositório InstitucionalPUBhttps://repositorio.bc.ufg.br/tedeserver/oai/requestgrt.bc@ufg.bropendoar:oai:repositorio.bc.ufg.br:tede/12342024-11-13T15:16:51Repositório Institucional da UFG - Universidade Federal de Goiás (UFG)falseTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=
dc.title.none.fl_str_mv Framework para sistemas de recomendação baseados em neural contextual Bandits com restrição de justiça
dc.title.alternative.eng.fl_str_mv Framework for Recommender Systems based on Neural Contextual bandits with Fairness-Constrained
title Framework para sistemas de recomendação baseados em neural contextual Bandits com restrição de justiça
spellingShingle Framework para sistemas de recomendação baseados em neural contextual Bandits com restrição de justiça
Santana, Marlesson Rodrigues Oliveira de
recomendação multistakeholder
justiça na recomendação
aprendizado por reforço
multistakeholder recommendation
fairness
reinforcement learning
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::SISTEMAS DE COMPUTACAO
title_short Framework para sistemas de recomendação baseados em neural contextual Bandits com restrição de justiça
title_full Framework para sistemas de recomendação baseados em neural contextual Bandits com restrição de justiça
title_fullStr Framework para sistemas de recomendação baseados em neural contextual Bandits com restrição de justiça
title_full_unstemmed Framework para sistemas de recomendação baseados em neural contextual Bandits com restrição de justiça
title_sort Framework para sistemas de recomendação baseados em neural contextual Bandits com restrição de justiça
author Santana, Marlesson Rodrigues Oliveira de
author_facet Santana, Marlesson Rodrigues Oliveira de
author_role author
dc.contributor.advisor1.fl_str_mv Soares, Anderson da Silva
dc.contributor.advisor1Lattes.fl_str_mv http://lattes.cnpq.br/1096941114079527
dc.contributor.referee1.fl_str_mv Soares, Anderson da Silva
dc.contributor.referee2.fl_str_mv Rosa, Thierson Couto
dc.contributor.referee3.fl_str_mv Carvalho, Cedric Luiz De
dc.contributor.referee4.fl_str_mv Araújo, Aluizio Fausto Ribeiro
dc.contributor.referee5.fl_str_mv Veloso, Adriano
dc.contributor.authorLattes.fl_str_mv http://lattes.cnpq.br/4493140717229623
dc.contributor.author.fl_str_mv Santana, Marlesson Rodrigues Oliveira de
contributor_str_mv Soares, Anderson da Silva
Soares, Anderson da Silva
Rosa, Thierson Couto
Carvalho, Cedric Luiz De
Araújo, Aluizio Fausto Ribeiro
Veloso, Adriano
dc.subject.por.fl_str_mv recomendação multistakeholder
justiça na recomendação
aprendizado por reforço
topic recomendação multistakeholder
justiça na recomendação
aprendizado por reforço
multistakeholder recommendation
fairness
reinforcement learning
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::SISTEMAS DE COMPUTACAO
dc.subject.eng.fl_str_mv multistakeholder recommendation
fairness
reinforcement learning
dc.subject.cnpq.fl_str_mv CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::SISTEMAS DE COMPUTACAO
description The advent of digital businesses such as marketplaces, in which a company mediates a commercial transaction between different actors, presents challenges to recommendation systems as it is a multi-stakeholder scenario. In this scenario, the recommendation must meet conflicting objectives between the parties, such as relevance versus exposure, for example. State-of-the-art models that address the problem in a supervised way not only assume that the recommendation is a stationary problem, but are also user-centered, which leads to long-term system degradation. This thesis focuses on modeling the recommendation system as a reinforcement learning problem, through a Markovian decision-making process with uncertainty where it is possible to model the different interests of stakeholders in an environment with fairness constraints. The main challenges are the need for real interactions between stakeholders and the recommendation system in a continuous cycle of events that enables the scenario for online learning. For the development of this work, we present a model proposal, based on Neural Contextual Bandits with fairness constrain for multi-stakeholder scenarios. As results, we present the construction of MARS-Gym, a framework for modeling, training and evaluating recommendation systems based on reinforcement learning, and the development of different recommendation policies with fairness control adaptable to Neural models. Contextual Bandits, which led to an increase in fairness metrics for all scenarios presented while controlling the reduction in relevance metrics.
publishDate 2024
dc.date.accessioned.fl_str_mv 2024-11-13T15:16:51Z
dc.date.available.fl_str_mv 2024-11-13T15:16:51Z
dc.date.issued.fl_str_mv 2024-06-03
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv SANTANA, M. R. O. Framework para Sistemas de Recomendação Baseados em Neural Contextual Bandits com Restrição de Justiça. Goiânia. 2024. 105p. Tese (Doutorado em Ciência da Computação) - Instituto de Informática, Universidade Federal de Goiás, Goiânia, 2024.
dc.identifier.uri.fl_str_mv http://repositorio.bc.ufg.br/tede/handle/tede/13642
dc.identifier.dark.fl_str_mv ark:/38995/001300000fprk
identifier_str_mv SANTANA, M. R. O. Framework para Sistemas de Recomendação Baseados em Neural Contextual Bandits com Restrição de Justiça. Goiânia. 2024. 105p. Tese (Doutorado em Ciência da Computação) - Instituto de Informática, Universidade Federal de Goiás, Goiânia, 2024.
ark:/38995/001300000fprk
url http://repositorio.bc.ufg.br/tede/handle/tede/13642
dc.language.iso.fl_str_mv por
language por
dc.rights.driver.fl_str_mv Attribution-NonCommercial-NoDerivatives 4.0 International
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Attribution-NonCommercial-NoDerivatives 4.0 International
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Universidade Federal de Goiás
dc.publisher.program.fl_str_mv Programa de Pós-graduação em Ciência da Computação (INF)
dc.publisher.initials.fl_str_mv UFG
dc.publisher.country.fl_str_mv Brasil
dc.publisher.department.fl_str_mv Instituto de Informática - INF (RMG)
publisher.none.fl_str_mv Universidade Federal de Goiás
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFG
instname:Universidade Federal de Goiás (UFG)
instacron:UFG
instname_str Universidade Federal de Goiás (UFG)
instacron_str UFG
institution UFG
reponame_str Repositório Institucional da UFG
collection Repositório Institucional da UFG
bitstream.url.fl_str_mv http://repositorio.bc.ufg.br/tede/bitstreams/d3d161c4-ec23-439b-a380-2ebfb3d3a34b/download
http://repositorio.bc.ufg.br/tede/bitstreams/70eae756-8316-4a52-84d5-eaadace5045b/download
http://repositorio.bc.ufg.br/tede/bitstreams/57f3ccf6-eefc-4ee4-9fc1-f2052e83f468/download
bitstream.checksum.fl_str_mv 57f091413bd4f9e4cc57811c25805bf0
8a4605be74aa9ea9d79846c1fba20a33
4460e5956bc1d1639be9ae6146a50347
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFG - Universidade Federal de Goiás (UFG)
repository.mail.fl_str_mv grt.bc@ufg.br
_version_ 1846536536634425344