Framework para sistemas de recomendação baseados em neural contextual Bandits com restrição de justiça
| Autor(a) principal: | |
|---|---|
| Data de Publicação: | 2024 |
| Tipo de documento: | Tese |
| Idioma: | por |
| Título da fonte: | Repositório Institucional da UFG |
| dARK ID: | ark:/38995/001300000fprk |
| Texto Completo: | http://repositorio.bc.ufg.br/tede/handle/tede/13642 |
Resumo: | The advent of digital businesses such as marketplaces, in which a company mediates a commercial transaction between different actors, presents challenges to recommendation systems as it is a multi-stakeholder scenario. In this scenario, the recommendation must meet conflicting objectives between the parties, such as relevance versus exposure, for example. State-of-the-art models that address the problem in a supervised way not only assume that the recommendation is a stationary problem, but are also user-centered, which leads to long-term system degradation. This thesis focuses on modeling the recommendation system as a reinforcement learning problem, through a Markovian decision-making process with uncertainty where it is possible to model the different interests of stakeholders in an environment with fairness constraints. The main challenges are the need for real interactions between stakeholders and the recommendation system in a continuous cycle of events that enables the scenario for online learning. For the development of this work, we present a model proposal, based on Neural Contextual Bandits with fairness constrain for multi-stakeholder scenarios. As results, we present the construction of MARS-Gym, a framework for modeling, training and evaluating recommendation systems based on reinforcement learning, and the development of different recommendation policies with fairness control adaptable to Neural models. Contextual Bandits, which led to an increase in fairness metrics for all scenarios presented while controlling the reduction in relevance metrics. |
| id |
UFG-2_df80e7e3bb4e664d8d88723d933fda9f |
|---|---|
| oai_identifier_str |
oai:repositorio.bc.ufg.br:tede/13642 |
| network_acronym_str |
UFG-2 |
| network_name_str |
Repositório Institucional da UFG |
| repository_id_str |
oai:repositorio.bc.ufg.br:tede/1234 |
| spelling |
Soares, Anderson da Silvahttp://lattes.cnpq.br/1096941114079527Soares, Anderson da SilvaRosa, Thierson CoutoCarvalho, Cedric Luiz DeAraújo, Aluizio Fausto RibeiroVeloso, Adrianohttp://lattes.cnpq.br/4493140717229623Santana, Marlesson Rodrigues Oliveira de2024-11-13T15:16:51Z2024-11-13T15:16:51Z2024-06-03SANTANA, M. R. O. Framework para Sistemas de Recomendação Baseados em Neural Contextual Bandits com Restrição de Justiça. Goiânia. 2024. 105p. Tese (Doutorado em Ciência da Computação) - Instituto de Informática, Universidade Federal de Goiás, Goiânia, 2024.http://repositorio.bc.ufg.br/tede/handle/tede/13642ark:/38995/001300000fprkThe advent of digital businesses such as marketplaces, in which a company mediates a commercial transaction between different actors, presents challenges to recommendation systems as it is a multi-stakeholder scenario. In this scenario, the recommendation must meet conflicting objectives between the parties, such as relevance versus exposure, for example. State-of-the-art models that address the problem in a supervised way not only assume that the recommendation is a stationary problem, but are also user-centered, which leads to long-term system degradation. This thesis focuses on modeling the recommendation system as a reinforcement learning problem, through a Markovian decision-making process with uncertainty where it is possible to model the different interests of stakeholders in an environment with fairness constraints. The main challenges are the need for real interactions between stakeholders and the recommendation system in a continuous cycle of events that enables the scenario for online learning. For the development of this work, we present a model proposal, based on Neural Contextual Bandits with fairness constrain for multi-stakeholder scenarios. As results, we present the construction of MARS-Gym, a framework for modeling, training and evaluating recommendation systems based on reinforcement learning, and the development of different recommendation policies with fairness control adaptable to Neural models. Contextual Bandits, which led to an increase in fairness metrics for all scenarios presented while controlling the reduction in relevance metrics.O advento dos negócios digitais como marketplaces, em que uma empresa intermedeia uma transação comercial entre diferentes atores, apresenta desafios aos sistemas de recomendação por se tratar de um cenário multistakeholders. Nesse cenário, a recomendação deve atender a objetivos conflitantes entre as partes, como relevância versus exposição, por exemplo. Modelos estado da arte que tratam o problema de forma supervisionada, não apenas assumem que a recomendação é um problema estacionário, mas também são centradas no usuário, o que leva à degradação do sistema em longo prazo. Esta tese foca em modelar o sistema de recomendação como um problema de aprendizado por reforço, por um processo markoviano de tomada de decisão com incerteza onde seja possível modelar os diferentes interesses dos stakeholders em um ambiente com restrições de justiça. Os principais desafios estão na necessidade de interações reais entre os stakeholders e o sistema de recomendação em um ciclo de eventos contínuo que possibilite o cenário para o aprendizado online. Para o desenvolvimento deste trabalho, apresentamos uma proposta de modelo, baseado em Neural Contextual Bandits com restrição de justiça para cenários multistakeholders. Como resultados, apresentamos um framework de código aberto (MARS-Gym) para modelagem, treinamento, e avaliação de agentes de RL para sistemas de recomendação em ambientes multistakeholders e a arquitetura de Neural Contextual Bandit ‘Fair-Feature-Policy‘ com otimização multiobjetiva e restrição de justiça, o que levou a um aumento nas métricas de exposição ideal dos fornecedores em todos os cenários avaliados, em contrapartida, com pouca ou nenhuma redução na relevância das recomendações dadas pelo modelo.porUniversidade Federal de GoiásPrograma de Pós-graduação em Ciência da Computação (INF)UFGBrasilInstituto de Informática - INF (RMG)Attribution-NonCommercial-NoDerivatives 4.0 Internationalinfo:eu-repo/semantics/openAccessrecomendação multistakeholderjustiça na recomendaçãoaprendizado por reforçomultistakeholder recommendationfairnessreinforcement learningCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::SISTEMAS DE COMPUTACAOFramework para sistemas de recomendação baseados em neural contextual Bandits com restrição de justiçaFramework for Recommender Systems based on Neural Contextual bandits with Fairness-Constrainedinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisreponame:Repositório Institucional da UFGinstname:Universidade Federal de Goiás (UFG)instacron:UFGORIGINALTese - Marlesson Rodrigues Oliveira de Santana - 2024.pdfTese - Marlesson Rodrigues Oliveira de Santana - 2024.pdfapplication/pdf12692662http://repositorio.bc.ufg.br/tede/bitstreams/d3d161c4-ec23-439b-a380-2ebfb3d3a34b/download57f091413bd4f9e4cc57811c25805bf0MD51LICENSElicense.txtlicense.txttext/plain; charset=utf-81748http://repositorio.bc.ufg.br/tede/bitstreams/70eae756-8316-4a52-84d5-eaadace5045b/download8a4605be74aa9ea9d79846c1fba20a33MD52CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8805http://repositorio.bc.ufg.br/tede/bitstreams/57f3ccf6-eefc-4ee4-9fc1-f2052e83f468/download4460e5956bc1d1639be9ae6146a50347MD53tede/136422024-11-13 12:16:51.66http://creativecommons.org/licenses/by-nc-nd/4.0/Attribution-NonCommercial-NoDerivatives 4.0 Internationalopen.accessoai:repositorio.bc.ufg.br:tede/13642http://repositorio.bc.ufg.br/tedeRepositório InstitucionalPUBhttps://repositorio.bc.ufg.br/tedeserver/oai/requestgrt.bc@ufg.bropendoar:oai:repositorio.bc.ufg.br:tede/12342024-11-13T15:16:51Repositório Institucional da UFG - Universidade Federal de Goiás (UFG)falseTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo= |
| dc.title.none.fl_str_mv |
Framework para sistemas de recomendação baseados em neural contextual Bandits com restrição de justiça |
| dc.title.alternative.eng.fl_str_mv |
Framework for Recommender Systems based on Neural Contextual bandits with Fairness-Constrained |
| title |
Framework para sistemas de recomendação baseados em neural contextual Bandits com restrição de justiça |
| spellingShingle |
Framework para sistemas de recomendação baseados em neural contextual Bandits com restrição de justiça Santana, Marlesson Rodrigues Oliveira de recomendação multistakeholder justiça na recomendação aprendizado por reforço multistakeholder recommendation fairness reinforcement learning CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::SISTEMAS DE COMPUTACAO |
| title_short |
Framework para sistemas de recomendação baseados em neural contextual Bandits com restrição de justiça |
| title_full |
Framework para sistemas de recomendação baseados em neural contextual Bandits com restrição de justiça |
| title_fullStr |
Framework para sistemas de recomendação baseados em neural contextual Bandits com restrição de justiça |
| title_full_unstemmed |
Framework para sistemas de recomendação baseados em neural contextual Bandits com restrição de justiça |
| title_sort |
Framework para sistemas de recomendação baseados em neural contextual Bandits com restrição de justiça |
| author |
Santana, Marlesson Rodrigues Oliveira de |
| author_facet |
Santana, Marlesson Rodrigues Oliveira de |
| author_role |
author |
| dc.contributor.advisor1.fl_str_mv |
Soares, Anderson da Silva |
| dc.contributor.advisor1Lattes.fl_str_mv |
http://lattes.cnpq.br/1096941114079527 |
| dc.contributor.referee1.fl_str_mv |
Soares, Anderson da Silva |
| dc.contributor.referee2.fl_str_mv |
Rosa, Thierson Couto |
| dc.contributor.referee3.fl_str_mv |
Carvalho, Cedric Luiz De |
| dc.contributor.referee4.fl_str_mv |
Araújo, Aluizio Fausto Ribeiro |
| dc.contributor.referee5.fl_str_mv |
Veloso, Adriano |
| dc.contributor.authorLattes.fl_str_mv |
http://lattes.cnpq.br/4493140717229623 |
| dc.contributor.author.fl_str_mv |
Santana, Marlesson Rodrigues Oliveira de |
| contributor_str_mv |
Soares, Anderson da Silva Soares, Anderson da Silva Rosa, Thierson Couto Carvalho, Cedric Luiz De Araújo, Aluizio Fausto Ribeiro Veloso, Adriano |
| dc.subject.por.fl_str_mv |
recomendação multistakeholder justiça na recomendação aprendizado por reforço |
| topic |
recomendação multistakeholder justiça na recomendação aprendizado por reforço multistakeholder recommendation fairness reinforcement learning CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::SISTEMAS DE COMPUTACAO |
| dc.subject.eng.fl_str_mv |
multistakeholder recommendation fairness reinforcement learning |
| dc.subject.cnpq.fl_str_mv |
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::SISTEMAS DE COMPUTACAO |
| description |
The advent of digital businesses such as marketplaces, in which a company mediates a commercial transaction between different actors, presents challenges to recommendation systems as it is a multi-stakeholder scenario. In this scenario, the recommendation must meet conflicting objectives between the parties, such as relevance versus exposure, for example. State-of-the-art models that address the problem in a supervised way not only assume that the recommendation is a stationary problem, but are also user-centered, which leads to long-term system degradation. This thesis focuses on modeling the recommendation system as a reinforcement learning problem, through a Markovian decision-making process with uncertainty where it is possible to model the different interests of stakeholders in an environment with fairness constraints. The main challenges are the need for real interactions between stakeholders and the recommendation system in a continuous cycle of events that enables the scenario for online learning. For the development of this work, we present a model proposal, based on Neural Contextual Bandits with fairness constrain for multi-stakeholder scenarios. As results, we present the construction of MARS-Gym, a framework for modeling, training and evaluating recommendation systems based on reinforcement learning, and the development of different recommendation policies with fairness control adaptable to Neural models. Contextual Bandits, which led to an increase in fairness metrics for all scenarios presented while controlling the reduction in relevance metrics. |
| publishDate |
2024 |
| dc.date.accessioned.fl_str_mv |
2024-11-13T15:16:51Z |
| dc.date.available.fl_str_mv |
2024-11-13T15:16:51Z |
| dc.date.issued.fl_str_mv |
2024-06-03 |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/doctoralThesis |
| format |
doctoralThesis |
| status_str |
publishedVersion |
| dc.identifier.citation.fl_str_mv |
SANTANA, M. R. O. Framework para Sistemas de Recomendação Baseados em Neural Contextual Bandits com Restrição de Justiça. Goiânia. 2024. 105p. Tese (Doutorado em Ciência da Computação) - Instituto de Informática, Universidade Federal de Goiás, Goiânia, 2024. |
| dc.identifier.uri.fl_str_mv |
http://repositorio.bc.ufg.br/tede/handle/tede/13642 |
| dc.identifier.dark.fl_str_mv |
ark:/38995/001300000fprk |
| identifier_str_mv |
SANTANA, M. R. O. Framework para Sistemas de Recomendação Baseados em Neural Contextual Bandits com Restrição de Justiça. Goiânia. 2024. 105p. Tese (Doutorado em Ciência da Computação) - Instituto de Informática, Universidade Federal de Goiás, Goiânia, 2024. ark:/38995/001300000fprk |
| url |
http://repositorio.bc.ufg.br/tede/handle/tede/13642 |
| dc.language.iso.fl_str_mv |
por |
| language |
por |
| dc.rights.driver.fl_str_mv |
Attribution-NonCommercial-NoDerivatives 4.0 International info:eu-repo/semantics/openAccess |
| rights_invalid_str_mv |
Attribution-NonCommercial-NoDerivatives 4.0 International |
| eu_rights_str_mv |
openAccess |
| dc.publisher.none.fl_str_mv |
Universidade Federal de Goiás |
| dc.publisher.program.fl_str_mv |
Programa de Pós-graduação em Ciência da Computação (INF) |
| dc.publisher.initials.fl_str_mv |
UFG |
| dc.publisher.country.fl_str_mv |
Brasil |
| dc.publisher.department.fl_str_mv |
Instituto de Informática - INF (RMG) |
| publisher.none.fl_str_mv |
Universidade Federal de Goiás |
| dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFG instname:Universidade Federal de Goiás (UFG) instacron:UFG |
| instname_str |
Universidade Federal de Goiás (UFG) |
| instacron_str |
UFG |
| institution |
UFG |
| reponame_str |
Repositório Institucional da UFG |
| collection |
Repositório Institucional da UFG |
| bitstream.url.fl_str_mv |
http://repositorio.bc.ufg.br/tede/bitstreams/d3d161c4-ec23-439b-a380-2ebfb3d3a34b/download http://repositorio.bc.ufg.br/tede/bitstreams/70eae756-8316-4a52-84d5-eaadace5045b/download http://repositorio.bc.ufg.br/tede/bitstreams/57f3ccf6-eefc-4ee4-9fc1-f2052e83f468/download |
| bitstream.checksum.fl_str_mv |
57f091413bd4f9e4cc57811c25805bf0 8a4605be74aa9ea9d79846c1fba20a33 4460e5956bc1d1639be9ae6146a50347 |
| bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 |
| repository.name.fl_str_mv |
Repositório Institucional da UFG - Universidade Federal de Goiás (UFG) |
| repository.mail.fl_str_mv |
grt.bc@ufg.br |
| _version_ |
1846536536634425344 |