Detalhes bibliográficos
Ano de defesa: |
2022 |
Autor(a) principal: |
Nishimoto, Bruno Eidi |
Orientador(a): |
Não Informado pela instituição |
Banca de defesa: |
Não Informado pela instituição |
Tipo de documento: |
Dissertação
|
Tipo de acesso: |
Acesso aberto |
Idioma: |
eng |
Instituição de defesa: |
Biblioteca Digitais de Teses e Dissertações da USP
|
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: |
|
Link de acesso: |
https://www.teses.usp.br/teses/disponiveis/3/3141/tde-31032023-082212/
|
Resumo: |
Dialogue system (DS) is an old idea dating back to 1966, when the rst such system was created. DS can be classied in three categories: question & answering, task-oriented and socialbot. Task-oriented dialogue systems are a very relevant eld due to the diversity of possible applications it can achieve. For example, it can solve tasks like buying a movie ticket, booking a restaurant and providing customer service. They have received increasing attention in recent years, and one reason for this is the advancement in natural language processing. Although the literature presents several studies focusing on DS, there are still many issues to be accomplished. Most of them are related to dialogue management, the central component of DS. Reinforcement learning (RL) is one approach that has achieved great success recently. However, things become more complex when DS is extended to multi-domain settings, i.e. when DS needs to complete multiple tasks in dierent domains for the user. Some problems such as policy adaptation and transfer learning arise in this new scenario. The purpose of this research is to improve recent techniques using RL on the dialogue management. We present an ecient learning by balancing exploration and exploitation, and enhancing the usage of expert knowledge to guide the agent. We propose a method to handle noise and error in the input of the dialogue management and we also provide a basic comparison between RL and supervised learning in both toy and real datasets. Finally, we present a new proposal to deal with multi-domain settings: the use of the divide-and-conquer technique and transfer learning for dierent domains. |