Aprendizado por reforço profundo aplicado ao problema de alocação de veículos dinâmico e estocástico

Detalhes bibliográficos
Ano de defesa: 2023
Autor(a) principal: Cordeiro, Francisco Edyvalberty Alenquer
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Não Informado pela instituição
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://repositorio.ufc.br/handle/riufc/75618
Resumo: The dynamic and stochastic vehicle allocation problem involves deciding which vehicles to assign to requests that arise randomly in time and space. This challenge includes various practical scenarios, such as the transportation of goods by trucks, emergency response systems, and app- based transportation services. In this study, the problem was modeled as a semi-Markov decision process, allowing the treatment of time as a continuous variable. In this approach, decision moments coincide with discrete events with random durations. The use of this event-based strategy results in a significant reduction in decision space, thereby reducing the complexity of the allocation problems involved. Furthermore, it proves to be more suitable for practical situations when compared to discrete-time models often used in the literature. To validate the proposed approach, a discrete event simulator was developed, and two decision-making agents were trained using the reinforcement learning algorithm called Double Deep Q-Learning. Numerical experiments were conducted in realistic scenarios in New York, and the results of the proposed approach were compared with commonly employed heuristics, demonstrating substantial improvements, including up to a 50% reduction in average waiting times compared to other tested policies.