Escalonamento de Produção em Manufatura em Rede: uma abordagem multicritério utilizando aprendizagem por reforço

FERREIRA, Frederic Menezes

Escalonamento de Produção em Manufatura em Rede: uma abordagem multicritério utilizando aprendizagem por reforço

Detalhes bibliográficos
Ano de defesa:	2024
Autor(a) principal:	FERREIRA, Frederic Menezes
Orientador(a):	OLIVEIRA, Alexandre César Muniz de
Banca de defesa:	OLIVEIRA, Alexandre César Muniz de , SOUZA, Bruno Feres de , ALMEIDA NETO, Areolino de, CHAVES, Antônio Augusto
Tipo de documento:	Dissertação
Tipo de acesso:	Acesso aberto
Idioma:	por
Instituição de defesa:	Universidade Federal do Maranhão
Programa de Pós-Graduação:	PROGRAMA DE PÓS-GRADUAÇÃO EM CIÊNCIA DA COMPUTAÇÃO/CCET
Departamento:	DEPARTAMENTO DE INFORMÁTICA/CCET
País:	Brasil
Palavras-chave em Português:	Rede de fabricação auto-otimizada; Aprendizagem por reforço; Otimização de Política Proximal
Palavras-chave em Inglês:	Self-optimizing manufacturing network; Reinforcement learning; Proximal Policy Optimization
Área do conhecimento CNPq:	Metodologia e Técnicas da Computação
Link de acesso:	https://tedebc.ufma.br/jspui/handle/tede/5666
Resumo:	Efficient manufacturing control plays a fundamental role in the manufacturing industry’s ability to meet the growing demands for personalised production, characterised by rapid changes in customer preferences. To optimise flexible manufacturing settings marked by high automation, it is essential to incorporate autonomous decisions during production planning and execution. However, the challenge lies in developing a manufacturing control system that is resilient and, ideally, proactive, anticipating sudden changes efficiently in industrial practice. Achieving such a goal requires the use of intelligent production scheduling tools. Many data-driven technologies have been adopted in production scheduling research, with Reinforcement Learning (RL) being a promising candidate capable of establishing a direct mapping from environment observation to performance-enhancing actions. This dissertation presents an RL framework to solve the dynamic scheduling problem within a local unit of a self-optimised manufacturing network, aiming to find an optimal production schedule. The RL algorithm trains a scheduling agent, capturing the relationship between factory floor information and scheduling criteria to make real-time decisions for a manufacturing system subject to frequent unexpected events. We propose an initial validation scenario where the agent must accept production demands considering priorities related to three performance criteria (economic, sustainability, and variability), given the current system load, which can impact delays and, consequently, financial losses. A Reinforcement Learning environment is introduced using state-of-the-art open-source software. The environment is designed as a single-agent problem, where the RL agent decides which demands to dispatch for production on the available machines at each time step. To guide the agent towards an optimal schedule, a reward function balances each criterion’s influence using a prioritisation factor. Additionally, a state-of-the-art RL algorithm is implemented. The RL approach is evaluated by comparing its solutions to a simulated dataset. The results show that the approach can generate more profitable and personalised, or more sustainable, schedules, depending on the adopted criterion. The agent’s performance is influenced by the prioritisation factor in the reward function.

Escalonamento de Produção em Manufatura em Rede: uma abordagem multicritério utilizando aprendizagem por reforço

Registros relacionados