Detalhes bibliográficos
Ano de defesa: |
2021 |
Autor(a) principal: |
SILVA, Fábio Nogueira da
 |
Orientador(a): |
FONSECA NETO, João Viana da
 |
Banca de defesa: |
FONSECA NETO, João Viana da
,
SERRA, Ginalber Luiz de Oliveira
,
SOUZA, Francisco das Chagas de
,
BARRA JUNIOR, Walter
,
SILVEIRA, Antônio da Silva
 |
Tipo de documento: |
Tese
|
Tipo de acesso: |
Acesso aberto |
Idioma: |
por |
Instituição de defesa: |
Universidade Federal do Maranhão
|
Programa de Pós-Graduação: |
PROGRAMA DE PÓS-GRADUAÇÃO EM ENGENHARIA DE ELETRICIDADE/CCET
|
Departamento: |
DEPARTAMENTO DE ENGENHARIA DA ELETRICIDADE/CCET
|
País: |
Brasil
|
Palavras-chave em Português: |
|
Palavras-chave em Inglês: |
|
Área do conhecimento CNPq: |
|
Link de acesso: |
https://tedebc.ufma.br/jspui/handle/tede/3695
|
Resumo: |
Formulations for state observers for dynamical systems, based on the fundamentals of approxi mate dynamic programming (ADP), optimal control and reinforcement learning are proposed, developed, applied and analyzed in this Thesis. Algorithm proposals, metrics for performance evaluation, robustness, convergence and solvability analysis are also presented. Studies on para metric sensitivity of the algorithms, with respect to noise signals, initial conditions of parameters and initial states of the dynamic system are presented. The rationale for the proposed observers is based on approximate dynamic programming, with approximation of the valued function performed by a reinforcement learning algorithm (RL), using the temporal differences errors, aiming at the coupling of observers for online applications, being able to also be implemented offline. The observer’s formulation is based on the discrete optimal control problem, associated with the discrete linear quadratic regulator (DLQR) with output feedback, requiring only the measured input and output signals. For state estimation with ADP-based structure, the availability of two matrices is necessary, and a formulation is proposed that results in a system of nonlinear algebraic equations for matrix recovery. To solve this problem, a feedforward multi layer neural network is initially applied, but due to its high computational complexity throughout the iterative process, such a solution was found to be unfeasible. An alternative based on an approxima tion is proposed, not being necessary to solve the system of equations and thus reducing the computational complexity. To evaluate the performance of the algorithms, error metrics are proposed, since the algorithms have several tunable parameters. To facilitate the tuning and analysis process, error surfaces are constructed with parameter variations, in order to observe the parametric sensitivities in the algorithm in relation to the error metrics and to evaluate the solvability and convergence, facilitating the observer tuning process. The application of the proposed methodologies has advantages such as the lack of modeling or dynamical system identification, the incorporation of dynamic changes through the use of approaches based on reinforcement learning, in addition to helping in the tuning and analysis process. Keywords: |