Aprendizagem por Reforço e Programação Dinâmica Aproximada para Controle Ótimo: Uma Abordagem para o Projeto Online do Regulador Linear Quadrático Discreto com Programação Dinâmica Heurística Dependente de Estado e Ação.

RÊGO, Patrícia Helena Moraes

Aprendizagem por Reforço e Programação Dinâmica Aproximada para Controle Ótimo: Uma Abordagem para o Projeto Online do Regulador Linear Quadrático Discreto com Programação Dinâmica Heurística Dependente de Estado e Ação.

Detalhes bibliográficos
Ano de defesa:	2014
Autor(a) principal:	RÊGO, Patrícia Helena Moraes
Orientador(a):	FONSECA NETO, João Viana da
Banca de defesa:	FONSECA NETO, João Viana da , FREIRE, Raimundo Carlos Silvério , OLIVEIRA, Roberto Célio Limão de , SERRA, Ginalber Luiz de Oliveira , SOUZA, Francisco das Chagas de
Tipo de documento:	Tese
Tipo de acesso:	Acesso aberto
Idioma:	por
Instituição de defesa:	Universidade Federal do Maranhão
Programa de Pós-Graduação:	PROGRAMA DE PÓS-GRADUAÇÃO EM ENGENHARIA DE ELETRICIDADE/CCET
Departamento:	DEPARTAMENTO DE ENGENHARIA DA ELETRICIDADE/CCET
País:	Brasil
Palavras-chave em Português:	Programação Dinâmica; Aprendizagem por Reforço; Programação Dinâmica Heurística; Controle Multivariável; Controle Ótimo; Regulador Linear Quadrático Discreto; Mínimos Quadrados Recursivos
Palavras-chave em Inglês:	Dynamic Programming; Reinforcement Learning; Heuristic Dynamic Programming; Multivariable Control; Optimal Control; Discrete Linear Quadratic Regulator; Recursive Least-Squares
Área do conhecimento CNPq:	Análise de Algoritmos e Complexidade de Computação
Link de acesso:	http://tedebc.ufma.br:8080/jspui/handle/tede/1879
Resumo:	In this thesis a proposal of an uni ed approach of dynamic programming, reinforcement learning and function approximation theories aiming at the development of methods and algorithms for design of optimal control systems is presented. This approach is presented in the approximate dynamic programming context that allows approximating the optimal feedback solution as to reduce the computational complexity associated to the conventional dynamic programming methods for optimal control of multivariable systems. Speci cally, in the state and action dependent heuristic dynamic programming framework, this proposal is oriented for the development of online approximated solutions, numerically stable, of the Riccati-type Hamilton-Jacobi-Bellman equation associated to the discrete linear quadratic regulator problem which is based on a formulation that combines value function estimates by means of a RLS (Recursive Least-Squares) structure, temporal di erences and policy improvements. The development of the proposed methodologies, in this work, is focused mainly on the UDU T factorization that is inserted in this framework to improve the RLS estimation process of optimal decision policies of the discrete linear quadratic regulator, by circumventing convergence and numerical stability problems related to the covariance matrix ill-conditioning of the RLS approach.

Aprendizagem por Reforço e Programação Dinâmica Aproximada para Controle Ótimo: Uma Abordagem para o Projeto Online do Regulador Linear Quadrático Discreto com Programação Dinâmica Heurística Dependente de Estado e Ação.

Registros relacionados