Melhorias de estabilidade numérica e custo computacional de aproximadores de funções valor de estado baseados em estimadores RLS para projeto online de sistemas de controle HDP-DLQR

Ferreira, Ernesto Franklin Marçal

Melhorias de estabilidade numérica e custo computacional de aproximadores de funções valor de estado baseados em estimadores RLS para projeto online de sistemas de controle HDP-DLQR

Detalhes bibliográficos
Ano de defesa:	2016
Autor(a) principal:	Ferreira, Ernesto Franklin Marçal
Orientador(a):	FONSECA NETO, João Viana da
Banca de defesa:	Não Informado pela instituição
Tipo de documento:	Dissertação
Tipo de acesso:	Acesso aberto
Idioma:	por
Instituição de defesa:	Universidade Federal do Maranhão
Programa de Pós-Graduação:	PROGRAMA DE PÓS-GRADUAÇÃO EM ENGENHARIA DE ELETRICIDADE/CCET
Departamento:	DEPARTAMENTO DE ENGENHARIA DA ELETRICIDADE/CCET
País:	Brasil
Palavras-chave em Português:	Programação Dinâmica Aprendizagem por Reforço Programa ção Dinâmica Heurística Controle Multivariável Controle Ótimo Regulador Linear Quadrático Discreto Mínimos Quadrados Recursivos Decomposição QR
Palavras-chave em Inglês:	Dynamic Programming Reinforcement Learning Heuristic Dynamic Programming Multivariable Control Optimal Control Discrete Linear Quadratic Regulator Recursive Least-Squares
Área do conhecimento CNPq:	Engenharia de Software
Link de acesso:	http://tedebc.ufma.br:8080/jspui/handle/tede/1687
Resumo:	The development and the numerical stability analysis of a new adaptive critic algorithm to approximate the state-value function for online discrete linear quadratic regulator (DLQR) optimal control system design based on heuristic dynamic programming (HDP) are presented in this work. The proposed algorithm makes use of unitary transformations and QR decomposition methods to improve the online learning e-ciency in the critic network through the recursive least-squares (RLS) approach. The developed learning strategy provides computational performance improvements in terms of numerical stability and computational cost which aim at making possible the implementations in real time of optimal control design methodology based upon actor-critic reinforcement learning paradigms. The convergence behavior and numerical stability of the proposed online algorithm, called RLSµ-QR-HDP-DLQR, are evaluated by computational simulations in three Multiple-Input and Multiple-Output (MIMO) models, that represent the automatic pilot of an F-16 aircraft of third order, a fourth order RLC circuit with two input voltages and two controllable voltage levels, and a doubly-fed induction generator with six inputs and six outputs for wind energy conversion systems.

Melhorias de estabilidade numérica e custo computacional de aproximadores de funções valor de estado baseados em estimadores RLS para projeto online de sistemas de controle HDP-DLQR

Registros relacionados