APRENDIZAGEM POR REFORÇO E PROGRAMACÃO DINÂMICA ADAPTATIVA PARA PROJETO E AVALIAÇÃO DO DESEMPENHO DE ALGORITMOS DLQR EM SISTEMAS MIMO

Lopes, Leandro Rocha

APRENDIZAGEM POR REFORÇO E PROGRAMACÃO DINÂMICA ADAPTATIVA PARA PROJETO E AVALIAÇÃO DO DESEMPENHO DE ALGORITMOS DLQR EM SISTEMAS MIMO

Detalhes bibliográficos
Ano de defesa:	2011
Autor(a) principal:	Lopes, Leandro Rocha
Orientador(a):	FONSECA NETO, João Viana da
Banca de defesa:	Não Informado pela instituição
Tipo de documento:	Dissertação
Tipo de acesso:	Acesso aberto
Idioma:	por
Instituição de defesa:	Universidade Federal do Maranhão
Programa de Pós-Graduação:	PROGRAMA DE PÓS-GRADUAÇÃO EM ENGENHARIA DE ELETRICIDADE/CCET
Departamento:	Engenharia
País:	BR
Palavras-chave em Português:	Programação Dinâmica Controle ótimo HDP Q-Function ADHDP Sistemas Multivariáveis Convergência DLQR
Palavras-chave em Inglês:	Dynamic Programming Optimal Control HDP Q-Function ADHDP Multivariable Systems Convergence DLQR
Área do conhecimento CNPq:	CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO::ANALISE DE ALGORITMOS E COMPLEXIDADE DE COMPUTACAO
Link de acesso:	http://tedebc.ufma.br:8080/jspui/handle/tede/462
Resumo:	Due to the increasing of technological development and its associated industrial applications, control design methods to attend high performance requests and reinforcement learning are been developed, not only, to solve new problems, as well as, to improve the performance of implemented controllers in the real systems. The reinforcement learning (RL) and discrete linear quadratic regulator (DLQR) approaches are connected by adaptive dynamic programming (ADP). This connection is oriented to the design of optimal controller for multivariable systems (MIMO). The proposed method for DLQR controllers tuning can been heuristic guidance for biased variations in weighting matrices of instantenous reward. The heuristics performance are evaluated in terms of convergence of heuristic dynamic programming (HDP) and action dependent (AD-HDP) algorithms. The algorithms and tuning are evaluated by the capability to map the plane-Z in MIMO dynamic system of third order.

APRENDIZAGEM POR REFORÇO E PROGRAMACÃO DINÂMICA ADAPTATIVA PARA PROJETO E AVALIAÇÃO DO DESEMPENHO DE ALGORITMOS DLQR EM SISTEMAS MIMO

Registros relacionados