Adaptive value-at-risk policy optimization: a deep reinforcement learning approach for minimizing the capital charge

Banhudo, Guilherme Sousa Falcão Duarte

Adaptive value-at-risk policy optimization: a deep reinforcement learning approach for minimizing the capital charge

Bibliographic Details
Main Author:	Banhudo, Guilherme Sousa Falcão Duarte
Publication Date:	2019
Format:	Master thesis
Language:	eng
Source:	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Download full:	http://hdl.handle.net/10071/19197
Summary:	In 1995, the Basel Committee on Banking Supervision emitted an amendment to the first Basel Accord, allowing financial institutions to develop internal risk models, based on the value-at-risk (VaR), as opposed to using the regulator’s predefined model. From that point onwards, the scientific community has focused its efforts on improving the accuracy of the VaR models to reduce the capital requirements stipulated by the regulatory framework. In contrast, some authors proposed that the key towards disclosure optimization would not lie in improving the existing models, but in manipulating the estimated value. The most recent progress in this field employed dynamic programming (DP), based on Markov decision processes (MDPs), to create a daily report policy. However, the use of dynamic programming carries heavy costs for the solution; not only does the algorithm require an explicit transition probability matrix, the high computational storage requirements and inability to operate in continuous MDPs demand simplifying the problem. The purpose of this work is to introduce deep reinforcement learning as an alternative to solving problems characterized by a complex or continuous MDP. To this end, the author benchmarks the DP generated policy with one generated via proximal policy optimization. In conclusion, and despite the small number of employed learning iterations, the algorithm showcased a strong convergence with the optimal policy, allowing for the methodology to be used on the unrestricted problem, without incurring in simplifications such as action and state discretization.

Item metadata

id	RCAP_1fae90ec8ca3b2c25081d969182d2faf
oai_identifier_str	oai:repositorio.iscte-iul.pt:10071/19197
network_acronym_str	RCAP
network_name_str	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str	https://opendoar.ac.uk/repository/7160
spelling	Adaptive value-at-risk policy optimization: a deep reinforcement learning approach for minimizing the capital chargeValue at riskBasel accordArtificial intelligenceDeep learningDeep reinforcement learningProximal policy optimizationIn 1995, the Basel Committee on Banking Supervision emitted an amendment to the first Basel Accord, allowing financial institutions to develop internal risk models, based on the value-at-risk (VaR), as opposed to using the regulator’s predefined model. From that point onwards, the scientific community has focused its efforts on improving the accuracy of the VaR models to reduce the capital requirements stipulated by the regulatory framework. In contrast, some authors proposed that the key towards disclosure optimization would not lie in improving the existing models, but in manipulating the estimated value. The most recent progress in this field employed dynamic programming (DP), based on Markov decision processes (MDPs), to create a daily report policy. However, the use of dynamic programming carries heavy costs for the solution; not only does the algorithm require an explicit transition probability matrix, the high computational storage requirements and inability to operate in continuous MDPs demand simplifying the problem. The purpose of this work is to introduce deep reinforcement learning as an alternative to solving problems characterized by a complex or continuous MDP. To this end, the author benchmarks the DP generated policy with one generated via proximal policy optimization. In conclusion, and despite the small number of employed learning iterations, the algorithm showcased a strong convergence with the optimal policy, allowing for the methodology to be used on the unrestricted problem, without incurring in simplifications such as action and state discretization.Em 1995 foi emitida uma adenda ao Acordo de Basileia vigente, o Basileia I, que permitiu que as instituições financeiras optassem por desenvolver modelos internos de medição de risco, tendo por base o value-at-risk (VaR), ao invés de recorrer ao modelo estipulado pelo regulador. Desde então, a comunidade científica focou os seus esforços na melhoria da precisão dos modelos de VaR procurando assim reduzir os requisitos de capital definidos na regulamentação. No entanto, alguns autores propuseram que a chave para a optimização do reporte não estaria na melhoria dos modelos existentes, mas na manipulação do valor estimado. O progresso mais recente recorreu ao uso de programação dinâmica (DP), baseada em processos de decisão de Markov (MDP) para atingir este fim, criando uma regra de reporte diária. No entanto, o uso de DP acarreta custos para a solução, uma vez que por um lado, o algoritmo requer uma matriz de probabilidades de transição definida, e por outro, os elevados requisitos de armazenamento computacional e incapacidade de lidar com processos de decisão de Markov (MDP) contínuos, exigem a simplificação do problema em questão. Este trabalho visa introduzir "deep reinforcement learning" como uma alternativa a problemas caracterizados por um MDP contínuo ou complexo. Para o efeito, é realizado um "benchmarking" com a "policy" criada por programação dinâmica, recorrendo ao algoritmo "proximal policy optimization". Em suma, e apesar do reduzido montante de iterações empregue, o algoritmo demonstrou fortes capacidades de convergência com a solução óptima, podendo ser empregue na estimativa do problema sem incorrer em simplificações.2019-12-17T12:22:46Z2019-11-12T00:00:00Z2019-11-122019-10info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10071/19197TID:202322521engBanhudo, Guilherme Sousa Falcão Duarteinfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-07-07T02:35:02Zoai:repositorio.iscte-iul.pt:10071/19197Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T18:01:22.798880Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv	Adaptive value-at-risk policy optimization: a deep reinforcement learning approach for minimizing the capital charge
title	Adaptive value-at-risk policy optimization: a deep reinforcement learning approach for minimizing the capital charge
spellingShingle	Adaptive value-at-risk policy optimization: a deep reinforcement learning approach for minimizing the capital charge Banhudo, Guilherme Sousa Falcão Duarte Value at risk Basel accord Artificial intelligence Deep learning Deep reinforcement learning Proximal policy optimization
title_short	Adaptive value-at-risk policy optimization: a deep reinforcement learning approach for minimizing the capital charge
title_full	Adaptive value-at-risk policy optimization: a deep reinforcement learning approach for minimizing the capital charge
title_fullStr	Adaptive value-at-risk policy optimization: a deep reinforcement learning approach for minimizing the capital charge
title_full_unstemmed	Adaptive value-at-risk policy optimization: a deep reinforcement learning approach for minimizing the capital charge
title_sort	Adaptive value-at-risk policy optimization: a deep reinforcement learning approach for minimizing the capital charge
author	Banhudo, Guilherme Sousa Falcão Duarte
author_facet	Banhudo, Guilherme Sousa Falcão Duarte
author_role	author
dc.contributor.author.fl_str_mv	Banhudo, Guilherme Sousa Falcão Duarte
dc.subject.por.fl_str_mv	Value at risk Basel accord Artificial intelligence Deep learning Deep reinforcement learning Proximal policy optimization
topic	Value at risk Basel accord Artificial intelligence Deep learning Deep reinforcement learning Proximal policy optimization
description	In 1995, the Basel Committee on Banking Supervision emitted an amendment to the first Basel Accord, allowing financial institutions to develop internal risk models, based on the value-at-risk (VaR), as opposed to using the regulator’s predefined model. From that point onwards, the scientific community has focused its efforts on improving the accuracy of the VaR models to reduce the capital requirements stipulated by the regulatory framework. In contrast, some authors proposed that the key towards disclosure optimization would not lie in improving the existing models, but in manipulating the estimated value. The most recent progress in this field employed dynamic programming (DP), based on Markov decision processes (MDPs), to create a daily report policy. However, the use of dynamic programming carries heavy costs for the solution; not only does the algorithm require an explicit transition probability matrix, the high computational storage requirements and inability to operate in continuous MDPs demand simplifying the problem. The purpose of this work is to introduce deep reinforcement learning as an alternative to solving problems characterized by a complex or continuous MDP. To this end, the author benchmarks the DP generated policy with one generated via proximal policy optimization. In conclusion, and despite the small number of employed learning iterations, the algorithm showcased a strong convergence with the optimal policy, allowing for the methodology to be used on the unrestricted problem, without incurring in simplifications such as action and state discretization.
publishDate	2019
dc.date.none.fl_str_mv	2019-12-17T12:22:46Z 2019-11-12T00:00:00Z 2019-11-12 2019-10
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10071/19197 TID:202322521
url	http://hdl.handle.net/10071/19197
identifier_str_mv	TID:202322521
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.source.none.fl_str_mv	reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP
instname_str	FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv	info@rcaap.pt
_version_	1833597136907796480

Adaptive value-at-risk policy optimization: a deep reinforcement learning approach for minimizing the capital charge

Similar Items