Use of machine learning for improvements in performance abd energy consumption in HPC systems

Detalhes bibliográficos
Ano de defesa: 2020
Autor(a) principal: Klôh, Vinícius Prata
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Laboratório Nacional de Computação Científica
Coordenação de Pós-Graduação e Aperfeiçoamento (COPGA)
Brasil
LNCC
Programa de Pós-Graduação em Modelagem Computacional
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://tede.lncc.br/handle/tede/324
Resumo: Scientific Computing has been indispensable for advances in several research domains because it offers a large capacity of computational resources that allow the execution of increasingly complex simulations. However, some areas still require greater computational capability to perform increasingly accurate simulations. To meet this demand make them feasible, High Performance Computing (HPC) sys- tems are constantly evolving. In this sense, the development of new supercom- puters and strategies that enable their development, such as the orchestration of computational resources in a more efficient way, in terms of performance and energy efficiency, has been sought. Among the challenges found to achieve these advances are the analysis and prediction of runtime and energy consumption for different classes of scientific applications executed in different computational architectures. Therefore, this work proposes a methodology for monitoring and analyzing sci- entific applications using performance counters, and the use of these as real data for Machine Learning (ML) techniques to understand the scientific applications, and how their performance characteristics regarding the use of computational re- sources. Still, ML is used for the development of models for characterization of the scientific applications and predictive models for runtime and energy consumption on the execution of an application in different architectures.