Export Ready — 

Transparent optimization of OpenMP applications via thread throttling and boosting techniques

Bibliographic Details
Main Author: Marques, Sandro Matheus Vila Nova
Publication Date: 2021
Format: Bachelor thesis
Language: eng
Source: Repositório Institucional da UNIPAMPA
Download full: http://dspace.unipampa.edu.br:8080/jspui/handle/riu/5612
Summary: The growing number of cores in modern multicore architectures has brought together the need for better use of hardware resources. Consequently, two techniques have become widely used to optimize the performance and energy consumption of these environments: Dynamic Concurrency Throttling (DCT) and Boosting. On the one hand, DCT adjusts the number of threads in parallel regions to minimize the effects of intrinsic characteristics of the applications that impact performance and energy consumption (e.g., data synchronization and communication). On the other hand, Boosting techniques focus on making the performance reach its maximum level during all phases of the application by increasing the processor frequencies while respecting the Thermal Design Power (TDP). One of the main challenges is that each region of a parallel application can behave differently (i.e., memory access behavior and degree of parallelism) which makes using both techniques combined not a straightforward task. Choosing the wrong number of threads and enabling/disabling boosting frequencies in the incorrect phases can lead to increasing the energy consumption and performance degradation. To solve this problem, this work presents two strategies that apply DCT and Boosting to improve the trade-off between performance and energy consumption (represented by the energy-delay product - EDP): PFG, a strategy that optimizes each region of a given application, individually; and PCG that considers the combination of parallel and sequential regions during optimization. Both strategies are transparent, automatic, and deeply integrated into the OpenMP parallel programming interface, so no code modification or recompilation is necessary. By executing twelve well-known benchmarks in three multicore systems, PFG and PCG improve EDP by up to, respectively, 95.3% and 95.5% compared to standard OpenMP execution, 90.9%, and 94.8% on Varuna-PM and 80.5% and 83.7% against the Core Packing technique. We also show that PFG is more suitable for applications with high variability in the CPU workload, while PCG is better when there is low workload variability.
id UNIP_42d3db9db669b9a9d480b3868bb4e711
oai_identifier_str oai:repositorio.unipampa.edu.br:riu/5612
network_acronym_str UNIP
network_name_str Repositório Institucional da UNIPAMPA
repository_id_str
spelling Transparent optimization of OpenMP applications via thread throttling and boosting techniquesEngenharia de softwareComputação de alto desempenhoProgramação paralela (Computação)Software engineeringHigh performance computingParallel programming (Computer science)CNPQ::CIENCIAS EXATAS E DA TERRAThe growing number of cores in modern multicore architectures has brought together the need for better use of hardware resources. Consequently, two techniques have become widely used to optimize the performance and energy consumption of these environments: Dynamic Concurrency Throttling (DCT) and Boosting. On the one hand, DCT adjusts the number of threads in parallel regions to minimize the effects of intrinsic characteristics of the applications that impact performance and energy consumption (e.g., data synchronization and communication). On the other hand, Boosting techniques focus on making the performance reach its maximum level during all phases of the application by increasing the processor frequencies while respecting the Thermal Design Power (TDP). One of the main challenges is that each region of a parallel application can behave differently (i.e., memory access behavior and degree of parallelism) which makes using both techniques combined not a straightforward task. Choosing the wrong number of threads and enabling/disabling boosting frequencies in the incorrect phases can lead to increasing the energy consumption and performance degradation. To solve this problem, this work presents two strategies that apply DCT and Boosting to improve the trade-off between performance and energy consumption (represented by the energy-delay product - EDP): PFG, a strategy that optimizes each region of a given application, individually; and PCG that considers the combination of parallel and sequential regions during optimization. Both strategies are transparent, automatic, and deeply integrated into the OpenMP parallel programming interface, so no code modification or recompilation is necessary. By executing twelve well-known benchmarks in three multicore systems, PFG and PCG improve EDP by up to, respectively, 95.3% and 95.5% compared to standard OpenMP execution, 90.9%, and 94.8% on Varuna-PM and 80.5% and 83.7% against the Core Packing technique. We also show that PFG is more suitable for applications with high variability in the CPU workload, while PCG is better when there is low workload variability.O número crescente de núcleos em arquiteturas multicore modernas trouxe consigo a necessidade de melhor uso dos recursos de hardware. Consequentemente, duas técnicas têm se tornado amplamente utilizadas para otimizar o desempenho e o consumo de energia desses ambientes: o Dynamic Concurrency Throttling (DCT) e Boosting. Por um lado, o DCT ajusta o número de threads em regiões paralelas para minimizar os efeitos das características intrínsecas das aplicações que afetam o desempenho e o consumo de energia (e.g., comunicação e sincronização de dados). Por outro lado, as técnicas de Boosting focam em fazer o desempenho atingir seu nível máximo durante todas as fases da aplicação, por meio do aumento das frequências do processador, respeitando o Thermal Design Power (TDP). Um dos principais desafios é que cada região de uma aplicação paralela pode se comportar de forma diferente (e.g., comportamento de acesso à memória e grau de paralelismo), o que torna o uso de ambas as técnicas combinadas tarefa complicada. Escolher o número errado de threads e habilitar/desabilitar as frequências de boost nas fases erradas pode levar ao aumento do consumo de energia e degradação do desempenho. Para resolver este problema, este trabalho apresenta duas estratégias que aplicam DCT e Boosting para melhorar o trade-off entre desempenho e consumo de energia (representado pelo energy-delay product - EDP): PFG, uma estratégia que otimiza cada região de um determinada aplicação, individualmente; e PCG que considera a combinação de regiões paralelas e sequenciais durante a otimização. Ambas as estratégias são transparentes, automáticas e profundamente integradas à interface de programação paralela OpenMP, portanto, nenhuma modificação ou recompilação de código é necessária. Por meio da execução de doze benchmarks amplamente conhecidos em três sistemas multicore, PFG e PCG melhoram EDP em até, respectivamente, 95,3% e 95,5% em comparação com a execução OpenMP padrão, 90,9 % e 94,8 % em Varuna-PM e 80,5% e 83,7% contra a técnica Core Packing. Também mostramos que o PFG é mais adequado para aplicações com alta variabilidade na carga de trabalho da CPU, enquanto o PCG é melhor quando há baixa variabilidade da carga de trabalho.Universidade Federal do PampaUNIPAMPABrasilCampus AlegreteLorenzon, Arthur FranciscoMarques, Sandro Matheus Vila Nova2021-05-31T15:18:00Z2021-05-282021-05-31T15:18:00Z2021-05-07info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/bachelorThesisapplication/pdfMARQUES, Sandro Matheus Vila Nova. Transparent optimization of OpenMP applications via thread throttling and boosting techniques. Orientador: Arthur Francisco Lorenzon. 2021. 64p. Trabalho de Conclusão de Curso (Bacharel em Engenharia de software) - Universidade Federal do Pampa, Curso de Engenharia de software, Alegrete, 2021.http://dspace.unipampa.edu.br:8080/jspui/handle/riu/5612enginfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UNIPAMPAinstname:Universidade Federal do Pampa (UNIPAMPA)instacron:UNIPAMPA2025-05-22T19:00:00Zoai:repositorio.unipampa.edu.br:riu/5612Repositório InstitucionalPUBhttp://dspace.unipampa.edu.br:8080/oai/requestsisbi@unipampa.edu.bropendoar:2025-05-22T19:00Repositório Institucional da UNIPAMPA - Universidade Federal do Pampa (UNIPAMPA)false
dc.title.none.fl_str_mv Transparent optimization of OpenMP applications via thread throttling and boosting techniques
title Transparent optimization of OpenMP applications via thread throttling and boosting techniques
spellingShingle Transparent optimization of OpenMP applications via thread throttling and boosting techniques
Marques, Sandro Matheus Vila Nova
Engenharia de software
Computação de alto desempenho
Programação paralela (Computação)
Software engineering
High performance computing
Parallel programming (Computer science)
CNPQ::CIENCIAS EXATAS E DA TERRA
title_short Transparent optimization of OpenMP applications via thread throttling and boosting techniques
title_full Transparent optimization of OpenMP applications via thread throttling and boosting techniques
title_fullStr Transparent optimization of OpenMP applications via thread throttling and boosting techniques
title_full_unstemmed Transparent optimization of OpenMP applications via thread throttling and boosting techniques
title_sort Transparent optimization of OpenMP applications via thread throttling and boosting techniques
author Marques, Sandro Matheus Vila Nova
author_facet Marques, Sandro Matheus Vila Nova
author_role author
dc.contributor.none.fl_str_mv Lorenzon, Arthur Francisco
dc.contributor.author.fl_str_mv Marques, Sandro Matheus Vila Nova
dc.subject.por.fl_str_mv Engenharia de software
Computação de alto desempenho
Programação paralela (Computação)
Software engineering
High performance computing
Parallel programming (Computer science)
CNPQ::CIENCIAS EXATAS E DA TERRA
topic Engenharia de software
Computação de alto desempenho
Programação paralela (Computação)
Software engineering
High performance computing
Parallel programming (Computer science)
CNPQ::CIENCIAS EXATAS E DA TERRA
description The growing number of cores in modern multicore architectures has brought together the need for better use of hardware resources. Consequently, two techniques have become widely used to optimize the performance and energy consumption of these environments: Dynamic Concurrency Throttling (DCT) and Boosting. On the one hand, DCT adjusts the number of threads in parallel regions to minimize the effects of intrinsic characteristics of the applications that impact performance and energy consumption (e.g., data synchronization and communication). On the other hand, Boosting techniques focus on making the performance reach its maximum level during all phases of the application by increasing the processor frequencies while respecting the Thermal Design Power (TDP). One of the main challenges is that each region of a parallel application can behave differently (i.e., memory access behavior and degree of parallelism) which makes using both techniques combined not a straightforward task. Choosing the wrong number of threads and enabling/disabling boosting frequencies in the incorrect phases can lead to increasing the energy consumption and performance degradation. To solve this problem, this work presents two strategies that apply DCT and Boosting to improve the trade-off between performance and energy consumption (represented by the energy-delay product - EDP): PFG, a strategy that optimizes each region of a given application, individually; and PCG that considers the combination of parallel and sequential regions during optimization. Both strategies are transparent, automatic, and deeply integrated into the OpenMP parallel programming interface, so no code modification or recompilation is necessary. By executing twelve well-known benchmarks in three multicore systems, PFG and PCG improve EDP by up to, respectively, 95.3% and 95.5% compared to standard OpenMP execution, 90.9%, and 94.8% on Varuna-PM and 80.5% and 83.7% against the Core Packing technique. We also show that PFG is more suitable for applications with high variability in the CPU workload, while PCG is better when there is low workload variability.
publishDate 2021
dc.date.none.fl_str_mv 2021-05-31T15:18:00Z
2021-05-28
2021-05-31T15:18:00Z
2021-05-07
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/bachelorThesis
format bachelorThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv MARQUES, Sandro Matheus Vila Nova. Transparent optimization of OpenMP applications via thread throttling and boosting techniques. Orientador: Arthur Francisco Lorenzon. 2021. 64p. Trabalho de Conclusão de Curso (Bacharel em Engenharia de software) - Universidade Federal do Pampa, Curso de Engenharia de software, Alegrete, 2021.
http://dspace.unipampa.edu.br:8080/jspui/handle/riu/5612
identifier_str_mv MARQUES, Sandro Matheus Vila Nova. Transparent optimization of OpenMP applications via thread throttling and boosting techniques. Orientador: Arthur Francisco Lorenzon. 2021. 64p. Trabalho de Conclusão de Curso (Bacharel em Engenharia de software) - Universidade Federal do Pampa, Curso de Engenharia de software, Alegrete, 2021.
url http://dspace.unipampa.edu.br:8080/jspui/handle/riu/5612
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Federal do Pampa
UNIPAMPA
Brasil
Campus Alegrete
publisher.none.fl_str_mv Universidade Federal do Pampa
UNIPAMPA
Brasil
Campus Alegrete
dc.source.none.fl_str_mv reponame:Repositório Institucional da UNIPAMPA
instname:Universidade Federal do Pampa (UNIPAMPA)
instacron:UNIPAMPA
instname_str Universidade Federal do Pampa (UNIPAMPA)
instacron_str UNIPAMPA
institution UNIPAMPA
reponame_str Repositório Institucional da UNIPAMPA
collection Repositório Institucional da UNIPAMPA
repository.name.fl_str_mv Repositório Institucional da UNIPAMPA - Universidade Federal do Pampa (UNIPAMPA)
repository.mail.fl_str_mv sisbi@unipampa.edu.br
_version_ 1842255704374640640