Performance Comparison of Speculative Taskloop and OpenMP-for-Loop Thread-Level Speculation on Hardware Transactional Memory
Main Author: | |
---|---|
Publication Date: | 2022 |
Format: | Conference object |
Language: | eng |
Source: | Repositório Institucional da UNESP |
Download full: | http://dx.doi.org/10.1109/ISPDC55340.2022.00021 http://hdl.handle.net/11449/247961 |
Summary: | Speculative Taskloop (STL) is a loop parallelization technique that takes the best of Task-based Parallelism and Thread-Level Speculation to speed up loops with may loop-carried dependencies that were previously difficult for compilers to parallelize. Previous studies show the efficiency of STL when implemented using Hardware Transactional Memory and the advantages it offers compared to a typical DOACROSS technique such as OpenMP ordered. This paper presents a performance comparison between STL and a previously proposed technique that implements Thread-Level Speculation (TLS) in the for worksharing construct (FOR-TLS) over a set of loops from cbench and SPEC2006 benchmarks. The results show interesting insights on how each technique can be more appropriate depending on the characteristics of the evaluated loop. Experimental results reveal that by implementing both techniques on top of HTM, speed-ups of up to 2.41× can be obtained for STL and up to 2× for FOR-TLS. |
id |
UNSP_f939fefb824b6b7e659714b356b9d20c |
---|---|
oai_identifier_str |
oai:repositorio.unesp.br:11449/247961 |
network_acronym_str |
UNSP |
network_name_str |
Repositório Institucional da UNESP |
repository_id_str |
2946 |
spelling |
Performance Comparison of Speculative Taskloop and OpenMP-for-Loop Thread-Level Speculation on Hardware Transactional MemoryHardware Transactional MemoryOpenMPparallel fortaskloopThread-Level SpeculationSpeculative Taskloop (STL) is a loop parallelization technique that takes the best of Task-based Parallelism and Thread-Level Speculation to speed up loops with may loop-carried dependencies that were previously difficult for compilers to parallelize. Previous studies show the efficiency of STL when implemented using Hardware Transactional Memory and the advantages it offers compared to a typical DOACROSS technique such as OpenMP ordered. This paper presents a performance comparison between STL and a previously proposed technique that implements Thread-Level Speculation (TLS) in the for worksharing construct (FOR-TLS) over a set of loops from cbench and SPEC2006 benchmarks. The results show interesting insights on how each technique can be more appropriate depending on the characteristics of the evaluated loop. Experimental results reveal that by implementing both techniques on top of HTM, speed-ups of up to 2.41× can be obtained for STL and up to 2× for FOR-TLS.DEMAC/IGCE São Paulo State University (Unesp)DEMAC/IGCE São Paulo State University (Unesp)Universidade Estadual Paulista (UNESP)Salamanca, Juan [UNESP]2023-07-29T13:30:38Z2023-07-29T13:30:38Z2022-01-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObject83-90http://dx.doi.org/10.1109/ISPDC55340.2022.00021Proceedings - 2022 21st International Symposium on Parallel and Distributed Computing, ISPDC 2022, p. 83-90.http://hdl.handle.net/11449/24796110.1109/ISPDC55340.2022.000212-s2.0-85142857267Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengProceedings - 2022 21st International Symposium on Parallel and Distributed Computing, ISPDC 2022info:eu-repo/semantics/openAccess2024-11-27T14:10:26Zoai:repositorio.unesp.br:11449/247961Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestrepositoriounesp@unesp.bropendoar:29462024-11-27T14:10:26Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false |
dc.title.none.fl_str_mv |
Performance Comparison of Speculative Taskloop and OpenMP-for-Loop Thread-Level Speculation on Hardware Transactional Memory |
title |
Performance Comparison of Speculative Taskloop and OpenMP-for-Loop Thread-Level Speculation on Hardware Transactional Memory |
spellingShingle |
Performance Comparison of Speculative Taskloop and OpenMP-for-Loop Thread-Level Speculation on Hardware Transactional Memory Salamanca, Juan [UNESP] Hardware Transactional Memory OpenMP parallel for taskloop Thread-Level Speculation |
title_short |
Performance Comparison of Speculative Taskloop and OpenMP-for-Loop Thread-Level Speculation on Hardware Transactional Memory |
title_full |
Performance Comparison of Speculative Taskloop and OpenMP-for-Loop Thread-Level Speculation on Hardware Transactional Memory |
title_fullStr |
Performance Comparison of Speculative Taskloop and OpenMP-for-Loop Thread-Level Speculation on Hardware Transactional Memory |
title_full_unstemmed |
Performance Comparison of Speculative Taskloop and OpenMP-for-Loop Thread-Level Speculation on Hardware Transactional Memory |
title_sort |
Performance Comparison of Speculative Taskloop and OpenMP-for-Loop Thread-Level Speculation on Hardware Transactional Memory |
author |
Salamanca, Juan [UNESP] |
author_facet |
Salamanca, Juan [UNESP] |
author_role |
author |
dc.contributor.none.fl_str_mv |
Universidade Estadual Paulista (UNESP) |
dc.contributor.author.fl_str_mv |
Salamanca, Juan [UNESP] |
dc.subject.por.fl_str_mv |
Hardware Transactional Memory OpenMP parallel for taskloop Thread-Level Speculation |
topic |
Hardware Transactional Memory OpenMP parallel for taskloop Thread-Level Speculation |
description |
Speculative Taskloop (STL) is a loop parallelization technique that takes the best of Task-based Parallelism and Thread-Level Speculation to speed up loops with may loop-carried dependencies that were previously difficult for compilers to parallelize. Previous studies show the efficiency of STL when implemented using Hardware Transactional Memory and the advantages it offers compared to a typical DOACROSS technique such as OpenMP ordered. This paper presents a performance comparison between STL and a previously proposed technique that implements Thread-Level Speculation (TLS) in the for worksharing construct (FOR-TLS) over a set of loops from cbench and SPEC2006 benchmarks. The results show interesting insights on how each technique can be more appropriate depending on the characteristics of the evaluated loop. Experimental results reveal that by implementing both techniques on top of HTM, speed-ups of up to 2.41× can be obtained for STL and up to 2× for FOR-TLS. |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022-01-01 2023-07-29T13:30:38Z 2023-07-29T13:30:38Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/conferenceObject |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://dx.doi.org/10.1109/ISPDC55340.2022.00021 Proceedings - 2022 21st International Symposium on Parallel and Distributed Computing, ISPDC 2022, p. 83-90. http://hdl.handle.net/11449/247961 10.1109/ISPDC55340.2022.00021 2-s2.0-85142857267 |
url |
http://dx.doi.org/10.1109/ISPDC55340.2022.00021 http://hdl.handle.net/11449/247961 |
identifier_str_mv |
Proceedings - 2022 21st International Symposium on Parallel and Distributed Computing, ISPDC 2022, p. 83-90. 10.1109/ISPDC55340.2022.00021 2-s2.0-85142857267 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Proceedings - 2022 21st International Symposium on Parallel and Distributed Computing, ISPDC 2022 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
83-90 |
dc.source.none.fl_str_mv |
Scopus reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP |
instname_str |
Universidade Estadual Paulista (UNESP) |
instacron_str |
UNESP |
institution |
UNESP |
reponame_str |
Repositório Institucional da UNESP |
collection |
Repositório Institucional da UNESP |
repository.name.fl_str_mv |
Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP) |
repository.mail.fl_str_mv |
repositoriounesp@unesp.br |
_version_ |
1834483512189648896 |