HEP-Frame: an efficient tool for big data applications at the LHC

Bibliographic Details
Main Author: Pereira, André
Publication Date: 2023
Other Authors: Onofre, A., Proença, Alberto José
Format: Article
Language: eng
Source: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Download full: https://hdl.handle.net/1822/87722
Summary: HEP-Frame is a new C++ package designed to efficiently perform analyses of datasets from a very large number of events, like those available at the Large Hadron Collider (LHC) at CERN, Geneva. It mainly targets high-performance servers and mini-clusters, and it was designed for natural science researchers with a user-friendly interface to access structured databases. HEP-Frame automatically evaluates the underlying computing resources and builds an adequate code skeleton when creating a data analysis application. At run-time, HEP-Frame analyses a sequence of datasets exploring the available parallelism in the code and hardware resources: it concurrently reads inputs from a user-defined data structure and processes them, following the user-specific sequence of requirements to select relevant data; it manages the efficient execution of that sequence; and it outputs results in userdefined objects (e.g., ROOT structures), stored together with the used input dataset. This paper shows how a domain expert software development can benefit from HEP-Frame, and how it significantly improved the performance of analyses of large datasets produced in proton-proton collisions at the LHC. Two case studies are discussed: the associated production of top quarks together with a Higgs boson (t (t) over barH) at the LHC, and a double- and single-top quark productions at the high-luminosity phase of the LHC (HL-LHC). Results show that the HEP-Frame awareness of the analysis code behaviour and structure, and the underlying hardware system, provides powerful and transparent parallelization mechanisms that largely improve the execution time of data analysis applications.
id RCAP_090fc1476c2a05ff8984088a1a14d778
oai_identifier_str oai:repositorium.sdum.uminho.pt:1822/87722
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling HEP-Frame: an efficient tool for big data applications at the LHCCiências Naturais::Ciências FísicasScience & TechnologyHEP-Frame is a new C++ package designed to efficiently perform analyses of datasets from a very large number of events, like those available at the Large Hadron Collider (LHC) at CERN, Geneva. It mainly targets high-performance servers and mini-clusters, and it was designed for natural science researchers with a user-friendly interface to access structured databases. HEP-Frame automatically evaluates the underlying computing resources and builds an adequate code skeleton when creating a data analysis application. At run-time, HEP-Frame analyses a sequence of datasets exploring the available parallelism in the code and hardware resources: it concurrently reads inputs from a user-defined data structure and processes them, following the user-specific sequence of requirements to select relevant data; it manages the efficient execution of that sequence; and it outputs results in userdefined objects (e.g., ROOT structures), stored together with the used input dataset. This paper shows how a domain expert software development can benefit from HEP-Frame, and how it significantly improved the performance of analyses of large datasets produced in proton-proton collisions at the LHC. Two case studies are discussed: the associated production of top quarks together with a Higgs boson (t (t) over barH) at the LHC, and a double- and single-top quark productions at the high-luminosity phase of the LHC (HL-LHC). Results show that the HEP-Frame awareness of the analysis code behaviour and structure, and the underlying hardware system, provides powerful and transparent parallelization mechanisms that largely improve the execution time of data analysis applications.Open access funding provided by FCT|FCCN (b-on). The research leading to these results was partially funded by Fundacao para a Ciencia e Tecnologia under Grant Agreement No. UIDB/00319/2020.info:eu-repo/semantics/publishedVersionSpringer HeidelbergUniversidade do MinhoPereira, AndréOnofre, A.Proença, Alberto José20232023-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttps://hdl.handle.net/1822/87722engPereira, A., Onofre, A. & Proença, A. HEP-Frame: an efficient tool for big data applications at the LHC. Eur. Phys. J. Plus 138, 278 (2023). https://doi.org/10.1140/epjp/s13360-023-03861-12190-54442190-544410.1140/epjp/s13360-023-03861-1https://link.springer.com/article/10.1140/epjp/s13360-023-03861-1info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-12-21T01:17:36Zoai:repositorium.sdum.uminho.pt:1822/87722Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T15:04:49.403847Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv HEP-Frame: an efficient tool for big data applications at the LHC
title HEP-Frame: an efficient tool for big data applications at the LHC
spellingShingle HEP-Frame: an efficient tool for big data applications at the LHC
Pereira, André
Ciências Naturais::Ciências Físicas
Science & Technology
title_short HEP-Frame: an efficient tool for big data applications at the LHC
title_full HEP-Frame: an efficient tool for big data applications at the LHC
title_fullStr HEP-Frame: an efficient tool for big data applications at the LHC
title_full_unstemmed HEP-Frame: an efficient tool for big data applications at the LHC
title_sort HEP-Frame: an efficient tool for big data applications at the LHC
author Pereira, André
author_facet Pereira, André
Onofre, A.
Proença, Alberto José
author_role author
author2 Onofre, A.
Proença, Alberto José
author2_role author
author
dc.contributor.none.fl_str_mv Universidade do Minho
dc.contributor.author.fl_str_mv Pereira, André
Onofre, A.
Proença, Alberto José
dc.subject.por.fl_str_mv Ciências Naturais::Ciências Físicas
Science & Technology
topic Ciências Naturais::Ciências Físicas
Science & Technology
description HEP-Frame is a new C++ package designed to efficiently perform analyses of datasets from a very large number of events, like those available at the Large Hadron Collider (LHC) at CERN, Geneva. It mainly targets high-performance servers and mini-clusters, and it was designed for natural science researchers with a user-friendly interface to access structured databases. HEP-Frame automatically evaluates the underlying computing resources and builds an adequate code skeleton when creating a data analysis application. At run-time, HEP-Frame analyses a sequence of datasets exploring the available parallelism in the code and hardware resources: it concurrently reads inputs from a user-defined data structure and processes them, following the user-specific sequence of requirements to select relevant data; it manages the efficient execution of that sequence; and it outputs results in userdefined objects (e.g., ROOT structures), stored together with the used input dataset. This paper shows how a domain expert software development can benefit from HEP-Frame, and how it significantly improved the performance of analyses of large datasets produced in proton-proton collisions at the LHC. Two case studies are discussed: the associated production of top quarks together with a Higgs boson (t (t) over barH) at the LHC, and a double- and single-top quark productions at the high-luminosity phase of the LHC (HL-LHC). Results show that the HEP-Frame awareness of the analysis code behaviour and structure, and the underlying hardware system, provides powerful and transparent parallelization mechanisms that largely improve the execution time of data analysis applications.
publishDate 2023
dc.date.none.fl_str_mv 2023
2023-01-01T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://hdl.handle.net/1822/87722
url https://hdl.handle.net/1822/87722
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Pereira, A., Onofre, A. & Proença, A. HEP-Frame: an efficient tool for big data applications at the LHC. Eur. Phys. J. Plus 138, 278 (2023). https://doi.org/10.1140/epjp/s13360-023-03861-1
2190-5444
2190-5444
10.1140/epjp/s13360-023-03861-1
https://link.springer.com/article/10.1140/epjp/s13360-023-03861-1
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Springer Heidelberg
publisher.none.fl_str_mv Springer Heidelberg
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833595085282869248