HEP-Frame: an efficient tool for big data applications at the LHC
| Main Author: | |
|---|---|
| Publication Date: | 2023 |
| Other Authors: | , |
| Format: | Article |
| Language: | eng |
| Source: | Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| Download full: | https://hdl.handle.net/1822/87722 |
Summary: | HEP-Frame is a new C++ package designed to efficiently perform analyses of datasets from a very large number of events, like those available at the Large Hadron Collider (LHC) at CERN, Geneva. It mainly targets high-performance servers and mini-clusters, and it was designed for natural science researchers with a user-friendly interface to access structured databases. HEP-Frame automatically evaluates the underlying computing resources and builds an adequate code skeleton when creating a data analysis application. At run-time, HEP-Frame analyses a sequence of datasets exploring the available parallelism in the code and hardware resources: it concurrently reads inputs from a user-defined data structure and processes them, following the user-specific sequence of requirements to select relevant data; it manages the efficient execution of that sequence; and it outputs results in userdefined objects (e.g., ROOT structures), stored together with the used input dataset. This paper shows how a domain expert software development can benefit from HEP-Frame, and how it significantly improved the performance of analyses of large datasets produced in proton-proton collisions at the LHC. Two case studies are discussed: the associated production of top quarks together with a Higgs boson (t (t) over barH) at the LHC, and a double- and single-top quark productions at the high-luminosity phase of the LHC (HL-LHC). Results show that the HEP-Frame awareness of the analysis code behaviour and structure, and the underlying hardware system, provides powerful and transparent parallelization mechanisms that largely improve the execution time of data analysis applications. |
| id |
RCAP_090fc1476c2a05ff8984088a1a14d778 |
|---|---|
| oai_identifier_str |
oai:repositorium.sdum.uminho.pt:1822/87722 |
| network_acronym_str |
RCAP |
| network_name_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| repository_id_str |
https://opendoar.ac.uk/repository/7160 |
| spelling |
HEP-Frame: an efficient tool for big data applications at the LHCCiências Naturais::Ciências FísicasScience & TechnologyHEP-Frame is a new C++ package designed to efficiently perform analyses of datasets from a very large number of events, like those available at the Large Hadron Collider (LHC) at CERN, Geneva. It mainly targets high-performance servers and mini-clusters, and it was designed for natural science researchers with a user-friendly interface to access structured databases. HEP-Frame automatically evaluates the underlying computing resources and builds an adequate code skeleton when creating a data analysis application. At run-time, HEP-Frame analyses a sequence of datasets exploring the available parallelism in the code and hardware resources: it concurrently reads inputs from a user-defined data structure and processes them, following the user-specific sequence of requirements to select relevant data; it manages the efficient execution of that sequence; and it outputs results in userdefined objects (e.g., ROOT structures), stored together with the used input dataset. This paper shows how a domain expert software development can benefit from HEP-Frame, and how it significantly improved the performance of analyses of large datasets produced in proton-proton collisions at the LHC. Two case studies are discussed: the associated production of top quarks together with a Higgs boson (t (t) over barH) at the LHC, and a double- and single-top quark productions at the high-luminosity phase of the LHC (HL-LHC). Results show that the HEP-Frame awareness of the analysis code behaviour and structure, and the underlying hardware system, provides powerful and transparent parallelization mechanisms that largely improve the execution time of data analysis applications.Open access funding provided by FCT|FCCN (b-on). The research leading to these results was partially funded by Fundacao para a Ciencia e Tecnologia under Grant Agreement No. UIDB/00319/2020.info:eu-repo/semantics/publishedVersionSpringer HeidelbergUniversidade do MinhoPereira, AndréOnofre, A.Proença, Alberto José20232023-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttps://hdl.handle.net/1822/87722engPereira, A., Onofre, A. & Proença, A. HEP-Frame: an efficient tool for big data applications at the LHC. Eur. Phys. J. Plus 138, 278 (2023). https://doi.org/10.1140/epjp/s13360-023-03861-12190-54442190-544410.1140/epjp/s13360-023-03861-1https://link.springer.com/article/10.1140/epjp/s13360-023-03861-1info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-12-21T01:17:36Zoai:repositorium.sdum.uminho.pt:1822/87722Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T15:04:49.403847Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse |
| dc.title.none.fl_str_mv |
HEP-Frame: an efficient tool for big data applications at the LHC |
| title |
HEP-Frame: an efficient tool for big data applications at the LHC |
| spellingShingle |
HEP-Frame: an efficient tool for big data applications at the LHC Pereira, André Ciências Naturais::Ciências Físicas Science & Technology |
| title_short |
HEP-Frame: an efficient tool for big data applications at the LHC |
| title_full |
HEP-Frame: an efficient tool for big data applications at the LHC |
| title_fullStr |
HEP-Frame: an efficient tool for big data applications at the LHC |
| title_full_unstemmed |
HEP-Frame: an efficient tool for big data applications at the LHC |
| title_sort |
HEP-Frame: an efficient tool for big data applications at the LHC |
| author |
Pereira, André |
| author_facet |
Pereira, André Onofre, A. Proença, Alberto José |
| author_role |
author |
| author2 |
Onofre, A. Proença, Alberto José |
| author2_role |
author author |
| dc.contributor.none.fl_str_mv |
Universidade do Minho |
| dc.contributor.author.fl_str_mv |
Pereira, André Onofre, A. Proença, Alberto José |
| dc.subject.por.fl_str_mv |
Ciências Naturais::Ciências Físicas Science & Technology |
| topic |
Ciências Naturais::Ciências Físicas Science & Technology |
| description |
HEP-Frame is a new C++ package designed to efficiently perform analyses of datasets from a very large number of events, like those available at the Large Hadron Collider (LHC) at CERN, Geneva. It mainly targets high-performance servers and mini-clusters, and it was designed for natural science researchers with a user-friendly interface to access structured databases. HEP-Frame automatically evaluates the underlying computing resources and builds an adequate code skeleton when creating a data analysis application. At run-time, HEP-Frame analyses a sequence of datasets exploring the available parallelism in the code and hardware resources: it concurrently reads inputs from a user-defined data structure and processes them, following the user-specific sequence of requirements to select relevant data; it manages the efficient execution of that sequence; and it outputs results in userdefined objects (e.g., ROOT structures), stored together with the used input dataset. This paper shows how a domain expert software development can benefit from HEP-Frame, and how it significantly improved the performance of analyses of large datasets produced in proton-proton collisions at the LHC. Two case studies are discussed: the associated production of top quarks together with a Higgs boson (t (t) over barH) at the LHC, and a double- and single-top quark productions at the high-luminosity phase of the LHC (HL-LHC). Results show that the HEP-Frame awareness of the analysis code behaviour and structure, and the underlying hardware system, provides powerful and transparent parallelization mechanisms that largely improve the execution time of data analysis applications. |
| publishDate |
2023 |
| dc.date.none.fl_str_mv |
2023 2023-01-01T00:00:00Z |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
| format |
article |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/1822/87722 |
| url |
https://hdl.handle.net/1822/87722 |
| dc.language.iso.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
Pereira, A., Onofre, A. & Proença, A. HEP-Frame: an efficient tool for big data applications at the LHC. Eur. Phys. J. Plus 138, 278 (2023). https://doi.org/10.1140/epjp/s13360-023-03861-1 2190-5444 2190-5444 10.1140/epjp/s13360-023-03861-1 https://link.springer.com/article/10.1140/epjp/s13360-023-03861-1 |
| dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.publisher.none.fl_str_mv |
Springer Heidelberg |
| publisher.none.fl_str_mv |
Springer Heidelberg |
| dc.source.none.fl_str_mv |
reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP |
| instname_str |
FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
| instacron_str |
RCAAP |
| institution |
RCAAP |
| reponame_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| collection |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| repository.name.fl_str_mv |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
| repository.mail.fl_str_mv |
info@rcaap.pt |
| _version_ |
1833595085282869248 |