Detalhes bibliográficos
Ano de defesa: |
2021 |
Autor(a) principal: |
Miletto, Marcelo Cogo |
Orientador(a): |
Schnorr, Lucas Mello |
Banca de defesa: |
Não Informado pela instituição |
Tipo de documento: |
Dissertação
|
Tipo de acesso: |
Acesso aberto |
Idioma: |
por |
Instituição de defesa: |
Não Informado pela instituição
|
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: |
|
Palavras-chave em Inglês: |
|
Link de acesso: |
http://hdl.handle.net/10183/221594
|
Resumo: |
Parallel application performance analysis is an essential and a continuous step towards understanding and optimizing any high-performance program. Nowadays, ubiquitous and complex heterogeneous architectures turn this job even more burdensome. While paradigms like task-based ease programming through its abstractions and its runtime system, the analysis of such applications demand attention because of its specific view of the applications. Likewise, the analysis of irregular applications built upon specific data structures need to consider its abstractions and behavior to improve and facilitate an analyst’s work. Thus, the current work proposes strategies to enhance the performance analysis of irregular task-based applications and propose application-centric visualization panels to represent performance according to the elimination tree structure, the foundation of many direct sparse factorization methods. The strategies rely on tracing information for collecting task performance data. Since task-based applications can create many tasks and huge trace files, the proposed automatic mechanism for anomalous task classification based on regression models allows highlighting specific groups of problematic tasks and guiding the analysis process. The visualization techniques represent the tree structure and describe application-specific concepts like tree and node parallelism, child and parent dependencies, and communications. Those strategies are applied to the qr_mumps sparse task-based solver in an extensive set of experiments. The anomalous detection mechanism exposed four different task anomaly sources, guiding a solution that improved performance by up to 24% by reducing task interference. The elimination tree visualization panels allowed detailed comparisons between different application and runtime configurations, revealing other sources of inefficiency. The experiments also involved testing the qr_mumps application in a real computational simulation application, where it presented better performance than other parallel solvers. The results demonstrate the usefulness of the proposed strategies to guide the performance analysis of irregular task-based applications and enhance the performance representation of elimination-tree based applications. |