Uma implementação baseada em tarefas e fluxo de dados do método de Lattice-Boltzmann
Ano de defesa: | 2018 |
---|---|
Autor(a) principal: | |
Orientador(a): | |
Banca de defesa: | |
Tipo de documento: | Dissertação |
Tipo de acesso: | Acesso aberto |
Idioma: | por |
Instituição de defesa: |
Universidade Federal de Santa Maria
Brasil Ciência da Computação UFSM Programa de Pós-Graduação em Ciência da Computação Centro de Tecnologia |
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: | |
Link de acesso: | http://repositorio.ufsm.br/handle/1/14872 |
Resumo: | Lattice-Boltzmann is an iterative numerical method for mesoscopic modeling and simulation of fluid flow dynamics. This method simulates the discrete properties of physical systems and, for the simulations to be performed in a computationally acceptable time, requires a great computational power. Several studies in the literature are dedicated to parallelizing and evaluating the performance of the Lattice-Boltzmann method applied to a variety of problems using a wide range of high-performance computing architectures ranging from shared, distributed and hybrid memory architectures to GPU accelerators, Xeon Phi, among others. In today’s scenario where processor manufacturers practically double the number of transistors on a single chip with each new generation dedicating them to core replication, the efficient exploitation of the increasing parallelism offered especially by shared memory high performance computing architectures depends on the adoption of parallelism techniques that optimize the execution of applications in these architectures. One of the most popular parallelism techniques in this type of architecture is the parallelism of loop iterations using the OpenMP API. In spite of providing significant performance gains, the parallelism offered by this type of architecture is not always exploited in its entirety and techniques such as task parallelism can be used to optimize the exploitation of parallelism by applications. Although task parallelism has been supported since version 3.0 of the OpenMP API, the concept of data dependency between tasks has only been introduced in version 4.0. By specifying the data dependencies it is possible to add constraints to the scheduling of tasks so that the execution order is determined according to the read and write operations performed by each task in memory addresses, thus avoiding inconsistencies in the parallel execution of tasks. Because it is an iterative method, the parallelization of the Lattice-Boltzmann method using tasks requires that at each new iteration the different tasks be synchronized, which can be performed from the determination of the dependencies of each task. Therefore, the objective of this work is to present and evaluate the performance of a task-based and data flow implementation of the Lattice-Boltzmann method using OpenMP tasks with dependencies. The results of experiments performed on a NUMA shared memory architecture composed of 48 processing cores show that the performance of the task-based implementation with dependencies was up to 22.49% better when compared to the performance achieved by an implementation based on loop-level parallelism. |