Detalhes bibliográficos
Ano de defesa: |
2014 |
Autor(a) principal: |
Raeder, Mateus
 |
Orientador(a): |
Fernandes, Luiz Gustavo Leão
 |
Banca de defesa: |
Não Informado pela instituição |
Tipo de documento: |
Tese
|
Tipo de acesso: |
Acesso aberto |
Idioma: |
por |
Instituição de defesa: |
Pontifícia Universidade Católica do Rio Grande do Sul
|
Programa de Pós-Graduação: |
Programa de Pós-Graduação em Ciência da Computação
|
Departamento: |
Faculdade de Informática
|
País: |
Brasil
|
Palavras-chave em Português: |
|
Área do conhecimento CNPq: |
|
Link de acesso: |
http://tede2.pucrs.br/tede2/handle/tede/7390
|
Resumo: |
Over the last years, technological advances provide machines with different levels of parallelism, producing a great impact in high-performance computing area. These advances allowed developers to improve further the performance of large scale applications. In this context, clusters of multiprocessor machines with Non-Uniform Memory Access (NUMA) are a trend in parallel processing. In NUMA architectures, the access time to data depends on where it is placed in memory. For this reason, managing data location is essential in this type of machine. In this scenario, developing software for a cluster of NUMA machines must explore the internode part (multicomputer, with distributed memory) and the intranode part (multiprocessor, with shared memory) of this architecture. This type of hybrid programming takes advantage of all features provided by NUMA architectures. However, rewriting a sequential application so that it exploits the parallelism of the environment correctly is not a trivial task, but can be facilitated through an automated process. In this sense, our work presents an automatic parallel code generation process for hybrid architectures. With the proposed approach, users do not need to know low level routines of parallel programming libraries. In order to do so, we developed a graphical tool, in which users can dynamically and intuitively create their parallel models. Thereby, it is possible to create parallel programs in such a way that is not required to be familiar with libraries commonly used by professionals of high performance computing area (such as MPI, for example). By using the developed tool, user draws a directed graph to indicate the number of processes (nodes of the graph) and the communication between them (edges). From this drawing, user inserts the sequential code of each process defined in the graphical interface, and the tool automatically generates the corresponding parallel code. Moreover, weight process and memory mappings were defined and tested on a NUMA machine cluster, as well as a hybrid mapping. The tool was developed in Java and generates parallel code with MPI for C++, in the same way that it applies memory affinity policies for NUMA machines through the Memory Affinity Interface (MAI) library. Some applications were developed with and without our model. The obtained results evidence that the proposed mapping is valid, providing performance gains in relation to sequential versions and behaving in a very similar way to traditional parallel implementations. |