Paralelização do método Meshless Local Petrov-Galerkin (MLPG) utilizando processadores gráficos (GPU) e CUDA

Detalhes bibliográficos
Ano de defesa: 2014
Autor(a) principal: Bruno Carvalho Correa
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Minas Gerais
UFMG
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://hdl.handle.net/1843/RAOA-BEKMG6
Resumo: In this work, a new strategy to paralelize the Meshless Local Petrov-Galerkin method(MLPG) is developed. It is executed in a high parallel architecture, the well known graphics processing unit (GPU). The meshless methods are extensively applied nowadays to solve several different problems of partial diferential equations. Compared with the traditional finite element methods, the meshless methods are a quite interesting alternative because they do not require a mesh in order to solve a physical problem, only a node distribution and a proper description of the boundary of the problem (that is actually a node distribution on the boundary) as well as the boundary conditions are needed. In this work the algorithm is adapted to run on the GPU. Several applications are being developed to execute in this new architecture to take advantage of its high parallel nature. Among several models of programming, one can distinguish CUDA or Computer Unified Architecture of NVIDIA. CUDA is a scalable parallel architecture developed by NVIDIA and can be programmed in C or via graphics API, so that the GPU can be used as a coprocessor auxilliating the central processing unit (CPU) as well as serving as a cheap supercomputer for numerical applications with surprisingly readiness. The MLPG is parallelized to execute completly on the GPU side. The MLPG was chosen because of its simplicity and because it does not require any complex geometric representation of the domain or any sychronization scheme to obtain the global system of equations. In order to test this approach, it is applied to an electromagnetic problem whose analytical solution exist. The execution time of both GPU and CPU versions are compared. The results obtained with NIVIDA GeForce GTX 680 in this work shows that it is possible to have an execution time 20 times smaller than the counterpart algorithm on the CPU, ensuring the same precision of results.