Escalonador adaptativo para laços paralelos em processadores multinúcleo assimétricos

Detalhes bibliográficos
Ano de defesa: 2020
Autor(a) principal: Trindade, Rafael Gauna
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Santa Maria
Brasil
Ciência da Computação
UFSM
Programa de Pós-Graduação em Ciência da Computação
Centro de Tecnologia
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://repositorio.ufsm.br/handle/1/22265
Resumo: The growing demand for computing power and energy efficiency in mobile computing has triggered the development of heterogeneous processors with specialized cores for different types of computational tasks, such as ARM big.LITTLE processors, which have different cores that combine performance with low energy consumption. Such difference in the composition of the cores in this kind of processors ends up inducing an asymmetry in the computational performance of these systems, making complicated the task of predicting the behavior of parallel applications in relation to performance when using all their cores. This asymmetry can be detected in applications that use parallel loops, a parallel programming feature that allows to divide the workload of an iterative routine between the cores present in a processor. Parallel loop schedulers that are not designed to prevent loss of performance in asymmetric multi-core processors (AMPs) can compromise the implementation of software solutions designed for this type of architecture. This dissertation presents the implementation proposal of a scheduler for parallel loops that uses an adaptive algorithm to distribute the workload among the threads, aiming at a better extraction of performance in AMPs. The scheduler uses of parallel work-stealing and lock-free as possible sequential extraction of work to face other existing solutions. Its implementation was carried out in the C ++ language, with the possibility of portability to the C language. In order to evaluate the performance of the solution, an analysis was performed on the set of NAS benchmarks and four distinct well-established scientific applications in related literature, over two real asymmetric embedded environments, against two existing solutions (OpenMP and Intel TBB). The analysis shows that the scheduler is able to extract more performance in certain cases and is very close to the best solutions in most of the remaining cases, with greater scalability potential in theory for cases where the scheduling overhead becomes an obstacle in the other solutions.