Sincronização de threads em hardware SIMD

Detalhes bibliográficos
Ano de defesa: 2013
Autor(a) principal: Teo Milanez Brandao
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Minas Gerais
UFMG
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://hdl.handle.net/1843/ESBF-9GXNJJ
Resumo: Performance is constrained by power consumption in modern computer architectures.One way to reduce power consumption, and hence increase performance, is to eliminate redundant operations between assembly instructions.This redundancy elimination, however, is difficult, because it involves solving a costly on-line problem: the shortest common supersequence.Previous work have proposed many different heuristics to solve this problem at either the architecture, or at the compiler level.The sheer number of different algorithms, and the vast search space makes a comparison between them a herculean task.In this dissertation, we dive into this task, providing the most extensive comparative analysis of these different heuristics ever seen in the literature.We match the different heuristics along several dimensions, including the amount of thread-level or data-level parallelism that they deliver.Our results show that relatively simple heuristics, such as the so called MinPcSp can outperform very convoluted algorithms.From this comparison we draw subsidies to design, test and implement new heuristics to share redundant work between parallel threads.Our new algorithms improve on the previous works in non-trivial ways.When testing these algorithms in industrial-strength benchmarks, we have observed that some of them are able to reduce the number of instructions to be processed by a factor of 3x.