Design exploration of machine learning data-flows onto heterogeneous reconfigurable hardware

Detalhes bibliográficos
Ano de defesa: 2023
Autor(a) principal: Oliveira, Westerley Carvalho
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade Federal de Viçosa
Ciência da Computação
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://locus.ufv.br//handle/123456789/31709
https://doi.org/10.47328/ufvbbt.2023.570
Resumo: This work explores the placement and routing of Machine Learning applications data- flow graphs on different heterogeneous Coarse-Grained Reconfigurable Architectures (CGRA). We analyze three different types of processing element (PE) heterogeneity, the first concerning the interconnection pattern, the second being on the kind of ope- rations a single PE can execute, and the last concerning the PE buffer resources. This analysis aims to propose a fair reduction to the overall cost in comparison to the ho- mogeneous CGRA architecture. We compare our results with the homogeneous case and one of the state-of-the-art tools for placement and routing (P&R). Our algorithm executed, on average, 52% faster than VPR 8.1 (Versatile Place and Route), which is an open-source academic tool designed for the FPGA placement and routing pha- ses, reaching better mapping in 66% of cases and achieving the same results in 26% of cases. Furthermore, a heterogeneous architecture reduces the cost without losing performance in 76% of the cases considering multiplier heterogeneity. We propose a novel heterogeneous buffer architecture that minimizes the buffer resources by 56.3% for K-means dataflow patterns. We also show that a heterogeneous border chess archi- tecture outperforms a homogeneous one. In addition, our mapping reaches optimal instances of single tree dataflows compared to classical Lee/Choi and H-Trees. Keywords: Reconfigurable architecture. CGRAs. Placement. Routing.