Particionamento de redes neurais profundas com saídas antecipadas

Detalhes bibliográficos
Ano de defesa: 2020
Autor(a) principal: Pacheco, Roberto Gonçalves
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal do Rio de Janeiro
Brasil
Instituto Alberto Luiz Coimbra de Pós-Graduação e Pesquisa de Engenharia
Programa de Pós-Graduação em Engenharia Elétrica
UFRJ
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://hdl.handle.net/11422/23205
Resumo: Deep Neural Networks (DNNs) requires high computation power. This power may not be available on end devices, requiring the use of a cloud computing infrastructure. However, sending raw data to the cloud can increase the inference time, due to the communication time. To reduce this time, the first layers of DNN can be executed in a edge device and the remaining layers in the cloud. Depending on which layers are processed at the edge, this can reduce the amount of data sent, but can also increase processing time. As the inference time is composed of the communication and processing time, it is necessary to deal with this trade-off. Partitioning problems try to solve this trade-off, choosing a set of layers to be executed in the edge device to minimize the inference time. This dissertation addresses DNN partitioning with early exits. In this kind of DNN, the inference can be finished in the middle layers, depending on the level of uncertainty of the classification of an input sample. Therefore, besides of network conditions and cloud and edge hardware, input data characteristics can also influence the partitioning decision. To consider these characteristics, this disseration models the partitioning problem as a shortest path problem in a graph and, thus, can be solved in polynomial time. This model is used as the basis for proposing the POPEX (Partitioning OPtimization for deep neural networks with Early eXits) system. Moreover, this dissertation evaluates as the DNN model and input data can affect the DNN partitioning with early exits. Regarding the first, this considers the process of DNN calibration, while the second refers to image distortion in the partitioning.