BatchKeeper: processamento de lotes com controle de fluxos de execução
Ano de defesa: | 2020 |
---|---|
Autor(a) principal: | |
Orientador(a): | |
Banca de defesa: | |
Tipo de documento: | Dissertação |
Tipo de acesso: | Acesso aberto |
Idioma: | por |
Instituição de defesa: |
Universidade Federal de Minas Gerais
Brasil ICEX - INSTITUTO DE CIÊNCIAS EXATAS Programa de Pós-Graduação em Ciência da Computação UFMG |
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: | |
Link de acesso: | http://hdl.handle.net/1843/43124 |
Resumo: | Batch processing has been used for several decades for scientific computing by companies and governments. Batch processing can be just one component of an extensive business process. In this context, visibility and control over the computation are important. These objectives can be achieved through the integration of batch processing systems with monitoring frameworks and a programmable execution workflow; these objectives can decrease costs, human interventions, and delays, all of which are key for businesses. Furthermore, the processing of some batches can have different priorities or have deadlines, which requires advanced prediction and resource scheduling. In this work, we design and evaluate BatchKeeper: a system for batch processing that supports automated control of processing through a workflow module, scheduling with support for deadlines, and collecting monitoring information from multiple sources. We utilized historical batch execution data from a financial company to evaluate the batch execution time estimator. The estimator uses regression to find a statistical distribution that approximates the historical task runtimes of each batch, then uses a configurable percentile of the runtime distribution to estimate the finish time of executing batches. The estimator can correctly predict whether executions will finish before their deadline for 66\% of the batches even in a context where the machines are shared by other concurrent tasks, which leads to long, highly-variable runtimes. BatchKeeper is a generic solution applicable in several scenarios. We applied BatchKeeper in a company that executes hundreds of batches each month, which had a high human intervention rate on batches with long runtimes. We show that BatchKeeper was capable to automate several tasks and reduce human intervention. |