Detalhes bibliográficos
Ano de defesa: |
2021 |
Autor(a) principal: |
Henrique Rennó de Azeredo Freitas |
Orientador(a): |
Celso Luiz Mendes |
Banca de defesa: |
Stephan Stephany,
João Ricardo de Freitas Oliveira,
Walter Collischonn,
Daniel Andrés Rodriguez |
Tipo de documento: |
Tese
|
Tipo de acesso: |
Acesso aberto |
Idioma: |
eng |
Instituição de defesa: |
Instituto Nacional de Pesquisas Espaciais (INPE)
|
Programa de Pós-Graduação: |
Programa de Pós-Graduação do INPE em Computação Aplicada
|
Departamento: |
Não Informado pela instituição
|
País: |
BR
|
Link de acesso: |
http://urlib.net/sid.inpe.br/mtc-m21c/2021/02.17.20.54
|
Resumo: |
Large-scale hydrological models are extensively used for the understanding of watershed processes with applications in water resources, climate change, land use, and forecast systems. The quality of the hydrological results mainly depends on calibrating the optimal sets of watershed parameters, a time-consuming task that requires repeated hydrological model simulations. The ever-growing availability of hydrometeorological data from extensive regions also contributes to the increase in the demand for more computational resources. The performance of optimization methods in hydrological applications has been continuously addressed. However, improving the performance of an application on a modern computer requires a detailed investigation about the interaction between the application and the underlying system, to find the techniques that provide the best performance improvements. This thesis aims at performance optimizations on the well-established MGB hydrological model (simulation) and the MOCOM-UA method (calibration) for real-world input datasets, the Purus (Brazil) and Niger (Africa) watersheds. The optimization strategies investigated in this thesis target state-of-the-art CPU and GPU systems by exploiting techniques that include AVX-512 vectorization, and multi-core (CPU) and many-core (GPU) parallelisms, to increase the usefulness of both simulation and calibration using the MGB model. Significant speedups of up to 20× were achieved on CPU with the proposed optimizations, while the roofline analysis confirmed that the CPU and GPU optimizations more effectively exploited the hardware resources, and improved the overall performance of the MGB model. An additional scalability analysis using a miniapp of the MGB model indicated that speedups up to 24× (CPU) and 65× (GPU) can be achieved for larger problem sizes. Moreover, the accuracy of the simulated results between the nonoptimized and optimized implementations was quantitatively evaluated, reaching maximum relative errors of approximately 6% for discharges and objective functions. The investigated techniques applied on the MGB model are also valid for other scientific applications where a few key parts dominate the execution time when processing a large amount of data. Carefully employing these techniques to optimize such parts may significantly enhance the overall application performance on current CPUs and GPUs. |