Water demand modeling using machine learning techniques

Detalhes bibliográficos
Ano de defesa: 2019
Autor(a) principal: Carvalho, Taís Maria Nunes
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Não Informado pela instituição
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://www.repositorio.ufc.br/handle/riufc/40462
Resumo: Water demand forecasting is fundamentalto decisions related to long-term water resources management.However, spatial variability of water consumption may turn prediction into a difficult task. The main purpose of the current study is to investigate how socioeconomic aspects of households affect future urban water consumption. Prior to designing the prediction model, a significant subset of explanatory variables had to be chosenfor an improved performance and accuracy. Therefore, several filter and wrapper variable selection methods in Partial Least Squares Regression (PLSR) were tested, along with a classificationbased on Random Forests (RF). The feature subsets were used as input for a predictivemodel. Two machine learning techniques were tested: RF and Artificial Neural Network (ANN). Model performance was evaluated through Nash-Sutfcliffe coefficient, Root Mean Square Error (RMSE) and Pearson Correlation. The dataset consisted in 2010 water consumption and Census data associated with 182 Human Development Units (HDU) in Fortaleza, Ceará. Variable importance in projection (VIP),Regularized elimination procedure (REP-PLS)and RF provided the variablesubsetsthat led to the best prediction performance among theseven selection methods.Life expectancy at birth, per capita income and residents with primary and secondary education were considered as important variables in most of the feature subsets. According to the performance assessment, ANNan PLSRprovided similar performances and better estimates than RF in predicting water demand. RMSE for the best PLSRmodel was 25.779Liters/person/day (Lpd-1) and24.776Lpd-1for ANN, while for RF 31.820Lpd-1. Socioeconomic variables presented great influence in water consumption, especially per capita income and education.Although frequently used for short-term forecasting, ANNwas proved a good approachfor long-termwater demand prediction.The proposed approach of designinga spatial water consumption model can be extended to other metropolitanregions and different datasets.