Pakistan Journal of Geology (PJG)

PERFORMANCE OF MACHINE LEARNING MODELS FOR PREDICTING VOLUME OF WATER CONSUMED BY POOR URBAN HOUSEHOLDS WHERE THERE IS NO WATER DISTRIBUTION NETWORK

April 9, 2025 Posted by Dania In Uncategorized

ABSTRACT

PERFORMANCE OF MACHINE LEARNING MODELS FOR PREDICTING VOLUME OF WATER CONSUMED BY POOR URBAN HOUSEHOLDS WHERE THERE IS NO WATER DISTRIBUTION NETWORK

Journal: Pakistan Journal of Geology (PJG)
Author: Taiwo, Tolu A, Olusina, J.O., Hamid-Mosaku, A.I., Abiodun, O.E

This is an open access journal distributed under the Creative Commons Attribution License CC BY 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited

DOI: 10.26480/pjg.01.2025.26.33

Several studies have applied various techniques to model and predict water consumption in urban areas where there is water distribution network (WDN). This study examines the performance of machine learning models for predicting volume of water consumed by urban poor households where there is no WDN. Historical data of daily volume of water consumed was gathered through questionnaires, and integrated with socioeconomic data, weather data, property data and geospatial data. The datasets were passed through Pearson Correlation algorithm to select few features that correlate with the target variable. The selected features were inputted into four predictive models – Multilinear Regression (MLR), Random Forest (RF), Support Vector Regression (SVR), and Artificial Neural Networks (ANN). Three error metrics, Mean Absolute Error (MAE), Root Mean Squared Error (RMSE) and R squared (R2) score, were used to measure the model performances. The models were validated with dataset collected where there is WDN. All four models performed very well during training, as they produced RMSE of 110 litres, 83 litres, 98 litres and 97 litres respectively, and R2 score of 53%, 73%, 52% and 63% respectively. Significance test carried out on the results at 95% confidence level shows that there is no significant difference between model performance where there is WDN and where there is no WDN, which also confirms the validity of the dataset collected where there is no WDN.

Pages 26-33
Year 2025
Issue 1
Volume 9