Tags:Classification, Decision Tree, Deterioration, Machine Learning, Predictive Modelling, Random Forest and Water Main
Abstract:
Many water utilities are currently struggling to manage their aging infrastructure. Water mains are a key component of water systems, as they convey drinking water to billions of end-users worldwide. However, as they are usually buried underground, their visual inspection and condition assessment can be cumbersome. Furthermore, water main failure may lead to significant challenges for utilities and end-users, such as service interruption, capacity reduction, as well as high replacement and rehabilitation costs. Accordingly, various researchers have sought to develop statistical methods to predict water main condition. Previous studies have developed models for single systems, applying a range of statistical and machine learning methods, from linear regression to artificial neural networks. The objective of the present study is to compare the applicability and accuracy of a few machine-learning algorithms, such as Linear Regression to predict the number of years to the first failure, and Random Forest Classifier, Logistic Regression, and Decision tree to predict whether a pipe is going to fail. Data from two Canadian municipalities have been collected (Saskatoon, Saskatchewan, and Waterloo, Ontario). A number of features are taken into consideration, such as diameter, age, material, and the number of previous failures. The results show a moderate to high accuracy of classification models although in some cases the performance of models is relatively low. Thus, deeper data mining approaches with higher concentrations on the most influential attributes would increase the reliability of the models.
Comparison of Machine Learning Classifiers for Predicting Water Main Failure