Machine Learning-based Anomaly Detection for Hydroelectric Power Plants

Purpose - This paper analyzes how anomaly detection can effectively be applied to predictive maintenance in hydroelectric power plants and evaluates the effects SHAP values have in helping the user understand the causes of the reported anomaly. Methods - We compared the performance of RRCF, KDE, SOD, ECOD, isolation forest, auto-encoder, DeepSVDD, KitNet, and RSHash on data coming from a real-world hydroelectric power plant, through human evaluation (since no label was available and the labeling process was too expensive). Then we checked whether SHAP applied to the best performing model gives informative and correct indications on the cause of the anomaly. Results - Our results showed that auto-encoders can catch all of the organization’s recorded anomalies and propose additional ones, that are later confirmed by the expert of the domain. The application of SHAP is sufficiently able to guide the user toward the features related to the anomaly but is a little slow to be applied in streaming data. Implications - From a practical perspective, more efficient maintenance leads to less operative costs and higher reliability of the plant, on top of this we used models and algorithms from well-known Python packages so this work is ready to be applied as a production tool. From a social perspective, reducing the costs makes hydroelectric technology more appealing to investors, helping the transition towards renewable energies.

Scopo - Questo lavoro analizza come il rilevamento delle anomalie possa essere efficacemente applicato alla manutenzione predittiva delle centrali idroelettriche e valuta gli effetti degli SHAP values nell’aiutare l’utente a comprendere le cause dell’anomalia segnalata. Metodo - Abbiamo confrontato le prestazioni di RRCF, KDE, SOD, ECO, isolation forest, auto-encoder, DeepSVDD, KitNet e RSHash su un insieme di dati provenienti da una vera centrale idroelettrica, attraverso una valutazione umana (poiché non erano disponibili labels e procurarsele sarebbe stato troppo costoso). Poi abbiamo verificato se SHAP, applicato al modello più performante, fornisce indicazioni informative e corrette sulla causa dell’anomalia. Risultati - I nostri risultati hanno mostrato che gli auto-encoder sono in grado di cogliere tutte le anomalie registrate dall’organizzazione e di proporne altre, che vengono poi confermate dall’esperto di dominio. L’applicazione di SHAP è sufficientemente in grado di guidare l’utente verso le features relative all’anomalia, ma è un po’ lenta per essere applicata ai dati in streaming. Implicazioni - Da un punto di vista pratico, una manutenzione più efficiente porta a minori costi operativi e a una maggiore affidabilità dell’impianto; inoltre, abbiamo utilizzato modelli e algoritmi tratti da noti pacchetti Python, quindi questo lavoro è pronto per essere applicato come strumento di produzione. Da un punto di vista sociale, la riduzione dei costi rende la tecnologia idroelettrica più interessante per gli investitori, favorendo la transizione verso le energie rinnovabili.