The building sector plays a significant role in energy consumption and energy related emissions, and the adoption of efficient technologies such as heat pumps is essential. However, faults within these systems can lead to increased energy consumption and reduced efficiency. The coefficient of performance is a critical indicator for evaluating the efficiency of air source heat pump systems. This thesis presents two methods to improve operational efficiency and maintenance strategies of heat pumps through performance predictive models and anomaly detection. The aim of this thesis is to evaluate how data processing, machine learning and regression approaches can be employed to predict COP and anomaly detection of heat pumps. This research focuses on developing various predictive modeling approaches, including linear regression, exponential regression, and machine learning algorithms such as Random Forest and XGBoost. Performance predictions were based on operational data collected from one hundred air-source heat pumps installed for various end users. The data was retrieved from a PostgreSQL database and consequently preprocessed to remove unsatisfactory values, and relevant input features were identified during feature selection. This process enhanced the quality and reliability of the datasets for predictive modeling. The Sparman and Pearson correlations were used to quantify the influencing factors between features and COP. Finally, the model performance accuracy was evaluated using three different key performance indicators. After the predictive modeling phase was completed, residual analysis was calculated to identify deviations between actual and predicted COP values. The Isolation Forest algorithm and threshold-based anomaly detection were developed using residuals, along with other operational parameters, as a features. The model performance was evaluated using accuracy, precision, recall, F1-Scores, and percentage within error margins (±20%). The results indicated that among all input variables, source entering temperature (SET) is the most significant factor positively affecting COP, while load entering temperature (LET) shows the least influence based on correlation strength. Error analysis and performance evaluation results indicate that the combined feature set, which incorporates the part load ratio as both a direct linear and quadratic terms demonstrates the best performance. The regression models achieved a mean absolute percentage error (MAPE), coefficient of determination (R²), and root mean square error (RMSE) values ranging from 0.026 to 0.14, 0.62 to 0.94, and 0.21 to 0.65, respectively, while machine learning models achieved values range from 0.023 to 0.11, 0.65 to 0.95, and 0.18 to 0.56, respectively across all selected device type codes. Overall, machine learning models exhibited outstanding performance compared to empirical regression models. The Isolation Forest detected 44,241 (5%) anomalies, while a ±20% threshold-based method identified 72,206 (8.2%) anomalies out of 883,104 total data points across all selected device type codes. In general, the findings highlight the effectiveness of machine learning and the importance of feature set for heat pump performance prediction and the potential of these models to enhance predictive maintenance and operational efficiency in heat pumps.
The building sector plays a significant role in energy consumption and energy related emissions, and the adoption of efficient technologies such as heat pumps is essential. However, faults within these systems can lead to increased energy consumption and reduced efficiency. The coefficient of performance is a critical indicator for evaluating the efficiency of air source heat pump systems. This thesis presents two methods to improve operational efficiency and maintenance strategies of heat pumps through performance predictive models and anomaly detection. The aim of this thesis is to evaluate how data processing, machine learning and regression approaches can be employed to predict COP and anomaly detection of heat pumps. This research focuses on developing various predictive modeling approaches, including linear regression, exponential regression, and machine learning algorithms such as Random Forest and XGBoost. Performance predictions were based on operational data collected from one hundred air-source heat pumps installed for various end users. The data was retrieved from a PostgreSQL database and consequently preprocessed to remove unsatisfactory values, and relevant input features were identified during feature selection. This process enhanced the quality and reliability of the datasets for predictive modeling. The Sparman and Pearson correlations were used to quantify the influencing factors between features and COP. Finally, the model performance accuracy was evaluated using three different key performance indicators. After the predictive modeling phase was completed, residual analysis was calculated to identify deviations between actual and predicted COP values. The Isolation Forest algorithm and threshold-based anomaly detection were developed using residuals, along with other operational parameters, as a features. The model performance was evaluated using accuracy, precision, recall, F1-Scores, and percentage within error margins (±20%). The results indicated that among all input variables, source entering temperature (SET) is the most significant factor positively affecting COP, while load entering temperature (LET) shows the least influence based on correlation strength. Error analysis and performance evaluation results indicate that the combined feature set, which incorporates the part load ratio as both a direct linear and quadratic terms demonstrates the best performance. The regression models achieved a mean absolute percentage error (MAPE), coefficient of determination (R²), and root mean square error (RMSE) values ranging from 0.026 to 0.14, 0.62 to 0.94, and 0.21 to 0.65, respectively, while machine learning models achieved values range from 0.023 to 0.11, 0.65 to 0.95, and 0.18 to 0.56, respectively across all selected device type codes. Overall, machine learning models exhibited outstanding performance compared to empirical regression models. The Isolation Forest detected 44,241 (5%) anomalies, while a ±20% threshold-based method identified 72,206 (8.2%) anomalies out of 883,104 total data points across all selected device type codes. In general, the findings highlight the effectiveness of machine learning and the importance of feature set for heat pump performance prediction and the potential of these models to enhance predictive maintenance and operational efficiency in heat pumps.
Machine learning and regression techniques for performance prediction and anomaly detection of heat pumps
TUFA, FRAOL TULU
2024/2025
Abstract
The building sector plays a significant role in energy consumption and energy related emissions, and the adoption of efficient technologies such as heat pumps is essential. However, faults within these systems can lead to increased energy consumption and reduced efficiency. The coefficient of performance is a critical indicator for evaluating the efficiency of air source heat pump systems. This thesis presents two methods to improve operational efficiency and maintenance strategies of heat pumps through performance predictive models and anomaly detection. The aim of this thesis is to evaluate how data processing, machine learning and regression approaches can be employed to predict COP and anomaly detection of heat pumps. This research focuses on developing various predictive modeling approaches, including linear regression, exponential regression, and machine learning algorithms such as Random Forest and XGBoost. Performance predictions were based on operational data collected from one hundred air-source heat pumps installed for various end users. The data was retrieved from a PostgreSQL database and consequently preprocessed to remove unsatisfactory values, and relevant input features were identified during feature selection. This process enhanced the quality and reliability of the datasets for predictive modeling. The Sparman and Pearson correlations were used to quantify the influencing factors between features and COP. Finally, the model performance accuracy was evaluated using three different key performance indicators. After the predictive modeling phase was completed, residual analysis was calculated to identify deviations between actual and predicted COP values. The Isolation Forest algorithm and threshold-based anomaly detection were developed using residuals, along with other operational parameters, as a features. The model performance was evaluated using accuracy, precision, recall, F1-Scores, and percentage within error margins (±20%). The results indicated that among all input variables, source entering temperature (SET) is the most significant factor positively affecting COP, while load entering temperature (LET) shows the least influence based on correlation strength. Error analysis and performance evaluation results indicate that the combined feature set, which incorporates the part load ratio as both a direct linear and quadratic terms demonstrates the best performance. The regression models achieved a mean absolute percentage error (MAPE), coefficient of determination (R²), and root mean square error (RMSE) values ranging from 0.026 to 0.14, 0.62 to 0.94, and 0.21 to 0.65, respectively, while machine learning models achieved values range from 0.023 to 0.11, 0.65 to 0.95, and 0.18 to 0.56, respectively across all selected device type codes. Overall, machine learning models exhibited outstanding performance compared to empirical regression models. The Isolation Forest detected 44,241 (5%) anomalies, while a ±20% threshold-based method identified 72,206 (8.2%) anomalies out of 883,104 total data points across all selected device type codes. In general, the findings highlight the effectiveness of machine learning and the importance of feature set for heat pump performance prediction and the potential of these models to enhance predictive maintenance and operational efficiency in heat pumps.| File | Dimensione | Formato | |
|---|---|---|---|
|
Tufa_Fraol Tulu.pdf
Accesso riservato
Dimensione
33.69 MB
Formato
Adobe PDF
|
33.69 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/88937