Indoor air quality in public buildings is increasingly monitored through multi-parameter sensors whose measurements, however, are often affected by drift, signal transmission errors and inaccurate artefacts. These distortions compromise data reliability and limit the effectiveness of large-scale environmental monitoring programmes. This thesis addresses the problem of detecting such anomalies in an entirely unsupervised setting, using a real deployment of eight air quality devices installed in school buildings across the Campania region (Italy), with a particular focus on Radon—a critical pollutant for indoor health risk assessment. We design a complete spatio–temporal anomaly detection framework based on deep autoencoders. The methodological contributions include: (i) a unified preprocessing pipeline for alignment, interpolation and scaling of heterogeneous sensor data; (ii) window-based reconstruction using four families of autoencoders; (iii) a quantile-based anomaly scoring scheme relying on training data; and (iv) a physically informed Radon constraint implemented through a Controlled Mechanical Ventilation (CMV) gating mechanism that stabilises Radon reconstructions according to ventilation conditions. Particular emphasis is placed on two architectures: a 2D Convolutional Autoencoder (Conv2D-AE) modelling joint time–sensor patterns, and a Graph Convolutional Autoencoder (GCN-AE) exploiting school-level topology. Experimental results show that the Conv2D-AE achieves the best reconstruction accuracy, reducing the global RMSE to 0.057 and halving per-parameter errors for particulate matter, TVOC, temperature and humidity compared to dense baselines. The GCN-AE, while slightly less accurate in reconstruction, produces the most selective anomaly maps: its window-level thresholds are 1.5–2.5 times higher than those of other models, suppressing false positives and isolating true deviations such as spikes, abrupt drops and sensor drifts. Overall, the proposed framework demonstrates that unsupervised deep learning, enhanced by graph priors and domain-aware CMV gating, provides a robust foundation for automatic quality assessment in real indoor monitoring networks. The results highlight the operational potential of Conv2D and GCN-AE architectures for scalable, interpretable and deploymentready anomaly detection in environmental sensing systems.
Unsupervised Anomaly Detection on Air Quality Sensor Data
LOPEZ, FRANCESCA
2024/2025
Abstract
Indoor air quality in public buildings is increasingly monitored through multi-parameter sensors whose measurements, however, are often affected by drift, signal transmission errors and inaccurate artefacts. These distortions compromise data reliability and limit the effectiveness of large-scale environmental monitoring programmes. This thesis addresses the problem of detecting such anomalies in an entirely unsupervised setting, using a real deployment of eight air quality devices installed in school buildings across the Campania region (Italy), with a particular focus on Radon—a critical pollutant for indoor health risk assessment. We design a complete spatio–temporal anomaly detection framework based on deep autoencoders. The methodological contributions include: (i) a unified preprocessing pipeline for alignment, interpolation and scaling of heterogeneous sensor data; (ii) window-based reconstruction using four families of autoencoders; (iii) a quantile-based anomaly scoring scheme relying on training data; and (iv) a physically informed Radon constraint implemented through a Controlled Mechanical Ventilation (CMV) gating mechanism that stabilises Radon reconstructions according to ventilation conditions. Particular emphasis is placed on two architectures: a 2D Convolutional Autoencoder (Conv2D-AE) modelling joint time–sensor patterns, and a Graph Convolutional Autoencoder (GCN-AE) exploiting school-level topology. Experimental results show that the Conv2D-AE achieves the best reconstruction accuracy, reducing the global RMSE to 0.057 and halving per-parameter errors for particulate matter, TVOC, temperature and humidity compared to dense baselines. The GCN-AE, while slightly less accurate in reconstruction, produces the most selective anomaly maps: its window-level thresholds are 1.5–2.5 times higher than those of other models, suppressing false positives and isolating true deviations such as spikes, abrupt drops and sensor drifts. Overall, the proposed framework demonstrates that unsupervised deep learning, enhanced by graph priors and domain-aware CMV gating, provides a robust foundation for automatic quality assessment in real indoor monitoring networks. The results highlight the operational potential of Conv2D and GCN-AE architectures for scalable, interpretable and deploymentready anomaly detection in environmental sensing systems.| File | Dimensione | Formato | |
|---|---|---|---|
|
Lopez_Francesca.pdf
Accesso riservato
Dimensione
19.13 MB
Formato
Adobe PDF
|
19.13 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/102122