Anomaly detection is the process of identifying unusual observations that rarely happen. This task is relevant in many computer vision domains, such as medical imaging, industrial manufacturing, video surveillance and robotic applications. In all these cases, it is usually difficult to collect many abnormal data, since they rarely occur in the real world and greatly vary in types and definitions. We propose a computationally efficient ConvLSTM-UNet network for detecting anomalies in video sequences, performing either frame prediction or frame interpolation tasks. This model is able to combine the ability of ConvLSTM layers to process temporal information and the benefits of UNets to propagate spatial features, without the use of any pre-trained module. In this way we are able to detect anomalous frames with a high speed and at the same time to achieve competitive results in terms of Area Under the Curve (AUC) in three different benchmarks: UCSD Ped2, Avenue and ShanghaiTech. In specific, when performing frame interpolation, our model is able to compute 258 frames per second (FPS) and surpassing many state-of-the-art works on Ped2 and ShanghaiTech datasets, obtaining respectively 97.15% and 72.97% AUC values. Furthermore we present a possible application of our model for anomaly detection in a human-robot collaboration task, in which a human operator and a robotic arm collaborate to assemble a small table. Differently from other works in the literature, we investigate the challenging scenario of detecting human-robot anomalies using only information from a camera, without relying on the information provided by other sensors. First experiments show that anomalous events are hard to detect in this application. Thus, we effectuate an ablation study on how much the distance between camera and moving subjects influences the final results. Finally, we investigate the impact of the training size, when a neural network must learn complex human gestures and robot movements in order to detect anomalies.
Anomaly detection is the process of identifying unusual observations that rarely happen. This task is relevant in many computer vision domains, such as medical imaging, industrial manufacturing, video surveillance and robotic applications. In all these cases, it is usually difficult to collect many abnormal data, since they rarely occur in the real world and greatly vary in types and definitions. We propose a computationally efficient ConvLSTM-UNet network for detecting anomalies in video sequences, performing either frame prediction or frame interpolation tasks. This model is able to combine the ability of ConvLSTM layers to process temporal information and the benefits of UNets to propagate spatial features, without the use of any pre-trained module. In this way we are able to detect anomalous frames with a high speed and at the same time to achieve competitive results in terms of Area Under the Curve (AUC) in three different benchmarks: UCSD Ped2, Avenue and ShanghaiTech. In specific, when performing frame interpolation, our model is able to compute 258 frames per second (FPS) and surpassing many state-of-the-art works on Ped2 and ShanghaiTech datasets, obtaining respectively 97.15% and 72.97% AUC values. Furthermore we present a possible application of our model for anomaly detection in a human-robot collaboration task, in which a human operator and a robotic arm collaborate to assemble a small table. Differently from other works in the literature, we investigate the challenging scenario of detecting human-robot anomalies using only information from a camera, without relying on the information provided by other sensors. First experiments show that anomalous events are hard to detect in this application. Thus, we effectuate an ablation study on how much the distance between camera and moving subjects influences the final results. Finally, we investigate the impact of the training size, when a neural network must learn complex human gestures and robot movements in order to detect anomalies.
Fast anomaly detection and localization in video sequences based on a ConvLSTM-UNet neural network
GUIZZARO, CHIARA
2021/2022
Abstract
Anomaly detection is the process of identifying unusual observations that rarely happen. This task is relevant in many computer vision domains, such as medical imaging, industrial manufacturing, video surveillance and robotic applications. In all these cases, it is usually difficult to collect many abnormal data, since they rarely occur in the real world and greatly vary in types and definitions. We propose a computationally efficient ConvLSTM-UNet network for detecting anomalies in video sequences, performing either frame prediction or frame interpolation tasks. This model is able to combine the ability of ConvLSTM layers to process temporal information and the benefits of UNets to propagate spatial features, without the use of any pre-trained module. In this way we are able to detect anomalous frames with a high speed and at the same time to achieve competitive results in terms of Area Under the Curve (AUC) in three different benchmarks: UCSD Ped2, Avenue and ShanghaiTech. In specific, when performing frame interpolation, our model is able to compute 258 frames per second (FPS) and surpassing many state-of-the-art works on Ped2 and ShanghaiTech datasets, obtaining respectively 97.15% and 72.97% AUC values. Furthermore we present a possible application of our model for anomaly detection in a human-robot collaboration task, in which a human operator and a robotic arm collaborate to assemble a small table. Differently from other works in the literature, we investigate the challenging scenario of detecting human-robot anomalies using only information from a camera, without relying on the information provided by other sensors. First experiments show that anomalous events are hard to detect in this application. Thus, we effectuate an ablation study on how much the distance between camera and moving subjects influences the final results. Finally, we investigate the impact of the training size, when a neural network must learn complex human gestures and robot movements in order to detect anomalies.File | Dimensione | Formato | |
---|---|---|---|
Guizzaro_Chiara.pdf
accesso riservato
Dimensione
6.74 MB
Formato
Adobe PDF
|
6.74 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/40262