Fast anomaly detection and localization in video sequences based on a ConvLSTM-UNet neural network

Anomaly detection is the process of identifying unusual observations that rarely happen. This task is relevant in many computer vision domains, such as medical imaging, industrial manufacturing, video surveillance and robotic applications. In all these cases, it is usually difficult to collect many abnormal data, since they rarely occur in the real world and greatly vary in types and definitions. We propose a computationally efficient ConvLSTM-UNet network for detecting anomalies in video sequences, performing either frame prediction or frame interpolation tasks. This model is able to combine the ability of ConvLSTM layers to process temporal information and the benefits of UNets to propagate spatial features, without the use of any pre-trained module. In this way we are able to detect anomalous frames with a high speed and at the same time to achieve competitive results in terms of Area Under the Curve (AUC) in three different benchmarks: UCSD Ped2, Avenue and ShanghaiTech. In specific, when performing frame interpolation, our model is able to compute 258 frames per second (FPS) and surpassing many state-of-the-art works on Ped2 and ShanghaiTech datasets, obtaining respectively 97.15% and 72.97% AUC values. Furthermore we present a possible application of our model for anomaly detection in a human-robot collaboration task, in which a human operator and a robotic arm collaborate to assemble a small table. Differently from other works in the literature, we investigate the challenging scenario of detecting human-robot anomalies using only information from a camera, without relying on the information provided by other sensors. First experiments show that anomalous events are hard to detect in this application. Thus, we effectuate an ablation study on how much the distance between camera and moving subjects influences the final results. Finally, we investigate the impact of the training size, when a neural network must learn complex human gestures and robot movements in order to detect anomalies.

Fast anomaly detection and localization in video sequences based on a ConvLSTM-UNet neural network

GUIZZARO, CHIARA

2021/2022

Abstract

Anomaly detection is the process of identifying unusual observations that rarely happen. This task is relevant in many computer vision domains, such as medical imaging, industrial manufacturing, video surveillance and robotic applications. In all these cases, it is usually difficult to collect many abnormal data, since they rarely occur in the real world and greatly vary in types and definitions. We propose a computationally efficient ConvLSTM-UNet network for detecting anomalies in video sequences, performing either frame prediction or frame interpolation tasks. This model is able to combine the ability of ConvLSTM layers to process temporal information and the benefits of UNets to propagate spatial features, without the use of any pre-trained module. In this way we are able to detect anomalous frames with a high speed and at the same time to achieve competitive results in terms of Area Under the Curve (AUC) in three different benchmarks: UCSD Ped2, Avenue and ShanghaiTech. In specific, when performing frame interpolation, our model is able to compute 258 frames per second (FPS) and surpassing many state-of-the-art works on Ped2 and ShanghaiTech datasets, obtaining respectively 97.15% and 72.97% AUC values. Furthermore we present a possible application of our model for anomaly detection in a human-robot collaboration task, in which a human operator and a robotic arm collaborate to assemble a small table. Differently from other works in the literature, we investigate the challenging scenario of detecting human-robot anomalies using only information from a camera, without relying on the information provided by other sensors. First experiments show that anomalous events are hard to detect in this application. Thus, we effectuate an ablation study on how much the distance between camera and moving subjects influences the final results. Finally, we investigate the impact of the training size, when a neural network must learn complex human gestures and robot movements in order to detect anomalies.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria dell'Informazione - DEI
			
	Corso di studio
	
				COMPUTER ENGINEERING Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2021
			
	Titolo inglese
	
				Fast anomaly detection and localization in video sequences based on a ConvLSTM-UNet neural network
			
	Abstract in italiano
	
				Anomaly detection is the process of identifying unusual observations that rarely happen. This task is relevant in many computer vision domains, such as medical imaging, industrial manufacturing, video surveillance and robotic applications. In all these cases, it is usually difficult to collect many abnormal data, since they rarely occur in the real world and greatly vary in types and definitions.
We propose a computationally efficient ConvLSTM-UNet network for detecting anomalies in video sequences, performing either frame prediction or frame interpolation tasks. This model is able to combine the ability of ConvLSTM layers to process temporal information and the benefits of UNets to propagate spatial features, without the use of any pre-trained module. In this way we are able to detect anomalous frames with a high speed and at the same time to achieve competitive results in terms of Area Under the Curve (AUC) in three different benchmarks: UCSD Ped2, Avenue and ShanghaiTech.
In specific, when performing frame interpolation, our model is able to compute 258 frames per second (FPS) and surpassing many state-of-the-art works on Ped2 and ShanghaiTech datasets, obtaining respectively 97.15% and 72.97% AUC values.
Furthermore we present a possible application of our model for anomaly detection in a human-robot collaboration task, in which a human operator and a robotic arm collaborate to assemble a small table. Differently from other works in the literature, we investigate the challenging scenario of detecting human-robot anomalies using only information from a camera, without relying on the information provided by other sensors.
First experiments show that anomalous events are hard to detect in this application. Thus, we effectuate an ablation study on how much the distance between camera and moving subjects influences the final results. Finally, we  investigate the impact of the training size, when a neural network must learn complex human gestures and robot movements in order to detect anomalies.
			
	Parola chiave
	
				anomaly detection
deep learning
video anomaly
			
	Relatore
	
				GHIDONI, STEFANO
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Guizzaro_Chiara.pdf accesso riservato Dimensione 6.74 MB Formato Adobe PDF	6.74 MB	Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/40262