Online human action recognition based on skeletal data for human-robot collaboration scenarios

Human Action Recognition (HAR) is a relevant research area of computer vision that deals with the recognition and classification of the human actions from data captured by sensors such as cameras or wearable sensors. With the increasing interest in human-robot interaction and collaboration scenarios, it is important for the robot to understand and recognize the action performed by the human in real-time (online recognition), so as to behave accordingly. Robustness in data representation is another important requirement in these contexts. Skeletal data are a simple and informative way of representing human actions, particularly well-suited for online action recognition due to their robustness. The purpose of this thesis is the study and development of models that perform human action recognition in real-time, to be applied in scenarios of human-robot collaboration. An in-depth study was conducted on two existing state-of-the-art models, namely InfoGCN++ and STGCN-SWVM, that perform online HAR on skeletal data exploiting Graph Convolutional Networks (GCNs). The generalization ability of the models is studied by splitting the dataset based on the subjects that perform the actions, according to the Leave-One-Out Cross-Validation approach. Furthermore, motivated by the prevalence of complex models in the literature, two small-size simple models, namely a Convolutional and a LSTM model, are developed in order to speed up the recognition of the actions with more straightforward architectures, exploiting the so called ensemble learning strategy. This is achieved by creating a series of classifiers working in parallel, each responsible for recognizing one single class of the dataset, following the One-vs-All approach for the multi-class classification. The outputs of all the classifiers are combined to select the final predictions. Convolutional models demonstrated great ability in online recognition, outperforming the two-state-of-the-art architectures and showing greater ability of generalizing if actions performed by different subjects are considered. On the other hand, LSTM models showed some difficulties in online action recognition. Actions are recognized by models, generally, with some delay and inaccuracies, due to the need of time in identifying the action type and potential noise in data.

Online human action recognition based on skeletal data for human-robot collaboration scenarios

MARCHESINI, ANNA

2023/2024

Abstract

Human Action Recognition (HAR) is a relevant research area of computer vision that deals with the recognition and classification of the human actions from data captured by sensors such as cameras or wearable sensors. With the increasing interest in human-robot interaction and collaboration scenarios, it is important for the robot to understand and recognize the action performed by the human in real-time (online recognition), so as to behave accordingly. Robustness in data representation is another important requirement in these contexts. Skeletal data are a simple and informative way of representing human actions, particularly well-suited for online action recognition due to their robustness. The purpose of this thesis is the study and development of models that perform human action recognition in real-time, to be applied in scenarios of human-robot collaboration. An in-depth study was conducted on two existing state-of-the-art models, namely InfoGCN++ and STGCN-SWVM, that perform online HAR on skeletal data exploiting Graph Convolutional Networks (GCNs). The generalization ability of the models is studied by splitting the dataset based on the subjects that perform the actions, according to the Leave-One-Out Cross-Validation approach. Furthermore, motivated by the prevalence of complex models in the literature, two small-size simple models, namely a Convolutional and a LSTM model, are developed in order to speed up the recognition of the actions with more straightforward architectures, exploiting the so called ensemble learning strategy. This is achieved by creating a series of classifiers working in parallel, each responsible for recognizing one single class of the dataset, following the One-vs-All approach for the multi-class classification. The outputs of all the classifiers are combined to select the final predictions. Convolutional models demonstrated great ability in online recognition, outperforming the two-state-of-the-art architectures and showing greater ability of generalizing if actions performed by different subjects are considered. On the other hand, LSTM models showed some difficulties in online action recognition. Actions are recognized by models, generally, with some delay and inaccuracies, due to the need of time in identifying the action type and potential noise in data.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria dell'Informazione - DEI
			
	Corso di studio
	
				COMPUTER ENGINEERING Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2023
			
	Titolo inglese
	
				Online human action recognition based on skeletal data for human-robot collaboration scenarios
			
	Abstract in italiano
	
				Human Action Recognition (HAR) is a relevant research area of computer vision that deals with the recognition and classification of the human actions from data captured by sensors such as cameras or wearable sensors. With the increasing interest in human-robot interaction and collaboration scenarios, it is important for the robot to understand and recognize the action performed by the human in real-time (online recognition), so as to behave accordingly. Robustness in data representation is another important requirement in these contexts. Skeletal data are a simple and informative way of representing human actions, particularly well-suited for online action recognition due to their robustness. The purpose of this thesis is the study and development of models that perform human action recognition in real-time, to be applied in scenarios of human-robot collaboration. An in-depth study was conducted on two existing state-of-the-art models, namely InfoGCN++ and STGCN-SWVM, that perform online HAR on skeletal data exploiting Graph Convolutional Networks (GCNs). The generalization ability of the models is studied by splitting the dataset based on the subjects that perform the actions, according to the Leave-One-Out Cross-Validation approach. Furthermore, motivated by the prevalence of complex models in the literature, two small-size simple models, namely a Convolutional and a LSTM model, are developed in order to speed up the recognition of the actions with more straightforward architectures, exploiting the so called ensemble learning strategy. This is achieved by creating a series of classifiers working in parallel, each responsible for recognizing one single class of the dataset, following the One-vs-All approach for the multi-class classification. The outputs of all the classifiers are combined to select the final predictions. Convolutional models demonstrated great ability in online recognition, outperforming the two-state-of-the-art architectures and showing greater ability of generalizing if actions performed by different subjects are considered. On the other hand, LSTM models showed some difficulties in online action recognition. Actions are recognized by models, generally, with some delay and inaccuracies, due to the need of time in identifying the action type and potential noise in data.
			
	Parola chiave
	
				action recognition
skeleton
human-robot
			
	Relatore
	
				GHIDONI, STEFANO
			
	Correlatore
	
				TERRERAN, MATTEO
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Marchesini_Anna.pdf accesso riservato Dimensione 7.95 MB Formato Adobe PDF	7.95 MB	Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/80168