This thesis addresses the challenge of estimating caloric expenditure from videos capturing individuals engaging in a variety of activities, ranging from mild to intense exercises. The estimation of calories burned is based not only on the category of physical activity (e.g., running, walking) but also on assessing the intensity of muscle and bodily movements depicted in the videos. To approach this goal, tests were conducted on two distinct sets of data: "known test", which includes activities represented in the training dataset, and "unknown test", consisting of activities not featured during training. This methodology allowed the exploration of not only the model's performance on familiar activities but also its generalization capabilities to new, unseen actions, to check the effect of categories on the training. In this study, neural networks are leveraged, specifically utilizing foundation models due to computational constraints. A pretrained Dino model is used to extract features from video data. These features are then fed into an evaluator model to estimate caloric burn. A significant portion of this research involved experimenting with different loss functions and tuning various parameters to optimize the model's predictive accuracy. Through these experiments, the aim was to enhance the model's ability to accurately estimate calorie expenditure, thereby contributing valuable insights into the domain of human action detection in temporal data.

This thesis addresses the challenge of estimating caloric expenditure from videos capturing individuals engaging in a variety of activities, ranging from mild to intense exercises. The estimation of calories burned is based not only on the category of physical activity (e.g., running, walking) but also on assessing the intensity of muscle and bodily movements depicted in the videos. To approach this goal, tests were conducted on two distinct sets of data: "known test", which includes activities represented in the training dataset, and "unknown test", consisting of activities not featured during training. This methodology allowed the exploration of not only the model's performance on familiar activities but also its generalization capabilities to new, unseen actions, to check the effect of categories on the training. In this study, neural networks are leveraged, specifically utilizing foundation models due to computational constraints. A pretrained Dino model is used to extract features from video data. These features are then fed into an evaluator model to estimate caloric burn. A significant portion of this research involved experimenting with different loss functions and tuning various parameters to optimize the model's predictive accuracy. Through these experiments, the aim was to enhance the model's ability to accurately estimate calorie expenditure, thereby contributing valuable insights into the domain of human action detection in temporal data.

Human action detection in temporal data

BABAKHANI, SARVENAZ
2023/2024

Abstract

This thesis addresses the challenge of estimating caloric expenditure from videos capturing individuals engaging in a variety of activities, ranging from mild to intense exercises. The estimation of calories burned is based not only on the category of physical activity (e.g., running, walking) but also on assessing the intensity of muscle and bodily movements depicted in the videos. To approach this goal, tests were conducted on two distinct sets of data: "known test", which includes activities represented in the training dataset, and "unknown test", consisting of activities not featured during training. This methodology allowed the exploration of not only the model's performance on familiar activities but also its generalization capabilities to new, unseen actions, to check the effect of categories on the training. In this study, neural networks are leveraged, specifically utilizing foundation models due to computational constraints. A pretrained Dino model is used to extract features from video data. These features are then fed into an evaluator model to estimate caloric burn. A significant portion of this research involved experimenting with different loss functions and tuning various parameters to optimize the model's predictive accuracy. Through these experiments, the aim was to enhance the model's ability to accurately estimate calorie expenditure, thereby contributing valuable insights into the domain of human action detection in temporal data.
2023
Human action detection in temporal data
This thesis addresses the challenge of estimating caloric expenditure from videos capturing individuals engaging in a variety of activities, ranging from mild to intense exercises. The estimation of calories burned is based not only on the category of physical activity (e.g., running, walking) but also on assessing the intensity of muscle and bodily movements depicted in the videos. To approach this goal, tests were conducted on two distinct sets of data: "known test", which includes activities represented in the training dataset, and "unknown test", consisting of activities not featured during training. This methodology allowed the exploration of not only the model's performance on familiar activities but also its generalization capabilities to new, unseen actions, to check the effect of categories on the training. In this study, neural networks are leveraged, specifically utilizing foundation models due to computational constraints. A pretrained Dino model is used to extract features from video data. These features are then fed into an evaluator model to estimate caloric burn. A significant portion of this research involved experimenting with different loss functions and tuning various parameters to optimize the model's predictive accuracy. Through these experiments, the aim was to enhance the model's ability to accurately estimate calorie expenditure, thereby contributing valuable insights into the domain of human action detection in temporal data.
Neural Networks
Deep learning
Loss function
Distributions
File in questo prodotto:
File Dimensione Formato  
Babakhani_Sarvenaz.pdf

accesso aperto

Dimensione 9.49 MB
Formato Adobe PDF
9.49 MB Adobe PDF Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/66547