This thesis addresses the challenge of estimating caloric expenditure from videos capturing individuals engaging in a variety of activities, ranging from mild to intense exercises. The estimation of calories burned is based not only on the category of physical activity (e.g., running, walking) but also on assessing the intensity of muscle and bodily movements depicted in the videos. To approach this goal, tests were conducted on two distinct sets of data: "known test", which includes activities represented in the training dataset, and "unknown test", consisting of activities not featured during training. This methodology allowed the exploration of not only the model's performance on familiar activities but also its generalization capabilities to new, unseen actions, to check the effect of categories on the training. In this study, neural networks are leveraged, specifically utilizing foundation models due to computational constraints. A pretrained Dino model is used to extract features from video data. These features are then fed into an evaluator model to estimate caloric burn. A significant portion of this research involved experimenting with different loss functions and tuning various parameters to optimize the model's predictive accuracy. Through these experiments, the aim was to enhance the model's ability to accurately estimate calorie expenditure, thereby contributing valuable insights into the domain of human action detection in temporal data.
This thesis addresses the challenge of estimating caloric expenditure from videos capturing individuals engaging in a variety of activities, ranging from mild to intense exercises. The estimation of calories burned is based not only on the category of physical activity (e.g., running, walking) but also on assessing the intensity of muscle and bodily movements depicted in the videos. To approach this goal, tests were conducted on two distinct sets of data: "known test", which includes activities represented in the training dataset, and "unknown test", consisting of activities not featured during training. This methodology allowed the exploration of not only the model's performance on familiar activities but also its generalization capabilities to new, unseen actions, to check the effect of categories on the training. In this study, neural networks are leveraged, specifically utilizing foundation models due to computational constraints. A pretrained Dino model is used to extract features from video data. These features are then fed into an evaluator model to estimate caloric burn. A significant portion of this research involved experimenting with different loss functions and tuning various parameters to optimize the model's predictive accuracy. Through these experiments, the aim was to enhance the model's ability to accurately estimate calorie expenditure, thereby contributing valuable insights into the domain of human action detection in temporal data.
Human action detection in temporal data
BABAKHANI, SARVENAZ
2023/2024
Abstract
This thesis addresses the challenge of estimating caloric expenditure from videos capturing individuals engaging in a variety of activities, ranging from mild to intense exercises. The estimation of calories burned is based not only on the category of physical activity (e.g., running, walking) but also on assessing the intensity of muscle and bodily movements depicted in the videos. To approach this goal, tests were conducted on two distinct sets of data: "known test", which includes activities represented in the training dataset, and "unknown test", consisting of activities not featured during training. This methodology allowed the exploration of not only the model's performance on familiar activities but also its generalization capabilities to new, unseen actions, to check the effect of categories on the training. In this study, neural networks are leveraged, specifically utilizing foundation models due to computational constraints. A pretrained Dino model is used to extract features from video data. These features are then fed into an evaluator model to estimate caloric burn. A significant portion of this research involved experimenting with different loss functions and tuning various parameters to optimize the model's predictive accuracy. Through these experiments, the aim was to enhance the model's ability to accurately estimate calorie expenditure, thereby contributing valuable insights into the domain of human action detection in temporal data.File | Dimensione | Formato | |
---|---|---|---|
Babakhani_Sarvenaz.pdf
accesso aperto
Dimensione
9.49 MB
Formato
Adobe PDF
|
9.49 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/66547