Reconstructing 3D poses from a single RGB image is a challenging task. Such computer vision problem hides an inherent ambiguity in the determination of the depth coordinate of the keypoints. In the following work I will start from exploring state-of-the-art approaches used to solve it focusing specifically on Human Hands Pose Estimation I will consider the most natural settings, including self-interaction and interaction with objects. Expressing the groundtruth hand label coordinates in the reference frame centered in a standard point or in a point internal to the hand, plays an important role to the success of the training process. After evaluating the benefits of chosing a specific one, I propose a multi-stage approach separately regressing root-relative pose and root coordinates in the camera reference frame. Such model is then trained and tested on the novel dataset: H2O dataset (2 Hands and Objects)

Reconstructing 3D poses from a single RGB image is a challenging task. Such computer vision problem hides an inherent ambiguity in the determination of the depth coordinate of the keypoints. In the following work I will start from exploring state-of-the-art approaches used to solve it focusing specifically on Human Hands Pose Estimation I will consider the most natural settings, including self-interaction and interaction with objects. Expressing the groundtruth hand label coordinates in the reference frame centered in a standard point or in a point internal to the hand, plays an important role to the success of the training process. After evaluating the benefits of chosing a specific one, I propose a multi-stage approach separately regressing root-relative pose and root coordinates in the camera reference frame. Such model is then trained and tested on the novel dataset: H2O dataset (2 Hands and Objects)

Reconstructing Human Hands from Egocentric RGB Data

CHIMENTI, ALBERTO
2021/2022

Abstract

Reconstructing 3D poses from a single RGB image is a challenging task. Such computer vision problem hides an inherent ambiguity in the determination of the depth coordinate of the keypoints. In the following work I will start from exploring state-of-the-art approaches used to solve it focusing specifically on Human Hands Pose Estimation I will consider the most natural settings, including self-interaction and interaction with objects. Expressing the groundtruth hand label coordinates in the reference frame centered in a standard point or in a point internal to the hand, plays an important role to the success of the training process. After evaluating the benefits of chosing a specific one, I propose a multi-stage approach separately regressing root-relative pose and root coordinates in the camera reference frame. Such model is then trained and tested on the novel dataset: H2O dataset (2 Hands and Objects)
2021
Reconstructing Human Hands from Egocentric RGB Data
Reconstructing 3D poses from a single RGB image is a challenging task. Such computer vision problem hides an inherent ambiguity in the determination of the depth coordinate of the keypoints. In the following work I will start from exploring state-of-the-art approaches used to solve it focusing specifically on Human Hands Pose Estimation I will consider the most natural settings, including self-interaction and interaction with objects. Expressing the groundtruth hand label coordinates in the reference frame centered in a standard point or in a point internal to the hand, plays an important role to the success of the training process. After evaluating the benefits of chosing a specific one, I propose a multi-stage approach separately regressing root-relative pose and root coordinates in the camera reference frame. Such model is then trained and tested on the novel dataset: H2O dataset (2 Hands and Objects)
Pose Estimation
Computer Vision
Deep Learning
File in questo prodotto:
File Dimensione Formato  
Chimenti_Alberto.pdf

accesso riservato

Dimensione 9.84 MB
Formato Adobe PDF
9.84 MB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/36020