In recent years, neural networks have achieved incredible performance in computer vision applications like image classification and video identification due to the availability of powerful computational resources and vast amounts of data. Though there exists a large amount of data, it takes great efforts to annotate such massive data, which is necessary in standard supervision task. To reduce the need for annotation self-supervised approach can be used. In self-supervised approach some surrogate task that don’t need the use of labels data are used to learn representations. This approach can be useful to extract meaningful insight from that can then be exploited in different downstream tasks. With this work, we propose to use clip order prediction as a self supervised task and test the learned representation in an action classification task.
Self-Supervised Learning :Video Clip Order Prediction with diffusion models
REPETTO, SARA
2022/2023
Abstract
In recent years, neural networks have achieved incredible performance in computer vision applications like image classification and video identification due to the availability of powerful computational resources and vast amounts of data. Though there exists a large amount of data, it takes great efforts to annotate such massive data, which is necessary in standard supervision task. To reduce the need for annotation self-supervised approach can be used. In self-supervised approach some surrogate task that don’t need the use of labels data are used to learn representations. This approach can be useful to extract meaningful insight from that can then be exploited in different downstream tasks. With this work, we propose to use clip order prediction as a self supervised task and test the learned representation in an action classification task.File | Dimensione | Formato | |
---|---|---|---|
Thesis_SARAREPETTO.pdf
accesso aperto
Dimensione
4.13 MB
Formato
Adobe PDF
|
4.13 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/61392