Many virtual reality (VR) scenes render convincingly but offer little interaction with the real world, with objects that appear realistic yet remain disconnected from physical touch. Achieving truly immersive VR experiences requires virtual objects to both correspond in 3D to their physical counterparts and also to preserve their exact affordances, meaning the functional regions that enable meaningful actions and interactions with our hands. This work envisions a different kind of immersion in a mixed state between the virtual and the physical, where users interact with new, generated VR objects while maintaining the functional coherence of the real-world counterparts they are physically touching. In this setting, 3D geometry can be freely altered while the underlying affordances must remain intact, ensuring that virtual edits preserve physical meaning and stay naturally grounded in real-world interaction. With this goal in mind, we propose an affordance-preserving 3D editing pipeline for objects in a scene, providing the building block for creating a fully immersive scene experience while also allowing plenty of creative flexibility. Our two-stage editing pipeline, built on top of the latest advancements in the field, combines an open-vocabulary affordance estimator with a conditional 3D generative model. In the first stage, we estimate functional regions through open-vocabulary affordance prediction based on a user-supplied text prompt or automatic captioning methods. In the second stage, we perform 3D editing using a text-conditioned latent generative model, ensuring that the affordance-relevant regions remain intact or largely unaltered. At the same time, the rest of the object is modified according to the user’s prompt. This scheme preserves geometry and functionality where it matters, yet allows coherent and original changes elsewhere. We report quantitative affordance localization on standard benchmarks and qualitative editing results, demonstrating strong potential for generative 3D object manipulation that is both physically grounded and visually flexible in immersive virtual environments.

Affordance-Preserving 3D Object Editing for Immersive VR

VIOLA, MASSIMILIANO
2023/2024

Abstract

Many virtual reality (VR) scenes render convincingly but offer little interaction with the real world, with objects that appear realistic yet remain disconnected from physical touch. Achieving truly immersive VR experiences requires virtual objects to both correspond in 3D to their physical counterparts and also to preserve their exact affordances, meaning the functional regions that enable meaningful actions and interactions with our hands. This work envisions a different kind of immersion in a mixed state between the virtual and the physical, where users interact with new, generated VR objects while maintaining the functional coherence of the real-world counterparts they are physically touching. In this setting, 3D geometry can be freely altered while the underlying affordances must remain intact, ensuring that virtual edits preserve physical meaning and stay naturally grounded in real-world interaction. With this goal in mind, we propose an affordance-preserving 3D editing pipeline for objects in a scene, providing the building block for creating a fully immersive scene experience while also allowing plenty of creative flexibility. Our two-stage editing pipeline, built on top of the latest advancements in the field, combines an open-vocabulary affordance estimator with a conditional 3D generative model. In the first stage, we estimate functional regions through open-vocabulary affordance prediction based on a user-supplied text prompt or automatic captioning methods. In the second stage, we perform 3D editing using a text-conditioned latent generative model, ensuring that the affordance-relevant regions remain intact or largely unaltered. At the same time, the rest of the object is modified according to the user’s prompt. This scheme preserves geometry and functionality where it matters, yet allows coherent and original changes elsewhere. We report quantitative affordance localization on standard benchmarks and qualitative editing results, demonstrating strong potential for generative 3D object manipulation that is both physically grounded and visually flexible in immersive virtual environments.
2023
Affordance-Preserving 3D Object Editing for Immersive VR
3D Generation
Virtual Reality
Flow Matching
File in questo prodotto:
File Dimensione Formato  
Tesi_Galileiana_Viola.pdf

embargo fino al 19/11/2026

Dimensione 10.99 MB
Formato Adobe PDF
10.99 MB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/98355