Normalizing Flow (NF) models have recently emerged as a powerful class of generative models capable of learning expressive probability distributions through invertible transformations. In Reinforcement Learning (RL), where efficient exploration, robust policy learning, and uncertainty estimation remain major challenges, Normalizing Flows offer a promising avenue by enabling flexible density estimation and tractable likelihood evaluation. This thesis investigates the application of Normalizing Flow architectures to RL tasks, with a focus on policy representation, value approximation, and exploration strategies. To demonstrate their versatility, Normalizing Flows are first applied to conditional image generation as a preliminary study. Subsequently, NF-based policies are employed in a customized environment to showcase their ability to represent multimodal action distributions. Building on these insights, NF-based policy networks are evaluated across continuous control benchmarks and compared against standard policy gradient and actor-critic methods. Furthermore, the potential of Normalizing Flows in Multi-Agent Reinforcement Learning (MARL) is explored through a two-player zero-sum game. The experimental results demonstrate that NF policies significantly improve expressiveness without sacrificing stability, leading to more efficient exploration and higher sample efficiency in challenging environments, though their advantages may be task-dependent.

Normalizing Flow (NF) models have recently emerged as a powerful class of generative models capable of learning expressive probability distributions through invertible transformations. In Reinforcement Learning (RL), where efficient exploration, robust policy learning, and uncertainty estimation remain major challenges, Normalizing Flows offer a promising avenue by enabling flexible density estimation and tractable likelihood evaluation. This thesis investigates the application of Normalizing Flow architectures to RL tasks, with a focus on policy representation, value approximation, and exploration strategies. To demonstrate their versatility, Normalizing Flows are first applied to conditional image generation as a preliminary study. Subsequently, NF-based policies are employed in a customized environment to showcase their ability to represent multimodal action distributions. Building on these insights, NF-based policy networks are evaluated across continuous control benchmarks and compared against standard policy gradient and actor-critic methods. Furthermore, the potential of Normalizing Flows in Multi-Agent Reinforcement Learning (MARL) is explored through a two-player zero-sum game. The experimental results demonstrate that NF policies significantly improve expressiveness without sacrificing stability, leading to more efficient exploration and higher sample efficiency in challenging environments, though their advantages may be task-dependent.

Application of Normalizing Flows in Deep Reinforcement Learning

BOSCOLO MENEGUOLO, FRANCESCO
2024/2025

Abstract

Normalizing Flow (NF) models have recently emerged as a powerful class of generative models capable of learning expressive probability distributions through invertible transformations. In Reinforcement Learning (RL), where efficient exploration, robust policy learning, and uncertainty estimation remain major challenges, Normalizing Flows offer a promising avenue by enabling flexible density estimation and tractable likelihood evaluation. This thesis investigates the application of Normalizing Flow architectures to RL tasks, with a focus on policy representation, value approximation, and exploration strategies. To demonstrate their versatility, Normalizing Flows are first applied to conditional image generation as a preliminary study. Subsequently, NF-based policies are employed in a customized environment to showcase their ability to represent multimodal action distributions. Building on these insights, NF-based policy networks are evaluated across continuous control benchmarks and compared against standard policy gradient and actor-critic methods. Furthermore, the potential of Normalizing Flows in Multi-Agent Reinforcement Learning (MARL) is explored through a two-player zero-sum game. The experimental results demonstrate that NF policies significantly improve expressiveness without sacrificing stability, leading to more efficient exploration and higher sample efficiency in challenging environments, though their advantages may be task-dependent.
2024
Application of Normalizing Flows in Deep Reinforcement Learning
Normalizing Flow (NF) models have recently emerged as a powerful class of generative models capable of learning expressive probability distributions through invertible transformations. In Reinforcement Learning (RL), where efficient exploration, robust policy learning, and uncertainty estimation remain major challenges, Normalizing Flows offer a promising avenue by enabling flexible density estimation and tractable likelihood evaluation. This thesis investigates the application of Normalizing Flow architectures to RL tasks, with a focus on policy representation, value approximation, and exploration strategies. To demonstrate their versatility, Normalizing Flows are first applied to conditional image generation as a preliminary study. Subsequently, NF-based policies are employed in a customized environment to showcase their ability to represent multimodal action distributions. Building on these insights, NF-based policy networks are evaluated across continuous control benchmarks and compared against standard policy gradient and actor-critic methods. Furthermore, the potential of Normalizing Flows in Multi-Agent Reinforcement Learning (MARL) is explored through a two-player zero-sum game. The experimental results demonstrate that NF policies significantly improve expressiveness without sacrificing stability, leading to more efficient exploration and higher sample efficiency in challenging environments, though their advantages may be task-dependent.
Reinforcement
Learning
Deep
Normalizing Flows
Algorithms
File in questo prodotto:
File Dimensione Formato  
Boscolo_Meneguolo_Francesco.pdf

accesso aperto

Dimensione 3.79 MB
Formato Adobe PDF
3.79 MB Adobe PDF Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/94118