The growing number of renewable energy sources and distributed energy systems are transforming traditional power systems into highly dynamic and decentralized networks. In this context, Active Network Management task has become central to the safe and efficient operation of modern distribution grids. However, the uncertainty, variability, and operational constraints of such systems challenge conventional control and optimization methods. This thesis investigates the application of Reinforcement Learning to active network management in micro smart grids. The task is formulated as a sequential decision making problem, where an agent learns policies to efficiently coordinate distributed generation and storage systems under various types of uncertainties. Safety is a major issue associated with the use of RL in active distribution grid systems: These types of approach do not inherently guaranty physical and operational constraints such as voltage limits and thermal ratings. To enforce constraints during learning, the problem is modeled within the framework of Constrained Markov Decision Processes. Both standard penalty-based methods and Lagrangian based safe RL approaches are explored. Two deep RL algorithms, Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC), are evaluated in simulated environments based on the Gym-ANM framework, including a custom scenario and the ANM6Easy benchmark. Results show that fixed penalty methods achieve safety at the expense of operational efficiency: often the agent learns to curtail nearly all renewable generation to avoid violations, wasting large amounts of clean energy. The Lagrangian formulation better balances safety and efficiency, reducing energy waste through active battery management and minimal curtailment, while maintaining acceptable constraint satisfaction. The Lagrangian method also exhibit more stable learning dynamics but still shows sensitivity to hyperparameter selection. These results suggest that safe reinforcement learning can be a practical approach for real-time control of micro smart grids, balancing performance and safety without resorting to excessive conservatism. At the same time, careful tuning remains necessary, and further work is needed to ensure robustness in real-world power systems.
Reinforcement Learning for Active Network Management in Micro Smart Grids
PASE, EMANUELE
2025/2026
Abstract
The growing number of renewable energy sources and distributed energy systems are transforming traditional power systems into highly dynamic and decentralized networks. In this context, Active Network Management task has become central to the safe and efficient operation of modern distribution grids. However, the uncertainty, variability, and operational constraints of such systems challenge conventional control and optimization methods. This thesis investigates the application of Reinforcement Learning to active network management in micro smart grids. The task is formulated as a sequential decision making problem, where an agent learns policies to efficiently coordinate distributed generation and storage systems under various types of uncertainties. Safety is a major issue associated with the use of RL in active distribution grid systems: These types of approach do not inherently guaranty physical and operational constraints such as voltage limits and thermal ratings. To enforce constraints during learning, the problem is modeled within the framework of Constrained Markov Decision Processes. Both standard penalty-based methods and Lagrangian based safe RL approaches are explored. Two deep RL algorithms, Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC), are evaluated in simulated environments based on the Gym-ANM framework, including a custom scenario and the ANM6Easy benchmark. Results show that fixed penalty methods achieve safety at the expense of operational efficiency: often the agent learns to curtail nearly all renewable generation to avoid violations, wasting large amounts of clean energy. The Lagrangian formulation better balances safety and efficiency, reducing energy waste through active battery management and minimal curtailment, while maintaining acceptable constraint satisfaction. The Lagrangian method also exhibit more stable learning dynamics but still shows sensitivity to hyperparameter selection. These results suggest that safe reinforcement learning can be a practical approach for real-time control of micro smart grids, balancing performance and safety without resorting to excessive conservatism. At the same time, careful tuning remains necessary, and further work is needed to ensure robustness in real-world power systems.| File | Dimensione | Formato | |
|---|---|---|---|
|
Pase_Emanuele.pdf
Accesso riservato
Dimensione
2.05 MB
Formato
Adobe PDF
|
2.05 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/106859