US-12627150-B2 - System and method for controlling power distribution systems using graph-based reinforcement learning

US12627150B2US 12627150 B2US12627150 B2US 12627150B2US-12627150-B2

Abstract

A method for controlling a power distribution system having a number of nodes and controllable grid assets associated with at least some of the node includes acquiring observations via measurement signals associated with respective nodes and generating a graph representation of a system state based on the observations and topological information of the power distribution system. The topological information is used to determine edges defining connections between nodes. The observations are used to determine nodal features of respective nodes, which are indicative of a measured electrical quantity and a status of controllable grid assets associated with the respective node. The graph representation is processed using a reinforcement learned control policy to output a control action for effecting a change of status of one or more of the controllable grid assets, to regulate voltage and reactive power flow in the power distribution system based on a volt-var optimization objective.

Inventors

Xian Yeow Lee
Yubo Wang
Ulrich MUENZ

Assignees

SIEMENS AKTIENGESELLSCHAFT

Dates

Publication Date: 20260512
Application Date: 20220708

Claims (13)

1 . A method for controlling a power distribution system comprising a number of nodes and controllable grid assets associated with at least some of the nodes, the method comprising: acquiring observations via measurement signals associated with respective nodes, generating a graph representation of a system state of the power distribution system based on the observations and topological information of the power distribution system, wherein the topological information is used to determine edges defining connections between nodes and the observations are used to determine nodal features of respective nodes, the nodal features indicative of a measured electrical quantity and a status of controllable grid assets associated with the respective node, and processing the graph representation of the system state using a control policy trained by reinforcement learning to output a control action for effecting a change of status of one or more of the controllable grid assets, to regulate voltage and reactive power flow in the power distribution system based on a volt-var optimization objective, wherein the control policy comprises a graph neural network for generating nodal embeddings of respective nodes based on the observations and the topological information using a mechanism of message-passing between neighboring nodes, wherein the output control action is predicted based on the nodal embeddings.
2 . The method according to claim 1 , wherein the volt-var optimization objective is defined by a combination of costs including voltage violation at nodes, power losses and control error pertaining to frequency of change of status of the controllable grid assets.
3 . The method according to claim 1 , wherein the reinforcement learning comprises, over a number of episodes of trial, optimizing trainable parameters of the control policy to maximize a cumulative reward resulting from a sequence of control actions for each episode, based on a reward function defined by the volt-var optimization objective.
4 . The method according to claim 3 , wherein the sequence of control actions is generated based on respective graph representations of simulated system states of the power distribution system using observations from a simulation environment to determine respective nodal features and the topological information to determine edges defining connection between nodes.
5 . The method according to claim 1 , wherein the output control action is predicted from an action space defined by switchable states of the controllable grid assets.
6 . The method according to claim 5 , wherein the controllable grid assets have discrete switchable states or a combination of discrete and continuous switchable states.
7 . The method according to claim 1 , wherein the controllable grid assets comprise one or more voltage regulators, one or more capacitors, and one or more batteries.
8 . The method according to claim 1 , wherein the measured electrical quantity comprises a nodal voltage and/or power.
9 . The method according to claim 1 , wherein the power distribution system comprises one or more unmeasured nodes from which measurement signals are missing, wherein the nodal embeddings of the one or more unmeasured nodes are generated based on observations from neighboring measured nodes using the message passing mechanism.
10 . The method according to claim 1 , wherein the nodal embeddings are processed via a readout function to form logits for an output layer defining an action space, wherein the readout function includes a stacking of nodal embeddings of nodes associated with controllable grid assets.
11 . The method according to claim 1 , comprising augmenting the graph representation of the system by: identifying one or more nodes associated with controllable grid assets having a global effect on the power distribution system, and for each identified node, adding an edge between the identified node and all downstream nodes.
12 . A non-transitory computer-readable storage medium including instructions that, when processed by a computing system, configure the computing system to perform the method according to claim 11 .
13 . A system for controlling a power distribution system comprising a number of nodes and controllable grid assets associated with at least some of the nodes, the system comprising: measurement devices for communicating measurement signals associated with respective nodes, and a computing system, comprising: one or more processors, and a memory storing algorithmic modules executable by the one or more processors, the algorithmic modules comprising: a state graph generator configured to generate a graph representation of a system state of the power distribution system based on observations acquired via the measurement signals and topological information of the power distribution system, wherein the topological information is used to determine edges defining connections between nodes and the observations are used to determine nodal features of respective nodes, the nodal features indicative of a measured electrical quantity and a status of controllable grid assets associated with the respective node, and a volt-var controller configured to process the graph representation of the system state using a control policy trained by reinforcement learning to output a control action for effecting a change of status of one or more of the controllable grid assets, to regulate voltage and reactive power flow in the power distribution system based on a volt-var optimization objective, wherein the control policy comprises a graph neural network for generating nodal embeddings of respective nodes based on the observations and the topological information using a mechanism of message-passing between neighboring nodes, wherein the output control action is predicted based on the nodal embeddings.

Description

TECHNICAL FIELD The present disclosure relates, in general, to control of power distribution systems to achieve voltage regulation. Disclosed embodiments relate to systems, methods and computer program products for controlling voltage and reactive power flow in power distribution systems using graph-based reinforcement learning. BACKGROUND Circuits in power distribution systems usually follow a radial topology, which may cause nodes that are far away from the substation (root node) to experience undervoltage. For example, voltage at the end of a feeder may drop below the acceptable range of ±5% of nominal the nominal voltage. Active control of voltage and reactive power flow may be desirable for maintaining healthy operation of power distribution systems. Volt-var control refers to the control of voltage (Volt) and reactive power (Var) in power distribution systems. Volt-var control usually involves optimally dispatching controllable grid assets or actuators of a power distribution system to maintain voltage profile at the nodes as well as reduce power losses across the power distribution system. SUMMARY Briefly, aspects of the present disclosure provide a technique for volt-var control in power distribution systems using graph-based reinforcement learning. A first aspect of the disclosure provides a method for controlling a power distribution system comprising a number of nodes and controllable grid assets associated with at least some of the nodes. The method comprises acquiring observations via measurement signals associated with respective nodes. The method further comprises generating a graph representation of a system state of the power distribution system based on the observations and topological information of the power distribution system. The the topological information is used to determine edges defining connections between nodes and the observations are used to determine nodal features of respective nodes. The nodal features are indicative of a measured electrical quantity and a status of controllable grid assets associated with the respective node. The method further comprises processing the graph representation of the system state using a control policy trained by reinforcement learning to output a control action for effecting a change of status of one or more of the controllable grid assets, to regulate voltage and reactive power flow in the power distribution system based on a volt-var optimization objective. A further aspect of the disclosure provides a computer-implemented method for training a control policy using reinforcement learning for volt-var control in a power distribution system according to the above-described method. Other aspects of the disclosure implement features of the above-described method in systems and computer program products for volt-var control in a power distribution system. Additional technical features and benefits may be realized through the techniques of the present disclosure. Embodiments and aspects of the disclosure are described in detail herein and are considered a part of the claimed subject matter. For a better understanding, refer to the detailed description and to the drawings. BRIEF DESCRIPTION OF THE DRAWINGS The foregoing and other aspects of the present disclosure are best understood from the following detailed description when read in connection with the accompanying drawings. To easily identify the discussion of any element or act, the most significant digit or digits in a reference number refer to the figure number in which the element or act is first introduced. FIG. 1 illustrates an example topology of a power distribution system having controllable grid assets for controlling voltage and reactive power flow. FIG. 2 is a schematic illustration of a system for controlling voltage and reactive power flow in a power distribution system using graph-based reinforcement learning, according to an example embodiment. FIG. 3 is a flowchart of an example logic for training a control policy for volt-var control via graph-based reinforcement learning. FIG. 4 is a schematic illustration of a control policy including a graph neural network, according to an example embodiment. FIG. 5 is a schematic illustration of an augmented graph representation to take into account global effects of a controllable grid asset. FIG. 6 illustrates a visualization of the effects of adding graph augmentation and/or localized readout function based only on nodes with controllable grid assets. FIG. 7 illustrates a computing system that can support volt-car control of a power distribution system according to disclosed embodiments. DETAILED DESCRIPTION Volt-var control involves operating a power distribution system via controlling voltage and reactive power flow to maintain healthy operation of the power distribution system. In particular, volt-var control may be implemented to optimize the operation of controllable grid assets to minimize power delivery losses as well as maintai