CN-122021815-A - Cartesian genetic programming energy scheduling method and related equipment

CN122021815ACN 122021815 ACN122021815 ACN 122021815ACN-122021815-A

Abstract

The embodiment of the application provides a Cartesian genetic programming energy scheduling method and related equipment, and belongs to the field of energy scheduling and evolution calculation. The method comprises the steps of obtaining an environment state and a discrete action set of an energy system, initializing a CGP population, splicing the state and each candidate action into an augmented input vector by adopting an explicit action selection strategy, inputting CGP individuals to obtain scalar scores, preferentially executing the scalar scores, calculating fitness, and generating offspring by applying an activity perception sampling mutation operator in the evolution process, namely identifying active nodes in the genotype by reverse traversal, dynamically distributing mutation budget according to the active nodes, enabling resources to incline towards the active nodes, and simultaneously reserving a small amount of non-active node mutation. The application obviously improves the searching efficiency and solving quality through the directional mutation mechanism, and simultaneously generates the scheduling strategy which is a completely symbolized mathematical expression, has high interpretability and easy deployment, and is suitable for real-time optimized scheduling of the multi-functional micro-grid and the active power distribution network.

Inventors

QI RUI
JIA YAHUI
JIA BING
CHEN WEINENG
LIN ZHENHONG

Assignees

华南理工大学

Dates

Publication Date: 20260512
Application Date: 20251215

Claims (10)

1. A method of cartesian genetic programming energy scheduling, the method comprising the steps of: S1, acquiring environment state data, discrete action sets and training scene data of an energy management system; s2, initializing a Cartesian genetic programming CGP population, wherein the population comprises a plurality of CGP individuals, and the genotype of each CGP individual is represented as a directed acyclic graph formed by two-dimensional grid arranged computing nodes; s3, for the current decision time, acquiring an environment state vector Using an explicit action selection strategy to vector the environmental state Splicing with each candidate action in the discrete action set to construct a corresponding augmented input vector ; S4, carrying out fitness evaluation on individuals in the population, namely, carrying out each augmentation input vector The CGP individual to be evaluated is input, scalar scores of the individual to each candidate action are obtained through forward propagation calculation, the action corresponding to the lowest score is selected as a control instruction at the current moment, and the fitness value of the individual is calculated based on the accumulated rewards under the training scene; S5, judging whether the evolution termination condition is met, if so, executing a step S9, otherwise, executing a step S6; S6, screening parent individuals from the current population by adopting an elite selection strategy according to the fitness evaluation result; s7, applying an active perception sampling mutation operator to the selected parent individuals to generate child individuals, wherein the method comprises the following steps: s71, identifying an active node set actually contributing to output by reversely traversing the genotype map of the CGP individual from the output node of the CGP individual; S72, according to the number of nodes in the active node set Number of inactive nodes Dynamic calculation of variant budgets allocated to active nodes And variant budgets allocated to inactive nodes , wherein, Calculated value of (2) is greater than ; S73, randomly selecting corresponding number of nodes from the active node set and the inactive node set respectively as nodes to be mutated according to the mutation budget determined in the step S72; s8, executing mutation operation on each node to be mutated selected in the step S73, wherein the mutation operation comprises at least one of randomly replacing a functional function of the node, randomly reconnecting an input index of the node and adding random disturbance to a weight parameter of the node; And S9, outputting the CGP individuals with the highest fitness in the population after the evolution is finished, and taking the CGP individuals as an optimal energy scheduling strategy.
2. The method according to claim 1, wherein in step S1, the energy management system environment status data For a D-dimensional vector, data of at least one of the following scenes: Multi-energy micro-grid MEMG scene, electric load demand Demand for thermal load Real-time electricity price Available power of photovoltaic State of charge of energy storage unit And Current time step ; Active distribution network ADN-EES scene network node payload State of charge of energy storage unit Real-time electricity price Current time step Node voltage amplitude Photovoltaic availability power 。
3. The method according to claim 1, wherein in step S1, the discrete set of actions And the discharging, idling and charging behaviors of the energy storage unit are respectively corresponding to the turn-off, partial load operation and full load operation states of the controllable generator set.
4. The method according to claim 1, wherein in step S2, the set of function functions of the CGP individual is The genotype of each functional node is encoded into a tuple , wherein, As a function of the node, An index is connected for the input of the node, Is a weight parameter of the node.
5. The method according to claim 1, wherein in step S4, the action with the lowest score is selected The calculation formula of (2) is as follows: wherein Augmenting input vectors for CGP individual pairs Scalar score calculated, fitness value of individual Jackpot for all training scenarios Average value of (2): , Is the number of training scenarios.
6. The method according to claim 1, wherein in step S72, the active node variation budget And inactive node variation budget Dynamically calculated by the following formula: Wherein, the For the preset parameter of the activity intensity, And Respectively a lower limit and an upper limit of the total number of the preset variation nodes.
7. The method according to claim 1 or 6, wherein in step S8, the mutation operation is performed on the selected node to be mutated, specifically, with a predetermined mutation probability At least one of the following operations is performed: (1) From a set of functional functions A new function is selected uniformly and randomly to replace the original function of the node ; (2) Selecting new connection index from feasible input connection index set uniformly and randomly to replace original input connection of the node ; (3) For the original weight parameter of the node Adding a gaussian distribution And truncating the result to a preset weight interval And (3) inner part.
8. An electronic device comprising a memory storing a computer program and a processor implementing the method of any of claims 1 to 7 when the computer program is executed by the processor.
9. A computer readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the method of any one of claims 1 to 7.
10. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements the method of any one of claims 1 to 7.

Description

Cartesian genetic programming energy scheduling method and related equipment Technical Field The application relates to the field of energy scheduling and evolution calculation, in particular to a Cartesian Genetic Programming (CGP) energy scheduling method and related equipment. Background Global energy systems are being transformed towards integrating high proportions of renewable energy, which puts higher demands on real-time scheduling and control of the grid. Traditional model-based optimization methods (e.g., mixed integer linear programming) rely on accurate physical models and a priori knowledge of uncertainty, suffer from inadequate adaptation in the face of strongly nonlinear, highly fluctuating environments, and have a heavy computational burden. Deep Reinforcement Learning (DRL) is used as a model-free method, and a single strategy is optimized through trial and error learning, but the searching process is easy to fall into local optimum, and the strategy is a 'black box' model which is difficult to explain, so that trust barriers exist in safety-critical energy systems. Evolutionary algorithms, in particular Cartesian Genetic Programming (CGP), have a natural balancing capability between exploration and utilization by maintaining a population parallel search, and can generate transparent strategies consisting of simple arithmetic logic primitives. However, the conventional CGP generally adopts blind purpose and uniform random variation, so that the searching efficiency is low in the complex energy scheduling problem, the convergence speed is low, and a high-quality strategy is difficult to find in limited resources. Therefore, a new method for not only maintaining the robustness and interpretability advantages of the evolutionary algorithm, but also remarkably improving the searching efficiency and performance of the evolutionary algorithm in the complex energy scheduling problem is needed. Disclosure of Invention The application mainly aims to overcome the defects that the existing deep reinforcement learning method is easy to fall into local optimum, the strategy cannot be interpreted, and the traditional Cartesian genetic programming algorithm is low in exploration efficiency and slow in convergence, and provides a Cartesian genetic programming energy scheduling method, electronic equipment, storage media and program products based on an activity sensing sampling strategy. To achieve the above object, an aspect of an embodiment of the present application provides a cartesian genetic programming energy scheduling method, the method including: S1, acquiring environment state data, discrete action sets and training scene data of an energy management system; s2, initializing a Cartesian genetic programming CGP population, wherein the population comprises a plurality of CGP individuals, and the genotype of each CGP individual is represented as a directed acyclic graph formed by two-dimensional grid arranged computing nodes; s3, for the current decision time, acquiring an environment state vector Using an explicit action selection strategy to vector the environmental stateSplicing with each candidate action in the discrete action set to construct a corresponding augmented input vector; S4, carrying out fitness evaluation on individuals in the population, namely, carrying out each augmentation input vectorThe CGP individual to be evaluated is input, scalar scores of the individual to each candidate action are obtained through forward propagation calculation, the action corresponding to the lowest score is selected as a control instruction at the current moment, and the fitness value of the individual is calculated based on the accumulated rewards under the training scene; S5, judging whether the evolution termination condition is met, if so, executing a step S9, otherwise, executing a step S6; S6, screening parent individuals from the current population by adopting an elite selection strategy according to the fitness evaluation result; s7, applying an active perception sampling mutation operator to the selected parent individuals to generate child individuals, wherein the method comprises the following steps: s71, identifying an active node set actually contributing to output by reversely traversing the genotype map of the CGP individual from the output node of the CGP individual; S72, according to the number of nodes in the active node set Number of inactive nodesDynamic calculation of variant budgets allocated to active nodesAnd variant budgets allocated to inactive nodes, wherein,Calculated value of (2) is greater than; S73, randomly selecting corresponding number of nodes from the active node set and the inactive node set respectively as nodes to be mutated according to the mutation budget determined in the step S72; s8, executing mutation operation on each node to be mutated selected in the step S73, wherein the mutation operation comprises at least one of randomly replacing a functional funct