CN-121581338-B - Competitive evolution multitasking optimization method, system and equipment

CN121581338BCN 121581338 BCN121581338 BCN 121581338BCN-121581338-B

Abstract

The invention discloses a competitive evolution multitask optimization method, a system and equipment, and aims to solve the problem that the conventional method is difficult to cooperatively optimize fuel cost and gas emission under the condition that power balance and unit capacity constraint conditions are met. The method comprises the steps of constructing a main population and an auxiliary population for parallel search, calculating state indexes representing the convergence, diversity and feasibility of the populations in real time, adaptively selecting a cooperative mode and an evolution operator for combined action through deep reinforcement learning of the populations based on the current state, calculating rewards according to improvement of the populations on cost, emission and constraint satisfaction after action execution, training the populations by utilizing the states, the actions and the rewards, and iterating and optimizing decision strategies until a pareto optimal scheduling scheme meeting the constraint is output. The invention realizes the self-adaptive intelligent guidance of the evolution process, and remarkably improves the optimization quality, convergence speed and algorithm robustness of the economic emission scheduling scheme of the power system.

Inventors

Luo Wenguan
Tan Suoyi
YU XIAOBING
LV XIN

Assignees

中国人民解放军国防科技大学

Dates

Publication Date: 20260512
Application Date: 20260129

Claims (8)

1. The competition evolution multitask optimization method is used for solving the problem of joint economic emission scheduling of an electric power system, and the scheduling problem is a constraint multitarget optimization problem of joint optimization fuel cost and gas emission under the condition of meeting power balance constraint and generator capacity constraint, and is characterized by comprising the following steps: S1, constructing a main population and an auxiliary population for solving constraint multi-objective optimization problems in parallel, wherein each population individual represents a group of generator output distribution candidate solutions meeting engineering constraints; S2, calculating and generating indexes representing the optimization states of the two populations according to objective function values and constraint violation degrees of candidate solutions in the main population and the auxiliary population, wherein the indexes comprise convergence indexes, diversity indexes and feasibility indexes; S3, based on the current optimization state, selecting and executing an action from a preset combination set of a cooperation mode and a propagation operator by a deep reinforcement learning agent to update candidate solutions of the main population and the auxiliary population, wherein the combination set of the cooperation mode and the propagation operator consists of a plurality of predefined combination actions, each combination action consists of a cooperation mode and a propagation operator, the cooperation mode is used for prescribing an information interaction rule between the main population and the auxiliary population in a current iteration period, and the propagation operator is used for prescribing a generation and update mode of a candidate scheduling parameter scheme under the cooperation mode; S4, calculating rewards reflecting the improvement degree of the scheduling performance according to the change of the overall performance of the population after the action is executed, wherein the overall performance of the population is comprehensively calculated by the convergence index, the diversity index and the feasibility index; s5, training the deep reinforcement learning agent by utilizing the optimized state, the executed actions and the rewards so as to update a decision strategy for guiding subsequent action selection; iteratively executing the steps S2 to S5 until a preset termination condition is met, and outputting the pareto optimal solution set of the scheduling problem as a generator output distribution scheme; The cooperation mode comprises close coupling cooperation and loose coupling cooperation; the close-coupled collaboration exchanges solutions in each iteration, and the loose-coupled collaboration exchanges solutions only in an environment selection phase; the operators in the combined set of the propagation operators comprise a genetic algorithm operator, a first differential evolution operator and a second differential evolution operator.
2. The method of competitive evolutionary multitasking optimization of claim 1, wherein the convergence index is calculated based on variations in ideal points and worst points between adjacent iterations; the diversity index is calculated based on the inverse of the average Euclidean distance of the solution to its neighboring solutions; the feasibility index is calculated based on the average of constraint violation values of all solutions in the population.
3. The method for optimizing competitive evolution multitasking of claim 1, wherein said calculating a reward reflecting the improvement of the dispatch performance based on the overall performance change of the population after performing said action is: If the overall performance of the population in the current iteration is improved compared with that in the previous iteration, a first positive rewarding value is given; If the overall performance of the population in the current iteration is unchanged compared with that in the previous iteration, giving a zero reward or a second small reward value; If the population overall performance of the current iteration is reduced compared with the previous iteration, a negative reward value is given.
4. The method of claim 1, wherein training the deep reinforcement learning agent using the optimization state, the performed actions, and the rewards is specifically: training a deep Q network by adopting an experience playback mechanism through minimizing a loss function, wherein the deep reinforcement learning agent is specifically the deep Q network; The loss function formula is: ; Wherein, the The value representing the loss function is indicated, The training data representing the samples is presented as such, And Respectively represent the first The states and actions in a number of iterations, Representing the presentation to be And An output after the input of the action cost function, Is shown in the state Take action downwards Q value of (2); Acquisition of The formula of (2) is: ; Wherein, the Is shown in the first In a plurality of iterations, in state Take action downwards The prize to be achieved is a prize, Representing the discount factor(s), Is shown in the state Take the next action The maximum Q value that can be obtained.
5. The method for optimizing competing evolutionary multitasking of claim 1, further comprising: In each iteration process, when the deep reinforcement learning agent is not trained, the action is randomly selected from the preset cooperation mode and propagation operator combination set, and the selection result is stored in an experience playback pool; and after the deep reinforcement learning agent is trained, the action is selected based on the current optimization state according to the trained deep reinforcement learning agent.
6. The method of competitive evolutionary multitasking optimization of claim 1, wherein the equation to obtain the fuel cost value is: ; Wherein, the In order to be a total fuel cost, As a total number of generators, For indexing the generator(s), Is the first The output power of the generator is set to be equal to the output power of the generator, 、、 Respectively the first Cost coefficients of the generator; The formula for obtaining the gas emission value is as follows: ; Wherein, the In order to achieve the total gas discharge amount, 、、 Respectively the first The discharge coefficient of the generator; is the first The output power of the station generator; The formula of the optimization target of the scheduling problem is as follows: ; Wherein, the And (3) with In order to optimize the goal of the present invention, In order to meet the overall power load demand, As the actual total power generation amount, Is the first The counter generator allows a minimum generated power, Is the first Maximum power allowed by the generator.
7. A competition evolution multitasking optimization system for performing the method of any one of claims 1-6 for solving a joint economic emission scheduling problem for an electrical power system, said scheduling problem being a constrained multi-objective optimization problem for joint optimization of fuel cost and gas emissions under conditions that satisfy power balance constraints and generator capacity constraints, comprising: the population construction module is used for constructing a main population and an auxiliary population for parallel solving constraint multi-objective optimization problems, wherein each population individual represents a group of generator output distribution candidate solutions meeting engineering constraints; the state generation module is used for calculating and generating indexes representing the optimization states of the two populations according to objective function values and constraint violation degrees of candidate solutions in the main population and the auxiliary population, wherein the indexes comprise convergence indexes, diversity indexes and feasibility indexes; The action selection module is used for selecting and executing an action from a preset combination set of a cooperation mode and a propagation operator through the deep reinforcement learning agent based on the current optimization state so as to update candidate solutions of the main population and the auxiliary population; the rewards calculation module is used for calculating rewards reflecting the improvement degree of the scheduling performance according to the change of the overall performance of the population after the action is executed, and the overall performance of the population is comprehensively calculated by the convergence index, the diversity index and the feasibility index; The model training module is used for training the deep reinforcement learning intelligent agent by utilizing the optimized state, the executed actions and the rewards so as to update a decision strategy for guiding subsequent action selection; and iteratively operating the population construction module, the state generation module, the action selection module, the rewarding calculation module and the model training module until a preset termination condition is met, and outputting the pareto optimal solution set of the scheduling problem as a generator output distribution scheme.
8. A competition evolution multitasking optimization device comprising: one or more processors; a memory storing a computer program; the method of competitive evolutionary multitasking optimization as claimed in any one of claims 1 to 6 is implemented when the computer program is executed by the processor.

Description

Competitive evolution multitasking optimization method, system and equipment Technical Field The invention relates to the technical field of operation scheduling and intelligent optimization control of an electric power system, in particular to a competition evolution multitask optimization method, a competition evolution multitask optimization system and competition evolution multitask optimization equipment. Background In practical applications such as power system operation, energy scheduling and engineering system design, it is often required to simultaneously optimize a plurality of conflicting performance indexes on the premise of meeting various engineering constraint conditions, for example, in the power system joint economic emission scheduling process, it is required to simultaneously reduce the system operation cost and the pollutant emission level on the premise of meeting the power balance constraint and the generator capacity constraint. The problems can be generally abstracted into constraint multi-objective optimization problems (Constrained Multi-objective Optimization Problems, CMOPs), the feasible solution space is limited by physical constraint and engineering conditions, and the characteristics of narrow feasible area, discontinuous distribution, complex morphology and the like are often presented, so that the search and generation process of a scheduling scheme faces larger engineering realization difficulty. The existing evolutionary multi-objective optimization method can generate a candidate solution set meeting constraint conditions to a certain extent, but has obvious defects in practical engineering scheduling application. On one hand, the traditional method generally depends on a single evolution operator or a fixed population cooperation mechanism, and is difficult to realize dynamic balance among convergence speed, solution set distribution uniformity and constraint satisfaction degree according to the change of the running state of the system, so that the quality of a generated scheduling scheme is unstable easily, and even a large number of infeasible schemes appear, thereby influencing the running safety of an engineering system. In order to improve the solving efficiency of complex scheduling problems, a multitasking evolution (Evolutionary Multi-tasking, EMT) framework provides a feasible thought for accelerating the searching process by carrying out knowledge migration among a plurality of related optimization tasks. However, the existing multi-task evolution method still faces two technical bottlenecks in engineering application, namely that firstly, the knowledge migration process lacks an effective perception and intelligent decision mechanism for the running state of the system, invalid or even negative migration results are easily introduced, so that computational resources are wasted and scheduling efficiency is reduced, and secondly, the selection of evolution operators and cooperation strategies usually depends on manual experience or preset rules, so that the dynamic requirements of complex running scenes such as load change, constraint condition change and the like of the power system are difficult to adapt. Although the competitive evolution multitasking frame improves the diversity and searching capability of a scheduling scheme to a certain extent by introducing a cooperative mechanism of an auxiliary population and various evolution operators, under the condition of lacking a unified intelligent decision mechanism, the problems of unreasonable cooperation mode selection and low operator use efficiency still exist, so that uneven computing resource allocation and unstable engineering constraint meeting effect in the scheduling process are caused, and the requirements of the actual engineering scene such as the joint economic emission scheduling of the electric power system on the reliability and stability are difficult to meet. Therefore, how to dynamically perceive the state of the scheduling process and intelligently select the cooperation mode and the evolution strategy under the complex engineering constraint condition, so as to generate a scheduling scheme which meets the engineering constraint, has excellent performance index and stability, and is a technical problem to be solved by the technicians in the field. Disclosure of Invention In order to solve the problem that in the existing engineering scheduling scene, on the premise of meeting various engineering constraint conditions, stable and efficient joint optimization is difficult to realize among a plurality of performance indexes, the invention aims to provide a competitive evolution multitask optimization method, a system and equipment for constraining a multi-objective scheduling problem. The technical scheme provided by the invention is as follows: A competition evolution multitask optimization method is used for solving the problem of joint economic emission scheduling of an electric power sy