CN-121996532-A - Power load management system detection case arranging and combining method and system
Abstract
The invention provides a method and a system for arranging and combining detection cases of a power load management system, which comprise the steps of initializing the environment and initializing an actor commentator model according to each detection case of the power load management system, selecting the detection cases from unselected detection cases by adopting the actor commentator model according to the current strategy and state, iteratively updating the actor commentator model according to the selected detection cases after each detection case selection until each detection case is selected to obtain the optimal arranging and combining result of each detection case as a test path, and adopting the actor commentator model to select the optimal arranging and combining of the detection cases, so that the optimization efficiency of the test path is improved, and meanwhile, the test path can be gradually optimized through the iterative optimization process of the actor commentator model, so that the relevance among the test cases is stronger, the test effect is better, the execution of redundant test cases is reduced, and the time required by the whole test flow is shortened.
Inventors
- CHEN KE
- WANG SHUYANG
- FENG MEILING
- CHEN SONGSONG
- REN MINGYUAN
- MA GUOHAN
- GUO QIANG
- TANG CONG
- LI BIN
Assignees
- 中国电力科学研究院有限公司
- 国网甘肃省电力公司
- 国网山东省电力公司
- 国家电网有限公司
- 华北电力大学
Dates
- Publication Date
- 20260508
- Application Date
- 20241107
Claims (16)
- 1. A method for arranging and combining detection cases of a power load management system, comprising: Carrying out environment initialization and actor commentary model initialization according to each detection use case of the power load management system; Selecting detection cases from unselected detection cases according to the current strategy and state by adopting the actor commentary models, and iteratively updating the actor commentary models according to the selected detection cases after each detection case selection until each detection case is selected, so as to obtain an optimal arrangement and combination result of each detection case as a test path; The actor network of the actor commentary model is used for selecting detection use cases according to the current strategy and state, and the commentary network of the actor commentary model is used for evaluating benefits according to the selected detection use cases, wherein the benefits correspond to the effectiveness or quality of the test.
- 2. The method for arranging and combining detection cases of a power load management system according to claim 1, wherein the initializing the environment and the actor commentary models according to the respective detection cases of the power load management system comprises: mapping each detection case of the power load management system to a vector in a multidimensional space by utilizing single-heat coding, wherein each dimension of the vector respectively represents the attribute of the detection case, and the attribute comprises one or more of the functions, states or detection purposes of equipment; establishing and initializing a first list and a second list as environment initialization, wherein the initial value of the first list is an empty set and is used for storing selected vectors, and the initial value of the second list comprises vectors corresponding to detection cases and is used for storing vectors which are not selected yet; the actor network and the reviewer network are randomly initialized as an initialization of the actor reviewer model.
- 3. The method for ranking and combining detection cases of a power load management system according to claim 2, wherein the selecting detection cases from unselected detection cases by using the actor's commentator model according to current policy and state, and iteratively updating the actor's commentator model according to the selected detection cases after each detection case selection until each detection case is selected, and obtaining an optimal ranking and combining result of each detection case as a test path comprises: Selecting a detection case according to a current strategy and a state by adopting an actor network of the actor commentator model, wherein the strategy is probability distribution of each detection case under the current state and strategy parameters, and the state comprises the first list and the second list; storing the selected detection cases in the first list according to the selected order, deleting the selected detection cases from the second list, and calculating the value of rewards after the detection cases are selected; the critic network of the actor critic model is adopted to evaluate the value before selecting the detection case and the value after selecting the detection case, and the value of the dominance function is calculated based on the value before selecting the detection case, the value after selecting the detection case and the value of the rewards; updating an actor network and a criticism network according to the value of the dominance function; And judging whether the selectable detection cases are all selected, if so, taking the first list as an optimal arrangement and combination result and a test path of each detection case, and ending, otherwise, jumping to an actor network adopting the actor commentary model, and selecting one detection case according to the current strategy and state.
- 4. A method of power load management system detection case permutation and combination according to claim 3, wherein the value of the reward is calculated as follows: Where r represents the value of the prize, m represents the number of detection cases in the test path, X i+1 represents the (i+1) th detection case, and X i represents the (i) th detection case.
- 5. A method of power load management system detection case permutation and combination according to claim 3, wherein the value of the merit function is calculated as follows: A(s,a)=r+γV(s ′ )-V(s) Where A (s, a) represents the value of the merit function after action a is performed in the current state s, γ represents the discount factor, r represents the value of the benefit, V (s ′ ) represents the value of state s ′ after action a is performed, V(s) represents the value of current state s, and action a is performed to select one detection case.
- 6. The method of claim 3, wherein the updated calculation formula for actor networks is as follows: Wherein, theta new represents the updated policy parameters of the actor network, alpha represents the learning rate of the actor network, theta represents the policy parameters of the current actor network, gamma represents discount factors, r represents the value of rewards, V (s ′ ) represents the value of the state s ′ after the action is executed, V(s) represents the value of the current state s, and the policy pi (a|s, theta) represents the probability distribution of the action a executed under the conditions of the current state s and the policy parameters theta, Representing the gradient of the strategy pi (a|s, θ) with respect to θ, executing action a represents selecting one detection case.
- 7. The power load management system detection case permutation and combination method according to claim 3, wherein a calculation formula for updating the critics network is as follows: V new (s)←V(s)+β[r+γV(s ′ )-V(s)] Where V new (s) represents the value of the updated critics network with respect to the state s, β represents the learning rate of the critics network, γ represents the discount factor, r represents the value of the prize, V (s ′ ) represents the value of the state s ′ after the execution of the action, V(s) represents the value of the current state s, and the execution of the action represents the selection of one detection case.
- 8. The power load management system detection case arranging and combining system is characterized by comprising an initialization module and an iteration updating module; The initialization module is used for initializing the environment and the actor commentary models according to each detection use case of the power load management system; The iteration updating module is used for selecting detection cases from unselected detection cases according to the current strategy and state by adopting the actor commentator model, and iteratively updating the actor commentator model according to the selected detection cases after each detection case is selected until each detection case is selected, so as to obtain an optimal arrangement and combination result of each detection case as a test path; The actor network of the actor commentary model is used for selecting detection use cases according to the current strategy and state, and the commentary network of the actor commentary model is used for evaluating benefits according to the selected detection use cases, wherein the benefits correspond to the effectiveness or quality of the test.
- 9. The power load management system detection case permutation and combination system according to claim 8, wherein the initialization module is specifically configured to: mapping each detection case of the power load management system to a vector in a multidimensional space by utilizing single-heat coding, wherein each dimension of the vector respectively represents the attribute of the detection case, and the attribute comprises one or more of the functions, states or detection purposes of equipment; establishing and initializing a first list and a second list as environment initialization, wherein the initial value of the first list is an empty set and is used for storing selected vectors, and the initial value of the second list comprises vectors corresponding to detection cases and is used for storing vectors which are not selected yet; the actor network and the reviewer network are randomly initialized as an initialization of the actor reviewer model.
- 10. The power load management system detection case permutation and combination system according to claim 9, wherein the iterative updating module is specifically configured to: Selecting a detection case according to a current strategy and a state by adopting an actor network of the actor commentator model, wherein the strategy is probability distribution of each detection case under the current state and strategy parameters, and the state comprises the first list and the second list; storing the selected detection cases in the first list according to the selected order, deleting the selected detection cases from the second list, and calculating the value of rewards after the detection cases are selected; the critic network of the actor critic model is adopted to evaluate the value before selecting the detection case and the value after selecting the detection case, and the value of the dominance function is calculated based on the value before selecting the detection case, the value after selecting the detection case and the value of the rewards; updating an actor network and a criticism network according to the value of the dominance function; And judging whether the selectable detection cases are all selected, if so, taking the first list as an optimal arrangement and combination result and a test path of each detection case, and ending, otherwise, jumping to an actor network adopting the actor commentary model, and selecting one detection case according to the current strategy and state.
- 11. The power load management system detection case permutation and combination system according to claim 10, wherein the value of the reward in the iterative update module is calculated as follows: Where r represents the value of the prize, m represents the number of detection cases in the test path, X i+1 represents the (i+1) th detection case, and X i represents the (i) th detection case.
- 12. The power load management system detection case permutation and combination system according to claim 10, wherein the value of the dominance function in the iterative update module is calculated as follows: A(s,a)=r+γV(s ′ )-V(s) Where A (s, a) represents the value of the merit function after action a is performed in the current state s, γ represents the discount factor, r represents the value of the benefit, V (s ′ ) represents the value of state s ′ after action a is performed, V(s) represents the value of current state s, and action a is performed to select one detection case.
- 13. The power load management system detection case permutation and combination system according to claim 10, wherein the iterative update module updates the actor network as follows: Wherein, theta new represents the updated policy parameters of the actor network, alpha represents the learning rate of the actor network, theta represents the policy parameters of the current actor network, gamma represents discount factors, r represents the value of rewards, V (s ′ ) represents the value of the state s ′ after the action is executed, V(s) represents the value of the current state s, and the policy pi (a|s, theta) represents the probability distribution of the action a executed under the conditions of the current state s and the policy parameters theta, Representing the gradient of the strategy pi (a|s, θ) with respect to θ, executing action a represents selecting one detection case.
- 14. The power load management system detection case permutation and combination system according to claim 10, wherein a calculation formula of updating the critics network by the iterative updating module is as follows: V new (s)←V(s)+β[r+γV(s ′ )-V(s)] Where V new (s) represents the value of the updated critics network with respect to the state s, β represents the learning rate of the critics network, γ represents the discount factor, r represents the value of the prize, V (s ′ ) represents the value of the state s ′ after the execution of the action, V(s) represents the value of the current state s, and the execution of the action represents the selection of one detection case.
- 15. The computer equipment is characterized by comprising at least one processor and a memory, wherein the memory and the processor are connected through a bus; The memory is used for storing one or more programs; A power load management system detection case permutation and combination method according to any of claims 1 to 7 is implemented when the one or more programs are executed by the at least one processor.
- 16. A computer-readable storage medium having stored thereon an execution program which, when executed, implements a power load management system detection case permutation and combination method according to any one of claims 1 to 7.
Description
Power load management system detection case arranging and combining method and system Technical Field The invention relates to the technical field of power load management, in particular to a method and a system for arranging and combining detection cases of a power load management system. Background With the continuous evolution of the global energy structure and the deep promotion of the reform of the electric power system, the construction of novel electric power load management is steadily developed. The management system is required to cope with not only increasingly complex power networks, but also severe requirements of users on power supply in terms of efficiency, stability and reliability. Nevertheless, as a highly intelligent management scheme, there is still a significant room for improvement in the field of operation and detection of the user side and control loop. The complex and changeable detection projects and operation flows directly influence the efficiency and accuracy of field detection, and further challenge the safety and stability of integral operation. The user side and the control loop on-site detection are critical to ensure the normal operation of the power system. The energy demand of the user may fluctuate due to seasons, time or specific events, and monitoring the power demand change of the user side in real time is a complex and critical task. Along with the change of the energy consumption mode and the continuous evolution of the energy supply, the state and the demand of the user side equipment are also continuously changed, and the requirements on the real-time performance and the accuracy of the on-site detection are further increased. In addition, the stability and performance of the control loop, which is a core component of the power system, directly affects the operating efficiency and safety of the overall system. However, the complexity and variability of the control loop makes field detection work abnormally cumbersome and may be affected by various factors such as external environmental factors, equipment failure, or human operation. Quick detection and effective resolution of these influencing factors are critical to ensuring system stability and safe operation. Although the fault detection theory and experience for the user side terminal and the control loop are rich, the existing detection technology is mainly focused on the system output data of fault detection, and the detection use cases play a vital role in evaluating and maintaining the power system, and can help identify potential system problems, guide maintenance decisions and optimize system performance. Therefore, developing an effective method for arranging and combining detection cases to improve the detection efficiency and accuracy becomes a key for improving the stability and safe operation of the system. Disclosure of Invention In order to solve the problem of insufficient efficiency and accuracy in the existing power load management detection flow in the prior art, the invention provides a method for arranging and combining detection cases of a power load management system, which comprises the following steps: Carrying out environment initialization and actor commentary model initialization according to each detection use case of the power load management system; Selecting detection cases from unselected detection cases according to the current strategy and state by adopting the actor commentary models, and iteratively updating the actor commentary models according to the selected detection cases after each detection case selection until each detection case is selected, so as to obtain an optimal arrangement and combination result of each detection case as a test path; The actor network of the actor commentary model is used for selecting detection use cases according to the current strategy and state, and the commentary network of the actor commentary model is used for evaluating benefits according to the selected detection use cases, wherein the benefits correspond to the effectiveness or quality of the test. Preferably, the initializing the environment and the actor commentator model according to each detection use case of the power load management system includes: mapping each detection case of the power load management system to a vector in a multidimensional space by utilizing single-heat coding, wherein each dimension of the vector respectively represents the attribute of the detection case, and the attribute comprises one or more of the functions, states or detection purposes of equipment; establishing and initializing a first list and a second list as environment initialization, wherein the initial value of the first list is an empty set and is used for storing selected vectors, and the initial value of the second list comprises vectors corresponding to detection cases and is used for storing vectors which are not selected yet; the actor network and the reviewer network are randomly init