CN-122021941-A - Group game strategy generation method and group game method

CN122021941ACN 122021941 ACN122021941 ACN 122021941ACN-122021941-A

Abstract

The invention provides a group game strategy generation method and a group game method, which relate to the technical field of artificial intelligence, and explicitly define a game strategy as an executable strategy code with clear structure, so that a human can directly read decision logic of an intelligent body, and the method has strategy interpretability, is convenient for users to debug and correct logic loopholes, and can be applied to the field with high requirements on safety and compliance. Moreover, the method utilizes programming knowledge of a large language model, so that the generated strategy population has natural diversity in algorithm structure, strategy codes with high-level competitiveness can be directly written in the initial stage of game, zero-sample cold start is realized, and the calculation power consumption and the time cost are greatly reduced. The method can be suitable for application scenes with high requirements on strategy interpretability, logic complexity and diversity.

Inventors

ZHANG JUNGE
SONG JIALU

Assignees

中国科学院自动化研究所

Dates

Publication Date: 20260512
Application Date: 20260410

Claims (10)

1. The group game strategy generation method is characterized by comprising the following steps of: acquiring an initial prompt word bank, wherein the initial prompt word bank comprises a plurality of initial prompt words, and each initial prompt word comprises game strategy description text; Based on the initial prompt word library, applying a large language model to generate a preset number of initial strategy codes, and constructing an initial strategy population based on each initial strategy code; Performing multiple iterations on the initial strategy population, performing combined game based on strategy codes in the current strategy population in each iteration, and solving the mixed strategy Nash equilibrium distribution of the current strategy population; and generating a restriction strategy code corresponding to the optimal strategy code by applying the large language model based on the optimal strategy code corresponding to the hybrid strategy Nash equilibrium distribution and a game track record of winning of the optimal strategy code in a game, and updating the current strategy population based on the restriction strategy code.
2. The method of generating a group gaming strategy as set forth in claim 1, wherein said updating said current strategy population based on said cradling strategy code, previously comprises: Carrying out grammar validity judgment on the restriction strategy codes; and if the grammatical errors exist in the restriction strategy codes, based on the grammatical error information of the restriction strategy codes, the optimal strategy codes and the game track records, applying the large language model to generate the restriction strategy codes with correct grammatical errors.
3. The method of generating a group gaming strategy as set forth in claim 1, wherein said updating said current strategy population based on said cradling strategy code further comprises, before: if the restriction strategy codes contain adjustable super parameters, performing self-game based on the restriction strategy codes, and determining the optimal parameter configuration of the adjustable super parameters in the self-game process; and determining a restriction strategy code with optimal parameters based on the optimal parameter configuration.
4. A method of generating a group gaming strategy according to any of claims 1-3, wherein said solving for a hybrid strategy nash equilibrium distribution of said current strategy population comprises: calculating a matrix of element game benefits obtained by the combined game; And solving the mixed strategy Nash equilibrium distribution of the current strategy population based on the element game income matrix.
5. The method of generating a group game strategy according to claim 4, wherein the end condition of the plurality of iterations comprises any one of: the iteration round reaches the maximum iteration round; The element game revenue matrix converges.
6. The method of claim 1-3, wherein each of the initial hint words further includes tactical style information, and wherein the initial hint word stock includes diversified tactical style information.
7. A method of group gaming, comprising: determining a target hybrid strategy nash equilibrium distribution of an optimal strategy population based on the population game strategy generation method according to any one of claims 1-6; and selecting target strategy codes from the optimal strategy population based on the target mixed strategy Nash equilibrium distribution, executing the target strategy codes and performing games.
8. The group game strategy generating device is characterized by comprising the following components: The acquisition module is used for acquiring an initial prompt word bank, wherein the initial prompt word bank comprises a plurality of initial prompt words, and each initial prompt word comprises game strategy description text; The population construction module is used for generating a preset number of initial strategy codes based on the initial prompt word library by applying a large language model, and constructing an initial strategy population based on each initial strategy code; The population iteration module is used for carrying out multiple iterations on the initial strategy population, carrying out combined game based on strategy codes in the current strategy population in each iteration, and solving the mixed strategy Nash equilibrium distribution of the current strategy population; and the population updating module is used for generating a restriction strategy code corresponding to the optimal strategy code by applying the large language model based on the optimal strategy code corresponding to the mixed strategy Nash equilibrium distribution and a game track record of winning of the optimal strategy code in a game, and updating the current strategy population based on the restriction strategy code.
9. A group gaming device, comprising: a balanced distribution determining module, configured to determine a target hybrid strategy nash balanced distribution of an optimal strategy population based on the population game strategy generation method according to any one of claims 1 to 6; and the game module is used for selecting target strategy codes from the optimal strategy population based on the target mixed strategy Nash equilibrium distribution, executing the target strategy codes and performing games.
10. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor, when executing the computer program, implements the group gaming strategy generation method of any of claims 1-6, or the group gaming method of claim 7.

Description

Group game strategy generation method and group game method Technical Field The invention relates to the technical field of artificial intelligence, in particular to a group game strategy generation method and a group game method. Background In the field of artificial intelligence and complex system control, group gaming has been a central problem of research. The group game refers to the joint participation of a large number of intelligent agents, and the individual benefits not only depend on the strategy of the individual but also depend on the interaction process of the strategies of other participants, so that the group game is widely applied to application scenes such as group countermeasure in a large-scale instant strategic game, attack and defense exercise in network security and the like. Currently, in the fields of group gaming and multi-agent systems, the core idea of a strategy generation technology based on neural network parameter optimization is to model the strategy of an agent as a deep neural network, and knowledge of the strategy is implicitly stored in the deep neural network. The workflow usually depends on large-scale self-gaming, i.e. the agent continuously plays with itself to collect a lot of status and action data and fine-tune network parameters by using gradient descent or evolutionary algorithm. Although such methods are excellent in reaction speed and accuracy, the generated strategies are poor in interpretability and controllability, and lack of an explicit logic structure. With the rise of generative artificial intelligence, researchers began trying to make decisions directly with the semantic understanding capabilities of large language models. The method does not train specific network parameters, but describes game strategies, current situations and historical information to a large language model through designing natural language prompt words, so that the large language model directly outputs action suggestions or tactical analysis of the next step. The method utilizes the general reasoning capability of a large language model and has certain zero sample adaptability. However, since it relies on natural language for mental chain reasoning, its decision process often accompanies the illusion risk, and the reasoning speed is slow, it is difficult to meet the requirements of high frequency or real-time gaming, while its policy logic is still unstructured natural language text, rather than strict executable logic. In addition, reinforcement learning typically starts with randomly initialized weights, and the agent needs to go through a large amount of action exploration to learn the basic strategy, resulting in difficult cold start, and consuming huge effort and time costs. In summary, the conventional strategy generation technology has bottlenecks in terms of the interpretability of the strategy and the training efficiency of the intelligent agent, so that a group game strategy generation method is needed. Disclosure of Invention The invention provides a group game strategy generation method and a group game method, which are used for solving the defects existing in the related technology. The invention provides a group game strategy generation method, which comprises the following steps: acquiring an initial prompt word bank, wherein the initial prompt word bank comprises a plurality of initial prompt words, and each initial prompt word comprises game strategy description text; Based on the initial prompt word library, applying a large language model to generate a preset number of initial strategy codes, and constructing an initial strategy population based on each initial strategy code; Performing multiple iterations on the initial strategy population, performing combined game based on strategy codes in the current strategy population in each iteration, and solving the mixed strategy Nash equilibrium distribution of the current strategy population; and generating a restriction strategy code corresponding to the optimal strategy code by applying the large language model based on the optimal strategy code corresponding to the hybrid strategy Nash equilibrium distribution and a game track record of winning of the optimal strategy code in a game, and updating the current strategy population based on the restriction strategy code. According to the method for generating the group game strategy provided by the invention, the updating of the current strategy group based on the restraint strategy code comprises the following steps: Carrying out grammar validity judgment on the restriction strategy codes; and if the grammatical errors exist in the restriction strategy codes, based on the grammatical error information of the restriction strategy codes, the optimal strategy codes and the game track records, applying the large language model to generate the restriction strategy codes with correct grammatical errors. According to the method for generating the group game strategy provi