CN-122021360-A - Full-flow virtual simulation training method and device based on multi-agent cooperation

CN122021360ACN 122021360 ACN122021360 ACN 122021360ACN-122021360-A

Abstract

The application discloses a full-flow virtual simulation training method and device based on multi-agent cooperation, and relates to the technical field of virtual simulation. The method comprises the steps of dynamically generating a task sequence and event triggering conditions by scenario agents through obtaining training targets, constructing a unified world state comprising an environment state, an object state and a task state by world agents, receiving student operations and triggering corresponding events by an event engine in combination with the world state in the training process, generating a decision packet by each agent based on the updated state, obtaining a target decision through conflict detection and weighted arbitration, completing world state updating based on a transaction consistency mechanism, simultaneously feeding back an updating result to a simulation client and driving multi-agent circulation collaborative decision, and constructing a capability vector by combining student behavior data to realize dynamic evaluation and self-adaptive adjustment of training content.

Inventors

SHEN GONGJUN
LI KUNXU

Assignees

重庆恒琪信息科技有限公司

Dates

Publication Date: 20260512
Application Date: 20260414

Claims (10)

1. The full-flow virtual simulation training method based on multi-agent cooperation is characterized by comprising the following steps of: Acquiring a training target, and generating an initial task sequence and an event triggering condition by a scenario agent based on the training target; Initializing a virtual simulation environment based on a world agent, constructing a world state, wherein the world state comprises an environment state variable, an object state variable, a task state variable, a role relation state and a system control state, and writing the initial task sequence and an event triggering condition into the task state variable; Receiving operation behaviors or interaction information of a student through a simulation client, transmitting the operation behaviors or interaction information to an event engine, and determining whether to trigger a corresponding event and update a task state by combining a current world state and the event triggering condition through the event engine; Broadcasting the operation behavior, the event triggering result and the updated world state to a plurality of intelligent agents, and generating a corresponding decision package by each intelligent agent based on the world state and the task state, wherein the decision package comprises a target state, an execution action, a risk level and a confidence level; Performing conflict detection on a plurality of decision packets, and performing weighted arbitration based on confidence coefficient and preset weight to obtain a target decision result; Updating world states corresponding to the target decision result based on a transaction consistency management mechanism, wherein the world states comprise state locking, state updating and transaction submitting or rollback so as to obtain updated world states; the updated world state is fed back to the simulation client for visual presentation and is synchronized to each intelligent agent as the next round of decision input, so that a collaborative training process of loop iteration is formed; and recording behaviors of the students in the circulation process, and carrying out capability assessment on the students based on the change of the world state and the task state so as to realize dynamic assessment of the training process.
2. The method of claim 1, wherein the plurality of agents includes scenario agents, knowledge agents, character agents, evaluation agents, and world agents, wherein the scenario agents generate or adjust task sequences and event trigger conditions based on training objectives and task states, and dynamically update subsequent tasks according to world state changes in each cycle to drive the training process to continue.
3. The multi-agent collaboration-based full-flow virtual simulation training method of claim 2, wherein generating, by each agent, a corresponding decision package based on the world state and the task state, comprises: determining a target state based on the world state and the task state; Generating an execution action according to the target state; determining a risk level based on the current environmental risk and the task context; Calculating confidence by combining the historical decision effect or the knowledge reasoning result; And encapsulating the target state, the execution action, the risk level and the confidence level into the decision package.
4. The multi-agent collaboration-based full-flow virtual simulation training method of claim 3, wherein performing collision detection on the plurality of decision packets and performing weighted arbitration based on confidence levels and preset weights to obtain a target decision result comprises: Consistency comparison is carried out on the execution actions and the target states in each decision packet so as to identify conflict decisions; weighting calculation is carried out on the decision package based on the preset weight and the confidence coefficient corresponding to each intelligent agent; the target decision result is selected based on the weighted result and used to drive a subsequent world state update.
5. The multi-agent collaboration-based full-process virtual simulation training method of claim 4, wherein updating world states corresponding to the target decision result based on a transaction consistency management mechanism, including state locking, state updating, and transaction commit or rollback, to obtain updated world states comprises: Generating a transaction identifier and recording a current world state version; Locking the world state to be updated; executing a state updating operation corresponding to the target decision result; and submitting the transaction and generating a new state version when the update is successful, and rolling back based on the state version when the update is failed, so that the state consistency in the multi-agent collaboration process is ensured.
6. The multi-agent collaboration-based full-process virtual simulation training method as claimed in any one of claims 1-5, wherein performing capability assessment on a learner based on the change in the world state and the task state to achieve dynamic assessment of a training process comprises: constructing a capability vector comprising a plurality of capability dimensions; Calculating the evaluation value of each capability dimension based on the operation behaviors of students, the task state propulsion situation and the world state change result; And dynamically updating the capability vector according to a preset updating rule, and adaptively adjusting a subsequent task sequence or event triggering condition based on the updated capability vector.
7. Full-flow virtual simulation training device based on multi-agent cooperation, which is characterized by comprising: The target acquisition module is used for acquiring a training target and generating an initial task sequence and an event triggering condition by the scenario agent based on the training target; the state construction module is used for initializing the virtual simulation environment based on the world intelligent agent, constructing a world state, wherein the world state comprises an environment state variable, an object state variable, a task state variable, a role relation state and a system control state, and writing the initial task sequence and an event triggering condition into the task state variable; The trigger judging module is used for receiving the operation behaviors or interaction information of the students through the simulation client, transmitting the operation behaviors or interaction information to the event engine, and determining whether to trigger the corresponding event and update the task state by combining the current world state and the event triggering condition through the event engine; The decision generation module is used for broadcasting the operation behavior, the event triggering result and the updated world state to a plurality of intelligent agents, and each intelligent agent generates a corresponding decision package based on the world state and the task state, wherein the decision package comprises a target state, an execution action, a risk level and a confidence level; the result acquisition module is used for carrying out conflict detection on a plurality of decision packets and carrying out weighted arbitration based on the confidence coefficient and the preset weight to obtain a target decision result; The state updating module is used for updating the world state corresponding to the target decision result based on a transaction consistency management mechanism, including state locking, state updating and transaction submitting or rollback so as to obtain the updated world state; The state feedback module is used for feeding back the updated world state to the simulation client for visual presentation and synchronizing to each intelligent agent as the next round of decision input so as to form a collaborative training process of loop iteration; And the dynamic evaluation module is used for recording the behaviors of the students in the circulation process and evaluating the capability of the students based on the change of the world state and the task state so as to realize the dynamic evaluation of the training process.
8. An electronic device comprising a memory, a processor, and a computer program stored on the memory and running on the processor, wherein the processor implements the multi-agent collaboration-based full-flow virtual simulation training method of any of claims 1-6 when the computer program is executed by the processor.
9. A computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, and when the computer program is executed by a processor, the full-flow virtual simulation training method based on multi-agent collaboration is realized according to any one of claims 1 to 6.
10. A computer program product comprising a computer program which, when executed by a processor, implements the multi-agent collaborative based full-flow virtual simulation training method of any of claims 1-6.

Description

Full-flow virtual simulation training method and device based on multi-agent cooperation Technical Field The invention relates to the technical field of virtual simulation, in particular to a full-flow virtual simulation training method and device based on multi-agent cooperation. Background With the continuous development of computer technology, virtual simulation technology and artificial intelligence technology, the virtual simulation training system is widely applied to complex scenes such as public safety, emergency management, industrial production, medical and health, military exercises and the like, and is used for improving the operation capability and decision level of personnel. The existing virtual simulation training system generally adopts a script-driven or rule engine-driven mode to construct a training process, and the training process is advanced by presetting scenario, triggering conditions and fixed interaction paths. However, the above technical solution still has many drawbacks in practical application. Firstly, training content depends on manual presetting, the development period is long, the cost is high, dynamic adjustment is difficult to carry out according to the student performance, and flexibility and expansibility are lacking. Secondly, most of interaction modes are fixed path or option interaction, which is difficult to support open decision and natural language input, and limits the reality and complexity of training. Again, the existing systems are generally based on a single agent or simple auxiliary tool, only provide prompt or question-answer functions, cannot realize collaborative decision and interaction between multiple roles, and are difficult to simulate the multi-subject collaboration process in a real complex scene. In addition, under the condition of multi-user concurrency or complex scenes, the existing system generally lacks a unified consistency control mechanism for managing the state of the simulation environment, so that the problems of state asynchronism, logic conflict and the like are easy to occur, and the reliability of training results is influenced. Meanwhile, most of the existing evaluation mechanisms are single scoring modes guided by results, dynamic analysis and capability growth modeling of a training process are lacked, and comprehensive capability change of students cannot be comprehensively reflected. Therefore, a virtual simulation training method capable of supporting multi-agent collaborative decisions, realizing simulation environment state consistency control and having dynamic content generation and capability modeling functions is needed to improve the intelligent level and application effect of the training system. Disclosure of Invention The invention aims to provide a full-flow virtual simulation training method and device based on multi-agent cooperation, which realize dynamic driving and closed-loop control of a virtual simulation training process by introducing a multi-agent cooperation mechanism and constructing a unified world state management system. Specifically, the scenario agent generates a task sequence and event triggering conditions based on the training target and writes the task sequence and the event triggering conditions into the task state, so that the training process is converted from static preset into a process capable of dynamically evolving, the content development cost is effectively reduced, and the training flexibility is improved. Meanwhile, the event engine is combined with the world state and the event triggering condition to analyze and drive the operation of the trainee in real time, so that the response capability of the training process to unexpected behaviors is ensured. Further, each agent generates a decision packet based on a unified world state, and a consistent decision result is obtained through a conflict detection and weighted arbitration mechanism, so that the problem of multi-main decision conflict can be effectively solved, and the stability and the cooperativity of the system are improved. By introducing a transaction consistency management mechanism, locking, submitting and rolling back operations are executed in the state updating process, so that consistency and traceability of the world state under the multi-agent concurrency scene are ensured, and logic confusion is avoided. Finally, through continuous recording of behaviors and state changes of students, dynamic evaluation and feedback of the training process are realized, so that the evaluation is guided to be changed into the process and capability from a single result and is repeated, and the comprehensiveness and scientificity of the training effect are improved. In order to achieve the above purpose, the present invention provides the following technical solutions: In a first aspect, the present invention provides a full-flow virtual simulation training method based on multi-agent cooperation, the method comprising: acquiring