CN-122019515-A - Large model data management method based on expert agent
Abstract
The application relates to the technical field of data processing and artificial intelligence, and discloses a large model data management method based on expert intelligent body, the method comprises the steps of inquiring a dynamic expert intelligent agent capability map, and evaluating performance, cost and quality indexes of candidate intelligent agents by utilizing a multi-objective optimization strategy so as to determine an execution unit. The execution unit processes the task and generates a structured result that includes the confidence score and the attribution path. And then, the system carries out conflict detection on the multi-source result, triggers a hierarchical intelligent arbitration mechanism comprising confidence comparison, heuristic rule reasoning and secondary inquiry for the detected conflict to determine an effective conclusion, and combines a decision fusion algorithm to generate a final result. The application realizes the dynamic scheduling of the treatment resources, ensures the decision transparency through the attribution path, realizes the self-adaptive evolution of the system by utilizing the feedback closed loop, and remarkably improves the accuracy and the efficiency of data treatment.
Inventors
- LIU CHUANG
- WANG JING
- MENG LINGTAO
Assignees
- 北京中微盛鼎科技有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20251222
Claims (10)
- 1. The large model data management method based on the expert agent is characterized by comprising the following steps: s1, a central coordinator queries a dynamic expert intelligent agent capability map according to the characteristics of the received data management subtasks to obtain candidate expert intelligent agents and dynamic attributes thereof; S2, executing an agent selection strategy based on multi-objective optimization, and evaluating the dynamic properties of the candidate expert agents so as to determine one or more execution units; S3, the execution unit processes the data management subtasks by utilizing an internal decision core module and generates a structured result, wherein the structured result comprises a core processing conclusion, a confidence score and an attribution path; S4, performing conflict detection on a plurality of structured results aiming at the same data management subtask, triggering a hierarchical intelligent arbitration mechanism to determine an effective conclusion if the conflict is detected, and performing a decision fusion algorithm to generate a final management result if the conflict is not detected or arbitration is completed; s5, generating a treatment process log based on the comparison condition of the final treatment result and the structural result of each execution unit, and updating the dynamic attribute in the dynamic expert intelligent capability map by using the treatment process log.
- 2. The method of claim 1, wherein in S1, the dynamic attributes comprise at least a performance index, a cost index, and a quality index, and wherein updating the dynamic attributes in the dynamic expert intelligent capability map comprises: updating the dynamic attribute based on an exponentially weighted moving average algorithm; the quality index is updated by calculating a quality score of the last time step, a reward value obtained in the current time step and a preset attenuation factor; The prize value is determined based on a status of the outcome of the structuring of the execution units in the hierarchical intelligent arbitration mechanism or consistency with the final governance outcome.
- 3. The method of claim 1, wherein in S2, the multi-objective optimization-based agent selection strategy specifically comprises: The dynamic attribute of each candidate expert agent is normalized, maximum and minimum standardized mapping is adopted for benefit type attribute, reverse normalization or reciprocal transformation is adopted for cost type attribute, the comprehensive score of each candidate expert agent is calculated according to preset preference weight, and a single optimal expert agent is selected as a unique execution unit or a plurality of expert agents with comprehensive scores higher than a preset threshold are selected as common execution units based on the comprehensive score to form a redundant execution environment.
- 4. The method of claim 1, wherein in S3 the confidence score calculation logic determines from the type of technology of the decision core module: if the decision core module is a probability-based machine learning model, the confidence score is obtained from a class probability value of a model output layer or a statistical average value of each word element probability in sequence labeling; And if the decision core module is a script program executing specific logic, the confidence score is dynamically calculated based on the number of verification check points passing through in the script executing process.
- 5. The method of claim 1, wherein in S3, the attribution path is a structured data object for recording decision basis of the core processing conclusion; The attribution path comprises a unique identifier of an expert intelligent agent for executing the task, a version number of the decision core module, an original data segment as a decision basis and a decision logic identifier; wherein the decision logic identifier corresponds to a feature dimension that contributes most in the machine learning model, a specific rule ID triggered in the rule inference engine, or a logical branch identification executed in the script program.
- 6. The method of claim 1, wherein in S4, the collision detection specifically comprises: Grouping the structured results according to the unique identification of the processed data item; performing value consistency comparison on the results in the group, and checking whether the values or the labels of the core processing conclusion are equal; Performing boundary consistency comparison on the results in the group, and checking whether the position ranges of the core processing conclusion in the original data are overlapped but not completely overlapped; if any of the above inconsistencies is detected, a conflict is determined to exist.
- 7. The method of claim 6, wherein in S4, the hierarchical intelligent arbitration mechanism comprises the following hierarchy in order: The first level of arbitration, namely calculating the difference value of the confidence scores of all the conflicting parties, and if the difference value of the highest confidence coefficient and the second highest confidence coefficient exceeds a preset significance threshold value, judging that the conclusion of the highest confidence coefficient is valid; Second-stage arbitration, if the first-stage arbitration is pending, resolving the attribution paths of all conflicting parties, and applying a predefined heuristic rule for arbitration; the heuristic rules comprise conclusions generated by a priority adoption deterministic rule engine rather than conclusions generated by a probability model; and third level arbitration, if the second level arbitration is pending, constructing a secondary inquiry task, combining the original data, the conclusions of all conflicting parties and attribution paths thereof into prompt information, and sending the prompt information to a preset advanced arbitration intelligent agent to obtain a final arbitration result.
- 8. The method of claim 1, wherein in S4, the decision fusion algorithm specifically comprises: when the core processing conclusion is a numerical value type or a probability vector, calculating a final treatment result by adopting a weighted model average algorithm; The fusion weight of each execution unit is determined based on the weighted sum of the historical quality index of the execution unit in the dynamic expert intelligent capability map and the confidence score output by the current task; And when the final treatment result is formed by splicing a plurality of sub-parts, calculating a global confidence index, wherein the global confidence index is a reconciliation average of the confidence degrees of all the sub-part conclusions.
- 9. The method of claim 1, wherein in S5, the updating the dynamic expert intelligent capability map with the abatement process log further comprises performing context-based contextualized meta-learning: Establishing a mapping relation between data characteristics and expert intelligent body performances by adopting a contextual multi-arm slot machine algorithm; The input data characteristic fingerprints of the data management subtasks are used as context vectors, the candidate expert intelligent agents are used as optional arms, and the obtained rewarding values of the structured results are used as rewards; and updating the bias weight under the specific situation in the dynamic expert intelligent capability map according to the mapping relation.
- 10. The method of claim 1, further comprising constructing a data blood-edge topology network based on the abatement process log: reconstructing a directed acyclic graph of task execution according to a task decomposition structure and a tracking identifier in the treatment process log; mounting the attribution path and the generated arbitration metadata to the corresponding nodes in the directed acyclic graph; Responding to the query request, and reversely traversing the directed acyclic graph from the node corresponding to the final treatment result to generate an evidence object containing a complete treatment logic chain; The directed acyclic graph is traversed forward from a particular expert agent node to identify downstream results affected by the expert agent.
Description
Large model data management method based on expert agent Technical Field The invention relates to the technical field of data processing and artificial intelligence, in particular to a large model data management method based on expert intelligent agents. Background With the penetration of digital transformation, data has become a core asset for enterprises, and the data quality directly determines the effectiveness of downstream business analysis and decision making. Traditional data governance approaches have mostly relied on predefined rule bases, regular expressions, or fixed ETL scripts. Such methods, while having deterministic logic in processing structured data, tend to exhibit limitations of inadequate flexibility, high rule maintenance costs, and poor generalization ability in the face of unstructured text, multimodal data, or semantically complex dirty data. In recent years, the advent of large-scale language models has provided new technological paths for data governance. The large model has strong semantic understanding and generating capability, and can process fuzzy matching and logic inference tasks which are difficult to be covered by traditional rules. However, in practical engineering applications, there are still many challenges to directly rely on a single generic language model to perform the data governance tasks of a full process. The general large model has wide knowledge coverage, but is often inferior to a special model or a pertinence optimization rule engine in the professional degree of a specific vertical field, so that the phenomena of content fiction or logic inconsistency are easy to generate, and the strict requirement of industrial data management on accuracy is difficult to meet. In addition, the existing large model application architecture usually adopts a static task arrangement mode, and lacks a dynamic evaluation mechanism for computing resources and treatment cost. Regardless of the task difficulty, the same-scale model is called for processing, so that the waste of computing resources or response delay is caused. Meanwhile, when the existing treatment system faces uncertainty of a model output result, an effective multisource verification and conflict arbitration mechanism is lacked, and reliability of the model output is difficult to judge. More importantly, most model-based governance systems lack interpretable attribution capability and adaptive feedback evolution mechanisms, so that users are difficult to trace the decision basis of error data, and the systems cannot automatically optimize subsequent task allocation strategies by using experience data in the historical governance process, so that performance improvement of long-term operation of the systems suffers from bottlenecks. Disclosure of Invention Aiming at the defects of the prior art, the invention provides a large model data management method based on expert intelligent agents, which solves the problems of insufficient single model processing accuracy, lack of dynamic optimization of resource scheduling and lack of interpretability and self-adaptive evolution capability of the management process when facing complex heterogeneous data in the existing data management technology. In order to achieve the purpose, the invention is realized by the following technical scheme that the large model data management method based on the expert intelligent agent comprises the following steps: s1, a central coordinator queries a dynamic expert intelligent agent capability map according to the characteristics of the received data management subtasks to obtain candidate expert intelligent agents and dynamic attributes thereof; S2, executing an agent selection strategy based on multi-objective optimization, and evaluating the dynamic properties of the candidate expert agents so as to determine one or more execution units; S3, the execution unit processes the data management subtasks by utilizing an internal decision core module and generates a structured result, wherein the structured result comprises a core processing conclusion, a confidence score and an attribution path; S4, performing conflict detection on a plurality of structured results aiming at the same data management subtask, triggering a hierarchical intelligent arbitration mechanism to determine an effective conclusion if the conflict is detected, and performing a decision fusion algorithm to generate a final management result if the conflict is not detected or arbitration is completed; s5, generating a treatment process log based on the comparison condition of the final treatment result and the structural result of each execution unit, and updating the dynamic attribute in the dynamic expert intelligent capability map by using the treatment process log. Preferably, in S1, the dynamic attribute includes at least a performance index, a cost index and a quality index, and the updating the dynamic attribute in the dynamic expert intelligent capability map s