CN-121980293-A - Strategy generation method and device based on multi-agent cooperation

CN121980293ACN 121980293 ACN121980293 ACN 121980293ACN-121980293-A

Abstract

The application provides a strategy generation method and device based on multi-agent cooperation, which can be applied to the technical fields of artificial intelligence and big data and relates to application of a big model in a financial science and technology scene. The method comprises the steps of acquiring user behavior data by utilizing a first intelligent agent under the condition of acquiring user authorization or agreement, processing the user behavior data by utilizing the first intelligent agent, grouping users to acquire user grouping characteristics, acquiring market dynamic data by utilizing a second intelligent agent, processing the market dynamic data by utilizing the second intelligent agent to acquire the market dynamic characteristics, performing conflict detection on the user grouping characteristics and the market dynamic characteristics, detecting whether the user grouping characteristics conflict with the market dynamic characteristics or not in the conflict detection, and generating a compensation strategy in response to the fact that the user grouping characteristics conflict with the market dynamic characteristics, wherein the compensation strategy is used for reducing the conflict between the user grouping characteristics and the market dynamic characteristics.

Inventors

WANG JISHENG
MAO YACHEN
SONG RUI

Assignees

中国工商银行股份有限公司

Dates

Publication Date: 20260505
Application Date: 20250724

Claims (15)

1. A policy generation method based on multi-agent collaboration, the method comprising: under the condition of acquiring user authorization or agreement, acquiring user behavior data by using a first intelligent agent, wherein the user behavior data comprises behavior data of the user on an e-commerce platform; Processing the user behavior data by using the first agent, and grouping the users to obtain user grouping characteristics; obtaining market-dynamics data with a second agent, wherein the market-dynamics data includes data representing at least one of a commodity price trend, a consumption trend, and a market competition trend; processing the market dynamics data by using the second agent to obtain market dynamics features; Performing conflict detection on the user grouping feature and the market dynamic feature, in the conflict detection, detecting whether the user grouping feature and the market dynamic feature conflict or not, and And generating a compensation strategy in response to the conflict between the user grouping feature and the market dynamic feature, wherein the compensation strategy is used for reducing the conflict between the user grouping feature and the market dynamic feature.
2. The method of claim 1, wherein said processing said user behavior data with said first agent clusters said users to obtain user cluster features, comprising: Processing the user behavior data by using a clustering algorithm, grouping the users, and adding user grouping labels and weights to each user grouping, wherein the user grouping labels are used for representing the behavior characteristics of the user grouping, and the weights are used for representing the proportion of the number of the users represented by the user grouping labels to the total number of the users.
3. The method of claim 2, wherein the clustering algorithm comprises a streaming K-means algorithm.
4. A method according to claim 3, wherein said processing said user behavior data using a clustering algorithm to cluster said users comprises: utilizing a central cluster to process periodic user behavior data, and determining a plurality of clusters and initial cluster centers of the clusters, wherein the periodic user behavior data comprises historical user behavior data and/or initial batch user behavior data; After acquiring the real-time user behavior data, processing the real-time user behavior data by using an edge computing node to calculate the distance between the real-time user behavior data and the initial cluster center of each cluster, and And distributing the real-time user behavior data to the cluster closest to the initial cluster center according to the distance between the real-time user behavior data and the initial cluster center of each cluster.
5. The method of any one of claims 1-4, wherein said processing said market dynamics data with said second agent to obtain market dynamics features comprises: Constructing a knowledge graph according to the market dynamic data, wherein the knowledge graph comprises a plurality of nodes and a plurality of edges, the nodes respectively represent at least one of a market entity, a market strategy and a market event, and the edges represent the relationship among the market entity, the market strategy and the market event And acquiring the association relationship among the market entity, the market strategy and the market event according to the knowledge graph.
6. The method of claim 5, wherein said processing said market dynamics data with said second agent to obtain market dynamics features comprises: processing the market dynamics data using a time series model to obtain a short-term market trend; processing the market dynamics data using a large language model to obtain long term market trends, and And acquiring the market dynamic characteristics based on the association relationship, the short-term market trend and the long-term market trend.
7. The method of any of claims 1-4 and 6, wherein said conflict detection of said user grouping feature and said market dynamics feature comprises: obtaining quantized values of the user grouping features and the market dynamic features; Normalizing the quantized values of the user grouping feature and the market dynamics feature, and And processing the quantized values of the user grouping characteristic and the market dynamic characteristic after normalization processing by adopting a weighted dynamic loss function so as to obtain the deviation degree.
8. The method of claim 7, wherein the weighted dynamic loss function includes dynamic weights, the dynamic weights being determined by a time decay factor and traffic priority.
9. The method of claim 7, wherein generating a compensation strategy in response to the user grouping feature conflicting with the market dynamics feature comprises: generating a first compensation strategy in response to the deviation being less than a first deviation threshold, wherein in the first compensation strategy, the user grouping labels and the weights remain unchanged, and only a first compensation scheme is generated; generating a second compensation strategy in response to the degree of deviation being between the first degree of deviation threshold and a second degree of deviation threshold, wherein in the second compensation strategy the weight is adjusted; Generating a third compensation strategy in response to the deviation being greater than the second deviation threshold, wherein in the third compensation strategy, the user grouping labels and the weights are adjusted, and a second compensation scheme is generated, Wherein the first deviation threshold is less than the second deviation threshold.
10. The method of claim 9, wherein adjusting the weights comprises: extracting a user grouping label with highest association degree with the market dynamic characteristics as a first target adjustment label; Extracting user grouping labels conflicting with the market dynamic characteristics as second target adjustment labels, and And increasing the weight of the first target adjustment label and reducing the weight of the second target adjustment label.
11. The method of claim 9 or 10, wherein adjusting the user grouping labels comprises: extracting a target market entity and a target association relationship from the knowledge graph based on the conflicting market dynamic characteristics, wherein the target market entity and the target association relationship are associated with the conflicting market dynamic characteristics; converting the target market entity and target association relationship into structural features, and And injecting the structural features processed through the privacy computing technology into a feature library of the first agent.
12. A multi-agent collaboration-based policy generation device, the device comprising: the behavior data acquisition module is used for acquiring user behavior data by using a first intelligent agent under the condition of acquiring user authorization or agreement, wherein the user behavior data comprises behavior data of the user on an e-commerce platform; the first agent module is used for processing the user behavior data by utilizing the first agent and grouping the users so as to obtain user grouping characteristics; a dynamic data acquisition module for acquiring market dynamic data using a second agent, wherein the market dynamic data includes data representing at least one of a commodity price trend, a consumption trend, and a market competition trend; the second agent module is used for processing the market dynamic data by utilizing the second agent so as to acquire market dynamic characteristics; A conflict detection module for performing conflict detection on the user grouping feature and the market dynamic feature, in the conflict detection, detecting whether the user grouping feature and the market dynamic feature conflict or not, and And the strategy generation module is used for responding to the conflict between the user grouping characteristic and the market dynamic characteristic and generating a compensation strategy, wherein the compensation strategy is used for reducing the conflict between the user grouping characteristic and the market dynamic characteristic.
13. An electronic device, comprising: One or more processors; a memory for storing one or more computer programs, Characterized in that the one or more processors execute the one or more computer programs to implement the steps of the method according to any one of claims 1-11.
14. A computer-readable storage medium, on which a computer program or instructions is stored, which, when executed by a processor, carries out the steps of the method according to any one of claims 1 to 11.
15. A computer program product comprising a computer program or instructions which, when executed by a processor, implement the steps of the method according to any one of claims 1 to 11.

Description

Strategy generation method and device based on multi-agent cooperation Technical Field The application relates to the technical field of artificial intelligence and big data, relates to application of a big model in a financial and scientific scene, in particular to a strategy generation method, device, electronic equipment, storage medium and program product based on multi-agent cooperation. Background In the fields of electronic commerce, personalized recommendation and the like, along with the deep fusion of big data technology and Internet of things technology, user behavior data and market dynamic data are in an exponentially growing situation. In the prior art, user behavior data is generally collected in real time by a log collection system deployed at the front end of an e-commerce platform by adopting a message queue technology, full-link behavior information such as click stream data, search keyword sequences, purchasing and paying of users at a webpage end or a mobile end is covered, market dynamic data is obtained from a public e-commerce platform and an industry information website by adopting a web crawler technology, and data such as commodity price fluctuation, commodity sales trend and the like are stored in combination with a structured database. At present, a data processing system mostly adopts a distributed file system and a computing framework to construct a data storage and analysis platform. However, in practical application, because the user behavior data and the market dynamic data belong to different service systems, the interface protocols and the data formats adopted by the systems are different, and the unified data management standard is lacking, the problems of data loss, incompatible formats and the like occur in the processes of extracting, converting and loading the data, and a data island is formed. The existing processing mode mainly depends on a single data processing module or a serialization processing flow, a multi-agent cooperative mechanism is not adopted, and the correlation among data cannot be fully utilized for carrying out parallelization deep analysis, so that real-time fusion analysis based on a stream processing engine is difficult to realize. Meanwhile, the traditional user clustering technology mostly adopts a traditional clustering algorithm, and a static clustering model is built through a preset distance measurement formula and clustering parameters. When the user behavior abnormal fluctuation caused by sudden market demand change and sales promotion activities is faced, the clustering strategy cannot be adjusted in real time according to the dynamic data due to the lack of self-adaptive decision capability of multi-agent cooperation. Specifically, it is difficult for a single model to simultaneously compromise data feature extraction, model parameter optimization and strategy generation, resulting in user portrait lag, and it is difficult to support enterprises to quickly respond to market changes to formulate accurate marketing strategies. In addition, in the traditional strategy generation process, all business links (such as data cleaning, characteristic engineering and strategy formulation) are mutually independent, cooperative interaction among intelligent agents is not formed, dynamic game and strategy optimization cannot be performed aiming at complex and changeable market environments, and therefore the operation strategy is difficult to flexibly adjust in competition for enterprises. Disclosure of Invention In view of at least one aspect of the above-mentioned problems, embodiments of the present application provide a method, an apparatus, an electronic device, a storage medium, and a program product for generating a policy based on multi-agent collaboration. According to a first aspect of the application, a policy generation method based on multi-agent collaboration is provided, the method comprising collecting user behavior data with a first agent in case of obtaining user authorization or consent, wherein the user behavior data comprises behavior data of the user on an e-commerce platform, processing the user behavior data with the first agent to group the user to obtain user group characteristics, obtaining market dynamics data with a second agent, wherein the market dynamics data comprises data representing at least one of commodity price trends, consumption trends and market competition trends, processing the market dynamics data with the second agent to obtain market dynamics characteristics, performing conflict detection on the user group characteristics and the market dynamics characteristics, in which conflict detection, detecting whether the user group characteristics conflict with the market dynamics characteristics, and generating a compensation policy in response to the user group characteristics conflicting with the market dynamics characteristics, wherein the policy compensation policy is used for reducing the conflict between