Search

CN-121996763-A - Index-based content perception influence maximization query response dynamic method and device

CN121996763ACN 121996763 ACN121996763 ACN 121996763ACN-121996763-A

Abstract

The invention discloses a content perception influence maximization query response method and device based on indexes, wherein the method comprises the steps of 1) establishing an index node selection mechanism based on minimum vertex coverage approximation, 2) constructing a space efficient index table to store content perception reachability information, 3) calculating a node pair reachability probability lower bound by a reachability table estimation method, and 4) providing pruning reachability table estimation and an inertia forward greedy algorithm to optimize query efficiency. Verification on real network data sets such as Wiki, twitter and the like shows that the method has a similar propagation effect as the existing optimal influence maximization method, and the query response speed is improved by 1-2 orders of magnitude.

Inventors

  • SHI QIHAO
  • WANG CAN
  • SONG MINGLI

Assignees

  • 浙江大学

Dates

Publication Date
20260508
Application Date
20260116

Claims (7)

  1. 1. The content perception influence maximization query response method based on the index is characterized by comprising the following steps of: S1, constructing a propagation model, namely constructing a directed probability graph model representing a social network, wherein nodes represent users, edges represent the association relation among the users, and setting the dynamic activation probability of each edge based on content topics; S2, constructing an index, namely selecting an index node set through a vertex coverage approximation algorithm to ensure that all edges in a coverage graph are controllable in scale, constructing an index table, and storing a reverse set of 1-jump-in edge neighbors and the reachable probability under each theme for each index node, and For non-index nodes, recording the forward set and the reverse set of the index node to which the non-index node belongs and the corresponding reachable probability; S3, inquiring response, namely receiving CAIM inquiry comprising focal content (a topic set F) and seed node budget k, adopting a reachability table estimation RTE method, calculating the lower limit of the reachability probability of any node under the topic content F based on an index table, and further estimating the influence propagation range of the node; S4, adopting a pruning reachability table to estimate PRTE (partial order of magnitude) method, optimizing index storage and calculation efficiency through a non-overlapping partition strategy, only keeping the optimal index node association information corresponding to each node, iteratively selecting the node with the largest marginal influence gain based on an inertia forward greedy algorithm to add a seed subset until the seed subset reaches a budget k or meets a transmission target, and outputting a final seed node set to finish query response.
  2. 2. The method of claim 1, wherein the DE vertex coverage approximation algorithm of step S2 comprises the steps of: S21, randomly selecting an edge (u, v) from the graph; S22, randomly selecting one endpoint (u or v) of the edge to add into the index node set; S23, removing the endpoint and all the related edges, and repeating the steps until all the edges in the graph are covered.
  3. 3. The method according to claim 1, wherein the construction of the index table in step S2 satisfies the following condition: reverse set storage 1 hops in edge neighbors and edge activation probability under each theme, and the basic probability is that (When no topic is present); Forward set storage Jumping to the best The method comprises the steps that 1) the edge-out neighbors are not less than 2), the reachable probability of the 2-hop neighbors is directly calculated through the edge activation probability, and the reachable probability exceeds 2 hops through Monte Carlo simulation calculation; The index table has a spatial complexity of Wherein For the number of inodes to be indexed, For the number of topics, d is the network average, The average number of nodes stored for each inode.
  4. 4. The method of claim 1, wherein the RTE method of step S3 calculates the lower bound of probability of reachability comprises: for node pairs connected by a single path, the lower limit of the reachable probability is the basic reachable probability which is obtained by subtracting (the number of the topics is-1) times from the sum of the reachable probabilities of the paths under each topic; for node pairs connected by double independent paths, the lower limit of the reachable probability is deduced through quadratic programming, and the cooperative effect among paths is considered; For non-index nodes, the maximum value of the reachable probability products of the associated index nodes is used as the lower bound of the reachable probability of the non-index nodes; for seed sets, all seed nodes are associated through the virtual nodes, and the reachable probability from the virtual nodes to each node is calculated to estimate the set influence.
  5. 5. The method of claim 1, wherein the pruning strategy of the PRTE method of step S3 is: For the node which can be covered by a plurality of index nodes, only the index node associated information with the maximum reachable probability is reserved; when the reachable probabilities corresponding to the index nodes are equal, randomly selecting one for association; The space complexity of the index table after pruning is optimized as follows Wherein Is the total number of nodes in the network.
  6. 6. The method of claim 1, wherein the lazy forward greedy algorithm of step S3 comprises: generating a node influence ordering list OISR based on basic edge probability in advance; Selecting the front in the ordered list The individual nodes are used as candidate sets, and marginal influence gain of each candidate node is calculated; When the candidate set is ordered continuously When the current ranking is unchanged, selecting the node with the highest ranking to add into the seed subset; The algorithm time complexity is Wherein The topic set is scaled for the query.
  7. 7. An apparatus for implementing the method of any one of claims 1 to 6, comprising: The network modeling module is used for constructing a directed probability map of the social network, distributing node theme preference and setting edge activation probability and a propagation model; The index construction module is used for executing a minimum vertex coverage approximation algorithm to select index nodes, generating an index table and storing reachability information; The reachability estimation module is used for realizing RTE and PRTE methods and calculating the lower bound of the node reachability probability and the influence estimation; The seed selection module is used for iteratively selecting optimal seed nodes based on an inert forward greedy algorithm; and the query interface module is used for receiving the CAIM query request and outputting a seed node set.

Description

Index-based content perception influence maximization query response dynamic method and device Technical Field The invention belongs to the technical field of social network information propagation optimization, and particularly relates to a high-efficiency implementation method and device for content perception influence maximization (CAIM) query response, which are suitable for scenes such as virus marketing, information diffusion, public opinion guiding and the like in a multi-topic social network. Background Social networks have become the core carrier for information dissemination, with the Impact Maximization (IM) problem as its key application, aimed at selecting a small number of seed nodes to achieve the maximum range of dissemination of information. The traditional IM method is mainly based on a network topology structure, and influences on the transmission effect by matching degree of the content theme and the user interest are ignored. With the deep research, the problem of content perception influence maximization (CAIM) is proposed, and the key feature is that the edge activation probability dynamically changes along with the transmission content theme, so that the method is more suitable for the transmission rule of a real social network. The existing CAIM method is mainly designed aiming at single query scenes, such as a theme perception model based on deep reinforcement learning, an online learning framework and the like. However, in practical application, there are often a large number of concurrent CAIM queries (different topics and different seed budgets), and directly repeatedly applying the existing method to each query results in extremely high computational overhead, and it is difficult to meet the real-time response requirement. In addition, the existing indexing technology is mainly used for map query such as shortest path and reachability, is not adapted to probability propagation characteristics of the CAIM problem, and cannot be directly used for solving the problem of efficient response in a multi-query scene. Therefore, how to construct an index framework suitable for the CAIM problem, and to realize quick response of multiple queries through one-time pre-calculation, so as to reduce the calculation cost while guaranteeing the propagation effect becomes a technical problem to be solved currently. Disclosure of Invention Aiming at the problems of high calculation cost and low response speed of the existing CAIM method in a multi-query scene, the invention provides a content perception influence maximization query response method and device based on indexes, and the quick response in the multi-query scene is realized by pre-constructing a content perception index structure and combining an efficient reachability estimation and seed selection algorithm, and meanwhile, the influence propagation effect is guaranteed to be equivalent to that of the existing optimal method. The first aspect of the invention relates to a content perception influence maximization query response method based on index, which comprises the following steps: S1, constructing a propagation model, namely constructing a directed probability graph model representing a social network, wherein nodes represent users, edges represent the association relation among the users, and setting the dynamic activation probability of each edge based on content topics; S2, constructing an index, namely selecting an index node set through a vertex coverage approximation algorithm to ensure that all edges in a coverage graph are controllable in scale, constructing an index table, and storing a reverse set of 1-jump-in edge neighbors and the reachable probability under each theme for each index node, and For non-index nodes, recording the forward set and the reverse set of the index node to which the non-index node belongs and the corresponding reachable probability; S3, inquiring response, namely receiving CAIM inquiry comprising focal content (a theme set F) and seed node budget k, adopting an reachability table estimation RTE method, calculating the lower limit of the reachability probability of any node under the theme content F based on an index table, further estimating the influence propagation range of the node, optionally adopting a pruning reachability table estimation PRTE method, optimizing index storage and calculation efficiency through a non-overlapping partition strategy, only preserving the optimal index node association information corresponding to each node, iteratively selecting the node with the largest marginal influence gain based on an inertia forward greedy algorithm, adding the node into a seed subset until the seed subset scale reaches the budget k or meets the propagation target, and outputting the final seed node set to finish inquiry response. Further, the DE vertex coverage approximation algorithm in step S2 includes the following steps: S21, randomly selecting an edge (u, v) from the graph; S22, r