CN-121996818-A - Flow graph processing method and processing system based on sub-graph mixed calculation
Abstract
The invention belongs to the technical field of information analysis and discloses a flow graph processing method and a processing system based on sub graph mixed calculation, wherein the processing method comprises the steps of dividing an original flow graph in a graph database into a plurality of sub graphs with balanced distribution of total weight of vertexes; the method comprises the steps of executing graph updating operation on sub-graphs, matching a corresponding graph computing mode for each updated sub-graph, matching the corresponding updated sub-graph with a full computing mode when the vertex proportion influenced by the updating operation in the updated sub-graph exceeds a preset proportion, otherwise, matching an increment computing mode, extracting dependency relations among the sub-graphs, executing graph computing on each updated sub-graph in parallel by adopting the matched graph computing mode to respectively obtain intermediate results of each sub-graph, transmitting and updating the intermediate results of each sub-graph based on the dependency relations, and outputting global processing results of a flow graph. The processing mode can fully utilize the locality characteristic of the graph, reduce the calculation cost and ensure the accuracy of the calculation result.
Inventors
- WANG QINGGANG
- SI LIWEI
- ZHAO XU
- LIAO XIAOFEI
Assignees
- 华中科技大学
Dates
- Publication Date
- 20260508
- Application Date
- 20260114
Claims (10)
- 1. A flowsheet processing method based on subgraph hybrid computation, comprising: The method comprises the steps of responding to a graph processing request, dividing an original flow graph in a graph database into a plurality of subgraphs with balanced total weight distribution of vertexes, wherein the weight of any vertex in the subgraph is proportional to the sum of the weights of edges of the vertex connected with other vertexes in the same subgraph, the original flow graph is an information relation graph and is provided with vertexes and edges connected between the vertexes, the vertexes represent entities participating in analysis, the edges between the two vertexes represent the existence of a relation or a traffic path between the two entities, and the tighter the relation or the longer the traffic path is, the greater the weight of the edges is; executing a graph update operation on the sub-graph, wherein the graph update operation comprises deleting and adding edges; Matching the corresponding graph calculation modes for each updated sub-graph, wherein the graph calculation modes comprise a full-scale calculation mode and an increment calculation mode, when the vertex proportion influenced by the updating operation in the updated sub-graph exceeds a preset proportion, the corresponding updated sub-graph is matched with the full-scale calculation mode, and when the vertex proportion influenced by the updating operation in the updated sub-graph does not exceed the preset proportion, the corresponding updated sub-graph is matched with the increment calculation mode; Extracting the dependency relationship between the sub-images after the image update, if the current sub-image has a vertex influenced by another sub-image, indicating that the current sub-image depends on the other sub-image, and taking the other sub-image as a front sub-image of the current sub-image; respectively executing the internal graph calculation of each updated sub-graph in parallel by adopting the matched graph calculation mode to respectively obtain the internal intermediate result of each sub-graph; Transmitting and updating the intermediate result of each sub-graph based on the dependency relationship between sub-graphs, and transmitting the updated result of the pre-graph to the current sub-graph and updating the intermediate result of the current sub-graph after the pre-graph of the current sub-graph is transmitted and updated; and sequentially executing the transmission and updating of the intermediate results of the subgraph, and outputting the global processing result of the flow graph.
- 2. The flow graph processing method based on sub-graph hybrid computation of claim 1, wherein dividing the original flow graph in the graph database into a plurality of sub-graphs of the vertex total weight balance distribution comprises: firstly, primarily dividing an original flow diagram into a plurality of subgraphs; And fine-tuning sub-graph division according to a weight balancing mechanism, wherein the weight balancing mechanism comprises the steps of calculating the weight of each vertex in the sub-graph, identifying a high-weight sub-graph and a low-weight sub-graph according to the total weight of the vertices of the sub-graph, dividing part of the vertices in the high-weight sub-graph into other sub-graphs with connection relations, and merging the low-weight sub-graphs.
- 3. The flow graph processing method based on sub-graph hybrid computation of claim 1, wherein weights of arbitrary vertices v in the sub-graph The calculation formula of (2) is as follows: ; in the formula, Representing the edge of vertex u connected to vertex v, Representing edges The vertex u and the vertex v both belong to the same subgraph.
- 4. The flow graph processing method based on sub-graph mixed calculation according to claim 2 or 3, wherein dividing part of the vertices in the high-weight sub-graph into other sub-graphs having connection relations comprises calculating the degree ratio of each vertex in the high-weight sub-graph Dividing partial vertexes v with the minimum degree ratio into other subgraphs with the minimum total weight of vertexes with a connection relation with the partial vertexes v; Wherein the ratio of degrees of any vertex v The calculation formula of (2) is as follows: ; in the formula, For the degree to which vertex v is connected to other vertices within the sub-graph to which it belongs, The degree to which vertex v is connected to other vertices outside the sub-graph to which it belongs.
- 5. The flow graph processing method based on sub-graph blending computation of claim 1, wherein vertices having interdependencies are partitioned into the same sub-graph when the sub-graph is partitioned.
- 6. The flow graph processing method based on sub-graph mixed computation of claim 1, wherein the preset proportion is 50%, when the vertex proportion affected by the update operation in the updated sub-graph exceeds 50%, the corresponding updated sub-graph matches the full computation mode, otherwise, the corresponding updated sub-graph matches the increment computation mode.
- 7. The flow graph processing method based on sub-graph mixed computation according to claim 1, wherein when executing graph update operation on the sub-graph, each edge to be updated is recorded as a triplet [ operator, source, destination ], the operator represents an update operation type, and is divided into two types of adding and deleting, source represents a source vertex of an update edge, destination represents a target vertex of an update edge, the source vertex points to the target vertex, and the corresponding update operation is attributed to updating the sub-graph to which the source belongs according to the source.
- 8. The method for processing a flow graph based on mixed computation of sub-graphs according to claim 1, wherein the intermediate result inside each sub-graph is a best path of other vertices inside the sub-graph relative to reference vertices inside the sub-graph, and the reference vertices inside the sub-graph are vertices that affect and are not affected by the other vertices of the sub-graph.
- 9. The flow graph processing method based on sub-graph mixed computation of claim 1, wherein the original flow graph is a social network flow graph or a traffic network flow graph; in a traffic network flow graph, vertexes represent stations, edges between the vertexes represent traffic paths between two stations, the weight of the edges represents the length of the traffic paths, and the shorter the traffic paths are, the smaller the weight of the edges is; In a social network flow graph, vertexes represent users, edges between the vertexes represent social relations among the users, the weights of the edges are used for describing the tightness degree of the social relations among the users, and the closer the social relations are, the larger the weights of the corresponding edges are.
- 10. A flowsheet processing system based on subgraph hybrid computing, comprising: The image partitioning module is used for responding to an image processing request, dividing an original flow image in an image database into a plurality of sub-images with balanced total weight distribution of vertexes, wherein the weight of any vertex in the sub-image is proportional to the sum of the weights of edges of the vertex connected with other vertexes in the same sub-image; The diagram updating module is used for executing diagram updating operation on the subgraph, wherein the diagram updating operation comprises deleting and adding edges; The computing mode selection module is used for matching the corresponding graph computing mode for each updated sub-graph, wherein the graph computing mode comprises a full-scale computing mode and an increment computing mode, when the vertex proportion influenced by the updating operation in the updated sub-graph exceeds a preset proportion, the corresponding updated sub-graph is matched with the full-scale computing mode, and when the vertex proportion influenced by the updating operation in the updated sub-graph does not exceed the preset proportion, the corresponding updated sub-graph is matched with the increment computing mode; the dependency relation construction module is used for extracting the dependency relation between the sub-images after the image update, if the current sub-image has a vertex influenced by another sub-image, the current sub-image is dependent on the other sub-image, and the other sub-image is used as a front sub-image of the current sub-image; the in-graph calculation module is used for respectively executing the graph calculation in each updated sub-graph in parallel by adopting the matched graph calculation mode to respectively obtain the intermediate result in each sub-graph; The inter-graph scheduling module is used for transmitting and updating the intermediate results of all the subgraphs based on the dependency relationship between the subgraphs, transmitting the updated results of the pre-arranged subgraphs to the current subgraphs after the pre-arranged subgraphs of the current subgraphs are transmitted and updated, sequentially executing the transmission and updating of the intermediate results of the current subgraphs, and outputting the global processing results of the flow graph.
Description
Flow graph processing method and processing system based on sub-graph mixed calculation Technical Field The invention belongs to the technical field of information analysis, and particularly relates to a flow graph processing method and a flow graph processing system based on sub-graph mixed calculation. Background The information relationship graph can be used for representing the relationship between the entities, and has vertexes and edges connected between the vertexes, wherein the vertexes represent the entities participating in analysis, the edges between the two vertexes represent relationship paths between the two entities, such as common social network information graphs, traffic network information graphs and the like, and the traditional information processing mode is usually based on static graphs, namely the structure of the graph is kept unchanged in the calculation process, so that the information relationship graph is suitable for one-time analysis or offline batch processing tasks. However, in real world applications, the relationships in the graph tend to be dynamically changing. For example, users in social networks continually add friends or focus account numbers, road conditions in traffic networks change in real time, and such a graph that changes over time is referred to as a flowsheet (STREAMING GRAPH). In a flowsheet scenario, the graph structure may change frequently, including edge insertions, deletions, and the like. For such dynamics, flowsheet computation typically takes two forms, incremental computation or full computation. Full computation refers to re-performing complete computation on all vertices and edges of the entire graph after each graph update. Full-scale computation is generally simple to implement and accurate in computation results, but the computation overhead is high when the graph size is large or the updating is frequent. The incremental computation performs local computation only on vertices or edges affected by the update and multiplexes the historical computation results, thereby reducing redundant computation overhead. Incremental computation is more efficient in scenes with smaller update scales or stronger locality, but incremental computation is complex to implement and requires accurate identification of affected areas, which may otherwise lead to result bias. That is, the full-scale calculation is simple to realize, but the calculation cost is large due to the existence of a large number of redundant calculations, and the incremental calculation can effectively reduce the redundant calculations, but the realization is complex and calculation deviation can exist. Therefore, it is needed to propose a flow graph processing scheme that can achieve both computational overhead and computational accuracy. Disclosure of Invention Aiming at the defects or improvement demands of the prior art, the invention provides a flow graph processing method and a processing system based on sub-graph mixed computation, and aims to give consideration to the computing expense and the computing accuracy of flow graph processing. In order to achieve the above purpose, the following technical solutions are proposed. According to a first aspect of the present invention, there is provided a flowsheet processing method based on subgraph hybrid computation, including: The method comprises the steps of responding to a graph processing request, dividing an original flow graph in a graph database into a plurality of subgraphs with balanced total weight distribution of vertexes, wherein the weight of any vertex in the subgraph is proportional to the sum of the weights of edges of the vertex connected with other vertexes in the same subgraph, the original flow graph is an information relation graph and is provided with vertexes and edges connected between the vertexes, the vertexes represent entities participating in analysis, the edges between the two vertexes represent the existence of a relation or a traffic path between the two entities, and the tighter the relation or the longer the traffic path is, the greater the weight of the edges is; executing a graph update operation on the sub-graph, wherein the graph update operation comprises deleting and adding edges; Matching the corresponding graph calculation modes for each updated sub-graph, wherein the graph calculation modes comprise a full-scale calculation mode and an increment calculation mode, when the vertex proportion influenced by the updating operation in the updated sub-graph exceeds a preset proportion, the corresponding updated sub-graph is matched with the full-scale calculation mode, and when the vertex proportion influenced by the updating operation in the updated sub-graph does not exceed the preset proportion, the corresponding updated sub-graph is matched with the increment calculation mode; Extracting the dependency relationship between the sub-images after the image update, if the current sub-image has a vertex i