CN-119516554-B - Identification method of hand-drawn flow chart, electronic equipment, medium and computer product
Abstract
A method for identifying a hand-drawn flow chart, electronic equipment, media and computer products, wherein the method comprises the steps of obtaining stroke characteristics corresponding to strokes in the hand-drawn flow chart, inputting the stroke characteristics into a graph neural network model, obtaining predicted categories of the strokes in the hand-drawn flow chart output by the graph neural network model, and symbol characteristics obtained by clustering the stroke characteristics, wherein the graph neural network model is obtained by training sample stroke characteristics, and determining the categories of the symbol characteristics based on the predicted categories of the strokes in the hand-drawn flow chart. By implementing the technical scheme provided by the invention, the recognition accuracy of the symbols in the hand-drawn flow chart can be improved, and the problem that the hand-drawn flow chart is difficult to recognize in the prior art is effectively solved.
Inventors
- ZHANG YANMING
- WANG HAOZHE
- YIN FEI
- ZHANG HENG
Assignees
- 中国科学院自动化研究所
Dates
- Publication Date
- 20260505
- Application Date
- 20240919
Claims (7)
- 1. A method for identifying a hand-drawn flowchart, comprising: acquiring stroke characteristics corresponding to strokes in the hand-drawn flow chart; Inputting the stroke characteristics into a graph neural network model, and obtaining predicted categories of strokes in the hand-drawn flow chart output by the graph neural network model and symbol characteristics obtained by clustering the stroke characteristics, wherein the graph neural network model is obtained by training sample stroke characteristics; Determining a category of the symbolic feature based on a predicted category of the strokes in the hand-drawn flowchart; the acquiring the stroke characteristics corresponding to the strokes in the hand-drawn flow chart comprises the following steps: Acquiring strokes in a hand-drawn flow chart; The method comprises the steps of obtaining geometric characteristics of the strokes, wherein the geometric characteristics are used for representing position information of the strokes, the geometric characteristics refer to digital representation describing spatial attributes and shape characteristics of single strokes in a hand-drawing flow chart, preprocessing each stroke, converting each stroke into a sequence containing 3-dimensional vectors, and the processing method is realized by calculating first-order differences of track point coordinates, wherein the converted sequence can be expressed as: ; In the formula, , Representing the variation of adjacent track points in the x and y coordinate directions, respectively, m is the length of stroke s, Indicating the status of the pen and, Representing that the stroke falls at that point, otherwise The method comprises the steps of obtaining a characteristic extraction network based on a cyclic neural network, wherein the characteristic extraction network extracts geometric characteristics, the network is formed by stacking a plurality of bidirectional RNN layers, each layer simultaneously comprises a forward RNN unit and a reverse RNN unit, the forward RNN unit processes forward information of a time sequence, and the reverse RNN unit ingests reverse information, so that a model can more comprehensively understand context information of stroke track sequence data; The method comprises the steps of acquiring edge features between strokes, wherein the edge features are used for representing space-time relations between the strokes; And then, calculating the difference value of coordinates of sampling points between the two strokes and splicing to obtain a position embedded PE, wherein the position embedded PE is expressed as: ; the relative order of strokes is described in terms of a time-embedded TE: ; In the formula, Is a time factor that controls the importance of the temporal neighbors, Is the number of factors that are present, Is a splice operator; Edge feature is position embedding And time embedding Is expressed as: taking the geometrical characteristics and the edge characteristics corresponding to the strokes as stroke characteristics; the graph neural network model comprises node classification branches and node clustering branches, the node clustering branches comprise a message passing neural network and a mean shift clustering layer, and the training process of the graph neural network model comprises the following steps: inputting sample stroke characteristics into the node classification branch, and obtaining a sample prediction category corresponding to the sample stroke characteristics output by the node classification branch; inputting the sample stroke characteristics into the message transmission neural network, and obtaining sample high-dimensional characteristics output by the message transmission neural network, wherein the sample high-dimensional characteristics are used for representing the shape and position relations among the sample stroke characteristics; Inputting the sample high-dimensional features to the mean shift clustering layer to obtain sample symbol features output by the mean shift clustering layer, wherein the mean shift clustering layer is used for clustering the sample high-dimensional features; acquiring a first difference value of the sample prediction category, and acquiring a second difference value of the sample symbol feature; Adjusting parameters of the graph neural network model based on the first difference value and the second difference value, wherein the second difference value is the sum of a third difference value and a fourth difference value, and the second difference value is determined by the following steps: ; In the formula, Representing the center embedded vector of symbol c, i.e. the average of all stroke embeddings within symbol c, An embedded vector representing a stroke i is represented, Representing the set of strokes in symbol c, including all strokes i belonging to symbol c, Representing the number of strokes in symbol c; ; In the formula, Representing the compact loss in the class, measuring the distance between the stroke embedding and the symbol center embedding in the same symbol, Representing the square of the L2 norm, representing the distance between the center embedding of the symbol and the stroke embedding, C representing the number of hand-drawn flowcharts, Representing the number of strokes in a symbol; ; In the formula, Representing separation loss between classes, measuring center embedment between different symbols And The distance between the two plates is set to be equal, Representing a predefined inter-class distance threshold, ensuring sufficient separation between different symbol center embeddings, [ x ] + represents a range function; regularization function: ; In the formula, A fourth difference value is indicated and is indicative of, A weight super parameter representing a regularization term, for controlling the strength of the regularization term, Center embedding of a representation symbol c For limiting the size of the embedded vector; In the formula, Is the total loss of the total loss, Is the first difference value and is a first difference value, Is the value of the third difference and, Is the total iteration number of the mean shift cluster layer.
- 2. The method of claim 1, wherein the determining the category of the symbol feature based on the predicted category of the strokes in the hand-drawn flowchart comprises: merging the strokes into a cluster of strokes based on the Euclidean distance between the strokes; and acquiring the category of the symbol characteristic based on the predicted category of the strokes in the stroke cluster.
- 3. The method of claim 1, wherein the obtaining the second difference value of the sample symbol feature comprises: calculating a third difference value of the messaging neural network using a cross-loss function based on the sample high-dimensional features; Calculating a fourth difference value of the mean shift clustering layer by adopting a node embedding function based on the sample symbol characteristics, wherein the node embedding function is used for optimizing a clustering result of the mean shift clustering layer so as to enhance the aggregation degree between stroke characteristics in the same sample symbol characteristics and the separation degree between the stroke characteristics in different sample symbol characteristics; And taking the sum of the third difference value and the fourth difference value as a second difference value.
- 4. The method of claim 1, wherein the obtaining a first variance value of the sample prediction category comprises: If the sample set of the sample stroke characteristics is smaller than or equal to a sample set threshold, calculating a first difference value of the sample prediction category by adopting a weighted poorer entropy loss function; And if the sample set of the sample stroke characteristics is greater than a sample set threshold, calculating a first difference value of the sample prediction category by adopting a balanced Softmax loss function.
- 5. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the stroke recognition method of the hand-drawn flowchart of any one of claims 1 to 4 when the computer program is executed by the processor.
- 6. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor implements a stroke recognition method of a hand-drawn flowchart according to any of claims 1 to 4.
- 7. A computer program product comprising a computer program which, when executed by a processor, implements a stroke recognition method of a hand-drawn flowchart as claimed in any one of claims 1 to 4.
Description
Identification method of hand-drawn flow chart, electronic equipment, medium and computer product Technical Field The present invention relates to the field of image recognition technologies, and in particular, to a method for recognizing a hand-drawn flowchart, an electronic device, a medium, and a computer product. Background With the rapid development of information technology, flowcharts are increasingly used to describe business processes, algorithm processes, and the like. Currently, computer graphics tools are mainly used for drawing flowcharts. However, in some application scenarios, such as conference discussions, brain storms, etc., people are more accustomed to hand-drawing flowcharts. Compared with a computer drawing tool, the hand drawing flow chart is more flexible and visual, and complex ideas and concepts are easy to express. However, recognition and understanding of hand-drawn flowcharts still faces many challenges. Symbols in the hand-drawn flow chart, such as frames, arrows and the like, have irregular shapes, different sizes and random positions, and the drawing styles of different people are greatly different, so that great difficulty is brought to automatic identification of the flow chart. The current common symbol recognition method is mainly aimed at the printed symbols, and the recognition precision of the hand-drawn symbols is lower. Disclosure of Invention The invention provides a method, electronic equipment, medium and computer product for identifying a hand-drawn flow chart, which can improve the identification accuracy of symbols in the hand-drawn flow chart and effectively solve the problem of difficult identification of the hand-drawn flow chart in the prior art. In a first aspect of the present invention, there is provided a method for identifying a hand-drawn flowchart, including: acquiring stroke characteristics corresponding to strokes in the hand-drawn flow chart; Inputting the stroke characteristics into a graph neural network model, and obtaining predicted categories of strokes in the hand-drawn flow chart output by the graph neural network model and symbol characteristics obtained by clustering the stroke characteristics, wherein the graph neural network model is obtained by training sample stroke characteristics; based on the predicted category of the strokes in the hand-drawn flowchart, a category of the symbolic feature is determined. Optionally, the acquiring the stroke characteristics corresponding to the strokes in the hand-drawn flowchart includes: Acquiring strokes in a hand-drawn flow chart; acquiring geometric features of the strokes, wherein the geometric features are used for representing position information of the strokes; Acquiring edge features among the strokes, wherein the edge features are used for representing space-time relations among the strokes; And taking the geometrical characteristics and the edge characteristics corresponding to the strokes as stroke characteristics. Optionally, the determining the category of the symbol feature based on the predicted category of the strokes in the hand-drawn flowchart includes: merging the strokes into a cluster of strokes based on the Euclidean distance between the strokes; and acquiring the category of the symbol characteristic based on the predicted category of the strokes in the stroke cluster. Optionally, the graph neural network model includes a node classification branch and a node clustering branch, and the training process of the graph neural network model includes: inputting sample stroke characteristics into the node classification branch, and obtaining a sample prediction category corresponding to the sample stroke characteristics output by the node classification branch; Inputting the sample stroke characteristics to the node clustering branch, and obtaining sample symbol characteristics obtained by the sample stroke characteristic clustering output by the node clustering branch; acquiring a first difference value of the sample prediction category, and acquiring a second difference value of the sample symbol feature; and adjusting parameters of the graph neural network model based on the first difference value and the second difference value. Optionally, the node clustering branch includes a message passing neural network and a mean shift clustering layer, the inputting the sample stroke features to the node clustering branch, obtaining sample symbol features output by the node clustering branch and obtained by the sample stroke feature clustering includes: inputting the sample stroke characteristics into the message transmission neural network, and obtaining sample high-dimensional characteristics output by the message transmission neural network, wherein the sample high-dimensional characteristics are used for representing the shape and position relations among the sample stroke characteristics; And inputting the sample high-dimensional features to the mean shift clustering layer to obtain