Search

CN-122019346-A - Program debugging method and system for artificial intelligent driver

CN122019346ACN 122019346 ACN122019346 ACN 122019346ACN-122019346-A

Abstract

The invention relates to the technical field of program debugging and discloses an artificial intelligent driven program debugging method and system, wherein the method comprises the steps of performing lexical analysis and grammar analysis on a source code file, generating an abstract grammar tree and performing format character matching; establishing a control flow debugging relation and a data flow debugging relation between debugging information and grammar nodes, calculating to obtain the debugging value weight of the grammar nodes, screening grammar nodes with high debugging value weight, generating debugging sentences by using codes based on AI driving, and carrying out formatting security check and semantic consistency check. According to the invention, the grammar structure and the semantic features of the source code are subjected to joint analysis, so that grammar nodes with high debugging value are intelligently screened, and safe and semantically consistent debugging sentences are generated, thereby improving the debugging positioning accuracy and the debugging efficiency.

Inventors

  • TIAN CHUNHUI

Assignees

  • 北京鲲鹏凌昊智能技术有限公司

Dates

Publication Date
20260512
Application Date
20260104

Claims (10)

  1. 1. An artificial intelligence driven program debugging method, comprising: s1, acquiring a source code file of a program to be debugged, performing lexical analysis and grammar analysis on the source code file, generating an abstract grammar tree corresponding to the source code file, extracting grammar labels of grammar nodes in the abstract grammar tree, constructing a type-format symbol mapping table based on types in the grammar labels, and performing format symbol automatic matching processing on the grammar nodes to obtain an abstract grammar tree after format symbol matching; S2, extracting semantic features of the abstract syntax tree after the format symbol matching by utilizing a code analysis model based on artificial intelligence according to the abstract syntax tree after the format symbol matching and the syntax labels of the syntax nodes to obtain the semantic features of the syntax nodes; S3, establishing a control flow debugging relation and a data flow debugging relation between debugging information and the grammar nodes according to semantic features of the grammar nodes, calculating to obtain debugging value weights of the grammar nodes, and screening the grammar nodes with high debugging value weights; S4, extracting context information of the grammar node with the high debugging value weight, and generating a debugging statement corresponding to the grammar node with the high debugging value weight by using an AI-driven code generation model; And S5, carrying out formatting security verification and semantic consistency verification on the debug statement, inserting the debug statement passing the verification into a corresponding source code file debug position, and carrying out compiling and debugging on the source code file.
  2. 2. The method for debugging an artificial intelligence driver according to claim 1, wherein in step S1, a source code file of a program to be debugged is obtained, lexical analysis and syntax analysis are performed on the source code file, and an abstract syntax tree corresponding to the source code file is generated, which comprises: S11, carrying out integrity check and standardization processing on the source code file to obtain a standardized source code file, marking the source code file which does not pass the integrity check as abnormal input, and stopping a subsequent analysis flow; S12, performing lexical analysis on the standardized source code file to obtain a lexical unit sequence based on the character stream sequence, and recording the structural attribute of the lexical unit in the lexical unit sequence, wherein the structural attribute of the lexical unit comprises a code value, a lexical type and a position of the lexical unit, and the position of the lexical unit is a line number and a column number of the lexical unit in the standardized source code file; S13, carrying out grammar analysis on the lexical unit sequence by adopting a bottom-up analysis mode, extracting grammar structure units from the lexical unit sequence, creating corresponding grammar nodes, marking the hierarchical structure of the grammar nodes, and forming an abstract grammar tree corresponding to the source code file.
  3. 3. The artificial intelligence driven program debugging method of claim 2, wherein extracting syntax labels of syntax nodes in the abstract syntax tree comprises: And executing semantic tag extraction on each grammar node by taking a root node of the abstract grammar tree as a starting point in a depth-first traversal mode to obtain the grammar tag of the grammar node in the abstract grammar tree, wherein the semantic tag of the grammar node comprises a grammar node type and various semantic attributes corresponding to the grammar node type, and the grammar node type comprises a root node, a function node, a variable statement and reference node, a control structure node and an expression node.
  4. 4. The artificial intelligence driven program debugging method of claim 3, wherein constructing a type-format symbol mapping table to automatically match format symbols of grammar nodes to obtain an abstract grammar tree after format symbol matching comprises: Extracting grammar nodes with the types being variable declarations and reference nodes, constructing a type-format mapping table, automatically matching the format of the extracted grammar nodes by using the type-format mapping table to obtain format symbols corresponding to different variable data types, and adding the format symbols obtained by matching into semantic attributes of the variable declarations and the reference nodes to form an abstract grammar tree after the format symbols are matched.
  5. 5. The method for debugging an artificial intelligence driver according to claim 1, wherein the step S2 of extracting semantic features from the abstract syntax tree after the pattern matching by using an artificial intelligence-based code analysis model comprises: The code analysis model based on artificial intelligence comprises a grammar structure coding layer, a context and scope fusion layer, a data dependent perception layer and a semantic feature output layer; s21, extracting grammar labels of the grammar nodes by a grammar structure coding layer, and carrying out word vector coding based on the grammar labels on semantic attribute sequences of the grammar nodes to obtain coding vectors of the grammar nodes; S22, extracting a context node set of the grammar node in an abstract grammar tree by a context and scope fusion layer, extracting coding vectors of the grammar node in the context node set, and weighting the coding vectors of the grammar node in the context node set in an attention weighting mode to obtain fusion characteristics of the grammar node of the semantic characteristics to be calculated; S23, the data dependency perception layer selects a dependency reference node set with a dependency reference relation with the grammar nodes based on the dependency relation and the reference relation of variable data in the grammar nodes, calculates a dependency weight coefficient between the grammar nodes in the dependency reference node set and the grammar nodes of the semantic features to be calculated, and performs dependency weighted propagation on coding vectors of the grammar nodes in the dependency reference node set to obtain semantic propagation features of the grammar nodes of the semantic features to be calculated; And S24, weighting the fusion features and the semantic propagation features of the grammar nodes by a semantic feature output layer, and outputting the semantic features of the grammar nodes.
  6. 6. The method for debugging an artificial intelligence driver according to claim 1, wherein establishing a control flow debugging relationship and a data flow debugging relationship between the debugging information and the grammar node in step S3 comprises: S31, extracting the number of the child nodes of the grammar node in the abstract grammar tree according to the abstract grammar tree, and converting the number of the child nodes into a control flow debugging relation between debugging information and the grammar node; S32, extracting a dependency reference node set of the grammar nodes according to the semantic feature calculation flow of the grammar nodes, counting the number of the grammar nodes in the dependency reference node set, and converting the number of the data flow dependencies into the data flow debugging relation between debugging information and the grammar nodes as the number of the data flow dependencies of the grammar nodes.
  7. 7. The method for debugging an artificial intelligence driven program according to claim 6, wherein in step S3, according to the control flow debugging relationship and the data flow debugging relationship between the debugging information and the grammar node, a debugging value weight of the grammar node is calculated, and grammar nodes with high debugging value weights are screened, and further comprising: S33, calculating attention semantic weights between grammar nodes in the dependency reference node set and grammar nodes of the debugging value weights to be calculated based on the semantic features of the grammar nodes; S34, according to the attention semantic weight, calculating a node abnormal risk score of the grammar node by adopting a graph neural network model integrated with an attention mechanism, wherein the node abnormal risk score is used for evaluating the potential false triggering probability of the grammar node; s35, calculating to obtain the debugging value weight of the grammar node according to the node abnormal risk score, the control flow debugging relation and the data flow debugging relation of the grammar node; s36, screening grammar nodes with the debugging value weight higher than a preset value threshold as grammar nodes with high debugging value weight according to the debugging value weight of the grammar nodes.
  8. 8. The method for debugging an artificial intelligence driven program according to claim 1, wherein extracting context information of the grammar node of the high debug value weight in the step S4 generates a debug sentence corresponding to the grammar node of the high debug value weight using an AI-driven code generation model, comprising: s41, extracting context information of grammar nodes with high debugging value weights, wherein the context information comprises function names, parameter lists, variable data sets, variable data types, format symbols of variable data and variable positions in the grammar nodes; And S42, receiving the context information by adopting a code generation model based on AI driving by taking the format symbol of the variable data as constraint, and generating a debug statement with complete semantics.
  9. 9. The method for debugging an artificial intelligence driven program according to claim 1, wherein the step S5 of performing a formatting security check and a semantic consistency check on the debug statement comprises: s51, extracting the format symbols of the variable data in the debug statement, and calculating the matching relation between the format symbols of the variable data in the debug statement and the variable data in the context information to be used as a formatted security check result; matching relationship between the format symbol of the variable data in the debug statement R and the variable data in the context information The calculation formula is as follows: ; Wherein, the Representing the number of variable data in the debug statement R, A format symbol representing the kth variable data in the debug statement R, Represents variable data in the context information corresponding to the kth variable data in the debug sentence R, Representing variable data generated based on a type-to-format mapping table Variable data type and style symbol of (a) If there is variable data in the type-format mapping table Variable data type and style symbol of (a) Then 0, Otherwise 1 Is shown in the specification; s52, constructing a semantic consistency check function to detect consistency of variable data, function names and the context information in the debug statement, and obtaining a semantic consistency check result; The debug statement Consistency of medium variable data, function names and the context information The calculation formula is as follows: ; Wherein, the Representing debug statements Is used to determine the set of variable data in the database, Representing a set of variable data in the context information, Representing variable data sets The number of medium-variable data, Representing the intersection of the collection of computations, Representing intersections The number of medium-variable data, Representing debug statements Consistency of function names in the middle, if the statement is debugged The function name in the context information is consistent with the function name in the context information, then 1, Otherwise Is 0; And S53, weighting the formatted security check result and the semantic consistency check result to obtain a weighted check result, if the weighted check result is higher than a preset check threshold, indicating that the debug statement passes the check, otherwise, indicating that the debug statement fails the check.
  10. 10. An artificial intelligence driven program debugging system, characterized in that the program debugging system comprises a compiling preprocessing module, a compiling time type safety deducing module, a source code context sensing module, a debugging statement generating module, a grammar highlighting enhancing module and a dynamic switch control module, so as to realize the artificial intelligence driven program debugging method according to any one of claims 1-9.

Description

Program debugging method and system for artificial intelligent driver Technical Field The invention relates to the field of program debugging, in particular to an artificial intelligent driven program debugging method and system. Background With the rapid development of information technology and embedded systems, the C/C++ language still dominates in the key fields of system software, bottom-layer drivers, embedded development and the like. However, C/C++ program debugging has long relied on two traditional approaches, namely breakpoint trace debugging using GDB (GNU Debugger) tools and outputting key information by inserting a printf () statement in the code. Although GDB can realize finer granularity debugging control, the use threshold is higher, developers are required to have rich debugging command knowledge, and the-g DEBUG option must be started in the compiling stage, so that the operation is complex and the efficiency is lower. The printf printing and debugging is a general and direct method, although the method is simple and easy to implement, the method has obvious defects that for debugging information with an expression, different formatted output character strings are required to be manually written for different data types of a C language, required printing information is often not only variable values, the source code position of a variable is also key information of a positioning problem, but also positioning information such as a file name line number is required to be manually used, the degree of distinction of common printing is insufficient, the common printing and general program output cannot be obviously distinguished, when the debugging stage is finished, the debugging printing output is required to be closed in the formal release, manually added debugging codes are often required to be manually removed, omission is easy to cause, when the debugging codes are also provided, warning printing is displayed in the compiling stage to remind, the release program is prevented from printing internal information, and the debugging information lacks systematic management and is difficult to form a unified debugging frame. These problems lead to complex and inefficient debugging processes of the C/C++ program, and particularly in large-scale engineering or system-level development, the debugging time and cost are high, thus greatly restricting the product development and delivery efficiency. In existing research, some schemes have attempted to improve debug efficiency through an automated mechanism. For example, patent document CN115599671a discloses an application program debugging method, apparatus, device and storage medium, which obtains a target application program identifier and a debug class identifier by responding to an application program debugging request, further obtains a debug code corresponding to the debug class, redefines the target debug class, generates a debug agent program, and debugs the target application program in combination with a virtual machine process identifier. Aiming at the problem, the invention provides an artificial intelligence driven program debugging method and system, which automatically identify program structure and code semantic information through an artificial intelligence technology, generate adaptive output debugging sentences, automatically label positioning information and improve program debugging efficiency and reliability. Disclosure of Invention The invention provides an artificial intelligence driven program debugging method and system, which aim at the technical problems that debugging points depend on artificial experience, debugging sentences are generated randomly and in a debugging range in the existing program debugging process, new semantics or format errors are easy to introduce, and the like, the S1 step is used for constructing an abstract syntax tree by means of lexical and grammatical analysis on source codes and automatically completing matching of variable types and format symbols in combination with grammar labels, the problems that the format symbols and the variable types are inconsistent and abnormal operation is easy to cause in the traditional debugging sentences are solved, the S2 step is used for conducting joint modeling on structural features, context information and data dependency relations of grammar nodes by means of introducing an artificial intelligence-based code analysis model, the problem that the existing debugging method only pays attention to local sentences and is difficult to accurately identify key debugging positions is solved, the S3 step is used for automatically screening high-value debugging nodes by means of constructing control flow and data flow debugging relations and calculating debugging value weights, the problem that the selection of semantic debugging points is blindness and the debugging efficiency is low is solved, and the S4 step and S5 is used for automatically generating sentences c