Search

US-12619405-B2 - Method and system for incremental functional approach-based dataflow analysis

US12619405B2US 12619405 B2US12619405 B2US 12619405B2US-12619405-B2

Abstract

This disclosure relates generally to method and system for incremental functional approach-based dataflow analysis. Static dataflow analysis can take hours to days depending on size and complexity of the code. In today's agile development environment faster analysis is required which can handle incremental changes to the code in an efficient manner. The method includes by performing a static dataflow analysis over a set of functions of a source code. Further, obtains a set of impacted functions from the source code and executes a dataflow analysis over the set of impacted functions of the source code. The method performs an incremental functional approach-based dataflow analysis over the set of impacted functions including an incremental bottom-up analysis and an incremental top-down analysis. The method efficiently updates results of dataflow analysis in response to incremental changes which is fast and scalable and minimizes the number of procedures by comparing summaries across the versions.

Inventors

  • Anushri Jana
  • Bharti CHIMDYALWAR
  • Ramanathan Venkatesh
  • Shrawan Kumar

Assignees

  • TATA CONSULTANCY SERVICES LIMITED

Dates

Publication Date
20260505
Application Date
20230908
Priority Date
20221028

Claims (12)

  1. 1 . A processor implemented method of performing incremental functional approach-based dataflow analysis in a source code, the method comprising: performing via one or more hardware processors a static dataflow analysis over a set of functions of a source code; identifying via the one or more hardware processors a set of edited functions from the set of functions based on at least one change observed in the source code, wherein each function represents a node comprising a set of statements being called in the source code; obtaining via the one or more hardware processors a set of impacted functions based on at least one change observed between a current version summary of the source code and a previous version summary of the source code; executing via the one or more hardware processors a dataflow analysis over the set of impacted functions of the source code which recomputes the set of impacted functions by leveraging one or more cached previous analysis results, instead of re-running an analysis for every changed source code by performing an incremental bottom-up analysis on one or more selected functions from the set of functions being traversed in a bottom-up order of a call graph originated from each edited function, wherein performing the incremental bottom-up analysis further comprises: loading a bottom-up worklist of a set of edited functions universally available in a worklist; listing the set of functions to be traversed in the bottom-up order from the bottom-up worklist; obtaining a previous version summary of the edited function to compute a current version summary of the edited function; and performing when the current version summary of the edited function is not identical to the previous version summary of the edited function, identifying a sequence of a set of caller functions from the set of edited functions and recomputing the current version summary of the set of caller functions, wherein recomputing the current version summary is initially started by identifying a chain of callers of the edited function, recomputing and comparing a respective set of summaries recomputed from the callers of the edited function, and stopping re-computation of the current version summary when a procedure with the current version summary matching the previous version summary is found; updating a summary of each edited function with the current version summary; and identifying a set of called functions from the set of edited functions and adding a set of called functions universally available to the set of impacted functions; performing an incremental top-down analysis over the set of impacted functions by traversing the call graph in a top-down order; and performing via the one or more hardware processors, an incremental functional approach-based dataflow analysis over the set of impacted functions based on the current version summary of the source code, the incremental bottom-up analysis, and the incremental top-down analysis, wherein the incremental functional approach-based dataflow analysis forms a summary of each procedure using a bottom-up traversal over the call graph, analyzes every procedure only once even if the procedure includes multiple calls and resulting summaries computed using the bottom-up traversal replace calls in a set of caller functions to construct a summary of the set of caller functions and the call graph is traversed in bottom-up order, implementing via the one or more hardware processors, the incremental functional approach-based dataflow analysis as an incremental static analysis tool that upon execution provides analysis and additionally stores program analysis information.
  2. 2 . The processor implemented method as claimed in claim 1 , wherein the incremental bottom-up analysis computes a set of edited function summaries and a set of function summaries marked as the set of impacted functions, wherein the summaries are computed based on at least one change observed when the current version summary and the previous version summary are compared.
  3. 3 . The processor implemented method as claimed in claim 1 , wherein the incremental top-down analysis updates one or more dataflow values of the set of impacted functions by traversing the call graph in the top-down order.
  4. 4 . The processor implemented method as claimed in claim 1 , wherein performing the incremental top-down analysis comprises: loading a top-down worklist with a set of procedures and listing the set of impacted functions to be traversed in the top-down order from the worklist; obtaining the set of impacted functions universally available in the set of functions and in the set of impacted functions; fetching the set of called functions from the set of impacted functions; and computing an entry of each called function based on a summary of the set of impacted functions and updating the entry of each called function.
  5. 5 . A system for performing incremental functional approach-based dataflow analysis in a source code comprising: a memory storing instructions; one or more communication interfaces; and one or more hardware processors coupled to the memory via the one or more communication interfaces, wherein the one or more hardware processors are configured by the instructions to: perform a static dataflow analysis over a set of functions of a source code; identify a set of edited functions from the set of functions based on at least one change observed in the source code, wherein each function represents a node comprising a set of statements being called in the source code; obtain a set of impacted functions based on at least one change observed between a current version summary of the source code and a previous version summary of the source code; execute a dataflow analysis over the set of impacted functions of the source code which recomputes the set of impacted functions by leveraging one or more cached previous analysis results, instead of re-running an analysis for every changed source code by performing an incremental bottom-up analysis on one or more selected functions from the set of functions being traversed in a bottom-up order of a call graph originated from each edited function, wherein performing the incremental bottom-up analysis further comprises: loading a bottom-up worklist of a set of edited functions universally available in a worklist; listing the set of functions to be traversed in the bottom-up order from the bottom-up worklist; obtaining a previous version summary of the edited function to compute a current version summary of the edited function; and performing when the current version summary of the edited function is not identical to the previous version summary of the edited function, identifying a sequence of a set of caller functions from the set of edited functions and recomputing the current version summary of the set of caller functions, wherein recomputing the current version summary is initially started by identifying a chain of callers of the edited function, recomputing and comparing a respective set of summaries recomputed from the callers of the edited function, and stopping re-computation of the current version summary when a procedure with the current version summary matching the previous version summary is found; updating a summary of each edited function with the current version summary; and identifying a set of called functions from the set of edited functions and adding a set of called functions universally available to the set of impacted functions; performing an incremental top-down analysis over the set of impacted functions by traversing the call graph in a top-down order and perform an incremental functional approach-based dataflow analysis over the set of impacted functions based on the current version summary of the source code, the incremental bottom-up analysis, and the incremental top-down analysis, wherein the incremental functional approach-based dataflow analysis forms a summary of each procedure using a bottom-up traversal over the call graph, analyzes every procedure only once even if the procedure includes multiple calls and resulting summaries computed using the bottom-up traversal replace calls in a set of caller functions to construct a summary of the set of caller functions and the call graph is traversed in bottom-up order, implementing the incremental functional approach-based dataflow analysis as an incremental static analysis tool that upon execution provides analysis and additionally stores program analysis information.
  6. 6 . The system as claimed in claim 5 , wherein the incremental bottom-up analysis computes a set of edited functions summaries and a set of function summaries marked as the set of impacted functions, wherein the summaries are computed based on at least one change observed when the current version summary and the previous version summary are compared.
  7. 7 . The system as claimed in claim 5 , wherein the incremental top-down analysis updates one or more dataflow values of the set of impacted functions by traversing the call graph in the top-down order.
  8. 8 . The system as claimed in claim 5 , wherein performing the incremental top-down analysis comprises: loading a top-down worklist with a set of procedures and listing the set of impacted functions to be traversed in the top-down order from the worklist; obtaining the set of impacted functions universally available in the set of functions and in the set of impacted functions; fetching the set of called functions from the set of impacted functions; and computing an entry of each called function based on a summary of the set of impacted functions and updating the entry of each called function.
  9. 9 . One or more non-transitory machine-readable information storage mediums comprising one or more instructions which when executed by one or more hardware processors cause: performing a static dataflow analysis over a set of functions of a source code; identifying via the one or more hardware processors, a set of edited functions from the set of functions based on at least one change observed in the source code, wherein each function represents a node comprising a set of statements being called in the source code; obtaining a set of impacted functions based on at least one change observed between a current version summary of the source code and a previous version summary of the source code; executing a dataflow analysis over the set of impacted functions of the source code which recomputes the set of impacted functions by leveraging one or more cached previous analysis results, instead of re-running an analysis for every changed source code by performing an incremental bottom-up analysis on one or more selected functions from the set of functions being traversed in a bottom-up order of a call graph originated from each edited function, wherein performing the incremental bottom-up analysis further comprises: loading a bottom-up worklist of a set of edited functions universally available in a worklist; listing the set of functions to be traversed in the bottom-up order from the bottom-up worklist; obtaining a previous version summary of the edited function to compute a current version summary of the edited function; and performing when the current version summary of the edited function is not identical to the previous version summary of the edited function, identifying a sequence of a set of caller functions from the set of edited functions and recomputing the current version summary of the set of caller functions, wherein recomputing the current version summary is initially started by identifying a chain of callers of the edited function, recomputing and comparing a respective set of summaries recomputed from the callers of the edited function, and stopping re-computation of the current version summary when a procedure with the current version summary matching the previous version summary is found; updating a summary of each edited function with the current version summary; and identifying a set of called functions from the set of edited functions and adding a set of called functions universally available to the set of impacted functions; performing an incremental top-down analysis over the set of impacted functions by traversing the call graph in a top-down order; and performing an incremental functional approach-based dataflow analysis over the set of impacted functions based on the current version summary of the source code, the incremental bottom-up analysis, and the incremental top-down analysis, wherein the incremental functional approach-based dataflow analysis forms a summary of each procedure using a bottom-up traversal over the call graph, analyzes every procedure only once even if the procedure includes multiple calls and resulting summaries computed using the bottom-up traversal replace calls in a set of caller functions to construct a summary of the set of caller functions and the call graph is traversed in bottom-up order, implementing the incremental functional approach-based dataflow analysis as an incremental static analysis tool that upon execution provides exhaustive analysis and additionally stores store program analysis information.
  10. 10 . The one or more non-transitory machine-readable information storage mediums of claim 9 , wherein the incremental bottom-up analysis computes a set of edited function summaries and a set of function summaries marked as the set of impacted functions, wherein the summaries are computed based on at least one change observed when the current version summary and the previous version summary are compared.
  11. 11 . The one or more non-transitory machine-readable information storage mediums of claim 9 , wherein the incremental top-down analysis updates one or more dataflow values of the set of impacted functions by traversing the call graph in the top-down order.
  12. 12 . The one or more non-transitory machine-readable information storage mediums of claim 9 , wherein performing the incremental top-down analysis comprises: loading a top-down worklist with a set of procedures and listing the set of impacted functions to be traversed in the top-down order from the worklist; obtaining the set of impacted functions universally available in the set of functions and in the set of impacted functions; fetching the set of called functions from the set of impacted functions; and computing an entry of each called function based on a summary of the set of impacted functions and updating the entry of each called function.

Description

PRIORITY CLAIM This U.S. patent application claims priority under 35 U.S.C. § 119 to: Indian Patent Application No. 202221061597, filed on Oct. 28, 2022. The entire contents of the aforementioned application are incorporated herein by reference. TECHNICAL FIELD The disclosure herein generally relates to verifying software program, and, more particularly, to method and system for incremental functional approach-based dataflow analysis. BACKGROUND In software systems, software evolves more in continuous integration/continuous deployment (CI/CD) environment. Static analysis refers to techniques for analyzing computer software source code without executing the source code as a computer software program. Static analysis tools are widely used to detect runtime programming errors that includes division by zero, or use of uninitialized variables, in industry strength software. The usefulness is due to their scalability, which comes at the cost of precision. In case of evolving software, when release cycle time is comparable to the static analysis tool running time, the tool's deployment at system testing time is not an option. Therefore, for the static analysis tools to be useful in the CI/CD environment, there is a need for dataflow analysis to be faster for subsequent versions of software. Dataflow analysis is a static analysis technique that is applied over the source code. It models the flow of data throughout a program, for example from one variable to another and across branches and loops. Static inter-procedural dataflow analysis is commonly used in industry to automatically detect potential bugs in large software systems due to corresponding scalability. However, this scalable analysis can take hours to days depending on size and complexity of the code. In today's agile development environment faster analysis is required which can handle incremental changes to the code in an efficient manner. Whole-program analysis on successive versions is only time consuming. With incremental changes, a developer naturally expects that the tool must report only many false alarms that are impacted by the change in an efficient manner and end up taking a lot of review efforts of the developers. SUMMARY Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems. For example, in one embodiment, a system for incremental functional approach-based dataflow analysis is provided. The system includes performing a static dataflow analysis over a set of functions of a source code. A set of edited functions are identified from the set of functions based on at least one change observed in the source code, wherein each function represents a node comprising a set of statements being called in the source code. Then, a set of impacted functions are obtained based on at least one change observed between a current version summary of the source code and a previous version summary of the source code. Further, a dataflow analysis is executed over the set of impacted functions of the source code by performing an incremental bottom-up analysis on one or more selected functions from the set of functions being traversed in the bottom-up order of a call graph originated from each edited function and performing an incremental top-down analysis over the set of impacted functions by traversing the call graph in the top-down order. The incremental functional approach-based dataflow analysis is performed over the set of impacted functions based on the current version summary of the source code, the incremental bottom-up analysis, and the incremental top-down analysis. In another aspect, a method for incremental functional approach-based dataflow analysis is provided. The method includes for incremental functional approach-based dataflow analysis is provided. The system includes performing a static dataflow analysis over a set of functions of a source code. A set of edited functions are identified from the set of functions based on at least one change observed in the source code, wherein each function represents a node comprising a set of statements being called in the source code. Then, a set of impacted functions are obtained based on at least one change observed between a current version summary of the source code and a previous version summary of the source code. Further, a dataflow analysis is executed over the set of impacted functions of the source code by performing an incremental bottom-up analysis on one or more selected functions from the set of functions being traversed in the bottom-up order of a call graph originated from each edited function and performing an incremental top-down analysis over the set of impacted functions by traversing the call graph in the top-down order. The incremental functional approach-based dataflow analysis is performed over the set of impacted functions based on the current versio