US-12619522-B2 - Information processing device, information processing method, and information processing system

US12619522B2US 12619522 B2US12619522 B2US 12619522B2US-12619522-B2

Abstract

Provided is an information processing device that performs processing regarding a code analysis of software that is in an intermediate state. The information processing device includes a difference information acquisition section and a code analysis section. The difference information acquisition section acquires difference information regarding the difference between a code stored in a first repository and a code stored in a second repository in correspondence with the code stored in the first repository. The code analysis section reproduces the code stored in the first repository from the code stored in the second repository, in reference to the difference information, and performs code analysis on the reproduced code. The information processing device reports the result of analysis by the code analysis section to the developer of the reproduced code.

Inventors

KISHIN YAGAMI
Tatsuo Nagamatsu

Assignees

Sony Group Corporation

Dates

Publication Date: 20260505
Application Date: 20211228
Priority Date: 20210224

Claims (13)

1 . An information processing device, comprising: a central processing unit (CPU) configured to: acquire difference information that is associated with a difference between a first code and a second code, wherein a first repository stores the first code, and a second repository stores the second code in correspondence with the first code; reproduce, based on the difference information, the first code from the second code; generate a feature vector file based on metadata of the reproduced first code and a test history, wherein the test history is associated with the first code; generate, based on the feature vector file, a machine learning model; generate, based on the machine learning model, a plurality of failure prediction scores for each test code of a plurality of test codes; select, based on the plurality of failure prediction scores, a test code of the plurality of test codes; execute the selected test code with respect to the reproduced first code; and generate a result of analysis based on the executed selected test code.
2 . The information processing device according to claim 1 , wherein the CPU is further configured to transmit the result of the analysis to a developer of the reproduced first code.
3 . The information processing device according to claim 1 , wherein the CPU is further configured to acquire the difference information that includes a first list of commits, and the first list of commits is not pushed to the second repository.
4 . The information processing device according to claim 3 , wherein based on a branch in a code history, the CPU is further configured to acquire the difference information that includes a second list of commits, the second list of commits is associated with the reproduction of the first code in a latest form, and the reproduced first code is stored in the first repository from the branch.
5 . The information processing device according to claim 4 , wherein the CPU is further configured to: reproduce, based on specific information associated with each commit of the second list of commits, the first code that is stored in the first repository, wherein the first code is in the latest form; and perform a code analysis process on the reproduced first code.
6 . The information processing device according to claim 1 , wherein the selected test code is associated with the first code.
7 . The information processing device according to claim 6 , wherein the CPU is further configured to select the test code based on the machine learning model, and the machine learning model is associated with the test history of a plurality of changed files, the test history of the plurality of changed files is derived from each of a code modification and a plurality of tests, and the plurality of tests is based on the plurality of test codes.
8 . The information processing device according to claim 7 , wherein the CPU is further configured to: apply a plurality of rules to a result of prediction by the machine learning model; and select, based on the applied plurality of rules, the test code of the plurality of test codes.
9 . The information processing device according to claim 8 , wherein the plurality of rules includes at least one of: a rule to conduct a previous failed test; a rule to conduct the plurality of tests based on a first branch name; a rule to adjust a test selection rate for continuous integration with respect to a second branch name; or a rule to give preference to a specific test of the plurality of tests.
10 . The information processing device according to claim 8 , wherein a plurality of conditions is set for each rule of the plurality of rules, and the CPU is further configured to: apply a set of rules of the plurality of rules to the result of prediction, wherein the set of rules matches with the plurality of conditions; and select, based on the applied set of rules, the test code of the plurality of test codes.
11 . An information processing method, comprising: in an information processing device: acquiring difference information that is associated with a difference between a first code and a second code, wherein a first repository stores the first code, and a second repository stores the second code in correspondence with the first code; reproducing, based on the difference information, the first code from the second code; generating a feature vector file based on metadata of the reproduced first code and a test history, wherein the test history is associated with the first code; generating, based on the feature vector file, a machine learning model; generating, based on the machine learning model, a plurality of failure prediction scores for each of test code of a plurality of test codes; selecting, based on the plurality of failure prediction scores, a test code of the plurality of test codes; executing the selected test code with respect to the reproduced first code; and generating a result of analysis based on the executed selected test code.
12 . A non-transitory computer-readable medium having storage thereon, computer-executable instructions which, when executed by a computer, cause the computer to execute operations, the operations comprising: acquiring difference information that is associated with a difference between a first code and a second code, wherein a first repository stores the first code, and a second repository stores the second code in correspondence with the first code; and reproducing, based on the difference information, the first code from the second code; generating a feature vector file based on metadata of the reproduced first code and test history, wherein the test history is associated with the first code; generating, based on the feature vector file, a machine learning model; generating, based on the machine learning model, a plurality of failure prediction scores for each test code of a plurality of test codes; selecting, based on the plurality of failure prediction scores, a test code of the plurality of test codes; executing the selected test code with respect to the reproduced first code; and generating a result of analysis based on the executed selected test code.
13 . An information processing system, comprising: a commit information acquisition device that includes a first central processing unit (CPU) configured to acquire commit information that is associated with a commit, wherein the commit represents a difference between a latest first code developed in a local environment and a second code that is pushed to a remote repository; and a code analysis device that includes a second CPU configured to: reproduce, based on the commit information, the first code from the second code; generate a feature vector file based on metadata of the reproduced first code and test history, wherein the test history is associated with the first code; generate, based on the feature vector file, machine learning model; generate, based on the machine learning model, a plurality of failure prediction scores for each test code of a plurality of test codes; select, based on the plurality of failure prediction scores, a test code of the plurality of test codes; execute the selected test code with respect to the reproduced first code; and generate a result of analysis based on the executed selected test code.

Description

CROSS REFERENCE TO RELATED APPLICATIONS This application is a U.S. National Phase of International Patent Application No. PCT/JP2021/048935 filed on Dec. 28, 2021, which claims priority benefit of Japanese Patent Application No. JP 2021-027165 filed in the Japan Patent Office on Feb. 24, 2021. Each of the above-referenced applications is hereby incorporated herein by reference in its entirety. TECHNICAL FIELD The technology disclosed in this document (hereinafter referred to as the “present disclosure”) relates to an information processing device, an information processing method, a computer program, and an information processing system that perform processing regarding a software code analysis. BACKGROUND ART In these days, with the development of ICT (Information and Communication Technology), the need for software is increasing more and more, and a large amount of software is being developed every day. It should be noted that software and computer programs referred to in this document are almost synonymous. Unless otherwise stated, the software and computer programs are assumed to be as developed by a developer, that is, in source code. Code analysis is typically performed on developed software. The code analysis is performed to analyze a program control flow and structure, class inheritance, and relations between functions and variables, and verify function vulnerability (refer, for example, to PTL 1). Code analysis methods can roughly be classified into two types, that is, a method of performing code analysis with a local PC (Personal Computer) used by a software developer and a method of performing code analysis in a cloud environment. Each time a commit is made (the contents of edited or updated files are updated) in a local environment, the former method of performing code analysis with a local PC is able to perform code analysis within a short period of time and report the result of the code analysis to a developer at an early stage. Recently, local PCs provide analysis functions similar to those provided by the cloud environment, but suffer from such problems as many functional limitations and limited computational resources. Meanwhile, the latter method of performing code analysis in the cloud environment is able to increase a computation speed by performing, for example, parallel processing through the effective use of abundant computational resources. For example, CircleCl not only provides the service of performing code analysis in a cloud, but also provides a tool for performing code analysis with a local PC. Further, GitHub, Inc. provides the “GitHub Actions” service of executing CI and code analysis in the cloud. Moreover, a tool named “Act” for executing “GitHub Actions” in the local environment is provided by a third-party developer. CITATION LIST Patent Literature PTL 1 Japanese Patent Laid-open No. 2019-168753 Non Patent Literature NPL 1 Mateusz Machalica, Alex Samylkin, Meredith Porth, Satish Chandra, “Predictive Test Selection” NPL 2 Claire Leong, Abhayendra Singh, Mike Papadakis, Yves Le Traon, John Micco, “Assessing Transition-based Test Selection Algorithms at Google” SUMMARY Technical Problem An object of the present disclosure is to provide an information processing device, an information processing method, a computer program, and an information processing system that perform processing regarding the code analysis of software that is in an intermediate state before being uploaded to the cloud environment. Solution to Problem In order to address the above-described problem, according to a first aspect of the present disclosure, there is provided an information processing device including a difference information acquisition section and a code analysis section. The difference information acquisition section acquires difference information regarding the difference between a code stored in a first repository and a code stored in a second repository in correspondence with the code stored in the first repository. The code analysis section reproduces the code stored in the first repository from the code stored in the second repository, in reference to the difference information, and performs code analysis on the reproduced code. The difference information acquisition section acquires the difference information in reference to a list of commits that are not yet pushed to the second repository. More specifically, in a case where there is a branch in a code history, the difference information acquisition section acquires the difference information in reference to a list of commits necessary for reproducing the code in a latest form that is stored in the first repository from the branch. Subsequently, the code analysis section reproduces the latest code stored in the first repository, in reference to information regarding individual commits included in the list of commits, and performs code analysis on the reproduced latest code. Further, the code analysis section conducts a test by using