CN-116243962-B - Version identification method and device of open source software, electronic equipment and storage medium
Abstract
The version identification method, the electronic equipment and the storage medium of the open source software are characterized by obtaining a target binary file of the open source software to be identified, extracting target version characteristics of the open source software to be identified according to the target binary file, performing version coarse screening according to the target version characteristics to obtain a candidate version set, and comparing the candidate version in the candidate version set with the target binary file to obtain the target version of the open source software to be identified. By comparing the version difference characteristics, the invention can effectively avoid the influence of redundant noise characteristics on the version difference identification and can identify the fine-granularity change difference between adjacent versions. And the method of firstly carrying out coarse screening and then carefully comparing is adopted, so that the accuracy and the efficiency of the identification are both considered.
Inventors
- LI HONG
- DONG CHAOPENG
- WANG YONGPAN
- ZHU HONGSONG
- SUN LIMIN
Assignees
- 中国科学院信息工程研究所
Dates
- Publication Date
- 20260508
- Application Date
- 20230117
Claims (8)
- 1. A version identification method of open source software, comprising: acquiring a target binary file of open source software to be identified; Extracting target version characteristics of the open source software to be identified according to the target binary file, wherein the target version characteristics comprise software level characteristics and version level characteristics, and the software level characteristics comprise character strings and function names; Performing version coarse screening according to the target version characteristics to obtain a candidate version set; Comparing the candidate versions in the candidate version set with the target binary file to obtain a target version of the open source software to be identified; the performing version coarse screening according to the target version characteristics to obtain a candidate version set, including: Matching the target version characteristics with an open source software code library to determine a version set containing the target version characteristics; Screening the version set according to the target version characteristics to obtain a first candidate version set; According to the open source software code library, a version interval table among all versions of the open source software is obtained; obtaining a target feature distance of open source software to be identified according to the target version features; obtaining a second candidate version set according to the target feature distance; obtaining a candidate version set according to the intersection of the first candidate version set and the second candidate version set; the determining, by the open source software code library, a version set including the target version feature, including: analyzing the version sensitive function in the open source software code library; and determining a version set containing the target version features according to the version sensitive function contained in the target version features.
- 2. The method for identifying a version of open source software according to claim 1, wherein the screening the version set according to the target version characteristic to obtain a first candidate version set comprises: Constructing a mapping relation between the target version characteristics and the version set; And screening the version set according to the target version characteristics and the mapping relation to obtain a first candidate version set.
- 3. The method for identifying versions of open source software according to claim 2, wherein the obtaining a version interval table between versions of open source software according to the open source software code library comprises: acquiring characteristic information of each version from the open source software code library; Acquiring a characteristic distance between any two versions according to the characteristic information; And obtaining a version interval table according to the characteristic distance.
- 4. The method for identifying the version of the open source software according to claim 3, wherein the feature distance includes a first feature distance and a second feature distance, and the obtaining the target feature distance of the open source software to be identified according to the target version feature includes: According to the target version characteristics, calculating the distance from the target binary file to a first preset version to obtain a first characteristic distance; according to the target version characteristics, calculating the distance from the target binary file to a second preset version to obtain a second characteristic distance; and obtaining the characteristic distance of the open source software to be identified according to the first characteristic distance and the second characteristic distance.
- 5. The method for identifying the version of the open source software according to claim 1, wherein comparing the candidate version in the candidate version set with the target binary file to obtain the target version of the open source software to be identified comprises: Comparing the candidate versions with the target binary file to obtain feature distances from each candidate version to the target binary file; Sorting according to the feature distance from the candidate version to the target binary file; and taking the candidate version with the minimum feature distance as the target version of the open source software to be identified.
- 6. An open source version identification device, comprising: the file acquisition unit is used for acquiring a target binary file of open source software to be identified; the feature extraction unit is used for extracting the target version features of the open source software to be identified according to the target binary file, wherein the target version features comprise software level features and version level features, and the software level features comprise character strings and function names; The version roughing unit is used for carrying out version roughing screening according to the target version characteristics to obtain a candidate version set; The version selection unit is used for comparing the candidate versions in the candidate version set with the target binary file to obtain the target version of the open source software to be identified; the performing version coarse screening according to the target version characteristics to obtain a candidate version set, including: Matching the target version characteristics with an open source software code library to determine a version set containing the target version characteristics; Screening the version set according to the target version characteristics to obtain a first candidate version set; According to the open source software code library, a version interval table among all versions of the open source software is obtained; obtaining a target feature distance of open source software to be identified according to the target version features; obtaining a second candidate version set according to the target feature distance; obtaining a candidate version set according to the intersection of the first candidate version set and the second candidate version set; the determining, by the open source software code library, a version set including the target version feature, including: analyzing the version sensitive function in the open source software code library; and determining a version set containing the target version features according to the version sensitive function contained in the target version features.
- 7. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the version identification method of open source software according to any one of claims 1 to 5 when the program is executed by the processor.
- 8. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the version identification method of open source software according to any of claims 1 to 5.
Description
Version identification method and device of open source software, electronic equipment and storage medium Technical Field The present invention relates to the field of software development technologies, and in particular, to a version identification method of open source software, an electronic device, and a storage medium. Background In order to reduce development cost and improve software iteration efficiency, a developer of commercial software generally uses existing open source software (Open Source Software, OSS) to realize related functions, but the OSS brings development convenience, introduces a plurality of self vulnerabilities into the commercial software, and virtually threatens a user by an attacker. For the user, due to the lack of commercial software source code, it cannot be confirmed whether the software used by the user is affected by the indirectly introduced OSS vulnerability. Therefore, detecting the version of OSS in software can help users to confirm whether the user is affected by known vulnerabilities, so that vulnerabilities can be repaired as soon as possible, and economic losses caused by utilization of attackers are avoided. In the prior art, identifying software such as VES, OSSPolice and LibDB is generally used for detecting OSS. The VES does not need to dynamically run software, realizes large-scale version static analysis, but severely depends on version character strings to identify versions, and the OSS may not contain the version character strings or version printing functions in actual development, so that the method is invalid. OSSPolice on the one hand rely on the version related character strings to distinguish the versions, so that OSS lacking the version character strings cannot be processed, on the other hand, because of the multiplexing relation of OSS, the character strings may contain other version character strings, which confuses the identification of the true OSS version, and reduces the accuracy of version identification. LibDB is that the fine-grained version changes are masked from distinction due to the introduction of excessive redundancy noise features when identifying versions. In summary, the defects of the open source software version cannot be accurately and effectively identified in the prior art. Disclosure of Invention The invention provides a version identification method of open source software, electronic equipment and a storage medium, which are used for solving the defect that the version of the open source software cannot be accurately identified in the prior art. The invention provides a version identification method of open source software, which comprises the following steps: acquiring a target binary file of open source software to be identified; extracting target version characteristics of the open source software to be identified according to the target binary file; Performing version coarse screening according to the target version characteristics to obtain a candidate version set; And comparing the candidate versions in the candidate version set with the target binary file to obtain the target version of the open source software to be identified. According to the version identification method of open source software provided by the invention, the version coarse screening is carried out according to the target version characteristics to obtain a candidate version set, and the method comprises the following steps: Matching the target version characteristics with an open source software code library to determine a version set containing the target version characteristics; Screening the version set according to the target version characteristics to obtain a first candidate version set; According to the open source software code library, a version interval table among all versions of the open source software is obtained; obtaining a target feature distance of open source software to be identified according to the target version features; obtaining a second candidate version set according to the target feature distance; and obtaining a candidate version set according to the first candidate version set and the second candidate version set. According to the version identification method of the open source software provided by the invention, the version set containing the target version characteristics is determined by matching the open source software code library with the target version characteristics, and the method comprises the following steps: analyzing the version sensitive function in the open source software code library; and determining a version set containing the target version features according to the version sensitive function contained in the target version features. According to the version identification method of open source software provided by the invention, the screening of the version set according to the target version characteristics, to obtain a first candidate version set, comprises the following steps: Constructin