CN-122020645-A - API defect detection method, device, electronic equipment and storage medium

CN122020645ACN 122020645 ACN122020645 ACN 122020645ACN-122020645-A

Abstract

The application relates to an API defect detection method, an API defect detection device, an electronic device and a storage medium using outlier deviation. The method comprises the steps of carrying out cluster analysis on feature vectors corresponding to HTTP request records RR under a target API path periodically, identifying a main stream normal response model and an outlier feature vector of the API in the current period, calculating the deviation amount between the main stream normal response model and the center feature vector of the main stream normal response model according to each outlier feature vector, and judging that an original RR corresponding to the outlier feature vector is a request for successfully triggering the API defect and generating defect alarm information if the deviation amount D corresponding to any outlier feature vector is not larger than a preset defect judgment threshold. The normal response behavior model of the API is established through cluster analysis, and the deviation degree of abnormal response and the model is quantitatively evaluated, so that a malicious request which can truly trigger the safety defect can be accurately identified, and the false alarm rate is remarkably reduced.

Inventors

YANG SHENGHUA
CAO LANG
SUN HAOXIANG

Assignees

杭州迪普科技股份有限公司

Dates

Publication Date: 20260512
Application Date: 20260121

Claims (10)

1. An API defect detection method using outlier bias, the method comprising: Performing cluster analysis on feature vectors corresponding to HTTP request records RR under a target API path periodically, and identifying a main stream normal response model and an outlier feature vector of the API in the current period; calculating the deviation amount between each outlier feature vector and the central feature vector of the main flow normal response model; If the deviation D corresponding to any outlier feature vector is not greater than a preset defect judgment threshold, judging that the original RR corresponding to the outlier feature vector is a request for successfully triggering the API defect, and generating defect alarm information.
2. The method according to claim 1, characterized in that the method further comprises the pre-step of: Collecting network mirror image flow and restoring the network mirror image flow into an HTTP request record RR, wherein the RR comprises a request part and a response part; judging whether the request part of the RR contains preset API safety defect characteristic keywords or not; if so, extracting the feature vector from the response portion of the RR.
3. The method of claim 1, wherein the mainstream normal response model is determined by: Among all the clusters generated by the cluster analysis, the cluster with the largest number of feature vectors is defined as a mainstream normal response model.
4. The method of claim 1, wherein the feature vector is a five-dimensional vector having dimensions including a content length CL, a content type CT, a number of response heads HC, a response head fingerprint HF, and a response body fingerprint BF.
5. The method of claim 4, wherein the deviation is a cosine similarity value obtained by calculating a cosine similarity between the outlier feature vector and the center feature vector.
6. The method of claim 1, wherein the cluster analysis is performed using an optimized DBSCAN algorithm, wherein optimizing the DBSCAN algorithm comprises: in the clustering process, feature vectors which do not meet the density requirement of the core cluster are reserved as noise points, and the noise points are identified as the outlier feature vectors.
7. The method of claim 1, wherein the periodic cluster analysis is performed based on a rolling time window; The main flow normal response model and the outlier feature vector are obtained based on feature vector analysis corresponding to RR collected in a current time window; the length of the time window and the period of the clusters are parameters which can be independently configured.
8. An API defect detection apparatus using outlier bias, the apparatus comprising: The cluster analysis module is configured to periodically perform cluster analysis on feature vectors corresponding to HTTP request records RR under a target API path, and identify a main flow normal response model and an outlier feature vector of the API in the current period; A deviation calculation module configured to calculate, for each of the outlier feature vectors, a deviation amount between the outlier feature vector and a center feature vector of the mainstream normal response model; And the defect alarm module is configured to judge that the original RR corresponding to the outlier feature vector is a request for successfully triggering the API defect and generate defect alarm information if the deviation D corresponding to any outlier feature vector is not greater than a preset defect judgment threshold.
9. An electronic device, comprising: one or more processors; Storage means for storing one or more programs which, when executed by the one or more processors, cause the electronic device to implement the method of any of claims 1 to 7.
10. A computer readable storage medium having stored thereon computer readable instructions which, when executed by a processor of a computer, cause the computer to perform the method of any of claims 1 to 7.

Description

API defect detection method, device, electronic equipment and storage medium Technical Field The disclosure relates to the technical field of communication, and in particular relates to an API defect detection method and device using outlier deviation, electronic equipment and a storage medium. Background With the rapid development of internet technology, website applications are commonly developed and deployed by adopting a front-end and back-end separated architecture. In this mode, data interaction between the front-end and the back-end is accomplished primarily through application programming interface (Application Programming Interface, API) calls. The API is used as a data transmission channel, and the safety and the robustness of the design are directly related to the data safety of the whole application system. However, APIs often suffer from security drawbacks at design time, such as corrupted object level authorization, authentication mechanism flaws, excessive data exposure, etc., due to lack of security awareness or omission during development. These defects, once exploited by an attacker, are extremely prone to serious security events such as unauthorized access of data, leakage of sensitive information, and the like. Currently, the mainstream detection method for API defects in the industry generally adopts a "flow restoration and feature matching" technical route. Specifically, the method captures the data packet in the network through the flow mirror technology, and then restores the data packet into a complete HTTP transaction by utilizing the protocol analysis technology, thereby extracting the calling condition of the API. On the basis, the detection system carries out matching detection on the request message sent by the API calling party according to a pre-written characteristic rule base. These feature rules include both generic vulnerability pattern features (e.g., SQL injection, path traversal, etc.) and feature fingerprints for certain business systems that have been disclosed as having certain drawbacks. When detecting that the content in the flow request matches a certain rule in the feature library, the system determines that the API may have a corresponding defect and issues an alarm. However, the above prior art solutions have the following significant limitations: First, the false alarm rate is high, and the accuracy is not enough. The core logic of the existing method is to perform static feature matching on the request message. This way of matching can only determine that a request is "looking like" to attack, but cannot verify whether the request actually can successfully trigger the defective logic of the backend API and trigger an abnormal server response. Many requests that contain common attack features may be handled normally due to the existence of a robust verification mechanism at the back-end, the response of which is no different from that of ordinary requests. The request is judged to be defect, so that a large number of false alarms are caused, and the reliability and operation and maintenance efficiency of the detection result are seriously affected. Second, maintainability is poor and expansion is difficult. The writing and maintenance of defect feature rules is a highly specialized and complex task that requires a security expert to understand various vulnerabilities and specific business systems in depth. Along with the continuous evolution of attack techniques and the continuous iteration of a service system, a feature rule base needs to be frequently updated, and the maintenance cost is high. In addition, feature rules written for a particular system often have no versatility and cannot be directly applied to other business systems. Third, the hysteresis is strong and the unknown system defect cannot be detected. The method relies heavily on known, publicly disclosed vulnerability characteristics and system fingerprints. For API security flaws that have not been disclosed or widely recognized, and emerging business systems that employ custom protocols or logic, the method is essentially disabled in its detection capabilities due to the lack of corresponding feature rules. This "post patch" mode of defense places the system in a passive and unprotected state in the face of new attacks or undisclosed zero-day vulnerabilities. In summary, the existing API defect detection method based on flow restoration and feature matching has significant shortcomings in terms of accuracy, maintainability and prospective. Therefore, a new technical scheme is urgently needed, false alarms can be reduced, dependence on manual programming features is reduced, and the capability of detecting unknown or undisclosed API defects is provided, so that the safety of the API is ensured more flexibly and accurately. Disclosure of Invention The embodiment of the disclosure provides an API defect detection method, device, electronic equipment and storage medium using outlier deviation, w