CN-122027259-A - Multi-source log-based anomaly scoring method, device, equipment and storage medium
Abstract
The embodiment of the application provides an anomaly scoring method, an anomaly scoring device, anomaly scoring equipment and a storage medium based on a multi-source log. The method comprises the steps of obtaining a plurality of alarm logs generated by a plurality of safety devices and internet protocol addresses associated with each alarm log, determining a plurality of objects to be evaluated, determining a plurality of fields to be evaluated and field characteristic information of each field to be evaluated for each object to be evaluated, matching abnormal scores of the field characteristic information corresponding to the fields to be evaluated for each object to be evaluated, constructing an initial score sequence of each object to be evaluated, calculating characteristic distances between the initial score sequence and an object clustering center for each object to be evaluated, splicing the characteristic distances with the initial score sequences to obtain target score sequences, and inputting the target score sequences corresponding to each object to be evaluated into a target score model to obtain abnormal score results. Therefore, the comprehensiveness and the accuracy of abnormality detection based on the multi-source log can be improved.
Inventors
- FENG WENYING
- Zhao Angxiao
- LI RUONAN
- Yao Aiting
- LUO CUI
- GU ZHAOQUAN
- WANG XINGANG
Assignees
- 鹏城实验室
Dates
- Publication Date
- 20260512
- Application Date
- 20260129
Claims (10)
- 1. An anomaly scoring method based on a multi-source log, the method comprising: acquiring a plurality of alarm logs generated by a plurality of safety devices and an internet protocol address associated with each alarm log, and determining a plurality of objects to be evaluated based on the alarm logs and the internet protocol addresses; determining a plurality of corresponding fields to be evaluated according to each object to be evaluated, and extracting field characteristic information corresponding to each field to be evaluated from at least one associated alarm log; Determining a corresponding score evaluation rule table for each object to be evaluated, and matching abnormal scores of field characteristic information corresponding to each field to be evaluated according to the score evaluation rule table; combining the abnormal scores corresponding to each field to be evaluated and corresponding to each object to be evaluated, and constructing an initial score sequence corresponding to each object to be evaluated; Calculating the characteristic distance between the initial score sequence and a preset object clustering center for each object to be evaluated, and splicing the characteristic distance with the initial score sequence to obtain a target score sequence; and inputting the target score sequence corresponding to each object to be evaluated into a pre-trained target score model to obtain a corresponding abnormal score result.
- 2. The multi-source log based anomaly scoring method of claim 1, wherein the plurality of fields to be evaluated comprises at least one of a device distribution field, a behavior alert field, an attack category number field, a behavior type field, and a geographic location field, and wherein the plurality of field characteristic information corresponding to the plurality of fields to be evaluated comprises at least one of device distribution information, a behavior alert total number information, an attack category number information, a behavior type information, and a geographic location information; The determining a plurality of corresponding fields to be evaluated for each object to be evaluated, and extracting field characteristic information corresponding to each field to be evaluated from at least one associated alarm log, includes: when the object to be evaluated is at least one internet protocol address corresponding to the plurality of alarm logs, determining at least one associated target alarm log from the plurality of alarm logs according to each internet protocol address; For each internet protocol address, based on the at least one target alarm log, determining equipment distribution of the internet protocol address in the plurality of security equipment, obtaining the equipment distribution information corresponding to the equipment distribution field, determining a total number of behavior alarms in the at least one target alarm log, obtaining the total number of behavior alarms corresponding to the behavior alarm field, determining the number of attack types associated with the internet protocol address, obtaining the number of attack types corresponding to the number of attack types field, determining the behavior type information corresponding to the behavior type field, determining the geographic position corresponding to the internet protocol address, and obtaining the geographic position information corresponding to the geographic position field.
- 3. The multi-source log based anomaly scoring method of claim 1, further comprising, for each object under evaluation, prior to calculating a feature distance between the initial score sequence and a predetermined object cluster center: When an object to be evaluated is an alarm log corresponding to each safety device, a plurality of sample alarm logs and sample first initial score sequences corresponding to each sample alarm log are obtained for the corresponding safety device, and an object cluster center corresponding to the safety device is obtained based on the average value among the plurality of sample first initial score sequences corresponding to the sample alarm logs, wherein the object cluster center corresponding to each safety device is associated with any alarm log generated by the safety device; And when the object to be evaluated is an internet protocol address, acquiring a plurality of sample second initial score sequences corresponding to the internet protocol address, and acquiring an object clustering center corresponding to the internet protocol address based on the average value among the plurality of sample second initial score sequences.
- 4. The multi-source log-based anomaly scoring method of claim 1, wherein the inputting the target score sequence corresponding to each object to be evaluated into a pre-trained target scoring model, after obtaining the corresponding anomaly scoring result, further comprises: when the object to be evaluated is an Internet protocol address, a preset access configuration file, a vulnerability database and a collapse state of the Internet protocol address are obtained; Traversing in the access configuration file based on the Internet protocol address to obtain a first traversing result, and traversing in the vulnerability database to obtain a second traversing result; calculating a false alarm removal coefficient of the internet protocol address based on the first traversing result, the second traversing result and the collapse state of the internet protocol address; and correcting the abnormal scoring result of the Internet protocol address based on the false alarm removal coefficient to obtain a target abnormal scoring result.
- 5. The multi-source log based anomaly scoring method of claim 1, wherein the objective scoring model is trained by: Determining a plurality of preset evaluation objects of a plurality of different scene types, acquiring at least one corresponding sample alarm log aiming at each preset evaluation object, and determining a sample initial score sequence corresponding to each preset evaluation object based on the at least one sample alarm log; Calculating a sample feature distance between a preset evaluation object and a preset object clustering center, and splicing the sample feature distance and the sample initial score sequence based on the sample feature distance to obtain a sample target score sequence; inputting the sample target score sequence into a preset score model to obtain a predictive score result corresponding to the preset evaluation object; Obtaining sample scoring results corresponding to the preset evaluation objects, and adjusting parameters of the preset scoring model based on differences between the prediction scoring results corresponding to each preset evaluation object and the sample scoring results to obtain a target scoring model.
- 6. The anomaly scoring method based on multi-source logs according to claim 5, wherein after adjusting the parameters of the preset scoring model to obtain a target scoring model, further comprising: determining a corresponding scene type and a plurality of fields to be evaluated associated with each scene type according to each preset evaluation object; carrying out regression analysis on the target scoring model under each scene type to obtain a correction weight parameter of each field to be evaluated associated with each scene type; And adjusting the parameters of the target scoring model based on the corrected weight parameters of each field to be evaluated associated with each scene type to obtain an adjusted target scoring model.
- 7. The multi-source log-based anomaly scoring method of claim 6, wherein performing regression analysis on the target scoring model under each scene type to obtain a corrected weight parameter for each field to be evaluated associated with each scene type comprises: aiming at each scene type corresponding to the target scoring model, acquiring a significant probability value and a variance expansion factor of each field to be evaluated; determining a corresponding first weight adjustment coefficient according to the magnitude relation between the significance probability value of each field to be evaluated and a preset significance threshold value; determining a corresponding second weight adjustment coefficient according to the magnitude relation between the variance expansion factor of each field to be evaluated and a preset collinearity threshold; Calculating a comprehensive weight adjustment coefficient of each field to be evaluated based on a product between the first weight adjustment coefficient and the second weight adjustment coefficient of each field to be evaluated; Acquiring original weight parameters corresponding to each field to be evaluated of each scene type in the target scoring model, and adjusting the original weight parameters by utilizing the comprehensive weight adjustment coefficients aiming at each field to be evaluated to obtain corrected weight parameters of each field to be evaluated.
- 8. An anomaly scoring device based on a multi-source log, the device comprising: The acquisition module is used for acquiring a plurality of alarm logs generated by a plurality of safety devices and an internet protocol address associated with each alarm log, and determining a plurality of objects to be evaluated based on the alarm logs and the internet protocol addresses; the determining module is used for determining a plurality of corresponding fields to be evaluated according to each object to be evaluated, and extracting field characteristic information corresponding to each field to be evaluated from at least one associated alarm log; The matching module is used for determining a corresponding score evaluation rule table for each object to be evaluated, and matching abnormal scores of field characteristic information corresponding to each field to be evaluated according to the score evaluation rule table; The construction module is used for combining the abnormal scores corresponding to each field to be evaluated and corresponding to each object to be evaluated and constructing an initial score sequence corresponding to each object to be evaluated; the splicing module is used for calculating the characteristic distance between the initial score sequence and a preset object clustering center aiming at each object to be evaluated, and splicing the characteristic distance with the initial score sequence to obtain a target score sequence; And the input module is used for inputting the target score sequence corresponding to each object to be evaluated into a pre-trained target score model to obtain a corresponding abnormal score result.
- 9. A computer device, characterized in that it comprises a memory storing a computer program and a processor implementing the multisource log based anomaly scoring method according to any one of claims 1 to 7 when the computer program is executed.
- 10. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the multi-source log based anomaly scoring method of any one of claims 1 to 7.
Description
Multi-source log-based anomaly scoring method, device, equipment and storage medium Technical Field The present application relates to the field of network security technologies, and in particular, to an anomaly scoring method, apparatus, device, and storage medium based on a multi-source log. Background With the increasing complexity of network attack means, especially for the attack of the Web application layer, the explosion is presented, and enterprises commonly deploy various security devices such as firewalls, WAFs, honeypots and the like to enhance protection. The security device continuously generates a large amount of alarm logs in the running process, and potential attack behaviors can be revealed in real time. However, since the quality of the alarm logs is uneven and contains a large amount of false alarms and redundancy, the alarm logs need to be evaluated abnormally so as to improve the safety operation efficiency. In the related art, a rule matching mechanism is generally adopted to perform anomaly detection on an alarm log generated by a security device. Specifically, whether the corresponding alarm is abnormal or not can be judged by comparing each alarm log with known attack characteristics in a predefined rule base. However, the method only aims at the logs of a single source to carry out isolation analysis, potential anomalies across devices are easy to miss, so that analysis results are not comprehensive, and secondly, the detection capability of the method is highly dependent on the completeness of the pre-extracted known attack features and the rule base, and accurate detection is difficult to realize for unknown attacks or variants of the existing attacks which are not covered by the rule base. Disclosure of Invention The application provides an anomaly scoring method, an anomaly scoring device, anomaly scoring equipment and a storage medium based on a multi-source log, which can improve the comprehensiveness and the accuracy of anomaly detection based on the multi-source log. In order to achieve the above object, a first aspect of an embodiment of the present application provides an anomaly scoring method based on a multi-source log, including: acquiring a plurality of alarm logs generated by a plurality of safety devices and an internet protocol address associated with each alarm log, and determining a plurality of objects to be evaluated based on the alarm logs and the internet protocol addresses; determining a plurality of corresponding fields to be evaluated according to each object to be evaluated, and extracting field characteristic information corresponding to each field to be evaluated from at least one associated alarm log; Determining a corresponding score evaluation rule table for each object to be evaluated, and matching abnormal scores of field characteristic information corresponding to each field to be evaluated according to the score evaluation rule table; combining the abnormal scores corresponding to each field to be evaluated and corresponding to each object to be evaluated, and constructing an initial score sequence corresponding to each object to be evaluated; Calculating the characteristic distance between the initial score sequence and a preset object clustering center for each object to be evaluated, and splicing the characteristic distance with the initial score sequence to obtain a target score sequence; and inputting the target score sequence corresponding to each object to be evaluated into a pre-trained target score model to obtain a corresponding abnormal score result. Accordingly, a second aspect of an embodiment of the present application proposes an anomaly scoring device based on a multi-source log, the device including: The acquisition module is used for acquiring a plurality of alarm logs generated by a plurality of safety devices and an internet protocol address associated with each alarm log, and determining a plurality of objects to be evaluated based on the alarm logs and the internet protocol addresses; the determining module is used for determining a plurality of corresponding fields to be evaluated according to each object to be evaluated, and extracting field characteristic information corresponding to each field to be evaluated from at least one associated alarm log; The matching module is used for determining a corresponding score evaluation rule table for each object to be evaluated, and matching abnormal scores of field characteristic information corresponding to each field to be evaluated according to the score evaluation rule table; The construction module is used for combining the abnormal scores corresponding to each field to be evaluated and corresponding to each object to be evaluated and constructing an initial score sequence corresponding to each object to be evaluated; the splicing module is used for calculating the characteristic distance between the initial score sequence and a preset object clustering center aiming at each