CN-122020499-A - Abnormal behavior identification method, medium and equipment based on multidimensional data
Abstract
The invention relates to the technical field of data processing, in particular to a multidimensional data-based abnormal behavior recognition method, medium and equipment, which are used for carrying out high, medium and low timeliness dimension type division according to the association tightness degree of data dimension and historical abnormal behavior occurrence time, so as to ensure that the processing strategy of each data dimension is accurately matched with the early warning requirement; the method comprises the steps of extracting mutation indexes reflecting data fluctuation, abnormal aura reflecting early risk and sparseness reflecting data distribution, so that recognition of abnormal behaviors and risk signal capture are more comprehensive, dynamically determining a target aggregation time period by combining dimension types, mutation indexes, abnormal aura and sparseness, aggregating data according to the target aggregation time period and extracting a characteristic data set, enabling the data aggregation time period to have self-adaption capability, considering data integrity and early warning timeliness, improving quality of data characteristics, integrating multidimensional characteristics to generate recognition information, and improving accuracy of abnormal behavior recognition.
Inventors
- WU MENGXUE
- YU GENGXING
- YANG MINGZHU
Assignees
- 杭州云深科技有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260416
Claims (10)
- 1. An abnormal behavior identification method based on multidimensional data, which is characterized by comprising the following steps: s1, acquiring dimension types corresponding to each data dimension according to the association tightness degree between each data dimension and the occurrence time of the historical abnormal behavior, wherein the data dimension at least comprises communication behaviors, transaction behaviors and APP behaviors, and the dimension types comprise high timeliness, medium timeliness and low timeliness; s2, according to the dimension type corresponding to each data dimension, mutation indexes, abnormal aura and sparseness degree corresponding to the behavior data of the target equipment aiming at each data dimension are respectively obtained, wherein the mutation indexes are used for reflecting the fluctuation degree of the behavior data, the abnormal aura is used for early warning the abnormal behavior, and the sparseness degree is used for reflecting the distribution density degree of the behavior data in unit time; S3, respectively determining a target aggregation time period corresponding to each data dimension according to the dimension type, mutation index, abnormal aura and sparseness degree corresponding to each data dimension; s4, according to the target aggregation time period corresponding to each data dimension, respectively aggregating and extracting the behavior data corresponding to each data dimension to obtain a characteristic data set corresponding to each data dimension; S5, acquiring abnormal behavior identification information corresponding to the target equipment according to all the characteristic data sets.
- 2. The abnormal behavior recognition method based on multidimensional data according to claim 1, wherein S1 comprises the steps of: S11, acquiring historical data and abnormal behavior record data of reference equipment aiming at each data dimension, wherein the types of the reference equipment and the target equipment are consistent, the historical data at least comprise a plurality of specific behavior events and behavior data occurrence time corresponding to each specific behavior event, the abnormal behavior record data comprise a plurality of times of historical abnormal behaviors and historical abnormal behavior occurrence time corresponding to each time of historical abnormal behaviors, and the historical abnormal behaviors at least comprise abnormal communication behaviors, abnormal transaction behaviors and abnormal APP behaviors; s12, acquiring average interval duration between the behavior data occurrence time corresponding to each data dimension and the corresponding nearest historical abnormal behavior occurrence time according to the historical data of each data dimension and the abnormal behavior record data; and S13, comparing the average interval duration corresponding to each data dimension with a preset degree threshold, and determining the dimension type corresponding to each data dimension according to the comparison result.
- 3. The abnormal behavior recognition method based on multidimensional data according to claim 1, wherein S2 comprises the steps of: s21, selecting a corresponding basic observation window according to the dimension type corresponding to the current data dimension aiming at each data dimension, wherein the lengths of the basic observation windows corresponding to the high timeliness, the medium timeliness and the low timeliness are sequentially increased; s22, cutting behavior data corresponding to the current data dimension into a plurality of data fragments according to the basic observation window; S23, respectively calculating mutation indexes corresponding to each data segment, and acquiring mutation indexes corresponding to the current data dimension according to the mutation indexes of all the data segments; s24, identifying whether behavior data corresponding to the current data dimension contains preset behavior pattern features corresponding to historical abnormal behaviors, and acquiring abnormal aura corresponding to the current data dimension; S25, respectively counting the number of specific behavior events corresponding to each data segment, and acquiring the sparseness degree corresponding to the current data dimension according to the number of the specific behavior events corresponding to all the data segments.
- 4. The abnormal behavior recognition method based on multidimensional data according to claim 1, wherein S3 comprises the steps of: s31, respectively determining target weights of a data updating frequency factor, an abnormal behavior latency factor and a device liveness factor according to mutation indexes corresponding to each data dimension; s32, respectively determining a data updating frequency factor, an abnormal behavior latency factor and a time period benchmark value of a device liveness factor according to the dimension type, the abnormal aura and the sparseness degree corresponding to each data dimension; s33, carrying out weighted average calculation on all the target weights corresponding to each data dimension and the corresponding time period reference value to obtain an initial aggregation time period corresponding to each data dimension; S34, performing conflict detection on the initial aggregation time periods corresponding to all the data dimensions, and obtaining the target aggregation time period corresponding to each data dimension.
- 5. The abnormal behavior recognition method based on multidimensional data according to claim 4, wherein S31 comprises the steps of: S311, aiming at any data dimension, acquiring the basic weights of a data updating frequency factor, an abnormal behavior latency factor and a device liveness factor; S312, if the mutation index corresponding to the current data dimension is greater than or equal to a preset mutation threshold, the basic weight of the data updating frequency factor is increased according to a first preset proportion, the basic weights of the abnormal behavior latency factor and the equipment activity factor are reduced according to a second preset proportion, and the target weights of the data updating frequency factor, the abnormal behavior latency factor and the equipment activity factor are obtained; S313, if the mutation index corresponding to the current data dimension is smaller than the preset mutation threshold, determining the basic weights of the data updating frequency factor, the abnormal behavior latency factor and the equipment liveness factor as corresponding target weights.
- 6. The abnormal behavior recognition method based on multidimensional data according to claim 4, wherein S32 comprises the steps of: s321, acquiring a typical latency range corresponding to the current data dimension according to an abnormal precursor corresponding to the current data dimension and a first preset lookup table aiming at any data dimension, wherein the first preset lookup table comprises a mapping relation between abnormal behaviors and latencies; s322, determining a time period reference value corresponding to the abnormal behavior latency factor in the typical latency range according to the sparseness degree corresponding to the current data dimension; S323, acquiring a time period reference value corresponding to the data updating frequency factor from a second preset query relation according to the dimension type of the current data dimension; S324, according to the sparseness of the current data dimension, a time period reference value corresponding to the equipment liveness factor is obtained from a third preset query relation.
- 7. The abnormal behavior recognition method based on multidimensional data according to claim 4, wherein S34 comprises the steps of: s341, calculating the ratio of the initial aggregation time periods corresponding to the current two data dimensions according to any two data dimensions; S342, if the ratio is larger than the preset synergy coefficient, judging that the period conflict exists between the two current data dimensions, and executing step S343; S343, comparing the dimension types corresponding to the current two data dimensions, taking an initial aggregation time period corresponding to the data dimension with higher timeliness as a corresponding target aggregation time period, adjusting the initial aggregation time period corresponding to the data dimension with lower timeliness until the ratio of the adjusted aggregation time period to the target aggregation time period of the data dimension with higher timeliness is smaller than or equal to the preset synergy coefficient, and taking the adjusted aggregation time period as the target aggregation time period corresponding to the data dimension with lower timeliness; S344, if the ratio is smaller than or equal to the preset synergy coefficient, determining that there is no period conflict between the current two data dimensions, and executing step S345; And S345, taking the initial aggregation time periods corresponding to the current two data dimensions as the corresponding target aggregation time periods.
- 8. The abnormal behavior recognition method based on multidimensional data according to claim 1, wherein S5 comprises the steps of: s51, inputting a characteristic data set corresponding to the current data dimension into a preset risk score model corresponding to the abnormal behavior corresponding to the current data dimension, and acquiring a risk score of the target equipment for the current data dimension; s52, if the risk score is greater than or equal to a preset score threshold, determining that the target equipment has abnormal behaviors corresponding to the current data dimension; and S53, traversing all data dimensions, determining all abnormal behaviors of the target equipment, and acquiring the abnormal behavior identification information corresponding to the target equipment.
- 9. A non-transitory computer readable storage medium having at least one instruction or at least one program stored therein, wherein the at least one instruction or the at least one program is loaded and executed by a processor to implement the multidimensional data based abnormal behavior identification method of any one of claims 1-8.
- 10. An electronic device comprising a processor and the non-transitory computer readable storage medium of claim 9.
Description
Abnormal behavior identification method, medium and equipment based on multidimensional data Technical Field The present invention relates to the field of data processing technologies, and in particular, to a method, medium, and apparatus for identifying abnormal behavior based on multidimensional data. Background Along with the popularization of mobile internet and digital payment, abnormal behaviors such as communication, transaction, APP use and the like show the characteristics of multi-scene penetration, multi-dimensional association and enhanced concealment, and higher requirements are put forward on the accuracy of the abnormal behavior recognition technology. The existing abnormal behavior recognition method generally adopts a method for recognizing feature engineering and a machine learning model based on multidimensional behavior data, however, the existing method generally presets a fixed time window for all data dimensions or all devices to conduct data aggregation and feature extraction, if unified is 30 days recently, the intrinsic timeliness difference of different data dimensions is ignored, meanwhile, the method cannot adapt to the dynamic change of a device behavior mode, and when the device behavior is changed severely or the abnormal behavior is in a long latency period, the feature extracted by a static window cannot capture short-term mutation signals or is diluted by a large amount of historical normal data, so that the distinguishing degree of the feature on the abnormal behavior is reduced, and the problem of untimely early warning or increased false alarm rate exists. Therefore, how to improve the quality of data features and thus the accuracy of abnormal behavior recognition is a urgent problem to be solved when abnormal behavior recognition is performed. Disclosure of Invention Aiming at the technical problems, the technical scheme adopted by the invention is an abnormal behavior identification method based on multidimensional data, which comprises the following steps: S1, acquiring dimension types corresponding to each data dimension according to the association tightness degree between each data dimension and the occurrence time of the historical abnormal behavior, wherein the data dimension at least comprises communication behaviors, transaction behaviors and APP behaviors, and the dimension types comprise high timeliness, medium timeliness and low timeliness. S2, according to the dimension type corresponding to each data dimension, mutation indexes, abnormal aura and sparseness degree corresponding to the behavior data of the target equipment aiming at each data dimension are respectively obtained, wherein the mutation indexes are used for reflecting the fluctuation degree of the behavior data, the abnormal aura is used for early warning the abnormal behavior, and the sparseness degree is used for reflecting the distribution density degree of the behavior data in unit time. S3, respectively determining a target aggregation time period corresponding to each data dimension according to the dimension type, mutation index, abnormal aura and sparseness degree corresponding to each data dimension. And S4, respectively carrying out aggregation and feature extraction on the behavior data corresponding to each data dimension according to the target aggregation time period corresponding to each data dimension, and obtaining a feature data set corresponding to each data dimension. S5, acquiring abnormal behavior identification information corresponding to the target equipment according to all the characteristic data sets. The invention also provides a non-transitory computer readable storage medium, wherein at least one instruction or at least one section of program is stored in the non-transitory computer readable storage medium, and the at least one instruction or the at least one section of program is loaded and executed by a processor to realize the abnormal behavior identification method based on the multidimensional data. The invention also provides an electronic device comprising a processor and the non-transitory computer readable storage medium described above. The method has the advantages that the method divides the high, medium and low timeliness dimension types through the association tightness degree of each data dimension and the occurrence time of the historical abnormal behavior, ensures that the high timeliness data can be subjected to short-term analysis strategy preferentially, captures instant risks, analyzes the medium and low timeliness data through the adaptive time period, avoids characteristic distortion, enables the processing strategy of each data dimension to be accurately matched with the early warning requirement of the abnormal behavior, lays a foundation for subsequent accurate feature extraction, enables early recognition and risk signal capture of the abnormal behavior to be more comprehensive through synchronous extraction of mutation indexes reflecti