CN-121980528-A - Action rule mining method based on Apriori algorithm
Abstract
The invention belongs to the technical field of behavior rule mining, and particularly relates to an action rule mining method based on an Apriori algorithm, which comprises the steps of firstly cleaning situation data, finding available data, classifying the action rule data, and dividing the action rule data into conditions and actions; step two, action rules of each category are called term sets, if the number of times that various rules appear together in the term sets is larger than a minimum support threshold, the term sets are integrated into frequent term sets, and evaluation standards of frequent term set mining comprise support and confidence; aiming at data such as situations, the method constructs a frequent item set library of action rules based on the Apriori algorithm, and realizes the mining of the action rules through the frequent item set.
Inventors
- JI SIYUAN
- CHEN XIAODONG
- MA XIAOLE
- GUO XIAOLIN
- WEN YUE
Assignees
- 航天科工智能运筹与信息安全研究院(武汉)有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20251230
Claims (10)
- 1. An action rule mining method based on an Apriori algorithm is characterized by comprising two links: Step one, data cleaning is carried out on situation data, available data are found, and action rule data are classified into conditions and actions; And in the second link, action rules of each category are called term sets, if the number of times that various rules appear together in the term sets is greater than a minimum support threshold, the term sets are integrated into frequent term sets, and the evaluation criteria of the frequent term set mining comprise support and confidence.
- 2. The method of action rule mining based on Apriori algorithm of claim 1, wherein the support represents joint probabilities of co-occurrence of different terms in the term set, wherein Y represents actions if X represents conditions, for a large number of rule data And The corresponding support formulas of the two association items to be analyzed are as follows: Is the probability of occurrence of X condition under the condition of executing Y action in frequent item set, namely ; Is the number of occurrences in the frequent item set, Is the total number of frequent item sets.
- 3. The Apriori algorithm-based action rule mining method of claim 2, wherein the confidence level represents a conditional probability of occurrence of different terms; For a pair of The confidence level of (2) is: Is the probability of an X condition occurring in the case of performing a Y action, i.e ; Is the probability of performing Y actions in the frequent item set.
- 4. The action rule mining method based on Apriori algorithm according to claim 3, wherein the method is applied to Apriori algorithm, and the algorithm principle of Apriori algorithm is as follows: Input data set Support threshold ; Output of Top- Frequent item sets of (a); The operation is as follows: 1) Traversing the whole data set to obtain all the appeared data as candidate frequent data A set of items; 2) Frequent excavation Item set: a) Computing candidates through data Support of item sets; b) Pruning removal candidates The item set support is below the minimum support Is frequently obtained Item set, if frequent If the item set is empty, frequent returns The collection of item sets is taken as the result of the algorithm, the algorithm is ended, if the obtained frequency is high If the item set has only one item, the method returns frequently The collection of the item sets is used as an algorithm result, and the algorithm is ended; c) Based on frequency Item set, join generation candidates A set of items; 3) Using step 2), iterating to obtain The result is a set of items.
- 5. The Apriori algorithm-based action rule mining method recited in claim 4, wherein the Apriori algorithm uses a priori knowledge to analyze a problem; If one item set is a frequent item set, a subset of the item set is also a frequent item set, and by utilizing the prior property, the Apriori algorithm can rapidly remove the candidate set which does not meet the requirement, a certain relation exists among the data description events, typical association analysis only emphasizes the relation which occurs at the same time, the Apriori algorithm does not generate too many frequent item sets for log data with fewer item sets, and does not need to scan too many data, and the superiority of the Apriori algorithm is shown when the item sets are fewer and the transactions are more, so that the effect is better by improving the algorithm.
- 6. The action rule mining method based on the Apriori algorithm according to claim 5, wherein in the target action class rule, the aircraft reflects the real state of the aircraft and the decision intention thereof, and for the air combat decision, the flight parameters and the environment situation information of the aircraft need to be passed, and the environment information comprises the relative postures of the two parties of the enemy, the flight parameters, the flight state and the safety state information; in the flight parameters of the aircraft, on the basis of three decision bases of relative attitude, flight parameters and flight state, carrying out data preprocessing on the condition variables, improving a data quality method by means of data cleaning, data conversion and the like, enabling the data to be clean by means of abnormal value detection, processing, missing value supplementing and the like, and further carrying out abstract transformation to obtain regular condition variables; The finding rule needs to find the occurrence probability of the X event and the occurrence probability of the XY event, so that a plurality of complex situations are contained, and the searching requirement of the action rule class of the multi-condition multi-event is formed, so that the original algorithm formula is improved, and the confidence coefficient calculation formula is changed as follows: is the probability of X condition and Y action occurring simultaneously in the case of executing Y action, i.e ; Is the probability that the X condition and the Y action occur simultaneously under the condition that the Y action is executed in the frequent item set.
- 7. The action rule mining method based on the Apriori algorithm according to claim 6, wherein in the above conversion, the actions Y of the action rule are determined and limited in number, and from the action perspective, an action is determined first, rule conditions are found by (X+Y) -Y, and rules under certain conditions are generated through the above conversion expression, so that the generation of the disregarded rules is avoided, and the rule discovery efficiency is improved.
- 8. The action rule mining method based on Apriori algorithm according to claim 1, wherein the method belongs to the technical field of behavior rule mining.
- 9. The action rule mining method based on the Apriori algorithm according to claim 1, wherein the method builds a frequent item set library of action class rules based on the Apriori algorithm aiming at situation data, and mining the action class rules through the frequent item set.
- 10. The action rule mining method based on Apriori algorithm according to claim 1, wherein the method implements action rule mining by Apriori algorithm.
Description
Action rule mining method based on Apriori algorithm Technical Field The invention belongs to the technical field of behavior rule mining, and particularly relates to an action rule mining method based on an Apriori algorithm. Background In action tasks of various scenes, a large amount of various data such as situation data is accumulated. The situation data contains rich scenes and effective coping experience and knowledge, and the association rule of the form of 'if scene then coping activities' is taken as an empirical knowledge, so that the situation data can be aimed at mining target action rules and coping treatment strategies; In the past experience, the rule can only be mined based on the Apriori algorithm according to the association rule of the if scene then for the activity as the basis of an action rule and aiming at data such as situation. Disclosure of Invention First, the technical problem to be solved The invention aims to provide an action rule mining method based on an Apriori algorithm. (II) technical scheme In order to solve the technical problems, the invention provides an action rule mining method based on an Apriori algorithm, which comprises two links: Step one, data cleaning is carried out on situation data, available data are found, and action rule data are classified into conditions and actions; And in the second link, action rules of each category are called term sets, if the number of times that various rules appear together in the term sets is greater than a minimum support threshold, the term sets are integrated into frequent term sets, and the evaluation criteria of the frequent term set mining comprise support and confidence. Wherein the support represents the joint probability that different items co-occur in the item set, and Y represents the action if X represents the condition, for a large amount of rule dataAndThe corresponding support formulas of the two association items to be analyzed are as follows: Is the probability of occurrence of X condition under the condition of executing Y action in frequent item set, namely ; Is the number of occurrences in the frequent item set,Is the total number of frequent item sets. Wherein the confidence level represents conditional probabilities of occurrence of different items; For a pair of The confidence level of (2) is: Is the probability of an X condition occurring in the case of performing a Y action, i.e ; Is the probability of performing Y actions in the frequent item set. The method is applied to an Apriori algorithm, and the algorithm principle of the Apriori algorithm is as follows: Input data set Support threshold; Output of Top-Frequent item sets of (a); The operation is as follows: 1) Traversing the whole data set to obtain all the appeared data as candidate frequent data A set of items; 2) Frequent excavation Item set: a) Computing candidates through data Support of item sets; b) Pruning removal candidates The item set support is below the minimum supportIs frequently obtainedItem set, if frequentIf the item set is empty, frequent returnsThe collection of item sets is taken as the result of the algorithm, the algorithm is ended, if the obtained frequency is highIf the item set has only one item, the method returns frequentlyThe collection of the item sets is used as an algorithm result, and the algorithm is ended; c) Based on frequency Item set, join generation candidatesA set of items; 3) Using step 2), iterating to obtain The result is a set of items. The Apriori algorithm uses priori knowledge to analyze the problem; If one item set is a frequent item set, a subset of the item set is also a frequent item set, and by utilizing the prior property, the Apriori algorithm can rapidly remove the candidate set which does not meet the requirement, a certain relation exists among the data description events, typical association analysis only emphasizes the relation which occurs at the same time, the Apriori algorithm does not generate too many frequent item sets for log data with fewer item sets, and does not need to scan too many data, and the superiority of the Apriori algorithm is shown when the item sets are fewer and the transactions are more, so that the effect is better by improving the algorithm. In the target action type rule, the aircraft reflects the real state of the aircraft and the decision intention thereof, and for the air combat decision, the aircraft flight parameters and environment situation information are needed to pass through, wherein the environment information comprises the relative postures of the two parties of the enemy, the flight parameters, the flight state and the safety state information; in the flight parameters of the aircraft, on the basis of three decision bases of relative attitude, flight parameters and flight state, carrying out data preprocessing on the condition variables, improving a data quality method by means of data cleaning, data conversion and the like, enabling the data to