Search

CN-121980449-A - Data detection method and device, electronic equipment and storage medium

CN121980449ACN 121980449 ACN121980449 ACN 121980449ACN-121980449-A

Abstract

The application relates to a data detection method, a device, an electronic device and a storage medium, wherein the method comprises the steps of dividing a current detection period into a first processing window and a second processing window when the current detection period is reached; in a first processing window, a training data set corresponding to the next detection period of the current detection period is determined based on the acquired service data and the historical service data set, and in a second processing window, a target prediction model is trained based on the training data set, so that the trained target prediction model outputs a second data threshold corresponding to the next detection period, and data detection in the next detection period is completed based on the second data threshold. The method eliminates the prediction empty window period caused by model training and threshold calculation in the traditional real-time detection mode, avoids system resource contention in a high concurrency scene, and improves the real-time performance of data detection.

Inventors

  • ZHOU JIEYUN

Assignees

  • 北京奇艺世纪科技有限公司

Dates

Publication Date
20260505
Application Date
20251230

Claims (10)

  1. 1. A data detection method, comprising: Dividing the current detection period into a first processing window and a second processing window when the current detection period is reached; Detecting the acquired service data based on a first data threshold corresponding to the current detection period in the first processing window and the second processing window; determining a training data set corresponding to a first detection period based on the acquired service data and a historical service data set acquired before the current detection period in the first processing window, and training a target prediction model based on the training data set in the second processing window so that the trained target prediction model outputs a second data threshold corresponding to the first detection period, wherein the first detection period is the next detection period of the current detection period; And completing detection of the service data acquired in the first detection period based on the second data threshold value when the first detection period is reached.
  2. 2. The method according to claim 1, wherein detecting the acquired service data based on the first data threshold corresponding to the current detection period comprises: Detecting the acquired service data through a first channel based on a first data threshold corresponding to the current detection period; the determining a training data set corresponding to the first detection period based on the acquired service data and the historical service data set acquired before the current detection period includes: determining a training data set corresponding to a first detection period through a second channel based on the acquired service data and a historical service data set acquired before the current detection period; Training a target prediction model based on the training data set, so that the trained target prediction model outputs a second data threshold corresponding to the first detection period, including: And training a target prediction model based on the training data set through the second channel, so that the trained target prediction model outputs a second data threshold corresponding to the first detection period.
  3. 3. The method of claim 2, wherein a proportion of system resources allocated to the first channel is greater than a proportion of system resources allocated to the second channel within the first processing window; Within the second processing window, the proportion of system resources allocated to the first channel is smaller than the proportion of system resources allocated to the second channel.
  4. 4. The method of claim 2, wherein the determining, via the second channel, the training data set corresponding to the first detection period based on the acquired traffic data and the historical traffic data set acquired prior to the current detection period, comprises: the acquired service data is encapsulated into a first message through the first channel, and the first message is sent to a message queue; acquiring the first message from the message queue through the second channel; And determining a training data set corresponding to the first detection period through the second channel based on the service data in the first message and the historical service data set acquired before the current detection period.
  5. 5. The method of claim 2, wherein training, via the second channel, a target prediction model based on the training data set such that the trained target prediction model outputs a second data threshold corresponding to the first detection period, comprises: Encapsulating the training data set into a second message through the second channel, and sending the second message to a message queue; acquiring the second message from the message queue through the second channel; and training a target prediction model through the second channel based on the training data set in the second message, so that the trained target prediction model outputs a second data threshold corresponding to the first detection period.
  6. 6. The method according to claim 1, characterized in that the method further comprises: After the trained target prediction model outputs the second data threshold corresponding to the first detection period, storing the corresponding relation between the first detection period and the second data threshold in a target database; the detecting, when the first detection period is reached, the service data acquired in the first detection period based on the second data threshold includes: inquiring the second data threshold corresponding to the first detection period from the target database based on the first detection period when the first detection period is reached; And completing detection of the service data acquired in the first detection period based on the second data threshold.
  7. 7. The method of claim 1, wherein training a target prediction model based on the training data set such that the trained target prediction model outputs a second data threshold corresponding to the first detection period comprises: Acquiring a history prediction model corresponding to a second detection period from a model storage pool, wherein the history prediction model corresponding to different detection periods is stored in the model storage pool, the second detection period is the last detection period of the current detection period, and the history prediction model corresponding to the second detection period is completed by training in the second detection period and is used for outputting a model of the first data threshold; determining the historical prediction model corresponding to the second detection period as a target prediction model; And performing incremental training on the target prediction model based on the training data set, so that the trained target prediction model outputs a second data threshold corresponding to the first detection period.
  8. 8. A data detection method, comprising: The dividing module is used for dividing the current detection period into a first processing window and a second processing window when the current detection period is reached; the detection module is used for detecting the acquired service data based on a first data threshold corresponding to the current detection period in the first processing window and the second processing window; the prediction module is used for determining a training data set corresponding to a first detection period based on the acquired service data and a historical service data set acquired before the current detection period in the first processing window, and training a target prediction model based on the training data set in the second processing window so that the trained target prediction model outputs a second data threshold corresponding to the first detection period, wherein the first detection period is the next detection period of the current detection period; And the detection module is further used for completing the detection of the service data acquired in the first detection period based on the second data threshold value when the first detection period is reached.
  9. 9. An electronic device comprising a processor and a memory, wherein the processor is configured to execute a data detection program stored in the memory, so as to implement the data detection method according to any one of claims 1 to 7.
  10. 10. A storage medium storing one or more programs executable by one or more processors to implement the data detection method of any one of claims 1-7.

Description

Data detection method and device, electronic equipment and storage medium Technical Field The present application relates to the field of big data technologies, and in particular, to a data detection method, a device, an electronic apparatus, and a storage medium. Background In the data quality monitoring scene, real-time anomaly detection is required to be performed on periodically arrived service data (such as the activity number of users per hour, etc.), so as to ensure system stability and service health. At present, a detection mode relying on real-time calculation is generally adopted, namely, when each detection period arrives, a prediction model is trained by utilizing historical service data, so that a data normal threshold corresponding to the current detection period is calculated by utilizing the trained prediction model, and whether the newly arrived service data is abnormal or not is judged according to the data normal threshold. However, in the detection mode, since the model training and the threshold calculation are both performed in the current detection period, the detection result cannot be timely output in the empty window period after the current detection period starts, and contention of system resources is easily caused in a high concurrency scene, so that the system response is delayed, and the requirement of instantaneity cannot be met. Disclosure of Invention In view of the foregoing, in order to solve the foregoing technical problems or some of the technical problems, embodiments of the present application provide a data detection method, apparatus, electronic device, and storage medium. In a first aspect, the present application provides a data detection method, including: Dividing the current detection period into a first processing window and a second processing window when the current detection period is reached; Detecting the acquired service data based on a first data threshold corresponding to the current detection period in the first processing window and the second processing window; determining a training data set corresponding to a first detection period based on the acquired service data and a historical service data set acquired before the current detection period in the first processing window, and training a target prediction model based on the training data set in the second processing window so that the trained target prediction model outputs a second data threshold corresponding to the first detection period, wherein the first detection period is the next detection period of the current detection period; And completing detection of the service data acquired in the first detection period based on the second data threshold value when the first detection period is reached. In an optional embodiment, the detecting the acquired service data based on the first data threshold corresponding to the current detection period includes: Detecting the acquired service data through a first channel based on a first data threshold corresponding to the current detection period; the determining a training data set corresponding to the first detection period based on the acquired service data and the historical service data set acquired before the current detection period includes: determining a training data set corresponding to a first detection period through a second channel based on the acquired service data and a historical service data set acquired before the current detection period; Training a target prediction model based on the training data set, so that the trained target prediction model outputs a second data threshold corresponding to the first detection period, including: And training a target prediction model based on the training data set through the second channel, so that the trained target prediction model outputs a second data threshold corresponding to the first detection period. In an alternative embodiment, the proportion of system resources allocated to the first channel is greater than the proportion of system resources allocated to the second channel within the first processing window; Within the second processing window, the proportion of system resources allocated to the first channel is smaller than the proportion of system resources allocated to the second channel. In an optional embodiment, the determining, through the second channel, a training data set corresponding to the first detection period based on the acquired service data and the historical service data set acquired before the current detection period includes: the acquired service data is encapsulated into a first message through the first channel, and the first message is sent to a message queue; acquiring the first message from the message queue through the second channel; And determining a training data set corresponding to the first detection period through the second channel based on the service data in the first message and the historical service data set acquired before the current