Search

CN-121979721-A - Data restoration method, device, electronic equipment, storage medium and program product

CN121979721ACN 121979721 ACN121979721 ACN 121979721ACN-121979721-A

Abstract

The embodiment of the application discloses a data restoration method, a device, electronic equipment, a storage medium and a program product. The method comprises the steps of receiving to-be-repaired data of an event SSE interface sent by a server, determining difference data and position identification by comparing the plurality of sub-data, constructing a difference data set according to each position identification, executing repair judgment processing operation on each difference data set, judging whether the difference data are discrete data or not based on a preset discrete judgment threshold value, dividing the difference data into first type difference data and second type difference data according to vector distances among the difference data when the difference data are non-discrete data, determining to-be-repaired difference data and correct type difference data based on the quantity relation between the first type difference data and the second type difference data and the abnormal data proportion of the to-be-repaired data, and repairing to-be-repaired difference data based on the correct type difference data to obtain repaired data.

Inventors

  • CHEN WEI
  • QIU MINGLIN
  • WANG CHUNJIANG
  • ZHOU JULIANG
  • HU XINGDA
  • WANG ZHENCHAO

Assignees

  • 中移物联网有限公司
  • 中国移动通信集团有限公司

Dates

Publication Date
20260505
Application Date
20251224

Claims (11)

  1. 1. A method of data repair, comprising: Receiving data to be repaired of an event SSE interface sent by a server, wherein the data to be repaired comprises a plurality of sub-data which are split based on a mark splitting point; comparing the plurality of sub-data, and determining difference data in each sub-data and a position mark of the difference data in the sub-data; constructing a corresponding difference data set for each position identifier, wherein the difference data set comprises all difference data with the position identifier; And respectively executing repair judgment processing operation for each difference data set, wherein the repair judgment processing operation comprises the following steps: Judging whether the difference data in the difference data set is discrete data or not based on a preset discrete judgment threshold value; When the difference data are non-discrete data, dividing the difference data in the difference data set into first-type difference data and second-type difference data according to vector distances between the difference data in the difference data set; determining to-be-repaired type difference data and correct type difference data based on the quantity relation between the first type difference data and the second type difference data and the abnormal data proportion of the to-be-repaired data; and repairing the to-be-repaired class difference data based on the correct class difference data to obtain repaired data.
  2. 2. The method of claim 1, wherein comparing the plurality of sub-data to determine difference data in each sub-data and a location identification of the difference data in the sub-data comprises: splitting each sub data in the plurality of sub data into a comparison element sequence formed by a plurality of comparison elements, and simultaneously recording the position identification of each comparison element in the sub data; selecting a reference element sequence from a plurality of comparison element sequences corresponding to the plurality of sub-data according to a preset selection rule; Based on the comparison of the reference sequence with other comparison element sequences in the plurality of comparison element sequences, determining the comparison elements which are not matched relative to the reference sequence in the other comparison element sequences as the difference data; determining a position identifier of the difference data in the corresponding comparison element sequence as the position identifier of the difference data in the sub data; The preset selection rule comprises that when the sequence lengths of the plurality of comparison element sequences are inconsistent, the comparison element sequence with the minimum sequence length is selected as a reference element sequence, or when the sequence lengths of the plurality of comparison element sequences are consistent, the comparison element sequence corresponding to the first sub data is selected as the reference element sequence.
  3. 3. The method of claim 1, wherein determining whether the difference data within the difference data set is discrete data based on a preset discrete decision threshold comprises: Determining feature vectors of each of the difference data within the difference data set; Calculating a vector distance between each of the difference data within the difference data set based on a feature vector of each of the difference data; Determining an average vector distance for the differential data set based on the vector distances between the differential data; Comparing the average vector distance with a preset discrete judgment threshold value, and if the average vector distance is larger than the preset discrete judgment threshold value, determining that the difference data in the difference data set is discrete data.
  4. 4. A method according to claim 1 or 3, wherein when the difference data is non-discrete data, dividing the difference data within the difference data set into a first type of difference data and a second type of difference data according to a vector distance between the difference data within the difference data set, comprises: when the difference data are non-discrete data, performing cluster analysis on the difference data in the difference data set according to the vector distance between the difference data in the difference data set and the average vector distance of the difference data set to obtain two clusters; And respectively determining the two clusters as the first type difference data and the second type difference data.
  5. 5. The method of claim 1, wherein determining class difference data to be repaired and correct class difference data based on a quantitative relationship of the first class difference data and the second class difference data, and an abnormal data proportion of the data to be repaired, comprises: determining the quantity of the first type difference data, the quantity of the second type difference data and the abnormal data proportion of the data to be repaired; When the abnormal data proportion of the data to be repaired is larger than a preset proportion threshold value, determining the difference data type with relatively larger quantity in the first type difference data and the second type difference data as the difference data to be repaired; Or alternatively And when the abnormal data proportion of the data to be repaired is smaller than or equal to a preset proportion threshold value, determining the difference data type with relatively smaller quantity in the first type of difference data and the second type of difference data as the difference data to be repaired.
  6. 6. The method of any one of claims 1 to 5, wherein the method further comprises: judging whether the repaired data meets a preset repair requirement or not based on the repaired data; If the repaired data to be repaired does not meet the preset repair requirement, adjusting the preset discrete judgment threshold; And based on the adjusted preset discrete judgment threshold value, circularly executing the repair judgment processing operation until the repaired data meets the preset repair requirement, or stopping the circulation when the execution times of circularly executing the repair judgment processing operation meet the preset times threshold value.
  7. 7. The method of claim 6, wherein determining whether the repaired data meets a preset repair requirement based on the repaired data comprises: calculating the actual repair proportion according to the data volume of the repaired data and the data volume of the data to be repaired; Comparing the actual repair proportion with the abnormal data proportion of the data to be repaired; And if the actual repair proportion is not equal to the abnormal data proportion of the data to be repaired, determining that the repaired data does not meet the preset repair requirement.
  8. 8. The data repairing device is characterized by comprising a receiving module, a comparing module, a constructing module and a processing module, wherein: the receiving module is used for receiving to-be-repaired data of the event SSE interface sent by the server, wherein the to-be-repaired data comprises a plurality of sub-data which are split based on a mark splitting point; The comparison module is used for comparing the plurality of sub-data and determining difference data in each sub-data and the position identification of the difference data in the sub-data; the construction module is used for constructing a corresponding difference data set aiming at each position mark, wherein the difference data set comprises all difference data with the position mark; The processing module is used for respectively executing repair judgment processing operation for each difference data set, wherein the repair judgment processing operation comprises the following steps: Judging whether the difference data in the difference data set is discrete data or not based on a preset discrete judgment threshold value; When the difference data are non-discrete data, dividing the difference data in the difference data set into first-type difference data and second-type difference data according to vector distances between the difference data in the difference data set; determining to-be-repaired type difference data and correct type difference data based on the quantity relation between the first type difference data and the second type difference data and the abnormal data proportion of the to-be-repaired data; and repairing the to-be-repaired class difference data based on the correct class difference data to obtain repaired data.
  9. 9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the computer program implementing the steps of the data restoration method according to any one of claims 1 to 8 when executed by the processor.
  10. 10. A computer-readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, implements the steps of the data restoration method according to any one of claims 1 to 8.
  11. 11. A computer program product comprising a computer program which, when executed by a processor, implements the data repair detection method of any one of claims 1 to 8.

Description

Data restoration method, device, electronic equipment, storage medium and program product Technical Field The present application relates to the field of computer technologies, and in particular, to a data repairing method, apparatus, electronic device, storage medium, and program product. Background Server send event (Server-SENT EVENTS, SSE), which is a unidirectional communication protocol based on HTTP, allows the Server to push streaming data to the client in one direction. In the SSE communication process, the client monitors SSE streaming data sent by the server by establishing long connection and analyzes and processes the SSE streaming data in real time. In practical applications, due to network fluctuations, link jitter, or other transmission reasons, the SSE streaming data received by the client may be abnormal, for example, special characters such as a line feed character and a separator are doped in the SSE streaming data, or non-critical content in the SSE streaming data is missing. For such anomalies, existing processing methods typically include (1) directly discarding the anomalous SSE streaming data, (2) re-issuing a request to the server to retrieve the SSE streaming data, (3) recovering the SSE streaming data from the backup data source, (4) invoking a large model, such as deepseek, to repair the anomalous SSE streaming data. However, the above manner often has certain limitations, such as directly discarding the abnormal SSE streaming data, which may cause problems of discontinuous streaming output, even missing key information, etc., secondly, the re-request manner is not suitable for the scene of continuous change of data, and the manner of recovering from the backup data source is similar to the manner of re-acquiring the data once, which cannot avoid the similar abnormal problems in retransmission, and finally, the large model-based repair manner generally has high response delay, complex flow, and may introduce irrelevant contents, etc. Based on this, how to perform real-time and lightweight repair on the SSE streaming data without re-initiating the SSE data request is a technical problem that needs to be solved by those skilled in the art. Disclosure of Invention The embodiment of the application provides a data repairing method which is used for solving the problem that SSE data cannot be repaired in a light, real-time and accurate manner on the premise of not re-initiating an SSE data request in the prior art. Embodiments of the present application also provide a data repair apparatus, an electronic device, a computer-readable storage medium, and a computer program product. The embodiment of the application adopts the following technical scheme: in a first aspect, an embodiment of the present application provides a data repair method, including: receiving data to be repaired of an event SSE interface sent by a server, wherein the data to be repaired comprises a plurality of sub-data which are split based on a mark splitting point; comparing the plurality of sub-data, and determining difference data in each sub-data and position identification of the difference data in the sub-data; constructing a corresponding difference data set aiming at each position mark, wherein the difference data set comprises all difference data with the position mark; And respectively executing a repair judgment processing operation for each difference data set, wherein the repair judgment processing operation comprises the following steps: judging whether the difference data in the difference data set is discrete data or not based on a preset discrete judgment threshold value; When the difference data are non-discrete data, dividing the difference data in the difference data set into first type difference data and second type difference data according to the vector distance between the difference data in the difference data set; Determining to-be-repaired type difference data and correct type difference data based on the quantity relation between the first type difference data and the second type difference data and the abnormal data proportion of the to-be-repaired data; And repairing the class difference data to be repaired based on the correct class difference data to obtain repaired data. In a second aspect, an embodiment of the present application provides a data repairing apparatus, including a receiving module, a comparing module, a constructing module, and a processing module, where: the receiving module is used for receiving to-be-repaired data of the event SSE interface sent by the server, wherein the to-be-repaired data comprises a plurality of sub-data which are split based on mark splitting points; the comparison module is used for comparing the plurality of sub-data and determining difference data in each sub-data and position identifiers of the difference data in the sub-data; the construction module is used for constructing a corresponding difference data set aiming at each position mar