Search

CN-115510113-B - Data processing method, system, electronic device and readable storage medium

CN115510113BCN 115510113 BCN115510113 BCN 115510113BCN-115510113-B

Abstract

The application discloses a data processing method, a data processing system, electronic equipment and a computer readable storage medium. The method comprises the steps of receiving a data query instruction aiming at the virtual query table, wherein the query instruction carries a field value screening interval of a specified time field of data to be queried, splitting the field value screening interval into a first time subinterval and a second time subinterval, respectively reading data of which the field value of the specified time field meets the first time subinterval and the second time subinterval from the offline data subsystem and the real-time data subsystem, and outputting a data query result according to the read data. By using the data processing method provided by the application, when a user develops a data task, the user only needs to develop and maintain a set of codes based on the virtual lookup table, so that the integration of the user into a stream batch without perception of separation calculation is realized, and better processing performance and throughput can be ensured.

Inventors

  • YU SHENGNAN
  • ZHANG YUANG
  • XIAO WENHAO
  • RUAN LIANG
  • WU JIANFEI
  • LIU BAI

Assignees

  • 网易(杭州)网络有限公司

Dates

Publication Date
20260512
Application Date
20220824

Claims (15)

  1. 1. The data processing method is characterized by being applied to a back-end service subsystem in a data processing system, wherein the system further comprises an offline data subsystem and a real-time data subsystem, the back-end service subsystem, the offline data subsystem and the real-time data subsystem are in communication connection with each other, at least one virtual lookup table is configured in the back-end service subsystem, and a mapping relation exists between the at least one virtual lookup table and data stored in the offline data subsystem and the real-time data subsystem, and the method comprises the following steps: Receiving a data query instruction aiming at the virtual query table, wherein the data query instruction at least carries a field value screening interval of a designated time field of data to be queried; responding to the data query instruction, splitting the field value screening interval into a first time subinterval before a preset moment and a second time subinterval after the preset moment; Reading first data of which the field value of the appointed time field meets the first time subinterval from the offline data subsystem, and reading second data of which the field value of the appointed time field meets the second time subinterval from the real-time data subsystem; outputting a data query result according to the first data and the second data; the offline data subsystem is configured with at least one offline data lookup table, and the at least one offline data lookup table has a mapping relation with data stored in the offline data subsystem; the real-time data subsystem is provided with at least one real-time data lookup table, and the at least one real-time data lookup table has a mapping relation with data stored in the real-time data subsystem; And the virtual lookup table has a mapping relation with the offline data lookup table and the real-time data lookup table with the same table name.
  2. 2. The method of claim 1, wherein the data query instruction further carries a table name of a virtual query table to be queried, wherein the reading the first data of the specified time field from the offline data subsystem whose field value satisfies the first time sub-interval comprises: Querying the storage information of the first data of which the field value of the appointed time field meets the first time subinterval from an offline data query table with the same table name as the virtual query table; and reading the first data from the offline data subsystem according to the storage information of the first data.
  3. 3. The method of claim 1, wherein the data query instruction further carries a table name of a virtual query table to be queried, and wherein reading the second data of the specified time field whose field value satisfies the second time subinterval from the real-time data subsystem comprises: inquiring the storage information of the second data of which the field value of the appointed time field meets the second time subinterval from a real-time data inquiry table with the same table name as the virtual inquiry table; And reading the second data from the real-time data subsystem according to the storage information of the second data.
  4. 4. The method of claim 1, wherein prior to receiving the data query instruction for the virtual lookup table, further comprising: Receiving a virtual lookup table creation instruction, wherein the virtual lookup table creation instruction at least carries a table name of a virtual lookup table to be created; And responding to the virtual lookup table creation instruction, creating a virtual lookup table with the table name, creating an offline data lookup table with the table name in the offline data subsystem, and creating a real-time data lookup table with the table name in the real-time data subsystem.
  5. 5. The method of claim 4, wherein said creating a virtual look-up table at said table name in response to said virtual look-up table creation instruction and creating an offline data look-up table at said table name in said offline data subsystem, and after creating a real-time data look-up table at said table name in said real-time data subsystem, further comprises: Receiving a data writing instruction aiming at the virtual lookup table, wherein the data writing instruction at least carries a table name of the virtual lookup table and a field value of the appointed time field of target data to be written; responding to the data writing instruction, storing the target data into the offline data subsystem under the condition that the field value is before the preset moment, and writing the storage information of the target data into an offline data lookup table with the table name; responding to the data writing instruction, storing the target data into the real-time data subsystem under the condition that the field value is behind the preset moment, and writing the storage information of the target data into a real-time data lookup table with the table name; And synchronizing the storage information of the target data into the virtual lookup table with the table name so as to update the metadata of the virtual lookup table.
  6. 6. The method of claim 4, wherein said creating a virtual look-up table at said table name in response to said virtual look-up table creation instruction and creating an offline data look-up table at said table name in said offline data subsystem, and after creating a real-time data look-up table at said table name in said real-time data subsystem, further comprises: receiving a deleting instruction of a target virtual lookup table, wherein the deleting instruction at least carries a table name of the target virtual lookup table; And deleting the target virtual lookup table, deleting the offline data lookup table with the same table name as the target virtual lookup table from the offline data subsystem, and deleting the real-time data lookup table with the same table name as the target virtual lookup table from the real-time data subsystem in response to the deleting instruction.
  7. 7. A method according to claim 3, wherein, in a case where the data reading mode is configured as batch reading and the data query instruction does not carry an end time node of the field value filtering interval, the querying the real-time data query table having the same table name as the virtual query table for the stored information of the second data of which the field value of the specified time field satisfies the second time subinterval includes: And inquiring second data which is stored in the real-time data subsystem before the receiving moment of the data inquiry instruction and has the field value of the appointed time field meeting the second time subinterval from a real-time data inquiry table with the same table name as the virtual inquiry table.
  8. 8. The method of claim 7, wherein outputting a data query result based on the first data and the second data comprises: integrating the first data and the second data to obtain a data query result; and displaying a data query result interface comprising the data query result.
  9. 9. The method according to claim 8, wherein, in a case where the data reading manner is configured as streaming reading and the data query instruction does not carry an end time node of the field value filtering interval, the displaying a data query result interface including the data query result further includes: monitoring the latest data stored in the real-time data subsystem after the receiving moment of the data query instruction in real time; And reading the latest data from the real-time data subsystem when the field value of the designated time field of the latest data meets the second time subinterval.
  10. 10. The method of claim 9, wherein when the field value of the specified time field of the latest data satisfies the second time subinterval, further comprising, after reading the latest data from the real-time data subsystem: Integrating the latest data with the data query result to obtain an updated data query result; And refreshing the data query result interface according to the updated data query result.
  11. 11. The method of claim 1, wherein the receiving a data query instruction for the virtual lookup table comprises: displaying a data query interface, wherein the data query interface comprises a queriable field of each virtual query table; And receiving an input data query statement in the data query interface, wherein the data query statement comprises a table name of a virtual query table to be queried and a field value screening interval of a specified time attribute of the data to be queried.
  12. 12. The method of claim 11, wherein the splitting the field value screening interval into a first time sub-interval before a preset time and a second time sub-interval after the preset time in response to the data query instruction comprises: Responding to the input of the data query statement, and analyzing the data query statement to obtain the table name and the field value screening interval; Splitting the field value screening interval into a first time subinterval before a preset time and a second time subinterval after the preset time.
  13. 13. The system is characterized by comprising a back-end service subsystem, an offline data subsystem and a real-time data subsystem which are in communication connection with each other, wherein at least one virtual lookup table is configured in the back-end service subsystem, and the at least one virtual lookup table has a mapping relation with data stored in the offline data subsystem and the real-time data subsystem; the off-line data subsystem is used for storing data of which the field value of the designated time field is before a preset time; The real-time data subsystem is used for storing data of the field value of the specified time field after the preset moment; The back-end service subsystem is used for receiving a data query instruction aiming at the virtual query table, wherein the data query instruction at least carries a field value screening interval of a specified time field of data to be queried, responding to the data query instruction, splitting the field value screening interval into a first time subinterval before a preset moment and a second time subinterval after the preset moment, reading first data of which the field value of the specified time field meets the first time subinterval from the offline data subsystem, and reading second data of which the field value of the specified time field meets the second time subinterval from the real-time data subsystem, and outputting a data query result according to the first data and the second data; the offline data subsystem is configured with at least one offline data lookup table, and the at least one offline data lookup table has a mapping relation with data stored in the offline data subsystem; the real-time data subsystem is provided with at least one real-time data lookup table, and the at least one real-time data lookup table has a mapping relation with data stored in the real-time data subsystem; And the virtual lookup table has a mapping relation with the offline data lookup table and the real-time data lookup table with the same table name.
  14. 14. An electronic device, comprising: Processor, and A memory for storing a data processing program, the electronic device being powered on and executing the program by the processor, to perform the method of any of claims 1-12.
  15. 15. A computer readable storage medium, characterized in that a data processing program is stored, which program is run by a processor, performing the method according to any of claims 1-12.

Description

Data processing method, system, electronic device and readable storage medium Technical Field The present application relates to the field of computer technologies, and in particular, to a data processing method, a system, an electronic device, and a computer readable storage medium. Background With the rapid development of computer technology, huge data volume brings great challenges to data analysis and processing. Currently, there are two processing demands of batch processing (also called batch computing, off-line computing) and stream processing (also called stream computing, real-time computing) on a large amount of data, in the related art, the architecture of a data processing link mostly adopts a pseudo-stream batch integrated architecture, and in the architecture, a batch computing task and a stream computing task need to be separately processed by two links, so that when developing data related demands, developers need to separately maintain two sets of codes of batch processing and stream processing, thereby reducing the development efficiency of the data demands. Disclosure of Invention The application provides a data processing method, a system, electronic equipment and a computer readable storage medium, which can enable batch processing and stream processing to be maintained only through one set of codes when related requirements of data are developed, thereby improving the development efficiency of the data requirements. The specific mode is as follows. In a first aspect, the present application provides a data processing method, applied to a back-end service subsystem in a data processing system, where the system further includes an offline data subsystem and a real-time data subsystem, where the back-end service subsystem, the offline data subsystem, and the real-time data subsystem are communicatively connected to each other, at least one virtual lookup table is configured in the back-end service subsystem, and the at least one virtual lookup table has a mapping relationship with data stored in the offline data subsystem and the real-time data subsystem, and the method includes: Receiving a data query instruction aiming at the virtual query table, wherein the data query instruction at least carries a field value screening interval of a designated time field of data to be queried; responding to the data query instruction, splitting the field value screening interval into a first time subinterval before a preset moment and a second time subinterval after the preset moment; Reading first data of which the field value of the appointed time field meets the first time subinterval from the offline data subsystem, and reading second data of which the field value of the appointed time field meets the second time subinterval from the real-time data subsystem; And outputting a data query result according to the first data and the second data. In a second aspect, an embodiment of the present application further provides a data processing system, where the system includes a back-end service subsystem, an offline data subsystem, and a real-time data subsystem that are communicatively connected to each other, where at least one virtual lookup table is configured in the back-end service subsystem, and there is a mapping relationship between the at least one virtual lookup table and data stored in the offline data subsystem and the real-time data subsystem; the off-line data subsystem is used for storing data of which the field value of the designated time field is before a preset time; The real-time data subsystem is used for storing data of the field value of the specified time field after the preset moment; The back-end service subsystem is used for receiving a data query instruction aiming at the virtual query table, wherein the data query instruction at least carries a field value screening interval of a specified time field of data to be queried, responding to the data query instruction, splitting the field value screening interval into a first time subinterval before a preset moment and a second time subinterval after the preset moment, reading first data of which the field value of the specified time field meets the first time subinterval from the offline data subsystem, and reading second data of which the field value of the specified time field meets the second time subinterval from the real-time data subsystem, and outputting a data query result according to the first data and the second data. In a third aspect, an embodiment of the present application further provides an electronic device, including: Processor, and A memory for storing a data processing program, the electronic device being powered on and executing the program by the processor, to perform the method according to any of the first aspects. In a fourth aspect, an embodiment of the present application further provides a computer readable storage medium storing a data processing program, the program being executed by a processor to