CN-122019606-A - Data processing method, apparatus, device, medium, and program product

CN122019606ACN 122019606 ACN122019606 ACN 122019606ACN-122019606-A

Abstract

The disclosure provides a data processing method, which can be applied to the technical field of big data. The data processing method comprises the steps of responding to a received data processing request, inquiring data to be processed from a first database, wherein the data in the first database is loaded from a second database, a first key value of the data to be processed in the first database and a second key value of the data to be processed in the second database have a corresponding relation, responding to the data processing request comprises a calculation operation, determining the data to be processed in the second database according to the corresponding relation, and executing the calculation operation in the second database, wherein the response time for executing the inquiry operation on the data in the first database is shorter than that of the second database, and the response time for executing the calculation operation on the data in the second database is shorter than that of the first database. The present disclosure also provides a data processing apparatus, device, storage medium, and program product.

Inventors

ZHAO JINGTENG

Assignees

中国建设银行股份有限公司
建信金融科技有限责任公司

Dates

Publication Date: 20260512
Application Date: 20251230

Claims (11)

1. A method of data processing, the method comprising: In response to receiving a data processing request, querying data to be processed from a first database, wherein the data in the first database is loaded from a second database, a first key value of the data to be processed in the first database has a corresponding relation with a second key value of the data to be processed in the second database, wherein the first key value is generated based on a pre-configured key value generation rule, and a storage position of the data in the first database is determined according to partition range information corresponding to the first key value and an access frequency of the data; And responding to the data processing request comprises a calculation operation, determining the data to be processed in the second database according to the corresponding relation, and executing the calculation operation on the second database, wherein the response time of executing the query operation on the data in the first database is smaller than that of the second database, and the response time of executing the calculation operation on the data in the second database is smaller than that of the first database.
2. The method according to claim 1, wherein the method further comprises: Generating a first key value of the sub data by using a second key value of the sub data in the second database according to a key value generation rule pre-configured in the first database aiming at each sub data in the data to be loaded in the second database; According to the first key value, loading the data to be loaded into the first database; and determining a key value corresponding relation between the first key value and the second key value based on the second key value of each sub data in the second database.
3. The method of claim 2, wherein loading the data to be loaded into the first database according to the first key value comprises: Obtaining partition range information aiming at the first key value in a first database; Determining a partition to which each piece of sub data belongs and a target file corresponding to the partition based on the first key value of each piece of sub data and the partition range information; and storing each sub data to a corresponding target file and loading the target file to the first database.
4. The method of claim 3, wherein storing each sub-data to a corresponding target file comprises: Determining an access frequency of each field in the sub-data based on the historical access record in the first database and/or the historical access record in the second database; for each piece of sub-data in the data to be loaded, responding to the fact that the access frequency of the field is larger than a preset threshold value, and storing the field into a first area of the target file, wherein the field in the first area is preloaded into a memory for direct access when an access process is started; And responding to the field access frequency being smaller than or equal to the preset threshold value, compressing the field and storing the field to a second area of the target file, wherein the field in the second area is kept in an unloaded state when an access process is started.
5. The method according to claim 4, wherein the method further comprises: Pre-allocating a first storage capacity to the first region, the first storage capacity being determined based on storage capacities of storage units in a file storage system, the file storage system being a storage system corresponding to the second database; a second storage capacity is pre-allocated to the second region, the second storage capacity being greater than the first storage capacity.
6. The method of claim 2, wherein the loading the data to be loaded into the first database comprises: performing conflict verification on a first key value of each sub-data in the data to be loaded and a first key value in the first database; responding to the verification result that no conflict exists, and loading the target file to a corresponding partition in the first database; and responding to the verification result as the conflict, recording a first key value with the conflict and generating alarm information.
7. The method of claim 2, wherein the loading the data to be loaded into the first database comprises: Carrying out integrity check and format check on the data to be loaded; responding to the verification result to pass, and loading the data to be loaded to the second database; Responding to the verification result that the verification is not passed, loading the sub-data which is passed through the verification in the data to be loaded to the second database, and generating alarm information based on the sub-data which is not passed through the verification in the data to be loaded; the data to be loaded is newly added data acquired from the second database; the integrity check comprises at least one of checking the data volume of the newly added data and the data volume of the data to be loaded and checking hash values of key fields in the newly added data and the data to be loaded; And the format verification comprises verification of a table structure to which the newly added data belongs and the field quantity of the data to be loaded.
8. A data processing apparatus, the apparatus comprising: the query module is used for responding to a received data processing request and querying data to be processed from a first database, wherein the data in the first database is loaded from a second database, a first key value of the data to be processed in the first database has a corresponding relation with a second key value of the data to be processed in the second database, the first key value is generated based on a preset key value generation rule, and the storage position of the data in the first database is determined according to partition range information corresponding to the first key value and the access frequency of the data; The first processing module is used for responding to the data processing request and comprising a calculation operation, determining the data to be processed in the second database according to the corresponding relation, and executing the calculation operation on the second database, wherein the response time of executing the query operation on the data in the first database is smaller than that of the second database, and the response time of executing the calculation operation on the data in the second database is smaller than that of the first database.
9. An electronic device, comprising: one or more processors; A memory for storing one or more computer programs, Characterized in that the one or more processors execute the one or more computer programs to implement the steps of the method according to any one of claims 1-7.
10. A computer-readable storage medium, on which a computer program or instructions is stored, which, when executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
11. A computer program product comprising a computer program or instructions which, when executed by a processor, implement the steps of the method according to any one of claims 1 to 7.

Description

Data processing method, apparatus, device, medium, and program product Technical Field The present disclosure relates to the field of big data, and in particular, to a data processing method, apparatus, device, medium, and program product. Background As business systems generate massive data every day, complex computation is required for historical data, and rapid query for data is required. Current schemes typically rely on only a single type of database, which typically has different advantages and disadvantages, such as having parallel computing power but lacking accurate indexing, resulting in long time consuming conditional queries, inability to respond in time to pages, or low query delays, but not suitable for large-scale data processing, resulting in a faster response for both data queries and data processing. Disclosure of Invention In view of the foregoing, the present disclosure provides a data processing method, apparatus, device, medium, and program product that increase the response speed of data query and data processing. According to a first aspect of the present disclosure, there is provided a data processing method including, in response to receiving a data processing request, querying data to be processed from a first database, wherein the data in the first database is loaded from a second database, a first key value of the data to be processed in the first database has a correspondence with a second key value of the data to be processed in the second database, wherein the first key value is generated based on a key value generation rule configured in advance, a storage location of the data in the first database is determined according to partition range information corresponding to the first key value and an access frequency of the data, determining the data to be processed in the second database according to the correspondence in response to the data processing request, and performing a calculation operation in the second database, wherein a response time of performing the query operation on the data in the first database is smaller than that of the second database, and a response time of performing the calculation operation on the data in the second database is smaller than that of the first database. The method further comprises the steps of generating a first key value of the sub data according to a preset key value generation rule in the first database for each sub data in the data to be loaded in the second database, utilizing a second key value of the sub data in the second database, loading the data to be loaded into the first database according to the first key value, and determining a key value corresponding relation between the first key value and the second key value based on the second key value of each sub data in the second database. According to the embodiment of the disclosure, loading data to be loaded into a first database according to a first key value comprises the steps of obtaining partition range information aiming at the first key value in the first database, determining a partition to which each piece of sub-data belongs and a target file corresponding to the partition based on the first key value and the partition range information of each piece of sub-data, storing each piece of sub-data into the corresponding target file, and loading the target file into the first database. According to the embodiment of the disclosure, storing each piece of sub-data to the corresponding target file comprises determining the access frequency of each field in the sub-data based on a historical access record in a first database and/or a historical access record in a second database, storing the field to a first area of the target file in response to the access frequency of the field being greater than a preset threshold value for each piece of sub-data to be loaded, wherein the field in the first area is preloaded to a memory for direct access when an access process is started, and performing compression processing on the field and storing the field to a second area of the target file in response to the access frequency of the field being less than or equal to the preset threshold value, wherein the field in the second area is kept in an unloaded state when the access process is started. According to an embodiment of the present disclosure, the method further includes pre-allocating a first storage capacity to the first region, the first storage capacity being determined based on storage capacities of storage elements in a file storage system, the file storage system being a storage system corresponding to the second database, and pre-allocating a second storage capacity to the second region, the second storage capacity being greater than the first storage capacity. According to the embodiment of the disclosure, loading data to be loaded into a first database comprises the steps of carrying out conflict verification on a first key value of each sub-data in the data to be loaded