Search

CN-114238381-B - Data quality checking method, device and computer readable storage medium

CN114238381BCN 114238381 BCN114238381 BCN 114238381BCN-114238381-B

Abstract

The invention relates to the technical field of financial science and technology (Fintech). The invention discloses a data quality checking method, equipment and a computer readable storage medium, which are characterized in that checking logic for checking rule expressions required by each field in a data table is packaged at a bottom layer, and then a rule information definition table adapted to description information of the data table field is configured based on the checking logic, so that when data checking requirements exist, a system can automatically load corresponding rule configuration and the data table according to parameters only by giving parameters for describing the data table, then the rule expressions required by each field checking are automatically generated through rule matching, and finally each field is checked according to the corresponding rule expressions.

Inventors

  • YANG DONGFANG
  • HAN HAIYAN
  • LI JUN
  • LI YUAN
  • XIAO HEBING
  • LI CHAOYANG

Assignees

  • 深圳前海微众银行股份有限公司

Dates

Publication Date
20260512
Application Date
20211221

Claims (7)

  1. 1. A data quality verification method, characterized in that the data quality verification method comprises: Acquiring input parameters, and inquiring rule configuration information related to the input parameters from a pre-configured rule information definition table, wherein the rule information definition table is matched with description information of each data table, and a plurality of check logic for checking rule expressions are packaged at the bottom layer; the rule information definition table comprises a basic information table and a check rule definition table, wherein the step of acquiring input parameters and inquiring rule configuration information related to the input parameters from the pre-configured rule information definition table comprises the steps of acquiring database names and data table names from a data check instruction as the input parameters when the data check instruction is received; The method comprises the steps of converting rule configuration information into a first key value pair set, obtaining a data table to be checked according to input parameters, and converting the data table to be checked into a second key value pair set, wherein the step of converting the rule configuration information into the first key value pair set comprises the steps of converting the rule configuration information into a plurality of first key value pairs, wherein keywords in the first key value pair are first field names defined in the check rule definition table, values in the first key value pair are basic information and check rule information respectively corresponding to the field names in the basic information table and the check rule definition table, summarizing the plurality of first key value pairs into the first key value pair set, and obtaining the data table to be checked according to the input parameters, wherein the step of converting the data table to be checked into the second key value pair set comprises the steps of generating a data query statement, using the sql statement to query the data table, and converting the values in the first key value pair into the second key value pair into the first key value pair set by using the key engine statement, and the key value pair data to be distributed into the second key value pair key value set by the first key value pair key value set, wherein the key value pair is the second key value pair key value set is the first key value pair set and the key value set is the key value set of all the key value set to be mapped in a real-domain; matching the first key value pair set with the second key value pair set to obtain target verification expressions corresponding to all fields in the data table to be verified; And carrying out data verification on each field in the data table to be verified according to the target verification expression to obtain a verification result.
  2. 2. The method for verifying data quality according to claim 1, wherein the step of matching the first set of key-value pairs with the second set of key-value pairs to obtain the target verification expression corresponding to each field in the data table to be verified comprises: Matching the key words of each second key value pair in the second key value pair set with the key words of each first key value pair in the first key value pair set; and generating a corresponding check rule expression based on the basic information and the check rule information in the successfully matched first key value pair, and taking the corresponding check rule expression as a target check expression of the field corresponding to the successfully matched second key value pair.
  3. 3. The data quality checking method according to claim 1, wherein the step of performing data checking on each field in the data table to be checked according to the target checking expression to obtain a checking result comprises: Determining the dependency relationship between target verification expressions matched with all fields in the data table to be verified; And executing the matched target verification expression on each field according to the dependency relationship so as to carry out data verification on each field and obtain verification results corresponding to each field.
  4. 4. The method for verifying data quality according to claim 1, wherein after the step of performing data verification on each field in the data table to be verified according to the target verification expression to obtain a verification result, the method further comprises: If the verification result is an abnormal verification result, generating data abnormal prompt information according to the abnormal verification result, and determining the alarm level of the prompt information; Summarizing the data abnormality prompt information and the alarm level of each field in the data table to be checked into an abnormal data summary table, and pushing the abnormal data summary table to related processing personnel.
  5. 5. The method for verifying data quality as defined in any one of claims 1-4, wherein prior to the step of obtaining the input parameter, further comprising: configuring basic information and check rules of the data table field to generate an initial rule information definition table; And when the class and/or function of the custom check rule is obtained based on the front end, integrating the class and/or function of the custom check rule into the initial rule information definition table to obtain the rule information definition table.
  6. 6. A data quality checking device comprising a memory, a processor and a data quality checking program stored on the memory and executable on the processor, the data quality checking program when executed by the processor implementing the steps of the data quality checking method according to any one of claims 1 to 5.
  7. 7. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a data quality check program, which when executed by a processor, implements the steps of the data quality check method according to any of claims 1 to 5.

Description

Data quality checking method, device and computer readable storage medium Technical Field The present invention relates to the technical field of financial science and technology (Fintech), and in particular, to a data quality verification method, apparatus and computer readable storage medium. Background With the development of computer technology, more and more technologies (big data, distributed, blockchain Blockchain, artificial intelligence, etc.) are applied in the financial field, and the traditional financial industry is gradually changing to financial technology (Fintech), but due to the requirements of safety and real-time performance of the financial industry, higher requirements are also put on the data processing technology. In the daily data processing process, the quality of the collected data is usually required to be checked, and because the data volume to be checked is usually relatively large, mass data is stored in a Hive library (a data warehouse tool based on hadoop, which is a distributed system infrastructure capable of performing high-speed operation and storage of mass data), then the check rules of the data are converted into sql statements based on the functions of Hive, and finally the check rule statements are executed on the data in the library so as to screen out abnormal data in the collected data. However, in the conventional verification method, verification rules are manually configured according to the logic of the field, and sql sentences are required to be manually written, so that when a large amount of data to be verified is faced, the data verification is obviously quite inefficient. Disclosure of Invention The invention mainly aims to provide a data quality verification method, equipment and a computer readable storage medium, and aims to solve the technical problem that the efficiency of the existing manual-based data quality verification mode is low. In order to achieve the above object, the present invention provides a data quality verification method, including: Acquiring input parameters, and inquiring rule configuration information related to the input parameters from a pre-configured rule information definition table, wherein the rule information definition table is matched with description information of each data table, and a plurality of check logic for checking rule expressions are packaged at the bottom layer; converting the rule configuration information into a first key value pair set, and acquiring a data table to be checked according to the input parameters so as to convert the data table to be checked into a second key value pair set; matching the first key value pair set with the second key value pair set to obtain target verification expressions corresponding to all fields in the data table to be verified; And carrying out data verification on each field in the data table to be verified according to the target verification expression to obtain a verification result. Optionally, the rule information definition table includes a base information table and a check rule definition table, The step of obtaining the input parameters and inquiring rule configuration information related to the input parameters from a pre-configured rule information definition table comprises the following steps: When a data verification instruction is received, acquiring a database name and a data table name from the data verification instruction as the input parameters; And inquiring rule configuration information related to the database names and the data table names from a pre-configured basic information table and a check rule definition table, wherein database name parameters and data table name parameters are defined in the basic information table and the check rule definition table. Optionally, the step of converting the rule configuration information into the first set of key-value pairs includes: Converting the rule configuration information into a plurality of first key value pairs, wherein the key words in the first key value pairs are field names defined in the verification rule definition table, and the values in the first key value pairs are basic information and verification rule information, which respectively correspond to the field names in the basic information table and the verification rule definition table; And summarizing a plurality of the first key value pairs into the first key value pair set. Optionally, the step of matching the first key value pair set with the second key value pair set to obtain a target verification expression corresponding to each field in the data table to be verified includes: Matching the key words of each second key value pair in the second key value pair set with the key words of each first key value pair in the first key value pair set; and generating a corresponding check rule expression based on the basic information and the check rule information in the successfully matched first key value pair, and taking the corres