CN-116303704-B - Automatic configuration processing method for running lot based on MPP database
Abstract
The invention relates to the field of database processing, and provides an automatic configuration processing method for running lots based on an MPP database. The method solves the problems that in the prior art, the error reporting and stopping of the batch running task and the batch running monitoring and adjusting are carried out due to the fact that the data of an upstream system in a batch running scene of a pneumatic control data mart are illuminated, and the error reporting and processing workload is large and untimely. The main scheme includes dividing running lot into business, making log, writing into function_log list, designing planned error reporting locating and processing system list, recording error code and correspondent error description and error reporting processing suggestion, making exception inquiry on function_log list according to error code in planned error reporting locating and processing system list, making exception prompt or error description and error reporting suggestion on the obtained exception information, and writing the inquired exception data into exception data list by means of given exception inquiry and error reporting suggestion.
Inventors
- WANG DALEI
- XU HAO
- TIAN YU
- LAN XIANG
- FANG SHIRU
Assignees
- 武汉众邦银行股份有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20221230
Claims (4)
- 1. An automatic configuration processing method of running lot based on MPP database is characterized by comprising the following steps: Step 1, metadata management, in which table structure definition, table field and function definition are managed, and mapping relation between an upstream table and a downstream table is managed, wherein the function definition refers to creating a statement containing data; Step 2, dividing the running lot into transactions, journaling and writing the transactions into a function_log log table, wherein the running lot comprises a plurality of transactions, each transaction is a database operation statement comprising INSERT, UPDATE, DELETE instructions, journaling refers to prompt information generated by operation instructions during running, the prompt information comprises time consumption, an execution result and abnormal information output, and then writing the prompt information into the function_log table; Step 2 comprises the following steps: step 2.1, dividing batch running transactions to obtain divided transactions; Dividing an execution statement of a running batch into independent insert statement, update statement and delete statement according to insert, update and delete operation, obtaining divided transactions, defining an increment variable v_step to represent step sequence number of a function transaction, defining another variable v_execution_content, and recording specific processing logic rules in the insert statement, update statement and delete statement; Step 2.2, marking the divided transaction with a step sequence number, and writing the main mapping relation between a source table and a target table in the transaction, a source field and a target field and the running batch metadata corresponding to the transaction by processing logic into a function_log log table, wherein the running batch metadata comprises a running batch function, a running batch starting time, a running batch ending time, a running batch step, a running batch logic, a running batch result and a running batch error reporting message; Step 3, designing a planning error-reporting positioning and processing system table, and recording error codes and corresponding error description and error-reporting processing suggestions: step 4, carrying out abnormal inquiry on the function_log list according to the error codes in the planning error reporting positioning and processing system list in the step 3, and carrying out abnormal prompt or error description and error reporting processing suggestion on the obtained abnormal information; and 5, writing the queried abnormal data into an abnormal data table through the abnormal query and error processing opinion given in the step 4.
- 2. An automatic configuration processing method for running lots based on an MPP database as in claim 1, wherein step 1 comprises the steps of: step 1.1, creating a table definition metadata table, wherein the metadata table comprises a table name field and a table establishing sentence field corresponding to the table name, and the table definition metadata table is used for reconstructing a data table corresponding to the table name through the table establishing sentence; Step 1.2, creating a table field structure metadata table, wherein the table field structure metadata table comprises a table name field, a field name corresponding to the table name, a field type definition and a table sentence building field corresponding to the table name, and is used for inquiring and reconstructing field data problems; and 1.3, creating an upstream and downstream table and a field mapping table, wherein the upstream and downstream table and field mapping table comprises a source table, a source field, a target table, a target field and a logic field, wherein the source table and the target table, the source field and the target field form a mapping relation through processing logic rules stored in the logic field, and the mapping relation is used for inquiring the relation of the fields and analyzing the abnormality.
- 3. An automatic configuration processing method for running lots based on an MPP database as in claim 1, wherein step 3 specifically comprises the following steps: Step 3.1, counting and listing the abnormal information in a template format, wherein the error reporting information comprises denominator 0 and ultralong field; Step 3.2, analyzing the data problem of the corresponding abnormal information, and writing the most common problem cause as prompt information; and 3.3, obtaining the maximum possibility of the cause of the problem, and writing the abnormal data query process and the abnormal data processing process into the abnormal query processing information through the abnormal information.
- 4. An automatic configuration processing method for running lots based on an MPP database as in claim 1, wherein step 4 specifically comprises the following steps: Step 4.1 java, the application reads the data in the function_log log table of the current day; and 4.2, if the abnormality occurs, locating the abnormality, and carrying out data matching in a planning error reporting locating and processing system according to the error code of the error reporting information of the abnormality to obtain an abnormality prompt or error description and error reporting processing suggestion.
Description
Automatic configuration processing method for running lot based on MPP database Technical Field The invention relates to the field of database processing, and provides an automatic configuration processing method for running lots based on an MPP database. Background MPP is a completely shared nothing architecture, each node (segment) in this architecture is independent, self-contained, the CPU, memory and storage resources of each node are completely independent, and each node has an independent SQL engine. The architecture has the advantages of good expansibility and capability of supporting complex structured query. However, no sharing means that the data needs to be split in the cluster, each node is allocated with a part of data fragments, each node performs SQL analysis and execution without mutual interference, and returns a result to a central control node (master) for aggregation. Therefore, processing of a transaction requires that all nodes operating in parallel return results to complete the commit of the transaction. In the MPP architecture, the distributed operations should be evenly distributed to the nodes, and this even task division is based on a reasonable data distribution (data distribution). In the data warehouse and data mart fields, the MPP architecture becomes a cost-effective parallel structured data processing and storage engine due to its good expansibility and support for distributed transactions. In the processing logic of the MPP architecture, the data analysis and processing requests are distributed to individual nodes, and the final computing task is determined by the data on each node, so the policy of data distribution ultimately affects the scalability of the cluster, as well as the performance of the distributed architecture. MPP is widely applied in the field of big data at present, in particular to the aspect of batch processing calculation. However, in the MPP application maintenance process, there are inconveniences in the following usage scenarios: The important report forms suddenly run abnormally after running stably for a period of time, if the upstream problems or development are some detail problems which are not verified, or the current service scene is not considered during development, the upstream data problems or the upstream data structures are not regulated and timely notified, and the abnormal conditions are in abnormal working time. After the report task function is developed, particularly after the report task function is stably operated for a period of time, the report task function may report errors suddenly when the follow-up running batch is performed, the reasons for the follow-up running batch are quite many, the follow-up running batch is found to be frequently seen through investigation, for example, the problem that the run batch depends on configuration is missed to cause a lock table, the report errors may be caused by the fact that dirty data are generated upstream, the authority of a source table is changed or the definition of a table structure is changed, even if the report errors are notified downstream, the report errors are caused by the fact that the follow-up running batch are deleted, but each report error may need to be queried in time, the follow-up running batch is not found timely or is inconvenient to process after the report errors, and the follow-up running batch is affected, so that the whole running batch is slow. Disclosure of Invention The invention aims to solve the problems of large error reporting workload caused by error reporting and stopping of a batch running task and batch running monitoring adjustment which are formed by illumination of data problems of an upstream system in a batch running scene of a wind control data mart in the prior art. In order to achieve the above purpose, the invention adopts the following technical scheme: an automatic configuration processing method of running lot based on MPP database comprises the following steps: Step 1, metadata management, in which table structure definition, table field and function definition are managed, and mapping relation between an upstream table and a downstream table is managed, wherein the function definition refers to creating a statement containing data; Step 2, dividing the running lot into transactions, journaling and writing the transactions into a function_log log table, wherein the running lot comprises a plurality of transactions, each transaction is a database operation statement comprising INSERT, UPDATE, DELETE instructions, journaling refers to prompt information generated by operation instructions during running, such as time consumption, execution results and abnormal information output, and writing the prompt information into the function_log table; Step 3, designing a planning error-reporting positioning and processing system table, and recording error codes and corresponding error description and error-reporting processing suggestions: s