CN-121996711-A - Data analysis device and data analysis method
Abstract
The data analysis device comprises a data table identifier, a dimension identifier and an abnormal condition, wherein the data table identifier is used for indicating a target data table, the dimension identifier is used for indicating a target dimension, the target dimension is used for grouping the target data table to obtain a plurality of groups, the first total data amount SQL and the first abnormal data amount SQL are executed based on the data table identifier, the dimension identifier and the abnormal condition, a data statistics result is returned, and the data statistics result comprises the total data amount and the abnormal data amount of the target data table and the total data amount and the abnormal data amount respectively corresponding to the plurality of groups.
Inventors
- GAO YUE
- SHI XIUTAO
- ZHANG PENG
- SUN CHUNXIAO
- ZHANG JIKUAN
Assignees
- 聚好看科技股份有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20251210
Claims (10)
- 1. A data analysis apparatus, comprising: The controller is configured to respond to a storage operation of a first total data volume SQL and a first abnormal data volume SQL of a data analysis page, analyze the first total data volume SQL and the first abnormal data volume SQL to obtain a data table identifier, a dimension identifier and an abnormal condition, wherein the data table identifier is used for indicating a target data table, the dimension identifier is used for indicating a target dimension, and the target dimension is used for grouping the target data table to obtain a plurality of groups; And executing the first total data quantity SQL and the first abnormal data quantity SQL based on the data table identification, the dimension identification and the abnormal condition, and returning a data statistics result, wherein the data statistics result comprises the total data quantity and the abnormal data quantity of the target data table and the total data quantity and the abnormal data quantity respectively corresponding to the plurality of packets.
- 2. The data analysis device of claim 1, wherein the controller is specifically configured to: generating a total grouping number SQL based on the data table identification and the dimension identification; Executing the total packet number SQL and returning the packet number of the plurality of packets; And under the condition that the number of the packets is smaller than or equal to a threshold value of the number of the packets, executing the first total data quantity SQL and the first abnormal data quantity SQL based on the data table identification, the dimension identification and the abnormal condition, and returning a data statistical result.
- 3. The data analysis device of claim 1, wherein the controller is further configured to: And responding to the save operation, and displaying target prompt information under the condition that the analysis target SQL does not comprise the dimension identification, wherein the target SQL comprises the first total data volume SQL and/or the first abnormal data volume SQL, and the target prompt information is used for prompting to input the dimension identification.
- 4. The data analysis device of claim 1, wherein the data analysis page includes a packet analysis option, the controller further configured to: in response to a triggering operation of the packet analysis option, initiating a packet analysis function; In the case of starting the packet analysis function, the save operation is received.
- 5. The data analysis device of claim 1, wherein the data analysis page comprises a first input area and a second input area, the controller further configured to: responsive to a trigger operation of inputting the dimension identification in a first input area, displaying the first total data quantity SQL in the first input area; and responding to a trigger operation of inputting the dimension identification in a second input area, and displaying the first abnormal data quantity SQL in the second input area.
- 6. The data analysis device of claim 5, wherein the controller is further configured to: The first input area is provided with a first option, and a first prompt message is provided for indicating a writing example and/or writing notes of the total data quantity SQL in response to a triggering operation of the first option; and/or the number of the groups of groups, The second input area is provided with a second option, and a second prompt message is provided in response to the triggering operation of the second option, wherein the second prompt message is used for indicating a writing example and/or writing notes of the abnormal data quantity SQL.
- 7. The data analysis device of any one of claims 1 to 6, wherein the data analysis page further comprises a third input area, the third input area exhibiting first exception data SQL for obtaining exception data in the target data table, the controller further configured to: Determining a plurality of target packets based on abnormal data amounts respectively corresponding to the plurality of packets, wherein the plurality of target packets are packets with abnormal data amounts greater than 0; Generating a plurality of second abnormal data SQL based on the grouping information of the plurality of target groupings and the first abnormal data SQL, wherein the plurality of second abnormal data SQL are respectively used for acquiring abnormal data in the plurality of target groupings; And executing the plurality of second abnormal data SQL and returning the abnormal data corresponding to the plurality of packets respectively.
- 8. The data analysis device of claim 7, wherein the controller is further configured to: displaying an anomaly analysis list, wherein the anomaly analysis list comprises a plurality of data items respectively corresponding to the plurality of groups, the data items comprise total data quantity of the corresponding groups, anomaly data quantity and a first operation item, and the first operation item is used for displaying anomaly data of the corresponding groups; The data entry further comprises at least one of data accuracy rate of the corresponding packet, abnormal data duty ratio of the corresponding packet and a second operation item of the corresponding packet, wherein the second operation item is used for displaying abnormal data SQL of the corresponding packet.
- 9. The data analysis device of claim 7, wherein the controller is specifically configured to: and determining the plurality of target packets based on the abnormal data amounts respectively corresponding to the plurality of packets when the total number of abnormal data amounts in the target data table is greater than or equal to an abnormal constant threshold.
- 10. A data analysis method, characterized by being applied to a data analysis apparatus, comprising: Responding to a storage operation of a first total data quantity SQL and a first abnormal data quantity SQL of a data analysis page, and analyzing the first total data quantity SQL and the first abnormal data quantity SQL to obtain a data table identifier, a dimension identifier and an abnormal condition, wherein the data table identifier is used for indicating a target data table, the dimension identifier is used for indicating a target dimension, and the target dimension is used for grouping the target data table to obtain a plurality of groups; And executing the first total data quantity SQL and the first abnormal data quantity SQL based on the data table identification, the dimension identification and the abnormal condition, and returning a data statistics result, wherein the data statistics result comprises the total data quantity and the abnormal data quantity of the target data table and the total data quantity and the abnormal data quantity respectively corresponding to the plurality of packets.
Description
Data analysis device and data analysis method Technical Field Embodiments of the present disclosure relate to data analysis techniques. And more particularly, to a data analysis apparatus and a data analysis method. Background In the field of large data platforms, data quality rules in data management and analysis are key links for verifying the accuracy and usability of data. The data quality check covers links such as submitting quality rules by users, carrying out timing scoring on a target database table by a production timing task. The quality rules are divided into template rules and user-defined rules. The user-defined rules require the user to fill in the custom structured query language (Structured Query Language, SQL). In general, the user-defined SQL only supports global exception statistics (such as total exception amount), when the exception amount is large, all exception data may not be obtained, and the user may need to manually further analyze and count the exception data packet, so that the operation efficiency is low. Disclosure of Invention In order to solve the above technical problems or at least partially solve the above technical problems, embodiments of the present disclosure provide a data analysis device and a data analysis method. In a first aspect, an embodiment of the present disclosure provides a data analysis device, including a controller configured to, in response to a save operation of a first total data amount SQL and a first abnormal data amount SQL of a data analysis page, parse the first total data amount SQL and the first abnormal data amount SQL to obtain a data table identifier, a dimension identifier, and an abnormal condition, where the data table identifier is used to indicate a target data table, the dimension identifier is used to indicate a target dimension, and the target dimension is used to group the target data table to obtain a plurality of groups, execute the first total data amount SQL and the first abnormal data amount SQL based on the data table identifier, the dimension identifier, and the abnormal condition, and return a data statistics result, where the data statistics result includes a total data amount and an abnormal data amount of the target data table, and a total data amount and an abnormal data amount respectively corresponding to the plurality of groups. In the embodiment of the disclosure, the dimension identification, the data table identification and the abnormal condition in the custom SQL are obtained by analyzing the custom SQL (the first total data amount SQL and the first abnormal data amount SQL), then the custom SQL is executed based on the dimension identification, the data table identification and the abnormal condition to obtain the data statistics result (including the total data amount and the abnormal data amount of the target data table, and the total data amount and the abnormal data amount respectively corresponding to a plurality of groups obtained by grouping based on the dimension), so that the user can write the custom SQL including the dimension identification by analyzing and executing the custom SQL, the user does not need to write the grouping SQL manually, the data analysis efficiency can be improved, and the abnormal data can be respectively displayed according to different groups when the abnormal data amount is large, the diversified abnormal situation can be covered, the user does not need to download the total data for local analysis, and the analysis efficiency can be improved. In some embodiments of the present disclosure, the controller is specifically configured to generate a total packet number SQL based on the data table identification and the dimension identification, execute the total packet number SQL, return the packet number of the plurality of packets, execute a first total data amount SQL and a first abnormal data amount SQL based on the data table identification, the dimension identification and the abnormal condition if the packet number is less than or equal to a packet number threshold, and return a data statistics result. Therefore, the control packet number can be realized through the total packet number SQL, so that the overlarge packet data volume is prevented, and the pressure on the local memory and the database is increased. In some embodiments of the present disclosure, the controller is further configured to, in response to the save operation, present a target hint information if the resolved target SQL does not include the dimension identification, the target SQL including a first total data amount SQL and/or a first abnormal data amount SQL, the target hint information being used to hint in entering the dimension identification. In the embodiment of the disclosure, by setting whether the dimension identifier exists in the first total data volume SQL and the first abnormal data volume SQL or not in response to the save operation, the user can be prompted to input the dimension ident