CN-121997010-A - Data analysis method, device, related equipment and computer program product
Abstract
The application discloses a data analysis method, a data analysis device, related equipment and a computer program product, and relates to the field of data processing. The method comprises the steps of extracting aggregation granularity of data analysis from a data analysis request, converting a time field corresponding to each piece of data to be analyzed in original data into an aggregation key value corresponding to the aggregation granularity, executing aggregation operation on all pieces of data to be analyzed in the original data according to the aggregation key value corresponding to each piece of data to be analyzed to obtain an aggregation result corresponding to each aggregation key value, and carrying out data analysis on the aggregation results corresponding to all aggregation key values according to the data analysis request to obtain a data analysis result. The application can execute data analysis operation on the basis of data with complete semantics, consistent logic and accurate grouping, and remarkably improves the accuracy and reliability of data analysis results.
Inventors
- LUO QIYUAN
- PENG JUE
- LIU MENGMENG
- LIANG FENG
Assignees
- 帆软软件有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20260128
Claims (10)
- 1. A method of data analysis, comprising: Acquiring a data analysis request of original data, wherein the original data comprises a plurality of pieces of data to be analyzed and a time field corresponding to each piece of data to be analyzed; Extracting an aggregate granularity of data analysis from the data analysis request, the aggregate granularity comprising at least one time unit; Converting the time field corresponding to each piece of data to be analyzed in the original data into an aggregation key value corresponding to the aggregation granularity, wherein the aggregation key value comprises at least one compound field, and each compound field is a combination of a time unit and a time value corresponding to the time unit; According to the aggregation key value corresponding to each piece of data to be analyzed, performing aggregation operation on all pieces of data to be analyzed in the original data according to the aggregation granularity to obtain an aggregation result corresponding to each aggregation key value; and carrying out data analysis on the aggregation results corresponding to all the aggregation key values according to the data analysis request to obtain data analysis results.
- 2. The data analysis method according to claim 1, wherein the process of performing data analysis on the aggregation results corresponding to all the aggregation key values according to the data analysis request to obtain data analysis results includes: determining the priority of each time unit according to the unit hierarchy relation among the time units contained in the aggregation key value, wherein the priority of each time unit is in direct proportion to the corresponding unit hierarchy; According to the priority order of the time units in the aggregation key values, starting from the time unit with the highest priority, comparing the time values corresponding to the same time unit in all the aggregation key values step by step to obtain a time sequence sequencing result of the aggregation key values; and carrying out data analysis on the aggregation results corresponding to all the aggregation key values according to the time sequence sequencing results to obtain data analysis results.
- 3. The method of data analysis according to claim 2, wherein the data analysis includes contemporaneous data analysis; And according to the time sequence sequencing result, carrying out data analysis on the aggregation results corresponding to all the aggregation key values to obtain a data analysis result, wherein the process comprises the following steps of: Extracting an analysis step length of synchronous data analysis from the data analysis request, wherein the time unit of the analysis step length is the same as at least one time unit in the aggregation key values; Pushing the time value of the same time unit corresponding to the analysis step length in the aggregation key value corresponding to each aggregation result according to the analysis step length to obtain a contemporaneous aggregation result corresponding to each aggregation result; And carrying out contemporaneous data analysis according to each aggregation result and the corresponding contemporaneous aggregation result to obtain a data analysis result.
- 4. A data analysis method according to any one of claims 1 to 3, wherein the process of converting the time field corresponding to each piece of the data to be analyzed in the raw data into an aggregation key value corresponding to the aggregation granularity comprises: And calling the type of the time field corresponding to the data to be analyzed and a conversion function matched with the aggregation granularity, and mapping the original value of the time field corresponding to each piece of data to be analyzed into the structured aggregation key value.
- 5. A data analysis method according to any one of claims 1 to 3, further comprising: And storing the aggregation key value and the data to be analyzed correspondingly.
- 6. A data analysis device, comprising: the request acquisition unit is used for acquiring a data analysis request of original data, wherein the original data comprises a plurality of pieces of data to be analyzed and a time field corresponding to each piece of data to be analyzed; a granularity extraction unit for extracting an aggregate granularity of data analysis from the data analysis request, the aggregate granularity comprising at least one time unit; A field conversion unit, configured to convert the time field corresponding to each piece of data to be analyzed in the original data into an aggregation key value corresponding to the aggregation granularity, where the aggregation key value includes at least one compound field, and each compound field is a combination of a time unit and a time value corresponding to the time unit; The data aggregation unit is used for executing aggregation operation on all the data to be analyzed in the original data according to the aggregation granularity according to the aggregation key value corresponding to each data to be analyzed to obtain an aggregation result corresponding to each aggregation key value; and the data analysis unit is used for carrying out data analysis on the aggregation results corresponding to all the aggregation key values according to the data analysis request to obtain data analysis results.
- 7. The data analysis device according to claim 6, wherein the data analysis unit includes: A priority marking subunit, configured to determine, according to a unit hierarchy relationship between each of the time units included in the aggregation key value, a priority of each of the time units, where the priority of the time unit is proportional to a corresponding unit hierarchy; A key value sorting subunit, configured to compare time values corresponding to the same time unit in all the aggregated key values step by step from a time unit with the highest priority according to the priority order from high to low of the time units in the aggregated key values, so as to obtain a time sequence sorting result of the aggregated key values; And the data analysis subunit is used for carrying out data analysis on the aggregation results corresponding to all the aggregation key values according to the time sequence sequencing result to obtain a data analysis result.
- 8. An electronic device is characterized by comprising a memory and a processor; the memory is used for storing programs; The processor is configured to execute the program to implement the steps of the data analysis method according to any one of claims 1 to 7.
- 9. A readable storage medium having stored thereon a computer program, which, when executed by a processor, implements the steps of the data analysis method according to any of claims 1-7.
- 10. A computer program product comprising a computer program which, when executed by a processor, implements the steps of the data analysis method as claimed in any one of claims 1 to 7.
Description
Data analysis method, device, related equipment and computer program product Technical Field The present application relates to the field of data processing technology, and more particularly, to a data analysis method, apparatus, related device, and computer program product. Background In the field of data analysis, multi-level, multi-granularity data analysis from the time dimension has become a core requirement. For example, business analysis frequently involves "year and month" (e.g., 2024, 5 months), "year and week" (e.g., 2024, 20 weeks), etc., which are essentially composite values collectively defined by a plurality of interrelated, interconnected values. However, the underlying data model of the current mainstream data storage and computing systems is commonly built on data types represented in single numerical values (or text) such as dates represented by "20251206", "2025-12-16", of integers, floating point numbers, strings, and the like. However, the representation mode cannot directly bear multi-dimensional and complex time semantics such as 'annual month', 'annual week number', and the like, which inevitably leads to mismatch between the time representation of the underlying data and the time logic required by the service analysis, and further, the accurate, coherent and service-requirement-compliant time dimension analysis is difficult to perform. Disclosure of Invention In view of the foregoing, the present application provides a data analysis method, apparatus, related device and computer program product, so as to improve accuracy of multidimensional data analysis. The specific scheme is as follows: in a first aspect, the present application provides a data analysis method, comprising: Acquiring a data analysis request of original data, wherein the original data comprises a plurality of pieces of data to be analyzed and a time field corresponding to each piece of data to be analyzed; Extracting an aggregate granularity of data analysis from the data analysis request, the aggregate granularity comprising at least one time unit; Converting the time field corresponding to each piece of data to be analyzed in the original data into an aggregation key value corresponding to the aggregation granularity, wherein the aggregation key value comprises at least one compound field, and each compound field is a combination of a time unit and a time value corresponding to the time unit; According to the aggregation key value corresponding to each piece of data to be analyzed, performing aggregation operation on all pieces of data to be analyzed in the original data according to the aggregation granularity to obtain an aggregation result corresponding to each aggregation key value; and carrying out data analysis on the aggregation results corresponding to all the aggregation key values according to the data analysis request to obtain data analysis results. In another implementation manner of the first aspect of the embodiment of the present application, according to the data analysis request, the process of performing data analysis on the aggregate results corresponding to all the aggregate key values to obtain data analysis results includes: determining the priority of each time unit according to the unit hierarchy relation among the time units contained in the aggregation key value, wherein the priority of each time unit is in direct proportion to the corresponding unit hierarchy; According to the priority order of the time units in the aggregation key values, starting from the time unit with the highest priority, comparing the time values corresponding to the same time unit in all the aggregation key values step by step to obtain a time sequence sequencing result of the aggregation key values; and carrying out data analysis on the aggregation results corresponding to all the aggregation key values according to the time sequence sequencing results to obtain data analysis results. In another implementation of the first aspect of the embodiments of the present application, the data analysis includes a contemporaneous data analysis; And according to the time sequence sequencing result, carrying out data analysis on the aggregation results corresponding to all the aggregation key values to obtain a data analysis result, wherein the process comprises the following steps of: Extracting an analysis step length of synchronous data analysis from the data analysis request, wherein the time unit of the analysis step length is the same as at least one time unit in the aggregation key values; Pushing the time value of the same time unit corresponding to the analysis step length in the aggregation key value corresponding to each aggregation result according to the analysis step length to obtain a contemporaneous aggregation result corresponding to each aggregation result; And carrying out contemporaneous data analysis according to each aggregation result and the corresponding contemporaneous aggregation result to obtai