US-12625878-B2 - Complex dimension functions in pivot aggregations
Abstract
Systems and methods described herein relate to techniques for performing a pivot aggregation on a set of data using intermediate maps and intermediate pages. The intermediate maps and the intermediate pages reduce the computational complexity of pivot aggregations into a series of operations that are less computationally complex and may be performed in parallel. The techniques described herein improve efficiency in database technology by reducing computational complexity of pivot aggregations.
Inventors
- Christian Bensberg
- Kai Stammerjohann
- Frederik Transier
Assignees
- SAP SE
Dates
- Publication Date
- 20260512
- Application Date
- 20241023
Claims (20)
- 1 . A system comprising: at least one hardware processor; and a computer-readable medium storing instructions that, when executed by the at least one hardware processor, cause the at least one hardware processor to perform operations comprising: receiving a first pivot aggregation command to generate a first output table based on an input table; generating a first intermediate hash map based on the first pivot aggregation command and the input table; generating a first intermediate lookup map based on the first pivot aggregation command and query conditions; generating a first intermediate page based on the first intermediate hash map and the first intermediate lookup map; generating the first output table based on the first intermediate page; receiving a second pivot aggregation command to generate a second output table based on the input table, wherein the second pivot aggregation command comprises a different filter condition from the first pivot aggregation command or a different grouping from the first pivot aggregation command; and generating the second output table by reusing the first intermediate hash map and generating a second intermediate lookup map to apply the different filter condition or by reusing the first intermediate lookup map and generating a second intermediate hash map to apply the different grouping.
- 2 . The system of claim 1 , wherein generating the first intermediate hash map comprises: identifying a group for data entries of the input table based on the first pivot aggregation command; identifying a filter condition value for the data entries of the input table based on the first pivot aggregation command; and aggregating the data entries with a same group and a same filter condition value, the first intermediate hash map comprising entries based on the aggregated data entries.
- 3 . The system of claim 2 , wherein aggregating the data entries comprises summing numeric measures of the data entries.
- 4 . The system of claim 1 , wherein the first intermediate hash map comprises a column for each group in the first pivot aggregation command.
- 5 . The system of claim 1 , wherein generating the first intermediate lookup map comprises: identifying filter conditions based on the first pivot aggregation command; identifying filter condition values of the query conditions; and listing, for each filter condition value, the filter conditions satisfied by the filter condition value.
- 6 . The system of claim 1 , wherein generating the first intermediate page comprises: for each entry in the first intermediate hash map: identifying a filter condition that satisfies a filter condition value for the entry based on the first intermediate lookup map; and generating a corresponding entry in the first intermediate page based on the entry and the filter condition.
- 7 . The system of claim 1 , wherein the first intermediate page and the first output table have a same number of columns.
- 8 . The system of claim 1 , wherein the first intermediate page includes a column for each filter condition in the first pivot aggregation command.
- 9 . The system of claim 1 , wherein generating the first output table comprises: identifying entries in the first intermediate page that belong to a group based on group by values of the entries; and aggregating the entries that belong to the group.
- 10 . The system of claim 1 , wherein the first intermediate hash map is generated based on a first iteration through the input table, the first intermediate page is generated based on a second iteration through the first intermediate hash map, and the first output table is generated based on a third iteration through the first intermediate page.
- 11 . The system of claim 1 , the operations further comprising: generating a second intermediate hash map based on the first pivot aggregation command and the input table, the first intermediate hash map based on a first portion of the input table, the second intermediate hash map based on a second portion of the input table; and generating a second intermediate page, the first intermediate page based on the first intermediate hash map, the second intermediate page based on the second intermediate hash map.
- 12 . The system of claim 11 , wherein the second intermediate page is generated in parallel with the first intermediate hash map.
- 13 . The system of claim 1 , wherein the first intermediate hash map and the first intermediate lookup map are generated in parallel.
- 14 . The system of claim 1 , the operations further comprising: generating the second intermediate hash map based on the second pivot aggregation command and the input table; reusing the first intermediate lookup map based on a compatibility between the first pivot aggregation command and the second pivot aggregation command; and generating a second intermediate page based on the second intermediate hash map and the first intermediate lookup map.
- 15 . A method comprising: receiving a first pivot aggregation command to generate a first output table based on an input table; generating a first intermediate hash map based on the first pivot aggregation command and the input table; generating a first intermediate lookup map based on the first pivot aggregation command and query conditions; generating a first intermediate page based on the first intermediate hash map and the first intermediate lookup map; generating the first output table based on the first intermediate page; receiving a second pivot aggregation command to generate a second output table based on the input table, wherein the second pivot aggregation command comprises a different filter condition from the first pivot aggregation command or a different grouping from the first pivot aggregation command; and generating the second output table by reusing the first intermediate hash map and generating a second intermediate lookup map to apply the different filter condition or by reusing the first intermediate lookup map and generating a second intermediate hash map to apply the different grouping.
- 16 . The method of claim 15 , wherein generating the first intermediate hash map comprises: identifying a group for data entries of the input table based on the first pivot aggregation command; identifying a filter condition value for the data entries of the input table based on the first pivot aggregation command; and aggregating the data entries with a same group and a same filter condition value, the first intermediate hash map comprising entries based on the aggregated data entries.
- 17 . The method of claim 15 , wherein generating the first intermediate lookup map comprises: identifying filter conditions based on the first pivot aggregation command; identifying filter condition values of the query conditions; and listing, for each filter condition value, the filter conditions satisfied by the filter condition value.
- 18 . One or more non-transitory computer-readable media storing computer-executable instructions that, when executed by a computing system, cause the computing system to perform operations comprising: receiving a first pivot aggregation command to generate a first output table based on an input table; generating a first intermediate hash map based on the first pivot aggregation command and the input table; generating a first intermediate lookup map based on the first pivot aggregation command and query conditions; generating a first intermediate page based on the first intermediate hash map and the first intermediate lookup map; generating the first output table based on the first intermediate page; receiving a second pivot aggregation command to generate a second output table based on the input table, wherein the second pivot aggregation command comprises a different filter condition from the first pivot aggregation command or a different grouping from the first pivot aggregation command; and generating the second output table by reusing the first intermediate hash map and generating a second intermediate lookup map to apply the different filter condition or by reusing the first intermediate lookup map and generating a second intermediate hash map to apply the different grouping.
- 19 . The one or more non-transitory computer-readable media of claim 18 , wherein generating the first intermediate hash map comprises: identifying a group for data entries of the input table based on the first pivot aggregation command; identifying a filter condition value for the data entries of the input table based on the first pivot aggregation command; and aggregating the data entries with a same group and a same filter condition value, the first intermediate hash map comprising entries based on the aggregated data entries.
- 20 . The one or more non-transitory computer-readable media of claim 18 , wherein generating the first intermediate lookup map comprises: identifying filter conditions based on the first pivot aggregation command; identifying filter condition values of the query conditions; and listing, for each filter condition value, the filter conditions satisfied by the filter condition value.
Description
TECHNICAL FIELD The subject matter disclosed herein generally relates to database technology. More specifically, but not exclusively, the subject matter described herein relates to data transformation techniques for dimensional data. BACKGROUND Various industries use database technology for organizing, storing, and analyzing data. As these industries become more technically advanced, the need for advancements in database technology increases. For example, one way in which database technology has advanced is by providing new functions for analyzing data. However, as various industries become more technically advanced, the data for which these industries use database technologies also grows increasingly complex. This creates various technical challenges in the field of database technology because functions that cannot efficiently operate on data quickly become practically inoperative as the data becomes more complex. BRIEF DESCRIPTION OF THE DRAWINGS Some examples are shown for purposes of illustration and not limitation in the figures of the accompanying drawings. In the drawings, which are not necessarily drawn to scale, like numerals may describe similar components in different views or examples. It should be understood that additional and alternative examples are possible without departing from the principles of the subject matter described herein. FIG. 1 diagrammatically illustrates a network environment that includes a dimension function system, according to some examples. FIG. 2 is a block diagram of components of a dimension function system, according to some examples. FIG. 3 is a block diagram of an example of performing pivot aggregation operations, according to some examples. FIG. 4 is a block diagram of an example of performing pivot aggregation operations, according to some examples. FIG. 5 is a block diagram of an example of performing pivot aggregation operations, according to some examples. FIG. 6 is a block diagram of an example of performing pivot aggregation operations, according to some examples. FIG. 7 is a flowchart showing an example method, according to some examples. FIG. 8 is a block diagram showing a software architecture for a computing device, according to some examples. FIG. 9 is a block diagram of a machine in the form of a computer system, according to some examples, within which instructions may be executed for causing the machine to perform any one or more of the methodologies discussed herein. DETAILED DESCRIPTION In general, pivot aggregations are data transformation techniques that involve converting data organized in rows of a first table to data organized in columns of a second table. Converting data in this way supports dynamic aggregation, sorting, and filtering of data. For example, pivot aggregations are used to summarize large sets of data, create cross-tabular reports, and facilitate various data analysis functions. In performing a pivot aggregation, a set of conditions, which are expressed, for example, in case statements, sum statements, count statements, average statements, and the like, are applied to each row of a first table to convert data organized in each row of the first table into columns of a second table. As illustrated here, performing the pivot aggregation is computationally complex (e.g., N×M complexity, number of rows times number of conditions) because every condition is applied to every row of data in the first table. Thus, pivot aggregations are inefficient and resource-intensive operations, especially for large sets of data. These technical challenges are exacerbated where dimensional functions are involved in pivot aggregations. For example, a set of data with time-based dimensions can include rows with timestamps, or other condition values, that satisfy multiple time-based conditions for a pivot aggregation. Performing the pivot aggregation on this set of data would involve further complexity in addition to the already computationally complex operation of applying each condition to each row in the set of data. Thus, database technologies using dimensional functions in pivot aggregations face technical challenges arising from computational complexity and computational inefficiency. The subject matter described herein addresses these and other technical challenges arising in the field of database technologies. As an overview, the subject matter described herein provides for techniques for performing a pivot aggregation on a set of data using intermediate maps (e.g., hash maps, lookup maps) and intermediate pages (e.g., local pages). The intermediate maps and the intermediate pages allow for the pivot aggregation to be performed with reduced computational complexity by reducing the computationally complex pivot aggregation operation into less computationally complex operations (e.g., lookup operations). Furthermore, the intermediate maps and the intermediate pages are generated in parallel for improved efficiency, and in some examples, the inter