Search

US-12625868-B2 - Updating system configuration data to include objects for machine learning models in a database system

US12625868B2US 12625868 B2US12625868 B2US 12625868B2US-12625868-B2

Abstract

A database system is operable to execute a first to generate the machine learning model from a training set of rows based on accessing and processing the training set of rows via a plurality of operators. system configuration data tracking a plurality of objects of a database system is updated to further track the machine learning model as a corresponding first object tracked via the system configuration data. A query output for a second query indicating applying of the machine learning model via execution of the second query based on applying the machine learning model to a set of rows in accordance with at least one property of the corresponding first object based on accessing the system configuration data.

Inventors

  • Andrei Lougovtsov
  • Jason Arnold
  • Kevin GARNER

Assignees

  • Ocient Holdings LLC

Dates

Publication Date
20260512
Application Date
20240328

Claims (20)

  1. 1 . A method for execution within a database system by one or more computing devices, the method comprising: determining a first query for execution that indicates creation of a first machine learning model; executing the first query to generate the first machine learning model from a training set of rows based on accessing and processing the training set of rows via a plurality of operators; updating system configuration data tracking a plurality of objects of a database system to further track the first machine learning model as a corresponding first object tracked via the system configuration data; determining a second query for execution that indicates applying of the first machine learning model; and generating query output for the second query via execution of the second query based on applying the first machine learning model to a set of rows in accordance with at least one property of the corresponding first object based on accessing the system configuration data.
  2. 2 . The method of claim 1 , wherein the system configuration data indicates a plurality of object groups that collectively includes the plurality of objects, wherein the first corresponding object is mapped to a first object group of the plurality of object groups, and wherein the at least one property of the corresponding first object is based on at least one object group property of the first object group.
  3. 3 . The method of claim 1 , wherein the corresponding first object mapped is mapped to a first corresponding set of permissions, and wherein the first corresponding set of permissions is enforced to restrict usage of the first machine learning model based on the corresponding first object being mapped to the first corresponding set of permissions, and wherein the at least one property of the corresponding first object indicates the first corresponding set of permissions.
  4. 4 . The method of claim 1 , wherein the system configuration data indicates state data for a corresponding database system based on indicating ones of the plurality of objects that are available for access in query executions at a corresponding time, further comprising: determining to execute the second query based on determining the first corresponding object is available for access in query executions at the corresponding time, wherein the at least one property of the first corresponding object includes the first corresponding object being available for access in query executions at the corresponding time.
  5. 5 . The method of claim 1 , wherein the first machine learning model is configured in accordance with a first selected model type of a plurality of machine learning function types, wherein the plurality of machine learning function types includes at least two of: a simple linear regression type; a multiple linear regression type; a polynomial regression type; a linear combination regression type; a K means type; a K Nearest Neighbors type; a logistic regression type; a naive bayes type; a nonlinear regression type; a feedforward network type; a principal component analysis type; a support vector machine type; a decision tree type; a linear discriminant analysis type; a Gaussian mixture model type; or a Sammon mapping type; wherein the first selected model type corresponds to one of: the simple linear regression type; the multiple linear regression type; the polynomial regression type; the linear combination regression type; the K means type; the K Nearest Neighbors type; the logistic regression type; the naive bayes type; the nonlinear regression type; the feedforward network type; the principal component analysis type; the support vector machine type; the decision tree type; the linear discriminant analysis type; the Gaussian mixture model type; or the Sammon mapping type.
  6. 6 . A database system comprising: at least one processor, and at least one memory that stores operations instructions that, when executed by the at least one processor, causes the database system to perform operations that include: determining a first query for execution that indicates creation of a first machine learning model; executing the first query to generate the first machine learning model from a training set of rows based on accessing and processing the training set of rows via a plurality of operators; updating system configuration data tracking a plurality of objects of a database system to further track the first machine learning model as a corresponding first object tracked via the system configuration data; determining a second query for execution that indicates applying of the first machine learning model; and generating query output for the second query via execution of the second query based on applying the first machine learning model to a set of rows in accordance with at least one property of the corresponding first object based on accessing the system configuration data.
  7. 7 . The database system of claim 6 , wherein the system configuration data indicates a plurality of object groups that collectively includes the plurality of objects, wherein the first corresponding object is mapped to a first object group of the plurality of object groups, and wherein the at least one property of the corresponding first object is based on at least one object group property of the first object group.
  8. 8 . The database system of claim 6 , wherein the corresponding first object mapped is mapped to a first corresponding set of permissions, and wherein the first corresponding set of permissions is enforced to restrict usage of the first machine learning model based on the corresponding first object being mapped to the first corresponding set of permissions, and wherein the at least one property of the corresponding first object indicates the first corresponding set of permissions.
  9. 9 . The databases system of claim 8 , wherein the database system is further operable to: determine to execute the second query based on determining a corresponding second query expression adheres to the first corresponding set of permissions based on the corresponding first object being mapped to the first corresponding set of permissions, wherein the query output is generated for the second query based on determining to execute the second query; determine a third query for execution that indicates applying of the first machine learning model; and determine to not execute the third query based on determining a corresponding third query expression does not adhere to the first corresponding set of permissions based on the corresponding first object being mapped to the first corresponding set of permissions, wherein corresponding query output is not generated for the third query based on determining to not execute the third query.
  10. 10 . The database system of claim 8 , wherein the first corresponding set of permissions indicates, for each of a set of one or more authorized user entities, whether each of a set of operations performed upon the corresponding first object, wherein the set of operations includes at least one of: executing the corresponding first object in executing a corresponding query, reading the corresponding first object in executing a corresponding query, modifying the corresponding first object in executing a corresponding query, or deleting the corresponding first object in executing a corresponding query.
  11. 11 . The database system of claim 8 , wherein the database system is further operable to: receive a permission-setting instruction indicating the first corresponding set of permissions and further indicating the corresponding first object; set the first corresponding set of permissions for the corresponding first object in the system configuration data based on processing the permission-setting instruction.
  12. 12 . The database system of claim 6 , wherein the system configuration data indicates state data for a corresponding database system based on indicating ones of the plurality of objects that are available for access in query executions at a corresponding time, further comprising: determining to execute the second query based on determining the first corresponding object is available for access in query executions at the corresponding time, wherein the at least one property of the first corresponding object includes the first corresponding object being available for access in query executions at the corresponding time.
  13. 13 . The database system of claim 6 , wherein the first machine learning model is configured in accordance with a first selected model type of a plurality of machine learning function types, wherein the plurality of machine learning function types includes at least two of: a simple linear regression type; a multiple linear regression type; a polynomial regression type; a linear combination regression type; a K means type; a K Nearest Neighbors type; a logistic regression type; a naive bayes type; a nonlinear regression type; a feedforward network type; a principal component analysis type; a support vector machine type; a decision tree type; a linear discriminant analysis type; a Gaussian mixture model type; or a Sammon mapping type; wherein the first selected model type corresponds to one of: the simple linear regression type; the multiple linear regression type; the polynomial regression type; the linear combination regression type; the K means type; the K Nearest Neighbors type; the logistic regression type; the naive bayes type; the nonlinear regression type; the feedforward network type; the principal component analysis type; the support vector machine type; the decision tree type; the linear discriminant analysis type; the Gaussian mixture model type; or the Sammon mapping type.
  14. 14 . A non-transitory computer readable storage medium comprises: at least one memory section that stores operational instructions that, when executed by at least one processing module, causes the at least one processing module to perform operations that include: determining a first query for execution that indicates creation of a first machine learning model; executing the first query to generate the first machine learning model from a training set of rows based on accessing and processing the training set of rows via a plurality of operators; updating system configuration data tracking a plurality of objects of a database system to further track the first machine learning model as a corresponding first object tracked via the system configuration data; determining a second query for execution that indicates applying of the first machine learning model; and generating query output for the second query via execution of the second query based on applying the first machine learning model to a set of rows in accordance with at least one property of the corresponding first object based on accessing the system configuration data.
  15. 15 . The non-transitory computer readable storage medium of claim 14 , wherein the system configuration data indicates a plurality of object groups that collectively includes the plurality of objects, wherein the first corresponding object is mapped to a first object group of the plurality of object groups, and wherein the at least one property of the corresponding first object is based on at least one object group property of the first object group.
  16. 16 . The non-transitory computer readable storage medium of claim 14 , wherein the corresponding first object mapped is mapped to a first corresponding set of permissions, and wherein the first corresponding set of permissions is enforced to restrict usage of the first machine learning model based on the corresponding first object being mapped to the first corresponding set of permissions, and wherein the at least one property of the corresponding first object indicates the first corresponding set of permissions.
  17. 17 . The non-transitory computer readable storage medium of claim 16 , wherein the at least one memory section further stores operational instructions that, when executed by the at least one processing module, causes the at least one processing module to: determine to execute the second query based on determining a corresponding second query expression adheres to the first corresponding set of permissions based on the corresponding first object being mapped to the first corresponding set of permissions, wherein the query output is generated for the second query based on determining to execute the second query; determine a third query for execution that indicates applying of the first machine learning model; and determine to not execute the third query based on determining a corresponding third query expression does not adhere to the first corresponding set of permissions based on the corresponding first object being mapped to the first corresponding set of permissions, wherein corresponding query output is not generated for the third query based on determining to not execute the third query.
  18. 18 . The non-transitory computer readable storage medium of claim 16 , wherein the first corresponding set of permissions indicates, for each of a set of one or more authorized user entities, whether each of a set of operations performed upon the corresponding first object, wherein the set of operations includes at least one of: executing the corresponding first object in executing a corresponding query, reading the corresponding first object in executing a corresponding query, modifying the corresponding first object in executing a corresponding query, or deleting the corresponding first object in executing a corresponding query.
  19. 19 . The non-transitory computer readable storage medium of claim 16 , wherein the at least one memory section further stores operational instructions that, when executed by the at least one processing module, causes the at least one processing module to: receive a permission-setting instruction indicating the first corresponding set of permissions and further indicating the corresponding first object; set the first corresponding set of permissions for the corresponding first object in the system configuration data based on processing the permission-setting instruction.
  20. 20 . The non-transitory computer readable storage medium of claim 14 , wherein the system configuration data indicates state data for a corresponding database system based on indicating ones of the plurality of objects that are available for access in query executions at a corresponding time, further comprising: determining to execute the second query based on determining the first corresponding object is available for access in query executions at the corresponding time, wherein the at least one property of the first corresponding object includes the first corresponding object being available for access in query executions at the corresponding time.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS The present U.S. Utility Patent Application claims priority pursuant to 35 U.S.C. § 119(e) to U.S. Provisional Application No. 63/498,893, entitled “UPDATING SYSTEM CONFIGURATION DATA TO INCLUDE OBJECTS FOR MACHINE LEARNING MODELS IN A DATABASE SYSTEM”, filed Apr. 28, 2023, which is hereby incorporated herein by reference in its entirety and made part of the present U.S. Utility Patent Application for all purposes. STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT Not Applicable. INCORPORATION-BY-REFERENCE OF MATERIAL SUBMITTED ON A COMPACT DISC Not Applicable. BACKGROUND OF THE INVENTION Technical Field of the Invention This invention relates generally to computer networking and more particularly to database system and operation. Description of Related Art Computing devices are known to communicate data, process data, and/or store data. Such computing devices range from wireless smart phones, laptops, tablets, personal computers (PC), work stations, and video game devices, to data centers that support millions of web searches, stock trades, or on-line purchases every day. In general, a computing device includes a central processing unit (CPU), a memory system, user input/output interfaces, peripheral device interfaces, and an interconnecting bus structure. As is further known, a computer may effectively extend its CPU by using “cloud computing” to perform one or more computing functions (e.g., a service, an application, an algorithm, an arithmetic logic function, etc.) on behalf of the computer. Further, for large services, applications, and/or functions, cloud computing may be performed by multiple cloud computing resources in a distributed manner to improve the response time for completion of the service, application, and/or function. Of the many applications a computer can perform, a database system is one of the largest and most complex applications. In general, a database system stores a large amount of data in a particular way for subsequent processing. In some situations, the hardware of the computer is a limiting factor regarding the speed at which a database system can process a particular function. In some other instances, the way in which the data is stored is a limiting factor regarding the speed of execution. In yet some other instances, restricted co-process options are a limiting factor regarding the speed of execution. BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWING(S) FIG. 1 is a schematic block diagram of an embodiment of a large scale data processing network that includes a database system in accordance with various embodiments; FIG. 1A is a schematic block diagram of an embodiment of a database system in accordance with various embodiments; FIG. 2 is a schematic block diagram of an embodiment of an administrative sub-system in accordance with various embodiments; FIG. 3 is a schematic block diagram of an embodiment of a configuration sub-system in accordance with various embodiments; FIG. 4 is a schematic block diagram of an embodiment of a parallelized data input sub-system in accordance with various embodiments; FIG. 5 is a schematic block diagram of an embodiment of a parallelized query and response (Q&R) sub-system in accordance with various embodiments; FIG. 6 is a schematic block diagram of an embodiment of a parallelized data store, retrieve, and/or process (IO& P) sub-system in accordance with various embodiments; FIG. 7 is a schematic block diagram of an embodiment of a computing device in accordance with various embodiments; FIG. 8 is a schematic block diagram of another embodiment of a computing device in accordance with various embodiments; FIG. 9 is a schematic block diagram of another embodiment of a computing device in accordance with various embodiments; FIG. 10 is a schematic block diagram of an embodiment of a node of a computing device in accordance with various embodiments; FIG. 11 is a schematic block diagram of an embodiment of a node of a computing device in accordance with various embodiments; FIG. 12 is a schematic block diagram of an embodiment of a node of a computing device in accordance with various embodiments; FIG. 13 is a schematic block diagram of an embodiment of a node of a computing device in accordance with various embodiments; FIG. 14 is a schematic block diagram of an embodiment of operating systems of a computing device in accordance with various embodiments; FIGS. 15-23 are schematic block diagrams of an example of processing a table or data set for storage in the database system in accordance with various embodiments; FIG. 24A is a schematic block diagram of a query execution plan implemented via a plurality of nodes in accordance with various embodiments of the present invention; FIGS. 24B-24D are schematic block diagrams of embodiments of a node that implements a query processing module in accordance with various embodiments of the present invention; FIG. 24E is a schematic block diag