Search

US-12625863-B1 - Database-centric operating system for durable workflow

US12625863B1US 12625863 B1US12625863 B1US 12625863B1US-12625863-B1

Abstract

An operating system platform built on a transactional datastore is disclosed. The platform comprises a transactional datastore configured to store workflow state information, wherein services of the operating system are included in the transactional datastore. A workflow engine executes a workflow, wherein a portion of the workflow is annotated with decorators specifying infrastructure requirements. The platform includes a provenance database for storing historical versions of data and a scheduler for managing execution of periodic and event-driven tasks.

Inventors

  • Qian Li
  • Michael Coden
  • Peter Kraft
  • Henri Maxime Demoulin
  • Charles Bear
  • Harold Pierson, III
  • Manoj Khangaonkar
  • Alexandre Poliakov
  • Leander Neiss
  • Mike Stonebraker

Assignees

  • DBOS, Inc.

Dates

Publication Date
20260512
Application Date
20241216

Claims (20)

  1. 1 . An operating system platform comprising: one or more computer processors; and a non-transitory computer readable storage medium storing instructions that when executed by the one or more computer processors, cause the one or more computer processors to: store, by a transactional datastore, workflow state information, wherein services of an operating system are included in the transactional datastore, the services comprising process scheduling, memory management, and file system operations; execute, by a workflow engine, a workflow, wherein a portion of the workflow is annotated with decorators specifying infrastructure requirements, wherein the workflow engine assigns a unique identifier to each instance of executing code to ensure exactly-once execution; check, by the workflow engine, for existing execution records before starting the workflow to prevent duplicate executions; automatically provision, by the workflow engine, infrastructure resources based on the decorators specifying infrastructure requirements; execute, by an execution manager, functions with configurable isolation levels, wherein the functions are database transactions implemented with atomicity, consistency, isolation, and durability properties; include, by the execution manager, execution records in a transaction as workflow operations to ensure atomicity; generate, by a compiler, transactional datastore stored procedures from high-level procedural language code defining the workflow, wherein the stored procedures are generated in a SQL-based language compatible with the transactional datastore; facilitate read/write interaction between application logic and the transactional datastore through an object-relational mapping layer that maps between object-oriented programming constructs and transactional datastore structures; store historical versions of data from the transactional datastore in a provenance database; capture changes to the transactional datastore using logical replication to populate the provenance database with versioned records; communicate, by a messaging system, between workflows, wherein the messaging system provides exactly-once delivery semantics for messages between workflows; and generate and utilize idempotency keys, by the workflow engine, when interacting with external systems to prevent duplicate actions in case of retries.
  2. 2 . The platform of claim 1 , wherein the executional datastore is configured to accept user-defined functions that automatically execute when a specific event occurs in a table, view, or foreign table.
  3. 3 . The platform of claim 1 , wherein the execution manager is configured to support at least one of read uncommitted, read committed, repeatable read, and serializable isolation levels.
  4. 4 . The platform of claim 1 , wherein the compiler is configured to generate the stored procedures in a dialect of SQL compatible with the transactional datastore.
  5. 5 . The platform of claim 1 , wherein at least one of the decorators defines access roles assigned to the portion of the workflow.
  6. 6 . The platform of claim 1 , further comprising a scheduler configured to manage execution of periodic and event-driven tasks.
  7. 7 . The platform of claim 1 , further comprising a provenance database configured for storing historical versions of data from the transactional datastore.
  8. 8 . The platform of claim 7 , wherein the provenance database is configured to store multiple versions of each data record with corresponding transaction identifiers indicating when each version was created or superseded.
  9. 9 . The platform of claim 7 , wherein the provenance database is configured to capture changes to the transactional datastore using logical replication.
  10. 10 . The platform of claim 7 , further comprising a time travel proxy configured to transform user queries to retrieve data from the provenance database as of a specified point in time.
  11. 11 . The platform of claim 7 , wherein the provenance database is configured to store snapshot information mapping transaction identifiers to timestamps.
  12. 12 . The platform of claim 7 , wherein the provenance database is configured to enable auditing of database interactions by storing information about which transactions modified specific data records.
  13. 13 . The platform of claim 7 , wherein the provenance database is configured to recover data in the transactional datastore.
  14. 14 . The platform of claim 7 , wherein the instructions that when executed by the one or more computer processors, cause the one or more computer processors to recall, from the provenance database, historical functions and corresponding transactions for debugging the high-level procedural language code.
  15. 15 . The platform of claim 1 , further comprising an object-relational mapping (ORM) layer configured to map between object-oriented programming constructs and transactional datastore structures, wherein the ORM layer provides an abstraction for interacting with the transactional datastore using object-oriented programming languages.
  16. 16 . The platform of claim 15 , wherein the ORM layer comprises at least one of SQLAlchemy, Drizzle, Knex, Prisma, and TypeORM.
  17. 17 . The platform of claim 1 , wherein at least one of the decorators specifies a maximum number of times the workflow may be automatically recovered.
  18. 18 . The platform of claim 1 , wherein at least one of the decorators specifies a backoff rate between attempts to retry a step of the workflow.
  19. 19 . The platform of claim 1 , wherein the functions are database transactions.
  20. 20 . The platform of claim 1 , wherein the operating system runs on bare metal.

Description

TECHNICAL FIELD The present disclosure relates generally to database systems and methods for managing event processing in a database system. BRIEF DESCRIPTION OF THE DRAWINGS The accompanying drawings, which are incorporated in and form a part of the specification, illustrate the embodiments of the invention and together with the written description serve to explain the principles, characteristics, and features of the invention. Various aspects of at least one example are discussed below with reference to the accompanying drawings, which are not intended to be drawn to scale. In the drawings: FIG. 1 depicts an exemplary block diagram of an illustrative operating system platform in accordance with an embodiment. FIG. 2 depicts an example flowchart illustrating a method for managing transactions in a versioned database associated with a database-backed application in accordance with an embodiment. FIG. 3 depicts an illustrative flowchart showing a method for managing data through a primary and provenance database system in accordance with an embodiment. FIG. 4 depicts an illustrative flowchart of a method for transactional execution of workflows in accordance with an embodiment. FIG. 5 illustrates a block diagram of a data processing system in which embodiments are implemented. FIG. 6 depicts example code for migrating provenance schema in accordance with an embodiment. FIG. 7 depicts example code for time travel query transformation in accordance with an embodiment. FIG. 8 depicts example code for executing a workflow in accordance with an embodiment. FIG. 9 depicts example code for executing a transaction in accordance with an embodiment. DETAILED DESCRIPTION This disclosure is not limited to the particular systems, devices and methods described, as these may vary. The terminology used in the description is for the purpose of describing the particular versions or embodiments only and is not intended to limit the scope. As used in this document, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise. Those having skill in the art can also translate from the plural form to the singular as is appropriate to the context and/or application. Unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of ordinary skill in the art. Nothing in this disclosure is to be construed as an admission that the embodiments described in this disclosure are not entitled to antedate such disclosure by virtue of prior invention. As used in this document, the term “comprising” means “including, but not limited to.” It will be understood by those within the art that, in general, terms used herein are generally intended as “open” terms (for example, the term “including” should be interpreted as “including but not limited to,” the term “having” should be interpreted as “having at least,” the term “includes” should be interpreted as “includes but is not limited to,” et cetera). While various compositions, methods, and devices are described in terms of “comprising” various components or steps (interpreted as meaning “including, but not limited to”), the compositions, methods, and devices also can “consist essentially of” or “consist of” the various components and steps, and such terminology should be interpreted as defining essentially closed-member groups. In addition, even if a specific number is explicitly recited, those skilled in the art will recognize that such recitation should be interpreted to mean at least the recited number (for example, the bare recitation of “two recitations,” without other modifiers, means at least two recitations, or two or more recitations). Furthermore, in those instances where a convention analogous to “at least one of A, B, and C, et cetera” is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (for example, “a system having at least one of A, B, and C” would include but not be limited to systems that have A alone, B alone, C alone, A and B together, A and C together, B and C together, and/or A, B, and C together, et cetera). In those instances where a convention analogous to “at least one of A, B, or C, et cetera” is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (for example, “a system having at least one of A, B, or C” would include but not be limited to systems that have A alone, B alone, C alone, A and B together, A and C together, B and C together, and/or A, B, and C together, et cetera). It will be further understood by those within the art that virtually any disjunctive word and/or phrase presenting two or more alternative terms, whether in the description, sample embodiments, or drawings, should be understood to contemplate the possibilities of including one of the terms, either of the terms, or both terms. For example, the ph