EP-4740140-A2 - DOUBLY-EXPONENTIALLY ACCELERATED PARTICLE METHODS AND SYSTEMS FOR NONLINEAR CONTROL

EP4740140A2EP 4740140 A2EP4740140 A2EP 4740140A2EP-4740140-A2

Abstract

Aspects herein describe new methods of determining optimal actions to achieve high-level objectives based on an optimized chosen statistic. At least one high-level objective, along with various observational data about the world, is identified by a computational unit. The computational unit determines, through a particle method, an optimal course of action. The particle method is doubly-exponentially accelerated based on one or more acceleration methods. The doubly-exponentially accelerated particle method comprises alternating backward and forward sweeps of a coupled induction loop to optimize a selection policy and test for convergence to determine said optimal course of action. In one embodiment a user inputs a high-level objective into a cell phone which senses observational data. The cell phone communicates with a server that provides instructions. The server determines an optimal course of action via the doubly-exponentially accelerated particle method, and the cell phone then displays the instructions to the user.

Inventors

BURCHARD, Paul

Assignees

Artificial Genius Inc.

Dates

Publication Date: 20260513
Application Date: 20240703

Claims (1)

B&W Ref.: 007963.00026\WO CLAIMS What is claimed is: 1. A method comprising: receiving, by a sensor, observational information; maintaining, by a computational unit, an objective corresponding to the objective information; maintaining, by the computational unit, a current uncertainty about an unknown state, wherein the current uncertainty is updated using the observational information; generating a selection policy, wherein the selection policy comprises one or more parameters for determining optimal actions to achieve the objective with an optimized chosen statistic of a distribution of future cost; and determining, by the computational unit and based on the selection policy, one or more optimal actions to achieve the objective as an optimal value of the optimized chosen statistic, wherein said determining comprises performing both backward induction on the optimized chosen statistic and forward induction on the uncertainty about the unknown state. 2. The method of claim 2, further comprising: determining a first number t, which represents a future time; determining a first vector x, which represents an unknown state of a real or simulated world at time t; determining a second vector y, which represents an observable state at time t; determining a first function M(x), which is a measurement function corresponding to the second vector y; and determining a cost function corresponding to the observable state, wherein maintaining the objective comprises defining the objective based on the first number t, the first vector x, the second vector y, the first function M(x), and the cost function. 3. The method of claim 2, further comprising: determining a sequence of vectors i, wherein the series of vectors i represents one or more historical observable states; determining, based on the first vector x and the series of vectors i, lifted dynamics of the selection policy; determining, based on the lifted dynamics of the selection policy, an uninformed probability distribution p(t), wherein the uninformed probability distribution p(t) corresponds to one or more initial probability distributions; and B&W Ref.: 007963.00026\WO deriving, based on the uninformed probability distribution, the forward induction. 4. The method of claim 3, further comprising: determining a value function v(t)(x,i), wherein an expectation of the value function v(t)(x,i) is defined by: ^^^^^^^ = ^ '^^^ [ ^ → &^^^^^, ^^^^^ ] = ^ (^^^ [ &^^^ | ^^^^ ]; inputting the value function v(t)(x,i) into a global backward induction yielding: ^ (^^^ [ &^^^^^, ^^ | ^^^^ ] = ^^^ ^^)^^^^ { ^^^^^^, ^^^^^^^^ + ^ )^^+,^ ^ (^^+,^ [ &^^ + 1^ | ^^^ ^^^^^^^^ ^^^^^^ | ^^^^ ]; and deriving, based on inputting the value function v(t)(x,i) into the global backward induction, the backward induction. 5. The method of claim 4, wherein determining the one or more actions comprises optimizing, based on the uninformed probability distribution p(t) and the value function v(t)(x,i), the selection policy. 6. The method of claim 1, wherein the optimized chosen statistic comprises: a percentile distribution of expected future costs for achieving the objective, a maximum total future cost, an expectation of the total future cost, or an average of a subset of expected future costs for achieving the objective. 7. The method of claim 1, further comprising: receiving, by the sensor, a current observation corresponding to the real or simulated world state; updating, based on the current observation, historical observable state information; determining, based on the historical observable state information, a relevance score for the current observation, wherein the relevance score comprises statistics of a current cost and a statistic of a distribution of future cost of performing one or more actions; generating, based on the relevance score, one or more mathematical representations of emotions; and compressing, based on the one or more mathematical representations of emotions, the historical observable state information, B&W Ref.: 007963.00026\WO wherein performing the forward induction comprises determining, based on the compressed historical observable state information, informed state distributions for the real or simulated world state. 8. The method of claim 1, further comprising: determining, based on the one or more parameters, one or more optimal dimensions for computing the one or more optimal actions; generating, based on the one or more optimal dimensions, one or more updated probability distributions corresponding to a state of the world; and updating, during the determining the one or more optimal actions and based on the one or more updated probability distributions, the forward induction and the backward induction. 9. The method of claim 1, further comprising: determining an initial particle distribution, wherein the initial particle distribution corresponds to: information of an unknown state of a real or simulated world at a time t; information of an uninformed probability distribution p(t), information of one or more historical observable states, an indication of the selection policy, and a value function corresponding to the objective; performing a multi-scaling method, wherein the multi-scaling method comprises: scaling up interaction distances and speeds of motion in world mechanics corresponding to the one or more initial probability distributions; identifying a subset of particles of the initial particle distribution; interpolating the subset of particles; and repeating the scaling, identifying subsets of particles, and interpolating until an optimal number of scales is achieved; and updating, based on completion of the multi-scaling method, the backward induction and the forward induction. 10. The method of claim 1, further comprising: generating, based on optimal actions for achieving historical objectives, a historical record of sub-problems for historical objectives; determining an intermediate goal for the objective; B&W Ref.: 007963.00026\WO comparing, based on the historical record of sub-problems, the intermediate goal to one or more historical sub-problems; determining, based on the comparing a set of historical sub-problems corresponding to the intermediate goal; and updating, based on the set of historical sub-problems, the selection policy. 11. A method for optimizing acquisition of data, in furtherance of an objective, comprising: maintaining, by a computational unit, the objective; representing the objective using an incremental cost of a plurality of potential actions, wherein the plurality of potential actions comprises one or more actions associated with an optimal contingent strategy for achieving the objective as an optimal value of an optimized chosen statistic of a distribution of future cost that, when performed, produce observational information; acquiring, via one or more sensors, based on performing, during execution of the optimal contingent strategy, the one or more actions and based on prior observation information acquired by the one or more sensors, the observational information; providing, to a model that is selecting the optimal contingent strategy, the observational information, wherein providing the observational information configures the model to determine one or more optimal future actions for achieving the objective; and determining, by the computational unit and using the model, one or more optimal future actions to achieve the objective, wherein the determining the one or more optimal future actions comprises repeating a backward induction and a forward induction until convergence is identified. 12. The method of claim 11, further comprising: receiving, by the one or more sensors, a current observation corresponding to a real or simulated world state; updating, based on the current observation, historical observable state information; determining, based on the historical observable state information, a relevance score for the current observation, wherein the relevance score comprises statistics of a current cost and a statistic of a distribution of future cost of performing one or more actions; generating, based on the relevance score, one or more mathematical representations of emotions; and compressing, based on the one or more mathematical representations of emotions, the historical observable state information, B&W Ref.: 007963.00026\WO wherein performing the forward induction comprises determining, based on the compressed historical observable state information, informed state distributions for the real or simulated world state. 13. The method of claim 11, further comprising: determining, based on one or more parameters for determining optimal actions, one or more optimal dimensions for computing the one or more optimal actions; generating, based on the one or more optimal dimensions, one or more updated probability distributions corresponding to a state of the world; and updating, during the determining the one or more optimal actions and based on the one or more updated probability distributions, the forward induction and the backward induction. 14. The method of claim 11, further comprising: determining an initial particle distribution, wherein the initial particle distribution corresponds to: information of an unknown state of a real or simulated world at a time t; information of an uninformed probability distribution p(t), information of one or more historical observable states, an indication of the selection policy, and a value function corresponding to the objective; performing a multi-scaling method, wherein the multi-scaling method comprises: scaling up interaction distances and speeds of motion in world mechanics corresponding to the one or more initial probability distributions; identifying a subset of particles of the initial particle distribution; interpolating the subset of particles; and repeating the scaling, identifying subsets of particles, and interpolating until an optimal number of scales is achieved; and updating, based on completion of the multi-scaling method, the forward induction and the backward induction. 15. The method of claim 11, further comprising: generating, based on optimal actions for achieving historical objectives, a historical record of sub-problems for historical objectives; determining an intermediate goal for the objective; B&W Ref.: 007963.00026\WO comparing, based on the historical record of sub-problems, the intermediate goal to one or more historical sub-problems; determining, based on the comparing a set of historical sub-problems corresponding to the intermediate goal; and updating, based on the set of historical sub-problems, a selection policy for achieving the objective. 16. A method for constructing an efficient memory, in furtherance of an objective, comprising: maintaining, by a computational unit, the objective; representing the objective using an incremental cost of a plurality of potential actions; acquiring observational data, directly or indirectly, as a result of performing the plurality of potential actions; selecting a subset of the observational data to include in a memory unit based on one or more statistics of a distribution of total current and future cost at the time that the data is acquired; and determining, by the computational unit and based on the subset of the observational data, one or more optimal actions to achieve the objective as an optimal value of an optimized chosen statistic of a distribution of future cost, wherein determining the one or more optimal actions comprises repeating a backward induction and a forward induction until convergence is identified. 17. The method of claim 16, further comprising: receiving, by one or more sensors, a current observation corresponding to a real or simulated world state; updating, based on the current observation, historical observable state information; determining, based on the historical observable state information, a relevance score for the current observation, wherein the relevance score comprises statistics of a current cost and a statistic of a distribution of future cost of performing one or more actions; generating, based on the relevance score, one or more mathematical representations of emotions; and compressing, based on the one or more mathematical representations of emotions, the historical observable state information, B&W Ref.: 007963.00026\WO wherein performing the forward induction comprises determining, based on the compressed historical observable state information, informed state distributions for the real or simulated world state. 18. The method of claim 16, further comprising: determining, based on one or more parameters for determining optimal actions, one or more optimal dimensions for computing the one or more optimal actions; generating, based on the one or more optimal dimensions, one or more updated probability distributions corresponding to a state of the world; and updating, during the determining the one or more optimal actions and based on the one or more updated probability distributions, the forward induction and the backward induction. 19. The method of claim 16, further comprising: determining an initial particle distribution, wherein the initial particle distribution corresponds to: information of an unknown state of a real or simulated world at a time t; information of an uninformed probability distribution p(t), information of one or more historical observable states, an indication of a selection policy, and a value function corresponding to the objective; performing a multi-scaling method, wherein the multi-scaling method comprises: scaling up interaction distances and speeds of motion in world mechanics corresponding to the one or more initial probability distributions; identifying a subset of particles of the initial particle distribution; interpolating the subset of particles; and repeating the scaling, identifying subsets of particles, and interpolating until an optimal number of scales is achieved; and updating, based on completion of the multi-scaling method, the forward induction and the backward induction. 20. The method of claim 16, further comprising: generating, based on optimal actions for achieving historical objectives, a historical record of sub-problems for historical objectives; determining an intermediate goal for the objective; B&W Ref.: 007963.00026\WO comparing, based on the historical record of sub-problems, the intermediate goal to one or more historical sub-problems; determining, based on the comparing a set of historical sub-problems corresponding to the intermediate goal; and updating, based on the set of historical sub-problems, a selection policy for achieving the objective.

Description

B&W Ref.: 007963.00026\WO DOUBLY-EXPONENTIALLY ACCELERATED PARTICLE METHODS AND SYSTEMS FOR NONLINEAR CONTROL [01] A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever. CROSS REFERENCE TO RELATED APPLICATIONS [02] This application claims priority to provisional U.S. Application Ser. No. 63/512,510, filed July 7, 2023, entitled “ACCELERATED PARTICLE METHODS AND SYSTEMS FOR NONLINEAR CONTROL”; provisional U.S. Application Ser. No. 63/603,149, filed November 28, 2023, entitled “ACCELERATED PARTICLE METHODS AND SYSTEMS FOR NONLINEAR CONTROL”; provisional U.S. Application Ser. No. 63/573,967, filed April 3, 2024, entitled “DOUBLY-EXPONENTIALLY ACCELERATED PARTICLE METHODS AND SYSTEMS FOR NONLINEAR CONTROL”; and U.S. Application Ser. No. 18/750,304, filed June 21, 2024, entitled “DOUBLY-EXPONENTIALLY ACCELERATED PARTICLE METHODS AND SYSTEMS FOR NONLINEAR CONTROL,” each of which is hereby incorporated by reference in its entirety for all purposes. FIELD [03] Aspects described herein relate to computers, software, and artificial intelligence. More specifically, aspects relate to goal-oriented optimization for artificial intelligence systems. BACKGROUND [04] The field of Artificial Intelligence has attempted many approaches to the problem of replicating the capabilities of biological intelligence, but none of them have been successful to date. Additionally, the field of Neuroscience has attempted to apply many empirical approaches to dissect the operation of biological intelligence and elucidate its functioning, but to date, none of them have been successful in assembling the various empirical phenomena so discovered into a coherent understanding of how biological intelligence works. [05] Solving the problem of biological intelligence cannot be achieved without asking the right questions and setting the right conceptual framework, that can guide the solution using B&W Ref.: 007963.00026\WO both Computer Science and Neuroscience, in the right combination, to solve the problem. From a philosophical viewpoint, the starting point for inquiry was already identified in the 1890s by William James, with his functionalist philosophy that intelligence is an evolutionary imperative that is essential for the survival of the species that have this capability. His philosophy was not an effective one, in the sense of defining what it is that intelligence is doing for their survival. Nevertheless, asking the question in this way provides a guide for the inquiry, by focusing it on the elucidation of what those survival characteristics are. [06] This philosophy can be made effective, by recognizing that its power is in clarifying that in biology, there is a unique and well-defined highest-level goal, i.e., survival of the species. The problem that intelligence solves, then, is that this highest-level goal, which is far removed from the low-level knowledge of the organism about the operation of the world, must be achieved, with limited capabilities for action and observation, in environments that are complex, uncertain, and novel. This clarifies that the survival advantage of intelligence is to bridge that huge gap between low-level knowledge and observations, and high-level goals, under those severe constraints, by devising complex and contingent strategies of action that optimally achieve those goals. William James’ original insight is, through this thought experiment, thereby converted into an effective and fully quantitative definition of intelligence, since it translates into the known mathematical concept of a Markov Decision Process Under Uncertainty, which is studied in the field of Control Theory. [07] So does this precise mathematical formulation now solve the problem of intelligence? No. This mathematical problem was one of the many approaches to Artificial Intelligence proposed in the 1950s, and was studied and theoretically solved by Richard Bellman. However, because this theoretical solution is infeasible for any practical problem, this approach was long ago abandoned. To understand why, consider that the standard Markov Decision Process (without uncertainty) assumes that world states are explicitly known. Bellman’s equation then finds the globally optimal action strategy by backward induction over the space of all world states, an approach which is already infeasible for many real world problems, because of the enormous number of world states. Yet on top of that, the real world also includes uncertainty, which is handled in Bellman’s approach by taking the state space to be the space of all probability distributions over world states, and then performing the backward induction on that larger state space. This