EP-4505252-B1 - RECURRENT PREDICTION AND ITERATIVE CORRECTION METHOD FOR FAST SOLUTION OF MIXED-INTEGER OPTIMAL CONTROL

EP4505252B1EP 4505252 B1EP4505252 B1EP 4505252B1EP-4505252-B1

Inventors

QUIRYNEN, Rien
CHAKRABARTY, Ankush
DI CAIRANO, STEFANO
CAULIGI, Abhishek

Dates

Publication Date: 20260513
Application Date: 20230210

Claims (15)

A controller (110) for controlling a motion of a device to perform a task changing a state of the device subject to constraints (104), the controller comprising: a processor (151); and a memory (152) having instructions stored thereon that, when executed by the processor, cause the controller to: collect parameters (205) of the task, including the state (121) of the device; input the parameters of the task to evaluate a parametric function trained to output predicted values for a set of discrete variables (275) in a mixed-integer convex programming (MICP) problem (270) for performing the task defined by the parameters; fix a first subset of discrete variables in the MICP to the predicted values outputted by the trained parametric function; perform a presolve-based correction (312) configured to update at least some of the predicted values of a remaining subset of discrete variables in the MICP to values that are uniquely defined by the fixed values for the first subset of discrete variables and the constraints; transform the MICP into a convex programming (CP) problem (280), based on the fixed values of the first subset of discrete variables and the updated values of the remaining subset of discrete variables; solve the CP problem subject to the constraints to produce a feasible motion trajectory (255) for performing the task defined by the parameters; and command the device to change the state according to the motion trajectory (111).
The controller of claim 1, wherein the parametric function is trained to output the predicted values for the set of discrete variables at a single time step or at multiple consecutive time steps in a prediction time horizon of the MICP.
The controller of claim 1 or 2, wherein the parametric function is evaluated once or multiple times to produce multiple predicted values for each discrete variable in the set of discrete variables in the MICP, resulting in multiple candidate discrete solutions; wherein the controller performs the presolve-based correction for each candidate discrete solution to transform the MICP into multiple CP problems and determine multiple motion trajectories, and wherein the controller is further configured to select the feasible motion trajectory for performing the task from the multiple motion trajectories.
The controller of claim 3, wherein the feasible motion trajectory is selected from the multiple motion trajectories to be an optimal motion trajectory with the lowest objective value in a minimization problem, or the largest objective value in a maximization problem.
The controller of claim 3, wherein the parametric function is trained, based on a multi-class classification approach, to output a probability or confidence level for each of the predicted values of each discrete variable in the set of discrete variables, wherein the first subset of discrete variables is selected based on the probability or confidence level of the predicted values, optionally wherein multiple values of each of the discrete variables in the MICP are selected to maximize the probability or confidence level that is outputted by the trained parametric function for each set of predicted values of the discrete variables, or optionally wherein multiple values of each of the discrete variables in the MICP are produced by sampling a probability distribution at a single time step or at multiple consecutive time steps in the prediction time horizon, and the probability for each set of fixed values of the discrete variables in a discrete optimizer set at a single time step or at multiple consecutive time steps in the prediction time horizon is equal to the probability that is outputted by the trained parametric function for performing the task defined by the parameters.
The controller of claim 3, wherein the parametric function is trained, based on a regression approach, to directly output a continuous approximation for feasible values of one or multiple discrete variables at a single time step or at multiple consecutive time steps in the prediction time horizon of the MICP for performing the task defined by the problem parameters, optionally wherein the predicted values of the set of discrete variables are produced by rounding the continuous approximation that is outputted by the trained parametric function to a nearest integer-feasible candidate solution.
The controller of claim 3, wherein the processor is at least one parallel processor and the solution to the MICP problem is found by testing the multiple candidate solutions in parallel to find the feasible motion trajectory.
The controller of claim 1, wherein to perform the presolve-based correction, the controller iteratively selects discrete variables to the first subset and updates the predicted values in the remaining subset of discrete variables until either an infeasibility is detected or until values for all discrete variables are fixed allowing to transforming the MICP into the CP problem; if the infeasibility is detected, execute a backup control solution to change the state of the device; and otherwise, if the values for all discrete variables are fixed, solves the CP problem subject to the constraints to produce the feasible motion trajectory.
The controller of claim 8, wherein an iteration of the presolve-based correction consists of one or multiple single presolve steps, and each single presolve step is an iterative procedure that performs one or multiple iterations until a termination condition is met, and each iteration of the single presolve step comprises one or a combination of presolve operations including domain propagation for bound strengthening of continuous and/or discrete optimization variables, based on equality and inequality constraints in the MICP for performing the task defined by the problem parameters; redundant inequality constraint detection and removal, and dual fixing of continuous and/or discrete optimization variables, based on the cost function and based on equality and inequality constraints in the MICP; inequality constraint coefficient strengthening to replace one or multiple inequality constraints by one or multiple dominating inequality constraints, and the dominating inequality constraints can reduce the feasible search space of the relaxed problem without removing an integer-feasible solution of the MICP; and binary variable probing that aims to fix values of one or multiple discrete variables by probing feasibility of fixed values of one or multiple discrete variables in the MICP for performing the task defined by the problem parameters.
The controller of claim 9, wherein each iteration of the single presolve step comprises one or multiple tailored block-sparse presolve operations that exploit the block-structured sparsity of a mixed-integer optimal control problem (MIOCP) for performing the task defined by the parameters, optionally wherein the tailored block-sparse presolve operations can include one or a combination of block-sparse forward-backward operations for domain propagation and bound strengthening of continuous and/or discrete optimization variables, based on equality and inequality constraints in the block-structured MIOCP for performing the task defined by the problem parameters; block-sparse redundant inequality constraint detection and dual fixing of continuous and/or discrete optimization variables, based on the cost function and based on equality and inequality constraints in the block-structured MIOCP; block-sparse inequality constraint coefficient strengthening to replace one or multiple inequality constraints by one or multiple dominating inequality constraints in the block-structured MIOCP; and block-sparse binary variable probing that aims to fix values of one or multiple discrete variables by probing feasibility of fixed values of one or multiple discrete variables in the block-structured MIOCP.
The controller of any one of claims 2 to 10, wherein the parametric function is a recurrent neural network (RNN) and each step of the RNN corresponds to a prediction for values of a set of discrete variables at a single or multiple consecutive time steps in the prediction time horizon of the MICP, such that a sequential evaluation of the RNN results in predicted values for each of the discrete variables of the MICP, or wherein the parametric function approximation architecture can consist of one or multiple layers of a feedforward deep neural network, one or multiple layers of a long short-term memory (LSTM) network, one or multiple layers of a gated recurrent unit (GRU) and/or one or multiple of a convolutional neural network (CNN).
The controller of any one of claims 1 to 11, wherein a first subset of discrete variables in the MICP are fixed to the predicted values outputted by the trained parametric function, a second subset of discrete variables in the MICP are fixed by a presolve-based correction to values that are uniquely defined by the fixed values for the first subset of discrete variables and the constraints, and a third subset of discrete variables in the MICP are selected to be free optimization variables, such that an approximation of the MICP with a reduced number of discrete optimization variables is solved subject to the constraints to produce a feasible motion trajectory, for example, using a branch-and-bound (B&B) optimization method in combination with one or multiple tailored block-sparse presolve operations that exploit the block-structured sparsity of a mixed-integer optimal control problem (MIOCP) for performing the task defined by the parameters.
The controller of any one of claims 1 to 12 , wherein the controller is a mixed-integer model predictive controller (MIMPC), wherein the MIMPC computes a control signal based on parameters of the task, including a current state of the system and control command, and wherein the MIMPC computes a control solution that comprises a sequence of future optimal discrete and continuous control inputs over a prediction time horizon of the hybrid dynamical system, by solving a constrained mixed-integer optimization problem at each control time step, or wherein the device is a vehicle, and wherein the controller determines an input to the vehicle based on the MICP solution, wherein the input to the vehicle includes one or a combination of an acceleration of the vehicle, an engine torque of the vehicle, brake torques, and a steering angle; the discrete optimization variables are used to model one or a combination of discrete control decisions, switching in the system dynamics, gear shifting, and obstacle avoidance constraints; and the problem parameters of the task can include a current state of the vehicle, a goal or target state, a position and/or orientation of one or multiple obstacles, one or multiple actuation limits, one or multiple weight values in objective function and/or one or multiple bound values in mixed-integer inequality constraints of the mixed-integer optimal control problem (MIOCP).
The controller of any one of claims 1 to 12 , wherein the device is a spacecraft, and wherein the controller determines an input to the spacecraft based on the MICP solution, wherein the input to the spacecraft actuates one or a combination of thrusters and momentum exchange devices; the discrete optimization variables are used to model one or a combination of discrete control decisions, switching in the system dynamics, integer values for the thruster commands, and obstacle avoidance constraints; and the problem parameters of the task can include a current state of the spacecraft, a goal or target state, a position and/or orientation of one or multiple obstacles, one or multiple actuation limits, one or multiple weight values in objective function and/or one or multiple bound values in mixed-integer inequality constraints of the MIOCP, or wherein the device is a vapor compression system, and wherein the controller determines an input to the vapor compression system based on the MICP solution, wherein the input to the vapor compression system includes one or a combination of an indoor unit fan speed, an outdoor unit fan speed, a compressor rotational speed, an expansion valve position, and a flow reversing valve position; the discrete optimization variables are used to model one or a combination of discrete control decisions, switching in the system dynamics, and integer values for the commands that are sent to the valves and/or to the fans; and the problem parameters of the task can include a current state of the vapor compression system, a goal or target state, one or multiple actuation limits, one or multiple weight values in objective function and/or one or multiple bound values in mixed-integer inequality constraints of the MIOCP.
A hybrid dynamical system including the controller of any one of claims 1 to 12, wherein the control command generated by the controller specifies a target state of the hybrid dynamical system, the hybrid dynamical system comprises: a tracking controller configured to generate one or multiple control inputs to the actuators of the hybrid dynamical system to reduce an error between the current state and a target state of the hybrid dynamical system, wherein the tracking controller is a PID controller or a model predictive controller (MPC).

Description

[Technical Field] The present disclosure relates generally to mixed-integer convex optimization-based control, and more particularly to a recurrent prediction and iterative correction method and apparatus for model predictive control of systems that are described by dynamics with continuous and discrete elements of operations. [Background Art] Optimization based decision making, planning and control techniques, such as model predictive control (MPC), allow a model-based design framework in which the system dynamics, the system requirements and constraints can directly be taken into account. This framework has been extended to hybrid systems, including both continuous and discrete decision variables, which provides a powerful technique to model a large range of problems, e.g., including dynamical systems with mode switchings or systems with quantized actuation, problems with logic rules, temporal logic specifications or obstacle avoidance constraints. However, the resulting optimization problems are highly non-convex, and therefore difficult to solve in practice, because they contain variables which only take integer values. When using a linear or linear-quadratic objective in combination with linear system dynamics and linear inequality constraints, the resulting optimal control problem (OCP) can be formulated as a mixed-integer linear program (MILP) or mixed-integer quadratic program (MIQP). More general convex inequality constraints can be included such as quadratic inequality constraints, resulting in a mixed-integer quadratically constrained quadratic program (MIQCQP), or second-order cone constraints, resulting in a mixed-integer second-order cone program (MISOCP). Decision making, planning or control for hybrid systems aims to solve such a mixed-integer program (MIP) at every sampling time instant. This is a difficult task, given that mixed-integer programming is NP-hard in general, and several methods for solving such a sequence of MIPs have been explored in the literature. These approaches can be divided in heuristic techniques, which seek to efficiently find sub-optimal solutions to the problem, and optimization algorithms which attempt to solve the MIPs to optimality. Most mixed-integer optimization algorithms are based on a variant of the branch-and-bound (B&B) technique in order to solve the MIPs to optimality. Variants of the branch-and-bound strategy have been combined with various methods for solving the relaxed convex subproblems, e.g., with dual active-set solvers, interior point algorithms, dual projected gradient methods, nonnegative least squares solvers, and the alternating direction method of multipliers (ADMM). However, the combinatorial complexity of MIPs generally leads to an exponential increase of computation time for B&B methods to solve MIPs with an increasing number of discrete decision variables, limiting the applicability of MIP-based optimal control design in practice. Examples of heuristic techniques can be based on rounding and pumping schemes, using approximate optimization algorithms, approximate dynamic programming, or using data-based machine learning techniques, e.g., supervised learning. Using supervised learning to replicate optimal and feasible MIP solutions from B&B methods procured offline and inferring these solutions online at high speed has resulted in dramatic improvement of solution times for mixed-integer optimal control problems (MIOCPs). Alternatively, reinforcement learning techniques have been used to learn tree-search policies to accelerate B&B methods, but these approaches have a limited applicability to real-time embedded systems in practice, because they require at least one forward pass of a predictor, e.g., a neural network, at each node of the B&B tree and, more importantly, these approaches may still require enumerating the full B&B tree in the worst case. Existing supervised learning techniques may directly approximate the mapping from the problem parameters, i.e., the current state of the controlled system and of the system's environment, to the discrete variable solution of an MIOCP. Such an approach is based on the realization that a mixed-integer convex programming (MICP) problem can be solved very efficiently as a convex program (CP) after fixing all discrete variables to a fixed set of values that is provided by the learned predictor, i.e., after fixing all binary variables to either 0 or 1 and fixing all integer variables to an integer value. Examples of such supervised learning techniques include the use of kernel and non-parametric classification methods, deep neural networks for regression, and multi-class classification. Some existing approaches are based on a supervised learner that only predicts the discrete variables associated with one or multiple of the first time steps within the control horizon of the MIOCP; however, this is likely to incur significant suboptimality in a receding horizon implementation. Alternatively, the poli