US-12626589-B2 - Systems and methods for simulating traffic scenes

US12626589B2US 12626589 B2US12626589 B2US 12626589B2US-12626589-B2

Abstract

Example aspects of the present disclosure describe a scene generator for simulating scenes in an environment. For example, snapshots of simulated traffic scenes can be generated by sampling a joint probability distribution trained on real-world traffic scenes. In some implementations, samples of the joint probability distribution can be obtained by sampling a plurality of factorized probability distributions for a plurality of objects for sequential insertion into the scene.

Inventors

Shuhan Tan
Kelvin Ka Wing Wong
Shenlong Wang
Sivabalan Manivasagam
Mengye Ren
Raquel Urtasun

Assignees

AURORA OPERATIONS, INC.

Dates

Publication Date: 20260512
Application Date: 20230213

Claims (20)

1 . A computer-implemented method for traffic scene generation, comprising: (a) obtaining environmental data descriptive of an environment and a subject vehicle within the environment; (b) generating a first parameter of a new object for insertion into a synthesized traffic scene in the environment, the first parameter generated using a first parameter prediction machine-learned model of a machine-learned traffic scene generation framework and based at least in part on the environmental data; (c) generating a second parameter of the new object, the second parameter generated using a second parameter prediction machine-learned model of the machine-learned traffic scene generation framework and based at least in part on the environmental data and the first parameter; and (d) outputting data descriptive of the synthesized traffic scene.
2 . The computer-implemented method of claim 1 , further comprising: generating feature data using a machine-learned feature extraction model based on a current state of the synthesized traffic scene.
3 . The computer-implemented method of claim 2 , wherein the machine-learned feature extraction model processes multiple prior states of the synthesized traffic scene.
4 . The computer-implemented method of claim 2 , wherein the first parameter and the feature data are provided as inputs to the second parameter prediction machine-learned model.
5 . The computer-implemented method of claim 4 , further comprising: (e) inputting the first parameter, the second parameter, and the feature data to a third parameter prediction machine-learned model of the machine-learned traffic scene generation framework.
6 . The computer-implemented method of claim 1 , wherein at least one of the first parameter or the second parameter describes at least one of: an object class, an object position, an object orientation, an object bounding box, or an object velocity.
7 . The computer-implemented method of claim 1 , further comprising: (f) generating simulated sensor data for the environment based on the data descriptive of the synthesized traffic scene; (g) obtaining labels for the simulated sensor data based on the first parameter and the second parameter; and (h) training a machine-learned model of an autonomous vehicle control system using the labels and the simulated sensor data.
8 . A computing system for traffic scene generation, the computing system comprising: one or more processors; and one or more non-transitory computer-readable media that store instructions for execution by the one or more processors to cause the computing system to perform operations, the operations comprising: (a) obtaining environmental data descriptive of an environment and a subject vehicle within the environment; (b) generating a first parameter of a new object for insertion into a synthesized traffic scene in the environment, the first parameter generated using a first parameter prediction machine-learned model of a machine-learned traffic scene generation framework and based at least in part on the environmental data; (c) generating a second parameter of the new object, the second parameter generated using a second parameter prediction machine-learned model of the machine-learned traffic scene generation framework and based at least in part on the environmental data and the first parameter; and (d) outputting data descriptive of the synthesized traffic scene.
9 . The computing system of claim 8 , further comprising: generating feature data using a machine-learned feature extraction model based on a current state of the synthesized traffic scene.
10 . The computing system of claim 9 , wherein the machine-learned feature extraction model processes multiple prior states of the synthesized traffic scene.
11 . The computing system of claim 9 , wherein the first parameter and the feature data are provided as inputs to the second parameter prediction machine-learned model.
12 . The computing system of claim 11 , further comprising: (e) inputting the first parameter, the second parameter, and the feature data to a third parameter prediction machine-learned model of the machine-learned traffic scene generation framework.
13 . The computing system of claim 8 , wherein at least one of the first parameter or the second parameter describes at least one of: an object class, an object position, an object orientation, an object bounding box, or an object velocity.
14 . The computing system of claim 8 , wherein the operations comprise: (f) generating simulated sensor data for the environment based on the data descriptive of the synthesized traffic scene; (g) obtaining labels for the simulated sensor data based on the first parameter and the second parameter; and (h) training a machine-learned model of an autonomous vehicle control system using the labels and the simulated sensor data.
15 . An autonomous vehicle control system comprising: one or more machine-learned models that have been trained using simulated sensor data representing at least a portion of a synthesized traffic scene, the simulated sensor data having been generated by performance of operations, the operations comprising: (a) obtaining environmental data descriptive of an environment and a subject vehicle within the environment; (b) generating a first parameter of a new object for insertion into a synthesized traffic scene in the environment, the first parameter generated using a first parameter prediction machine-learned model of a machine-learned traffic scene generation framework and based at least in part on the environmental data; (c) generating a second parameter of the new object, the second parameter generated using a second parameter prediction machine-learned model of the machine-learned traffic scene generation framework and based at least in part on the environmental data and the first parameter; and (d) outputting data descriptive of the synthesized traffic scene.
16 . The autonomous vehicle control system of claim 15 , the operations further comprising: generating feature data using a machine-learned feature extraction model based on a current state of the synthesized traffic scene.
17 . The autonomous vehicle control system of claim 16 , wherein the machine-learned feature extraction model processes multiple prior states of the synthesized traffic scene.
18 . The autonomous vehicle control system of claim 16 , wherein the first parameter and the feature data are provided as inputs to the second parameter prediction machine-learned model.
19 . The autonomous vehicle control system of claim 18 , the operations further comprising: (e) inputting the first parameter, the second parameter, and the feature data to a third parameter prediction machine-learned model of the machine-learned traffic scene generation framework.
20 . The autonomous vehicle control system of claim 15 , the operations further comprising: (f) generating simulated sensor data for the environment based on the data descriptive of the synthesized traffic scene; (g) obtaining labels for the simulated sensor data based on the first parameter and the second parameter; and (h) training at least one of the one or more machine-learned models of the autonomous vehicle control system using the labels and the simulated sensor data.

Description

PRIORITY CLAIM The present application is a continuation of U.S. application Ser. No. 17/528,277, having a filing date of Nov. 17, 2021, which is based on and claims the benefit of U.S. Provisional Patent Application No. 63/114,848, filed Nov. 17, 2020. Applicant claims priority to and the benefit of each of such applications and incorporates all such applications herein by reference in their entirety. BACKGROUND An autonomous platform can process data to perceive an environment through which the platform can travel. For example, an autonomous vehicle can perceive its environment using a variety of sensors and identify objects around the autonomous vehicle. The autonomous vehicle can identify an appropriate path through the perceived surrounding environment and navigate along the path with minimal or no human input. SUMMARY Aspects and advantages of embodiments of the present disclosure are set forth in the following description. The present disclosure is directed to improved techniques for generating realistic simulated environmental scenes (e.g., simulated traffic scenes in a travel way environment). For instance, some implementations of environmental scene generators according to the present disclosure provide for more complex and diverse collections of simulated environmental scenes by sampling simulated scenes from probabilistic distributions of scenes. In some implementations, the environment can include a travel way, and the scene of interest can be a traffic scene. The traffic scene can be a snapshot (e.g., at a moment in time). Some example traffic scene generators of the present disclosure automatically select and insert objects into a traffic scene by sampling object characteristics from corresponding probabilistic distributions. For example, a traffic scene generator can be provided state information about a subject vehicle (e.g., a self-driving vehicle) and a high-definition map of an environment around the vehicle and generate actors or other objects of various classes for insertion into the scene. For instance, in some implementations, the traffic scene generator automatically obtains a size, orientation, velocity, and/or other parameter(s) of each object that is inserted into the scene by sampling the parameters from probabilistic distributions. In some implementations, multiple characteristics of an object are each respectively obtained from multiple machine-learned distributions (e.g., sampled from a probabilistic distribution of a respective parameter). In some implementations, some distributions for an object are generated in view of one or more other previously-sampled parameters for that object. In some implementations, traffic scene generators of the present disclosure generate joint probability distributions for objects in a traffic scene. In some implementations, a joint probability distribution for the traffic scene (e.g., for multiple objects in the scene) can be decomposed (e.g., autoregressively) into a product of probabilities for the objects in the scene. For example, in some implementations, multiple objects are obtained sequentially, with parameters of later-inserted objects being sampled from their respective distributions in view of (e.g., conditioned on) objects previously inserted into the scene. In this manner, a joint probability distribution may be sampled to obtain a simulated traffic scene. The joint probability distributions can also be used, for example, to determine the probability of an input traffic scene (e.g., existing reference scenes, such as pre-recorded scenes). In this manner, for instance, example implementations of a traffic scene generator are trained by optimizing (e.g., maximizing) a determined probability of real-world traffic scenes. Example systems and methods according to aspects of the present disclosure provide various technical effects and benefits. Realistic simulated environmental scenes can be used, for example, as substitutes for recordings of real environmental scenes. For example, snapshots containing simulated scenes can be used to initialize other simulations (e.g., traffic simulations, such as simulations over time, etc.). Snapshots containing simulated scenes can also be used to obtain labeled training data for machine-learned systems that interface with the scene (e.g., perception systems, etc.). Simulated environmental data can be generated much faster and more inexpensively than obtaining equivalent amounts of real-world recorded environmental data. For example, obtaining real-world recorded traffic scene data can require traveling along roadways and recording traffic events no faster than in real time, while generating simulated traffic scene data can be accomplished virtually, without wear and tear on physical vehicles (and the emissions thereof), and without any speed restriction of real-time synthesis. A broad spectrum of diverse simulated traffic scenes can be generated quicker than the time necessary to obtain the same amou