US-12623671-B2 - Trailer angle estimation using machine learning

US12623671B2US 12623671 B2US12623671 B2US 12623671B2US-12623671-B2

Abstract

In various examples, a trailer angle may be estimated using one or more machine learning models to predict one or more keypoints on the center axis of the trailer drawbar (e.g., a keypoint representing the drawbar junction around which the drawbar pivots, one or more other keypoints along the center axis), back-projecting the predicted keypoint(s) onto a three-dimensional (3D) representation of the ground, and calculating the angle between the longitudinal axis of the towing vehicle and a line or ray formed by or fitted to the projected keypoints. The trailer angle may be estimated at any frame rate. For each frame, keypoints may be predicted from that frame and/or optical flow or some other type of feature tracking may be used to propagate predicted keypoint(s) from a preceding frame in lieu of predicting keypoint(s), and the resulting keypoint(s) may be used to estimate the trailer angle for that frame.

Inventors

Ayon Sen

Assignees

NVIDIA CORPORATION

Dates

Publication Date: 20260512
Application Date: 20230801

Claims (20)

1 . A processor comprising: one or more processing units to: generate, based at least on one or more machine learning models processing a representation of sensor data corresponding to a view of a drawbar connecting a trailer to an ego-machine, a representation of three or more points substantially along a center axis of the drawbar; generate, based at least on an estimated ground projection of the center axis of the drawbar corresponding to the three or more points, an estimated trailer angle between a first axis of the ego-machine and a second axis of the trailer; and control one or more operations of the ego-machine based at least on the estimated trailer angle.
2 . The processor of claim 1 , wherein the one or more processing units are further to generate the estimated trailer angle based at least on back-projecting the three or more points into a three-dimensional (3D) coordinate system to identify 3D locations corresponding to the estimated ground projection of the center axis of the drawbar and calculating the estimated trailer angle as an angle between a longitudinal axis of the ego-machine and the estimated ground projection of the center axis of the drawbar.
3 . The processor of claim 1 , wherein the three or more points substantially along the center axis of the drawbar comprise a first keypoint representing a junction around which the drawbar pivots and one or more other keypoints.
4 . The processor of claim 1 , wherein the three or more points substantially along the center axis of the drawbar comprise a first fixed keypoint representing a junction around which the drawbar pivots and one or more predicted keypoints.
5 . The processor of claim 1 , wherein the one or more processing units are further to generate the estimated trailer angle based at least on a three-dimensional (3D) representation of the estimated ground projection of the center axis of the drawbar corresponding to a line or ray fitted to a ground projection of the three or more points.
6 . The processor of claim 1 , wherein the sensor data represents a two-dimensional (2D) view of the drawbar, and the three or more points along the center axis of the drawbar comprise two or more 2D points detected in the 2D view by the one or more machine learning models.
7 . The processor of claim 1 , wherein the one or more machine learning models comprise a neural network with an output channel for each individual point of the three or more points substantially along the center axis of the drawbar, and the output channel for each point is configured to predict classification data representing a likelihood that each individual pixel, of one or more pixels of the representation of the sensor data, depicts the individual point associated with the center axis of the drawbar.
8 . The processor of claim 1 , wherein the sensor data comprises image data generated using a rear-facing camera of the ego-machine.
9 . The processor of claim 1 , wherein the one or more processing units are further to repetitively generate the estimated trailer angle at a designated frame rate based at least on updated detections of the three or more points substantially along the center axis of the drawbar represented in corresponding frames of the sensor data.
10 . The processor of claim 1 , wherein the one or more processing units are further to propagate at least one detected point of the three or more points substantially along the center axis of the drawbar to a corresponding location in a subsequent frame using optical flow.
11 . The processor of claim 1 , wherein the operations of the ego-machine that are based at least on the estimated trailer angle comprise one or more of path planning, obstacle avoidance, presenting a recommended path on a display visible to an operator of the ego-machine, or self-parking.
12 . The processor of claim 1 , wherein the processor is comprised in at least one of: a control system for an autonomous or semi-autonomous machine; a perception system for the autonomous or semi-autonomous machine; a system for performing simulation operations; a system for performing digital twin operations; a system for performing light transport simulation; a system for performing collaborative content creation for 3D assets; a system for performing deep learning operations; a system for performing remote operations; a system for performing real-time streaming; a system for generating or presenting one or more of augmented reality content, virtual reality content, or mixed reality content; a system implemented using an edge device; a system implemented using a robot; a system for performing conversational AI operations; a system implementing one or more language models; a system implementing one or more large language models (LLMs); a system for generating synthetic data; a system for generating synthetic data using AI; a system incorporating one or more virtual machines (VMs); a system implemented at least partially in a data center; or a system implemented at least partially using cloud computing resources.
13 . A system comprising one or more processing units to control one or more operations of an ego-machine using a trailer kink angle estimated based at least on an estimated ground projection of a center axis of a drawbar connecting a trailer to the ego-machine, the estimated ground projection generated using three or more points detected substantially along the center axis of the drawbar using one or more machine learning models.
14 . The system of claim 13 , wherein the one or more processing units are further to estimate the trailer kink angle based at least on back-projecting the three or more points into a three-dimensional (3D) coordinate system to identify 3D locations corresponding to the estimated ground projection of the center axis of the drawbar and calculating the estimated trailer kink angle as an angle between a longitudinal axis of the ego-machine and the estimated ground projection of the center axis of the drawbar.
15 . The system of claim 13 , wherein the three or more points substantially along the center axis of the drawbar comprise a first keypoint representing a junction around which the drawbar pivots and one or more other keypoints.
16 . The system of claim 13 , wherein the using the one or more machine learning models to detect the three or more points comprises applying sensor data representing a two-dimensional (2D) view of the drawbar to the one or more machine learning models to detect the three or more points in the 2D view.
17 . The system of claim 13 , wherein the operations of the ego-machine controlled using the trailer kink angle comprise one or more of path planning, obstacle avoidance, presenting a recommended path on a display visible to an operator of the ego-machine, or self-parking.
18 . The system of claim 13 , wherein the system is comprised in at least one of: a control system for an autonomous or semi-autonomous machine; a perception system for the autonomous or semi-autonomous machine; a system for performing simulation operations; a system for performing digital twin operations; a system for performing light transport simulation; a system for performing collaborative content creation for 3D assets; a system for performing deep learning operations; a system for performing real-time streaming; a system for generating or presenting one or more of augmented reality content, virtual reality content, or mixed reality content; a system implemented using an edge device; a system implemented using a robot; a system for performing conversational AI operations; a system implementing one or more language models; a system implementing one or more large language models (LLMs); a system for generating synthetic data; a system for generating synthetic data using AI; a system incorporating one or more virtual machines (VMs); a system implemented at least partially in a data center; or a system implemented at least partially using cloud computing resources.
19 . A method comprising: generating, using one or more machine learning models and based at least on a two-dimensional (2D) view of a drawbar connecting a trailer to an ego-machine, a representation of three or more 2D points in the 2D view along the drawbar; generating, based at least on an estimated ground projection of a center axis of the drawbar corresponding to the three or more 2D points, an estimated trailer angle between a first axis of the ego-machine and a second axis of the trailer; and controlling one or more operations of the ego-machine based at least on the estimated trailer angle.
20 . The method of claim 19 , wherein the method is performed by at least one of: a control system for an autonomous or semi-autonomous machine; a perception system for the autonomous or semi-autonomous machine; a system for performing simulation operations; a system for performing digital twin operations; a system for performing light transport simulation; a system for performing collaborative content creation for 3D assets; a system for performing deep learning operations; a system for performing real-time streaming; a system for generating or presenting one or more of augmented reality content, virtual reality content, or mixed reality content; a system implemented using an edge device; a system implemented using a robot; a system for performing conversational AI operations; a system implementing one or more language models; a system implementing one or more large language models (LLMs); a system for generating synthetic data; a system for generating synthetic data using AI; a system incorporating one or more virtual machines (VMs); a system implemented at least partially in a data center; or a system implemented at least partially using cloud computing resources.

Description

BACKGROUND A vehicle trailer, commonly known as a trailer, may be coupled to a powered vehicle such as a car, truck, or sport utility vehicle via a draw bar and towed. Trailers may be used for various purposes, such as transporting goods, carrying recreational vehicles such as boats or motorcycles, hauling construction equipment, or moving personal belongings. Knowing the angle a trailer makes with the towing vehicle is crucial for safe and efficient towing. This angle is known as the “trailer angle” or “trailer kink angle” and typically refers to the angular difference between the longitudinal axis of the towing vehicle and the longitudinal axis of the trailer (or the angle between the directions the towing vehicle and the trailer are pointing). The trailer angle is an important input to vehicle control algorithms for several reasons. For example, the trailer angle can impact the stability of the towing setup. If the trailer angle becomes too large, it can lead to instability, sway, or even jackknifing. As such, some vehicle control algorithms seek to maintain an appropriate trailer angle to ensure stability during acceleration, deceleration, turns, lane changes, reversing, and other maneuvers. Accordingly, knowledge of the trailer angle enables vehicle control algorithms to provide for better control and smoother handling and prevent potential accidents or damage to the trailer and other vehicles. The trailer angle also impacts the clearance required to safely navigate corners and obstacles, so knowledge of the trailer angle allows vehicle control algorithms to determine whether there is enough clearance to make a turn without hitting curbs, vehicles, and/or other objects on the road. The trailer angle is particularly important when a towing vehicle is in reverse. By understanding the trailer angle, vehicle control algorithms can predict the trailer's path, assist the driver in manipulating the trailer direction while reversing, and/or adjust steering controls accordingly, making it easier to back up or park the trailer accurately. Conventional techniques for estimating the trailer angle have a variety of draw backs. For example, a typical vehicle configuration may include a single rear-facing camera (e.g., a fisheye camera) that views an attached trailer. Classical approaches to trailer angle estimation typically attempt to identify trailer or trailer draw bar features that are visible in a particular frame (e.g., an image generated using the rear-facing camera) and track those features from frame to frame. Another class of algorithms extracts edges in the frame, determines which edges are part of the draw bar, and uses those edges to fit a line and calculate the angle between that line and the vehicle. These conventional techniques rely on classical feature extraction, which evaluates neighboring pixels to identify regions of high contrast from frame to frame. However, since these conventional techniques rely on contrast, they are highly susceptible to lighting or environmental changes, which occur frequently and can result in inaccurate and even failed trailer angle estimates. In one example, driving under or past something like a tree or a building that casts a shadow on the object being tracked will typically result in a much darker scene in which relative contrast is no longer detectable. As a result, object tracking—and therefore, trailer angle estimation-often fails to produce a valid estimate in situations like these. Furthermore, conventional techniques may not be as suitable for real-time deployment as the computational resources required to distinguish which tracked keypoints or extracted edges are part of the trailer or the trailer draw bar may increase latency of the system. As such, there is a need for improved techniques for estimating trailer angle. SUMMARY Embodiments of the present disclosure relate to trailer angle estimation using machine learning. More specifically, systems and methods are disclosed that estimate the trailer angle based on an image of a trailer being towed by a towing vehicle. In contrast to conventional systems, such as those described above, a trailer angle may be estimated using one or more machine learning models to predict one or more keypoints on the center axis of the trailer draw bar (e.g., a keypoint representing the draw bar junction around which the draw bar pivots, one or more other keypoints along the center axis), back-projecting the predicted keypoint(s) onto a three-dimensional (3D) representation of the ground, and calculating the angle between the longitudinal axis of the towing vehicle and a line or ray formed by or fitted to the projected keypoints. The trailer angle may be estimated at any frame rate. For each frame, keypoints may be predicted from that frame and/or optical flow or some other type of feature tracking may be used to propagate predicted keypoint(s) from a preceding frame in lieu of predicting keypoint(s), and the resulting keypoint