EP-4494109-B1 - LANE MARKER RECOGNITION

EP4494109B1EP 4494109 B1EP4494109 B1EP 4494109B1EP-4494109-B1

Inventors

YOO, SEUNGWOO
MYEONG, HEESOO
LEE, HEE-SEOK

Dates

Publication Date: 20260506
Application Date: 20230221

Claims (15)

A computer-implemented method (500, 700) for lane marker detection, comprising: generating (510, 705) a set of feature tensors by processing an input image (105, 205) received (505) from a sensor in a vehicle (107) using a convolutional neural network (210), the input image (105, 205) including one or more lane markers (110); generating (515, 710) a set of lane marker localizations by processing the set of feature tensors using a localization network, each lane marker localization of the set of lane marker localizations indicating whether a patch of the input image corresponds to a portion of a lane marker; generating (520, 715) a set of lane marker positions in an x-direction of the input image by processing the set of feature tensors using row-wise regression; generating (525, 720) a set of lane marker end positions in a y-direction of the input image by processing the set of feature tensors using y-end regression; and determining (530, 725) a set of lane marker positions by aggregating the set of lane marker localizations, the set of lane marker positions in the x-direction, and the set of lane marker end positions in the y-direction.
The method (500, 700) of Claim 1, wherein determining (530, 725) the set of lane marker positions comprises: identifying a first subset of lane marker localization positions, from the set of lane marker localizations, that exceeds a defined threshold; and for each respective lane marker localization of the first subset of localization positions, selecting a corresponding lane marker position in the x-direction of the set of lane marker positions in the x-direction based on a respective lane marker end position of the set of lane marker end positions in the y-direction.
The method (500, 700) of Claim 2, wherein determining (530, 725) the set of lane marker positions comprises: selecting a second subset from the first subset of lane marker localization positions using non-maximum suppression based on distance between respective lane marker localizations in the first subset of lane marker localization positions, wherein the distance is defined based on horizontal distance between overlapping regions.
The method (500, 700) of Claim 1, further comprising: applying a random sample consensus, RANSAC (235), technique to the set of lane marker positions; and fitting the set of lane marker positions to one or more polynomial curves.
The method (500, 700) of Claim 1, wherein: the convolutional neural network (2120) is a feature pyramid network comprising a bottom-up pathway and a top-down pathway connected by one or more lateral connections; and prior to generating the set of lane marker localizations, the set of lane marker positions in the x-direction, and the set of lane marker end positions in the y-direction, the set of feature tensors are aggregated using a set of learned weights.
The method (500, 700) of Claim 5, wherein the set of learned weights comprise: a first weight corresponding to the localization network; a second weight corresponding to the row-wise regression; and a third weight corresponding to the y-end regression.
The method (500, 700) of Claim 1, wherein training the localization network comprises computing a cross-entropy loss based on output of the localization network and a ground-truth location of one or more lane markers, wherein the ground-truth location comprises a center point and at least one end point.
The method (500, 700) of Claim 1, wherein the row-wise regression comprises a softmax function, and wherein training the row-wise regression comprises computing a smooth L1 loss based on output of the softmax function of the row-wise regression.
The method (500, 700) of Claim 8, wherein training the row-wise regression further comprises computing a confidence loss using a dice loss function.
The method (500, 700) of Claim 1, wherein the y-end regression comprises an exponential function, and wherein training the y-end regression comprises computing a smooth L1 loss based on output of the exponential function of the y-end regression.
A processing system (115, 800) for lane marker detection, comprising: a memory (824) comprising computer-executable instructions; and one or more processors (802, 804, 806, 808) configured to execute the computer-executable instructions and cause the processing system to perform an operation comprising: generating a set of feature tensors by processing an input image (105, 205) received from a sensor in a vehicle (107) using a convolutional neural network (210), the input image (105, 205) including one or more lane markers (110); generating a set of lane marker localizations by processing the set of feature tensors using a localization network, each lane marker localization of the set of lane marker localizations indicating whether a patch of the input image corresponds to a portion of a lane marker; generating a set of lane marker positions in an x-direction of the input image by processing the set of feature tensors using row-wise regression; generating a set of lane marker end positions in a y-direction of the input image by processing the set of feature tensors using y-end regression; and determining a set of lane marker positions by aggregating the set of lane marker localizations, the set of lane marker positions in the x-direction, and the set of lane marker end positions in the y-direction.
The processing (115, 800) system of Claim 11, wherein determining the set of lane marker positions comprises: identifying a first subset of lane marker localization positions, from the set of lane marker localizations, that exceeds a defined threshold; and for each respective lane marker localization of the first subset of localization positions, selecting a corresponding lane marker position of the set of lane marker positions in the x-direction based on a respective lane marker end position of the set of lane marker end positions in the y-direction.
The processing system (115, 800) of Claim 12, wherein determining the set of lane marker positions comprises: selecting a second subset from the first subset of lane marker localization positions using non-maximum suppression based on distance between respective lane marker localizations in the first subset of lane marker localization positions, wherein the distance is defined based on horizontal distance between overlapping regions.
A non-transitory computer-readable medium comprising computer-executable instructions that, when executed by one or more processors (802, 804, 806, 808) of a processing system (115, 800) for lane marker detection, cause the processing system to perform an operation comprising: generating a set of feature tensors by processing an input image received from a sensor in a vehicle using a convolutional neural network, the input image including one or more lane markers; generating a set of lane marker localizations by processing the set of feature tensors using a localization network, each lane marker localization of the set of lane marker localizations indicating whether a patch of the input image corresponds to a portion of a lane marker; generating a set of lane marker positions in an x-direction of the input image by processing the set of feature tensors using row-wise regression; generating a set of lane marker end positions in a y-direction of the input image by processing the set of feature tensors using y-end regression; and determining a set of lane marker positions by aggregating the set of lane marker localizations, the set of lane marker positions in the x-direction, and the set of lane marker end positions in the y-direction.
The transitory computer-readable medium of claim 14, comprising computer-executable instructions that, when executed by one or more processors of a processing system for lane marker detection, cause the processing system to perform operations according to the method (500, 700) of any one of Claims 2 to 10.

Description

INTRODUCTION Aspects of the present disclosure relate to lane marker detection. Modern vehicles are increasingly equipped with advanced driver assistance systems, which among other things, may include lane marker detection for assisted and autonomous driving functions. Existing techniques for lane marker detection techniques are slow, require significant manual configuration, and lack robustness across many driving scenarios. Thus, many existing systems for lane detection are not suitable for modern vehicles. Accordingly, techniques are needed for improved lane marker detection. US 2021/287018 A1 relates to a method for lane marker detection, including: receiving an input image; providing the input image to a lane marker detection model; processing the input image with a shared lane marker portion of the lane marker detection model; processing output of the shared lane marker portion of the lane marker detection model with a plurality of lane marker-specific representation layers of the lane marker detection model to generate a plurality of lane marker representations; and outputting a plurality of lane markers based on the plurality of lane marker representations. The paper "Lane Marking Regression From Confidence Area Detection to Field Inference" by Hui Lv et al, IEEE Transactions on Intelligent Vehicles, Vol. 6, No. 1, March 2021, XP011839577, describes lane marking detection using a regression network which can simultaneously consider confidence area detection and field inference. CN 111 460 984 A relates to a lane line detection method based on key points and gradient equalization loss. BRIEF SUMMARY The scope of protection is defined by the scope of the appended claims. BRIEF DESCRIPTION OF THE DRAWINGS The appended figures depict certain aspects of the one or more aspects and are therefore not to be considered limiting of the scope of the appended claims. FIG. 1 depicts an example environment and system for lane marker detection using machine learning.FIG. 2 depicts an example workflow for lane marker detection using machine learning.FIG. 3 depicts an example model architecture for lane marker detection using machine learning.FIG. 4 depicts an example flow diagram illustrating a method for training machine learning models to detect lane markers.FIG. 5 depicts an example flow diagram illustrating a method for generating lane marker instances using machine learning.FIG. 6 depicts an example flow diagram illustrating a method for aggregating machine learning data to generate lane marker instances.FIG. 7 depicts an example flow diagram illustrating a method for detecting lane markers.FIG. 8 depicts an example processing system configured to perform various aspects of the present disclosure. To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the drawings. It is contemplated that elements and features of one aspect may be beneficially incorporated in other aspects without further recitation. DETAILED DESCRIPTION Aspects of the present disclosure provide techniques for improved lane marker detection. In some aspects, a multi-headed machine learning architecture (e.g., a model having multiple output heads) is used to separately generate lane marker localizations, horizontal (row-wise) positions, and end positions. By aggregating this data, the system is able to generate accurate lane marker instances quickly and efficiently. Generally, a lane marker is a device or material on a road surface that conveys information, such as where lanes exist on a roadway. Examples of lane markers include painted traffic lines, painted cross-walks, painted parking spaces, reflective markers, curbs, gutters, Botts' dots, and rumble strips, to name a few. Lane markers may be used by an assisted and/or autonomous vehicle navigation system. For example, a vehicle may include an advanced driver assistance system (ADAS) or a high-level self-driving system (SDS). Such systems are being widely adopted based in large part on concurrent improvements in computer vision technologies. Although there are a number of components related to ADAS and SDS systems, such as lane marker detection, vehicle detection tracking, obstacle detection, scene understanding, and semantic segmentation, lane detection is a key component for camera perception and positioning. For example, lane detection is necessary for keeping a vehicle within the ego-lane (the lane in which the vehicle is positioned, which may also be referred to as a host lane), and for assisting the vehicle in changing lanes to the left or the right of the ego-lane. Many conventional lane marker detection methods are based on semantic segmentation approaches. In the first stage of such approaches, a network is designed to perform a pixel-level classification that assigns each pixel in an image to a binary label: lane (or lane marker) or not lane (or not lane marker). However, in each pixel classifica