EP-4293634-B1 - METHOD, PROCESSOR CIRCUIT AND COMPUTER-READABLE STORAGE MEDIUM FOR CARRYING OUT TRAFFIC OBJECT DETECTION IN A MOTOR VEHICLE

EP4293634B1EP 4293634 B1EP4293634 B1EP 4293634B1EP-4293634-B1

Inventors

BAYZIDI, YASIN
HÜGER, Fabian
Schneider, Jan David

Dates

Publication Date: 20260506
Application Date: 20230516

Claims (10)

A method for operating traffic object detection (16) in a processor circuit of a motor vehicle (10), wherein at least one image data set (19) describing relevant imaging (30) of an environment (15) of the motor vehicle (10) is received from at least one environment sensor (17), and by means of at least one machine learning model (ML model), on the basis of the relevant image data set (19), • determining bounding boxes (23) containing potential images of traffic objects (20, 21) in the imaging (30) and • extracting feature data of image features (24) from image data of the image data set (19) by means of a feature extraction unit (22) of the at least one ML model; • detecting a traffic object (20, 21) that is completely depicted or is depicted by more than a predetermined minimum fraction, the minimum fraction being in a range of 65 to 90 percent, within the respective bounding box (23) on the basis of the image features (24) contained therein by means of a classifier unit (25) of the at least one ML model, and the relevant bounding box (23) depicting a traffic object (20, 21) is identified by a detection signal as a result of the detection, the detection signal indicating an ID of the bounding box (23) and/or coordinates, characterized in that , • for further bounding boxes (23) formed by subtraction, the image features (24) contained therein are combined to form a feature vector (31) in each case, and, during the subtraction, that area portion is subtracted or removed from a bounding box (23) which belongs to a bounding box (23) containing a traffic object (20, 21) identified by means of the classifier unit (25), and • a relevant distance value (36) of the feature vector (31) is determined to form a plurality of statistical distribution models (35), each of which models a statistical distribution of such image features (24) of only one relevant portion (62) of a traffic object (20, 21) and/or a body of a person, and in this case, "portion" (62) means that the distribution models (35) are based on those feature vectors which represent nonpredominant imaging of the traffic object (20, 21) or a pedestrian, i.e. represent only a single body part or obscuring of the traffic object (20, 21) or pedestrian by more than such an "occlusion fraction", which may be in a range of 25 percent to 80 percent, and • comparing the distance value (36) with a predetermined threshold value (38) and, • if, according to the comparison, the distance value (36) for one of the statistical distribution models (35) is less than the threshold value (38), it is indicated that a traffic object (20, 21) has gone undetected by the classifier unit (25); if the distance value is namely less than the threshold value, there is an accordingly high level of similarity or affiliation of the feature vector with the distribution model (35), i.e. the feature vector represents, with an accordingly high probability, a portion (2) of a traffic object (20, 21) or bodily region of a person and/or a human body, • the feature vector (31) being formed by the image features (24) being combined to form a temporary vector and the temporary vector being reduced to form the feature vector (31) by means of dimension-reducing imaging (30).
The method according to claim 1, wherein, before determining the distance value (36), those bounding boxes (23) are excluded for which it is identified that they overlap in area with a bounding box (23) for which it is indicated by the classifier unit (25) that it depicts a traffic object (20, 21) by more than a predetermined minimum fraction, so as not to calculate a feature vector (31) for all the bounding boxes (23) and not to have to compare said feature vector with the statistical distribution models, and/or wherein the subtraction comprises at least one of the further bounding boxes (23) being formed in that, for one of the bounding boxes (23) which has an overlap with a bounding box (23) for which it is indicated by the classifier unit (25) that it depicts a traffic object (20, 21), the nonoverlapping part is described as at least one further bounding box (23), in order to prevent the classifier unit (25) indicating a single pedestrian while a second, partially obscured pedestrian behind them is overlooked.
The method according to any one of the preceding claims, wherein the dimension-reducing imaging (30) involves a transformation of the temporary vector by means of a principal component analysis, and only a predetermined partial number of the vector components from the transformed vector are used for the feature vector (31).
The method according to any one of the preceding claims, wherein each of the distribution models models a portion (62) that is below a detection threshold of the classifier unit (25).
The method according to any one of the preceding claims, wherein a convolutional network (CNN) is used as the feature extraction unit (22) and/or a deep artificial neural network (DNN) is used as the classifier unit (25).
The method according to any one of the preceding claims, wherein activation values of artificial neurons of at least one network layer of the feature extraction unit (22) are determined as feature data from the feature extraction unit (22).
The method according to any one of the preceding claims, wherein if an undetected traffic object (20, 21) is indicated, a predetermined safety measure (40) is triggered in the motor vehicle (10).
The method according to any one of the preceding claims, wherein for generating the distribution models (35) from training data sets (60), bounding boxes (23) of completely depicted traffic objects (20, 21) are decomposed into the portions (62) and the image features (24) contained in the relevant portion (62) are combined to form respective training feature vectors (63), and the determined training feature vectors (63) are divided into clusters (66) by means of a cluster algorithm, wherein each cluster (66) constitutes one of the statistical distribution models (35).
A processor circuit for a motor vehicle (10), wherein the processor circuit is configured to carry out a method according to any one of claims 1 to 7.
A computer-readable storage medium containing program instructions which, when executed by a processor circuit, cause the processor circuit to carry out a method according to any one of claims 1 to 7 or, when executed by a computer, cause the computer to carry out a method according to claim 8.

Description

Die Erfindung betrifft ein Verfahren zum Betreiben einer Verkehrsobjektdetektion in einem Steuergerät eines Kraftfahrzeugs. Die Erfindung betrifft auch eine Prozessorschaltung zum Durchführen des Verfahrens sowie ein computerlesbares Speichermedium, um eine Prozessorschaltung zum Durchführen des Verfahrens zu ertüchtigen. Die Verkehrsobjektdetektion ermittelt auf Grundlage von Bilddaten eines jeweiligen Kamerabilds oder Bilddatensatzes, also eines jeweiligen Abbilds der Umgebung, mittels zumindest eines Modells des maschinellen Lernens (ML-Modell), ob und wo ein Passant, also z.B. ein Fußgänger, in dem jeweiligen Kamerabild der Umgebung abgebildet ist. Daraus kann mittels einer Umrechnung eines Sensorkoordinatensystems des Umgebungssensors in ein absolutes Koordinatensystem des Kraftfahrzeugs ermittelt werden, in welcher Relativposition zum Kraftfahrzeug sich der Passant befindet. Dies kann einer automatisierten Fahrfunktion des Kraftfahrzeugs signalisiert werden, die daraufhin eine Fahrtrajektorie des Kraftfahrzeugs für ein kollisionsfreies Passieren des Passanten berechnen kann. Die automatisierte Fahrfunktion kann beispielsweise eine Fahrerassistenzfunktion (wie beispielsweise eine Spurhalteassistenz und/oder eine Einparkassistenz) und/oder eine autonome Fahrfunktion (Autopilot), sein, die eine Fahrtrajektorie für das automatisierte, kollisionsfreie Führen des Kraftfahrzeugs planen kann, wenn bekannt ist, wo sich in der Umgebung beispielsweise Passanten befinden. Entsprechender Stand der Technik hierzu ist beispielsweise aus den folgenden wissenschaftlichen Veröffentlichungen bekannt: Shifeng Zhang, Longyin Wen, Xiao Bian, Zhen Lei, and Stan Z. Li, "Occlusion-aware R-CNN: Detecting Pedestrians in a Crowd", European Conference on Compuer Vision - ECCV 2018; Wanli Ouyang and Xiaogang Wang, "A Discriminative Deep Model for Pedestrian Detection with Occlusion Handling", IEEE, 2012;Chen Ning, Li Menglu, Yuan Hao, Su Xueping, Li Yunhong, "Survey of pedestrian detection with occlusion", Complex & Intelligent Systems, 2021;Alonso I.P., Llorca D.F., Sotelo M.Ä., Bergasa L.M., de Toro P.R., Nuevo J., Ocaña M., Garrido M.A.G., "Combination of Feature Extraction Methods for SVM Pedestrian Detection", Transactions on Intelligent Transportation Systems - IEEE 2007;He Y., Zhu C., Yin X.-C, "Mutual-Supervised Feature Modulation Network for Occluded Pedestrian Detection", 25th International Conference on Pattern Recognition (ICPR), 2021; Zhou J., Hoang J., "Real Time Robust Human Detection and Tracking System", Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, 2005CAO GONG et al., "Solving Occlusion Problem in Pedestrian Detection by Constructing Discriminative Part Layers", 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), IEEE, 24. März 2017, Seiten 91-99;Yanqiu Xiao et al., "Deep learning for occluded and multi-scale pedestrian detection: A review", IET Image Processing, IET, UK, Bd. 15, Nr. 2, 14. Dezember 2020, Seiten 286-301. Ein wichtiger Umstand beim Detektieren von Passanten auf der Grundlage von Abbildern der Umgebung ist, dass Passanten nicht immer mit ihrem vollständigen Körper sichtbar sind, sondern es zu einer Verdeckung (Okklusion) kommen kann, wenn beispielsweise mehrere Passanten nebeneinander stehen und/oder ein Passant hinter einem Objekt, beispielsweise einem Pfosten eines Verkehrsschildes, steht. Eine Passantendetektion auf der Grundlage zumindest eines Modells des maschinellen Lernens kann hier dahingehend versagen, dass ein solcher Passant nicht detektiert wird, wie dies auch in den genannten Veröffentlichungen beschrieben ist (Okklusionsproblem). Der zusätzliche Betrieb von weiteren ML-Modellen, die für das Detektieren von teilverdeckten Passanten trainiert sind, von denen also nur ein Teilbereich sichtbar ist, wird im Zusammenhang mit der Passantendetektion in einem Kraftfahrzeug aus dem Grunde vermieden, dass eine entsprechend zusätzliche Rechenleistung notwendig wäre, die in der Regel in einer Prozessorschaltung eines Steuergeräts eines Kraftfahrzeugs nicht verfügbar ist. Der Erfindung liegt die Aufgabe zugrunde, zu einem Bilddatensatz effizient zu erkennen, ob darin auch ein nur teilweise sichtbares Verkehrsobjekt (wie insbesondere ein nur teilweise sichtbarer Passant) abgebildet ist. Die Aufgabe wird durch die Gegenstände der unabhängigen Patentansprüche gelöst. Vorteilhafte Weiterentwicklungen der Erfindung sind durch die abhängigen Patentansprüche, die folgende Beschreibung sowie die Figuren beschrieben. Als eine Lösung umfasst die Erfindung ein Verfahren zum Betreiben oder Durchführen einer Passantendetektion in einer Prozessorschaltung eines Kraftfahrzeugs. Eine solche Prozessorschaltung kann durch ein Steuergerät oder einen Verbund aus mehreren Steuergeräten des Kraftfahrzeugs gebildet sein. Die Passantendetektion basiert in an sich bekannter Weise darauf, dass zumindest ein Bilddatensatz (also ein Kamerabild oder eine entsprechende Bildse