CN-116129396-B - Visual perception method and system for assisting driving

CN116129396BCN 116129396 BCN116129396 BCN 116129396BCN-116129396-B

Abstract

The application discloses a visual perception method for driving assistance, which comprises the following steps of S100, S200, constructing a multi-task neural network model F, substituting the input X into the multi-task neural network model F to obtain a plurality of branch task specific outputs F i (X), S300, calculating a final loss function L all of the multi-task neural network model F, and S400, optimizing the minimization as a target of the loss function L all , and finally obtaining an optimal multi-task neural network model F * . The system specifically comprises a data input module, a shared layer feature extraction module, a task layer structure search module and a model output module. Aiming at the real-time requirement of driving assistance perception complex environment, the application constructs a multi-task model, effectively reduces the network parameter quantity and the calculated quantity, reduces the on-line reasoning time, and is very beneficial to practical application.

Inventors

LI KE

Assignees

宁波弗浪科技有限公司

Dates

Publication Date: 20260512
Application Date: 20221220

Claims (7)

1. A visual perception method for assisting driving, comprising the steps of: s100, collecting auxiliary driving image data, preprocessing a data set and forming an input X; s200, constructing a multi-task neural network model F and substituting the input X to obtain a plurality of branch task specific outputs ; S300, calculating a final loss function of the multi-task neural network model F ; S400, minimizing as a loss function Optimizing the target of the (E) to finally obtain the optimal multi-task neural network model ; In step S300, the output loss of the multi-tasking neural network model F is The structural loss of the multi-task neural network model F is Loss function ; The output loss of each branch task is Output loss of the multi-tasking neural network model F Output loss by N branches Weighting, structure loss Is the accumulated sum of the loss differences among N branch tasks, namely ; ; Wherein alpha is a branch structure parameter, w is a neural network weight, the size of w is obtained by iteration from alpha, and the value of alpha in the first iteration is ; Step S400 includes the following specific procedures: S410, performing structure search on a task layer T to obtain branch task network structure parameters and optimal neural network weight parameters which keep balance among a plurality of tasks, wherein the structure search on the task layer T is required to change the branch structure of the task layer T from a discrete structure into a continuous search space; s420, taking the minimized loss function as an optimization target, and training the parameters in the step S410 through an algorithm; S430, fitting the trained parameters with the multi-task neural network model F to obtain an optimal multi-task neural network model determined by automatic search 。
2. The visual perception method for driving assistance as claimed in claim 1, wherein the multi-tasking neural network model F comprises a parameter sharing layer B and a tasking layer T, then step S200 comprises the following steps: s210, substituting input X into a parameter sharing layer B; S220, extracting image features of the input X by the parameter sharing layer B and obtaining a feature map M; s230, substituting the feature map M into the task layer T to obtain specific outputs of a plurality of branch tasks 。
3. The method for visual perception of driving assistance according to claim 2, wherein the initial structure parameters of the task layer T are determined by initializing each branch structure of the task layer T to obtain parameters of each layer of branch structure as follows Then, the parameters of each layer of branch structure are obtained by random combination And then output in step S230 Can be expressed as 。
4. The method of visual perception of driving assistance as set forth in claim 1, wherein the converted continuous search space is represented by the following formula: ; In the formula, the node X is a hidden representation and is a function For candidate operations from node to node, O represents a set of candidate operations, Is a node Weights of the structure candidate operation o.
5. The method for visual perception of driving assistance according to claim 1, wherein in step S420, the output is lost by a pair of outputs And structural losses Performing double optimization to realize the loss function In particular by gradient descent method to minimize the following objectives: ; 。
6. The visual perception method for driving assistance as set forth in claim 5, wherein the method is characterized in that the method comprises the steps of solving a target through an iterative optimization algorithm to obtain an optimal branch structure parameter alpha and a network weight w, and recovering a continuous structure parameter alpha into a discrete network node connection operation o to obtain a multi-task neural network model with a determined task branch structure and optimal neural network weight performance Wherein, the formula for discretizing the continuously represented neural network structure is as follows: ; 。
7. a system for assisting a visual perception method of driving according to any one of claims 1-6, comprising: The data input module is used for receiving images of a plurality of data sets and preprocessing the images; the shared layer feature extraction module is used for extracting low-level semantic features of the image; the task layer structure searching module is used for automatically searching network structure parameters of each task and The model output module is used for outputting the optimal model.

Description

Visual perception method and system for assisting driving Technical Field The application relates to the technical field of auxiliary driving, in particular to a visual perception method and a visual perception system for auxiliary driving. Background With the great development of deep learning technology, the auxiliary driving field becomes a main landing scene thereof, and visual perception is an important module of an auxiliary driving system, and the important tasks include lane line detection, drivable area segmentation, target detection and the like are challenging. The conventional deep learning algorithm only solves one of the tasks, and cannot meet the requirement of sensing multiple environmental factors in the driving assistance process. The existing advanced algorithm often improves the perception efficiency through multi-task learning, and the existing multi-task aided driving visual perception system is usually improved in the training stage of a deep learning model and is mainly divided into three angles of learning of a structure, an optimization method and a task relation. Wherein: (1) The multi-task model for assisting driving vision perception focuses on the design of a network structure, and performance on a single task is effectively improved. For example, FCN branches are added on the basis of FASTER RCNN to generate masks of corresponding categories, so that semantic segmentation tasks can be accurately completed. Or using the encoder-decoder architecture, a multi-tasking model of the shared encoder and three tasking decoders is designed for classification, object detection and semantic segmentation. Or the lightweight CNN is used as an encoder to extract image characteristics, and then the characteristic images are input into each decoder to finish visual perception tasks, so that the optimal performance is achieved in terms of accuracy and speed. (2) The optimization method of the multi-task model is mainly embodied in the optimization of a loss function and a gradient, wherein the loss function and the gradient are weighted and summed for different tasks, and the key is how to weight. At present, methods such as task uncertainty, learning rate, model performance, return amplitude and geometric mean are utilized. The latter adjusts around the gradient, mainly to correct the gradient to balance the training rate between tasks, such as GradNorm methods. (3) Additional attention is also required in the task relationship, and tasks with weak relevance may have negative migration effects. It is therefore desirable to have the model learn certain task representations or associations between tasks, e.g., clustered according to similarity, and to use the learned feature representations to further improve performance. The above method and system, while achieving a certain success, still have the following two problems: (1) Expert experience is also required to make adjustments to task weights or gradient coefficients for a particular scene or a particular task. (2) A significant amount of time is required for manual experimentation to determine the model structural parameters in the system. Disclosure of Invention One of the objectives of the present application is to provide a method and a system for driving assistance that can solve at least one of the above problems in the prior art. In order to achieve the purpose, the technical scheme adopted by the application is that the visual perception method for assisting driving comprises the following steps: s100, collecting auxiliary driving image data, preprocessing a data set and forming an input X; s200, constructing a multi-task neural network model F and substituting an input X into the model F to obtain a plurality of branch task specific outputs F i (X); s300, calculating a final loss function L all of the multi-task neural network model F; And S400, optimizing the minimization as a target of the loss function L all, and finally obtaining the optimal multi-task neural network model F *. Preferably, the multi-task neural network model F comprises a parameter sharing layer B and a task layer T, and the step S200 comprises the following procedures: s210, substituting input X into a parameter sharing layer B; S220, extracting image features of the input X by the parameter sharing layer B and obtaining a feature map M; S230, substituting the feature map M into the task layer T to obtain a plurality of branch task specific outputs f i (X). Preferably, the initial structure parameter determining process of the task layer T includes initializing each branch structure of the task layer T to obtain a parameter α i of each branch structure, then obtaining a parameter α i0 of each branch structure by random combination, and in step S230, outputting f i (X) to be f i (X, w (α)), where α is a branch structure parameter, w is a neural network weight, the size of w is obtained by iterating α, and the value of α is α i0 in the first iteration. Preferably