CN-122021743-A - Structural feature-based YOLO-pose network loss function design method

CN122021743ACN 122021743 ACN122021743 ACN 122021743ACN-122021743-A

Abstract

The application provides a structural feature-based design method of a YOLO-pose network loss function, which comprises the steps of defining a pixel coordinate system in an image, marking key points in the image, obtaining a marked json file of the image, and extracting the pixel coordinates of the key points in the json file Selecting basic network model, inputting image, loading pre-training weight, making model reasoning, resolving network model output to obtain YOLO-pose network model predictive output key point pixel coordinate The method comprises the steps of selecting key point pixel coordinates, numbering the key point pixel coordinates, recording the number ids, aligning the data of the selected key point pixels with the key point pixels in the json file by using the number ids, and applying a YOLO-pose LOSS function LOSS based on structural characteristics. Aiming at the problem of poor precision of key points output by the YOLO-pose network model, the application designs a set of reasonable loss function, so that the network model outputs higher precision and recall rate.

Inventors

WANG RUI
TAO CHENGGANG
NI JING
LI TIANXU
WANG XIAOHONG
YIN TIAN
Pu Saihu

Assignees

中国航空工业集团公司成都飞机设计研究所

Dates

Publication Date: 20260512
Application Date: 20251227

Claims (8)

1. A method for designing a YOLO-pose network loss function based on structural features, the method comprising: Step 1, defining a pixel coordinate system in an image; step 2, marking key points in the image to obtain a marked json file of the image; step 3, extracting pixel coordinates of key points in json file A number id; step 4, selecting a basic network model, inputting an image, loading a pre-training weight, and carrying out model reasoning; Step 5, analyzing the network model output to obtain pixel coordinates of the predicted output key points of the YOLO-pose network model Number id; step 6, screening the pixel coordinates of the key points and recording the serial numbers id of the key points; Step 7, carrying out data alignment on the key point pixels screened in the step 6 and the key point pixels in the json file in the step 3 by using the number id; and 8, applying a YOLO-pose LOSS function LOSS based on structural characteristics.
2. The method according to claim 1, wherein the step1 comprises: the upper left corner of the image is taken as the origin of a pixel coordinate system, the positive direction of the U axis points to the right side, and the positive direction of the V axis points to the right below, so as to establish the pixel coordinate system.
3. The method according to claim 1, wherein the step 2 comprises: And marking key points in the image through marking software labelme to obtain a marked json file of the image.
4. The method according to claim 1, wherein the pixel coordinates in step 3 Is true.
5. The method according to claim 1, wherein the YOLO-pose LOSS function LOSS based on structural features in step 8 is specifically expressed as follows: Wherein, the The total loss is calculated by the method, In order to be a loss of similarity, In the event of a loss of structural integrity, In order to be a loss of distance, The keypoint pixel value numbered id +1, The key point pixel coordinates for the number id, The region weight value for the similarity lost portion, The region weight value for the structurally lost portion, The sigma is the difficulty of convergence of the key points, and the larger the value is, the more difficult the convergence is, otherwise, the more difficult the convergence is, s is the area of a category detection frame where the key points are located, and n is the total number of the key points in the json file in the step 3).
6. The method according to claim 1, wherein the step4 comprises: and selecting YOLO-pose as a basic network model, inputting an image, loading a pre-training weight, and carrying out model reasoning.
7. The method according to claim 1, wherein the step 6 comprises: And screening the pixel coordinates of the key points with the confidence coefficient c of the key points larger than a preset value, and recording the serial numbers id of the pixel coordinates.
8. The method of claim 7, wherein the preset value is 0.

Description

Structural feature-based YOLO-pose network loss function design method Technical Field The application belongs to the technical field of deep learning, and particularly relates to a design method of a YOLO-pose network loss function based on structural characteristics. Background Along with the rapid development of the neural network model and the corresponding operation hardware platform, the neural network model is increasingly widely applied in engineering, and the loss function of the neural network model is a key for determining the accuracy rate and recall rate of the predicted output of the network, so that the design of a set of reasonable loss functions is necessary. Disclosure of Invention The application aims to solve the problem of poor precision of key points output by a YOLO-pose network model, and designs a set of reasonable loss function to ensure that the network model outputs higher precision and recall rate. The application provides a method for designing a YOLO-pose network loss function based on structural characteristics, which comprises the following steps: Step 1, defining a pixel coordinate system in an image; step 2, marking key points in the image to obtain a marked json file of the image; step 3, extracting pixel coordinates of key points in json file A number id; step 4, selecting a basic network model, inputting an image, loading a pre-training weight, and carrying out model reasoning; Step 5, analyzing the network model output to obtain pixel coordinates of the predicted output key points of the YOLO-pose network model Number id; step 6, screening the pixel coordinates of the key points and recording the serial numbers id of the key points; Step 7, carrying out data alignment on the key point pixels screened in the step 6 and the key point pixels in the json file in the step 3 by using the number id; and 8, applying a YOLO-pose LOSS function LOSS based on structural characteristics. Preferably, the step 1 includes: the upper left corner of the image is taken as the origin of a pixel coordinate system, the positive direction of the U axis points to the right side, and the positive direction of the V axis points to the right below, so as to establish the pixel coordinate system. Preferably, the step 2 includes: And marking key points in the image through marking software labelme to obtain a marked json file of the image. Preferably, the pixel coordinates in the step 3Is true. Preferably, the specific form of the YOLO-pose LOSS function LOSS based on structural characteristics in the step 8 is as follows: Wherein, the The total loss is calculated by the method,In order to be a loss of similarity,In the event of a loss of structural integrity,In order to be a loss of distance,The keypoint pixel value numbered id +1,The key point pixel coordinates for the number id,The region weight value for the similarity lost portion,The region weight value for the structurally lost portion,The sigma is the difficulty of convergence of the key points, and the larger the value is, the more difficult the convergence is, otherwise, the more difficult the convergence is, s is the area of a category detection frame where the key points are located, and n is the total number of the key points in the json file in the step 3). Preferably, the step 4 includes: and selecting YOLO-pose as a basic network model, inputting an image, loading a pre-training weight, and carrying out model reasoning. Preferably, the step6 includes: And screening the pixel coordinates of the key points with the confidence coefficient c of the key points larger than a preset value, and recording the serial numbers id of the pixel coordinates. Preferably, the preset value is 0. The beneficial technical effects of the application are as follows: a) Adding structural constraint into a loss function of a traditional YOLO-pose network model, and further constraining the convergence direction of the feature points; b) The proportion of distance loss, similarity loss and structural loss is distributed, the effect of each constraint is fully utilized, and the completeness of the loss function is greatly improved. Drawings In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the description of the embodiments or the prior art will be briefly described, in which the drawings are only some embodiments of the invention, and other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art. FIG. 1 is a diagram showing a conventional loss function of a YOLO-pose network model according to an embodiment of the present application; fig. 2 is a schematic diagram of a runway key point according to an embodiment of the present application. Detailed Description In order to make the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions of