CN-116188726-B - Human body 3D grid model construction method and system based on millimeter wave and image fusion

CN116188726BCN 116188726 BCN116188726 BCN 116188726BCN-116188726-B

Abstract

The invention discloses a human body 3D grid model construction method and system by means of millimeter wave and image fusion, which uses commercial millimeter wave radar AWR1443BOOST and DCA1000EVM as millimeter wave signal transceivers to collect images, extracts the positions of key points of a human body and estimates the contour of the human body by fusing two signals, and finally returns to a human body three-dimensional grid model.

Inventors

DING HAN
CHEN ZHENBIN
HUANG TENG
ZHAO SHUAI
WANG GE
ZHAO KUN
HUI WEI
ZHAO JIZHONG

Assignees

西安交通大学

Dates

Publication Date: 20260512
Application Date: 20230306

Claims (8)

1. The human body 3D grid model construction method based on millimeter wave and image fusion is characterized by comprising the following steps of: s1, collecting a reflected signal of an FMCW frequency modulation continuous wave signal, converting the reflected signal into an intermediate frequency signal, and collecting an RGB image; S2, performing static clutter elimination and phase calibration on the intermediate frequency signal obtained in the step S1 to obtain a signal with calibrated phase ; S3, processing the signal processed in the step S2 Performing three times of fast Fourier transformation to obtain human body 3D point cloud corresponding to the intermediate frequency signal, and performing MaskIMG algorithm operations on the RGB image obtained in the step S1, including graying and color inversion, to obtain image information; S4, inputting the 3D point cloud and the image information obtained in the step S3 into a depth neural network to generate a human body 3D grid, rendering the human body 3D grid into an original image to obtain an image containing human body 3D grid projection, and completing 3D modeling of the human body grid, wherein the depth neural network comprises: the feature extraction module is used for respectively processing the millimeter wave signal and the image signal and fusing the features of the millimeter wave signal and the image signal; the joint point identification and contour estimation module takes the characteristics fused by the characteristic extraction module as input and outputs a heat map of the joint point of the human body and a black-and-white figure contour map; The grid generation module inputs PoseNet the fusion characteristics of the 2D joint points of the joint point identification and contour estimation module and the characteristic extraction module, outputs 72 parameters representing joint rotation, inputs ShapeNet the fusion characteristics of the contour estimation diagram of the joint point identification and contour estimation module and the characteristic extraction module to obtain 10 parameters representing the shape, inputs the shape parameters into the multi-person skin linear model, and generates a 3D model of the human body grid; Processing millimeter wave signals by adopting a local attention method, and setting [ Q, K, V ] =X for millimeter wave signals X with characteristic dimension D Q, K, V are Query vectors, key vectors and Value vectors, respectively, For a feature transformation matrix, the attention weight is divided by the inner product of Query and Key Resulting in parallel expansion and connection of k SA operations to corresponding outputs Position embedding is used for different heads, each head is associated with certain parts of the human body, and regional characteristics are obtained through training.
2. The human 3D mesh model construction method of millimeter wave and image fusion according to claim 1, wherein in step S1, commercial millimeter wave radar is used to emit FMCW frequency modulated continuous wave signals, and monocular RGB/IR camera is used to collect images.
3. The method for constructing a human 3D mesh model with millimeter wave and image fusion according to claim 1, wherein in step S2, the phase-calibrated signal is The method comprises the following steps: =AS where S is the original signal and a is the phase shift vector.
4. The human 3D mesh model construction method of millimeter wave and image fusion according to claim 3, wherein the phase shift vector a is: A=[1 ... ] Wherein, the The phase shift introduced for the mth virtual antenna, Is a natural base number, and is used for the production of the natural base number, In imaginary units.
5. The method for constructing a human 3D mesh model with millimeter wave and image fusion according to claim 1, wherein in step S3, the phase-calibrated signals are And sequentially carrying out Range-FFT and Doppler-FFT solution, subtracting the average value of the Doppler-FFT heat map from all signals, selecting 128 points with the maximum intensity, and carrying out 3D-FFT solution to obtain a 3D point cloud.
6. The human 3D mesh model construction method of millimeter wave and image fusion according to claim 1, wherein the output : The k SA operations are extended in parallel and connected to the corresponding outputs: Wherein, the Is a weight matrix, Q, K and V are respectively a Query vector, a Key vector and a Value vector, Is the characteristic dimension of millimeter wave signals.
7. The human 3D mesh model construction method of millimeter wave and image fusion according to claim 1, wherein the joint point recognition and contour estimation module trains using L2 loss and cross entropy loss.
8. The utility model provides a human 3D grid model construction system of millimeter wave and image fusion which characterized in that includes: The acquisition module acquires a reflected signal of the FMCW frequency modulation continuous wave signal, converts the reflected signal into an intermediate frequency signal and collects an RGB image; The preprocessing module is used for carrying out static clutter elimination and phase calibration on the intermediate frequency signals obtained by the acquisition module to obtain signals with calibrated phases ; Training module for processing the signals processed by the preprocessing module Performing three times of fast Fourier transformation to obtain human body 3D point cloud corresponding to the intermediate frequency signal, and performing MaskIMG algorithm operations on the RGB image obtained by the acquisition module, including graying and color inversion, to obtain image information; the output module is used for inputting the 3D point cloud and the image information obtained by the training module into the depth neural network to generate a human body 3D grid, rendering the human body 3D grid into an original image, obtaining an image containing human body 3D grid projection, and completing 3D modeling of the human body grid, wherein the depth neural network comprises: the feature extraction module is used for respectively processing the millimeter wave signal and the image signal and fusing the features of the millimeter wave signal and the image signal; the joint point identification and contour estimation module takes the characteristics fused by the characteristic extraction module as input and outputs a heat map of the joint point of the human body and a black-and-white figure contour map; The grid generation module inputs PoseNet the fusion characteristics of the 2D joint points of the joint point identification and contour estimation module and the characteristic extraction module, outputs 72 parameters representing joint rotation, inputs ShapeNet the fusion characteristics of the contour estimation diagram of the joint point identification and contour estimation module and the characteristic extraction module to obtain 10 parameters representing the shape, inputs the shape parameters into the multi-person skin linear model, and generates a 3D model of the human body grid; Processing millimeter wave signals by adopting a local attention method, and setting [ Q, K, V ] =X for millimeter wave signals X with characteristic dimension D Q, K, V are Query vectors, key vectors and Value vectors, respectively, For a feature transformation matrix, the attention weight is divided by the inner product of Query and Key Resulting in parallel expansion and connection of k SA operations to corresponding outputs Position embedding is used for different heads, each head is associated with certain parts of the human body, and regional characteristics are obtained through training.

Description

Human body 3D grid model construction method and system based on millimeter wave and image fusion Technical Field The invention belongs to the technical field of intelligent perception, and particularly relates to a human body 3D grid model construction method and system based on millimeter wave and image fusion. Background The estimation and construction of three-dimensional networks of the human body has many uses in reality. For example, a more realistic AR service can be obtained in the fields of entertainment, medical care, etc. by capturing different gestures of a user, an identity-based personnel management system can be used for authorization and authentication of the user by acquiring the stature or stature characteristics (such as height, obesity, gait, etc.) of the human body, and by recognizing continuous movements of the human body, factory managers and supervisors can easily determine whether the operation of workers on a production line meets the standard through the system. At present, the work of generating the human body 3D grid is not few, but most of the work is performed based on images, namely computer vision, and the mode can seriously influence the definition of acquired images when the weather is dim or strong light exists, so that the human body grid is generated and fails. After that, people begin to pay attention to the possibility of a wireless signal-based method, many researches for reconstructing a human body grid by using radio frequency signals are produced, the radio frequency signals are not influenced by factors such as weather, light rays and the like, stronger reflection echoes can be formed through a human body, and the problems facing a camera can be solved while the characteristics of the human body can be captured. However, the complexity of the wireless signal acquisition hardware and the stability of the method are challenges that prevent the technology from being widely used. Disclosure of Invention The invention aims to solve the technical problems that aiming at the defects in the prior art, the invention provides a human body 3D grid model construction method and system for fusing millimeter waves and images, and a wireless radio frequency signal and image information fusion mode is used for rapidly reconstructing a three-dimensional human body grid model, so that the technical problems that 3D modeling cannot be performed on human body actions or modeling quality is poor when image quality is poor and modeling is unstable by a pure wireless signal method are solved. The invention adopts the following technical scheme: A human body 3D grid model construction method based on millimeter wave and image fusion comprises the following steps: S1, converting a reflected signal of an FMCW frequency modulation continuous wave signal into an intermediate frequency signal, and collecting an RGB image; s2, performing static clutter elimination and phase calibration on the intermediate frequency signal obtained in the step S1; S3, performing three times of fast Fourier transformation on the signals processed in the step S2 to obtain human body 3D point clouds corresponding to the intermediate frequency signals, and performing MaskIMG operation on the RGB images obtained in the step S1 to obtain image information; S4, inputting the 3D point cloud and the image information obtained in the step S3 into a depth neural network to generate a human body 3D grid, rendering the human body 3D grid into an original image, obtaining an image containing human body 3D grid projection, and completing 3D modeling of the human body grid. Specifically, in step S1, a commercial millimeter wave radar is used to emit FMCW frequency modulated continuous wave signals, and a monocular RGB/IR camera is used to collect images. Specifically, in step S2, the calibrated signal is phase shiftedThe method comprises the following steps: =AS where S is the original signal and a is the phase shift vector. Further, the phase shift vector a is: A=[1...] Wherein, the The phase shift introduced for the mth virtual antenna,Is a natural base number, and is used for the production of the natural base number,In imaginary units. Specifically, in step S3, the phase-calibrated signalAnd sequentially carrying out Range-FFT and Doppler-FFT solution, subtracting the average value of the Doppler-FFT heat map from all signals, selecting 128 points with the maximum intensity, and carrying out 3D-FFT solution to obtain a 3D point cloud. Specifically, in step S4, the deep neural network includes: the feature extraction module is used for respectively processing the millimeter wave signal and the image signal and fusing the features of the millimeter wave signal and the image signal; the joint point identification and contour estimation module takes the characteristics fused by the characteristic extraction module as input and outputs a heat map of the joint point of the human body and a black-and-white figure contour map;