CN-122015821-A - Multi-IMU end-to-end navigation method based on mask fusion mechanism
Abstract
The invention provides a multi-IMU end-to-end navigation method based on a mask fusion mechanism, which constructs a depth neural network architecture named MultiImuNet, comprising a plurality of independent bidirectional LSTM feature extraction branches, a feature fusion module based on a leachable mask and a position regression module, wherein by designing a special feature encoder for each IMU, the method adopts a sliding window mode to input multi-IMU time sequence data, outputs relative displacement estimation end to end, does not need external auxiliary signals, has high precision, strong robustness and good expandability, and is suitable for an autonomous navigation system in a GNSS-free environment.
Inventors
- QIN JIE
- YIN YUHAN
- LI HAIJUN
- LIU CHONG
- LIU CHANG
Assignees
- 北京自动化控制设备研究所
Dates
- Publication Date
- 20260512
- Application Date
- 20251229
Claims (8)
- 1. The multi-IMU end-to-end navigation method based on the mask fusion mechanism is characterized by comprising the following steps of: The method comprises the steps of firstly, constructing a deep neural network architecture MultiImuNet, wherein the architecture can jointly learn high-dimensional time sequence characteristics from time sequence data acquired by a plurality of IMU sensors and output a relative displacement estimation result, the architecture comprises an IMU characteristic extraction module, a characteristic fusion module and a position regression module, the IMU characteristic extraction module designs a special Bi-directional long-short-term memory network (Bi-LSTM) for each independent IMU as a characteristic encoder to fully capture context dependency in a time sequence and obtain a characteristic vector with fixed dimension, and the characteristic fusion module adaptively distributes weights for each characteristic channel through a lightweight sub-network based on the characteristic vector to realize dynamic weighted fusion; designing a loss function, wherein a network optimization target is to minimize the mean square error between the predicted displacement and the actual displacement; and thirdly, designing training sample input, training on the basis of the first step and the second step, and estimating relative displacement by using a trained model.
- 2. The method for multi-IMU end-to-end navigation based on mask fusion mechanism of claim 1, wherein the obtaining of the feature vector of the fixed dimension based on the IMU feature extraction module specifically comprises: Let the input sequence of the ith IMU in the time window T be: wherein x i,t =[a x ,a y ,a z ,ω x ,ω y ,ω z ] T is the measurement of the ith IMU at time t; inputting the sequence into a stacked Bi-LSTM network comprising L layers, each layer having hidden dimensions h dim , biLSTM outputting forward and backward hidden states at each time step t And Finally, the two-way output is obtained by splicing: Taking the output h i,T of the last time step t=t as the time sequence feature representation of the IMU, and mapping the time sequence feature representation into a feature vector with a fixed dimension through a full connection layer, wherein the feature vector is represented by the following formula: wherein FC i is the neural network full-connection layer corresponding to the ith IMU, d f is the feature dimension, Representing a d f -dimensional vector.
- 3. The method for multi-IMU end-to-end navigation based on mask fusion mechanism according to claim 1 or 2, wherein the method adaptively assigns weights to each feature channel through a lightweight subnetwork to implement dynamic weighted fusion, and specifically comprises: splicing the feature vectors { f 1 ,f 2 ,...,f N } extracted by the N IMUs along the feature dimension to form a joint feature vector, wherein the joint feature vector is shown in the following formula: f is input to a linear transformation layer, and a weight vector m with the same dimension as F is output, wherein the weight vector m is shown as the following formula: m=σ(W m F+b m ) Wherein, the B m is a bias term, sigma (·) is a Sigmoid activation function for a learnable weight matrix, and each weight value is ensured to fall in a [0,1] interval; the weight vector m is a mask (mask) for weighting the original joint features element by element, as shown in the following formula: wherein ". Sup.H.is the Hadamard product. The weighted characteristics Flattened and fed into the location regression module as a fused high-order representation.
- 4. A multi-IMU end-to-end navigation method based on a mask fusion mechanism according to any one of claims 1 to 3, wherein the weighted fusion result based on the feature fusion module outputs a relative displacement estimate in a three-dimensional space, and specifically includes: will fuse features Is input to a linear regression head which, the relative displacement estimation in the three-dimensional space is directly output, and the following formula is shown: Wherein, the The displacement amounts are respectively north, sky and east, (3 XN.d f -dimensional matrix), Is a trainable parameter.
- 5. The method for end-to-end navigation of multiple IMUs based on mask fusion mechanism as claimed in claim 4, wherein in step three, the whole network is trained in an end-to-end manner, training samples are extracted from original trajectory data through sliding windows, each sample contains IMU sequences with length T And a corresponding true relative displacement Δp gt .
- 6. The multi-IMU end-to-end navigation method based on a mask fusion mechanism of claim 5, wherein the loss function is designed by: where B is the batch size.
- 7. The method for multi-IMU end-to-end navigation based on mask fusion mechanism according to any one of claims 1-6, wherein in step three, a trained model is used for performing relative displacement estimation, and the method specifically comprises: The system receives multiple paths of IMU data streams in real time in a sliding window mode, and every time new T time step data are received, a trained MultiImuNet model is called to conduct relative displacement prediction once, all prediction results are accumulated, and a complete motion track can be rebuilt, wherein the following formula is shown: Where K is the number of windows completed.
- 8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of any of claims 1-7 when executing the computer program.
Description
Multi-IMU end-to-end navigation method based on mask fusion mechanism Technical Field The invention belongs to the technical field of inertial navigation and artificial intelligence fusion, and particularly relates to a multi-IMU end-to-end navigation method based on a mask fusion mechanism, which is particularly suitable for an autonomous navigation system in an indoor or shielding environment without external auxiliary signals, such as a mobile robot, an intelligent vehicle, wearable equipment and the like. Background The IMU is widely applied to the fields of pedestrian navigation, unmanned aerial vehicle control, augmented reality, automatic driving and the like due to the advantages of high-frequency response, strong autonomy, no need of external dependence and the like. The conventional Inertial Navigation System (INS) calculates the attitude, speed and position of the carrier by integrating the gyroscope angular velocity and accelerometer data. However, due to the inherent noise of the sensor, zero offset drift, etc., the integration process may cause the problem of error accumulation over time, especially in low cost MEMS-level IMUs, which may be more pronounced, resulting in a significant deviation of the position estimate from the true trajectory over long periods of operation. In order to alleviate the problem, the prior art mainly adopts two methods, namely a multi-sensor fusion strategy, such as GNSS/INS integrated navigation, vision/inertial fusion (VIO) and the like, and correcting an inertial system by introducing external observation information, and an error modeling and compensation method based on Kalman filtering and variants thereof (such as EKF and UKF). However, the above method relies on accurate system noise models and parameter tuning and is difficult to work continuously in GNSS-free or vision failure scenarios. In recent years, with the development of deep learning, researchers have proposed to input IMU data as a time series and to implement end-to-end position estimation using models such as a Recurrent Neural Network (RNN) and a long-short-term memory network (LSTM). Representative is IONet, which learns the mapping between IMU sequences and relative displacement through a bi-directional LSTM network, effectively suppressing integration drift during short-time tasks. Subsequent work further explored the AI-IMU framework, which completely replaced the traditional solution flow by neural networks, exhibiting superior performance to traditional algorithms in certain scenarios. However, the existing deep learning method still has obvious limitations that firstly, most models only process single IMU input and cannot fully utilize the space distribution redundancy and dynamic complementarity among a plurality of IMUs, secondly, the fusion mode of the plurality of IMUs is mainly simply spliced or averaged, a dynamic assessment and selection mechanism for the information credibility of each channel is lacking, thirdly, the robustness of the models under a long-time sequence is insufficient, a slow drift phenomenon still exists, and the high-precision continuous navigation requirement is difficult to meet. Therefore, there is a need for an end-to-end learning method that can effectively integrate multiple IMU information, has adaptive fusion capability, and can achieve stable and high-precision navigation without external correction. Disclosure of Invention The present invention aims to solve at least one of the technical problems existing in the prior art. Therefore, the invention provides a multi-IMU end-to-end navigation method based on a mask fusion mechanism. The method realizes the depth fusion and high-precision relative displacement estimation of a plurality of Inertial Measurement Unit (IMU) data by constructing a depth neural network structure with multi-branch time sequence modeling capability and combining a learnable mask weighted fusion mechanism. The invention can solve the problems of serious error accumulation, insufficient information fusion capability of the existing deep learning model to the multi-source IMU and poor robustness of the traditional inertial navigation system in long-time operation. The technical scheme of the invention is as follows: according to an aspect, a multi-IMU end-to-end navigation method based on a mask fusion mechanism is provided, the navigation method comprising: The method comprises the steps of firstly, constructing a deep neural network architecture MultiImuNet, wherein the architecture can jointly learn high-dimensional time sequence characteristics from time sequence data acquired by a plurality of IMU sensors and output a relative displacement estimation result, the architecture comprises an IMU characteristic extraction module, a characteristic fusion module and a position regression module, the IMU characteristic extraction module designs a special Bi-directional long-short-term memory network (Bi-LSTM) for each independent IMU