CN-116168319-B - Target identification realization method based on MPSOC

CN116168319BCN 116168319 BCN116168319 BCN 116168319BCN-116168319-B

Abstract

The application provides a target recognition implementation method based on MPSOC, which belongs to the technical field of artificial intelligence and specifically comprises the steps of calling DPUCZDX G IP cores based on a multi-core system-level chip platform FPGA end, designing a multi-core deep learning processor parallel data calculation module, carrying out parallel acceleration calculation on a convolutional neural network target recognition algorithm model, compiling a target recognition algorithm network model compatible with instruction operation of the multi-core deep learning processor based on a target recognition algorithm yolox, and deploying a compiled network model instruction file yolox. By the processing scheme, the real-time performance and the reliability of the target identification system are improved.

Inventors

WANG DONGJING
Qiu Jiayue
ZHENG XINYUE
ZHANG FENG
GAO WEILIN

Assignees

苏州长风航空电子有限公司

Dates

Publication Date: 20260508
Application Date: 20221228

Claims (7)

1. The target identification realization method based on MPSOC is characterized by comprising the following steps: based on a multi-core system level chip platform FPGA end call DPUCZDX G IP cores, designing a multi-core deep learning processor parallel data calculation module, and carrying out parallel acceleration calculation on a convolutional neural network target recognition algorithm model; Step two, compiling a target recognition algorithm network model compatible with the instruction operation of the multi-core deep learning processor based on the target recognition algorithm yolox, Deploying the compiled network model instruction file yolox. Xmodel into a DPU module at the FPGA end of the MPSOC platform for execution; The first step comprises the following steps: Designing a multi-core DPU data calculation module at an FPGA end, adding two DPUCZDX G IP cores and ZYNQ UltraScale +MPSOC IP cores, connecting DPUCZDX G IP cores with ZYNQ UltraScale +MPSOC IP cores, wherein each DPUCZDX G IP core comprises three AXI bus master interfaces and one AXI bus slave interface, and the three AXI bus master interfaces comprise two data interfaces and one instruction interface which are respectively used for acquiring data and instructions; The AXI bus data main interfaces of the two DPUCZDX G IP cores are respectively connected with the high-speed AXI bus interfaces S_AXI_HP0_FPD, S_AXI_HP1_FPD, S_AXI_HP2_FPD and S_AXI_HP3_FPD of the ZYNQ UltraScale +MPSOP IP core; The AXI bus instruction main interfaces of the two DPUCZDX G IP cores are commonly connected to the S_AXI_LPD interface of the ZYNQ UltraScale +MPSOC IP core through an AXI bus interface; connecting all DPUCZDX G IP core AXI buses to an M_AXI_HPM0_LPD interface of the ZYNQ UltraScale +MPSOC IP core through an AXI interface, and accessing a deep learning processor register space through the M_AXI_HPM0_LPD interface by a CPU end; The DPUCZDX G IP core and the ZYNQ UltraScale +MPSOC core form a deep learning processor computing module of the target recognition system through AXI bus interaction design.
2. The MPSOC-based target recognition implementation method according to claim 1, wherein a VDMA core is added into the FPGA end and mounted on an AXI bus, the VDMA core is simultaneously connected with a digital component serial interface acquisition module and ZYNQ UltraScale +MPSOC core, the acquired SDI video stream is cached into a double-rate synchronous dynamic random access memory of the FPGA, and when the CPU end reads image data, the data in the double-rate synchronous dynamic random access memory of the FPGA is accelerated to be read through the VDMA core and pushed onto the AXI bus to be transmitted to the CPU end.
3. The method for implementing the object recognition based on the MPSOC as set forth in claim 1, wherein the second step includes improving the object recognition algorithm yolox to make the object recognition algorithm yolox compatible with the execution of the deep learning processor instruction to form a new yolox network model as required, and the specific steps are as follows: removing the focus structure of yolox network input end, adopting 3 3 3, Carrying out convolution operation on three channel values of the input image by a convolution check, wherein the convolution sliding step length is set to be 2, and the convolution operation is the same as the feature map acquired by the focus structure; changing silu activation function in yolox network structure into relu activation function, and adjusting data training method to limit the influence of activation function on the whole network performance; The sigmod function in the yolox network output detection head is removed and the implementation of the function is put into a post-processing stage of the network output results, i.e. not included in the whole object recognition network.
4. The method for realizing object recognition based on MPSOC according to claim 3, wherein a new yolox network is built by adopting a pytorch deep learning framework, data set training is performed, a network model weight file yolox. Pth is generated, model weights generated by training are 32-bit floating point weight values, and pytorch quantization tools in a xilinx Vitis AI tool chain are called to quantize the model weights into 8-bit integers.
5. The MPSOC-based target recognition implementation of claim 4, wherein the forward network inference calibration is performed on the quantization weights during the quantization process by a calibration data set.
6. The method for implementing object recognition based on mpssoc of claim 4, wherein step3 specifically comprises: Compiling the quantization model by using a Vitis AI compiler vai _c_ xir to generate a yolox network model calculation graph, and mapping the calculation in the calculation graph to a deep learning processor instruction sequence to generate a deep learning processor instruction file yolox. And deploying the instruction file yolox. Xmodel into an FPGA end deep learning processor module of the MPSOC platform for execution.
7. The MPSOC-based target recognition implementation method according to claim 6, wherein the step 3 further comprises designing a multithread task scheduling processing mechanism to deploy an algorithm multithread task scheduling processing mechanism to run at a CPU end, firstly creating six thread tasks in a main program, wherein a first thread is used for receiving and preprocessing image data, the first thread calls a direct memory module at the CPU end to receive digital component serial image data on an AXI bus and is cached in a CPU end double-rate synchronous dynamic random memory to perform image preprocessing, the second thread is used for calling a DPU engine to run an algorithm network model in total to the first thread, loading yolox. Xmodel files in the main program once before the deep learning processor engine is called to run, carrying out serialization operation on instruction files to form a calculation sequence diagram, then creating four deep learning processor engines according to the calculation sequence diagram to be respectively called in the four threads, the deep learning processor engines in each thread can transmit the image data and network model parameters to the end learning processor module through an AXI bus, then the front of the CPU end is used for receiving and transmitting back data to a CPU end real-time recognition result information through an FPGA after the calculation result is overlaid to the FPGA, and the result is displayed on the front of the CPU end after the calculation is finished, and the result is displayed on the front by the front of the FPGA.

Description

Target identification realization method based on MPSOC Technical Field The application relates to the field of artificial intelligence, in particular to a target identification realization method based on MPSOC. Background At present, the object recognition technology based on the optical image is mainly divided into two types, namely a traditional template matching method and a main stream object recognition algorithm based on a neural network. The traditional template matching algorithm has poor robustness and low target recognition accuracy because the traditional template matching algorithm is seriously dependent on template sample characteristics, so that the main stream research direction is to realize target recognition based on neural network processing. The neural network-based target recognition method can realize target generalization detection by training a large amount of sample data, and has higher recognition accuracy. However, the application of the AI target recognition technology in the aviation airborne field is not mature enough, mainly because the existing target recognition algorithm is difficult to deploy in the airborne terminal equipment, and the high-performance GPU computing platform generally supporting the AI algorithm has the defects of larger running power consumption, low reliability, difficult expansion of hardware architecture and the like, so that the high-performance computing platform based on the GPU is difficult to apply in a severe aviation airborne environment. Disclosure of Invention In view of the above, the application provides a target recognition implementation method based on MPSOC, which solves the problems in the prior art and improves the real-time performance and reliability of a target recognition system. The application provides a target identification realization method based on MPSOC, which adopts the following technical scheme: a target identification realization method based on MPSOC comprises the following steps: based on a multi-core system level chip platform FPGA end call DPUCZDX G IP cores, designing a multi-core deep learning processor parallel data calculation module, and carrying out parallel acceleration calculation on a convolutional neural network target recognition algorithm model; Step two, compiling a target recognition algorithm network model compatible with the instruction operation of the multi-core deep learning processor based on the target recognition algorithm yolox, And thirdly, deploying the compiled network model instruction file yolox. Xmodel to an MPSOC platform FPGA end DPU module for execution. Optionally, the step1 includes: Designing a multi-core DPU data calculation module at an FPGA end, adding two DPUCZDX G IP cores and ZYNQ UltraScale +MPSOC IP cores, connecting DPUCZDX G IP cores with ZYNQ UltraScale +MPSOC IP cores, wherein each DPUCZDX G IP core comprises three AXI bus master interfaces and one AXI bus slave interface, and the three AXI bus master interfaces comprise two data interfaces and one instruction interface which are respectively used for acquiring data and instructions; The AXI bus data main interfaces of the two DPUCZDX G IP cores are respectively connected with the high-speed AXI bus interfaces S_AXI_HP0_FPD, S_AXI_HP1_FPD, S_AXI_HP2_FPD and S_AXI_HP3_FPD of the ZYNQ UltraScale +MPSOP IP core; The AXI bus instruction main interfaces of the two DPUCZDX G IP cores are commonly connected to the S_AXI_LPD interfaces of the ZYNQ UltraScale +MPSOC IP cores through AXI bus interfaces; connecting all DPUCZDX G IP core AXI buses to an M_AXI_HPM0_LPD interface of the ZYNQ UltraScale +MPSOC IP core through an AXI interface, and accessing a deep learning processor register space through the M_AXI_HPM0_LPD interface by a CPU end; The DPUCZDX G IP core and the ZYNQ UltraScale +MPSOC core form a deep learning processor computing module of the target recognition system through AXI bus interaction design. Optionally, a VDMA core is added into the FPGA end and is mounted on the AXI bus, the VDMA core is simultaneously connected with a digital component serial interface acquisition module and a ZYNQ UltraScale +MPSOC core, the acquired SDI video stream is cached into a double-rate synchronous dynamic random access memory of the FPGA, and when the CPU end reads image data, the data in the double-rate synchronous dynamic random access memory of the FPGA is accelerated to be read through the VDMA core and is pushed onto the AXI bus to be transmitted to the CPU end. Optionally, the step 2 includes improving the target recognition algorithm yolox to make the target recognition algorithm yolox compatible with the running of the deep learning processor instruction to form a new yolox network model as required, and the specific steps are as follows: Removing a focus structure at an input end of a yolox network, adopting a convolution check of 3 x 3 to carry out convolution operation on three channel values of an input image, set