CN-122008226-A - Automatic sorting method and system for aviation parts based on simulation and real teaching combined training
Abstract
The invention discloses an automatic sorting method and system for aviation parts based on simulation and real teaching combined training, which are used for constructing a system frame consisting of simulation data construction and generation, real data acquisition, data preprocessing, VLA model combined training and task execution optimization, wherein a MuJoCo platform is used for constructing a simulation scene to generate large-scale multi-mode training data, a PIKA terminal is used for acquiring real teaching data, after unified standardized preprocessing, VLA models of DinoV, sigLIP and Llama2 7B are trained and fused in a mode of simulation pre-training and real data fine tuning, and finally the action execution of a robot is optimized through ACT technology and closed-loop control. The invention realizes the end-to-end learning of the aviation part recognition, grabbing planning and sorting action generation, improves the sorting precision, stability and generalization capability, and can be widely applied to scenes such as aviation assembly, aviation material management, aviation automatic storage and the like.
Inventors
- FANG ZHIJUN
- XUE HAN
- WANG JIANNAN
- QIN JINLONG
Assignees
- 东华大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260320
Claims (8)
- 1. An automatic sorting method of aviation parts based on simulation and real teaching combined training is characterized by comprising the following steps: S1, constructing and generating simulation data, namely building an aviation part sorting action interactive simulation environment on a MuJoCo physical simulation platform, loading a three-dimensional model of aviation parts, a mechanical arm, clamping jaws and a working area and a virtual RGB camera, and configuring physical parameters; S2, acquiring real data, namely adopting a multi-sensor integrated acquisition scheme based on a PIKA terminal, carrying out aviation part sorting operation by an operator by holding the PIKA terminal, recording visual flow, depth flow, six-degree-of-freedom pose, clamping jaw state, IMU inertial information and task semantic tags in real time, and forming a structured real teaching data set after automatic time synchronization of the data; S3, data preprocessing, namely performing unified standardization processing on simulation and real data sets, including size standardization, quality filtering and data enhancement of images, denoising of depth data, hole filling and pixel level alignment, coordinate system alignment of track data, filtering smoothing and fixed length interpolation, normalization and binary coding of clamping jaw information, realizing multi-mode data time synchronization based on a unified time stamp, and packaging the multi-mode data into a structured training data set containing multi-mode input samples and corresponding action sequences; s4, the VLA model is jointly trained, namely a VLA model consisting of a visual encoder, a text encoder, a multi-mode fusion network and an action decoder is constructed, the model is pre-trained by utilizing a simulation data set, and then fine adjustment is carried out on the model by adopting a real teaching data set, so that end-to-end joint learning of visual information, task text and an action sequence is realized; S5, task execution optimization, namely generating a robot Action instruction based on the trained VLA model, splitting a complex Action sequence into Action blocks by adopting an Action-Chunking-transducer technology, processing the Action blocks by a transducer architecture, combining closed-loop control receiving sensor real-time feedback, dynamically adjusting the Action of a mechanical arm, and finishing high-precision automatic sorting of aviation parts.
- 2. The method of claim 1, wherein the scene randomization strategy in step S1 includes randomly changing the position, posture, number and stacking manner of the aviation parts, perturbing the view angle and spatial position of the virtual camera, randomly adjusting the illumination intensity and illumination angle of the scene, and replacing the background texture.
- 3. The method of claim 1, wherein the visual encoder in step S4 consists of DinoV, sigLIP and an MLP Projector, dinoV extracting image low-level spatial features, sigLIP extracting image high-level semantic features, the MLP Projector mapping both types of features to a feature space compatible with a language model.
- 4. The method of claim 1, wherein the text encoder in step S4 is Llama Tokenizer for converting the natural language task instruction into a model processable tokens, and the multimodal fusion network is an ilama 2 7B model for joint processing of the visual features and the text tokens to generate the multimodal fusion feature representation.
- 5. The method according to claim 1, wherein the Action decoder in step S4 comprises an Action De-Tokenizer for converting the multimodal fusion feature representation into 7D robot Action instructions, the 7D robot Action instructions comprising a displacement Δx, a pose Δθ and a gripping force Δgrip.
- 6. The method of claim 1, wherein the optimization of the Action-Chunking-transducer technique in step S5 is to divide a long sequence of actions into Action blocks including independent steps of grabbing, moving, placing, etc., capture the dependency relationship between the Action blocks by a self-attention mechanism, generate control instructions, and dynamically fine-tune each Action block based on real-time sensor feedback.
- 7. The method of claim 1, wherein the aerospace part in step S1 includes bolts, connectors, pins, and the action performed by the robotic arm in step S5 is performed by closed loop control, receiving visual and force sensor feedback in real time and correcting the action bias.
- 8. An automatic sorting system for aeronautical parts implementing the method according to any of claims 1 to 7, characterised in that it comprises: the simulation data construction and generation module is used for generating a simulation multi-mode data set for aviation part sorting at a MuJoCo platform; the real data acquisition module is used for realizing multi-mode data acquisition and structuring processing of sorting operation in a real environment based on the PIKA terminal; The data preprocessing module is used for cleaning, standardizing, time synchronizing and packaging the simulation and real data to generate a unified structured training data set; The VLA model combined training module is used for realizing combined training of simulation data pre-training and real data fine tuning and outputting a VLA model capable of generating a robot action instruction; And the task execution optimization module is used for converting an instruction generated by the VLA model into a precise sorting action of the mechanical arm and dynamically optimizing the precise sorting action based on the ACT technology and closed-loop control.
Description
Automatic sorting method and system for aviation parts based on simulation and real teaching combined training Technical Field The invention belongs to the technical field of aviation intelligent manufacturing and robot operation, and particularly relates to an automatic aviation part sorting method and system based on simulation and real teaching combined training, which can be widely applied to aviation manufacturing related scenes such as aviation assembly, aviation material management, aviation automatic storage and the like. Background The problems of high cost, low efficiency and large error of manual sorting exist in the field of aviation manufacturing, and automatic sorting becomes an industry development trend. The existing automatic sorting technology for aviation parts relies on a traditional image recognition or simple feature matching method, can only realize basic part classification, cannot integrally infer by combining task semantics, visual information and operation history, lacks an end-to-end learning model, does not have the capability of executing continuous actions such as grabbing, moving, aligning and the like, and is not an intelligent operation technology based on a visual-language-action model (VLA), and although the intelligent operation technology is tried to fuse multi-modal information, the intelligent operation technology cannot effectively combine simulation and real teaching data to carry out model training, and has the problems of weak adaptability to illumination change, local shielding and random placement of parts, and difficulty in transferring simulation action strategies to a real robot system, so that automatic sorting precision is insufficient, the process is unstable, and complex sorting tasks in aviation manufacturing cannot be supported. Currently available intelligent operation related patents of automatic sorting or VLA-based intelligent operation include a fusion perception and world model causal control VLA end-to-end system (patent number: CN 202511171514.8) of Guangzhou intelligent science and technology Co., ltd, and an intelligent sorting device before detection of aviation parts (application number: CN 202511156191.5) applied by Shenyang aircraft industry Co., ltd. S1, acquiring an image, semantic information and sensor data of an environment through a multi-mode sensor, and inputting the image, the semantic information and the sensor data into an environment sensing module to form a unified environment representation. S2, constructing an environmental dynamics model by using a world model training module, and predicting future states under different conditions. S3, reasoning the environmental state based on the causal modeling module to obtain a possible behavior decision. S4, generating a plurality of candidate tracks through a track generation module. S5, adopting a formal verification module to carry out safety verification on the candidate track, and selecting an optimal track as control output. The method disclosed by the application number CN202511156191.5 comprises the following steps that S1, the parts are conveyed to a sorting station through a conveyor belt, and a camera collects part images. S2, the data acquisition module transmits the image to the identification unit, and the conventional image processing/identification algorithm is adopted to classify the parts. S3, the control module sends an instruction to the sorting mechanism according to the identification result, and the push rod or the shunt mechanism is controlled to push the parts to the corresponding positions. S4, the sorted parts enter a subsequent detection or assembly process, so that basic automatic operation is realized. In the prior related patent, a causal control VLA end-to-end system integrating perception and world model only realizes environment perception and track planning, is not optimized for scene characteristics of aviation part sorting, is not combined with simulation and real data training, and adopts a traditional image processing algorithm, so that the intelligent sorting device before aviation part detection can only complete simple part splitting and lacks continuous action execution and complex scene adaptation capability. Therefore, an automatic sorting method which can integrate simulation and real data, has multi-mode end-to-end learning capability and adapts to complex sorting scenes of aviation parts is needed. Disclosure of Invention Aiming at the defects of the prior art, the invention provides an automatic sorting method and system for aviation parts based on combined training of simulation and real teaching, which solve the problems of lack of teaching data, insufficient sorting precision, difficult execution of complex tasks and difficult migration of simulation strategies to real environments, and realize automatic sorting of aviation parts with high precision, strong robustness and high popularization. The invention provides an automatic sort