CN-121000759-B - Tower crane remote control method based on multi-mode fusion and dynamic twinning

CN121000759BCN 121000759 BCN121000759 BCN 121000759BCN-121000759-B

Abstract

The invention provides a tower crane remote control method based on multi-mode fusion and dynamic twinning, and belongs to the field of building construction. The method comprises the steps of generating a virtual environment model through modeling of a multi-mode sensor, constructing a personalized interaction parameter set and a communication base line, generating an execution and feedback packet on the basis, feeding back the execution and feedback packet to a digital twin body, combining a multi-source real-time data set formed by a state monitoring sensor, fusing a physical model and federal learning to perform dynamic risk prediction, generating a hierarchical intervention strategy, executing treatments such as speed limit, hovering or stopping, and completing tamper-proof evidence storage of data and instructions. The intelligent control system has the beneficial effects that through multi-mode interaction individuation, communication link differentiation, track planning intellectualization, risk prediction prospective and evidence storage tamper prevention, the transition from 'passive protection' to 'active prediction and intelligent intervention' of the remote operation of the tower crane is realized, and the operation efficiency and the safety guarantee level under complex working conditions are integrally improved.

Inventors

QIAO YUANLIANG
CHEN MINGCHAO
CHEN QIANZHONG
WEI SHUCHEN
LI KAI
LIU QIANG
LIANG BIN
LV QINGMIAO

Assignees

中建八局第一数字科技有限公司

Dates

Publication Date: 20260512
Application Date: 20250912

Claims (3)

1. A tower crane remote control method based on multi-mode fusion and dynamic twinning is characterized by comprising the following steps of; The method comprises the steps of S1, generating a virtual environment model through modeling of a multi-modal sensor, constructing a personalized multi-modal interaction parameter set and a communication base line, acquiring original data through the multi-modal sensor, performing time synchronization and space calibration on the original data, performing preliminary denoising and space-time alignment to obtain a multi-modal data set, performing multi-source fusion on the multi-modal data set to generate a high-precision point cloud map and a semantic operation scene, synchronizing the point cloud map and the semantic operation scene into a digital twin body to form the virtual environment model corresponding to an actual building site, completing sensitivity, threshold and mapping relation calibration of gestures and voice in the virtual environment model, and forming the personalized multi-modal interaction parameter set by combining with brain electricity characteristics of an operator, aggregating a 5G, wi-Fi, a data station and a low-speed link based on the virtual environment model and the personalized multi-modal interaction parameter set, combining packet replication, forward error correction and QoS prediction, and finishing mapping according to a data flow level to obtain the communication base line supporting differential data flow, and performing local fast loop control on the edge and not depending on a public network; S2, generating an execution and return package based on the virtual environment model, the personalized multi-mode interaction parameter set and a communication baseline; Based on the personalized multi-modal interaction parameter set, executing the following intent fusion and authority policy linkage: (1) Normalizing the initial gesture intensity, the initial voice intention and the initial electroencephalogram attention/stability according to the gesture threshold, the voice threshold and the electroencephalogram threshold to obtain normalized gesture intensity Confidence of speech Attention to brain electricity/stability ; Specifically, it is provided with The original gesture intensity is obtained by scalar treatment of hand key point displacement, grip strength or arm acceleration; The original quantity of the voice intention is obtained by the recognition confidence or the acoustic energy ratio triggered by the key words; The initial brain electric attention/stability is obtained by indexing the power spectrum/time frequency characteristics, and in the calibration process, the lower limit and the upper limit of individual threshold positions of each channel are recorded, and the gesture threshold positions are respectively recorded as And The threshold bits of the speech are respectively recorded as And The brain electrical threshold positions are respectively recorded as And And accordingly, obtaining a normalization index: Wherein, the Respectively representing normalized gesture intensity, voice confidence and electroencephalogram attention/stability; (2) Based on the following Constructing a fusion score according to individual weights and an enhanced amplification coefficient and an environment risk index combined with electroencephalogram pair intention And controlling amplitude and triggering threshold value Gating when When the control superposition is not generated, otherwise, the output M is superposed to the reference track to form a superposition control instruction set; specifically, individual weights are defined Representing the relative weights of the gesture and the voice on the operator body respectively; an amplification factor for the electroencephalogram pair intention enhancement; The method is characterized in that the method is used for obtaining an environment risk index by integrating constraint distances formed by poses and tracks of obstacle, forbidden regions and height-limiting regions in a semantic scene of S12, and a compact fusion score J and a control amplitude M are constructed, wherein the control amplitude M is a saturation map: Wherein: Is a risk suppression coefficient; For outputting shape factor, for determining response speed and saturation degree, triggering threshold value When (when) When the control superposition is not generated, so as to realize error touch resistance, otherwise, the output M is superposed to the reference track to form a superposition control instruction set; (3) Generating a permission score A and introducing a personalized threshold value Mapping successive values of A to discrete authority levels when Allowing high-rights operation when The system is automatically degraded to a low authority mode; specifically, an operation fatigue/load index delta is defined, wherein the operation fatigue/load index is derived from a physiological signal or an operation duration and has a value range of ; Defining a right score A: Wherein: Is the policy weight; since a is a continuous score, its value range is usually between 0 and 1, but the continuous value of a alone is not sufficient to directly determine the operation authority level, a personalized threshold value ζ is introduced for mapping the continuous value of a to a discrete authority level: When (when) When the system judges that the operator state is good, and allows high-authority operation, wherein the high-authority operation comprises the steps of raising the upper limit M, accelerating response and executing obstacle approaching operation; When (when) When the system automatically degrades to a low-authority mode, wherein the low-authority mode comprises limiting the upper limit of M and executing speed limit, secondary confirmation or suspension; The above parameters The method comprises the steps of carrying out interactive calibration in a virtual environment model and lasting personalized multi-mode interaction parameter set, wherein epsilon, delta and gamma can be updated in a digital twin body in a sliding way along time when a task is executed, so that online self-adaption is formed; S3, based on the execution and feedback package and a multisource real-time data set formed by the acquisition and preprocessing of a state monitoring sensor, dynamic risk prediction is carried out by combining a physical model and a federal learning model, a hierarchical intervention strategy is generated, controlled treatment is carried out, meanwhile tamper-proof storage is carried out on data and instructions, specifically, the state real-time data related to the tower crane is acquired by the state monitoring sensor, time alignment and preprocessing are carried out, a multisource real-time data set for dynamic risk prediction is formed, based on the multisource real-time data set, virtual previewing is carried out in a digital twin body when structural abnormality or load risk is predicted, a corresponding hierarchical intervention strategy is deduced, controlled treatment is carried out on the tower crane according to the hierarchical intervention strategy, meanwhile, hash is generated periodically on multisource sensing data and control instructions and stored in the tamper-proof storage module, and index binding with video and sensor original data is established to support post-evidence acquisition.
2. The method for remotely controlling the tower crane based on multi-modal fusion and dynamic twinning according to claim 1, wherein the step S2 comprises the following steps: S21, based on a virtual environment model, completing instruction structuring by combining the personalized multi-mode interaction parameter set to obtain a structured task description package; S22, continuously collecting and updating the real-time working condition data of the multi-mode sensor in the step S11 to a digital twin body in the operation process to align the real-time working condition data to a virtual environment model, so as to form a current working condition situation comprising a dynamic obstacle and a safety constraint set; s23, after the digital twin body receives the structured task description package and the current working condition situation, executing constrained path planning and obstacle avoidance calculation based on the point cloud map and the semantic operation scene, and carrying out dynamic constraint correction by combining a load swing model and wind load influence, so as to generate a reference track package; S24, the tower crane control method carries out high-speed closed loop tracking on the reference track packet through the tower crane edge control node, carries out swing inhibition and scram protection, analyzes the gesture of an operator by combining the personalized multi-mode interaction parameter set, generates a superposition control instruction set, completes multi-link transmission through a communication base line, renders a three-dimensional visual scene through an operation end interface, and generates an execution and return packet to be fed back to a digital twin body.
3. The method for remotely controlling the tower crane based on multi-modal fusion and dynamic twinning according to claim 2, wherein the step S23 comprises the following steps: S231, receiving the reference track packet by a tower crane edge control node, tracking a time parameterized reference track in the reference track packet in local high-speed closed-loop control to obtain track tracking state information, and simultaneously executing swing suppression and scram protection; S232, in the execution process, analyzing gesture input of an operator by the tower crane edge control node based on track tracking state information and combining the personalized multi-mode interaction parameter set, and generating corresponding fine adjustment quantity; s233, based on the communication baseline, mapping the control instruction set, the path instruction, the real-time video stream and the telemetry data to corresponding link channels respectively, completing multilink aggregation, packet replication and error correction processing, and forming a real-time data stream set after multilink transmission; S234, the tower crane control method renders three-dimensional visual scenes based on the real-time data stream set in an operation end interface, provides safety prompts related to running states, and finally generates an execution and return packet.

Description

Tower crane remote control method based on multi-mode fusion and dynamic twinning Technical Field The invention relates to the field of building construction, in particular to a tower crane remote control method based on multi-mode fusion and dynamic twinning. Background In the field of modern building construction, the remote control technology of the tower crane is developed, but still faces a plurality of bottlenecks. In the aspect of communication, the construction site is complex, a large number of metal structures and frequent electromagnetic activities interfere with communication signals, so that when a single communication link (such as 5G) is adopted, the interruption rate is as high as 12% or more, and the reliable transmission of the remote control instruction of the tower crane is seriously affected. In an operation interaction layer, the traditional control mode based on a 2D interface and video feedback cannot provide visual space perception for operators, response delay exceeds 500ms in an emergency, and requirements of efficient and safe operation are difficult to meet. In the aspect of safety protection, the existing threshold alarm mechanism can only passively respond after risk occurs, and dynamic risks such as swing of a suspended object, structural fatigue of a tower crane and the like cannot be prejudged in advance. Disclosure of Invention The invention aims to provide a tower crane remote control method based on multi-mode fusion and dynamic twinning, which improves the operation efficiency and the safety guarantee level under complex working conditions. The invention is realized by the following measures: A tower crane remote control method based on multi-mode fusion and dynamic twinning comprises the following steps: The method comprises the steps of S1, modeling a multi-modal sensor to generate a virtual environment model, constructing a personalized multi-modal interaction parameter set and a communication base line, providing stable low-time delay guarantee for follow-up tower crane task planning and remote control, wherein the modeling process is carried out before tower crane operation, and the multi-modal sensor comprises a fixed laser radar, a vision sensor, a millimeter wave radar, an inertial unit of a lifting hook end and a short-distance ranging sensor; S2, generating an execution and return package based on the virtual environment model, the personalized multi-mode interaction parameter set and a communication baseline; s3, acquiring and preprocessing a multisource real-time data set based on the execution and return package and through a state monitoring sensor, carrying out dynamic risk prediction by combining a physical model and a federal learning model, generating a hierarchical intervention strategy, executing controlled treatment, and simultaneously carrying out tamper-proof evidence storage on data and instructions. The step S1 comprises the following steps: S11, acquiring original data through a multi-mode sensor, performing time synchronization and space calibration on the original data, and performing preliminary denoising and space-time alignment to obtain a multi-mode data set, wherein the multi-mode data set comprises a unified space-time sequence of laser points, image features, radar targets, IMU gestures and short-range distance measurement data; S12, carrying out multi-source fusion on the multi-mode dataset to generate a high-precision point cloud map and a semantic operation scene, and synchronizing the point cloud map and the semantic operation scene into a digital twin body to form a virtual environment model corresponding to an actual building site, wherein the semantic operation scene is to add semantic tags such as barriers, forbidden areas, pedestrian channels, height limiting areas, tower crane bodies, storage areas and the like on the point cloud map, and the semantic tags provide support for subsequent path planning, safety envelope generation and risk prediction; S13, completing the calibration of the sensitivity, threshold and mapping relation of gestures and voices in the virtual environment model, and combining the electroencephalogram characteristics of an operator to form a personalized multi-modal interaction parameter set, wherein the personalized multi-modal interaction parameter set comprises personalized parameters and authority strategies and is used for subsequent task input and control superposition; S14, based on the virtual environment model and the personalized multi-mode interaction parameter set, 5G, wi-Fi, a data transmission station and a low-speed link are aggregated, packet replication, forward error correction and QoS prediction are combined, mapping is completed according to the data flow level, a communication base line supporting differentiated data flows is obtained, specifically, local fast loop control stays at the edge and does not depend on a public network, a remote supervision instruction walks a reinforced