Search

CN-122027691-A - End cloud cooperation method based on semantic communication protocol and intelligent operation command system for underwater tool body

CN122027691ACN 122027691 ACN122027691 ACN 122027691ACN-122027691-A

Abstract

The invention discloses an end cloud cooperation method based on a semantic communication protocol and an underwater intelligent operation command system, which comprise the steps that a vision-language-action cloud command center infers task instructions, converts the task instructions into action primitive sequences and sends the action primitive sequences to an underwater intelligent terminal, the underwater intelligent terminal receives the action primitive sequences, maps and calls corresponding control strategies from a local atomic skill library to drive action execution, acquires perception data in real time in the action process, triggers different streaming logics according to whether data confidence degree meets an effective threshold value or not, monitors risk indexes at any moment of the steps, triggers an interrupt mechanism when the risk indexes exceed a safety critical threshold value, hangs up a current task and autonomously calls emergency skills to avoid. The invention has extremely high system universality and layered architecture efficiency optimization, breaks through the physical limit of underwater bandwidth and realizes low-bandwidth high-frequency intelligent interaction.

Inventors

  • HE ZHIGUO
  • TIAN JIACHENG
  • LIU YUZHOU
  • SONG MEIYU
  • MA HONGKUAN
  • ZHANG JUNBO
  • HAN DONGRUI

Assignees

  • 浙江大学

Dates

Publication Date
20260512
Application Date
20260413

Claims (10)

  1. 1. An end cloud cooperation method based on a semantic communication protocol is characterized by comprising the following steps: Step S1, a vision-language-action cloud command center infers an input task instruction, converts the task instruction into an action primitive sequence and transmits the action primitive sequence to an underwater self-help intelligent terminal through a communication link; step S2, the intelligent terminal of the underwater body receives an action primitive sequence, maps and invokes a corresponding control strategy from a local atomic skill library to drive action execution; step S3, sensing data are collected in real time in the action process, and different streaming logics are triggered according to whether the confidence coefficient of the data meets an effective threshold, wherein the auxiliary action is called to actively optimize and observe if the confidence coefficient of the data is lower than the effective threshold, and the environment semantic features are extracted and reported to the cloud to realize closed loop if the confidence coefficient of the data is not lower than the effective threshold; And S4, monitoring risk indexes at any time from the step S1 to the step S3, triggering an interruption mechanism when the risk indexes exceed a safety critical threshold, suspending a current task and automatically calling emergency skills to avoid.
  2. 2. The end cloud collaboration method based on the semantic communication protocol according to claim 1, wherein the step S1 includes: Step S1.1, a visual-language-action cloud command center receives unstructured natural language instructions input by an operator; S1.2, a multi-modal intention reasoning module analyzes unstructured natural language instructions based on a digital twin environment and a large model knowledge base to determine task targets, objects of interest and operation constraints; step S1.3, the action primitive generating module maps the macroscopic operation intention into a standardized action primitive sequence executable by the intelligent terminal of the underwater vehicle by using a prompt word engineering strategy, wherein the standardized action primitive sequence consists of a plurality of instruction items containing primitive codes, target indexes and skill parameters; And S1.4, coding the generated standardized action primitive sequence according to a downlink primitive command frame format, and transmitting the standardized action primitive sequence to the underwater self-body intelligent terminal through a bidirectional semantic communication link.
  3. 3. The end cloud collaboration method based on the semantic communication protocol according to claim 1, wherein the step S2 includes: S2.1, the intelligent terminal of the underwater body receives and analyzes the downlink instruction frame, and extracts action primitive codes and skill parameters; Step S2.2, the self-decision module indexes the corresponding control strategy function from the local atomic skill library module according to the action primitive code; And S2.3, loading the selected control strategy function and parameters by the bottom-layer motion control component, and driving the execution mechanism to execute corresponding physical actions so as to keep track of or work on the task target.
  4. 4. The end cloud collaboration method based on the semantic communication protocol according to claim 1, wherein the step S3 includes: step S3.1, in the task execution process, the active sensing module collects environment data in real time and performs feature extraction; step S3.2, the underwater intelligent operation command system evaluates the confidence coefficient of the current perception result in real time, if the confidence coefficient is lower than a preset effective threshold value, the environment information is judged to be insufficient, the step S3.3 is carried out, and if the confidence coefficient is not lower than the preset effective threshold value, the step S3.4 is carried out; Step S3.3, the body decision module autonomously suspends the current data reporting flow, triggers a gain action, namely invokes auxiliary atomic skills, executes an information gain action to optimize observation conditions, and returns to the step S3.2; And S3.4, the active sensing module extracts the structural semantic features of the target, including the target ID, the category label, the confidence level, the state label and the relative coordinates, encodes the structural semantic features into an uplink environment semantic report frame, and sends the uplink environment semantic report frame to the cloud to complete one-time sensing closed loop.
  5. 5. The end cloud collaboration method based on the semantic communication protocol according to claim 1, wherein the step S4 includes: At any moment when the steps S1-S3 are executed, if the active sensing module or the internal state monitoring module detects a sudden high-risk event and the risk index exceeds a safety critical threshold, the body decision-making module triggers a high-priority interrupt logic to immediately terminate or suspend the conventional atomic skills currently being executed, and the body decision-making module directly invokes the high-dynamic motor skills or emergency treatment skills preset in the atomic skill library; And after the risk is relieved or the state is stable, the underwater intelligent operation command system for the tool body resumes the conventional task execution according to a preset strategy, or generates an event state report and reports the event state report to the cloud to wait for a new instruction.
  6. 6. An end cloud cooperative underwater intelligent operation command system based on a semantic communication protocol is used for realizing the end cloud cooperative method according to any one of claims 1-5, and is characterized by comprising an underwater intelligent terminal, a vision-language-action cloud command center and a bidirectional semantic communication protocol link.
  7. 7. The intelligent operation command system of the end cloud collaborative underwater tool body based on the semantic communication protocol according to claim 6, wherein the intelligent terminal of the underwater tool body is integrated in the underwater robot body and comprises an atomic skill library module, an active perception module and a tool body decision module; The atomic skill library module stores atomic skills which are locally preset or learned online, and after the underwater intelligent terminal receives action primitives issued by the vision-language-action cloud command center, the action primitives are dynamically mapped into combinations of the local atomic skills through the task scheduler and are called to be executed by the bottom-layer motion controller; The active perception module evaluates semantic confidence coefficient of perceived data in real time, when target characteristics are fuzzy, namely the confidence coefficient is lower than a preset threshold value, the terminal cloud collaborative underwater self-body intelligent operation command system temporarily does not report a current task, autonomously decides and selects information gain actions in the atomic skill library module until the characteristics meeting the confidence coefficient threshold value are acquired, and then codes the characteristics into semantic packages for reporting, so that invalid data is prevented from occupying bandwidth; The body decision module operates a local decision model, receives cloud action primitives and invokes corresponding atomic skills, and is internally provided with high-priority obstacle avoidance logic, when a collision risk is perceived, a conventional task is immediately interrupted, the atomic skills related to emergency obstacle avoidance are autonomously invoked, and emergency avoidance is fully and autonomously realized.
  8. 8. The intelligent operation command system of the end cloud collaborative underwater tool body based on the semantic communication protocol according to claim 6, wherein the visual-language-action cloud command center comprises a visual-language-action large model, and a task intention understanding and macroscopic planning module, a multi-mode perception semantic extraction module and a collaborative decision and semantic instruction generation module are constructed based on the visual-language-action large model; the task intention understanding and macro planning module is used for analyzing natural language or graphical task instructions issued by an operator, and generating a global semantic task sequence by combining geographic information and historical data; the multi-mode perception semantic extraction module is used for receiving and analyzing compressed perception features uploaded by the underwater body intelligent terminal, carrying out fusion analysis through a large-scale vision-language model, and extracting high-level semantic description of a scene; The collaborative decision and semantic instruction generation module is used for fusing task planning and scene semantics, making decisions, packaging decision results into semantic instruction data packets conforming to a preset format and issuing the semantic instruction data packets to the underwater self-help intelligent terminal.
  9. 9. The intelligent operation command system of the end cloud collaborative underwater tool based on the semantic communication protocol according to claim 6 is characterized in that the bidirectional semantic communication protocol link adopts a compact coding format based on an object-attribute-state to adapt to physical characteristics of low bandwidth, high delay and high error rate of the underwater acoustic communication, the communication protocol adopts a variable length binary frame structure, the basic frame structure is unified to be formed by sequentially splicing a frame header with a preset length, a frame type, a data length, a variable length semantic load and check bits, wherein the frame header is used for identifying the starting position of a data packet in a continuous bit stream to realize frame synchronization, the frame type is used for identifying the functional attribute of a current data packet, the data length is used for indicating the specific byte number of the subsequent semantic load to solve the problem of the analysis boundary of variable length data in streaming, and the check bits are used for detecting error codes possibly occurring in the transmission process to ensure the integrity of the data.
  10. 10. The intelligent operation command system of the end cloud collaborative underwater tool body based on the semantic communication protocol is characterized in that for an uplink, semantic load is responsible for transmitting environment semantics refined by an active perception module in an underwater tool body intelligent terminal, and the semantic load internal structure is defined as being formed by sequentially connecting a target ID, a category label, a confidence coefficient, a state label and relative coordinates, wherein the target ID is a unique tracking identifier distributed to a current observation target by an edge end, the category label corresponds to a predefined object category enumeration table, the confidence coefficient quantifies the certainty degree of a perception model on an identification result, the state label is used for distinguishing the fine granularity state of the target, and the relative coordinates respectively represent the spatial distance of the target relative to a robot body at a X, Y, Z axis; for a downlink, a semantic load is responsible for transmitting a control instruction generated by reasoning a visual-language-action large model in a visual-language-action cloud command center, and the semantic load internal structure is strictly defined to be formed by sequentially connecting a global task sequence number, an action primitive code, a target ID, skill parameters and an execution priority, wherein the global task sequence number is used for identifying a time sequence logic ID of the instruction and preventing out-of-order execution caused by underwater sound multipath effects, the action primitive code corresponds to an index in an edge atomic skill library, the target ID designates an acting object of the action, the skill parameters are variable length fields, the length of the skill parameters depends on the atomic skill requirements of specific calls, and the execution priority is used for indicating the preemption level of the instruction so as to trigger high-priority interrupt logic of the underwater body intelligent terminal.

Description

End cloud cooperation method based on semantic communication protocol and intelligent operation command system for underwater tool body Technical Field The invention relates to the field of intelligent personal devices, in particular to an end cloud cooperation method based on a semantic communication protocol and an underwater intelligent personal operation command system. Background The requirements for tasks such as inspection of deep sea oil gas pipelines, inspection of cross-sea bridge foundation, monitoring of submarine ecological environment and the like are increasing. The underwater robot is used as core equipment for executing the tasks, and the intelligent level of the underwater robot directly determines the working efficiency and the success rate. However, the current intelligent development of underwater robots faces severe asymmetric resource constraints and technical bottlenecks. First, the physical characteristics of the underwater environment determine that electromagnetic waves are difficult to propagate, and mainly depend on underwater acoustic communication. The bandwidth of the existing commercial underwater acoustic communication technology is extremely low, usually only kbps, and the problems of high delay, serious multipath effect and high packet loss rate are accompanied. The cloud brain and broadband video feedback mode commonly used by the land robot is completely disabled underwater, an operator cannot obtain real-time visual feedback, and accurate remote control operation is difficult to perform. Secondly, due to the size of the underwater pressure-resistant cabin, the energy consumption limitation of battery power supply and the difficult heat dissipation problem of a closed space, the underwater robot terminal is difficult to carry high-performance computing hardware such as a high-power GPU. This means that currently the most advanced vision-language-action (VLA) multimodal big model cannot be run directly on an underwater terminal. Most existing underwater robots can only execute simple pre-programmed instructions, and lack understanding ability of unstructured environments and autonomous strain ability for sudden conditions. In summary, it is difficult to achieve high intellectualization under the condition that communication bandwidth and terminal computing power are limited in the prior art. Therefore, a general intelligent command architecture capable of scientifically distributing cloud computing power, decoupling sensing and decision and adapting to a very low bandwidth underwater sound environment is needed. Disclosure of Invention The invention aims to provide an end cloud cooperation method based on a semantic communication protocol and an underwater intelligent operation command system for a tool body, so as to solve the problems in the background technology. In order to achieve the above purpose, the present invention provides the following technical solutions: An end cloud collaboration method based on a semantic communication protocol, comprising: Step S1, a vision-language-action cloud command center infers an input task instruction, converts the task instruction into an action primitive sequence and transmits the action primitive sequence to an underwater self-help intelligent terminal through a communication link; step S2, the intelligent terminal of the underwater body receives an action primitive sequence, maps and invokes a corresponding control strategy from a local atomic skill library to drive action execution; step S3, sensing data are collected in real time in the action process, and different streaming logics are triggered according to whether the confidence coefficient of the data meets an effective threshold, wherein the auxiliary action is called to actively optimize and observe if the confidence coefficient of the data is lower than the effective threshold, and the environment semantic features are extracted and reported to the cloud to realize closed loop if the confidence coefficient of the data is not lower than the effective threshold; And S4, monitoring risk indexes at any time from the step S1 to the step S3, triggering an interruption mechanism when the risk indexes exceed a safety critical threshold, suspending a current task and automatically calling emergency skills to avoid. Further, the step S1 includes: Step S1.1, a visual-language-action cloud command center receives unstructured natural language instructions input by an operator; S1.2, a multi-modal intention reasoning module analyzes unstructured natural language instructions based on a digital twin environment and a large model knowledge base to determine task targets, objects of interest and operation constraints; step S1.3, the action primitive generating module maps the macroscopic operation intention into a standardized action primitive sequence executable by the intelligent terminal of the underwater vehicle by using a prompt word engineering strategy, wherein the standardi