Search

CN-122018675-A - Dialogue communication method and system based on non-invasive brain-computer interface and multi-mode AI large model

CN122018675ACN 122018675 ACN122018675 ACN 122018675ACN-122018675-A

Abstract

The invention provides a dialogue exchange method and a dialogue exchange system based on a non-invasive brain-computer interface and a multi-mode AI large model, wherein the method at least comprises the following steps of S1, wearing and starting a non-invasive brain-computer interface device by a user, continuously collecting brain-computer signals of the user, establishing and maintaining wireless communication connection with terminal equipment, S2, starting an application program by the terminal equipment, displaying a function selection interface containing at least one function option for the user, S3, identifying the selection intention of the user for the function option of a dialogue exchange scene in the function selection interface by decoding the brain-computer signals, S4, entering a dialogue exchange scene sub-module, and presenting an interaction mode selection interface of at least two modes for the user by the terminal equipment, S5, entering a processing flow corresponding to the interaction mode according to the user selection identified by decoding the brain-computer signals. The present invention provides an efficient, reliable, flexible, and honored dialog communication method and system.

Inventors

  • HE YONGZHENG
  • WANG DONGDONG
  • WU SHUTING
  • CAO JIAYI

Assignees

  • 陕西捷创睿智能科技有限公司

Dates

Publication Date
20260512
Application Date
20251224

Claims (10)

  1. 1. A method of dialogue exchange based on a non-invasive brain-computer interface and a multimodal AI large model, comprising at least the steps of: S1, a user wears and starts a non-invasive brain-computer interface device (1), continuously collects brain-computer signals of the user, and establishes and maintains wireless communication connection with a terminal device (2); S2, starting an application program by the terminal equipment (2), and displaying a function selection interface containing at least one function option to a user; S3, identifying the selection intention of a user for a function option of 'dialogue communication scene' in the function selection interface by decoding the electroencephalogram signals; s4, entering a dialogue communication scene sub-module, and presenting an interaction mode selection interface at least comprising two modes of voice interaction and brain text input to a user by the terminal equipment (2); S5, entering a processing flow of a corresponding interaction mode according to the user selection identified by the decoded brain electrical signals.
  2. 2. The method of claim 1, wherein when the user selects the "voice interaction" mode, the following steps are performed: S6, collecting external voice information through a sound pickup device; S7, sending the external voice information or the converted text thereof to an AI large model processing unit, and carrying out semantic understanding by the AI large model processing unit and generating at least one candidate answer text; S8, the terminal equipment (2) displays a visual stimulus interface containing at least one candidate answer text; S9, a user looks at a visual stimulus area corresponding to the target candidate answer text, and the brain-computer interface device (1) collects corresponding brain-computer signals; s10, the terminal equipment (2) identifies the selection intention of a user on the target candidate answer text according to the electroencephalogram signals; S11, sending the selected candidate answer text to a voice generation module; S12, the voice generation module converts the text into a voice signal and plays the voice signal through the audio output device.
  3. 3. The method of claim 1, wherein when a user selects the "electroencephalogram character input" mode, the following steps are performed: s13, the terminal equipment (2) displays a character input interface, wherein the interface comprises a plurality of character selection areas coded by specific visual stimuli; S14, inputting by sequentially watching the region corresponding to the target character by the user, and recognizing a character sequence input by the user by continuously decoding the brain electric signal by the system; S15, sending a text sequence input by a user to an AI large model processing unit for semantic optimization processing to generate an optimized text; S16, sending the optimized text to a voice generating module; S17, the voice generation module converts the text into a voice signal and plays the voice signal through the audio output device.
  4. 4. The method of claim 1-3, wherein the visual stimulus encoding in each interface is based on steady-state visual evoked potential paradigm, each selectable option or character area is assigned a unique visual flicker frequency, and the visual stimulus of a certain option or area is switched from periodic flicker mode to static visual confirmation feedback identification upon system decoding confirming the user's gaze selection intent of that area.
  5. 5. The method according to claim 3, wherein the electroencephalogram recognition process is implemented based on an event-related potential paradigm, the interactive mode selection interface and character input interface are configured as a stimulation matrix, the system highlights rows or columns of the matrix as rare stimuli by a random sequence, and decodes a user's selection intent by detecting event-related potential components in the electroencephalogram that are phase-locked to the rare stimuli.
  6. 6. The method of claim 3, wherein the character input interface is a nine-key virtual keyboard interface, each of the number key areas corresponds to a group of letters or characters and is encoded by an independent visual stimulus, and the input process comprises the steps of confirming that the user looks at the target number key area, and performing secondary selection in the character group corresponding to the number key to determine the specific input character.
  7. 7. The non-invasive brain-computer interface and multi-modal AI large model-based dialogue exchange method of claim 6, wherein the nine-key virtual keyboard interface can be replaced with a twenty-six key virtual keyboard interface or a handwriting input virtual keyboard interface.
  8. 8. A dialogue exchange system based on a non-invasive brain-computer interface and a multi-modal AI big model for implementing the dialogue exchange method based on a non-invasive brain-computer interface and a multi-modal AI big model according to any one of claims 1-7, characterized in that the dialogue exchange system at least comprises: a portable non-invasive brain-computer interface device (1) for acquiring and preprocessing brain-electrical signals of a user; The terminal equipment (2) is in wireless connection with the brain-computer interface equipment (1) and is used for presenting a visual stimulation interface, carrying out real-time intention decoding of brain electrical signals, managing interaction flow and processing audio input and output; And the AI large model processing unit is in communication connection with the terminal equipment (2) and is used for executing the tasks of voice recognition, semantic understanding, natural language generation and text optimization.
  9. 9. The dialogue communication system based on a non-invasive brain-computer interface and a multi-modal AI big model according to claim 8, wherein the system integrates a signal preprocessing module, an electroencephalogram recognition control module, an interaction module, an AI big model processing module and a speech generation module; Wherein, the The signal preprocessing module is integrated in the brain-computer interface equipment (1) and is used for carrying out filtering noise reduction processing on the acquired original brain-computer signal; The electroencephalogram identification control module and the interaction module are integrated in the terminal equipment (2); the electroencephalogram identification control module is used for carrying out feature extraction and intention decoding on the preprocessed electroencephalogram signals to generate control instructions; the interaction module is used for responding to the control instruction, providing and managing a visual interaction interface based on a preset stimulus-response paradigm, and the visual interaction interface is at least used for realizing a two-way dialogue exchange function; The AI large model processing module is composed of the core of an AI large model processing unit, is deployed at the cloud or locally at the terminal equipment (2), is in communication connection with the interaction module and the voice generation module, and is used for executing semantic understanding and natural language generation tasks; the voice generation module is used for converting text information into voice signals and outputting the voice signals.
  10. 10. The non-invasive brain-computer interface and multi-modal AI large model-based dialog exchange system of claim 9, wherein the interaction module includes a bi-directional dialog exchange sub-module configured to support at least two interaction channels: The first interaction channel is configured to receive external voice input, call the AI large model processing module to generate at least one candidate reply text, and output the candidate reply text through the voice generation module after the candidate reply text is selected by a user through the visual interaction interface; And/or the number of the groups of groups, And the second interaction channel is configured to provide a character input function through the visual interaction interface, identify a character sequence input by a user, call the AI large model processing module to perform semantic optimization on the character sequence, and output optimized text through the voice generation module.

Description

Dialogue communication method and system based on non-invasive brain-computer interface and multi-mode AI large model Technical Field The invention relates to the technical field of brain-computer interfaces and intelligent care intersections, in particular to a dialogue communication method and a dialogue communication system based on a non-invasive brain-computer interface and a multi-mode AI large model. Background Aiming at auxiliary communication of people with serious movement and language dysfunction (such as severe hemiplegic patients), the prior art scheme has remarkable limitations in communication efficiency, autonomy and user experience. The first type of conventional solution relies on the caregiver observing limited limb movements or simple vocalization of the patient. The communication mode essentially reduces the dimension of the abundant internal intention (high-dimension information) of the user into extremely limited physical signals for transmission, and has extremely low information entropy and high error rate, thus leading to low communication efficiency and extremely easy generation of demand misjudgment. The second category of auxiliary devices based on specific biological signals, such as eye tracker, has inherent drawbacks, though the addition of a control channel of one dimension (gaze point coordinates). From the perspective of a communication system, the eye movement signal is easily interfered by 'channel noise' such as eye fatigue, environment illumination change, unintentional glance and the like, and has the defects of insufficient stability and low reliability of a communication link. In addition, its functionality is often limited to simple interface navigation and control, lacks understanding and generating capabilities of deep semantic intent of the user, cannot support complete natural conversations, and has an excessively low communication protocol level. A third class of existing non-invasive brain-computer interface techniques, particularly based on Steady State Visual Evoked Potential (SSVEP) or P300 event related potential studies, have focused on medical rehabilitation training or basic spelling applications. The system has the problems that 1) the system has low integration level, namely, most of the system is an isolated 'stimulus-decoding' module, the system is not deeply integrated with an upstream Natural Language Processing (NLP) engine, a 'half-segment' communication system is formed, a user needs to transmit information through complex codes (such as a word by word Fu Pinxie), and the communication rate (measured by the information transmission rate ITR) and the user experience are poor. 2) The noise such as power frequency interference, myoelectric artifacts and the like existing in the actual use environment can seriously deteriorate the signal-to-noise ratio (SNR) of the electroencephalogram signal, and the existing scheme often lacks an adaptive front-end signal enhancement and robust feature extraction algorithm, so that the decoding accuracy rate is suddenly reduced in a non-ideal environment. 3) The equipment and the protocol are stiff, the equipment is huge in size and complex in operation, the communication protocol is fixed, the interaction strategy cannot be dynamically adjusted according to the state or environment of the user, and the personalized adaptation capability is lacked. In summary, the common defects of the prior art scheme can be summarized as insufficient communication link bandwidth, poor signal transmission reliability and low intelligent degree of a system protocol stack, which results in poor user autonomy, low interaction efficiency, high learning cost and hard communication experience. Disclosure of Invention The invention aims to provide a dialogue communication method and a dialogue communication system based on a non-invasive brain-computer interface and a multi-mode AI large model, which can solve the existing problems. The invention solves the technical problems that 1, the interactive link in the prior art has the bottleneck or dimension reduction problems: The traditional mode relies on residual limb functions or simple sounding, and is essentially to transmit physical actions or simple syllables (low-dimensional information) with limited dimension reduction of the complex intention (high-dimensional information) of the brain, and has low information entropy, easy distortion and low efficiency. The existing auxiliary equipment such as an eye tracker increases the information dimension, but the signal (the fixation point coordinate) does not directly carry the semantics, the signal needs to be converted into an interface instruction, the flow is long, and the signal is easy to fatigue and interference of environment light and unstable. 2. The prior art is mostly an isolated functional module, and lacks organic integration: the brain-computer interface is mostly used for rehabilitation training, the eye tracker is used f