CN-121985038-A - AI chat breakpoint continuous transmission method and device based on double storage architecture
Abstract
The invention provides an AI chat breakpoint continuous transmission method and device based on a double-storage architecture, which relate to the technical field of AI dialogue, and comprise the steps that after a user inputs a text at a client, an AI large model sequentially generates content blocks, and a server transmits the content blocks to the client; the server writes the content blocks into the cache storage layer, writes all the content blocks into the persistence storage layer as complete content after the AI big model finishes answering, stops transmitting the content blocks to the client after the client is disconnected from the server, generates a stop position index of the content blocks by the client, divides the complete content of the persistence storage layer into a plurality of content blocks when the complete content exists in the persistence storage layer after the client is reconnected with the server, sequentially transmits the content blocks to the client according to the sequence, and sequentially transmits the content blocks after the stop position index of the cache storage layer to the client when the complete content does not exist in the persistence storage layer.
Inventors
- LI PENG
- WANG FEI
- WANG QIMING
Assignees
- 睿瀛科技(武汉)有限责任公司
Dates
- Publication Date
- 20260505
- Application Date
- 20260121
- Priority Date
- 20251010
Claims (10)
- 1. An AI chat breakpoint continuous transmission method based on a double storage architecture is characterized by comprising the following steps: After a user inputs a text at a client, an AI big model sequentially generates content blocks, and a server transmits the content blocks to the client; the server writes the content blocks into a cache storage layer, and after the AI big model finishes answering, writes all the content blocks into a persistence storage layer as complete content; when the client is disconnected from the server, the server stops transmitting the content block to the client, and the client generates a stop position index of the content block; detecting whether the complete content exists in the persistent storage layer or not after the client is reconnected with the server; dividing the complete content of the persistent storage layer into a plurality of content blocks under the condition that the complete content exists in the persistent storage layer, and sequentially transmitting the content blocks to the client side; And under the condition that the complete content does not exist in the persistent storage layer, sequentially transmitting the content blocks after the stop position index of the cache storage layer to the client side according to the sequence.
- 2. The AI chat break-point transmission method based on the dual storage architecture of claim 1, wherein the AI big model sequentially generates content blocks after the user inputs text by the client, and the server transmits the content blocks to the client, comprising: After receiving the text input by the client, the server sequentially generates response data through an AI large model, and converts the response data into corresponding content blocks; The server sequentially pushes the content blocks to the client side one by one through an SSE protocol, wherein the content blocks comprise text content and metadata, and the metadata comprises a sequence identifier, a completion judgment identifier and a session identifier of the content blocks.
- 3. The AI chat break-point resume method based on dual storage architecture of claim 2, wherein the server writing the content block to a cache storage layer comprises: after the AI big model generates the content block, transmitting the content block to the server; After the server detects that the content block is received, the content block is placed into a variable character sequence, and text content of the content block is written into the cache storage layer from the variable character sequence according to the sequence identifier of the content block.
- 4. The AI chat break-point resume method based on dual storage architecture of claim 3, wherein writing all content blocks as complete content into the persistent storage layer comprises: S11, the server detects sequence identifiers, completion judging identifiers and session identifiers of all content blocks of the complete content, if the sequence identifiers of all the content blocks are in correct sequence, the completion judging identifiers of all the content blocks represent that the content blocks are completed, and the session identifiers of all the content blocks belong to the same answer, the step S12 is entered, and otherwise, the step S13 is entered; S12, carrying out integrity check on the complete content, wherein the integrity check comprises hash value check, content length check, summation check and logic consistency check, if the integrity check passes, converting the complete content in the variable character sequence into character strings and writing the character strings into the persistent storage layer, otherwise, entering step S13; And S13, the server sends an instruction for regenerating the complete content to the AI large model.
- 5. The AI chat breakpoint resume method based on the dual storage architecture of claim 2, wherein the stop location index is a sequence number corresponding to a sequence identifier of a last successfully received content chunk by the client.
- 6. The AI chat break point resume method based on dual storage architecture of claim 2, wherein dividing the complete content of the persistent storage layer into a plurality of content blocks, transmitting content blocks to the client in sequence, comprises: the server sends an answer completion instruction to the client, and divides the complete answer in the persistent storage layer into a plurality of content blocks according to a sequence identifier, a completion judgment identifier and a session identifier during storage; The server sequentially sends each content block to the client according to the sequence identifier, and the client continues to transmit the next content block after returning the confirmation information; when detecting that the content blocks are lost, repeated or out of order, the server re-extracts the corresponding content blocks from the persistent storage layer according to the sequence identifier for retransmission.
- 7. The AI chat breakpoint resume method based on the dual storage architecture of claim 2, wherein the sequentially transmitting the content blocks after the stopping position index of the cache storage layer to the client side includes: Identifying the content blocks after the stop position index of the cache storage layer according to the sequence identifier of each content block; Sorting the identified content blocks according to the sequence identifiers, and sequentially transmitting the identified content blocks to the client by the server according to the sorting; and when the loss, repetition or wrong sequence of the identified content blocks is detected, the server re-extracts the corresponding identified content blocks from the cache storage layer according to the sequence identifier for retransmission.
- 8. An AI chat breakpoint resume device based on a dual storage architecture, configured to implement the AI chat breakpoint resume method based on the dual storage architecture as claimed in any of claims 1 to 7, wherein the device includes: The transmission module is used for sequentially generating content blocks by the AI big model after the text is input by the client, and transmitting the content blocks to the client by the server; the storage module is used for writing the content blocks into the cache storage layer, and writing all the content blocks into the persistence storage layer as complete content after the AI large model finishes answering; The disconnection detection module is used for stopping the server from transmitting the content block to the client after the client is disconnected from the server, and the client generates a stopping position index of the content block; the reconnection detection module is used for detecting whether the complete content exists in the persistent storage layer after the client is reconnected with the server; The persistent storage layer continuous transmission module is used for dividing the complete content of the persistent storage layer into a plurality of content blocks under the condition that the complete content exists in the persistent storage layer, and sequentially transmitting the content blocks to the client side; And the cache storage layer continuous transmission module is used for sequentially transmitting the content blocks after the stop position index of the cache storage layer to the client side according to the sequence under the condition that the complete content does not exist in the persistent storage layer.
- 9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the AI chat breakpoint resume method based on the dual storage architecture of any of claims 1-7 when the program is executed by the processor.
- 10. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the AI chat breakpoint resume method based on the dual storage architecture of any of claims 1 to 7.
Description
AI chat breakpoint continuous transmission method and device based on double storage architecture Technical Field The invention relates to the technical field of AI conversation, in particular to an AI chat breakpoint continuous transmission method and device based on a double-storage architecture. Background Under the prior art condition, an AI dialogue system, such as an AI chat application for calling a large language model in a small program, generally adopts a mode that a front end and a back end are connected in real time in a network manner to carry out chat dialogue, after a user inputs a request, the request is transmitted to the back end by the front end, and dialogue content generated by the large model is returned in real time through the back end and displayed on a front end interface. However, when a user cuts out an applet in the question-answering process or the front-end and back-end connection is interrupted due to network fluctuations, the system often cannot effectively present the generated AI answer content to the user. The method is characterized in that although the communication between the back end and the large model is still continuous and can generate complete answers, the content cannot be transmitted to the user interface in real time due to the connection interruption, and after the user reenters the AI page, most systems cannot automatically reissue the content which is not received by the front end, so that the conversation context is lost and the continuity is interrupted, and the user experience is seriously influenced. In the conventional technology, the breakpoint resume mechanism is mainly applied to interrupt recovery of large files or media data, such as video, audio or firmware upgrades. For example, an automatic network repair technology in a video monitoring system can temporarily buffer video data in a front-end device when a network is interrupted and perform reissue after the network is restored. However, such schemes are mostly aimed at continuity guarantee of static data transmission, and have significant differences from application scenes of AI dialogue systems. The data transmitted in the AI dialogue scene is dynamic data, and the large model generated content is often returned gradually in a segmented form, and is rendered and displayed by the front end in real time. Once the connection is broken, the system cannot rely on simple cache reimbursement to achieve seamless recovery as in conventional media transmission. Therefore, the prior art lacks a complete solution for an AI-oriented dialog application capable of combining front-end disconnection detection with a back-end cache reissue mechanism. The existing AI dialogue system still cannot ensure continuous delivery of AI answer content and integrity of dialogue flow in case of user cutting out application or network interruption. Disclosure of Invention In order to solve the problems, the invention provides an AI chat breakpoint continuous transmission method based on a double-storage architecture, which comprises the following steps: After a user inputs a text at a client, an AI big model sequentially generates content blocks, and a server transmits the content blocks to the client; the server writes the content blocks into a cache storage layer, and after the AI big model finishes answering, writes all the content blocks into a persistence storage layer as complete content; when the client is disconnected from the server, the server stops transmitting the content block to the client, and the client generates a stop position index of the content block; detecting whether the complete content exists in the persistent storage layer or not after the client is reconnected with the server; dividing the complete content of the persistent storage layer into a plurality of content blocks under the condition that the complete content exists in the persistent storage layer, and sequentially transmitting the content blocks to the client side; And under the condition that the complete content does not exist in the persistent storage layer, sequentially transmitting the content blocks after the stop position index of the cache storage layer to the client side according to the sequence. Optionally, after the user inputs the text by the client, the AI big model sequentially generates content blocks, and the server transmits the content blocks to the client, including: After receiving the text input by the client, the server sequentially generates response data through an AI large model, and converts the response data into corresponding content blocks; The server sequentially pushes the content blocks to the client side one by one through an SSE protocol, wherein the content blocks comprise text content and metadata, and the metadata comprises a sequence identifier, a completion judgment identifier and a session identifier of the content blocks. Optionally, the writing, by the server, the content block into the cache