CN-122018953-A - Method for developing AI application based on responsive streaming interaction
Abstract
The invention discloses a method for developing AI application based on responsive streaming interaction, which comprises a responsive streaming interaction architecture with layered design, wherein the responsive streaming interaction architecture comprises a streaming data pipeline layer, an asynchronous event processing layer, a model reasoning layer and an interface response layer, incremental transmission and decoupling processing of data are realized by constructing an end-to-end streaming data processing link, non-blocking task scheduling is carried out based on an event-driven architecture, life cycle events, data stream events and tool calling events are supported, an AI model can receive data and output results simultaneously by adopting a streaming reasoning mechanism, and real-time updating of an interface is realized by adopting a responsive UI frame and a state snapshot-incremental mode. The invention solves the problems of high interaction response delay, low data stream processing efficiency, user experience cracking and the like in the traditional AI application development, improves the real-time interaction performance and the system throughput of the AI application, and simultaneously enhances the fault tolerance and the stability of the system through breakpoint continuous transmission and a state verification mechanism.
Inventors
- WANG LONG
Assignees
- 语联网(武汉)信息技术有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20251223
Claims (8)
- 1. A method for developing AI application based on responsive streaming interaction comprises a responsive streaming interaction architecture with layered design, and is characterized in that the responsive streaming interaction architecture comprises a streaming data pipeline layer, an asynchronous event processing layer, a model reasoning layer and an interface response layer; the method comprises the following steps: Constructing a stream data processing link, and performing incremental transmission and decoupling processing on input data through a stream data pipeline layer; non-blocking task scheduling is performed based on an event-driven architecture through an asynchronous event processing layer; Performing incremental reasoning on the received stream data through a model reasoning layer, and triggering a corresponding model output event; And subscribing state change events through an interface response layer, and driving a user interface to perform real-time response type update according to the state snapshot and the increment update information.
- 2. The method of claim 1, wherein the streaming data pipeline layer comprises a responsive streaming library, a backpressure control mechanism and multi-modal data partitioning, and the method comprises the following steps: The streaming data pipeline layer builds data flow by adopting a responsive streaming library and supports a backpressure control mechanism, input data comprises one or more multi-mode data in text, image or audio, and streaming is carried out in a data block mode through multi-mode data blocks.
- 3. The method of claim 1, wherein the asynchronous event processing layer has a unified event model defined therein, the event model comprising an event type, a timestamp, and a payload attribute.
- 4. The method of claim 3, wherein the event types include: A life cycle event for monitoring the running state of the AI agent; a data stream event for communicating incremental data or identifying an end of stream; And the tool call event is used for managing the call of the external API and the return of the result.
- 5. The method of claim 1, wherein the model inference layer comprises a streaming inference adapter, a dynamic batch engine and attention mechanism optimization, the model inference layer is integrated with an AI model and compatible with different model frameworks through the streaming inference adapter, the model inference layer supports streaming inference, triggers a model output event once every output unit is generated in the inference process, and supports early termination of the inference process through termination events.
- 6. The method of claim 1, wherein the interface response layer comprises a responsive UI framework, a state snapshot-increment and an automatic subscription state, the interface response layer is implemented based on the responsive UI framework, and performs state management in a snapshot-increment mode, and the method specifically comprises the following steps: transmitting a complete state snapshot at initial loading; Transmitting only incremental update information of a change field upon a state change; The interface response layer realizes automatic updating through subscription state flow.
- 7. The method according to claim 1, further comprising a fault tolerance and recovery mechanism, comprising in particular: The breakpoint resume mechanism is used for continuing transmission from the breakpoint after the network is restored by recording the last processed data block identifier; And the state verification mechanism is used for periodically sending a state verification event through the front end, comparing the state verification event with the hash value of the back end state, and triggering full synchronization when the hash value is inconsistent with the back end state.
- 8. The method of claim 1, wherein the method employs an event stream communication protocol including, but not limited to, defining a standard event pattern and an event stream pattern.
Description
Method for developing AI application based on responsive streaming interaction Technical Field The invention relates to a development method of an AI application, in particular to a method for developing the AI application based on responsive streaming interaction, which belongs to the technical field of artificial intelligence application development. Background In the field of artificial intelligence application development, a traditional interaction mode mostly adopts a request-response mode, namely, after a user inputs complete data, the system needs to execute full data processing and generate a final result before returning to the user. This mode can lead to significant interaction delays when dealing with large-scale data or complex model reasoning. In the prior art, 1) an interactive response method, a related device, an electronic device and a storage medium disclosed by publication number CN116954461A are used for judging whether feedback data can respond to a user request through an artificial intelligent model, and introducing a networking inquiry mechanism when the feedback data cannot respond, although response accuracy is improved, the core of the interactive response method still depends on a complete request-response period, the problem of real-time block transmission and incremental reasoning of a data stream cannot be solved, particularly when multi-mode data or long text conversation is processed, a user still needs to wait for full processing to be completed, and experience is poor, and 2) the streaming dialogue butt joint based on a large model disclosed by publication number CN118351831A is used for supporting streaming of voice and text, but the architecture is still focused on unidirectional flow and back-end batch processing of the data, and lacks an asynchronous scheduling mechanism for hierarchical decoupling and event driving of the data stream, so that the throughput of the system is limited under a high concurrent scene and the interface updating is lagged. In summary, the common defects of the prior art are that firstly, the data processing links are tightly coupled, synchronous blocking exists in each link from input to reasoning to interface updating, real low-delay pipeline operation cannot be realized, secondly, a unified stream event protocol is lacked to coordinate incremental data transmission, model reasoning and interface state synchronization, so that interaction response is not smooth enough, thirdly, the architecture expansibility is insufficient, and the requirements of complex AI application scenes such as multi-model dynamic switching, tool calling integration and the like are difficult to adapt. Therefore, a development method capable of deeply fusing streaming data pipeline, asynchronous event processing, model streaming reasoning and responsive interface updating is needed to systematically solve the problems of high interaction delay, low data processing efficiency, user experience cracking and the like. Disclosure of Invention The present invention aims to solve at least one of the above-mentioned problems and provide a method for developing AI applications based on responsive streaming interactions. The method for developing the AI application based on the responsive streaming interaction comprises a responsive streaming interaction architecture with layered design, wherein the responsive streaming interaction architecture comprises a streaming data pipeline layer, an asynchronous event processing layer, a model reasoning layer and an interface response layer; the method comprises the following steps: Constructing a stream data processing link, and performing incremental transmission and decoupling processing on input data through a stream data pipeline layer; non-blocking task scheduling is performed based on an event-driven architecture through an asynchronous event processing layer; Performing incremental reasoning on the received stream data through a model reasoning layer, and triggering a corresponding model output event; And subscribing state change events through an interface response layer, and driving a user interface to perform real-time response type update according to the state snapshot and the increment update information. As a still further aspect of the present invention, the streaming data pipeline layer includes a responsive streaming library, a backpressure control mechanism, and multi-modal data partitioning, and specifically includes: The streaming data pipeline layer builds a data stream by adopting a response type streaming library and supports a backpressure control mechanism, input data comprises one or more multi-mode data in text, images or audios, and streaming is carried out in a data block mode through multi-mode data blocks. As a still further aspect of the present invention, a unified event model is defined in the asynchronous event processing layer, the event model comprising event type, timestamp and load attribute. As a