Search

CN-122019599-A - Method, device, equipment, medium and product for processing stream data

CN122019599ACN 122019599 ACN122019599 ACN 122019599ACN-122019599-A

Abstract

The application relates to the technical field of data processing and discloses a method, a device, equipment, a medium and a product for processing streaming data, wherein the method comprises the steps of obtaining target streaming data to be processed; the method comprises the steps of carrying out label identification on target streaming data, determining target labels in the target streaming data, predefining corresponding label levels on the target labels, carrying out level verification on the target labels according to the label levels of the target labels, and carrying out corresponding analysis processing on the target labels under the condition that the target labels pass the level verification. The method and the device can verify the hierarchical nested relation and the legality of the target label, ensure the correct nested relation of the label, ensure the accuracy of subsequent data processing, and effectively avoid the problems of analysis errors and the like.

Inventors

  • LIU YIQI

Assignees

  • 杭州网易智企科技有限公司

Dates

Publication Date
20260512
Application Date
20251215

Claims (17)

  1. 1. A method for processing streaming data, the method comprising: acquiring target streaming data to be processed; Performing tag identification on the target streaming data to determine a target tag in the target streaming data, wherein the target tag is predefined with a corresponding tag level; performing hierarchical verification on the target tag according to the tag hierarchy of the target tag; And under the condition that the target label passes the hierarchical verification, carrying out corresponding analysis processing on the target label.
  2. 2. The method of claim 1, wherein the identifying the target streaming data to determine the target tag in the target streaming data comprises: integrity detection is carried out on the target streaming data, and the complete data in the target streaming data is determined; and carrying out tag identification on the complete data, and determining a target tag in the complete data.
  3. 3. The method of claim 2, wherein the integrity detecting the target streaming data to determine complete data in the target streaming data comprises: Judging whether an incomplete label exists in the target stream data; And taking other data except the incomplete label in the target streaming data as complete data under the condition that the incomplete label exists.
  4. 4. The method of claim 3, wherein the determining whether an incomplete label exists in the target streaming data comprises: Searching for a first start character and a first end character and/or a last start character and a last end character in the target streaming data; determining that an incomplete label exists in the target streaming data under the condition that the first start character is positioned behind the first end character, and taking partial data before the first end character in the target streaming data as the incomplete label; And under the condition that the last beginning character is positioned behind the last ending character, determining that an incomplete label exists in the target streaming data, and taking partial data behind the last beginning character in the target streaming data as the incomplete label.
  5. 5. A method according to claim 3, characterized in that the method further comprises: And under the condition that an incomplete label exists, storing the incomplete label into a cache, or forming the complete label by the incomplete label and data recorded in the cache so as to analyze the complete label.
  6. 6. The method of claim 1, wherein said hierarchically verifying said target tag according to a tag hierarchy of said target tag comprises: judging whether the label level of the starting label is matched with the current stack depth of a state stack or not under the condition that the target label is the starting label; Determining that the start tag passes level verification under the condition that the tag level of the start tag is matched with the current stack depth of the state stack; the method further comprises the steps of: and pushing the start tag to the state stack under the condition that the start tag passes the hierarchical verification.
  7. 7. The method of claim 6, wherein the determining that the start tag passes level verification if the tag level of the start tag matches a current stack depth of a state stack comprises: Judging whether a stack top label of the state stack is allowed to be used as a parent label of the start label if the start label is a secondary label under the condition that the label level of the start label is matched with the current stack depth of the state stack; And determining that the start tag passes the hierarchical verification under the condition that the stack top tag of the state stack is allowed to serve as a parent tag of the start tag.
  8. 8. The method of claim 6, wherein the performing level verification on the target tag according to the tag level of the target tag further comprises: Judging whether the end label is consistent with a stack top label of the state stack or not under the condition that the target label is the end label; Determining that the end label passes the hierarchical verification under the condition that the end label is consistent with the stack top label of the state stack; the method further comprises the steps of: and removing the stack top label of the state stack under the condition that the end label passes the hierarchical verification.
  9. 9. The method according to claim 1, wherein the method further comprises: And under the condition that the target label does not pass the hierarchical verification, performing text processing on the text content corresponding to the target label.
  10. 10. The method of claim 1, wherein the performing a corresponding parsing process on the target tag comprises: triggering a preset starting callback function under the condition that the target label is a starting label; triggering a preset completion callback function under the condition that the target label is an end label; And triggering a preset progress callback function on the content block under the condition that the content block between the starting label and the corresponding ending label exists in the target streaming data.
  11. 11. The method according to any one of claims 1 to 10, further comprising: judging whether a tool call request exists in the target stream data; Judging whether a first tool and a second tool are the same tool or not under the condition that a tool calling request for the first tool exists in the target streaming data and the second tool in an active state exists currently; and if the tools are not the same tool, ending the second tool and starting to call the first tool.
  12. 12. The method of claim 11, wherein the determining whether the first tool and the second tool are the same tool if there is a tool call request for the first tool in the target streaming data and there is a second tool currently in an active state comprises: extracting corresponding tool information under the condition that a tool call request exists in the target streaming data; If the tool information contains the tool name of the first tool, judging whether the tool name of the first tool is the same as the tool name of the second tool if the second tool in an active state exists currently; the method further comprises the steps of: judging whether the tool information contains a tool parameter or not under the condition that the tool information does not contain a tool name; and under the condition that the tool information comprises the tool parameters, if the tool in an active state exists currently, accumulating the tool parameters until the tool call is completed.
  13. 13. The method of claim 11, wherein the method further comprises: Judging whether a tool in an active state exists currently under the condition that a tool calling request does not exist in the target streaming data; and ending the third tool if the third tool in the active state exists currently.
  14. 14. A streaming data processing apparatus, the apparatus comprising: the acquisition module is used for acquiring target streaming data to be processed; The identification module is used for carrying out label identification on the target streaming data and determining target labels in the target streaming data, wherein the target labels are predefined with corresponding label levels; The processing module is used for carrying out level verification on the target tag according to the tag level of the target tag; And the analysis module is used for carrying out corresponding analysis processing on the target label under the condition that the target label passes the hierarchical verification.
  15. 15. An electronic device, comprising: A memory and a processor, the memory and the processor being communicatively connected to each other, the memory having stored therein computer instructions, the processor executing the computer instructions to perform the method of processing streaming data according to any of claims 1 to 13.
  16. 16. A computer-readable storage medium, having stored thereon computer instructions for causing a computer to perform the method of processing streaming data according to any of claims 1 to 13.
  17. 17. A computer program product comprising computer instructions for causing a computer to perform the method of processing streaming data according to any of claims 1 to 13.

Description

Method, device, equipment, medium and product for processing stream data Technical Field The present application relates to the field of data processing technologies, and in particular, to a method, an apparatus, a device, a medium, and a product for processing streaming data. Background The streaming data processing (Stream Processing) may perform real-time computation on the continuous data stream. For example, the continuous data stream output by the large model (Large Language Model, LLM) in real time is processed and analyzed, and the technology is mainly applied to LLM stream generation scenes. At present, related schemes for stream data processing are difficult to accurately identify label structures in stream data, particularly labels with nested structures, and accuracy of data analysis is affected. Disclosure of Invention In view of the above, the present application provides a method, apparatus, device, medium and product for processing streaming data, so as to solve the problem that it is difficult to accurately process the multi-layer tag of streaming data. In a first aspect, the present application provides a method for processing streaming data, where the method includes: acquiring target streaming data to be processed; Performing tag identification on the target streaming data to determine a target tag in the target streaming data, wherein the target tag is predefined with a corresponding tag level; performing hierarchical verification on the target tag according to the tag hierarchy of the target tag; And under the condition that the target label passes the hierarchical verification, carrying out corresponding analysis processing on the target label. In a second aspect, the present application provides a streaming data processing apparatus, the apparatus comprising: the acquisition module is used for acquiring target streaming data to be processed; The identification module is used for carrying out label identification on the target streaming data and determining target labels in the target streaming data, wherein the target labels are predefined with corresponding label levels; The processing module is used for carrying out level verification on the target tag according to the tag level of the target tag; And the analysis module is used for carrying out corresponding analysis processing on the target label under the condition that the target label passes the hierarchical verification. In a third aspect, the present application provides an electronic device, including a memory and a processor, where the memory and the processor are communicatively connected to each other, and the memory stores computer instructions, and the processor executes the computer instructions, thereby executing the method for processing streaming data according to the first aspect or any embodiment thereof. In a fourth aspect, the present application provides a computer readable storage medium having stored thereon computer instructions for causing a computer to execute the method for processing streaming data according to the first aspect or any one of the embodiments corresponding thereto. In a fifth aspect, the present application provides a computer program product comprising computer instructions for causing a computer to perform the method of processing streaming data of the first aspect or any of its corresponding embodiments. The method for processing the stream data provided by the application predefines a plurality of levels of labels for the stream data to form a nested multi-layer label system, and the target stream data to be processed can be subjected to level verification according to the label levels of the target labels, so that the level nesting relationship and legality of the target labels can be verified, the correct nesting relationship of the labels is ensured, the accuracy of subsequent data processing is ensured, and the problems of analysis errors and the like are effectively avoided. Drawings In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present application, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art. FIG. 1 is a schematic illustration of an application scenario according to an embodiment of the present application; FIG. 2 is a first flow chart of a method for processing streaming data according to an embodiment of the present application; FIG. 3 is a second flow chart of a method of processing streaming data according to an embodiment of the present application; FIG. 4 is a flow diagram of a tag splitting and caching process according to an embodiment of the present application; FIG. 5 is a schematic diagram of tag integrity detection accord