Search

CN-121982491-A - Data processing method and device for image understanding and visual question-answering task

CN121982491ACN 121982491 ACN121982491 ACN 121982491ACN-121982491-A

Abstract

The invention discloses a data processing method and device for image understanding and visual question-answering tasks, which are used for solving the technical problems that tasks in related technologies are easy to lose or accumulate for a long time, and peak resources are congested or repeatedly analyzed. The method comprises the steps of screening a completely written picture file through stability identification when new file writing is detected, analyzing image metadata of the picture file, determining event source types and image paths, distributing the picture file to a multi-channel AI queue for consumption according to the event source types and the image paths in combination with a distribution strategy, and executing a current limiting and lasting retry mechanism based on a queue consumption container when current limiting or QPS exceeding occurs in the AI reasoning process until AI reasoning is completed and a reasoning result is returned.

Inventors

  • ZOU YU

Assignees

  • 熵云脑机(杭州)科技有限公司

Dates

Publication Date
20260505
Application Date
20260122

Claims (10)

  1. 1. The data processing method for the image understanding and visual question-answering task is characterized by comprising the following steps of: when the writing of a new file is detected, the completely written picture file is screened through stability identification; analyzing the image metadata of the picture file, and determining the event source type and the image path; distributing the picture file to a multi-channel AI queue for consumption according to the event source type and the image path and combining a distribution strategy; And when current limiting or QPS exceeding limit occurs in the AI reasoning process, executing a current limiting and lasting retry mechanism based on the queue consumption container until the AI reasoning is completed and a reasoning result is returned.
  2. 2. The method for processing data for image understanding and visual question-answering task according to claim 1, wherein the distributing the picture file to a multi-channel AI queue for consumption processing according to the event source type and image path in combination with a distribution policy includes: determining a multi-channel AI queue into which the picture file enters according to the event source type; Acquiring an input control state, a week time plan and a system switch; And calling the picture file from the image path to at least one AI channel queue in the multi-channel AI queues for consumption processing according to the control state, the time-of-week plan and the system switch, and cleaning the queue task in the AI channel queue after the consumption processing is completed.
  3. 3. The method for processing data for image understanding and visual question-answering tasks according to claim 2, wherein when the event source type is an internal point source, the multi-channel AI queues include a visual question-answering AI channel queue and an image understanding AI channel queue, the visual question-answering AI channel queue includes a first priority queue and a second priority queue, and when the event source type is a third party source, the multi-channel AI queues include an image understanding AI channel queue.
  4. 4. The method for processing data for image understanding and visual question-answering task according to claim 1, wherein each AI channel queue of the multi-channel AI queue is provided with a queue consumption container, and when current limit or QPS overrun occurs in the AI reasoning process, executing a current limit and persistence retry mechanism based on the queue consumption container until AI reasoning is completed and a reasoning result is returned, including: when current limiting or QPS overrun occurs in the AI reasoning process, determining a target AI channel queue with current limiting or QPS overrun, and suspending the consumption processing of a queue consumption container corresponding to the target AI channel queue; Acquiring task parameters of a current processing queue task of the target AI channel queue, and writing the task parameters into ZSet in a serialization manner so as to perform orderly retry, and starting a lightweight unlocking mechanism at the same time so as to avoid concurrent repeated consumption; scanning the expired tasks in ZSet in a timed batch; If the retry execution of the current expired task is successful, deleting the retry record of the current expired task, cleaning the temporary file and recovering the current limiting state; If the retry execution of the current expired task fails and the error type is an unlimited error, cleaning a file of the current expired task and terminating the retry; when all the expired tasks of the scanning are executed, the queue consumption container corresponding to the target AI channel queue resumes normal processing, and history compensation and consumption processing of a new message queue are automatically continuously executed; and when each AI channel queue of the multi-channel AI queues successfully completes the consumption processing, completing AI reasoning and returning a reasoning result.
  5. 5. The image understanding and visual question-answering task-oriented data processing method according to claim 1, further comprising: In the AI reasoning process, when a thread pool reaches the maximum load, starting a refusal protection mechanism for the current processing task entering the thread pool, and extracting task parameters of the current processing task; and writing ZSet the task parameters of the current processing task in a complete and persistent manner, and automatically deleting the corresponding task parameters of the current processing task in the original queue.
  6. 6. The image understanding and visual question-answering task oriented data processing method according to any one of claims 1 to 5, wherein when the event source type of the picture file is an internal point source, the method further includes: After AI reasoning is completed and a reasoning result is successfully returned, event triggering actions are sequentially executed to form a complete event closed loop processing link, wherein the event triggering actions comprise event rule matching, duplication removal judgment and event warehousing, hit number increase, linkage outbound, picture multilevel archiving, fixed-size thumbnail generation and temporary file cleaning.
  7. 7. The image understanding and visual question-answering task-oriented data processing method according to claim 6, further comprising: Periodically detecting the information read-write delay, the queue consumption container running state and the queue accumulation condition of the multi-channel AI queue; when an abnormality is detected, an abnormality automatic recovery action is sequentially executed on the queue consumption container with the abnormality, wherein the abnormality automatic recovery action comprises stopping, rebuilding, starting and recovering history compensation.
  8. 8. A data processing apparatus for image understanding and visual question-answering tasks, comprising: The picture file detection unit is used for screening the completely written picture files through stability identification when the writing of the new files is detected; the metadata analysis unit is used for carrying out image metadata analysis on the picture file and determining an event source type and an image path; the queue distributing unit is used for distributing the picture file to a multi-channel AI queue for consumption processing according to the event source type and the image path in combination with a distributing strategy; and the consumption retry unit is used for executing a current limiting and lasting retry mechanism based on the queue consumption container when the current limiting or QPS exceeding limit occurs in the AI reasoning process until the AI reasoning is completed and the reasoning result is returned.
  9. 9. An electronic device, the device comprising a processor and a memory: the memory is used for storing program codes and transmitting the program codes to the processor; the processor is configured to execute the data processing method for image understanding and visual question-answering tasks according to any one of claims 1 to 7 according to instructions in the program code.
  10. 10. A computer readable storage medium for storing program code for performing the image understanding and visual question-answering task oriented data processing method according to any one of claims 1 to 7.

Description

Data processing method and device for image understanding and visual question-answering task Technical Field The present invention relates to the field of data processing technologies, and in particular, to a data processing method for an image understanding and visual question-answering task, a data processing device for an image understanding and visual question-answering task, an electronic device, and a storage medium. Background In engineering implementation of computer vision tasks, it is common practice to write pictures captured by an image capturing device (such as a camera) into a file directory or object to store, and then implement concurrent consumption and back-end analysis through a message queue or a stream processing framework (such as Kafka, rabbitMQ, REDIS STREAMS). The consumer usually adopts multi-thread processing, and the recognition result is put in storage and triggers downstream linkage (alarm, notification, worksheet, etc.). In the related process, invocation of external AI services (e.g., image question-answer VQA (Visual Question Answering), image understanding/embedding IRAG (IMAGE RETRIEVAL AND Generation)) commonly use simple retry strategies (e.g., exponential backoff, delay queues, or dead letter queues). In high concurrency or current limiting scenarios, this approach can easily lead to task loss or long-term pile-up. For multi-priority and time-plan control aspects, generally based on uniform rate or coarse-grained priority, peak resource congestion or repeated analysis is easily caused. Disclosure of Invention The invention provides a data processing method facing an image understanding and visual question-answering task, a data processing device facing the image understanding and visual question-answering task, electronic equipment and a storage medium, which are used for solving or partially solving the technical problems that tasks in related technologies are easy to lose or accumulate for a long time, and peak resources are jammed or repeatedly analyzed. The invention provides a data processing method for image understanding and visual question-answering tasks, which comprises the following steps: when the writing of a new file is detected, the completely written picture file is screened through stability identification; analyzing the image metadata of the picture file, and determining the event source type and the image path; distributing the picture file to a multi-channel AI queue for consumption according to the event source type and the image path and combining a distribution strategy; And when current limiting or QPS exceeding limit occurs in the AI reasoning process, executing a current limiting and lasting retry mechanism based on the queue consumption container until the AI reasoning is completed and a reasoning result is returned. Optionally, the distributing the picture file to a multi-channel AI queue for consumption processing according to the event source type and the image path in combination with a distribution policy includes: determining a multi-channel AI queue into which the picture file enters according to the event source type; Acquiring an input control state, a week time plan and a system switch; And calling the picture file from the image path to at least one AI channel queue in the multi-channel AI queues for consumption processing according to the control state, the time-of-week plan and the system switch, and cleaning the queue task in the AI channel queue after the consumption processing is completed. Optionally, when the event source type is an internal point source, the multi-channel AI queue includes a visual question-answer AI channel queue and an image understanding AI channel queue, the visual question-answer AI channel queue includes a first priority queue and a second priority queue, and when the event source type is a third party source, the multi-channel AI queue includes an image understanding AI channel queue. Optionally, each AI channel queue of the multi-channel AI queues is correspondingly provided with a queue consumption container, and when current limiting or QPS exceeding occurs in the AI reasoning process, a current limiting and lasting retry mechanism based on the queue consumption container is executed until AI reasoning is completed and a reasoning result is returned, including: when current limiting or QPS overrun occurs in the AI reasoning process, determining a target AI channel queue with current limiting or QPS overrun, and suspending the consumption processing of a queue consumption container corresponding to the target AI channel queue; Acquiring task parameters of a current processing queue task of the target AI channel queue, and writing the task parameters into ZSet in a serialization manner so as to perform orderly retry, and starting a lightweight unlocking mechanism at the same time so as to avoid concurrent repeated consumption; scanning the expired tasks in ZSet in a timed batch; If the retry executio