CN-121985175-A - Key event extraction and positioning processing feedback system and method

CN121985175ACN 121985175 ACN121985175 ACN 121985175ACN-121985175-A

Abstract

The application belongs to the technical field of intelligent analysis of videos, and particularly relates to a key event extraction and positioning processing feedback system and a method, which can match proper video analysis algorithm models and software tools according to application scene types and target key event types of video files to be processed in a task data packet, further can combine real-time calculation information, and configure software tools and operation platforms with minimum calculation force waste for the video files to be processed matched with the video analysis algorithm models and the software tools. Therefore, the task execution strategy can be automatically generated through the task planning model in the face of diversified scenes, autonomous intelligent scheduling of the algorithm model, the software tool and the operation platform is realized, the structured task processing result and the target key video segment of each video file to be processed are accurately obtained from each video file to be processed, and the overall operation efficiency of the system is improved while the accuracy of extracting key events is improved.

Inventors

LI HENG
CHENG PENG
LIU JIANYUN

Assignees

中国船舶集团有限公司第七〇九研究所

Dates

Publication Date: 20260505
Application Date: 20260107

Claims (10)

1. The key event extraction and positioning processing feedback method is characterized by comprising the following steps of: S1, acquiring a user input instruction, and judging the type of the input instruction, wherein the type of the input instruction comprises key event extraction and key event retrieval; S2, when the type of the input instruction is key event extraction, packaging user interaction information corresponding to the input instruction into a task data packet, wherein the task data packet comprises a target key event type, at least one address of a video file to be processed and an application scene type corresponding to the video file to be processed, and acquiring a task available information set; S3, carrying out key event extraction processing on each video file to be processed based on the task data packet and the task available information set to obtain target key video segments which are in accordance with the target key event types in each video file to be processed and each structured task processing result corresponding to the target key video segments, and storing each target key video segment into a disk and each structured task processing result into a database; and S4, when the type of the input instruction is key event retrieval, acquiring the user interaction information corresponding to the input instruction, and performing key event retrieval processing on the database based on the user interaction information.
2. The method for extracting key events and feeding back positioning process according to claim 1, wherein said performing key event extraction process on each of said video files to be processed based on said task data packet and task availability information set comprises: S31, extracting all key events which accord with the target key event type in the video file to be processed and a target key video segment corresponding to the target key event as a target task; S32, planning the target task through a task planning model based on the task data packet, the task available information set and a preset prompt word template to obtain a task execution strategy, and executing the target task based on the task execution strategy.
3. The method according to claim 2, wherein the task availability information set includes platform information to be configured and real-time computing power information, the platform information to be configured includes a number of each computing platform, computing power reference information, available software tools and video analysis algorithm models, each video analysis algorithm model and computing power requirements of each software tool operation, and the real-time computing power information includes real-time standby computing power of each computing platform.
4. The method for extracting and positioning a key event according to claim 3, wherein the planning the target task through a task planning model based on the task data packet, the task available information set and a preset prompt word template to obtain a task execution policy includes: And inputting the task data packet and the task available information set into a task planning model, and restricting the output of the task planning model based on the preset prompt word template to obtain a task execution strategy corresponding to the target task.
5. The method for extracting key events and feeding back positioning process according to claim 4, wherein said constraining the output of the task planning model based on the preset prompt word template comprises: and setting constraint conditions based on the preset prompt word templates, and constraining the task planning model based on the constraint conditions, wherein the constraint conditions comprise configuration requirement conditions of each operation platform, each software tool and each video analysis algorithm model.
6. The method according to claim 5, wherein the task execution strategy is a sub-task sequence comprising a plurality of sub-steps, each sub-step is a sub-task, and the attribute of the sub-task includes a membership task ID, a sub-task step number, a sub-task state, a type of the selected software tool, a type of the video analysis algorithm model, and a number of the selected computing platform.
7. The method for extracting key events and feeding back positioning process according to claim 6, wherein said performing key event extraction on each of said to-be-processed video files to obtain a target key video segment in each of said to-be-processed video files, said target key video segment conforming to said target key event type, and each structured task processing result corresponding to said target key video segment, comprises: Acquiring the subtask execution sequence of the task execution strategy, and packaging each subtask corresponding to the subtask execution sequence into a subtask packet, wherein the subtask packet field comprises the attribute of the subtask; Based on the operation platform, the software tool and the video analysis algorithm model corresponding to the subtask package, carrying out key event extraction processing on each video file to be processed to obtain a subtask processing result and a subtask key video segment corresponding to the subtask package; and integrating the subtask processing results corresponding to the subtask packets and the subtask key video segments to form the structured task processing results and the target key video segments of each video file to be processed.
8. The method for extracting and locating a key event according to claim 7, wherein the structured task processing result includes an event type, an event time, an event body, an event ID, an ID of the video to be processed corresponding to the target key event, a video address, timing information of the target key event in the video to be processed, an event key frame address, and an address of an event key frame sequence, which are corresponding to the task data packet, wherein the timing information of the target key event includes a duration time, a start time sequence timestamp, and a stop time sequence timestamp in the video to be processed, in which the event is located.
9. The method for key event extraction and location processing feedback according to claim 8, wherein said performing key event search processing on the database based on the user interaction information comprises: s41, analyzing the user interaction information into a key event retrieval instruction, wherein the key event retrieval instruction comprises an event type, an event main body and an event time section of a key event to be retrieved; S42, searching each structured task processing result in the database based on the key event search instruction to obtain a structured task processing result corresponding to the key event search instruction; S43, obtaining a target key video segment based on the event key frame address and the address of the event key frame sequence in the structured task processing result.
10. A key event extraction and localization process feedback system for implementing the method according to any of claims 1-9, comprising: the instruction acquisition module is used for acquiring an input instruction of a user; The task data acquisition module is used for packaging user interaction information corresponding to the input instruction into a task data packet when the type of the input instruction is key event extraction; The execution strategy acquisition module is used for planning the target task through a task planning model based on the task data packet, the acquired task available information set and a preset prompt word template to obtain a task execution strategy; The task scheduling module is used for acquiring a subtask execution sequence of the task execution strategy and packaging all subtasks corresponding to the subtask execution sequence into a subtask packet; The key event extraction module is used for carrying out key event extraction processing on each video file to be processed, calling a software tool and the video analysis algorithm model corresponding to each subtask in an operation platform corresponding to each subtask package, and carrying out key event extraction processing on the video file to be processed to obtain a subtask processing result and a subtask key video segment corresponding to each subtask package; The key event integration module integrates the subtask processing results corresponding to the subtask packets and the subtask key video segments to form a structured task processing result and a target key video segment of each video file to be processed; And the event retrieval module is used for acquiring the user interaction information corresponding to the input instruction when the type of the input instruction is key event retrieval, and carrying out key event retrieval processing on the database based on the user interaction information.

Description

Key event extraction and positioning processing feedback system and method Technical Field The application belongs to the technical field of intelligent video analysis, and particularly relates to a system and a method for extracting key events and processing and feeding back the key events in a positioning way. Background The rapid development of sensor technology and the reduction of hardware cost lead to explosive growth of monitoring video data volume, and video monitoring systems of large facilities such as intelligent agriculture, intelligent factories, emergency command centers and the like are shifted from simple monitoring to storage management and analysis of video data, so that the video monitoring systems are required to provide information extraction, alarm and traceability support for key event related data of massive video data. However, in the existing method for extracting and positioning the key event, the preset key event reference video and the video to be processed are mostly matched through a feature matching algorithm to position the video data corresponding to the key event in the video to be processed. Most of the methods focus on video processing analysis algorithm call in a single field, have insufficient adaptability and expansibility to different application scenes, lack intelligent autonomy for algorithm scheduling in diversified scenes, and require more manual intervention. In addition, because the video data volume of each large-scale video monitoring system is huge, the video data is required to be processed in a multithreading way by utilizing a plurality of groups of operation platforms, and the difference of calculation power consumption exists in different video processing analysis algorithms, the existing key event extraction and positioning method lacks intelligent scheduling on the task allocation of the calculation power platform, and the problems of calculation power waste or untimely response of users and the like are caused. Disclosure of Invention Aiming at the defects in the prior art, the application provides a key event extraction and positioning processing feedback system and a method, which aim to solve the problems of poor scene adaptability, insufficient calculation power utilization and low retrieval and positioning efficiency of the key event extraction and positioning of a video monitoring system. In a first aspect, the present application provides a method for extracting a key event and feeding back a positioning process, including: S1, acquiring a user input instruction, and judging the type of the input instruction, wherein the type of the input instruction comprises key event extraction and key event retrieval; S2, when the type of the input instruction is key event extraction, packaging user interaction information corresponding to the input instruction into a task data packet, wherein the task data packet comprises a target key event type, at least one address of a video file to be processed and an application scene type corresponding to the video file to be processed, and acquiring a task available information set; S3, based on the task data packet and the task available information set, carrying out key event extraction processing on each video file to be processed to obtain target key video segments which are in accordance with the target key event types in each video file to be processed and each structured task processing result corresponding to the target key video segments, and storing each target key video segment into a disk and each structured task processing result into a database; and S4, when the type of the input instruction is key event retrieval, acquiring user interaction information corresponding to the input instruction, and performing key event retrieval processing on the database based on the user interaction information. Further, based on the task data packet and the task available information set, performing key event extraction processing on each video file to be processed, including: s31, extracting key events which accord with the type of the target key event in all the video files to be processed, and taking a target key video segment corresponding to the target key event as a target task; S32, planning a target task through a task planning model based on the task data packet, the task available information set and a preset prompt word template to obtain a task execution strategy, and executing the target task based on the task execution strategy. Planning a target task through a task planning model based on a task data packet, a task available information set and a preset prompt word template to obtain a task execution strategy, wherein the task execution strategy comprises the following steps: And inputting the task data packet and the task available information set into a task planning model, and restricting the output of the task planning model based on a preset prompt word template to obtain a task execution strateg