KR-102964103-B1 - DEVICE AND METHOD FOR GENERATING USER-SELECTABLE SPORTS IMAGES BASED ON MULTI-ANGLE SYNCHRONIZATION

KR102964103B1KR 102964103 B1KR102964103 B1KR 102964103B1KR-102964103-B1

Abstract

The embodiments provide a multi-angle synchronization-based user-selectable sports video generation device and method. The device according to the embodiment may include: a video preprocessing unit that receives first original video data generated by a first camera, second original video data generated by a second camera, and third original video data generated by a third camera from a shooting system, and generates an original video by synchronizing the first to third original video data based on a time axis; a personalization unit that collects user gaze data from a user terminal; a sensing processing unit that extracts a region of interest from each of the first to third original video data based on the gaze data and calculates a gaze-based weight for the region of interest; and a multimedia management unit that selects at least one of the first to third original video data based on the gaze-based weight and generates a highlight video by editing the selected original video data. The gaze data may include at least one of the user's pupil position, gaze direction vector, gaze duration, and gaze movement speed, and the gaze-based weight may be calculated based on gaze density, gaze fixation time, and saccadic eye movement frequency. The above user terminal may include a touch display that receives and displays the original image from the image preprocessing unit.

Inventors

박기용
박민규
남광우

Assignees

주식회사 스포클립에이아이

Dates

Publication Date: 20260513
Application Date: 20251002

Claims (5)

An image preprocessing unit that receives first original image data generated by a first camera, second original image data generated by a second camera, and third original image data generated by a third camera from a shooting system, and generates an original image by synchronizing the first to third original image data based on a time axis; A personalization unit that collects user's gaze data from a user terminal; A sensing processing unit that extracts regions of interest from each of the first to third original image data based on the above gaze data, and calculates gaze-based weights for the regions of interest; A multimedia management unit that selects at least one of the first to third original video data based on the above gaze-based weighting and edits the selected original video data to generate a highlight video; The above gaze data includes at least one of the user's pupil position, gaze direction vector, gaze duration, and gaze movement speed, and The above gaze-based weights are calculated based on gaze density, gaze fixation time, and saccadic eye movement frequency, The above user terminal is, It includes a touch display that receives and displays the original image from the image preprocessing unit, and The above user terminal is, A front camera that captures the user's face to generate facial image data; and A gaze tracking module that analyzes the facial image data and generates the gaze data; is included. The above eye-tracking module is, A face detection unit that detects a face region in the above facial image data; A feature point extraction unit that extracts feature points around the eyes in the above face region; A pupil tracking unit that detects the pupil position based on the above feature points; and A gaze estimation unit that estimates the direction of gaze by combining the pupil position and facial posture information; The above multimedia management unit is, The original video data of the camera with the highest gaze-based weighting is selected first to generate the highlight video, and The above multimedia management unit is, The above gaze-based weights are extracted as highlight sections where the weights are greater than or equal to a threshold value, and A highlight image is generated by applying a fade effect between the extracted highlight section and a highlight section adjacent to the highlight section, and The above sensing processing unit calculates gaze-based weights through the following <mathematical formula>, and <Mathematical Formula> W_gaze(i,t) represents the gaze-based weight for the i-th camera at time t, D_gaze(i,t) represents the gaze density as the frequency of gazing within the corresponding camera image, T_fixation(i,t) represents the fixation duration, and F_saccade(i,t) represents the saccade frequency; w1, w2, and w3 can each represent weight coefficients, specified to have a value of w1 + w2 + w3 = 1, Multi-angle synchronization-based user-selectable sports video generation device.
delete
delete
delete
delete

Description

Device and Method for Generating User-Selectable Sports Images Based on Multi-Angle Synchronization Embodiments of the present invention relate to a multi-angle synchronization-based user-selectable sports video generation device and method. Sports matches are filmed simultaneously from various angles using multiple cameras, and effectively editing and delivering this multi-angle footage to viewers is a core technology of sports broadcasting. Traditional sports video editing involved professional editors manually reviewing footage from each camera to select and edit key scenes. This method not only required significant time and manpower but also had limitations, as it relied on the editor's subjective judgment and failed to adequately reflect the interests of individual viewers. While technologies utilizing artificial intelligence to automatically extract highlights have recently been developed, most rely on objective events such as game rules or score changes for editing, failing to reflect individual preferences or interests. Furthermore, alongside technical challenges in synchronizing and seamlessly transitioning between multiple camera feeds, it has been difficult to accurately identify the scenes viewers are actually focusing on and incorporate them into the editing. Consequently, there is a growing need for technology capable of automatically generating personalized sports highlight videos by leveraging user eye data. The accompanying drawings, included as part of the detailed description to aid in understanding the embodiments, provide various embodiments and explain the technical features of the various embodiments together with the detailed description. FIG. 1 is a schematic diagram illustrating a sports game video recording system (10) according to one embodiment of the present invention. FIG. 2 is a block diagram for explaining the structure of the electronic device (100) of FIG. 1. Figure 3 is a diagram illustrating the multilayer neural network (122) of Figure 2. FIG. 4a is a block diagram for explaining the structure of the user terminal (200) of FIG. 1. FIG. 4b is a block diagram for explaining the eye tracking module (270) of FIG. 4a. FIG. 5 is a block diagram for explaining the structure of the sensor device (400) of FIG. 1. FIG. 6 is a flowchart illustrating the original image data preprocessing process according to one embodiment of the present invention. Figure 7 is a diagram schematically illustrating the process of dividing multiple original image data into frames. FIG. 8 is a flowchart illustrating a method for generating a highlight image according to an embodiment of the present invention. FIG. 9 is a flowchart illustrating a process for generating a highlight image according to an embodiment of the present invention. The following embodiments are combinations of the components and features of the embodiments in a predetermined form. Each component or feature may be considered optional unless otherwise explicitly stated. Each component or feature may be implemented in a form not combined with other components or features. Additionally, various embodiments may be constructed by combining some components and/or features. The order of operations described in various embodiments may be changed. Some components or features of one embodiment may be included in another embodiment, or may be replaced with corresponding components or features of another embodiment. In the description of the drawings, procedures or steps that could obscure the essence of the various embodiments were not described, nor were procedures or steps that can be understood by a person of ordinary knowledge in the relevant technical field described. Throughout the specification, when a part is described as "comprising" or "including" a component, this means that, unless specifically stated otherwise, it does not exclude other components but may include additional components. Furthermore, terms such as "...part," "...unit," and "module" as used in the specification refer to a unit that performs at least one function or operation, and this may be implemented in hardware, software, or a combination of hardware and software. Additionally, "one (a or an)," "one," "the," and similar related terms may be used in the context describing various embodiments (particularly in the context of the following claims) in both singular and plural forms, unless otherwise indicated in the specification or clearly contradicted by the context. Hereinafter, embodiments according to various examples will be described in detail with reference to the accompanying drawings. The detailed description disclosed below, together with the accompanying drawings, is intended to describe exemplary embodiments of various examples and is not intended to represent the only embodiment. In addition, specific terms used in various embodiments are provided to aid in understanding the various embodiments, and the use of such specific terms may be modified in oth