EP-4740453-A1 - VIDEO FENCING SYSTEM AND METHOD

EP4740453A1EP 4740453 A1EP4740453 A1EP 4740453A1EP-4740453-A1

Abstract

Systems and methods are provided for receiving, from at least one microphone (102), boundary information defining one or more boundaries (118) for an audio pick-up region (108); receiving, from the at least one microphone, sound location information indicating a detected sound position of an audio source (106) located within the audio pick-up region; identifying, based on the sound location information and the boundary information, a first boundary of the one or more boundaries as being located near the detected sound position; calculating a first distance between the detected sound position and the first boundary; determining, based on the first distance, a depth of field parameter for at least one camera (104); and providing the depth of field parameter and the sound location information to the at least one camera.

Inventors

JOSHI, Bijal

Assignees

Shure Acquisition Holdings, Inc.

Dates

Publication Date: 20260513
Application Date: 20240705

Claims (20)

1. A method performed by one or more processors in communication with each of at least one microphone and at least one camera, the method comprising: receiving, from the at least one microphone, boundary information defining one or more boundaries for an audio pick up region; receiving, from the at least one microphone, sound location information indicating a detected sound position of an audio source located within the audio pick-up region; identifying, based on the sound location information and the boundary information, a first boundary of the one or more boundaries as being located near the detected sound position; calculating a first distance between the detected sound position and the first boundary; determining, based on the first distance, a depth of field parameter for the at least one camera; and providing the depth of field parameter and the sound location information to the at least one camera.
2. The method of claim 1, wherein the depth of field parameter adjusts a zone of focus of the at least one camera so that the zone of focus includes the audio source and a first area between the audio source and the first boundary, and excludes a second area outside the audio pick-up region.
3. The method of claim 1, further comprising: determining, based on the boundary information, an image field parameter for the at least one camera; and providing the image field parameter to the at least one camera, wherein the image field parameter is configured to define an image field of the at least one camera such that the image field comprises the audio pick-up region.
4. The method of claim 3, further comprising: causing the at least one camera to apply an image enhancement to a portion of the image field that extends beyond the first boundary line to outside the audio pick-up region.
5. The method of claim 4, wherein the image enhancement is a select image displayed over the portion of the image field.
6. The method of claim 4, wherein the image enhancement is a blurring effect applied to the portion of the image field.
7. The method of claim 3, further comprising: receiving, from the at least one camera, camera location information indicating a position of the at least one camera, wherein determining the image field parameter comprises determining the image field parameter based further on the camera location information, and identifying the first boundary comprises identifying the first boundary based further on the camera location information.
8. A system comprising: at least one microphone configured to provide: boundary information defining one or more boundaries for an audio pick-up region, and sound location information indicating a detected sound position of an audio source located within the audio pick-up region; at least one camera configured to capture images or video of the audio pick-up region; and one or more processors communicatively coupled to each of the at least one microphone and the at least one camera, the one or more processors configured to: receive the boundary information and the sound location information from the at least one microphone; identify, based on the sound location information and the boundary information, a first boundary, of the one or more boundaries, that is located near the detected sound position; calculate a first distance between the detected sound position and the first boundary; determine, based on the first distance, a depth of field parameter for the at least one camera; and provide the depth of field parameter and the sound location information to the at least one camera.
9. The system of claim 8, wherein the at least one camera is further configured to, based on the depth of field parameter, adjust a zone of focus of the at least one camera such that the zone of focus includes the audio source and a first area between the audio source and the first boundary, and excludes a second area outside the audio pick-up region.
10. The system of claim 8, wherein the at least one camera is configured to capture the images or video of the audio pick-up region based on an image field parameter configured to define an image field of the at least one camera to include the audio pick-up region, and wherein the one or more processors are further configured to determine the image field parameter based on the boundary information, and provide the image field parameter to the at least one camera.
11. The system of claim 10, wherein the at least one camera is configured to apply an image enhancement to a portion of the image field that extends beyond the first boundary to outside the audio pick-up region.
12. The system of claim 11, wherein the image enhancement is a select image displayed over the portion of the image field.
13. The system of claim 11, wherein the image enhancement is a blurring effect applied to the portion of the image field.
14. The system of claim 10, wherein the at least one camera is further configured to provide, to the one or more processors, camera location information indicating a position of the at least one camera, and the one or more processors are configured to: determine the image field parameter based further on the camera location information, and identify the first boundary based further on the camera location information.
15. The system of claim 8, further comprising a second camera, wherein the boundary information further defines a second audio pick-up region for one or more second boundaries, the sound location information further indicates a second detected sound position of a second audio source located within the second audio pick-up region, and the one or more processors are further configured to: identify, based on the boundary information, the second camera as being near the second audio pick-up region; configure the second camera to capture images or video of the second audio pick-up region; identify, based on the sound location information and the boundary information, a second boundary of the one or more second boundaries as being located near the second detected sound position; calculate a second distance between the second detected sound position and the second boundary; determine, based on the second distance, a second depth of field parameter for the second camera; and provide the second detected sound position and the second depth of field parameter to the second camera.
16. A method performed by one or more processors in communication with: a first camera, a second camera, and at least one microphone, the method comprising: receiving, from the at least one microphone, boundary information defining one or more first boundaries for a first audio pick-up region, and one or more second boundaries for a second audio pick-up region; receiving, from the at least one microphone, sound location information indicating: a first detected sound position of a first audio source located within the first audio pick-up region, and a second detected sound position of a second audio source located within the second audio pick-up region; identifying, based on the boundary information, the first camera as being near the first audio pick-up region and the second camera as being near the second audio pick-up region; configuring the first camera to capture images or video of the first audio pick-up region, and the second camera to capture images or video of the second audio pick-up region; identifying, based on the sound location information and the boundary information, a first boundary of the one or more first boundaries as being located near the first detected sound position, and a second boundary of the one or more second boundaries as being located near the second detected sound position; calculating a first distance between the first detected sound position and the first boundary, and a second distance between the second detected sound position and the second boundary; determining, based on the first distance, a first depth of field parameter for the first camera; determining, based on the second distance, a second depth of field parameter for the second camera; providing the first detected sound position and the first depth of field parameter to the first camera; and providing the second detected sound position and the second depth of field parameter to the second camera.
17. The method of claim 16, wherein the first depth of field parameter adjusts a first zone of focus of the first camera so that the first zone of focus includes the first audio source and a first area between the first audio source and the first boundary, and excludes a second area outside the first audio pick-up region, and wherein the second depth of field parameter adjusts a second zone of focus of the second camera so that the second zone of focus includes the second audio source and a third area between the second audio source and the second boundary, and excludes a fourth area outside the second audio pick-up region.
18. The method of claim 16, further comprising: determining, based on the boundary information, a first image field parameter for the first camera; providing the first image field parameter to the first camera; determining, based on the boundary information, a second image field parameter for the second camera; and providing the second image field parameter to the second camera, wherein the first image field parameter is configured to define a first image field of the first camera such that the first image field comprises the first audio pick-up region, and wherein the second image field parameter is configured to define a second image field of the second camera such that the second image field comprises the second audio pick-up region.
19. The method of claim 18, further comprising: causing the first camera to apply a first image enhancement to a first portion of the first image field that extends beyond the first boundary line to outside the first audio pick-up region, and causing the second camera to apply a second image enhancement to a second portion of the second image field that extends beyond the second boundary line to outside the second audio pick-up region.
20. The method of claim 18, further comprising: receiving, from the first camera, a first camera location information indicating a first position of the first camera; and receiving, from the second camera, a second camera location information indicating a second position of the second camera, wherein determining the first image field parameter comprises determining the first image field parameter based further on the first camera location information, and identifying the first boundary comprises identifying the first boundary based further on the first camera location information, and wherein determining the second image field parameter comprises determining the second image field parameter based further on the second camera location information, and identifying the second boundary comprises identifying the second boundary based further on the second camera location information.

Description

VIDEO FENCING SYSTEM AND METHOD CROSS-REFERENCE [0001] This application claims priority to U.S. Provisional Pat. App. No. 63/512,389, filed on July 7, 2023, the contents of which are incorporated herein in their entirety. TECHNICAL FIELD [0002] This disclosure generally relates to focusing a camera on an active talker and more specifically, to systems and methods for defining a visual boundary around the active talker based on talker location information and corresponding audio coverage area information provided by one or more microphones. BACKGROUND [0003] Various audio-visual environments, such as conference rooms, boardrooms, classrooms, video conferencing settings, performance venues, and more, typically involve the use of microphones (including microphone arrays) for capturing sound from one or more audio sources (e.g., human speakers) in the environment and one or more image capture devices (e.g., cameras) for capturing images and/or videos of the one or more audio sources or other persons and/or objects in the environment. The captured audio and video may be disseminated to a local audience in the environment through loudspeakers (for sound reinforcement) and display screens (for visual reinforcement), and/or transmitted to a remote location for listening and viewing a remote audience (such as via a telecast, webcast, or the like). For example, the transmitted audio and video may be used by persons in a conference room to conduct a conference call with other persons at the remote location. [0004] One or more microphones may be used in order to optimally capture the speech and sound produced by the persons in the environment. Some existing audio systems ensure optimal audio coverage of a given environment by delineating “audio coverage areas,” which represent the regions in the environment that are designated for capturing audio signals, such as, e.g., speech produced by human speakers. The audio coverage areas define the spaces where beamformed audio pick-up lobes can be deployed by the microphones, for example. A given environment or room can include one or more audio coverage areas, depending on the size, shape, and type of environment. For example, the audio coverage area for a typical conference room may include the seating areas around a conference table, while a typical classroom may include one coverage area around the blackboard and/or podium at the front of the room and another coverage area around the tables and chairs, or other audience area, facing the front of the room. Some audio systems have fixed audio coverage areas, while other audio system are configured to dynamically create audio coverage areas for a given environment. [0005] Some existing camera systems are configured to point a camera in the direction of an active talker, such as a human in the environment that is speaking, singing, or otherwise making sounds, so that viewers, locally or remotely, can see who is talking. Some cameras use motion sensors and/or facial recognition software in order to guess which person is talking for camera tracking purposes. Some camera systems use multiple cameras to optimally capture persons located at different parts of the environment or otherwise capture video of the whole environment. SUMMARY [0006] The techniques of this disclosure provide systems and methods designed to, among other things: (1) use a microphone’s audio coverage area to define one or more visual boundaries for video captured by a camera; (2) adjust one or more parameters of the camera based on talker location information and audio coverage area information provided by the microphone, so that the captured video focuses on an active talker and the surrounding audio coverage area; and (3) exclude, from the captured video, unwanted imagery from beyond the one or more visual boundaries. [0007] One exemplary embodiment includes a method performed by one or more processors in communication with each of at least one microphone and at least one camera, the method comprising: receiving, from the at least one microphone, boundary information defining one or more boundaries for an audio pick-up region; receiving, from the at least one microphone, sound location information indicating a detected sound position of an audio source located within the audio pick-up region; identifying, based on the sound location information and the boundary information, a first boundary of the one or more boundaries as being located near the detected sound position; calculating a first distance between the detected sound position and the first boundary; determining, based on the first distance, a depth of field parameter for the at least one camera; and providing the depth of field parameter and the sound location information to the at least one camera. [0008] Another exemplary embodiment includes a system comprising: at least one microphone configured to provide: boundary information defining one or more boundaries for an audio pickup region, and so