CN-122002027-A - Background layer update during encoding

CN122002027ACN 122002027 ACN122002027 ACN 122002027ACN-122002027-A

Abstract

Background layer updates during encoding are disclosed. In particular, a method performed by an image processing apparatus for updating a background layer during encoding of a scene is provided. The scene is encoded based on classifying objects depicted in the scene as foreground or background. The background is divided into ordered background layers, wherein each background layer is associated with a respective depth model. The method includes detecting a change in an image portion of a background layer (S102). The method comprises calculating (S104) a difference between the image portion and a corresponding image portion of a background layer ordered after the one background layer. The method comprises selecting (S106 a) the background layers ordered after the one background layer to represent the image portion when the difference is smaller than a threshold.

Inventors

JOHN STERN
YUAN SONG

Assignees

安讯士有限公司

Dates

Publication Date: 20260508
Application Date: 20251031
Priority Date: 20241104

Claims (12)

1. A method for updating background layers during encoding of a scene, performed by an image processing device, wherein the scene is encoded based on classifying objects depicted in the scene as foreground or background, wherein the background is divided into ordered background layers, wherein each background layer is associated with a respective depth model dm, and wherein the method comprises: Detecting a change in an image portion of a background layer; calculating a difference between the image portion and a corresponding image portion of a background layer ordered after the one background layer; Selecting the background layers ordered after the one background layer to represent the image portion when the difference is less than a threshold value, and When the difference is not less than the threshold, a new background layer is created to represent the image portion.
2. The method of claim 1, wherein the method further comprises: Upon expiration of a fusion time mt associated with the background layer ordered to follow the new background layer, the new background layer is fused with the background layer ordered to follow the new background layer.
3. The method of claim 1, wherein, Each background layer is associated with a respective fusion time mt.
4. The method of claim 3, wherein, When an object remains stationary in the scene for longer than the fusion time mt of a given background layer, a representation of the object in the scene is fused into the given background layer.
5. A method according to claim 2 and 3 in combination, wherein, After a given background layer has been present for longer than the fusion time mt of the background layers ordered behind the given background layer, the given background layer is fused into the background layers ordered behind the given background layer.
6. The method of claim 1, wherein, The background layers are ordered according to the fusion time mt, and the background layer with the shortest fusion time mt is closest to the foreground.
7. The method of claim 1, wherein, The image portion has depth values given by the depth model dm representing the background layer of the image portion.
8. The method of claim 1, wherein the method further comprises: and encoding the foreground and the background layers into an encoded video stream of the scene.
9. An image processing apparatus for updating background layers during encoding of a scene, wherein the scene is encoded based on classifying objects depicted in the scene as foreground or background, wherein the background is divided into ordered background layers, wherein each background layer is associated with a respective depth model dm, the image processing apparatus comprising processing circuitry configured to cause the image processing apparatus to: Detecting a change in an image portion of a background layer; calculating a difference between the image portion and a corresponding image portion of a background layer ordered after the one background layer; Selecting the background layers ordered after the one background layer to represent the image portion when the difference is less than a threshold value, and When the difference is not less than the threshold, a new background layer is created to represent the image portion.
10. The image processing apparatus of claim 9, further configured to perform the method of any one of claims 2 to 8.
11. A computer program for updating background layers during encoding of a scene, wherein the scene is encoded based on classifying objects depicted in the scene as foreground or background, wherein the background is divided into ordered background layers, wherein each background layer is associated with a respective depth model dm, the computer program comprising computer code which, when run on processing circuitry of an image processing apparatus, causes the image processing apparatus to: Detecting a change in an image portion of a background layer; calculating a difference between the image portion and a corresponding image portion of a background layer ordered after the one background layer; Selecting the background layers ordered after the one background layer to represent the image portion when the difference is less than a threshold value, and When the difference is not less than the threshold, a new background layer is created to represent the image portion.
12. A computer program product comprising a computer program according to claim 11 and a computer readable storage medium having the computer program stored thereon.

Description

Background layer update during encoding Technical Field Embodiments presented herein relate to a method, an image processing device, a computer program and a computer program product for updating a background layer during encoding of a scene. Background Depth perception is an essential aspect of understanding and interpretation of the surrounding environment in a number of fields, especially in applications requiring three-dimensional (3D) spatial information. The ability to accurately determine the distance, shape, and size of objects within a scene enables more accurate analysis, improved object detection, and enhanced decision making capability. Depth information is particularly useful in environments where it is critical to distinguish between objects based on their distance or size, such as security systems, robots, autonomous vehicles, and other systems that rely on visual data. Traditionally, depth perception is achieved by 3D cameras or other dedicated sensors that provide a detailed understanding of the environment. The ability to perceive depth provides several benefits including more accurate object detection and the ability to reduce false positives by filtering out objects that may appear to be larger or closer than they actually are. For example, depth perception may provide assistance in situations where objects may be misidentified based on two-dimensional (2D) images, as depth information provides a more comprehensive view of actual spatial relationships within a scene. There are several methods by which depth information can be extracted. In some cases, monocular camera systems may utilize advanced computational models to estimate depth from a single viewpoint. In other examples, the depth information is derived from parallax measurements obtained from overlapping images captured by multiple sensors (such as sensors used in multi-sensor panoramic systems). These systems calculate the difference in position of the object between the images so that their relative distances can be determined. Another approach involves sampling data from a laser point, such as that used in a laser equipped pan-tilt (PTZ) camera, to measure the distance of an object. Furthermore, self-learning techniques based on object tracking can provide depth information by analyzing how objects move and change position over time. While these schemes may be effective in relatively static environments, they face challenges when applied to more dynamic scenarios. Each method typically requires a certain amount of time to process the data and calculate accurate depth information. For example, monocular models often involve intensive computational processing, while systems relying on PTZ cameras may require time for the camera to physically pan or translate across the scene to collect sufficient data. This delay may prevent the ability to provide real-time or near real-time depth updates, especially in situations where large objects are moving rapidly, resulting in rapid changes in their depth. In a dynamic environment, objects may move unpredictably or at varying speeds, making it increasingly difficult for the depth perception system to update in real time. This challenge is particularly pronounced when large objects undergo significant transitions in depth, as the system may not be able to adjust quickly enough to provide accurate and up-to-date information. Thus, there is a need for more efficient methods to maintain accurate depth perception, especially in situations where both static and dynamic elements are present within the scene. Disclosure of Invention An object of embodiments herein is to solve the above problems. A particular object is to provide a computationally efficient technique to maintain accurate depth perception in a scene with both static and dynamic elements. According to a first aspect, a method for updating a background layer during encoding of a scene performed by an image processing apparatus is presented. The scene is encoded based on classifying objects depicted in the scene as foreground or background. The background is divided into ordered background layers, wherein each background layer is associated with a respective depth model. The method includes detecting a change in an image portion of a background layer. The method comprises calculating a difference between the image portion and a corresponding image portion of a background layer ordered after the one background layer. The method includes selecting the background layers ordered after the one background layer to represent the image portion when the difference is less than a threshold. According to a second aspect, an image processing apparatus for updating a background layer during encoding of a scene is presented. The scene is encoded based on classifying objects depicted in the scene as foreground or background. The background is divided into ordered background layers, wherein each background layer is associated with a respective depth m