JP-7855695-B2 - Image encoding device, image decoding device, image encoding method, and image decoding method

JP7855695B2JP 7855695 B2JP7855695 B2JP 7855695B2JP-7855695-B2

Inventors

ガオジンイン
テオハンブン
リムチョンスン
ヤーダブプラビーンクマール
安倍清史
西孝啓
遠間正真

Assignees

パナソニックインテレクチュアルプロパティコーポレーションオブアメリカ

Dates

Publication Date: 20260508
Application Date: 20230601
Priority Date: 20220713

Claims (20)

Circuits and, A memory connected to the aforementioned circuit, Equipped with, In operation, the aforementioned circuit Retrieve the image and specified information from the bitstream. The aforementioned designated information indicates that processing has been performed on multiple specific regions in the image that include privacy information. Each of the aforementioned multiple specific regions has a rectangular shape, The circuit generates a packing image in which the plurality of specific regions are rearranged during operation. Image decoding device.
The aforementioned process includes at least one of blurring, mosaic processing, or silhouette processing. The image decoding device according to claim 1.
The aforementioned privacy information includes at least one of the following: a person's face, a vehicle's license plate number, or information indicating an address. The image decoding device according to claim 1.
The circuit generates a generated image based on the image and the specified information. The image decoding device according to claim 1.
The specified information includes information indicating the position and size of the plurality of specific regions within the image. The image decoding device according to claim 1.
The circuit obtains the specified information from the header area of the bitstream. The image decoding device according to claim 1.
The header area includes SEI, The image decoding device according to claim 6.
The plurality of specific regions in the packing image are included within one or more encoding unit blocks that constitute the picture of the bitstream. The image decoding device according to claim 1 .
The plurality of specific regions in the packing image are included within subpictures or tiles that constitute the picture of the bitstream. The image decoding device according to claim 1 .
The specified information includes information indicating the correspondence between multiple specific regions in the image and multiple specific regions in the packing image. The image decoding device according to claim 1 .
The aforementioned packing image shows that areas other than the aforementioned multiple specific regions are padded. The image decoding device according to claim 1 .
Some of the aforementioned multiple specific regions have overlapping regions. The image decoding device according to claim 1 .
Circuits and, A memory connected to the aforementioned circuit, Equipped with, In operation, the aforementioned circuit The first image is obtained from the first bitstream. From the second bitstream, specify information that designates a plurality of specific regions within the first image, and a second image containing image data of the plurality of specific regions are obtained. Based on the first image, the specified information, and the second image, a third image is generated . Each of the aforementioned multiple specific regions has a rectangular shape, The circuit generates a packing image in which the plurality of specific regions are rearranged during operation. Image decoding device.
The first image is transmitted from the image encoding device to the image decoding device using a multi-layer base layer. The second image is transmitted from the image encoding device to the image decoding device using a multi-layer enhancement layer. The image decoding device according to claim 13 .
Circuits and, A memory connected to the aforementioned circuit, Equipped with, In operation, the aforementioned circuit For the input image, identify multiple specific regions containing privacy information. An image is generated in which processing has been applied to the aforementioned multiple specific regions. The system generates designation information indicating that processing has been performed on the multiple specific regions in the image. The image and the specified information are encoded into a bitstream . Each of the aforementioned multiple specific regions has a rectangular shape, The circuit generates a packing image in which the plurality of specific regions are rearranged during operation. Image encoding device.
The aforementioned process includes at least one of blurring, mosaic processing, or silhouette processing. The image encoding apparatus according to claim 15 .
The aforementioned privacy information includes at least one of the following: a person's face, a vehicle's license plate number, or information indicating an address. The image encoding apparatus according to claim 15 .
The specified information includes information indicating the position and size of the specific region within the image. The image encoding apparatus according to claim 15 .
The circuit writes the specified information to the header area of the bitstream. The image encoding apparatus according to claim 15 .
The header area includes SEI, The image encoding apparatus according to claim 19 .

Description

This disclosure relates to an image encoding device, an image decoding device, an image encoding method, and an image decoding method. Patent Document 1 discloses a video encoding and decoding method using adaptive coupled pre-filters and post-filters. Patent Document 2 discloses a method for encoding image data for loading into an artificial intelligence (AI) integrated circuit. U.S. Patent No. 9883207U.S. Patent No. 10452955 This disclosure aims to achieve both the protection of personal privacy information and the execution of machine tasks or human vision using that privacy information on the image decoding side during the transmission of images from an image encoding device to an image decoding device. An image decoding device according to one aspect of the present disclosure comprises a circuit and a memory connected to the circuit, wherein the circuit, in operation, acquires a first image by decoding a first bitstream, acquires designation information specifying a particular region within the first image and a second image including image data of the particular region by decoding a second bitstream, and generates a third image based on the first image, the designation information, and the second image. This figure shows a simplified configuration of the image processing system according to the embodiment of this disclosure.This is a flowchart showing the processing flow performed by the image encoding device.This figure shows an example of an input image.This figure shows a bounding box as an example of a specific region.This figure shows an example of the specified information.This figure shows an example of the specified information.This figure shows an example of the specified information.This figure shows an example of the specified information.This figure shows an example of the specified information.This figure shows an example of the specified information.This is a diagram showing an example of the first image.This is a diagram showing an example of the second image.This is a diagram showing an example of the second image.This is a diagram showing an example of the second image.This is a diagram showing an example of the second image.This is a diagram showing an example of the second image.This is a diagram showing an example of the second image.This is a diagram showing an example of the second image.This is a diagram showing an example of the second image.This figure shows the first example of the second image when parts of multiple bounding boxes overlap.This figure shows a second example of the second image where parts of multiple bounding boxes overlap.This figure shows a second example of the second image where parts of multiple bounding boxes overlap.This figure shows the first example of a bitstream data structure.This figure shows the first example of a bitstream data structure.This is a flowchart showing the processing flow performed by the image decoding device.This diagram shows a simplified example of an image encoding device implementation.This diagram shows a simplified example of an image decoding device implementation. (Knowledge that forms the basis of this disclosure) Conventional encoding methods aimed to provide optimal video under bitrate constraints for human vision. The advancement of machine learning or neural network-based applications, coupled with a wealth of sensors, has enabled numerous intelligent platforms that handle massive amounts of data, including connected cars, video surveillance, and smart cities. Because such vast amounts of data are constantly being generated, traditional methods involving humans in the pipeline have become inefficient and impractical in terms of latency and scalability. Furthermore, there was a concern that transmission and archiving systems needed more compact data representation and lower latency solutions, leading to the introduction of VCM (Video Coding for Machines). In some cases, machines can communicate with each other and perform tasks without human intervention, while in other cases, additional processing by a human may be required on a specific decompressed stream. For example, in surveillance camera footage, a human "supervisor" might search for a specific person or scene within the video. In other cases, the corresponding bitstream may be used by both humans and machines. In connected cars, features can be used for image correction for humans and for object detection and segmentation for machines. A typical system architecture includes a pair of image encoders and image decoders. The system's input can be video, still images, or feature data. Examples of machine tasks include object detection, object segmentation, object tracking, action recognition, pose estimation, or any combination thereof. Human vision is another potential use case that can be applied alongside machine tasks. According to conventional technology, if a captured image contains private information such as an individual's face or a vehicle's license plat