EP-4738832-A1 - DECODING DEVICE, ENCODING DEVICE, DECODING METHOD, AND ENCODING METHOD
Abstract
A decoder includes circuitry, and a memory connected to the circuitry. The circuitry decodes, from a bitstream having a multi-layer structure including at least one image layer, at least one parameter associated with the image layer, and the parameter indicates whether or not an image decoded from an image layer associated with the parameter is suitable for a specific task processing.
Inventors
- GAO, Jingying
- TEO, HAN BOON
- LIM, CHONG SOON
- YADAV, PRAVEEN KUMAR
- ABE, KIYOFUMI
- NISHI, TAKAHIRO
- SUGIO, TOSHIYASU
Assignees
- Panasonic Intellectual Property Corporation of America
Dates
- Publication Date
- 20260506
- Application Date
- 20240731
Claims (20)
- A decoder comprising: circuitry; and a memory coupled to the circuitry, wherein the circuitry decodes, from a bitstream having a multi-layer structure including at least one image layer, at least one parameter associated with the image layer, and the parameter indicates whether or not an image decoded from an image layer associated with the parameter is suitable for a specific task processing.
- The decoder according to claim 1, wherein the circuitry further: decodes an image from an image layer selected based on the parameter from the at least one image layer; and executes the task processing by using the image decoded from the image layer.
- The decoder according to claim 1, wherein the task processing includes machine vision.
- The decoder according to claim 1, wherein the task processing includes human vision.
- The decoder according to claim 1, wherein the task processing includes machine vision and human vision, and the at least one parameter includes a first parameter indicating whether or not an image decoded from the image layer is suitable for the machine vision, and a second parameter indicating whether or not an image decoded from the image layer is suitable for the human vision.
- The decoder according to claim 1, wherein the parameter includes a first value and a second value, the first value indicates that an image decoded from the image layer is suitable for the task processing, and the second value indicates that an image decoded from the image layer is not suitable for the task processing.
- The decoder according to claim 1, wherein the parameter includes a first value and a second value, the first value indicates that an image decoded from the image layer is suitable for the task processing, and the second value indicates that whether or not an image decoded from the image layer is suitable for the task processing is unspecified.
- The decoder according to claim 6 or 7, wherein the circuitry further: decodes an image from only an image layer associated with the parameter indicating the first value among the at least one image layer; and executes the task processing by using the image decoded from the image layer.
- The decoder according to claim 1, wherein the at least one image layer includes an image layer with which the parameter is not associated, and that the parameter is not associated with the image layer indicates that whether or not an image decoded from the image layer is suitable for the task processing is unspecified.
- The decoder according to claim 1, wherein the circuitry decodes the at least one parameter from a predetermined header region of the bitstream, and the predetermined header region includes SEI.
- The decoder according to claim 10, wherein the at least one image layer includes a base layer that is a lowermost layer of the multi-layer structure, and the at least one parameter associated with the at least one image layer is stored in the header region of the base layer.
- The decoder according to claim 10, wherein the at least one parameter associated with the at least one image layer is stored in the header region of each of the at least one image layer.
- An encoder comprising: circuitry; and a memory connected to the circuitry, wherein the circuitry encodes, into a bitstream having a multi-layer structure including at least one image layer, at least one parameter associated with the image layer, and the parameter indicates whether or not an image in an image layer associated with the parameter is suitable for a specific task processing.
- The encoder according to claim 13, wherein the task processing includes machine vision.
- The encoder according to claim 13, wherein the task processing includes human vision.
- The encoder according to claim 13, wherein the task processing includes machine vision and human vision, and the at least one parameter includes a first parameter indicating whether or not an image in the image layer is suitable for the machine vision, and a second parameter indicating whether or not an image in the image layer is suitable for the human vision.
- The encoder according to claim 13, wherein the parameter includes a first value and a second value, the first value indicates that an image in the image layer is suitable for the task processing, and the second value indicates that an image in the image layer is not suitable for the task processing.
- The encoder according to claim 13, wherein the parameter includes a first value and a second value, the first value indicates that an image in the image layer is suitable for the task processing, and the second value indicates that whether or not an image in the image layer is suitable for the task processing is unspecified.
- The encoder according to claim 13, wherein the at least one image layer includes an image layer with which the parameter is not associated, and that the parameter is not associated with the image layer indicates that whether or not an image in the image layer is suitable for the task processing is unspecified.
- The encoder according to claim 13, wherein the circuitry encodes the at least one parameter into a predetermined header region of the bitstream, and the predetermined header region includes SEI.
Description
Technical Field The present disclosure relates to a decoder, an encoder, a decoding method, and an encoding method. Background Art Patent Literature 1 discloses a video encoding method and a decoding method using an adaptive coupled prefilter and an adaptive coupled postfilter.Patent Literature 2 discloses a method of encoding image data for loading into an artificial intelligence (AI) integrated circuit. However, in Patent Literatures 1 and 2, in an image processing system that transmits a bitstream having a multi-layer structure from an encoder to a decoder, reducing a processing load on the decoder is not sufficiently studied. Citation List Patent Literature Patent Literature 1: US 9,883,207Patent Literature 2: US 10,452,955 Summary of Invention An object of the present disclosure is to reduce a processing load on a decoder in an image processing system that transmits a bitstream having a multi-layer structure from an encoder to the decoder. A decoder according to one aspect of the present disclosure includes circuitry, and a memory coupled to the circuitry. The circuitry decodes, from a bitstream having a multi-layer structure including at least one image layer, at least one parameter associated with the image layer, and the parameter indicates whether or not an image decoded from an image layer associated with the parameter is suitable for a specific task processing. Brief Description of Drawings FIG. 1 is a diagram illustrating, in a simplified manner, a configuration of an image processing system according to an embodiment of the present disclosure.FIG. 2 is a diagram illustrating, in a simplified manner, a configuration of circuitry included in an encoder.FIG. 3 is a flowchart showing processing executed by the circuitry included in the encoder.FIG. 4 is a diagram illustrating, in a simplified manner, a part of a bitstream having a multi-layer structure.FIG. 5 is a diagram illustrating a setting example of a parameter by a setting unit.FIG. 6A is a diagram illustrating a first example of syntax related to setting of a parameter.FIG. 6B is a diagram illustrating a second example of syntax related to setting of a parameter.FIG. 7 is a diagram illustrating, in a simplified manner, a configuration of circuitry included in a decoder.FIG. 8 is a flowchart showing processing executed by the circuitry included in the decoder.FIG. 9 is a diagram illustrating, in a simplified manner, a part of a bitstream having a multi-layer structure.FIG. 10 is a diagram illustrating a setting example of a parameter by the setting unit.FIG. 11 is a diagram illustrating an example of syntax related to setting of a parameter.FIG. 12 is a diagram illustrating a first setting example of a parameter by the setting unit.FIG. 13 is a diagram illustrating a second setting example of a parameter by the setting unit.FIG. 14A is a diagram illustrating a processing example in the decoder.FIG. 14B is a diagram illustrating a processing example in the decoder.FIG. 14C is a diagram illustrating a processing example in the decoder.FIG. 14D is a diagram illustrating a processing example in the decoder.FIG. 14E is a diagram illustrating a processing example in the decoder.FIG. 15 is a diagram illustrating, in a simplified manner, a part of a bitstream having a multi-layer structure.FIG. 16 is a diagram illustrating an example of syntax related to setting of a parameter.FIG. 17 is a block diagram illustrating an example of a functional configuration of an encoding unit.FIG. 18 is a block diagram illustrating an example of a functional configuration of a decoding unit.FIG. 19 is a diagram illustrating an example of a hierarchical structure of data in a stream.FIG. 20 is a diagram illustrating a configuration example of a bitstream. Description of Embodiments (Knowledge underlying present disclosure) An image processing system according to the background art includes an encoder and a decoder. The encoder encodes an image into a bitstream, and transmits the bitstream storing the encoded image to the decoder. The decoder decodes an image from a received bitstream, and executes task processing by using the decoded image. The task processing includes machine vision and human vision. The machine vision includes object detection, object tracking, object segmentation, action recognition, pose estimation, or the like using a machine-learned estimation model. The human vision includes visual recognition or viewing and listening of a moving image by a human, such as an operator or the user. In a case where a bitstream has a multi-layer structure including a plurality of image layers, different images are stored in the plurality of image layers. Then, a suitable image layer including an image to be used for task processing is different depending on content of the task processing. However, in the background art, there is no information indicating a correspondence relationship between content of task processing and a suitable image layer. Therefore, a decod