Search

CN-121985144-A - Image coding method, device, equipment and storage medium

CN121985144ACN 121985144 ACN121985144 ACN 121985144ACN-121985144-A

Abstract

The disclosure provides an image coding method, an image coding device and a storage medium, and relates to the technical field of computers. The method comprises the steps of analyzing an image to be encoded in a wavelet transformation encoding process to obtain original pixel data by a CPU, transmitting the original pixel data to a GPU, preprocessing the original pixel data by the GPU by utilizing a plurality of threads in parallel to obtain standard pixel data meeting wavelet transformation requirements, processing one pixel column in the original pixel data by one thread, performing forward wavelet transformation on the standard pixel data by the GPU to obtain sub-band coefficients, and performing entropy encoding on the sub-band coefficients by the GPU to obtain encoded data. According to the scheme, the image analysis is completed in the CPU, the preprocessing, the wavelet transformation and the entropy coding processing are executed in the GPU in parallel, and the coding throughput rate and the resource utilization rate in the image coding process can be improved.

Inventors

  • Request for anonymity

Assignees

  • 摩尔线程智能科技(北京)股份有限公司

Dates

Publication Date
20260505
Application Date
20251223

Claims (16)

  1. 1. An image encoding method, applied to an image encoding system including a central processing unit CPU and a graphics processing unit GPU, comprising: the CPU analyzes an image to be coded in the wavelet transformation coding process to obtain original pixel data, and transmits the original pixel data to the GPU; the GPU performs preprocessing operation on the original pixel data in parallel by utilizing a plurality of threads to obtain standard pixel data meeting wavelet transformation requirements, wherein one thread processes one pixel column in the original pixel data; the GPU performs forward wavelet transformation on the standard pixel data to obtain sub-band coefficients; And the GPU performs entropy coding processing on the sub-band coefficients to obtain coded data.
  2. 2. The image encoding method according to claim 1, wherein the preprocessing operation on the original pixel data using a plurality of threads in parallel includes: determining the number of thread blocks required by the preprocessing operation based on the image width and the number of threads of a single thread block in the GPU in the row direction; determining a plurality of thread blocks corresponding to the preprocessing operation based on the number of the thread blocks; and calling a preprocessing kernel function required by the preprocessing operation, and executing the preprocessing kernel function on threads in the plurality of thread blocks in parallel to perform the preprocessing operation on the original pixel data.
  3. 3. The image encoding method according to claim 2, wherein the determining the number of thread blocks required for the preprocessing operation based on the image width and the number of threads of a single thread block in the GPU in the row direction includes: Determining a total number of pixel columns to be preprocessed based on the image width; Determining the number of coverage columns of the single thread blocks based on the number of threads of the single thread blocks in the GPU in the row direction; the number of thread blocks required for the preprocessing operation is determined based on the calculation result of dividing the total number of pixel columns by the number of coverage columns.
  4. 4. The image encoding method according to claim 2, wherein the thread blocks are two-dimensional thread blocks, the two-dimensional thread blocks comprise threads in a row direction and threads in a column direction, and the number of threads of the two-dimensional thread blocks in the column direction is equal to the number of pixel components of the original pixel data; the executing the preprocessing kernel in parallel on threads in the plurality of thread blocks includes: Executing the preprocessing kernel function on the threads in the row direction and the threads in the column direction in a plurality of two-dimensional thread blocks in parallel; wherein one of the line-wise threads processes one pixel column in the raw pixel data and one of the column-wise threads processes one pixel component within the pixel column.
  5. 5. The image encoding method according to claim 2, wherein the original pixel data is in a cross storage format and the component representation form of the original pixel data is required to be subjected to color space conversion; the executing the preprocessing kernel function in parallel on threads in the plurality of thread blocks includes: and if the preprocessing kernel function is a first kernel function, executing the first kernel function on threads in the plurality of thread blocks in parallel, wherein the first kernel function is a kernel function capable of completing storage format conversion, direct current component translation processing and color space conversion in single kernel function execution.
  6. 6. The image encoding method according to claim 5, wherein the executing the preprocessing kernel in parallel on the threads in the plurality of thread blocks to perform preprocessing operation on the raw pixel data further comprises: If the preprocessing kernel function comprises a second kernel function capable of completing storage format conversion and direct current component translation processing in single kernel function execution and a third kernel function for color space conversion, executing the second kernel function on threads in the plurality of thread blocks in parallel to perform the storage format conversion and direct current component translation processing on the original pixel data; After the storage format conversion and direct current component translation processes are completed, the third kernel function is executed in parallel on threads in the plurality of thread blocks to perform the color space transformation on the original pixel data.
  7. 7. The image encoding method according to claim 2, wherein the original pixel data is in an independent storage format and the component representation form of the original pixel data is required to be subjected to color space transformation; the executing the preprocessing kernel function in parallel on threads in the plurality of thread blocks includes: And if the preprocessing kernel function is a fourth kernel function, executing the fourth kernel function on threads in the plurality of thread blocks in parallel, wherein the fourth kernel function is a kernel function capable of completing direct current component translation processing and color space transformation in single kernel function execution.
  8. 8. The image encoding method according to claim 1, wherein the transmitting the raw pixel data to the GPU comprises: Distributing corresponding video memories in the GPU according to the size of the original pixel data; And transmitting the original pixel data to the video memory in an asynchronous data transmission mode.
  9. 9. The image encoding method according to claim 8, wherein said forward wavelet transforming the standard pixel data to obtain subband coefficients comprises: Dividing the standard pixel data into a plurality of sub-blocks; Loading pixel data corresponding to the sub-blocks from the video memory to a shared memory; And performing the forward wavelet transform on the pixel data corresponding to the sub-block in the shared memory to obtain the sub-band coefficient, and writing the sub-band coefficient back to the video memory from the shared memory.
  10. 10. The image encoding method according to claim 9, wherein the dividing the standard pixel data into a plurality of sub-blocks includes: Determining a target size and a critical supporting area corresponding to the sub-block based on a preset wavelet kernel, wherein the wavelet kernel is a filtering operator adopted when the forward wavelet transformation is carried out, and the critical supporting area represents an area which is positioned at the edge of the sub-block and used for boundary calculation in the forward wavelet transformation; and dividing the standard pixel data based on the target size and the critical supporting area to obtain the sub-blocks.
  11. 11. The image encoding method according to claim 9, wherein said performing the forward wavelet transform on the pixel data corresponding to the sub-block in the shared memory includes: performing boundary expansion processing on pixel data corresponding to the sub-blocks in the shared memory to obtain corresponding expanded sub-blocks; and if the wavelet transformation coding is lossy coding, calling a fifth kernel function, and carrying out forward wavelet transformation and quantization processing on the extended subblocks by utilizing the fifth kernel function, wherein the fifth kernel function is a kernel function for executing the forward wavelet transformation and quantization processing in a combined way.
  12. 12. The image encoding method according to claim 1, wherein the entropy encoding of the subband coefficients comprises: Dividing the sub-band coefficients based on a preset code block size to obtain a plurality of code blocks; And calling an entropy coding kernel function required by the entropy coding process, and executing the entropy coding kernel function on a plurality of threads in parallel to respectively perform the entropy coding process on the plurality of code blocks, wherein the entropy coding kernel function is related to the coding mode of the wavelet transform coding.
  13. 13. The image encoding method according to claim 1, characterized in that the method further comprises: transmitting the encoded data from the GPU back to the CPU; the CPU encapsulates the encoded data based on a target output format.
  14. 14. An image encoding apparatus applied to wavelet transform encoding, the apparatus comprising: the image analysis module is used for analyzing the image to be coded in the wavelet transformation coding process to obtain original pixel data, and transmitting the original pixel data to the preprocessing module; the preprocessing module is used for preprocessing the original pixel data in parallel by utilizing a plurality of threads to obtain standard pixel data meeting the wavelet transformation requirement, wherein one thread processes one pixel column in the original pixel data; the wavelet transformation module is used for carrying out forward wavelet transformation on the standard pixel data to obtain a subband coefficient; and the entropy coding module is used for carrying out entropy coding processing on the sub-band coefficients to obtain coded data.
  15. 15. An image encoding apparatus, comprising: A processor including a CPU and a GPU, and A memory for storing executable instructions of the processor; wherein the processor is configured to perform the image encoding method of any of claims 1-13 via execution of the executable instructions.
  16. 16. A computer readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the image encoding method of any of claims 1-13.

Description

Image coding method, device, equipment and storage medium Technical Field The present disclosure relates to the field of computer technologies, and in particular, to an image encoding method, apparatus, device, and storage medium. Background In the wavelet transform-based encoding process, a plurality of data processing steps need to be sequentially performed on an image to be encoded. Under the existing computing architecture, related wavelet transform coding schemes such as High-Throughput JPEG 2000 (HTJ 2K) coding and the like typically only perform each coding processing step of an image to be coded in a serial manner in a CPU. The method for independently completing the encoding in the CPU is simple to realize and high in universality, but because of the limited number of CPU cores, the method is difficult to develop sufficient parallel computation when processing the large-resolution image, so that the overall encoding speed is limited, and when encoding the oversized image or the batch of images, each processing step needs to be completed in series, so that the time cost is obviously increased, and the requirement of real-time encoding is difficult to meet. Disclosure of Invention An object of an embodiment of the present disclosure is to provide an image encoding method, an image encoding apparatus, an image encoding device, and a computer readable storage medium, capable of improving encoding throughput and resource utilization in an image encoding process by completing image analysis in a CPU and performing preprocessing in a multithreaded parallel manner with pixel columns as processing objects in a GPU. Other features and advantages of the present disclosure will be apparent from the following detailed description, or may be learned in part by the practice of the disclosure. According to a first aspect of the disclosed embodiments, an image encoding method is provided, and the image encoding method is applied to an image encoding system, wherein the image encoding system comprises a Central Processing Unit (CPU) and a Graphics Processing Unit (GPU), the method comprises the steps that the CPU analyzes an image to be encoded in a wavelet transform encoding process to obtain original pixel data, the original pixel data is transmitted to the GPU, the GPU performs preprocessing operation on the original pixel data in parallel by using a plurality of threads to obtain standard pixel data meeting wavelet transform requirements, one thread processes one pixel column in the original pixel data, the GPU performs forward wavelet transform on the standard pixel data to obtain sub-band coefficients, and the GPU performs entropy encoding on the sub-band coefficients to obtain encoded data. In some example embodiments of the disclosure, based on the foregoing scheme, the performing, with multiple threads, the preprocessing operation on the original pixel data in parallel includes determining a number of thread blocks required for the preprocessing operation based on an image width and a number of threads of a single thread block in the GPU in a row direction, determining multiple thread blocks corresponding to the preprocessing operation based on the number of thread blocks, calling a preprocessing kernel function required for the preprocessing operation, and performing, with the preprocessing kernel function, on threads in the multiple thread blocks in parallel, to perform the preprocessing operation on the original pixel data. In some example embodiments of the disclosure, based on the foregoing scheme, the determining the number of thread blocks required for the preprocessing operation based on the image width and the number of threads of the single thread block in the GPU in the row direction includes determining a total number of pixel columns to be preprocessed based on the image width, determining a number of coverage columns of the single thread block based on the number of threads of the single thread block in the GPU in the row direction, and determining the number of thread blocks required for the preprocessing operation based on a result of a calculation of dividing the total number of pixel columns by the number of coverage columns. In some example embodiments of the present disclosure, the thread block is a two-dimensional thread block based on the foregoing scheme, the two-dimensional thread block includes threads in a row direction and threads in a column direction, the number of threads in the column direction of the two-dimensional thread block is equal to the number of pixel components of the original pixel data, and the executing the preprocessing kernel on threads in the plurality of thread blocks in parallel includes executing the preprocessing kernel on threads in the row direction and threads in the column direction of the plurality of two-dimensional thread blocks in parallel, wherein one thread in the row direction processes one pixel column in the original pixel data, and one thread