Search

CN-113223138-B - Hidden surface elimination in a graphics processing system

CN113223138BCN 113223138 BCN113223138 BCN 113223138BCN-113223138-B

Abstract

The present disclosure relates to hidden surface elimination in a graphics processing system. And more particularly, to a graphics processor that performs an early depth test on primitives of a block of a render output and a depth test of a sampling location of the render output, a per-block depth buffer and a per-sample depth buffer that maintain a depth value of a memory block for use in the early depth test of the block. When the processing of the rendering output is stopped before the rendering output is completed, the per-sample depth value in the per-sample depth buffer is written to the storage so that the per-sample depth value can be restored, but per-block depth value information in the per-block depth buffer is discarded. Then, upon resuming processing of the rendered output, the per-sample depth buffer value is loaded into the per-sample depth buffer, and the loaded per-sample depth buffer value is also used to resume the per-block depth buffer for use in resuming processing of the rendered output.

Inventors

  • ANDREAS DUE ENGH-HALSTVEDT
  • A.E. Chaffin
  • F. Hergland

Assignees

  • ARM有限公司

Dates

Publication Date
20260508
Application Date
20210120
Priority Date
20200121

Claims (20)

  1. 1. A method of operating a graphics processor, the graphics processor comprising: A rasterizer that rasterizes input primitives to generate graphics fragments to be processed, each graphics fragment having one or more sampling points associated with the graphics fragment, and A renderer that processes the fragments generated by the rasterizer to generate output fragment data; wherein the rasterizer, upon receiving a primitive to be rasterized, tests, for each of one or more blocks representing respective different regions of a rendering output to be generated, the block with respect to the primitive to determine whether the primitive at least partially covers the block; The graphics processor further includes: A block early depth test circuit configured to perform early depth testing of primitives with respect to rendering a block of output for which the rasterizer finds that the primitives at least partially cover the block, and A sample depth test circuit configured to perform a depth test for finding a sampling location covered by a primitive; The method includes the steps of, in processing primitives to generate a rendered output: Storing a per-block depth buffer of the rendering output, the per-block depth buffer storing, for each of one or more blocks representing respective different regions of the rendering output being generated, depth value information for that block for use by the block early depth test circuit in performing a block early depth test of primitives with respect to that block, and Storing a per-sample depth buffer of the render output, the per-sample depth buffer storing depth values for respective ones of one or more sample locations of the render output being generated for use by the sample depth test circuit in performing a depth test of a primitive with respect to the sample location of the render output being generated; the method further comprises the steps of: the graphics processor stops processing the rendered output and, when the graphics processor does so, performs the following: Writing the per-sample depth value in the per-sample depth buffer to a memory to enable recovery of the per-sample depth value while continuing processing of the rendering output, but discarding per-block depth value information in the per-block depth buffer; And The graphics processor resumes processing of the rendered output and, when the graphics processor does so, performs the following: loading the per sample depth buffer value written out to a memory into a per sample depth buffer for use in continuing processing of the rendering output, and Using the loaded per sample depth buffer value, a per block depth value information set is stored in a per block depth buffer for use by the block early depth test circuit in performing a block early depth test of a primitive while continuing processing of the rendering output.
  2. 2. The method of claim 1 wherein the rasterizer is a layered rasterizer operative to iteratively test primitives for blocks of rendering output that progressively decrease in size to a minimum block size and blocks of the rendering output on which the early depth test is performed correspond to blocks of the rendering output that the rasterizer tests for the rasterization.
  3. 3. The method of claim 1, wherein the per-block depth buffer stores minimum and maximum depth values of respective blocks for which depth value information has been stored.
  4. 4. The method of claim 3, wherein, Using the loaded per sample depth buffer values, storing a per block depth value information set in a per block depth buffer for use by the block early depth test circuit in performing block early depth testing of primitives when processing the render output, comprising: The method includes the steps of setting a minimum depth value of a block in a restored per-block depth buffer to a minimum per-sample depth value of per-sample depth values that have been loaded into the per-sample depth buffer that fall within the block at the sampling location, and setting a maximum depth value of a block to a maximum per-sample depth value of per-sample depth values that have been loaded into the per-sample depth buffer that fall within the sampling location within the block.
  5. 5. The method of claim 1, wherein loading the per-sample depth buffer value into a per-sample depth buffer for use in processing the render output comprises: the per sample depth value is loaded using a direct memory access process.
  6. 6. The method of claim 1, wherein loading the per-sample depth buffer value into a per-sample depth buffer for use in processing the render output comprises: the per-sample depth values are loaded in a block-by-block order.
  7. 7. The method of claim 1, wherein the per-tile depth buffer is configured to store depth values of a hierarchical layout of tiles, and wherein, Using the loaded per sample depth buffer values, storing a per block depth value information set in a per block depth buffer for use by the block early depth test circuit in performing block early depth testing of primitives when processing the render output, comprising: storing a per block depth value information set in a per block depth buffer using the loaded per sample depth buffer values for use by the block early depth test circuit in performing a block early depth test of a primitive in processing the render output to set per block depth values of a smallest block in a block subdivision hierarchy, and The per-block depth value of the larger block is set based on the per-block depth value of the small block respectively contained by the larger block in the block hierarchy.
  8. 8. The method according to claim 1, wherein: Triggering the graphics processor to stop processing the rendering output as a result of a current data structure to be processed of the rendering output being processed being exhausted, and The graphics processor is triggered to resume processing the rendered output as there is a new data structure containing new data ready for the rendered output to be processed.
  9. 9. The method of claim 1, wherein the rendered output to be generated comprises tiles of an overall output generated by the graphics processor.
  10. 10. A method of operating a graphics processor, the graphics processor comprising: A rasterizer that rasterizes input primitives to generate graphics fragments to be processed, each graphics fragment having one or more sampling points associated with the graphics fragment, and A renderer that processes the fragments generated by the rasterizer to generate output fragment data; wherein the rasterizer, upon receiving a primitive to be rasterized, tests, for each of one or more blocks representing respective different regions of a rendering output to be generated, the block with respect to the primitive to determine whether the primitive at least partially covers the block; The graphics processor further includes: A block early depth test circuit configured to perform early depth testing of primitives with respect to rendering a block of output for which the rasterizer finds that the primitives at least partially cover the block, and A sample depth test circuit configured to perform a depth test for finding a sampling location covered by a primitive; the method comprises the following steps: the graphics processor performs the following operations: Loading a per sample depth buffer value into a per sample depth buffer for use in processing a rendering output, the per sample depth buffer storing depth values for respective ones of one or more sample locations of the rendering output being generated for use by the sample depth test circuit in performing a depth test of primitives with respect to the sample locations of the rendering output being generated, and Using the loaded per-sample depth buffer values, storing a per-block depth value information set in a per-block depth buffer for use by the block early depth test circuit in performing a block early depth test when processing the rendering output, the per-block depth buffer storing, for each block of one or more blocks representing respective different regions of the rendering output being generated, depth value information for that block for use by the block early depth test circuit in performing a block early depth test of a primitive with respect to that block.
  11. 11. The method of claim 10, wherein the per-block depth buffer stores minimum and maximum depth values of respective blocks for which depth value information has been stored.
  12. 12. The method of claim 11, wherein, Using the loaded per sample depth buffer values, storing a per block depth value information set in a per block depth buffer for use by the block early depth test circuit in performing block early depth testing of primitives when processing the render output, comprising: The method includes the steps of setting a minimum depth value of a block in a restored per-block depth buffer to a minimum per-sample depth value of per-sample depth values that have been loaded into the per-sample depth buffer that fall within the block at the sampling location, and setting a maximum depth value of a block to a maximum per-sample depth value of per-sample depth values that have been loaded into the per-sample depth buffer that fall within the sampling location within the block.
  13. 13. The method of claim 12, wherein loading the per-sample depth buffer value into a per-sample depth buffer for use in processing the render output comprises: the per sample depth value is loaded using a direct memory access process.
  14. 14. The method of claim 12, wherein loading the per-sample depth buffer value into a per-sample depth buffer for use in processing the render output comprises: the per-sample depth values are loaded in a block-by-block order.
  15. 15. The method of claim 10, wherein the per-tile depth buffer is configured to store depth values of a hierarchical layout of tiles, and wherein, Using the loaded per sample depth buffer values, storing a per block depth value information set in a per block depth buffer for use by the block early depth test circuit in performing block early depth testing of primitives when processing the render output, comprising: storing a per block depth value information set in a per block depth buffer using the loaded per sample depth buffer values for use by the block early depth test circuit in performing a block early depth test of a primitive in processing the render output to set per block depth values of a smallest block in a block subdivision hierarchy, and The per-block depth value of the larger block is set based on the per-block depth value of the small block respectively contained by the larger block in the block hierarchy.
  16. 16. The method according to claim 10, wherein: Triggering the graphics processor to stop processing the rendering output as a result of a current data structure to be processed of the rendering output being processed being exhausted, and The graphics processor is triggered to resume processing the rendered output as there is a new data structure containing new data ready for the rendered output to be processed.
  17. 17. The method of claim 10, wherein the rendered output to be generated comprises tiles of an overall output generated by the graphics processor.
  18. 18. A graphics processor, the graphics processor comprising: A rasterizer that rasterizes input primitives to generate graphics fragments to be processed, each graphics fragment having one or more sampling points associated with the graphics fragment, and A renderer that processes the fragments generated by the rasterizer to generate output fragment data; wherein the rasterizer is configured, when the rasterizer receives a primitive to be rasterized, to test, for each of one or more tiles representing respective different regions of a rendering output to be generated, the tile with respect to the primitive to determine whether the primitive at least partially covers the tile; The graphics processor further includes: A block early depth test circuit configured to perform early depth testing of primitives with respect to rendering a block of output for which the rasterizer finds that the primitives at least partially cover the block, and A sample depth test circuit configured to perform a depth test for finding a sampling location covered by a primitive; The graphics processor is further configured, in processing the primitives to generate rendered output, to: Storing a per-block depth buffer of the rendering output, the per-block depth buffer storing, for each of one or more blocks representing respective different regions of the rendering output being generated, depth value information for that block for use by the block early depth test circuit in performing a block early depth test of primitives with respect to that block, and Storing a per-sample depth buffer of the render output, the per-sample depth buffer storing depth values for respective ones of one or more sample locations of the render output being generated for use by the sample depth test circuit in performing a depth test of a primitive with respect to the sample location of the render output being generated; The graphics processor is further configured to, when the graphics processor stops processing the rendered output before completing the rendered output,: writing the per-sample depth value in the per-sample depth buffer to a storage to enable recovery of the per-sample depth value while continuing processing of the rendering output, but discarding per-block depth value information in the per-block depth buffer, and The graphics processor is further configured to, when the graphics processor resumes processing of the previously stopped rendering output, perform the following: loading the per-sample depth buffer value of the rendered output written out to memory into a per-sample depth buffer for use in continuing processing of the rendered output, and Using the loaded per sample depth buffer value, a per block depth value information set is stored in a per block depth buffer for use by the block early depth test circuit in performing a block early depth test of a primitive while continuing processing of the rendering output.
  19. 19. The graphics processor of claim 18, wherein the rasterizer is a layered rasterizer operative to iteratively test primitives for blocks of rendering output that are progressively smaller in size up to a minimum block size and blocks of the rendering output for which the early depth test is performed correspond to blocks of the rendering output that are tested for the rasterizer.
  20. 20. The graphics processor of claim 18, wherein the per-block depth buffer stores minimum and maximum depth values for respective blocks for which depth value information has been stored.

Description

Hidden surface elimination in a graphics processing system Technical Field The technology described herein relates to the processing of computer graphics, and in particular to hidden surface elimination (hidden surface removal) in graphics processing. Background Graphics processing is normally performed first by dividing the graphics processing (rendering) output, such as a frame to be displayed, into a number of similar basic components (so-called "primitives") to make it easier to perform graphics processing operations. These "primitives" are typically in the form of simple polygons, such as triangles. Graphics primitives for output (such as frames to be displayed) are typically generated by a driver of a graphics processor based on graphics rendering instructions (requests) received from an application (e.g., game) that requires graphics processing. Each primitive is typically defined and represented by a vertex set at this level. Each vertex of a primitive has associated with it a dataset (such as position, color, texture, and other attribute data) representing that vertex. Such data is then used, for example, in rasterizing (rasterising) and rendering (rendering) vertices (primitives associated with the vertices), for example, for display. Once the primitives and their vertices are generated and defined, they may be processed by a graphics processor, for example, to display the frame. The processing basically involves determining which sampling points in an array of sampling points covering the rendered output area to be processed are covered by a primitive and then determining what representation (e.g. according to its color, etc.) each sampling point should have to represent the primitive at that sampling point. These processes are respectively known as rasterization and rendering. The rasterization process determines the (x, y) locations of the sampling points that should be used for the primitives (i.e., the sampling points to be used to represent the primitives in the rendering output (e.g., frame to be displayed)). This is typically done using the location of the vertices of the primitives. The rendering process then derives data (such as red, green, and blue (RGB) color values, and "Alpha" (transparency) values) necessary to represent the primitives at the sample points (i.e., "shading") each sample point. This may involve applying textures, mixing sample point data values, etc. ( In graphic literature, the term "rasterized" is sometimes used to refer to the conversion of primitives into sample locations and rendering. However, the use of "rasterization" herein merely refers to converting primitive data into sampling point addresses. ) These processes are typically performed by testing sets of one or more sampling points and then for each set of sampling points found to include a sampling point that is internal to (covered by) the primitive in question (tested), generating a discrete graphics entity commonly referred to as a "fragment" that performs a graphics processing operation (such as rendering). Thus, in effect, the covered sampling points are treated as fragments that will be used to render the primitives at the sampling points in question. A "fragment" is a graphical entity that is rendered (rendering pipeline). Depending on how the graphics processing is configured, each segment generated and processed may represent a single sample point or a set of multiple sample points, for example. ( Thus, a "segment" is effectively a set of primitive data (associated therewith) that is interpolated to a given one or more output spatial sampling points of the primitive. The fragment may also include per-primitive and other state data needed to color the primitive at the sampling point (fragment position) in question. The size and location of the individual graphics segments may generally be the same as the "pixels" of the output (e.g., output frame) (because the pixels are singular points (singularity) in the final display, there may be a one-to-one mapping between the "segments" of graphics processor operations (rendering) and the pixels of the display). However, there may be cases where there is no one-to-one correspondence between segments and display pixels, for example, where a particular form of post-processing (such as downsampling) is performed on the rendered image prior to displaying the final image. ) (There are also cases where the final pixel output may depend on multiple or all segments at a given pixel location because the segments at that pixel location may affect each other (e.g., due to transparency and/or blending) (e.g., from different overlapping) primitives) ( Accordingly, there may be a one-to-one correspondence between sampling points and pixels of the display, but more typically, there may not be a one-to-one correspondence between sampling points and display pixels, as the rendered sample values may be downsampled to generate output pixel values for displaying the final ima