CN-121982157-A - Global nerve drawing method and system oriented to hybrid representation optimization

CN121982157ACN 121982157 ACN121982157 ACN 121982157ACN-121982157-A

Abstract

The invention discloses a global nerve drawing method and a global nerve drawing system for mixed representation optimization, which belong to the technical field of computer graphics and comprise the steps of dividing a screen into a plurality of tiles, acquiring a first workload index of a first drawing stage on each tile and a second workload index of a second drawing stage on each tile, generating a unified dispatching queue of the cross-tile and the cross-stage based on the first workload index and the second workload index, executing drawing tasks on different tiles in parallel according to the unified dispatching queue, and triggering subsequent feature fusion and nerve global illumination drawing stages when inter-stage dependent conditions are met so as to generate an output image with a global illumination effect. The method can obviously improve the end-to-end frame rate facing the modern AI accelerator/AI-GPU and reduce tail delay on the premise of ensuring visual quality and time sequence stability, and is suitable for application scenes such as real-time global illumination, large-scale complex scene drawing, interactive neural rendering and the like.

Inventors

BAO HUJUN
YAN XINKAI
HUO YUCHI

Assignees

浙江大学

Dates

Publication Date: 20260505
Application Date: 20260408

Claims (10)

1. A global nerve mapping method oriented to hybrid representation optimization, comprising the steps of: Dividing a screen into a plurality of tiles, and acquiring a first workload index of a first drawing stage on each tile and a second workload index of a second drawing stage on each tile; generating a unified scheduling queue of the cross-tile and the cross-stage based on the first workload index and the second workload index, and executing drawing tasks on different tiles in parallel according to the unified scheduling queue; and triggering the subsequent feature fusion and nerve global illumination drawing stage when the inter-stage dependency condition is met so as to generate an output image with a global illumination effect.
2. The hybrid representation optimization oriented global nerve drawing method according to claim 1, wherein the first drawing stage is to soft rasterize grid geometry data to generate a geometry buffer, and the first workload index comprises a number of grids covered by tiles or a number of pixels; The second rendering stage is to project and sputter a 3D gaussian set to generate a gaussian buffer, the second workload index comprising a gaussian number of tile overlays.
3. The hybrid representation optimization oriented global nerve drawing method of claim 1, wherein the generating a cross-tile and cross-phase unified scheduling queue comprises: Sorting or barreling is carried out based on the first workload index and the second workload index of each tile so as to construct a unified scheduling queue considering inter-tile load distribution and inter-stage task association; and preferentially distributing the heavy-load tiles with the workload higher than a preset threshold according to the unified scheduling queue, or splitting the heavy-load tiles into a plurality of subtasks for parallel processing.
4. The hybrid representation optimization oriented global nerve drawing method according to claim 1, wherein the inter-stage dependent conditions are implemented by: And carrying out asynchronous tracking on the completion states of the first drawing stage and the second drawing stage by taking the tile as a unit, and triggering the subsequent feature fusion and nerve global illumination drawing stage of the tile when the two stages on the same tile are detected to be completed, so that the subsequent stages of different tiles can be executed asynchronously and parallelly.
5. The hybrid representation optimization oriented global nerve drawing method of claim 1, further comprising employing a precision aware scheduling policy: Different calculation accuracies are allocated for different rendering stages or different data types, wherein a first accuracy is used for geometry-dependent calculation, a second accuracy lower than the first accuracy is used for Gaussian mixture, feature fusion and neural global illumination rendering, and a third accuracy lower than the second accuracy is used for intermediate activation or final output.
6. The hybrid representation optimization-oriented global nerve mapping method of claim 1, further comprising employing an early culling strategy: And before dispatching the drawing task of the second drawing stage, skipping the drawing contribution of the second drawing stage, which is blocked or has a contribution lower than a threshold value, in the corresponding tile by utilizing the visibility information generated by the first drawing stage.
7. The hybrid representation optimization-oriented global nerve drawing method of claim 1, further comprising employing a kernel organization and asynchronous optimization strategy: Kernel fusion is carried out on the computation of the cross operator or the cross stage, and alignment optimization is carried out on the data layout; The data in the tile is prefetched to the on-chip high-speed memory through an asynchronous instruction, the first drawing stage, the second drawing stage and the subsequent feature fusion and the nerve global illumination drawing stage are overlapped and executed in different streams, and the dependence is ensured by an event/fence.
8. The global nerve drawing system oriented to the hybrid representation optimization is realized by the global nerve drawing method oriented to the hybrid representation optimization according to any one of claims 1-7, and is characterized by comprising a tile load acquisition module, a unified scheduling execution module and a fusion reasoning output module; the tile load acquisition module is used for dividing a screen into a plurality of tiles, and acquiring a first workload index of a first drawing stage on each tile and a second workload index of a second drawing stage on each tile; the unified scheduling execution module is used for generating a unified scheduling queue of the cross-tile and the cross-stage based on the first workload index and the second workload index, and executing drawing tasks on different tiles in parallel according to the unified scheduling queue; The fusion reasoning output module is used for triggering the subsequent feature fusion and nerve global illumination drawing stage when the inter-stage dependency condition is met so as to generate an output image with a global illumination effect.
9. An electronic device comprising a memory for storing a computer program and one or more processors, wherein the processors are configured to implement the hybrid representation optimization-oriented global nerve drawing method of any one of claims 1-7 when the computer program is executed.
10. A computer-readable storage medium, on which a computer program is stored, characterized in that the hybrid representation optimization oriented global nerve drawing method according to any one of claims 1 to 7 is implemented when the computer program is executed by a computer.

Description

Global nerve drawing method and system oriented to hybrid representation optimization Technical Field The invention belongs to the technical field of computer graphics, and particularly relates to a global nerve drawing method and system oriented to hybrid representation optimization. Background In recent years, nerve drawing (Neural Rendering) gradually transits from implicit representation (such as NeRF) with continuous fields as a core to an explicit-implicit mixed and multi-representation fusion paradigm, wherein stable primary visibility and geometric prior are provided by triangular grids, semitransparent and microstructure details such as hairs, fabrics, vegetation and the like are compactly depicted by explicit primitives such as 3D gauss and the like, approximation and compensation of complex illumination effects are borne by a neural network, and the compromise of quality and efficiency is realized. Practice shows that the grid is responsible for the certainty of the main structure and shielding, the 3D Gaussian is used for analyzing and sputtering full high-frequency detail and semitransparent phenomenon, and the combination of the three-dimensional Gaussian and the full high-frequency detail and semitransparent phenomenon can keep multi-view consistency and time sequence stability better than a single route, and provides more expressive conditions and characteristic input for subsequent nerve Global Illumination (GI). To drop the hybrid representation to real-time rendering, the pipeline typically employs a staged mapping strategy, upstream first generates a geometry buffer (G-buffer) using mesh geometry and raster, then micro-sputterable based on gaussian geometry and raster, outputs Tile (Tile) level contributions, and finally fuses the two types of results into a unified intermediate representation and passes to the neural coloring/relighting network to predict the final pixel. However, there are significant differences in the appeal of hardware resources at each stage, namely, geometry processing and rasterization are more bandwidth-limited, while neural coloring and global illumination are more biased/tensor computational power limited, and this differentiation of resource appeal results in a static resource mapping strategy that makes it difficult for each stage to cooperatively exert hardware peak performance. The low-precision computational trend of modern GPUs further affects the system design of neural rendering. The peak performance of tensor/matrix units of the new generation GPU in low-precision formats such as BF16/FP16/FP8 is far higher than that of FP32, and if the load can tolerate the mixing precision, the calculation and bandwidth pressure can be obviously reduced on the premise of not sacrificing the quality. However, in an actual hybrid pipeline, if the FP32 precision feature is used in multiple stages, the tensor unit is left idle for a long time. Besides precision, the imbalance of space load is another key factor causing tail delay and occupancy rate reduction, namely when the system is counted by taking the Tile as a unit, the Gaussian quantity of each Tile is distributed in heavy tail, so that imbalance of fence waiting and scheduling is caused, and the stability of frames is finally tired. In the comprehensive view, when grid gratings, 3D Gaussian sputtering and nerve global illumination are mapped to the current GPU in a mixed representation pipeline, three common bottlenecks exist, namely, full precision is excessively used to cause Tensor units to be idle, namely, FP32 features are used for multiple stages, so that the Tensor cores idle for a long time, throughput cannot be released, space load is highly deflected, the Gaussian quantity of each Tile is distributed in heavy tails to form long tails and fence waiting, parallelism is reduced, stage resource demand differentiation is conducted, geometry/gratings are more limited by bandwidth, nerve coloring/global illumination is more limited by vector/Tensor computing force, and static mapping is easy to cause resource misallocation and idle. For the above challenges, the current industry scheme only depends on the fixed function RT kernel combined denoising technology, but the complex sampling-denoising trade-off is needed under the dynamic high-frequency content and the hardware cost is higher, on the other hand, the mixed precision/kernel fusion and the simple scheduling are simply performed on the software stack, so that the tensor path is always intermittently lightened due to narrow GEMM, type conversion, splicing/copying overhead, tie load imbalance and the like, and the peak value is difficult to stably approach. Therefore, there is a need to optimize the global nerve mapping method in the mixed representation pipeline to systematically solve the above-mentioned bottleneck and achieve high-quality, high-throughput real-time nerve mapping. Disclosure of Invention In view of the above, the present invention aims