US-20260127709-A1 - FEED-FORWARD GAUSSIAN SPLATTING
Abstract
Certain aspects of the present disclosure provide techniques and apparatus for machine learning. In an example method, Gaussian kernels parameterized by a plurality of parameters corresponding to a set of attributes is accessed. A set of norm values for the set of attributes is determined based on the set of parameters, and a noise measure is generated based on the set of norm values. A set of rendered images is generated based on the Gaussian kernels and the first noise measure. A set of losses is generated based on the set of rendered images, and the plurality of parameters is updated based on the set of losses. An output rendered image is generated based on the updated plurality of parameters for the Gaussian kernels.
Inventors
- Junmin WU
- Khalid TAHBOUB
- Jamie Menjay Lin
- Scott Benjamin LEASK
- Qiqi Hou
- Chen Feng
- Kai Wang
Assignees
- QUALCOMM INCORPORATED
Dates
- Publication Date
- 20260507
- Application Date
- 20250718
Claims (20)
- 1 . A processing system comprising: one or more memories comprising processor-executable instructions; and one or more processors coupled to the one or more memories and configured to execute the processor-executable instructions and cause the processing system to: access a plurality of Gaussian kernels parameterized by a plurality of parameters, wherein the plurality of parameters comprises, for each respective Gaussian kernel of the plurality of Gaussian kernels, a respective set of parameters for a set of attributes; determine a first set of norm values for the set of attributes based on the set of parameters; generate a first noise measure based at least in part on the first set of norm values; generate a first set of rendered images based on the plurality of Gaussian kernels and the first noise measure; generate a first set of losses based at least in part on the first set of rendered images; update the plurality of parameters based on the first set of losses; and generate an output rendered image based on the updated plurality of parameters for the plurality of Gaussian kernels.
- 2 . The processing system of claim 1 , wherein the one or more processors are configured to execute the processor-executable instructions and cause the processing system to determine the first set of norm values in response to determining that a current iteration of updating the plurality of parameters corresponds to a norm update iteration.
- 3 . The processing system of claim 2 , wherein the one or more processors are configured to execute the processor-executable instructions and cause the processing system to, in response to determining that a subsequent iteration of updating the plurality of parameters does not correspond to a norm update iteration, use the first set of norm values during the subsequent iteration.
- 4 . The processing system of claim 2 , wherein norm update iterations are defined as a function of parameter iterations for which the plurality of parameters is updated, such that updated sets of norm values are generated more frequently during earlier parameter iterations, as compared to subsequent parameter iterations.
- 5 . The processing system of claim 1 , wherein, to generate the first set of rendered images, the one or more processors are configured to execute the processor-executable instructions and cause the processing system to: render a first image subsequent to adding the first noise measure to each set of parameters of the plurality of parameters; and render a second image subsequent to subtracting the first noise measure from each set of parameters of the plurality of parameters.
- 6 . The processing system of claim 1 , wherein, to generate the first set of losses, the one or more processors are configured to execute the processor-executable instructions and cause the processing system to compare the first set of rendered images to one or more ground truth images.
- 7 . The processing system of claim 6 , wherein the one or more ground truth images were captured using one or more imaging devices.
- 8 . The processing system of claim 1 , wherein, to update the plurality of parameters based on the first set of losses, the one or more processors are configured to execute the processor-executable instructions and cause the processing system to: compute a set of gradients based on the first set of losses; and update the plurality of parameters based on the set of gradients using gradient descent.
- 9 . The processing system of claim 8 , wherein the one or more processors are configured to execute the processor-executable instructions and cause the processing system to compute the set of gradients according to g = ❘ "\[LeftBracketingBar]" v ❘ "\[RightBracketingBar]" * sign ( l + - l - v ) , wherein: g is the set of gradients, v is the first noise measure, and l + and l − are first and second losses, respectively, of the first set of losses.
- 10 . The processing system of claim 1 , wherein the one or more processors are configured to execute the processor-executable instructions and cause the processing system to refine the plurality of Gaussian kernels, wherein, to refine the plurality of Gaussian kernels, the one or more processors are configured to execute the processor-executable instructions and cause the processing system to: generate a respective error value for each respective Gaussian kernel of the plurality of Gaussian kernels; and for each respective Gaussian kernel having a respective error value that satisfies one or more criteria, either: split the respective Gaussian kernel to form two new Gaussian kernels, or clone the respective Gaussian kernel to form the two new Gaussian kernels.
- 11 . The processing system of claim 10 , wherein, to generate the respective error value for each respective Gaussian kernel, the one or more processors are configured to execute the processor-executable instructions and cause the processing system to: generate a respective pixel error measure for each respective pixel of a set of pixels of at least a first rendered image of the first set of rendered images based on comparing the first rendered image to a ground truth image; determine a respective blending ratio for each respective pixel of the set of pixels based on a respective subset of the plurality of Gaussian kernels that are depicted by the respective pixel; and generate the respective error value for each respective Gaussian kernel based on the pixel error measures and the blending ratios.
- 12 . The processing system of claim 10 , wherein, to generate the respective error value for each respective Gaussian kernel, the one or more processors are configured to execute the processor-executable instructions and cause the processing system to, for each respective Gaussian kernel of the plurality of Gaussian kernels: determine a respective first position of the respective Gaussian kernel during a prior iteration of updating the plurality of parameters; determine a respective second position of the respective Gaussian kernel during a current iteration of updating the plurality of parameters; and generate the respective error value based on a difference between the respective first and second positions.
- 13 . The processing system of claim 1 , wherein the processing system corresponds to at least one of (i) an edge device or (ii) user equipment (UE).
- 14 . A processor-implemented method for view synthesis, comprising: accessing a plurality of Gaussian kernels parameterized by a plurality of parameters, wherein the plurality of parameters comprises, for each respective Gaussian kernel of the plurality of Gaussian kernels, a respective set of parameters for a set of attributes; determining a first set of norm values for the set of attributes based on the set of parameters; generating a first noise measure based at least in part on the first set of norm values; generating a first set of rendered images based on the plurality of Gaussian kernels and the first noise measure; generating a first set of losses based at least in part on the first set of rendered images; updating the plurality of parameters based on the first set of losses; and generating an output rendered image based on the updated plurality of parameters for the plurality of Gaussian kernels.
- 15 . The processor-implemented method of claim 14 , wherein determining the first set of norm values is performed in response to determining that a current iteration of updating the plurality of parameters corresponds to a norm update iteration, the processor-implemented method further comprising, in response to determining that a subsequent iteration of updating the plurality of parameters does not correspond to a norm update iteration, using the first set of norm values during the subsequent iteration.
- 16 . The processor-implemented method of claim 14 , wherein generating the first set of rendered images comprises: rendering a first image subsequent to adding the first noise measure to each set of parameters of the plurality of parameters; and rendering a second image subsequent to subtracting the first noise measure from each set of parameters of the plurality of parameters.
- 17 . The processor-implemented method of claim 14 , further comprising refining the plurality of Gaussian kernels, comprising: generating a respective error value for each respective Gaussian kernel of the plurality of Gaussian kernels; and for each respective Gaussian kernel having a respective error value that satisfies one or more criteria, either: splitting the respective Gaussian kernel to form two new Gaussian kernels, or cloning the respective Gaussian kernel to form the two new Gaussian kernels.
- 18 . The processor-implemented method of claim 17 , wherein, generating the respective error value for each respective Gaussian kernel comprises: generating a respective pixel error measure for each respective pixel of a set of pixels of at least a first rendered image of the first set of rendered images based on comparing the first rendered image to a ground truth image; determining a respective blending ratio for each respective pixel of the set of pixels based on a respective subset of the plurality of Gaussian kernels that are depicted by the respective pixel; and generating the respective error value for each respective Gaussian kernel based on the pixel error measures and the blending ratios.
- 19 . The processor-implemented method of claim 17 , wherein generating the respective error value for each respective Gaussian kernel comprises, for each respective Gaussian kernel of the plurality of Gaussian kernels: determining a respective first position of the respective Gaussian kernel during a prior iteration of updating the plurality of parameters; determining a respective second position of the respective Gaussian kernel during a current iteration of updating the plurality of parameters; and generating the respective error value based on a difference between the respective first and second positions.
- 20 . A processing system, comprising: means for accessing a plurality of Gaussian kernels parameterized by a plurality of parameters, wherein the plurality of parameters comprises, for each respective Gaussian kernel of the plurality of Gaussian kernels, a respective set of parameters for a set of attributes; means for determining a set of norm values for the set of attributes based on the set of parameters; means for generating a noise measure based at least in part on the set of norm values; means for generating a set of rendered images based on the plurality of Gaussian kernels and the noise measure; means for generating a set of losses based at least in part on the set of rendered images; means for updating the plurality of parameters based on the set of losses; and means for generating an output rendered image based on the updated plurality of parameters for the plurality of Gaussian kernels.
Description
CROSS-REFERENCE TO RELATED APPLICATION(S) The present application for patent claims the benefit of and priority to U.S. Provisional Patent Application No. 63/717,228, filed Nov. 6, 2024, which is hereby incorporated by reference herein in its entirety for all applicable purposes. INTRODUCTION Aspects of the present disclosure relate to machine learning. A wide variety of machine learning model architectures have been trained to perform an assortment of diverse tasks, including computer vision tasks, language tasks, classification and regression tasks, generative tasks, and the like. Recently, Gaussian splatting has been used for view synthesis, which involves learning to generate imagery of a subject or scene from one or more points of view based on images captured from different points of view. For example, given a set of images depicting a scene, Gaussian splatting may be used to generate images depicting the scene from other points of view not reflected in the training set of images (e.g., to generate a video that would be captured by a camera moving around the space, such as between two of the training images). BRIEF SUMMARY Certain aspects of the present disclosure provide a processor-implemented method, comprising: accessing a plurality of Gaussian kernels parameterized by a plurality of parameters, wherein the plurality of parameters comprises, for each respective Gaussian kernel of the plurality of Gaussian kernels, a respective set of parameters for a set of attributes; determining a first set of norm values for the set of attributes based on the set of parameters; generating a first noise measure based at least in part on the first set of norm values; generating a first set of rendered images based on the plurality of Gaussian kernels and the first noise measure; generating a first set of losses based at least in part on the first set of rendered images; updating the plurality of parameters based on the first set of losses; and generating an output rendered image based on the updated plurality of parameters for the plurality of Gaussian kernels. Other aspects provide processing systems configured to perform the aforementioned methods as well as those described herein; non-transitory, computer-readable media comprising instructions that, when executed by one or more processors of a processing system, cause the processing system to perform the aforementioned methods as well as those described herein; a computer program product embodied on a computer-readable storage medium comprising code for performing the aforementioned methods as well as those further described herein; and a processing system comprising means for performing the aforementioned methods as well as those further described herein. The following description and the related drawings set forth in detail certain illustrative features of one or more aspects. BRIEF DESCRIPTION OF THE DRAWINGS The appended figures depict example features of certain aspects of the present disclosure and are therefore not to be considered limiting of the scope of this disclosure. FIG. 1 depicts an example workflow for feed-forward Gaussian splatting, according to some aspects of the present disclosure. FIG. 2 depicts an example workflow for Gaussian perturbation scaling for feed-forward Gaussian splatting, according to some aspects of the present disclosure. FIG. 3 is a flow diagram depicting an example method for feed-forward Gaussian splatting with Gaussian perturbation scaling, according to some aspects of the present disclosure. FIG. 4 is a flow diagram depicting an example method for feed-forward Gaussian splatting via norm buffering, according to some aspects of the present disclosure. FIG. 5 is a flow diagram depicting an example method for feed-forward Gaussian splatting with gradient replacement, according to some aspects of the present disclosure. FIG. 6 is a flow diagram depicting an example method for feed-forward Gaussian splatting, according to some aspects of the present disclosure. FIG. 7 depicts an example processing system configured to perform various aspects of the present disclosure. To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the drawings. It is contemplated that elements and features of one aspect may be beneficially incorporated in other aspects without further recitation. DETAILED DESCRIPTION Aspects of the present disclosure provide apparatuses, methods, processing systems, and non-transitory computer-readable mediums for providing improved machine learning. Specifically, in some aspects of the present disclosure, techniques for efficient feed-forward Gaussian splatting are provided. Gaussian splatting involves techniques for view synthesis that can learn, from a relatively small set of static images depicting a scene or object, to generate new images of the scene or object from virtually any position and orientation in three-dimensional s