US-12625920-B2 - Rendering a scene using a reduced memory representation of a polynomial function to determine an output value approximating a mathematical function

US12625920B2US 12625920 B2US12625920 B2US 12625920B2US-12625920-B2

Abstract

An aspect includes an apparatus for evaluating a mathematical function at an input value. The apparatus includes a selector for selecting a mathematical function, an input for a value at which to evaluate the function, an identifier for identifying an interval containing the input value. The interval is described by at least one polynomial function. At least one control point representing the polynomial function is retrieved from at least one look up table, and the polynomial function can be derived from the control points. The function is evaluated at the input value and an output of the evaluation is used as a value of the function at that input value.

Inventors

Simon Fenney

Assignees

IMAGINATION TECHNOLOGIES LIMITED

Dates

Publication Date: 20260512
Application Date: 20220127
Priority Date: 20040527

Claims (18)

1 . A computer graphics system for rendering a scene using representations of a plurality of polynomial functions to determine an output value approximating one of a plurality of mathematical functions, the computer graphics system comprising: a plurality of lookup tables stored in one or more non-transitory memories, each of the plurality of lookup tables being configured to store a data value determined for a respective interval over which a respective one of the plurality of mathematical functions may be evaluated; and fixed function logic coupled to the plurality of lookup tables, the fixed function logic comprising: function selection logic configured to select a particular set of the plurality of lookup tables according to a particular mathematical function to be evaluated; interval selection logic configured to identify an interval containing a point at which the particular mathematical function is to be evaluated; polynomial derivation logic configured to: obtain, from the particular set of the plurality of lookup tables, the data value for the identified interval and data values for at least one interval adjacent each end of the identified interval, and derive a polynomial function from the data values, the polynomial function being for approximating the particular mathematical function over the identified interval, wherein the computer graphics system is configured to store spline control points as the data values used to derive the polynomial function; and evaluation logic configured to compute the output value of the polynomial function at the said point, and use that output value as an evaluation of the particular mathematical function thereby enabling the computer graphics system to render the scene.
2 . The computer graphics system of claim 1 , wherein the stored spline control points have a cumulative data size that is less than a cumulative data size of polynomial coefficients from which the said polynomial function is derivable.
3 . The computer graphics system of claim 2 , wherein the stored spline control points enable the computer graphics system to render the scene with a reduced memory requirement.
4 . The computer graphics system of claim 1 , wherein the polynomial function is a polynomial function of cubic order.
5 . The computer graphics system of claim 1 , wherein the polynomial function is a polynomial function of quadratic order.
6 . The computer graphics system of claim 1 , wherein the number of lookup tables in the particular set of the plurality of lookup tables is greater than or equal to an order of the polynomial function.
7 . The computer graphics system of claim 6 , the computer graphics system being configured to access one spline control point from each lookup table in the particular set of the plurality of lookup tables so as to retrieve the full number of spline control points required to evaluate the polynomial function.
8 . The computer graphics system of claim 7 , the computer graphics system being configured to simultaneously retrieve the full number of spline control points required to evaluate the polynomial function.
9 . The computer graphics system of claim 7 , the computer graphics system being configured to retrieve, in parallel, the full number of spline control points required to evaluate the polynomial function.
10 . The computer graphics system of claim 1 , wherein the polynomial function is a polynomial function of cubic order, and the particular set of the plurality of lookup tables consists of three look-up tables having 2 N−2 +1 entries.
11 . The computer graphics system of claim 10 , wherein N is 6.
12 . The computer graphics system of claim 1 , wherein the polynomial function is a polynomial function of quadratic order, and the particular set of the plurality of lookup tables consists of two look-up tables having 2 N−2 entries and two look-up tables having 2 N−2 +1 entries.
13 . The computer graphics system of claim 12 , wherein N is 6.
14 . The computer graphics system of claim 1 , the computer graphics system being configured to model at least one of a smooth surface and a curve using the output value approximating the particular mathematical function.
15 . The computer graphics system of claim 1 , wherein the computer graphics system is configured to: retrieve spline control points representing the polynomial function; and evaluate the polynomial function at the point using the retrieved spline control points.
16 . The computer graphics system of claim 15 , wherein the retrieved spline control points are B-spline control points, and wherein the computer graphics system is configured to: convert the B-spline control points to Bezier control points defining a Bezier representation of the polynomial function; and perform a de Casteljau algorithm to evaluate the Bezier representation of the polynomial function from the Bezier control points.
17 . The computer graphics system of claim 15 , wherein the positions in the lookup tables from which spline control points are retrieved are defined by lookup index values, and the computer graphics system is configured to: retrieve one spline control point from each lookup table of the particular set of the plurality of look-up tables at a look-up index value determined from the identified interval; and rotate the retrieved spline control points to a required order for use in evaluating the polynomial function.
18 . A method of rendering a scene using representations of a plurality of polynomial functions to determine an output value approximating one of a plurality of mathematical functions, the method comprising: selecting, using function selection logic implemented in fixed function logic, a particular set of a plurality of lookup tables stored in one or more non-transitory memories according to a particular mathematical function to be evaluated, each of the plurality of lookup tables being configured to store a data value determined for a respective interval over which a respective one of the plurality of mathematical functions may be evaluated; identifying, using interval selection logic implemented in said fixed function logic, an interval containing a point at which the particular mathematical function is to be evaluated; obtaining, using polynomial derivation logic implemented in said fixed function logic, the data value for the identified interval and data values for at least one interval adjacent each end of the identified interval from the particular set of the plurality of lookup tables; deriving, using the polynomial derivation logic, a polynomial function from the data values, the polynomial function being for approximating the particular mathematical function over the identified interval, wherein the data values used to derive the polynomial function are spline control points; computing, using evaluation logic implemented in said fixed function logic, the output value of the polynomial function at the said point; and using that output value as an evaluation of the particular mathematical function thereby enabling the computer graphics system to render the scene.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS AND CLAIM OF PRIORITY This application is a continuation under 35 U.S.C. 120 of copending application Ser. No. 13/673,872 filed Nov. 9, 2012, which is a continuation of prior application Ser. No. 13/343,071 filed Jan. 4, 2012, now abandoned, which is a divisional of prior U.S. application Ser. No. 11/140,258, filed May 27, 2005, now U.S. Pat. No. 8,285,768, which claims foreign priority under 35 U.S.C. 119 from United Kingdom Application No. 0411880.8 filed May 27, 2004. BACKGROUND OF THE INVENTION In floating point units of some CPUs or, more particularly, contemporary graphics chips, it is necessary to compute mathematical functions of certain floating point numbers. These are typically continuous and may include functions such as ‘reciprocal’, i.e. ƒ(x)=1/x, ‘reciprocal square root’, ƒ(x)=1/√{square root over (x)}, ‘logarithm base 2’, ƒ(x)=1nx/ln 2, power base 2, ƒ(x)=2x, sine and cosine. For example, such mathematical functions form part of the instruction set of the Microsoft Direct X graphics specification, and some have appeared, in various forms, as instructions in various CPUs. With many of these functions, there is a repeating pattern of behaviour that can be exploited to simplify the implementation—cosine and sine are obvious examples, but, this behaviour can also be seen with reciprocal which ‘repeats’ for each for every interval of the form [2n,2n+1) and for ‘reciprocal square root’ which repeats on intervals of the form [2n,2n+2). Such regions can usually be extracted through simple integer manipulation of the input floating point exponent, and so the difficult part of the evaluation reduces to evaluating a function on a much-reduced range of numbers. Without loss of generality, it can be assumed that the input value to such a reduced function can be mapped to a (typically fixed point) value, x′ in the range [0,1]. The output is also assumed to be of the same form. There are several methods of computing such reduced functions known in the art. The simplest system is that of a direct table look-up. For a given input precision (N bits) and a particular output precision (M bits) the table would contain 2N entries each of M-bits. The input value is mapped to the nearest entry in the table and the result is obtained. However, for anything except small tables, this is an expensive and generally inaccurate option. One approach that can be used to alleviate the expense is to compress the data in the table—such a scheme was used for reciprocal calculations in U.S. Pat. No. 6,330,000. This scheme, however, is difficult to apply generally and, furthermore, does not offer large levels of compression. An alternative approach is to use a small look-up table, which only gives moderately precise results, and then apply Newton-Raphson iteration to improve the numerical accuracy. With such a scheme, the precision, in terms of numbers of bits, typically doubles with each such iteration. Turkowski describes such a method in ‘Graphics Gems V’ [ISBN 0-12-543455-3] for computing the reciprocal square root. A look-up table with, say, 26 entries, produces an initial guess, R0, which is then improved by use of two Newton-Raphson iteration steps that evaluate: Ri+1=(3−Ri*Ri*x)*x/2 Similarly, the Texas Instrument's ‘TMS320C3x User's Guide’ gives examples for the computation of ‘reciprocal’ and ‘reciprocal square root’. It, in fact, effectively dispenses with the look-up table and starts with a very approximate initial guess and simply relies on the iteration to improve the accuracy. One problem with methods that use iterative improvements such as Newton-Raphson is that they require the additional mathematical operations (either floating point or integer) in order to converge to the result. This may not be an issue when the iterations steps are performed in a CPU with multipliers or adders that may otherwise be idle, but including the extra circuits in a dedicated hardware unit may be expensive. Another alternative is to use a power series to evaluate the function, say, using three or four terms. In this approach, weighted versions of the k derivatives are stored as constants and the function is evaluated as . . . ƒ(x+x′)=C0C1x′+C2x′2+ . . . +Ckx′k Eqn. 1 or, equivalently, with fewer multiplies, as . . . ƒ(x+x′)=C0+(C1+(C2+ . . . )x′)x′ . . . Eqn 1a Obvious candidates for such an implementation include Sine and Cosine, and, in fact, the Microsoft DirectX 9 specification does give an example using such. One problem with only having a single function is that it may be difficult to maintain accuracy through the entire domain of the function without having a very large number of terms. A simple remedy is to again use a table approach and use a piecewise approximation to the function. This method is well known in the art and several improvements have been suggested: e.g by Detrey and de Dinechin in “Table-based polynomials for fast hardware function evaluation” or Walters and Sch