US-12626129-B2 - Cost function engineering for estimating uncertainty correlated with prediction errors

US12626129B2US 12626129 B2US12626129 B2US 12626129B2US-12626129-B2

Abstract

A method and a non-transitory computer readable medium for performing a calculation in a neural network comprise: accepting a data set into the neural network; performing a calculations with the neural network using the data set, wherein the calculations use a loss function and provide an aleatoric and epistemic uncertainty that is correlated to a value; and displaying results of the calculations performed.

Inventors

RAVINATH KAUSIK KADAYAM VISWANATHAN
Lalitha Venkataramanan
Andras Szep
Lucas Pelegrin

Assignees

SCHLUMBERGER TECHNOLOGY CORPORATION

Dates

Publication Date: 20260512
Application Date: 20211028

Claims (20)

1 . A method for performing a calculation, comprising: providing a neural network configured to generate a prediction; accepting a data set into the neural network, wherein the data set comprises a measured value; performing calculations with the neural network using the data set, wherein the calculations use a loss function and provide an aleatoric uncertainty and an epistemic uncertainty that are correlated to an error of the prediction and the measured value, wherein the loss function is of a form of: L ⁡ ( x , y , θ ) = 1 N ⁢ ∑ i = 1 N ⁢ 1 σ i 2 ⁢  f θ ( x i ) - y i  2 + log [ σ i 2 ] + α ⁡ ( σ i - ❘ "\[LeftBracketingBar]" f θ ( x i ) - y i ❘ "\[RightBracketingBar]" ) 2 ; and displaying results of the calculations performed.
2 . The method according to claim 1 , wherein the neural network incorporates a machine learning model.
3 . The method according to claim 1 , wherein the error of the prediction and the measured value is a prediction bias.
4 . The method according to claim 1 , wherein the neural network has three hidden layers.
5 . The method according to claim 1 , wherein a first term of the loss function relates to an accuracy.
6 . The method according to claim 1 , wherein a second term of the loss function prevents an assignment of uncertainty values for all data points.
7 . The method according to claim 1 , wherein a third term of the loss function represents the aleatoric uncertainty.
8 . The method of claim 1 , wherein the data set comprises a geological scan of a geological formation, the method further comprising: analyzing, with the neural network, the data set to predict a highest concentration of hydrocarbons within the geological formation; and performing calculations with the neural network using the data set, wherein the calculations use the loss function and provide the aleatoric uncertainty and the epistemic uncertainty that are correlated to the error of the predicted highest concentration of hydrocarbons within the geological formation and an actual highest concentration of hydrocarbons within the geological formation.
9 . The method of claim 8 , further comprising training the neural network based on a result of the calculations using the loss function.
10 . The method of claim 1 , wherein the data set comprises a wireline scan of a geological formation, the method further comprising: analyzing, with the neural network, the data set to predict a geological parameter of the geological formation; and performing calculations with the neural network using the data set, wherein the calculations use the loss function and provide the aleatoric uncertainty and the epistemic uncertainty that are correlated to the error of the predicted geological parameter of the geological formation and an actual geological parameter of the geological formation.
11 . The method of claim 10 , further comprising training the neural network based on a result of the calculations using the loss function.
12 . A non-transitory computer-readable medium, comprising computer-executable instructions that, when executed by processing circuitry, cause the processing circuitry to perform operations comprising: generating a prediction using a neural network; accepting a data set into the neural network, wherein the data set comprises a measured value; performing calculations with the neural network using the data set, wherein the calculations use a loss function and provide an aleatoric uncertainty and an epistemic uncertainty that are correlated to an error of the prediction and the measured value, wherein the loss function is of a form of: L ⁡ ( x , y , θ ) = 1 N ⁢ ∑ i = 1 N ⁢ 1 σ i 2 ⁢  f θ ( x i ) - y i  2 + log [ σ i 2 ] + α ⁡ ( σ i - ❘ "\[LeftBracketingBar]" f θ ( x i ) - y i ❘ "\[RightBracketingBar]" ) 2 ; and displaying results of the calculations performed.
13 . The non-transitory computer-readable medium of claim 12 , wherein the computer-executable instructions, when executed by the processing circuitry, cause the processing circuitry to perform operations comprising: generating the prediction using a machine learning model incorporated by the neural network.
14 . The non-transitory computer-readable medium of claim 12 , wherein the data set comprises a geological scan of a geological formation, wherein the computer-executable instructions, when executed by the processing circuitry, cause the processing circuitry to perform operations comprising: analyzing, with the neural network, the data set to predict a highest concentration of hydrocarbons within the geological formation; and performing calculations with the neural network using the data set, wherein the calculations use the loss function and provide the aleatoric uncertainty and the epistemic uncertainty that are correlated to the error of the predicted highest concentration of hydrocarbons within the geological formation and an actual highest concentration of hydrocarbons within the geological formation.
15 . The non-transitory computer-readable medium of claim 14 , wherein the computer-executable instructions, when executed by the processing circuitry, cause the processing circuitry to perform operations further comprising training the neural network based on a result of the calculations using the loss function.
16 . The non-transitory computer-readable medium of claim 12 , wherein the data set comprises a wireline scan of a geological formation, wherein the computer-executable instructions, when executed by the processing circuitry, cause the processing circuitry to perform operations comprising: analyzing, with the neural network, the data set to predict a geological parameter of the geological formation; and performing calculations with the neural network using the data set, wherein the calculations use the loss function and provide the aleatoric uncertainty and the epistemic uncertainty that are correlated to the error of the predicted geological parameter of the geological formation and an actual geological parameter of the geological formation.
17 . The non-transitory computer-readable medium of claim 16 , wherein the computer-executable instructions, when executed by the processing circuitry, cause the processing circuitry to perform operations further comprising training the neural network based on a result of the calculations using the loss function.
18 . The non-transitory computer-readable medium of claim 12 , wherein a first term of the loss function relates to an accuracy.
19 . The non-transitory computer-readable medium of claim 12 , wherein a second term of the loss function prevents an assignment of uncertainty values for all data points.
20 . The non-transitory computer-readable medium of claim 12 , wherein a third term of the loss function represents the aleatoric uncertainty.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS The present application is a National Stage Entry of International Application No. PCT/US20/21057066, filed Oct. 28, 2021, which claims priority to U.S. Provisional Patent Application 63/107,294, filed Oct. 29, 2020, the entirety of which is incorporated by reference. FIELD OF THE DISCLOSURE Aspects of the disclosure relate to computer programming. More specifically, aspects of the disclosure relate to cost function engineering using neural networks. BACKGROUND Deep learning techniques have been shown to be able to provide uncertainties together with point estimates for various classification and regression problems. Enabling these uncertainties to be correlated and calibrated to the error, however, is still considered challenging. Different methods may be used. Methods such as Bayesian neural networks (BNN) and those using dropout for Bayesian inference have become very popular for providing the epistemic uncertainty, while the use of log-likelihood loss function in artificial neural networks (ANN) have incorporated the aleatoric uncertainty. In practice, the uncertainties are often found to be poorly correlated with the absolute error, questioning their value. BNNs have been traditionally used for epistemic uncertainty estimation by determining the posterior weight distributions of the neural networks, providing us with both a mean and variance of the predicted outputs using variational inference. Recently, dropout as a technique for approximating Bayesian inference has also been successfully introduced and has seen widespread uptake in the community. As these methods are sensitive to the epistemic uncertainty they have also been shown to be useful to differentiate in and out-of-distribution predictions. The aleatoric uncertainty which originates from the data and is inherent to the measurement, could be either homoscedastic or heteroscedastic. Recently, conventional analysis has addressed the issue of sensitivity of the neural network model to the aleatoric uncertainty with a log-likelihood loss function. They have shown that when such an approach is combined with dropout based variational inference, sensitivity to both epistemic and aleatoric uncertainty can be achieved. But the inherent challenge that the total uncertainty, which is the sum of the aleatoric and epistemic uncertainty is not necessarily well correlated to the absolute error in the prediction, makes the reliance on these uncertainties questionable. There is a need to provide an analysis technique that will provide aleatoric and epistemic uncertainty in analysis. There is a further need to take the aleatoric and epistemic uncertainty and correlate this uncertainty to the absolute error in predictions. There is a further need to provide analysis techniques that are more accurate than conventional analysis techniques. SUMMARY So that the manner in which the above recited features of the present disclosure can be understood in detail, a more particular description of the disclosure, briefly summarized below, may be had by reference to embodiments, some of which are illustrated in the drawings. It is to be noted that the drawings illustrate only typical embodiments of this disclosure and are therefore not to be considered limiting of its scope, for the disclosure may admit to other equally effective embodiments without specific recitation. Accordingly, the following summary provides just a few aspects of the description and should not be used to limit the described embodiments to a single concept. In one example embodiment, a method for performing a calculation is disclosed. The method may comprise providing a neural network and accepting a data set into the neural network. The method may also comprise performing calculations with the neural network using the data set, wherein the calculations use a loss function and provide an aleatoric and epistemic uncertainty that is correlated to a value and displaying results of the calculations performed. In another embodiment a non-transitory computer readable medium storing a program causing a computer to execute a process is disclosed. The process performed may comprise accepting a data set into a neural network. The process may further comprise performing calculations with the neural network using the data set, wherein the calculations use a loss function and provide an aleatoric and epistemic uncertainty that is correlated to a value and displaying results of the calculations performed. BRIEF DESCRIPTION OF THE DRAWINGS So that the manner in which the above recited features of the present disclosure can be understood in detail, a more particular description of the disclosure, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the drawings. It is to be noted, however, that the appended drawings illustrate only typical embodiments of this disclosure and are therefore not be considered limiting of its scope, for th