Distance-based stochastic modeling: theory and applications · 2008-06-13 · Distance-based stochastic modeling: theory and applications Jef Caers Stanford University, California,

1

Distance-based stochastic modeling: theory and applications

Jef Caers

Stanford University, California, USA Energy Resources Engineering Department

Abstract Traditional to geostatistics is the quantification of spatial variability through a variogram or co-variance model, often within a Multi-Gaussian random field description. Since most practical applications work on a grid, the input of many algorithms is essentially a co-variance table, in the theoretical limit, a large N×N matrix whereby N is the number of cells in the model grid. In most practical applications, N is larger than the number of realizations, L, generated. Co-variance-based models are however limited in representing realistic spatial variability, hence recently, multi-point geostatistical methods have been developed to better represent actual spatial variation. The multi-point approach has mostly relied on the construction of efficient algorithms whose application has been successful, but whose further progress may be hampered by the lack of a theoretical framework. In this regard, Markov random fields have yet to be proven practical and robust models in 3D. In this paper, a new type of random field model is proposed that is based on a parameterization by means of the distance between any two outcome realizations of this random field model. The main idea is based on a simple duality between the covariance table calculated from a set of L realizations and the Euclidean distance between these realizations. Hence, instead of defining a random field in a very high-dimensional Cartesian space (dim=N), we define the random field in a much lower-dimensional and mathematically/computationally tractable metric space (dim max=L), since it is expected that the number of realizations is much less the number of grid-cells. The classical Karhunen-Loeve expansion of a Gaussian random field, based on the eigenvalue decomposition of an N×N covariance table, can now be formulated as function of the L×L Euclidean distance table. To achieve this, we construct an L×L kernel matrix using the classical radial basis function, which is function of the Euclidean distance and perform eigenvalue decomposition of this kernel matrix. The generalization to non-Gaussian random fields is easily achieved by using distances other than the Euclidean distance. In fact, the distance chosen can be tailored to the particular application of the random field model at hand. It is shown how this modeling approach creates new avenues in spatial modeling, including the generation of a differentiable random field, a random field model for multiple-point simulations, the ability to simultaneously update multiple existing realizations with new data, a more realistic modeling of spatial uncertainty, and the fast and effective construction of prior model spaces for solving inverse problems.

2

Introduction During the last decade, considerable progress has been made to generate a geostatistical realization (e.g. of a reservoir model) that is constrained to static (e.g. well-log and 3D seismic) and dynamic data (4D seismic and production) and that at the same time exhibits realistic spatial heterogeneity. The same algorithm/workflow is repeated several times to generate a desired number of realizations. This set is then considered as a discrete representation of uncertainty. Considerably less progress has been made on how to handle this potentially large set of realizations in terms of a transfer function (e.g. flow simulation). It is still (and probably will remain) infeasible to process all these realizations through a CPU-demanding transfer function or to generate several 100s history matched realizations. More-over, data conditioning should not be considered as a one-time deal. Often, new data becomes available and models need to be updated, which in terms of the above modeling framework means that the entire process needs to be repeated often from scratch. In other words, while considerable advance has been made in geostatistical methods for reservoir modeling, much less advance has been made in using these geostatistical models for response uncertainty evaluation. Most importantly, the use of model uncertainty, model updating and uncertainty propagation using either transfer function or within an optimization framework (e.g. closed loop reservoir modeling, well placement problems) is still far from practical. The primary reason for this is that the traditional Monte Carlo framework works well within the geostatistical methodology; however, for reasons of CPU-time, it applies much less in modeling response uncertainty. One should first notice that in reservoir modeling and simulation a disconnection exists between the geostatistical modeling step and the response evaluation step (flow simulation). Geostatistical models are simply considered as input to a flow simulator and they are built possibly with little regard to the type of response or decision variable considered. The traditional approach to geostatistics calls for the formulation of a random function, whether analytically defined (e.g. multi-Gaussian) or algorithmically defined (e.g. MPS) to generate such realizations, see Figure 1. Once all realizations are produced, they are evaluated through a response function, e.g. a flow simulator. The drawback of this is that the formulation of the RF is independent of the response, more precisely, it is not purpose-driven (see Figure 1). In addition, it would be difficult to update the model when new data becomes available; one would have to start almost from scratch. Also, a single RF with a fixed training image is not likely to adequately capture reservoir uncertainty. In this paper we propose a new approach to random function modeling, one that is purpose driven, allows inclusion of many sources of uncertainty and allows easy updating when new data becomes available. Figure 1 provides an intuitive overview. The distance based-approach starts by generating multiple realizations using any type of method, be it variogram-based, MPS or Boolean or a mix of them. One may for example use multiple alternative training images. Next, a pair-wise distance between these realizations is defined. Most important is that one can (and should) define a distance that is related to

3

the difference in the target response function or decision variable. The better that relationship, the more effective the technique will be. In this paper, we establish a new methodology for defining a random function based on the existing realization and their corresponding input distance table. It is termed a distance-based random function that allows generating any new realization from the existing set. Once that is possible, we show that actions such as model updating and uncertainty quantification become much more efficient and effective. The aim of this report is to establish the basic theory to do this. Other papers in the report that follow describe these various practical applications.

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

� �

��

��

��

��

��

� �

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

� �

��

��

��

��

��

� �

��

��

��

��

Figure 1: traditional formulation and use of a random function (RF) versus the proposed distance-based random functions. TF=transfer function (e.g. flow simulation).

4

Notation The following notation will be used throughout the paper: xi : a geostatistical realization on a grid is represented as a vector L : total number of realizations generated N : total number of gridcells; N = nx × ny × nz X = (x1, x2,…,xL) : the realization matrix or the set of all realizations C : N × N spatial covariance table y : a vector of independent random Gaussian deviates D : Cartesian space M : metric space B : dot-product matrix H : centering matrix d : dimension of the mapping with MDS 1 : vector containing ones I : identity matrix � : matrix of realizations in feature space F: Feature space More generally, matrices are capital italic, vectors: small letter bold and spaces: capital

5

Distances and metric spaces One of the most studied distances is the Euclidean distance which is defined as

)()( jiT

jiijd xxxx −−=

if one applies this distance to a pair of realizations. In this paper, much of the theory will first be presented on Euclidean distances, and then extended to any other distances. Geostatistical realizations exist within a Cartesian space D of high dimension (namely the number of grid-cells) where each axis represents on grid cell value (whether discrete or continuous). A distance, such as a Euclidean distance defines a metric space M, which is a space only equipped with a distance, hence does not have any axis, origin, nor direction. This means that we cannot uniquely define the location of any x in this space, only how far each xi is from any other xj. Even though we cannot uniquely define locations for x in M, we can however present some mapping of these points. Indeed, knowing the distance table between a set of cities, we can always produce some form of a 2D map of these cities, up to rotation, reflection and translation of the mapped city locations.

��

��

�

��

�� Figure 2: geometry to explain relationship between Euclidean distance and dot-product

To construct such maps, we employ a traditional statistical technique termed multi-dimensional scaling (MDS). In MDS we rely on a duality between a dot-product of X and the Euclidean distances between pairs of X. Namely, see Figure 2, the Euclidean distance between xi and xj

α−+= cos2222kjkikjkiij ddddd

while the dot-product is defined as

α= coskjkiij ddb hence, the dot-product can be derived from Euclidean distances as

6

( )222

21

ijkjkiij dddb −+=

If we now have L points instead of three, and we center the origin around the centroid of these points then one can easily compute that

��

��

� −−−−= � ��= = ==

L

1

L

1

L

1

22

2L

1

22

L1

L1

L1

21

l k lkllj

kikijij ddddb (1)

We can write the same equations in terms of matrices as follows. Construct a matrix A containing the elements

2

21

ijij da −=

then, center this matrix as follows

TIHHAHB 11L1

with −==

with 1= [1 1 1 … 1] a row of L ones, I, the identity matrix of dimension L. The centering reflects exactly the operation in equation (1) which consists of subtracting, row column and matrix averages of A. One can now see that

( )( ) THXHXB = (2)

Consider now the eigenvalue decomposition of B as

TVVB Λ= if N<L, then there are L-N zero eigenvalues, all the other eigenvalues are positive since B is a real, symmetric, positive definite matrix. In our case, L<<N, hence all eigenvalues are positive. We can now reconstruct X in any dimension from a minimum of one dimension till a maximum of L dimensions, by considering that

( )( ) V XVVHXHXB T 2/1T Λ=�Λ== (3a) or by taking on the d largest eigenvalues we have that

VX ddd2/1ˆ Λ= (3b)

with Vd containing the eigenvectors belonging to the d largest eigenvalues contained in the diagonal matrix Λd. It is also clear from equation (3) that we can only recover the co-ordinates of X up to rotation, reflection and translation of the axis system. Indeed,

7

rotating or moving a map of points will not change their distances. In other words, multiplying X by any matrix S such that STS=I will not change the solution. The solution retained by MDS is such that the mapped points have their centroid as origin and the axes are chosen as the principal axes of X. The classical MDS was developed for Euclidean distances, but as an evident extension, the same operations can be done on any distance matrix. One can however not guarantee that such distance matrix will be positive definite. In that case, one will need to retain only the positive eigenvalues. In summary MDS requires the following operations

1. Choose a dimension d to map 2. Calculate a distance δij between any pair (xi,xj)

3. Calculate 2

21

ijija δ−=

4. Center the matrix A into B=HAH (subtracting row, column and matrix average) 5. Calculate the eigenvalue decomposition of B

6. Map X using VX ddd

2/1ˆ Λ=

7. Plot dX ˆ if d =2 or 3 To assess the accuracy of the mapping one can scatter-plot the Euclidian distances between the coordinates in dX ˆ versus the original distances δij. Consider the following example, Figure 3. 1000 realizations are generated with sgsim using a certain anisotropic covariance model. The Euclidean distance is calculated between any two realizations resulting in a 1000 × 1000 distance matrix. A 2D mapping is retained of the realization in Figure 3. As one notices, the scatter of realizations follows a nice 2D Gaussian shape. Consider now an alternative distance definition between the realizations. On the same realizations two wells are located, one in the SE corner and one in the NW corner. A measure of connectivity as measured by streamlines is calculated for each realization (see Park, 2008 on how exactly this is done). The distance is then simply the difference in connectivity between two realizations. Using this distance, we produce an equivalent 2D map of the realizations, see Figure 4. Note how different this map is from the map based on the Euclidean distance, see Figure. Notable, two distinct groups are present. If we investigate the maps belonging to each group, see Figure 5, we note how the realizations of the left-most group are disconnected, while realizations on the right reflect connected ones.

8

�!

�"�!

�"

�!

�"�!

�"

Figure 3: L=1000 sgsim realizations. Each point in the 2D map represents a single realization.

9

Figure 4: 2D mapping based on connectivity distance

Figure 5: two groups of realizations: disconnected and connected ones.

10

A link between Euclidean distance and spatial covariance In the previous section, we saw how a dot-product between realizations can be derived from a Euclidean distance matrix. In this section, we establish a relationship between the ensemble covariance matrix of a set of realizations and the Euclidean distances between these realizations. The (centered) ensemble covariance matrix between a set of L realizations is defined as

HX XC T

L1=

where H is the centering matrix. Since H2=H, we can write also that

( ) ( )HXHXC T=L Note how this is similar to the dot-product in (1) except for the placement of the transpose. Consider now the eigenvalue decomposition of B in equation (2), then pre-multiplying with (HX)T, we get

( )( )( ) ( )( ) ( ) Λ=�

Λ=�Λ=

VHX VHXHXHX

VVHXHXVBVTTT

T

If we set

( ) WVHX T ′=

then

( ) ( )

L L

Λ′=′�Λ′=′

Λ=′

W WC W WC

WWHXHX T

finally, we take

L 2/1 Λ=�Λ′= − WCWWW

to ensure that WTW=I. In other words, given the eigenvalue decomposition of B, with eigenvectors V and eigenvalue matrix Λ, we can obtain the eigenvalue decomposition of the ensemble covariance C, with eigenvectors and eigenvalues

11

( )

L

VHXW

C

T

Λ=Λ

Λ= − 2/1

(4)

which means that we can reconstruct the matrix C from the eigen-decomposition of B. Note that since L<<N, the cost of calculating the eigenvalue decomposition of B is much less than the cost of calculating the eigen-decomposition of C. To illustrate this property, consider the following simple example. We generate L=100 realizations of N=2 variables, they are standard Gaussian and correlated with correlation coefficient ρ=0.6. The experimental covariance matrix is calculated as

��

��

�=089.1618.0618.0897.0

C

The 2D scatter of points is shown in Figure 6. Next, we calculate the Euclidean distance matrix based on these 100 realizations and perform MDS into a 2D map. The 2D map of points is shown in Figure 6. If we rotate this graph by 45 degree counterclockwise (ensure that WTW=I) and calculate the experimental covariance matrix we obtain

��

��

�=086.1621.0621.0893.0ˆ

MDSC

which is very close to the experimental covariance directly calculated directly from the realizations. Hence, we can use a Euclidean distance to calculate a spatial covariance.

12

-3 -2 -1 0 1 2 3

-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

-2 -1 0 1 2 3

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

)2(ix

)1(ix

Map of realizations

2D mapping with MDS

)2(îx

)1(îx

-3 -2 -1 0 1 2 3

-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

-2 -1 0 1 2 3

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

)2(ix

)1(ix

Map of realizations

2D mapping with MDS

)2(îx

)1(îx

Figure 6: (top) 100 correlated realizations in 2D, (bottom) 2D MDS map calculated with the Euclidean distances between point of the top scatter-plot.

13

Karhunen-Loeve (KL) expansions of Gaussian random functions In geostatistics, the spatial covariance is used to generate multiple realizations reflecting the spatial heterogeneity expressed in that covariance function. The most straightforward technique is the LU decomposition, where, the model covariance matrix is decomposed into an upper (U) and lower triangular (L) matrix

LUC = then, given a random vector of Gaussian deviates y, one can generate a correlated Gaussian vector x as

ymx L+= where m is the mean of the Gaussian field. The resulting realizations follow a multi-Gaussian distribution. The LU-decomposition has fallen out of favor because the covariance matrix C is very large, namely of dimension N × N. An alternative to the LU-decomposition is the so-called Karhunen-Loeve (KL) expansion which relies on the eigen-decomposition of C instead. This expansion is used extensively in stochastic finite element modeling, particularly in the fields of civil and mechanical Engineering. Since this expansion is less known in geostatistics, we will shortly review its theory. Consider a stochastic process in nD, namely X(s), where s is the spatial coordinate. We can write any stochastic process using the following decomposition

functions ticdeterminis :)(

tscoefficien random: with

D) in (domain S )()(1

sw

Y

nsswYsX

k

k

kkk�

∞

=

∈=

(5)

For example, if the coefficients are random Gaussian variables and the function are orthogonal, then X(s) is a Gaussian process, fully specified by mean and covariance function. If we truncate the series in (5), then we approximate the process by the expansion

�=

∈=K

1

S )()(ˆk

kk sswYsX

In order to find the best approximation, we can minimize a least square function of the following type

14

( ) minimal is )()(ˆ that such )(, Find2

S

dssXsXswY kk − (6)

Once can show mathematically that (6) is equivalent to stating that

( )

kjkjk

kjjk

kkk

YYE

dsswswwith

swsdswsXsXCov

δλ=

δ=

λ=′′′

][ 2)

)()( 1)

)()()(),(S

(7)

In other words the best approximation is found by taking the eigenvalue decomposition of the covariance function with eigen-functions wk(s) (note that they are orthogonal) and eigenvalues equal to the variance of the random coefficients Yk. Moreover, one can verify that the following expression for X(s), namely

)()(1

swYsX kkk

k�∞

=

λ=

identifies the relation in (7), hence is a solution of the least-square problem in (6). Hence, we can take the following expansion as an approximation

)()(ˆK

1

swYsX kkk

k�=

= λ (7a)

If we discretize the stochastic process X on a grid and write realizations as vectors x, then the following steps are required to calculate a KL-expansion of a Gaussian random function: 1. Calculate the eigenvalue decomposition of C: T

CWWC Λ= 2. Draw a random vector of standard Gaussian deviates y 3. Generate: 2/1 yx CWΛ= (7b) Note that for large covariance matrices C, the KL-expansion is as CPU-prohibitive as the LU decomposition. However, given the fact that we can reconstruct the eigenvalue decomposition of C from a Euclidean distance matrix, we can now apply the KL-expansion to generate Gaussian and non-Gaussian random fields, as is elaborated in the next section.

15

Distance-based KL-expansions Consider now the case where we have L realization of a Gaussian random function, hence we have matrix X. Then we can calculate the N × N centered ensemble covariance matrix C:

HX XC T

L1=

The relation between the eigen-decomposition of the dot-product matrix B and the eigenvalue decomposition of C then allows applying the KL-expansion in terms of Euclidean distance, instead of covariance, simply by using (4) and (7b) we get

L

1)( yx VHX T=

where V and Λ are the eigenvector and eigenvalue matrices of B. One can simplify this equation to

L

1 h wit y��x VHX TT == (8)

and observe that a new Gaussian realization can written as a linear combination of the existing Gaussian realizations contained in X. Figure 7 shows an example of how realizations can be generated using a Euclidean distance table. First, 1000 sgsim realizations are created as was done in Figure 3. The Euclidean distance table was then calculated from these realizations. Next, 1000 new realizations are generated of which three are shown in Figure 7. The variogram model, as well as the ensemble variogram of both the 1000 original and 1000 new models is shown in Figure 7. As can be observed, we obtain a perfect reproduction of the variogram. To check the variability of the new realizations, we create a 2D MDS map and plot them on top of the original realization. Figure 8 shows that the point distribution in this 2D maps are similar; hence the new realizations have a similar variability between them as the original ones. The KL-expansions remains faithful to the multi-Gaussian properties of the original realizations.

16

0 10 20 30 40 50 60 70 80 900

0.2

0.4

0.6

0.8

1

1.2

Var

iogr

am

Distance

Variogram ModelExp. Vario - Initial Real.Exp. Vario - New Real.

0 10 20 30 40 50 60 70 80 900

0.2

0.4

0.6

0.8

1

1.2

Var

iogr

am

Distance

Variogram ModelExp. Vario - Initial Real.Exp. Vario - New Real.

Figure 7: (top) three example realization generated with KL-expansion of the Euclidean distance

matrix, (bottom) ensemble variograms based on 1000 realizations.

17

Figure 8: blue dots: new realizations, red dots: original realizations

Next, we consider the extension of this framework to non-Euclidean distances and non-Gaussian realizations, this can be done by mapping the MDS-points (of Figure 8 for example) into a new space termed feature space. Feature spaces Assume now that X contains a set of non-Gaussian realizations, for example a set of Boolean or multiple-point models. Moreover, a non-Euclidean distance has been defined between these realizations. How can we generate a new realization from this set of existing realizations in a similar way as was done for Gaussian realizations with a given Euclidean distance between them? To achieve this we rely on transforming the set of realizations X into a space in which Gaussian type modeling becomes more appropriate. This transformation is likely non-trivial involving a complex, multi-variate (multi-point) function ϕϕϕϕ

�Xii �� )( �x�x where the set of realizations X is transformed into a new set Φ. In fact, we can directly create a mapping between the Cartesian space created by MDS, namely (see (3b))

�X didid �� ˆ )ˆ(ˆ ,, �x�x

18

The latter mapping starts from a much lower dimensional space with dim=d than the former one (space of realizations) with dim=N. Φ is the matrix containing the mapped realizations. An example of this transformation is shown in Figure 9. The space created by such mapping is in machine learning better known as the feature space F. If such ϕϕϕϕ can be found, then we can apply the Euclidean-based KL-expansion of (8) to the newly transformed set. Similar to needing the eigenvalue decomposition of the dot-product B=XXT, we now need the eigenvalue decomposition of the dot-product ΦΦΤ, hence, we do not need to calculate, know or specify Φ (or ϕϕϕϕ) itself. Note that the result of a dot-product is a single-valued scalar, hence choosing a dot-product is much easier than the difficult choice of ϕϕϕϕ. So, in traditional Gaussian terms we have the following sequence

nrealizatio Gaussiannew

, nsrealizatio between product -dot

model e/covarianc distance Euclidean

nsrealizatio GaussianInput

expansion)-KL(

()) Eqn(

δ

jiijb

C

xx

which can be generalized as follows:

nrealizatioGaussian -non new

)(n realizatioGaussian new

)(),( nsrealizatiobetween product -dot

statisticsorder )/higher ,d(matrix distance

nsrealizatioinput Any

1

expansion)-KL(

(Kernels)

−

�

x

x�x�

xx

x

ϕ

jiij

ji

i

K

In this generalization, two arrows remain unspecified: (1) how to specify a dot-product ΦΦΤ from a given non-Euclidean distance, and (2) how to calculate the back-transformation ϕϕϕϕ−1−1−1−1. The latter problem is termed the pre-image problem. We first tackle the dot-product specification.

19

#

2D map of realizations 2D projection in feature space

#

2D map of realizations 2D projection in feature space

Figure 9: (left) 2D map obtained by MDS with the connectivity distance, each point is a realization, the color scale is the connectivity of each realization (red=high connectivity). (right) an example of realizations after mapping into feature space. What is shown here is the 2D projection (onto the first two eigenvectors of the Gram matrix) of the realizations mapped using a radial basis function. A relationship between distances, dot-products and kernels A dot-product defined on a vector space is of a simple symmetric bilinear form

][][:,),(N

1nj

nnijiji xxxxxx �

=

=�

that is strictly positive definite. Note that contrary to a distance, the scalar or dot-product provides a measure of similarity. If we now consider the mapping of x in any feature space F, then we can define any similarity measure K from the dot product,

kK jijiij )(),(),( x�x�xx ==

Hence, choosing the function ϕϕϕϕ is similar to choosing the dot-product on ϕϕϕϕ, which is similar to choosing the similarity measure K. Note that the matrix K constructed from the input elements Kij is also termed the Gram matrix. A function k(.,.) that is considered a dot-product in feature space of ϕϕϕϕ is in machine learning better known as a kernel function. Hence, the remaining issue is the choice of the kernel function. The literature offers many choices for kernel functions, but it would make sense in terms of the KL-theory presented here to make the kernel function, a function of the input non-Euclidean distance. In that case the Gaussian radial basis function (RBF) jumps to mind, which is given by

20

kK ijjiij �

�

�

�

��

�

�−==

σδ 2

exp),( xx (9)

with δij the traditional Euclidean distance. Using MDS the input non-Euclidean distance can be transformed into an approximating Euclidean distance as is explained above. An example is shown in Figure 9. 1000 sgsim realizations were mapped into 2D Cartesian space (same as Figure 4). Then the Gram matrix K (with radial basis kernel) was calculated using the Euclidean distances between these points. The eigenvalue decomposition of K was calculated:

TVVK ϕϕϕΛ= What is shown on the right of Figure 9 are the 2D projections ˆ 2,

2/12,2 === Λ= ddd V� ϕϕ of

points in feature space. Note how the disorganized cloud of points in Cartesian space D have become nicely co-linear points in feature space, allowing easier modeling of the variability of points in feature space. In actual applications, the dimension of the feature space will be taken much higher, in fact, much higher than the Cartesian space of points. More importantly, it is this increase in dimension that makes the data look more linear/Gaussian. While our research has successfully, so far, relied on this kernel function, it should be further investigated why this kernel is successful within this framework. Several property of the RBF kernel are noteworthy mentioning

)(),(:isotropy ;1),( and 0),( ijjiiiji kk kk δ==> xxxxxx as is the relationship between Euclidean distance between ϕϕϕϕ(xi) and ϕϕϕϕ(xj) and Euclidean distance between xi and xj

))(),((2),( jijjiiji KK x�x�xx δδ −+= (8) Also of interest is that if all xi are distinct realizations then the Gram matrix K has full rank, hence none of its eigenvalues will be zero, in fact they will all be positive. This also means that the vectors ϕϕϕϕ(x1), ϕϕϕϕ(x2),…,ϕϕϕϕ(xL) are linearly independent. The Gaussian RBF does not have restrictions on the number of training examples and produces potentially a feature space of infinite dimensions. Non-Euclidean KL-expansions The KL-expansion in the non-Euclidean, non-Gaussian case is based on the eigenvalue decomposition of the Gram matrix K, instead of the dot-product matrix B. If the RBF

21

kernel is used, then all eigenvalues are strictly positive, which is a requirement for calculating the KL-expansion. We start by calculating the eigen-decomposition of K

TVVK ϕϕϕΛ= In order to perform a KL expansion in feature space, we need to know the eigenvalue decomposition of the covariance of the mapped realizations ϕϕϕϕ((((xi)

yx� 2/1,,,,, )(

1ϕϕϕϕϕϕ Λ=�Λ== CC

TCCC

T VVV��L

C (10)

where Y is a matrix of standard Gaussian variables. Using the duality between covariance and distance (in this case the dot-product K), we know there exists a relationship between eigen-decompositions of the C and K, namely

ϕϕ Λ=ΛL1 2/1

, ϕϕϕ Λ= VV� CT (11)

We multiple (11) with and use (10) to simplify as follows

YL

VKYV�� CCTT 2/12/12/1

,,

1ϕϕϕϕϕ ΛΛ=�Λ= (12)

Inverting for K we get

TVLYKL

VY ϕϕϕϕ =�ΛΛ= −12/12/1 1 (13)

This allows calculating the standard Gaussian vectors contained in Y that correspond to the initial realizations contained in X. In summary, we have the following workflow for generating a stochastic realization from a non-Euclidean KL-expansion of non-Gaussian realization

1. Generate a set of realizations xi with any stochastic model or algorithm 2. Define a distance (non-Euclidean) between a pair of realizations

3. Calculate the distance matrix of the set of realizations 4. Use MDS to map the realizations xi into id ,x . Calculate the Euclidean distances

between id ,x

22

5. From these Euclidean distances, calculate the Gram matrix K, using the radial

basis function. 6. Calculate the eigenvalue decomposition of TVVK ϕϕϕΛ= . 7. Generate a new realization ϕϕϕϕ( newd ,x ) (in feature space) using the KL-expansion

based on K .

8. Calculate the pre-image newd ,x in Cartesian space D. 9. Calculate the realization xnew corresponding to pre-images newd ,x .

Remaining is the problem of calculating newd ,x and xnew, in other words calculating ϕϕϕϕ−1−1−1−1 or pre-image problem/back-transformation problem. The pre-image problem The inverse ϕϕϕϕ−1−1−1−1 is not known explicitly. Moreover, the inverse of this complex multivariate function does not provide a unique solution, i.e. there will be multiple newd ,x

that map into the same ϕϕϕϕ( newd ,x ). As a consequence, we treat the pre-image problem as an inverse problem. Any ill-posed inverse problem has many solutions. In such case, one can attempt to find a single best solution (e.g. a least squares solution), or one can attempt to find one that also has shares other properties, such as a certain type of geological continuity (e.g. as quantified in a variogram or training image). To solve the inverse problem, we start by formulating it as an optimization problem, namely

�x�xx

Tnewdnewd �

newd

−= )ˆ(minargˆ ,ˆ

,,

(14)

Recall that Φ = [ϕϕϕϕ(x1), ϕϕϕϕ(x2),…,ϕϕϕϕ(xL)] the matrix of mapped realizations and �

T� is the

KL expansion in feature space with

y�2/1

ϕϕ Λ= V

with y a vector of standard Gaussian deviates. Using equation (8) expression the relationship between Euclidean distances in feature and Cartesian space (14) is equivalent to

23

( )��x�x�x�xx

K�TT

newdT

newdnewdT

newdnewd

+−= )ˆ(2)ˆ()ˆ(minargˆ ,,,ˆ

,,

(15)

Note that this minimization problem contains only dot-products of ϕϕϕϕ, it does no longer contain single terms in the unknown ϕϕϕϕ.... As an example, Figure 10 shows four realization out of a total of 300 generated using snesim and using the same training image. Figure 11 shows the same realizations mapped in 2D using a connectivity distance. A sample y is generated in feature space and the objective function (15) is calculated for an exhaustive set of dx vectors (2D).

Figure 11: four realization of snesim with the training image below In order to find a minimum, we set the gradient to zero, which is equivalent to solving a so-called fix-point algorithm

24

id

L

iiL

iidnewdi

ididnewd

L

ii

newd

k

k

,1

1,,

,,,1

, ˆ)ˆ,ˆ(

ˆ)ˆ,ˆ(ˆ x

xx

xxxx �

�

�

=

=

= β=′α

′α=

with k´ the derivative of k. Note that the new point in Cartesian space is a non-linear combination (with weights βi) of the existing points. In order to get an actual reservoir model we have two options. The first option is to apply the same weights to the existing realizations:

i

L

iinewd xx �

=β=

1,ˆ

this results in the model generated in Figure 13. This makes sense because the distance between the points in the lower dimensional Cartesian space is the same as the distance between the realizations. Realizations “close” in similarity to the new realizations will get more weight. However, the resulting realization lacks geological realism, although a vague channel structure is visible. Again this makes sense because a non-linear combination of binary images is no longer binary. We can solve the same optimization problem under geological constraints. In order to do so, we use the probability perturbation technique, which guarantees that the solution has similar geological properties as those of the original models. Figure 13 provides the result. Clearly, the channel structure is now visible.

25

Non constrained solution

Geologicallyconstrained solution

Non constrained solution

Geologicallyconstrained solution

Figure 11: (top) mapping of realizations in 2D space using MDS (bottom) objective function in 2D with two realizations xnew, one geologically constrained and one not.

26

Applications Four papers in this report make extensive use of distance-based stochastic modeling, one of which involves a large 3D real field application. Their diversity of application testifies of the potentially wide applicability of this new modeling approach. The details of these works can be found in these papers. Here, we simply summarize the results. Example 1: Uncertainty quantification The paper of Celine Scheidt tackles the problem of quantifying uncertainty of dynamic (production) response from a given geostatistical model. In geostatistical modeling, various sources of uncertainty are addressed, first and foremost, the uncertainty in geological scenario, as expressed by different training images in mp-geostatistics. Next, individual global parameters such as NTG, object geometries etc may be uncertain as well. Finally, for a given training image and fixed global parameters, one may generate several realizations (spatial uncertainty). 72 realizations of a West-Africa turbidite reservoir are generated by varying the training image and NTG, see Figure 12. Per each training image two realizations were generated. In this reservoir there are 28 wells of which 8 are injectors and 20 producers. As distance between the reservoir models we use a streamline based distance that only requires minutes to evaluate. A full flow simulation takes several hours. K-means clustering is applied in the feature space and 8 cluster centers are retained. After solving the pre-image problem, these 8 cluster centers coincide with 8 realizations out of a total of 72 realizations. The P10, P50 and P90 field and individual well production responses are calculated from the 8 realizations, as well as for the total set, in order to assess the accuracy of the quantiles obtained. As shown for an example of water-cut response of a single production well in Figure 12, we obtain accurate assessment of uncertainty with only 8 flow simulations. The corresponding 2D map of the 72 realizations as well as the 8 selected realizations is shown in Figure 12. Example 2: Model updating The paper of Kwangwon Park tackles the problem of model updating, i.e. how can we jointly update a set of realizations with new production data. Such problem is quite relevant in the arena of “Smartfield” or “e-fields” where the goal is to update models real-time as well as optimize production (from new wells or existing ones) based on the newly updated reservoir models. The traditional approach to history matching, namely to update one model at a time, would be too CPU intensive, hence a different approach is called for. Recently, the so-called Ensemble Kalman Filter (EnKF) has gained popularity to update multiple realizations jointly. EnKF is essentially a prediction error filter. It takes any new production data at each time step in a sequential fashion and updates the model per time step. This update is based on the error calculated between the flow simulation evaluated on all realizations at a certain time step and the field data. Since multiple errors from

27

multiple realizations are calculated, one can derive a so-called gain-function that quantifies how much updating each realization should. This type of filtering is known to work well under Gaussian model assumptions (multi-Gaussian assumptions for permeability) as well as for linear or pseudo-linear processes. In this paper we provide several solutions to the limitations of the EnKF. First of all we define a distance between the realizations that is correlated with the target difference in dynamic data. This allows us to map the realizations in lower dimensional space. Using kernels, the KL-expansion of these mapped realizations can be formulated in feature space. Note that the KL expansion relies on standard Gaussian random variable. It is on these standard Gaussian variable that we apply EnKF, while, through solving the pre-image problem, the actual realizations need not be standard Gaussian. With a simple example we show how with this error-prediction filter we can match L realization to production data by doing simply L flow simulations, no iteration is required. The distance also allows us to reduce the ensemble size. The MDS procedure allows analyzing and monitoring the effect the error-prediction filter has on the realizations, moving them from prior models to posterior models. Figure 13 show a successful history match up to a pre-defined error. Example 3: Pattern classification and generation In the paper of Merhdad Honarkhah, the distance-based modeling approach is applied on a completely different problem, illustrating its versatility. Namely, the modeling approach is used to model variability of patterns extracted from the training image and provides for a model to generate new patterns from the existing ones. Traditional MPS algorithms such as snesim generated geologically consistent realizations by using training images to obtain the distributional properties and probabilities needed in a stochastic simulation framework. In an attempt to improve on the pattern reproduction accuracy, pattern-based geostatistical algorithms such as simpat and filtersim came along, in which patterns, instead of covariances or probabilities, were used in an image construction process in order to generate geologically realistic models. In these approaches, the training image which represents the spatial continuity of the subsurface structure is used to construct a pattern database, and consequently, the sequential simulation will be carried on by selecting a pattern from that database and pasting it onto the simulation grid. One of the shortcomings of the present algorithms is the lack of randomness, in other words, the strong similarity between the generated realizations and the input training image patterns. Hence, the main issue for both simpat and filtersim is caused by drawing from the same limited pattern database, no interpolation nor extrapolation is done. In his study a distance-based approach is taken towards pattern analysis and generation. The main purpose is to introduce a better pattern classification algorithm, by using distance methods. Additionally, he also deals with the shortcomings of previous algorithms, i.e., to generate new patterns from the same training image, using the existing pattern database. This can be done in a way similar to generating new realizations from existing ones.

28

0 5 10 15 20 25 30 35 400

500

1000

1500

2000

2500

3000

3500

4000

4500

Time (days)

Cum

Oil

(MS

TB

)

Exhaustive SetKKM

-6 -4 -2 0 2 4 6

x 107

-3

-2

-1

0

1

2

3

4x 10

7

All realizationsSelected realizations

Training images

2D map of realizations Uncertainty quantification

0 5 10 15 20 25 30 35 400

500

1000

1500

2000

2500

3000

3500

4000

4500

Time (days)

Cum

Oil

(MS

TB

)

Exhaustive SetKKM

-6 -4 -2 0 2 4 6

x 107

-3

-2

-1

0

1

2

3

4x 10

7

All realizationsSelected realizations

Training images

2D map of realizations Uncertainty quantification

Figure 12: (top) several alternative training images. (left) 2D map of 72 realizations mapped using a streamline based distance, (right) P10, P50, P90 from 72 realizations and from 8 selected realization with the proposed method

29

Figure 13: overview of the EnKF in distance-kernel space. (top) EnKF is a prediction-error filter that takes the error evaluated from several realizations and uses it to update these realizations one datum (production response at each time-step) at a time. In terms of mapping (middle figure) this means that realizations in 2D space are moved from one location to another. If this process is repeated fro all time-step, then a successful history match is achieved.

30

Acknowledgements I would like to acknowledge the contributions of Kwangwon and Celine in helping to construct the examples for this paper. References Borg, I., Groenen, P., 1997. Modern multidimensional scaling: theory and applications. New-York, Springer Caers, J.: History Matching under a Training Image-based Geological Model Constraints, SPEJ (2003) 218-226. Davis, M.W., 1987. Production of conditional simulations via the LU triangular decomposition of the covariance matrix. Mathematical Geology 19, 91-98. Evensen, G.: Sampling Strategies and Square Root Analysis Schemes for the EnKF, Ocean Dynamics (2004) 54, 539. Evensen, G.: The Ensemble Kalman Filter: Theoretical Formulation and Practical implementation, Ocean Dynamics (2003) 53, 343. Houtekamer, P.L. and Mitchell, H.L.: Data Assimilation Using an Ensemble Kalman Filter Technique, Monthly Weather Review (1998) 126, 796. Hoffman, B. T., 2005, Geologically consistent history matching while perturbing facies, PhD thesis, Stanford University Kalman, R.E.: A New Approach to Linear Filtering and Prediction Problems, J. of Basic Engineering (1960) 82, 35. MacKay D. J. C.. Information Theory, Inference, and Learning Algorithms. Cambridge University Press, 2003. Park, K., and Caers, J., 2007. History Matching in Low-Dimensional Connectivity-Vector Space, SCRF report 20, Stanford University Sarma, P: Efficient Closed-Loop Optimal Control of Petroleum Reservoirs under Uncertainty, Ph.D. Dissertation of Stanford University (2006) Sarma, P., Durlofsky, L. J., Aziz, K. and Chen, W. H., 2007. A New Approach to Automatic History Matching using Kernel PCA, SPE Reservoir Simulation Symposium, Houston, Texas, USA, , SPE 106176 Scheidt, C., and Caers, J., 2007. A workflow for Spatial Uncertainty Quantification using Distances and Kernels, SCRF report 20, Stanford University Schöelkopf, B., Smola, A., 2002. Learning with Kernels, MIT Press, Cambridge, MA.

31

Shawe-Taylor, J., Cristianni, N., 2004. Kernel Methods for Pattern Analysis: Cambridge University Press, 462 p. Strebelle, S., 2002, Conditional Simulation of Complex Geological Structures using Multiple-point Statistics, Mathematical Geology, 34, 1-22. Suzuki, S., Caers, J. 2006. History matching with an uncertain geological scenario. SPE Annual Technical Conference and Exhibition, SPE 102154.

Documents

Distance-based stochastic modeling: theory and applications · 2008-06-13 · Distance-based stochastic modeling: theory and applications Jef Caers Stanford University, California,