Upload
hoangduong
View
217
Download
3
Embed Size (px)
Citation preview
Kriging by Example: Regression of oceanographic data
Paris Perdikaris Brown University, Division of Applied Mathematics !January 23, 2015 Sea Grant College Program Massachusetts Institute of Technology Cambridge, MA
Outline
� Overview of Kriging
� An academic example in �D
� Pirate’s Cove data-set:
– Bathymetry
– Plant height
2/19
Model observed data Y (x) as a realization of a Gaussian process Z(x)up to measurement error E(x):
Y (x) = Z(x) + E(x)Construction steps:
�. Assume covariance models⌃(✓),⌃E(✓E) parametrized by (✓, ✓E).
�. Explore spatial correlations in the observed data y and estimate theoptimal kriging hyper-parameters (✓, ✓E) through optimization.
�. The conditional expectation E[Z|Y ] provides a predictive scheme forestimating y at new locations x.
�. The conditional covariance⌃Z|Y quanti�es the uncertainty of thekriging predictor.
Kriging
3/19
Some remarks:
� Choosing a covariance model should re�ect our prior belief of thenature of the data (smoothness, stationarity, anisotropy).
� Error models can capture the statistical behavior of the measurementnoise.
� If E(x) is neglected then the kriging predictor interpolates the data.
� The computational cost is dominated by the optimization for learningthe kriging hyper-parameters (typically scales osO(N3)).
� Matlab code is available.
Kriging
4/19
�D Example
0 0.2 0.4 0.6 0.8 1−10
−5
0
5
10
15
f(x) = (6x� 2)2 sin (12x� 4)
x
y
5/19
�D Examplef(x) = (6x� 2)2 sin (12x� 4)
x
y
0 0.2 0.4 0.6 0.8 1−10
−5
0
5
10
15
6/19
�D Examplef(x) = (6x� 2)2 sin (12x� 4)
x
y
0 0.2 0.4 0.6 0.8 1−10
−5
0
5
10
15
7/19
�D Examplef(x) = (6x� 2)2 sin (12x� 4)
x
y
8/19
�D Examplef(x) = (6x� 2)2 sin (12x� 4)
x
y
0 0.2 0.4 0.6 0.8 1−10
−5
0
5
10
15
9/19
�D Example
� � �� ������
��
����
�
���
�
���
x
f(x)
f(x) = sin(x) + ϵ
f(x)ObservationsKriging mean
10/19
Pirate’s Cove
�,��� depth measurements�,��� plant height measurements
Mike Sacarny:
11/19
Bathymetry
42.418 42.4185 42.419 42.4195 42.42 42.4205
−70.921
−70.92
−70.919
−70.918
−70.917
−70.916
0
1
2
3
4
5
6
7
42.418 42.4185 42.419 42.4195 42.42 42.4205
−70.921
−70.92
−70.919
−70.918
−70.917
−70.916
0.5
1
1.5
2
2.5
Long
itude
Depth(m)
Predictor UncertaintyDepth(m)
Long
itude
Latitude
Matern �/� covariance��� observations 12/19
Latitude
Bathymetry10%measurement noise
Depth(m)
Predictor UncertaintyDepth(m)
Latitude Latitude42.418 42.4185 42.419 42.4195 42.42 42.4205
−70.921
−70.92
−70.919
−70.918
−70.917
−70.916
1
2
3
4
5
6
7
42.418 42.4185 42.419 42.4195 42.42 42.4205
−70.921
−70.92
−70.919
−70.918
−70.917
−70.916
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Long
itude
Long
itude
Matern �/� covariance��� observations 13/19
Bathymetry
Depth(m)
Predictor UncertaintyDepth(m)
Latitude Latitude
50%measurement noise
42.418 42.4185 42.419 42.4195 42.42 42.4205
−70.921
−70.92
−70.919
−70.918
−70.917
−70.916
1
2
3
4
5
6
7
42.418 42.4185 42.419 42.4195 42.42 42.4205
−70.921
−70.92
−70.919
−70.918
−70.917
−70.916
0.5
1
1.5
2
2.5
3
Long
itude
Long
itude
Matern �/� covariance��� observations 14/19
Bathymetry
1 2 3 4 5 6 7
1
2
3
4
5
6
7
y
y
Predictor accuracy
1 2 3 4 5 6 7
1
2
3
4
5
6
7
1 2 3 4 5 6 7
1
2
3
4
5
6
7
y y
y y
0% noise 10% noise 50% noise
15/19
Bathymetry
Kriging prediction based on ��� observations
16/19
Depth(m)
Predictor UncertaintyDepth(m)
Latitude Latitude
0%measurement noisePlant Height
Matern �/� covariance��� observations
42.418 42.4185 42.419 42.4195 42.42 42.4205
−70.921
−70.92
−70.919
−70.918
−70.917
−70.916
0
0.51
1.5
1.5
2
2
2.5
2.5
3
3
3
3
3.5
3.5
3.5
3.5
4
4
4
4
4.5
4.5
4.5
4.5
5
5
5
5
5.5
5.5
5.5
6
6
6
6.5
6.5
6.5
7
7
0
0.2
0.4
0.6
0.8
1
1.2
1.4
42.418 42.4185 42.419 42.4195 42.42 42.4205
−70.921
−70.92
−70.919
−70.918
−70.917
−70.916
0
0.51
1.5
1.5
2
2
2.5
2.5
3
3
3
3
3.5
3.5
3.5
3.5
4
4
4
4
4.5
4.5
4.5
4.5
55
5
5
5.5
5.55.5
6
6
6
6.5
6.56.5
7
7
0.05
0.1
0.15
0.2
0.25
Long
itude
Long
itude
17/19
Plant Height
Kriging prediction based on ��� observations(overlay with predicted bathymetry) 18/19
SummaryKriging as a statistical regression tool for oceanographic data:
� Is built on exploring spatial correlations in the data.
� Provides a scheme for making predictions at new spatial locations.
� Provides a measure of uncertainty quanti�cation for the predicted values.
Next steps:
� Co-kriging for exploring cross-correlations between di�erent variables.
� Choose new sampling locations using the maximum expectedimprovement criterion.
� Extend formulation to spatio-temporal data-sets.
� Scale algorithms to very large data-sets.
19/19
References
[1] N. Cressie. Statistics for Spatial Data. Wiley-Interscience, 1993.
[2] A. Forrester, A. Sobester, and A. Keane. Engineering Design via Surrogate Modelling: A Practical Guide.John Wiley & Sons, 2008.
[3] A. I. J. Forrester, A. Sobester, and A. J. Keane. Multi-fidelity optimization via surrogate modelling. P.Roy. Soc. Lond. A Mat., 463(2088):3251–3269, 2007.
[4] T. Hastie, R. Tibshirani, J. Friedman, T. Hastie, J. Friedman, and R. Tibshirani. The elements of
statistical learning. Springer, 2009.
[5] M. C. Kennedy and A. O’Hagan. Predicting the output from a complex computer code when fastapproximations are available. Biometrika, 87(1):1–13, 2000.
[6] D. G. Krige. A Statistical Approach to Some Mine Valuation and Allied Problems on the Witwatersrand:
By DG Krige. PhD thesis, University of the Witwatersrand, 1951.
[7] G. Matheron. Principles of geostatistics. Econ. Geol., 58(8):1246–1266, 1963.
[8] C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning. The MIT Press,2005.
[9] J. Sacks, W. J. Welch, T. J. Mitchell, and H. P. Wynn. Design and analysis of computer experiments.Stat. Sci., 4(4):409–423, 1989.