Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Deep Gaussian Processes: Theory andApplications
Petar M. Djuric
Bellairs Workshop, Barbados
February 13, 2019
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 1/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Outline
I Introduction
I Gaussian processes
I Deep Gaussian processes
I Applications
I Conclusions
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 2/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Introduction
I Probabilistic modeling allows for representing and modifyinguncertainty about models and predictions.
I This is done according to well defined rules.
I Probabilistic modeling has a central role in machine learning,cognitive science and artificial intelligence.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 3/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
The Concept of Uncertainty
I Learning and intelligence depend on the amount ofuncertainty in the information extracted from data.
I Probability theory is the main framework for handlinguncertainty.
I Interestingly, in the recent progress of deep learning with deepneural networks, which are based on learning from hugeamounts of data, the concept of uncertainty is somewhatbypassed.
I In the years to come, we will see further advances in artificialintelligence and machine learning within the probabilisticframework.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 4/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
The Role of a Model
I To make inference from data, one needs models.
I Models can be simple (like linear models) or highly complex(like large and deep neural networks).
I In most settings, the models must be able to make predictions.
I Uncertainty plays a fundamental role in modeling observeddata and in interpreting model parameters, the results ofmodels, and the correctness of models.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 5/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
The Learning
I Probability distributions are used to represent uncertainty.
I Learning from data occurs by transforming prior distributions(defined before seeing the data) to posterior distributions(after seeing the data).
I The optimal transformation from information-theoretic pointof view is the Bayes rule.
I The beauty of the approach is the simplicity of the Bayesmechanism.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 6/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Gaussian Processes Regression
I Essentially, a GP can be seen as the distribution of areal-valued function f (x),
f (x) ∼ GP(m(x), kf (xi , xj ))
I Some assumptions are often made when using GP regression
1. the mean function m(x) = 0 for simplicity, and
2. the observation noise is additive white Gaussian noise fortractability.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 7/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Gaussian Processes Regression (contd.)
Let X = {xi}Ni=1 and y denote the collection of all input vectors
and all observations, respectively, with the above assumptions, i.e.,
y = f(X) + ε
where ε ∼ N (0, σ2ε I). We also have
I Likelihood: p(y|f) = N (y|f, σ2ε I), and
I Prior: p(f|X,θ) = N (f|0,Kff ), where Kff = kf (X,X) and θdenote the hyper-parameters in the covariance function.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 8/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Gaussian Processes Regression (contd.)
The hyper-parameters θ can be learned from the training data{X, y} by maximizing the log-marginal-likelihood
I Log-marginal-likelihood: log p(y|X,θ)
log p(y|X,θ) = logN (y|0, Kff + σ2ε I)
= logN (y|0, K)
= −1
2yTK−1y − 1
2log |K| − N
2log 2π
I The Occam’s razor is embedded in the model.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 9/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Gaussian Processes Regression (contd.)
Let X∗ and f∗ denote the collection of test inputs and thecorresponding latent function values, respectively. Then we canexpress the predictive posterior as
I Predictive posterior: p(f∗|X∗,X, y,θ) = N (f∗|E(f∗), cov(f∗))
E(f∗) = [Kf (X∗,X)]K−1y
cov(f∗) = Kf (X∗,X∗)− [Kf (X∗,X)]K−1[Kf (X∗,X)]T
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 10/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Covariance Function
I For example: Radial basis function (RBF) or squaredexponential (SE)
One dimensional form:
krbf (xi , xj ) = σ2f exp(−1
`(xi − xj )
2)
I σ2f measures strength of signal,
σ2fσ2ε
is equivalent to
signal-to-noise ratio (SNR).
I The characteristic length scale ` encodes the modelcomplexity in that dimension.
I r = 1` measures the relevance of that dimension.
I Automatic relevance determination (ARD)
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 11/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Toy Example
I Goal: learn f (x) from 5 noisy observations {xi , yi}5i=1.
I Ground truth: y = sin(x) + ε, ε ∼ N (0, σ2ε ).
I Test inputs: x∗ ∈ R300×1 equally spaced from x = 0 to 2π.
I Test outputs: f∗ = f (x∗)
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 12/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Prior Distribution
0 1 2 3 4 5 6
test inputs x*
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
test
ou
tpu
ts f
*
confidence interval
prior mean
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 13/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Predictive (Posterior) Distribution
0 1 2 3 4 5 6
test inputs x*
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
test
ou
tpu
ts f
*
confidence interval
predictive mean
noisy observations
ground truth
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 14/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Another Toy Example: The Function sin x/x
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 15/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Example: Recovery of Missing Samples in FHR1
I Goal: recover missing samples in FHR, using not onlyobserved FHR but also UA samples
I Model:yi = y(xi ) = f (xi ) + εi
I yi : i-th sample in an FHR segmentI xi = [i , ui ]
′ where ui is the i-th UA sampleI εi : Gaussian white noiseI f (xi ): i-th latent noise-free FHR sample
1Guanchao Feng, J Gerald Quirk, and Petar M Djuric. “Recovery of missing samples in fetal heart raterecordings with Gaussian processes”. In: Signal Processing Conference (EUSIPCO), 2017 25th European. IEEE.2017, pp. 261–265.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 16/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
CTG Segment for Experiments
0 50 100 150 200 250 300 350 400 450 500
time[samples]
100
110
120
130
140
150
FH
R[B
PM
]
FHR
0 50 100 150 200 250 300 350 400 450 500
time[samples]
0
20
40
60
80
100
UA
UA
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 17/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
CTG Segment for Experiments
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 18/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Experiment I
I 120 missing samples were randomly selected, and we tried torecover their true values.
0 50 100 150 200 250 300 350 400 450 500
time[samples]
90
100
110
120
130
140
150
FH
R[B
PM
]
ground truth
cubic spline interpolation
GP-based method
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 19/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Experiment II
I The percentage of missing samples was increased from 1% to85% with a step size of 1%.
0 10 20 30 40 50 60 70 80 90
Percentage of Missing Samples
-2
0
2
4
6
8
10
12
14
log o
f M
SE
MSE of cubic spline interpolation
MSE of GPs-based method
0 10 20 30 40 50 60 70 80 90
Percentage of Missing Samples
20
25
30
35
40
45
50
55
dB
SNR of cubic spline interpolation
SNR of GPs-based method
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 20/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Experiment IIII To demonstrate contribution of UA, we repeated the
experiment I, but excluded ui from the input vector xi .
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 21/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Experiment VI (An Extreme Case)
I 10 seconds of consecutive missing samples.
0 50 100 150 200 250 300 350 400 450 500105
110
115
120
125
130
135
140
145
Ground Truth
GP-based Method, MSE = 1.4337
Cubic Spline, MSE = 62.1504
Linear, MSE = 2.0730
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 22/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Limitations
I The general framework is computationally expensive, O(N3),due to the term K−1
N×N .
I Another limitation is the joint Gaussianity that is required bythe definition of GPs.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 23/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Deep Gaussian Processes
Z XH−1 . . . X2 X1 Y
I Y ∈ RN×dy : observations, output of the network
I N is the number of observation vectors.I dy is the dimension of the vectors yn.
I {Xh}H−1h=1 : intermediate latent states
I dimensions {dh}H−1h=1 are potentially different.
I Z ∈ RN×dz : the input to the networkI Z is observed for supervised learning.I Z is unobserved for unsupervised learning.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 24/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Deep Gaussian Processes (contd.)
I The joint Gaussianity limitation is overcome because nonlinearmappings generally will not preserve Gaussianity.
I DGPs immediately introduce intractabilities.
I One way of handling the difficulties is by introducing a set ofinducing points and where within the variational framework,sparsity and a tractable lower bound on the marginallikelihood are obtained.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 25/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Example: Functions Sampled From DGP
I Gaussianity limitation is overcome by nonlinear functioncomposition.
0 200 400 600 800 1000 1200 1400 1600 1800 2000-10
0
10Input
-8 -6 -4 -2 0 2 4 6 8
-1
0
1
2
Layer 1
-8 -6 -4 -2 0 2 4 6 8
-2
0
2
Layer 2
-8 -6 -4 -2 0 2 4 6 8
-2
0
2
Layer 3
-8 -6 -4 -2 0 2 4 6 8
-2
0
2
Layer 4
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 26/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Example: Learning a Step Function2
I Standard GP (top), two- and four-layer DGP (middle,bottom).
I DGPs achieved much better performance.
2James Hensman and Neil D Lawrence. “Nested variational compression in deep Gaussian processes”. In:arXiv preprint arXiv:1412.1370 (2014).
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 27/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Deep GPs and Deep Neural Networks (a comparison)
I A single layer of fully connected neural network with anindependent and identically distributed (iid) prior over itsparameters and with an infinite width is equivalent to a GP.
I Therefore, deep GPs are equivalent to neural networks withmultiple, infinitely wide hidden layers.
I Mappings of a DGP are governed by its GPs instead ofactivation functions.
I A DGP allows for propagations and quantifications ofuncertainties through each layer as a fully Bayesianprobabilistic model.
I There is ARD at each layer.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 28/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Generative Process
Z X Yf X f Y
Figure: A two-layer DGP.
I The generative process takes the form:
xnl = f Xl (zn) + εX
nl , l = 1, . . . , dx , zn ∈ RdZ
yni = f Yi (xn) + εY
ni , i = 1, . . . , dy , xn ∈ Rdx
I εXnl and εY
ni are additive white Gaussian processes.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 29/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Generative Process (contd.)
Z X Yf X f Y
I We assume Z is unobserved with a prior p(Z) = N (Z|0, I)I If we have specific prior knowledge about Z, we should
quantify this knowledge into a prior accordingly.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 30/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Inference
Z X Yf X f Y
I The inference takes the reverse route, i.e., we observehigh-dimensional data Y, and we learn the low-dimensionalmanifold Z (of dimension dz , where dz < dx < dy ) that isresponsible for generating Y.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 31/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Inference Challenges
The learning requires maximization of the log-marginal-likelihood,
log p(Y) = log
∫X,Z
p(Y|X)p(X|Z)p(Z)dXdZ
which is intractable.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 32/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Augmentation of Probability Space
Z X Yf X f Y
I Original probability space:
p(Y,FY ,FX ,X,Z) =p(Y|FY )p(FY |X)p(X|FX )
× p(FX |Z)p(Z)
I Augmentation using inducing points:I UX = f X (Z), Z ∈ RNp×dZ and UX ∈ RNp×dx
I UY = f Y (X), X ∈ RNp×dx and X ∈ RNp×dx
I Np ≤ N
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 33/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Augmentation of Probability Space
I Augmented probability space:
p(Y,FY ,FX ,X,Z,UY ,UX , X, Z)
= p(Y|FY )p(FY |UY ,X)p(UY |X)
× p(X|FX )p(FX |UX ,Z)p(UX |Z)p(Z)
I Problematic terms:
I A = p(FY |UY ,X)
I B = p(FX |UX ,Z)
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 34/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Variational InferenceI A variational distribution: Q = q(UY )q(X)q(UX )q(Z)
I By Jensen’s inequality:
log p(Y) ≥ Fv =
∫Q · A · B log G dFY dXdFXdZdUXdUY
I The function G is defined as:
G(Y,FY ,X,FX ,Z,UX ,UY )
=p(Y|FY )p(UY )p(X|FX )p(UX )p(Z)
Q.
I Fv is tractable for a collection of covariance functions, sinceA and B are canceled out in G.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 35/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Studying Complex Systems
Used principles
I algorithmic compressibility,
I locality, and
I deep probabilistic modeling.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 36/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Applications
t
. . .z1xL1,1 zQ xL2,2
...
x1,1
y1
...
x1,2
y2
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 37/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Applications-contd.
Pi
Pk Pj
Lkj
Ljk
Lji
Lij
yk [t1] yj [t2]
yi [t3]
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 38/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Applications-contd.3
3Figures obtained by Sima Mofakham and Chuck Mikell.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 39/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Example: Binary pH-based Classification4
I Goal: to have the DGP classify CTG recordings into healthand unhealthy classes.
I Features:
I 14 FHR featuresI 6 (categorical) UA features
I Labeling:I Positive (unhealthy): pH < 7.1I Negative (healthy): pH > 7.2
4Guanchao Feng, J Gerald Quirk, and Petar M Djuric. “Supervised and Unsupervised Learning of Fetal HeartRate Tracings with Deep Gaussian Processes”. In: 2018 14th Symposium on Neural Networks and Applications(NEUREL). IEEE. 2018, pp. 1–6.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 40/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
I Structure of DGP: our DGP network had two layers, and ineach layer, we set the initial latent dimension to five.
I Performance metrics:
1. Sensitivity (true positive rate)
2. Specificity (true negative rate)
3. Geometric mean of specificity and sensitivity
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 41/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Features
Table: Features for FHR
Category Feature
Time domain Mean, Standard deviation, STV, STI, LTV, LTI
Non-linear Poincare SD1, Poincare SD2, CCM
Frequency domain VLF, LF, MF, HF, ratio
Table: Features for UA
Normal (0) Abnormal (1)
Frequency ≤ 8 contractions > 8 UC (tachysystole)
Duration < 90s > 90s
Increased tonus With toco Prolonged > 120s
Interval A Interval – peak to peak < 2min
Interval BInterval – offset of UC to
onset of next UC< 1min
Rest time > 50% < 50%
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 42/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Classification Results
I Support vector machine (SVM) was used as a benchmarkingmodel.
Table: Classification results
Classifier Feature Specificity Sensitivity Geometric Mean
SVMFHR 0.82 0.73 0.77
FHR+UA 0.82 0.82 0.82
Deep GPFHR 0.91 0.73 0.82
FHR+UA 0.82 0.91 0.86
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 43/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Unsupervised Learning for FHR Recordings
I Goal: to have the DGP learn informative low-dimensionallatent spaces that can generate the recordings.
I Labeling:
I pH-based labeling combined with obstetrician’s evaluation.
I Labels are only used for evaluation of learning results.
I Data:
I The last 30 minutes of 10 FHR recordings, Y ∈ R10×7200.
I Three of them are abnormal and 7 are normal.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 44/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Performance Metric and Network Structure
I Performance metric: the number of errors in the latent spacefor one nearest neighbor.
I Structure of DGP: a five-layer DGP, and the initial dimensionsof the latent spaces in the layers were dx1:5 = [6, 5, 5, 4, 3]T .
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 45/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Automatic Structure Learning
Layer 5
1 2 30
0.01
0.02
Layer 4
1 2 3 40
0.5
Layer 3
1 2 3 4 50
0.5
Layer 2
1 2 3 4 50
2
4
Layer 1
1 2 3 4 5 60
0.5
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 46/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Visualization of the Latent Spaces with 2-D Projection.
I Red: the normal recordings
I Blue: the abnormal recordings
I Pixel intensity: proportional to precision
I The total errors in layers 1 to 5 are 2, 2, 1, 1, 0, respectively.
-3 -2 -1 0 1 2
-2
0
2
Layer 1
-1 -0.5 0 0.5 1 1.5
-1
0
1
Layer 2
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
-1
0
1
Layer 3
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
-0.4-0.2
00.20.40.6
Layer 4
-1.5 -1 -0.5 0 0.5 1 1.5
-1
0
1
Layer 5
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 47/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Example: Deep Gaussian Processes with ConvolutionalKernels5
I Goal: multi-class image classification
I Database: MNIST (handwritten digits)
I Methods:
1. SGP: Sparse Gaussian processes2. DGP: Deep Gaussian processes3. CGP: Convolutional Gaussian processes4. CDGP: Convolutional deep Gaussian processes
5Vinayak Kumar et al. “Deep Gaussian Processes with Convolutional Kernels”. In: arXiv preprintarXiv:1806.01655 (2018).
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 48/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
MNIST
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 49/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Example: Identification of Atmospheric Variable UsingDeep Gaussian Processes6
I Goal: modeling temperature using meteorological variables(features).
I Domain of interest: 25Km × 25Km around the nuclear powerplant in Krsko, Slovenia.
I Features: relative humidity, atmosphere stability, air pressure,global solar radiation, wind speed.
6Mitja Jancic, Jus Kocijan, and Bostjan Grasic. “Identification of Atmospheric Variable Using Deep GaussianProcesses”. In: IFAC-PapersOnLine 51.5 (2018), pp. 43–48.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 50/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
The Geographical Features of the Surrounding Terrain
I The plant and its measurement station (marked as STOLP –Postaja) are situated in the basin surrounded by hills andvalleys, which influence micro-climate conditions.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 51/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
One-Step-Ahead Prediction
I Prediction results:
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 52/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Example: Deep Gaussian Process for Crop Yield PredictionBased on Remote Sensing Data7
I Goal: predicting crop yields before harvest
I Model: CNN and LSTM combined with GP
7Jiaxuan You et al. “Deep Gaussian Process for Crop Yield Prediction Based on Remote Sensing Data.”. In:AAAI. 2017, pp. 4559–4566.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 53/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Comparing County-Level Error Maps
I The color represents the prediction error in bushel per acre.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 54/55
Introduction Gaussian Processes Deep Gaussian Processes Applications Conclusions
Conclusions
I A case was made for using probability theory in treatinguncertainties in inference from data.
I Deep probabilistic modeling based on deep Gaussian processeswas addressed.
I The use of DGPs in studying complex interacting systems wasdescribed.
I Applications in various fields using DGPs were provided.
I Although the development of DGPs is still in its relativelyearly stages, DGPs showed great potentials in manychallenging machine learning tasks.
Petar M. Djuric — Deep Gaussian Processes: Theory and Applications 55/55