Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
2020.01.30 Data Assimilation Workshop
Application of machine learning methods to model bias correction:
Lorenz-96 model experiments
Data Assimilation Research Team, RIKEN Center for Computational Science
Arata Amemiya*, Takemasa Miyoshi
Shlok Mohta
Graduate School of Science, the University of Tokyo
Motivation: bias in weather and climate models
Weather and climate models have model biases from various sources
• Truncation error• Approximation of unresolved physical processes
- Convection- Small-scale topography- Turbulence- Cloud microphysics
(From JMA website)
Treatment of forecast error in data assimilation
𝑷𝑡+1𝑓
= 𝑴𝑷𝑡𝑎𝑴𝑻
Kalman filter
𝒙𝑡+1𝑓
=𝓜(𝒙𝑡𝑎)
𝑴 = 𝜕𝓜/𝜕𝒙
http://www.data-assimilation.riken.jp/jp/research/index.html
𝑷𝑎 = 𝑰 − 𝑲𝑯 𝑷𝑓
𝑲 = 𝑷𝑓𝑯𝑇 𝑯𝑷𝑓𝑯𝑇 + 𝑹−1
𝒙𝑎 = 𝒙𝑓 +𝑲 𝒚 − 𝐻(𝒙𝑓)
Model bias leads tothe underestimation of forecast(background) error
Previous analysis
Analysis mean
Observation
Forecast mean
Update state and forecast error covariance
Calculate analysis state and error covariance
Calculate Kalman gain
Treatment of imperfect model
Insufficient model error degrades the performance of Kalman filter
1. Covariance inflation
2. Correction of systematic bias component
𝑷𝑎 → 𝛼 𝑷𝑎
𝒙𝑡+1𝑓
= 𝒙𝑡+1𝑓
+ 𝒃
𝒙𝑡+1𝑎 = 𝒙𝑡+1
𝑓+𝑲 𝒚𝑡+1 − 𝐻 𝒙𝑡+1
𝑓
Relaxation-to-prior
multiplicative inflation
𝑷𝑎 → 𝑷𝑎 +𝑸additive inflation
𝑷𝑎 → (1 − 𝛼) 𝑷𝑎 + 𝛼𝑷𝑓
Bias correction with simple functional form
Dimensionality reduction can be applied using Singular Value Decomposition (SVD) (Danforth et al. 2007)
Set of training data {𝛿𝒙, 𝒙𝑓}→ bias correction term 𝑫(𝒙) estimation
𝑫0 = Τ𝛿𝒙 Δ𝑡
𝑳𝒙′ = 𝑪𝛿𝒙,𝒙 𝑪𝒙,𝒙−1𝒙′ ∕ Δ𝑡
𝑫 𝒙 = 𝑫0 + 𝑳𝒙′ 𝒙′ = 𝒙𝑓 − 𝒙𝑓
Steady component
𝑪 : correlation matrix
Linearly-dependent component (Leith , 1978)
✓ “Offline” bias correction
𝑑
𝑑𝑡𝒙 = 𝒇true(𝒙)
𝑑
𝑑𝑡𝒙 = 𝒇model 𝒙
𝛿𝒙 = 𝒙𝑡 𝑡 + Δ𝑡 − 𝒙𝑓 𝑡 + Δ𝑡
𝒙𝑡 𝑡 , 𝒙𝑡 𝑡 + Δ𝑡 …
𝒙𝑓 𝑡 + Δ𝑡
𝑑
𝑑𝑡𝒙 = 𝒇model 𝒙 + 𝑫(𝒙)
Simplest form: linear dependency
Bias correction with nonlinear basis functions
• Coupled Lorenz96 system (Wilks et al. 2005, Arnold et al. 2013)
• Real case: All-sky satellite infrared brightness temperature (Otkin et al. 2018)
Probability of (obs – fcst) vs obs
Higher order polynomials :
Neural networks :
• Coupled Lorenz96 system (Watson et al. 2019)
(Fig.2 of Otkin et al. 2018)
Online bias correction
• Kalman filter
= Simultaneous estimation of state variables and bias correction terms
- Polynomials (Pulido et al. 2018)
✓ “Online” bias correction
- Legendre polynomials(Cameron and Bell, 2016; for Satellite sounding in UK Met Office operational model)
- Steady component (Dee and Da Sliva 1998, Baek et al 2006)
• Variational data assimilation (“VarBC”)
sequential treatment / augmented state
Penalty function forbias-corrected obs error
Penalty function forregularization
Localization
Localization is built-in in LETKF
(Pathak et al. 2018, Watson et al. 2019)
Also in ML-based data driven modelling
✓ Reduced matrix size -> low cost
In high dimensional spatiotemporal system,geographically (and temporally) local interaction is usually dominant
✓ Highly effective parallelization(Miyoshi and Yamane, 2007)
Also used in simultaneous parameter estimation(Aksoy et al. 2006)
(Pathak et al. 2018)
The goal of this study
✓ ML-based online bias correction using Long-Short Term Memory (LSTM)
✓ Combined with LETKF with similar localization
Test experiments with coupled Lorenz96 model
• Experimental Setup
• Online bias correction with simple linear regression as a reference
• (Online bias correction with LSTM)
bias correction in LETKF system
Model
𝒙𝑡+1𝑎 = 𝒙𝑡+1
𝑓+𝑲 𝒚𝑡+1 − 𝐻 𝒙𝑡+1
𝑓
observationforecast
𝒙𝑡+1𝑓
Bias correction
Analysis
Bias correction function 𝒃(𝒙𝑓) update
𝒙𝑡+1 = 𝑴 𝒙𝑡
𝒙𝑡+1𝑓
= 𝒃(𝒙𝑡+1𝑓
)
LETKF
𝒚𝑡+1
Lorenz96 model
𝑑
𝑑𝑡𝑥𝑘 = 𝑥𝑘−1 𝑥𝑘+1 − 𝑥𝑘−2 − 𝑥𝑘 + 𝐹
𝑥𝑘 𝑥𝑘+1
• 1-D cyclic domain• Chaotic behavior for sufficiently large 𝐹
Wind @ 250hPaNCEP GFS
https://earth.nullschool.net/
(Lorenz 1996)
𝑘 = 1,2,… , 𝐾
𝑥 = 𝐹
𝐾 = 40, 𝐹 = 8
Largest Lyapunov exp.
Mean 𝑥 𝑡
Coupled Lorenz96 model
𝑑
𝑑𝑡𝑥𝑘 = 𝑥𝑘−1 𝑥𝑘+1 − 𝑥𝑘−2 − 𝑥𝑘 + 𝐹 −
ℎ𝑐
𝑏
𝑗=𝐽 𝑘−1 +1
𝑘𝐽
𝑦𝑗
𝑑
𝑑𝑡𝑦𝑗 = −𝑐𝑏 𝑦𝑗+1 𝑦𝑗+2 − 𝑦𝑗−1 − 𝑐𝑦𝑗 +
ℎ𝑐
𝑏𝑥int[(𝑗−1)∕𝐽]+1
𝑘 = 1,2,… , 𝐾
𝑗 = 1,2,… , 𝐾𝐽
“Nature run”
𝑑
𝑑𝑡𝑥𝑘 = 𝑥𝑘−1 𝑥𝑘+1 − 𝑥𝑘−2 − 𝑥𝑘 + 𝐹 + 𝐴 sin 2𝜋
𝑘
𝐾
Forecast model (Danforth and Kalnay, 2008)
Coupling terms
Multi-scale interaction
𝑦1 𝑦𝐽
Parameters used in this study:
(Wilks, 2005)
… …
(𝐾 = 8)
Large scale (Slow) variables
Small scale (fast) variables
𝐾 = 16, 𝐽 = 16
ℎ = 1, 𝑏 = 20, 𝑐 = 50
𝐹 = 16, A = 1
Data assimilation experiment
Observation operator:identical(obs = model grid)
Error standard deviation: 0.1
Interval: 0.025 / 0.05 / 0.1 (cf: doubling time ≃ 0.2)
Member: 20Localization: Gaussian weighting (length scale = 3 grids)
Covariance inflation: multiplicative (factor: 𝛼)
Obs interval Minimum RMSE Inflation 𝛼
0.025 0.0760 2.9
0.05 0.0851 5.8
0.1 0.0902 15.0
“Observation”
LETKF configuration
Large inflation factor 𝛼 is needed to stabilize DA cycle
“Nature run” + random error
“Control run”: Data assimilation without bias correction
Example: simple linear regression
Model
𝒙𝑡+1𝑎 = 𝒙𝑡+1
𝑓+𝑲 𝒚𝑡+1 − 𝐻 𝒙𝑡+1
𝑓
𝒚𝑡+1 = 𝐻 𝒙𝑡+1true + 𝝐obs
observationforecast
𝒙𝑡+1𝑓
Bias correction
AnalysisLETKF
𝒙𝑡+1𝑓
= 𝑾𝒕+𝟏𝑓
𝒙𝑡+1𝑓
+ 𝒃𝑡+1𝑓
Online bias correction by linear regression
𝑾𝑡+1𝑎 = 𝑾𝑡+1
𝑓+ 𝛽𝑲 𝒚𝑡+1 − 𝐻 𝒙𝑡+1
𝑓⋅ (𝒙𝑡+1
𝑓)𝐓 ∕ 𝒙𝑡+1
𝑓 𝟐
𝒙𝑡+1 = 𝑴 𝒙𝑡
𝒃𝑡+1 = (1 − 𝛾)𝒃𝑡
𝑾𝑡+1 = 1 − 𝛾 𝑾𝑡
𝒃𝑡+1𝑎 = 𝒃𝑡+1
𝑓+ 𝛽𝑲 𝒚𝑡+1 − 𝐻 𝒙𝑡+1
𝑓
Online bias correction by linear regression
𝛽 = 0.02, 𝛾 = 0.999 (half-life: ~37) 𝒙𝒇
𝒙𝒂 − 𝒙𝒇
𝒙𝒇 − 𝒙𝒇 :correction
𝒙𝒂 − 𝒙𝒇 has large negative correlation with 𝒙𝒇
Interval (day) Min. infl. RMSE
0.025 2.9 0.0760
0.1 15.0 0.0902
Interval (day) Min. infl. RMSE
0.025 2.0 0.0625
0.1 8.2 0.0849
and mostly offset by bias correction term
Stable with smaller inflation factor 𝛼
Correction term
Time 0-30 120-150
240-270 360-390
Bias: Forecast(before correction)-analysisCorrection
Nonlinear bias correction with ML
Model
𝒙𝑡+1𝑎 = 𝒙𝑡+1
𝑓+𝑲 𝒚𝑡+1 − 𝐻 𝒙𝑡+1
𝑓
observationforecast
𝒙𝑡+1𝑓
Bias correction
Analysis
Train the network 𝒃 with 𝒙𝑡+1𝑓
, 𝒙𝑡+1𝑎
𝒙𝑡+1 = 𝑴 𝒙𝑡
𝒙𝑡+1𝑓
= 𝒃(𝒙𝑡+1𝑓
, … )
LETKF
𝒚𝑡+1
RNN
Plan: LSTM implementation
• Activation:tanh / sigmoid(recurrent)
LSTM
• No regularization / dropout
𝒙𝑡𝑓, 𝒙𝑡
𝑎
Input
𝒙𝑡−Δ𝑡𝑓
…𝒙𝑡𝑓
nx=9, nt=5
Output𝒙𝑡𝑎
nx=1
LSTMnx=15
Densenx=10
Densenx=5
Lorenz96LETKF 𝒙𝑡
𝑓
Python TensorFlowTensorflow LSTM is implemented and integrated with LETKF codes
Network architecture
• 1 LSTM + 3 Dense layers
• LETKF-like Localization
Input : localized area
Output : one grid point
Summary
• LSTM-based bias correction is implemented and to be tested
• The efficiency of localization ?
• Systematic model bias degrades forecasts and analysis
• “Online” bias correction with data assimilation has been studied using a fixed basis function set
• “Offline” bias correction can be performed by ML as nonlinear regression
• Online learning ?