Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Observing Network Design Applied to Antarctic Weather Monitoring and
Forecasting
Natalia Hryniw⇤, Gregory J. Hakim
Department of Atmospheric Sciences, University of Washington, Seattle, Washington1
Guillaume S. Mauger2
Climate Impacts Group, University of Washington, Seattle, Washington3
Karin A. Bumbaco4
Joint Institute for the Study of Atmosphere and Ocean, University of Washington, Seattle, WA5
Jordan G. Powers6
National Center for Atmospheric Research, Boulder, CO7
⇤Corresponding author address: Natalia Hryniw, Department of Atmospheric Sciences, Box
351640, University of Washington, Seattle, Washington 98195-1640
8
9
E-mail: [email protected]
Generated using v4.3.2 of the AMS LATEX template 1
ABSTRACT
2
The Antarctic surface weather observing network is crucial for support-
ing scientific and logistical operations over the continent. The harshness of
weather conditions make it difficult to support a dense network, and so it is
critical to place new weather stations in locations that are optimally and ob-
jectively chosen for a given purpose. Here a network design algorithm is
employed that uses ensemble sensitivity to identify optimal locations for new
Automated Weather Station (AWS) locations. Here we define the optimal
location as the one that reduces the total variance of the measurement field
by the largest amount. This algorithm is used with data from the Antarctic
Mesoscale Prediction System (AMPS) to identify the best locations for two
metrics: (1) measuring the 2-m temperature field and, (2) reducing forecast
errors at lead times of 12, 24, and 36 hours, both as a “blank slate” network,
and an augmented network that is conditional on currently existing stations
that report at least 90% of the time during the data period. These stations are
also ranked in terms of their importance in capturing the 2-m air temperature
field and reducing forecast errors. The most important locations for moni-
toring 2-m air temperature are found to be near Vostok in East Antarctica,
in Marie Byrd Land, and in Queen Maud Land. Among existing stations,
the locations with highest impact are Vostok, Siple Dome, and South Pole.
Results for forecast error networks are fairly similar for all lead times, with
prime locations in Oates Land, Marie Byrd Land, Cape Adare, and central
East Antarctica near Vostok. The three most important currently existing sta-
tions for reducing forecast errors are Vostok, McMurdo, and Halley.
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
3
1. Introduction
Antarctica has some of the harshest and most extreme weather on Earth, and so travel to the
continent and logistical operations for explorations and support of scientific campaigns can be
challenging. An important part of maintaining safety and supporting operations is having a robust
observing network to identify potentially hazardous weather and provide information for forecasts.
Because of the harsh conditions, maintenance of stations is costly and difficult, so there is a strong
motivation to have measurements that provide good coverage and do well with regard to perfor-
mance measures.
Optimal network design is a methodology for finding the best location for weather monitoring and
forecasting. It is a type of experimental design methodology that provides a framework for finding
the best measurements for a given experiment. The optimal measurement is usually found through
finding the observation that produces largest change in a measure, such as total variance or a par-
ticular eigenvalue of a covariance matrix (for a general overview from a statistics perspective, see
Chaloner and Verdinelli (1995) andFedorov (1972)). Many different approaches to optimal net-
work design have been extensively explored within the geophysical literature, including adjoint,
singular vector, variational, and ensemble approaches for both fixed and targeted observations.
Adjoint and singular value techniques, such as Dimet and Talagrand (1986), Baker and Daley
(2000), Buizza and Montani (1999), Buizza et al. (2007), find optimal locations by examining the
sensitivity of a particular location to forecast singular vectors or to produce a sensitivity field to
locations using an adjoint model. A large variety of ensemble Kalman filter (EnKF) approaches
have also been used for both fixed and targeted optimal network design. The general approach in
Kalman filter methods is to used archived forecasts to calculate model covariances and estimate
the impact of new observations (either fixed or targeted) on these model covariances without using
4
an adjoint model. One method, the Retrospective Design Algorithm (RDA) described in Khare
and Anderson (2006), tests the impact of sets of fixed networks by maximizing an objective func-
tion of covariance matrices that have been updated to reflect the potential new networks. Bishop
et al. (2001) and Majumdar et al. (2002b) developed an ensemble transform Kalman filter (ETKF)
approach to find optimal observations. The ETKF approach has been used to design both adap-
tive and fixed observational networks in both simple model and fieldwork implementations. The
ETKF approach also uses archived forecasts to find optimal observations for any desired forecast
lead time, but does so by transforming the forecast error covariance matrix within an EnKF frame-
work into an orthonormal vector form to quickly solve for optimal locations. Many subsequent
studies have successfully used the ETKF for optimizing and developing a wide variety of networks
(e.g., Majumdar et al. (2002a), Aberson (2003), Xue et al. (2006)). Similar to the ETKF approach
is the ensemble sensitivity approach developed in Ancell and Hakim (2007), which we used for our
experiments. The ensemble sensitivity approach does not transform covariance matrices, and uses
the sensitivity of state variables to a generic metric function (which can be non-state variables)
to find optimal measurements. The ensemble sensitivity approach has been tested in theoretical
modeling for time-averaged observations (Huntley and Hakim 2010), used to identify an optimal
climate observing network for the Pacific Northwest (Mauger et al. 2013), and to find salinity and
sea surface paleoclimate estimates from corals (Comboul et al. 2005).
The optimal network design approach here uses existing deterministic model analyses and fore-
casts (not archived ensembles) to find locations that best measure some field (such as temperature
or 500hPa heights) by finding the locations that most reduce the total variance in a chosen metric,
for an ensemble sample. The statistics of the field are then updated to reflect this new hypothetical
measurement, and then a second favorable location conditional on the first may be found by op-
timizing over the new statistics. This allows for finding an optimal network sequentially without
5
needing to run a model and is a flexible methodology that can take into account other concerns,
such as masking out locations that may not feasibly support a weather station from the optimal
design calculation.
The outline of the remainder of the paper is as follows. In section 2 we describe the data used
for this work and provide a brief assessment of the current Antarctic surface observing network
as performed in previous work. We also describe the theoretical underpinnings of the optimal
network design approach and algorithm used, as well as the statistical approach for implementing
the optimal network design algorithm. In section 3 we discuss the results of three optimal network
calculations. The first is a set of “blank slate” calculations, in which no other stations are assumed
to exist. The second is a set of augmented networks conditional on existing stations. The third
experiment ranks a subset of the existing surface stations. Each set of experiments optimizes for
two objective measures: surface temperature monitoring at 0000 UTC, and for reducing forecast
errors in surface temperature, which allows a comparison between the geographic locations and
variance reduction for both metrics. Finally, section 4 provides a discussion of the results and
suggestions for further work in this area.
2. Data and Methods
a. Data
The model temperature data for this study was taken from the Antarctic Mesoscale Prediction
System (AMPS) (Powers et al. 2012). AMPS is a real-time numerical weather prediction system
run by the National Center for Atmospheric Research (Skamarock and Klemp 2008) in support
of the weather forecasting needs of the United States Antarctic Program (USAP) and its Antarctic
scientific and logistical operations. AMPS runs a polar-modified version of the Weather Research
6
and Forecasting Model (WRF), and the system uses the WRFDA (Barker et al. 2004) (Huang et al.
2009) data assimilation package employing a 3-dimensional variational (3DVAR) approach for the
ingest of Antarctic observations. The system has run WRF on multiple grids covering Antarctica
and the Southern Ocean (see, e.g., Powers et al. 2012), and the archived gridded model output
used in this study is from a domain covering the continent at 15-km resolution. The area shown in
the plots of Fig. 1 reflects this domain.
Forecasts from AMPS have been verified and evaluated, including when AMPS used MM5 as a
predecessor to WRF (see, e.g., Powers et al. (2003), Bromwich et al. (2005), Monaghan et al.
(2005), Lazzara et al. (2012), Bromwich et al. (2013)). The statistics of AMPS output have been
shown to be in good agreement with the statistics of observations. A recent example is that of
Bumbaco et al. (2014), who found good agreement between station observations and AMPS out-
put, both for 2-m temperature and surface pressure, as well as good agreement between forecast
fields and observations, lending confidence that AMPS output can be used for optimal network
design. To date, the AMPS Schlosser et al. (2010)). Here, the data used are WRF 2-m temperature
(from the 15-km continental grid) over the period 31 September 31 2008 to 1 October 1 2012.
Data beyond Oct 2012 were not used due to a change in the AMPS resolution. All data reflect the
0000 UTC analyses and forecasts initialized at 0000 UTC with lead times of 12, 24, and 36 hours.
b. Antarctic Surface Observing Network
The current Antarctic surface network consists of Automatic Weather Station (AWS) sites main-
tained by different countries and efforts, as well as various manned surface stations and bases.
The AWS sites provide crucial observational data over the entire continent and provide informa-
tion that is assimilated into AMPS forecasts. In addition to real time observation, AWS stations
provide historical data for many Antarctic weather and climate studies (Lazzara et al. 2012). The
7
network has also primarily grown to serve the purposes of the United States Antarctic Program
(USAP) and the research efforts of the National Science Foundation (NSF). While the network is
important for furthering these goals, both existing and potential new stations may also be quite
valuable for the analysis of specific phenomena or the forecasting for a specific area of operations.
c. Network Design Theory
We take an ensemble sensitivity approach in this study to find optimal station locations. The
general idea behind the ensemble sensitivity approach is to use existing data to calculate perturba-
tions about some mean state and then use those perturbations to calculate the impact of a potential
new observation on some value describing the system. The theoretical framework for finding op-
timal observations for a scalar using ensemble sensitivity theory was derived in Ancell and Hakim
(2007). This scalar ensemble sensitivity approach has been used for Pacific Northwest climate
monitoring (Mauger et al. 2013), and salinity and sea surface temperature paleoclimate estimates
from corals (Comboul et al. 2005).
In most situations the observations must be optimized to accomplish more than one goal, which
is not possible with the scalar approach. Here we use a multivariate generalization approach,
which allows for simultaneous optimization with respect to multiple metrics, similar to the ETKF
method described in Bishop et al. (2001) and Majumdar et al. (2002b). However, unlike the ETKF
approach, this approach can be used optimize for any general function of state and/or non-state
variables, not simply covariances of state variables. We choose our general metric function to be
the trace of a covariance matrix for the metrics, and define the best measurement as the one that
maximizes the change in the trace (the total variance) when that measurement is incorporated.
First, a vector metric to optimize for is chosen,
8
J =
2
6666666664
J0
J1...
Jn
3
7777777775
(1)
where J0 . . .Jn are a set of scalar values or components of a multivariate quantity. A leading-order
Taylor approximation to the metric, J, dependent on the vector state of the system, x, gives
dJ ⇡
∂J∂x
�Tdx (2)
where superscript T represents the transpose operation. The total variance of J is then approxi-
mated by
d⌃2J = {dJdJT}⇡
∂J∂x
�TA
∂J∂x
�(3)
where
A = {dxdxT}
is the state covariance matrix and the curly braces denote an expectation.
When a new observation is incorporated, the state covariance A changes to A0, and the variance
of the metric also changes. Therefore, from (3), we have that
d⌃2J =
∂J∂x
�T(A
0 �A)
∂J∂x
�. (4)
The optimal location is the one that maximizes d⌃2J. As shown in Ancell and Hakim (2007),
this relationship may be written in terms of a covariance,
d⌃J =�1E⇥DJTAHT
⇤⇥DJTAHT
⇤T (5)
DJTAHT = {dJ(Hdx)T} (6)
9
where D is the Jacobian operator and H is the observation operator that maps from state to met-
ric space. The state in the context of this study is the field of temperature values where there could
be a potential new observation. E is the scalar total error variance of a potential new observa-
tion, given by E = HAHT +R, with R being the measurement error covariance matrix. Eq. 6 is
used to calculate the change in the trace of the covariance matrix without explicitly calculating
the covariance matrix itself (which can be computationally expensive). Instead, only the metric
perturbations, observation operator, and state perturbations are needed to calculate the change in
covariance trace.
The general procedure for finding optimal locations uses the following algorithm.
1. Choose a vector metric that quantifies an aspect of the system of interest (such the covariance
matrix for temperature across a region)
2. Calculate the total variance for the metric (trace of the metric covariance matrix)
3. Calculate change in the total variance for all possible locations
4. Choose as the optimal measurement location the point that maximizes the change in total
variance
5. Update the metric and state to reflect the chosen measurement
6. Repeat the procedure for the next measurement, using the updated state and metric to find the
next location conditional on the previous measurement (i.e., repeat steps 4–6)
Our metrics in this work are 0000 UTC 2-m temperature and 12-, 24-, and 36-hr 2-m forecast
temperature errors at every 20th grid point on the 15-km AMPS continental grid.
The change in total variance is given in Eq. 5. To find an optimal station, Eq. 5 is evaluated
for every potential observation location (which is every 5th model gridpoint), and the location of
10
Guillaume
Guillaume
Guillaume
I still think a plain english definition of the Jacobian would be helpful here
suggest moving this sentence up to where you first define x
maximum change is chosen as the optimal location. The state and metric perturbations are updated
using the square root form of the ensemble Kalman filter (Whitaker and Hamill 2002),
dxa = dxp � K̃Hdxp (7)
where the superscript a denotes the value after assimilating a new observation, and p the value
prior to assimilation. The modified Kalman gain K̃ is given by
K̃ = aK = aApHTE�1
where
a =
1+r
RHApHT +R
!�1
Once the state and metric are updated to reflect the new observation, the process repeats until the
desired number of stations is reached. This procedure ensures that each station, with the exception
of the first, is conditional on previous optimal stations.
d. Sampling and Bootstrap Error Estimates
Since the exact covariance matrix and probability density function of the surface temperature
field are unknown, we estimate sample covariances from the available AMPS data for the optimal
design calculation. Each iteration of the optimal calculation is done by randomly choosing
250 ensemble members each time from the AMPS temperature output, which are then used to
calculate the state and metric perturbations. 250 ensemble members were chosen for several
reasons. A variety of ensembles sizes were tested, from 30 members to 1000 members, and
250 member ensembles gave results that were qualitatively and quantitatively consistent with
the largest ensembles. Smaller ensembles did not produce results that were geographically
localized, instead resembling a random selection of observations. 250 was also close to the
11
Guillaume
Guillaume
out of a total sample size of ???
in this case, “ensemble” refers to a collection of different observation times. This may not be clear to the non-statisticians
computational rank of the temperature covariance matrix (when every grid point across the entire
data period was used to calculate the covariance matrix). Hence, a smaller ensemble size might
underestimate covariances, and using an ensemble larger than 250 would not provide enough
additional information to justify the computational cost of using a very large ensemble.
The number of degrees of freedom of the surface temperature field are also large, and because
sampling error must be addressed, we use this random sampling with a Monte Carlo bootstrap
approach to estimate the uncertainty in the network calculation. This bootstrap approach also
allows us to examine the error in the change in total variance for each station. Each iteration
of the optimal calculation is done by randomly choosing a new ensemble of 250 members
each time from the AMPS temperature output, which samples a large variety of different states
of the temperature field. The network identification results are determined for this ensemble,
as described in the algorithmic approach above, until 20 locations are found for the idealized
network. The process is repeated 10000 times, providing 10000 sets of 20 locations, allowing for
statistics on station location.
3. Results for Constructing Networks
We constructed two types of networks for weather and climate monitoring, one optimized for
0000 UTC 2-m temperature (monitoring), and one optimized for reducing 2-m temperature fore-
cast errors. The analysis network is constructed by using the covariance matrix of 2-m temperature
at 0000 UTC and calculated using every 20th grid point over the continent, and every 5th grid point
is considered as a possible observation (and is also used to calculate the state vector). The forecast
error network is constructed by using 2-m temperature forecast errors at a lead times of 12, 24,
and 36 hours also at every 20th grid point over the entire continent, and the state is the 0000 UTC
12
Guillaume
Guillaume
Guillaume
Guillaume
Guillaume
Guillaume
time step? (instead of grid point)
isn’t it not the computational cost but the ability to have meaningful differences among samples in your bootstrap procedure?
Why 20? This should be justified somewhere too.
I still think there needs to be some stated rationale for the 20th / 5th grid point subsamples
I find this notation more distracting than the simpler “00Z” notation
first use of “analysis” — needs to be defined
temperature at every 5th grid point 12, 24, and 36 hours prior.
For both networks, we consider three experiments. The first experiment (“blank slate”) finds the
best observation locations in Antarctica assuming no other stations currently exist. The second
experiment (“CD90”) is an “augmented” network, where the influence of existing stations are
removed before new stations are found. In this second case, the influence of Antarctic surface sta-
tions that report at least 90% of the time (subsequently denoted as CD90), as described in Bumbaco
et al. (2014), are first removed through multiple linear regression, and then the optimal calculation
algorithm is applied on the residual state and metric perturbation values. This method assumes
that the observations are perfect and does not take into account their error, which probably results
in too much variance being removed. This second experiment produces an optimal network that is
conditional on the current network, and shows the optimal locations to add new stations. The third
experiment is one where the CD90 stations are ranked based on how well the stations capture the
variance of the temperature field or reduce forecast errors. The metric remains the same as in the
other networks, but only grid points which are closest to CD90 locations are considered for the
observations.
a. Monitoring Networks
1) BLANK SLATE NETWORK
The first network chosen by the optimal design algorithm is one where no currently existing
stations are taken into account – the network is designed “from scratch”. The results for the
monitoring networks are given in Figs. 3 and 4. In the blank slate calculation in Fig. 3, the most
frequently chosen first station is in the Megadunes region in East Antarctica. This location is
chosen about 22% of the time, and explains nearly half of the total variance of the temperature
field. The closest CD90 station is Vostok, but the point chosen most often is within a large
13
Guillaume
Guillaume
Guillaume
as before: is this regardless of rank or do you mean that it’s chosen as the top location?
an overestimate of the variance explained
which 2 networks?
Your description of the 3rd experiment doesn’t seem consistent with what you have in your results section
gap in the existing network in East Antarctica. This region corresponds to some of the longest
correlation length scales in Antarctica (Bumbaco et al. 2014). Locations with long correlation
length scales allow information from measurements to affect a large area, which yields a relatively
larger change in the total variance compared to locations with shorter correlation length scales. A
physical explanation for why relatively long correlation length scales exist in East Antarctica is
that the strong winds, katabatic flows, and strong stability in this region dominate the variability
in the two-meter temperature field, which can have a large geographic space (e.g., Dadic et al.
(2013)). Another possible reason for the interior being chosen is that given that only land/ice
shelf locations were used for the metric and potential new observations. Coastal variability is not
a large portion of the total temperature variance over land/ice, and so locations in the interior of
the polar vortex dominate the variability over Antarctica. The second station, chosen about 22%
of the time, is very close to Siple Dome in West Antarctica, indicating its potential importance
in monitoring continental surface temperature. The third station (chosen about 4% of the time)
is located on the polar plateau poleward of in Queen Maud Land in another gap in the surface
network.
2) AUGMENTED NETWORK
The results for the augmented network are presented in Fig. 4. The results are significantly
different from the blank slate network, reflecting the contribution from existing AWS stations.
The change in variance of the stations chosen is an order of magnitude smaller than the change
in variance in the blank slate network, since much of the variance has been regressed out. The
first station (about 35%) is on the coast of East Antarctica in Coats Land and near the Ronne Ice
Shelf. The second location (about 58%) is close to the second location chosen in the blank slate
14
Guillaume
Guillaume
odd coincidence that both are chosen 22% of the time
This is an important point — would be worth exploring in another paper, and also worth repeating in your conclusions.
calculation, but further from Siple Dome. Perhaps counterintuitively, the second station is chosen
more frequently than the first, but some of this is due to spatial uncertainty. A second location is
chosen fairly frequently for the first station, close to the most frequent point. The second station
only has one frequent location, with less spatial uncertainty. This does indicate that some of
the variance after regressing the CD90 stations maybe be at the level of noise, since the linear
regression overestimates the impact of existing stations by not taking into account observational
error.
The third location (about 15%) is in Queen Maud Land in East Antarctica. Since much of the
total variance has already been removed, it is difficult to connect these results to large-scale
meteorological influences, and these results are likely due to local topological/meteorological
effects that cannot be explained by the CD90 network. The multiple linear regression used to
account for existing stations has likely overestimated the impact of the CD90 stations, and so
further work would be needed to determine a more realistic impact of these stations.
3) CD90 RANKINGS
Although the results of the blank slate calculation are suggestive of which CD90 stations are
important for surface monitoring, an additional calculation to gauge the relative importance of the
CD90 stations compares the variance reduction for the points nearest the CD90 stations to that for
the blank-slate calculation. The results are summarized in Table 1. The first station, Vostok, is
chosen about 99% of the time and is the CD90 location closest to the blank slate top station in
the Megadunes vicinity as shown in Fig. 3. Vostok reduces a similar amount of variance as the
first station of the blank slate calculation - the median variance explained by the first station in
the CD90 calculation is 43%, which is within a one standard deviation envelope of the median
15
variance explained by the first station in the blank slate calculation (45%). The second and third
stations, Siple Dome and South Pole, are the two closest CD90 stations to the most frequently
chosen locations for the second and third stations in the blank slate calculation. These results
show that Vostok, Siple Dome, and South Pole are the most important stations for monitoring
surface temperature, albeit suboptimally (meaning they explain less of the total variance than the
optimal locations). These stations are important for logistical and research purposes as well, and
putting in stations in the optimal locations may provide excellent redundancy in the network for
situations where these CD90 stations cannot report or have data interruptions.
b. Forecast Error Reduction Networks
1) BLANK SLATE NETWORK
We constructed another “from scratch” network, but this time finding optimal locations to reduce
forecast errors for temperature at various lead times. The results for reducing 2-m forecast errors
over Antarctica for forecast lead times of 12, 24, and 36 hours are shown in Figs. 5, 6, and 7,
respectively. For all the lead times, there is more spatial uncertainty in terms of the most optimal
location, and there is a smaller percentage change in the total variance compared to the analysis
network. The first station for reducing 12-hr 2-m temperature forecast errors is located near Terra
Nova Bay, the second is in Marie Byrd Land in West Antarctica, and the third station is close to
the first, but closer to Cape Adare. The Terra Nova Bay/Marie Byrd Land/Ross Shelf locations
are places of some of the most significant mesoscale activity concurring with katabatic flows in
Antarctica, especially compared to the Antarctic Peninsula (Carrasco et al. 2003). The first three
locations are likely statistically sensitive to large-scale conditions that can give rise to significant
mesoscale activity, which dominates the surface temperature variability on short time scales. If
the right conditions for such activity are captured by surface observations, then forecasts can be
16
Guillaume
Guillaume
somewhat
It would be great to include this comparison for all 3 top stations in Table 1, by adding an additional column that lists the corresponding variance reduction for the nearby blank slate location
associated
improved and thus forecast errors reduced for lead times of 12 hours.
The results for the 24- and 36-hr lead time networks are similar to those for 12-hr lead times, with
the first most frequent location in East Antarctica near Vostok, the second in Marie Byrd Land
(close to the 12-hr second location), and the third near Halley in East Antarctica (Coats Land). For
the 36-hr network, a location near Cape Adare is chosen relatively frequently as well (although it
is not the most frequently chosen location). These results are consistent with the analysis network
and the 12-hr network, which we suspect is due to the fact that East Antarctica is a location with
long correlation length scales for surface temperature, and so more optimally observing surface
temperature in this region (and assimilating it into a forecast model) will likely reduce 24- to 36-hr
forecast errors. The locations of significant mesoscale and katabatic activity are also highlighted
as regions to measure to help reduce forecast errors.
2) AUGMENTED NETWORKS
Once again an augmented network conditional on the existing CD90 stations is constructed for
forecast lead times of 12, 24, and 36 hours. The results are presented in Figs. 8, 9, and 10 respec-
tively. For all three lead times, similar locations are chosen as in the unaugmented forecast error
networks. This calculation is different from the monitoring network results, where both the “blank
slate” and augmented networks give fairly similar results. The first location is in Marie Byrd Land,
the second in East Antarctica in Coats Land, and the third near Cape Adare. A significant portion
of the surface temperature forecast error variance is removed through the linear regression used
to remove the influence of the CD90 stations, so the locations chosen only explain a very small
amount of variance ( 1%-2%). As in the monitoring experiments, the results are consistent with
each other and are still somewhat geographically localized. This suggests that the current CD90
17
Guillaume
Does it make sense that the 12-, 24-, and 36-hour networks highlight similar locations w.r.t. the mesoscale/katabatic activity? I would think there would be a lagrangian effect that would be important — i.e., longer lead times point further upstream..
Too vague. What exactly do you mean? This doesn’t logically support the subsequent sentence.
network may not be capturing some variance in the temperature field that could reduce forecast
errors for 12, 24, and 36 hours.
3) CD90 RANKINGS
The CD90 stations were also ranked in terms of their importance in reducing 12-, 24-, and 36-hr
forecast errors. The results are given in Tables 2 - 4. The station chosen most frequently for the
first location for 12- and 24-hr forecast lead times was Vostok (which was also the most chosen
location for the analysis network), and McMurdo was chosen for the top location for 36-hr forecast
errors. The second and third stations are in different locations from the second and third stations
chosen in the forecast error analysis network. Siple Dome and South Pole were the second and
third stations chosen for the analysis network. For reducing forecast errors, McMurdo and Halley
are chosen as the best locations (with the exception of 36 hours, where Vostok is second and
Halley is third). These results are consistent with the blank slate forecast error networks – the
region near Vostok is highlighted significantly for all three lead times. McMurdo is not far from
the second location chosen in Queen Maud Land, although it is not the closest (Siple Dome is), but
it probably covaries more with the location in Queen Maud Land than Siple Dome does. Halley
is on the East Antarctic coast, and is the closest CD90 station to the location chosen for the third
observation in the blank slate calculations. However, as in the network chosen for monitoring at
0000 UTC, these stations are suboptimal compared to the optimally chosen stations for forecast
errors. If one is interested in reducing forecast errors for surface temperature, assimilating Vostok,
McMurdo, and Halley more often may be worth investigating. If stations were installed in the
locations chosen in the blank slate calculations, the already-existing stations could provide useful
redundancy for the optimally sited stations.
18
Guillaume
Guillaume
have you noted previously that these are not consistently assimilated, and why? An unacquainted reader might
assume that these are always assimilated, by default.
c. Statistical Significance and Error in Results
All of the optimal networks constructed here are for 20 stations. This number was chosen to
examine the variance reduction predicted by the optimal design calculation beyond just a few
stations, but not so many as to be computationally too intensive. In real world situations the
number of new stations place will likely be affected by practical concerns, such as budget or time
to service only a certain number of stations during a field season in Antarctica. However, it is
important to know how many stations can be chosen by this optimal design approach before the
variance reduction of a new station is no longer statistically significant and so is not providing any
new information above a randomly chosen station.
To test whether the variance reduction of each station was distinct, we performed a Kolmogorov-
Smirnov (KS) test between the variance reduction distribution of a station obtained from Monte
Carlo sampling and the next one chosen (for instance, between the distribution of the 3rd and 4th
station). For all of the experiments discussed above, all 20 stations pass the KS test at a 99%
confidence level, but all of the stations beyond the 2nd or 3rd only remove a small percentage of
the total variance ( 1-2%). It is not clear whether these small percentage changes are choosing
meaningful locations. A large fraction of this variance is removed by the first station. For
example, the first station for the 0000 UTC temperature monitoring network removes about 45%
of the total variance (as seen in Fig. 3). This is possibly an overestimate of the impact the first
station has on the total variance, and so subsequent stations may be optimizing over residual
variances that could be on par with noise. A similar concern is also at play for the augmented
networks. The linear regression used to remove the influence of the CD90 stations removes
about 85-95% of the total variance, leaving little variance with which to find optimal location
conditional on the CD90 stations. The variance reduction of the augmenting stations are very
19
Guillaume
Guillaume
Guillaume
Guillaume
Guillaume
Guillaume
Guillaume
Guillaume
Guillaume
Guillaume
Somehow, this section needs to be incorporated into the methods section instead of here. I see the issue: you discuss the results below, but that problem can be solved by separating out the pieces that should go under Summary/Discussion. Otherwise readers have no basis for understanding your choice of 20 stations in earlier parts of the manuscript.
too
suggest replacing with something simpler, like “such as accessibility, maintenance requirements, and available budget”.
Suggest deleting this paragraph entirely, and adapting the first sentence to be an introduction to the second paragraph below.
e.g.: “For simplicity, all of the optimal network calculations are truncated after 20 stations have been identified.”
One possibility is that the variance explained by the first station is overestimated, leaving artificially small residuals for
subsequent stations, which may not be distinct from random noise. This could also be the issue for the augmented networks.
statistically robust
latter sites
the
small, and so the variance reduction on new stations may not be significant. This is suggested in
geographic distribution of locations chosen in some of the networks. The first location chosen is
geographically localized (a few points being chosen frequently, and most other points chosen are
clustered around the most frequent points). The second and third stations are more geographically
distributed, with a few locations chosen most frequently but with a lot of geographic variability
over all 10000 iterations. A purely random selection would choose all points over Antarctica
with the same frequency, and so a large geographic distribution of points chosen approaches the
random result.
One technique that can be employed to adjust variance reduction calculation is covariance
localization. Spurious correlations may occur due to sampling error or other effects between
geographically distant locations that in reality are uncorrelated. For instance, the top of the
West Antarctic peninsula may have a spurious correlation with a point in the interior of East
Antarctica on daily timescales in our analysis. However these two points are distant, separated
by significant topography, have very different weather on short timescales (coastal versus high
plateau), and the correlation length scales in the peninsula are much shorter than the distance
to East Antarctica (Bumbaco et al 2014). Some part of this correlation is likely a statistical
anomaly and not physically real. Covariance localization seeks to either remove or minimize these
spurious correlations. This is usually done by weighting covariances using a function of distance
– covariances between points that are far apart are weighted less, and covariances between points
closer together are weighted more.
As a sample calculation, we applied covariance location as described in Gaspari and Cohn (1999)
to the variance calculation by multiplying the Kalman gain in equation 7 for the blank slate
monitoring network case. A localization radius of 2000km, roughly the average correlation length
scale over Antarctica, was used for the calculation.
20
Guillaume
Guillaume
the
In particular because in a model, which will tends to under-represents local-scale influences on weather variability
or an artifact of the AMPS model
The results are shown in Fig. 11. The top three stations are in similar locations are the top the
stations for the unlocalized calculation, but the variance reduction of the stations are an order of
magnitude smaller than the variance reduction in the unlocalized calculation. As such it is likely
many more stations could be found before the results are not statistically different from that of
randomly chosen stations. The qualitative similarities between both localized and unlocalized
results that an unlocalized approach can be used to find a few stations, but for a large amount of
stations a localization approach may be necessary. However, how localization should be applied
(what is the appropriate radius, should it be adaptive, how to apply localization in time for forecast
errors etc.) is very dependent on the particular covariance structures of a system of interest. In
order to make decisions concerning where to place new stations, localization should be explored
and understood in the specific context of a desired network in future work.
Another valuable experiment for further work, especially for determining where to place new
stations in Antarctica, would be an Observing System Experiment (OSE). An OSE is an
experiment that is used to evaluate the impact of observations on a model analysis or forecast
in a context that mimics a real-world forecast system. Once an OSE is performed, it can be
used to adjust the optimal design calculation (by adjusting how much variance is removed, how
much localization is applied), and so can provide a more realistic variance reduction spectrum,
especially for finding optimal locations to reduce forecast errors. An OSE runs an actual forecast
model, and so more accurately captures the nonlinear relationship between a current time and a
future forecast hour. In our approach, the relationship between an observation and forecast errors
12, 24, and 36 hours in the future is treated as linear. Although this relationship may be accurate
to first order on short timescales and short geographic distances, it will not hold well for longer
lead times where more nonlinear dynamics and large changes in synoptic conditions come into
play, and so our ensemble sensitivity approach overestimates the impact of an observation on re-
21
Guillaume
Guillaume
suggests?
calibrate?
ducing forecast errors by overestimating how much that observation correlates with forecast errors.
4. Summary and Discussion
Good observations are crucial for supporting science and operations over the Antarctic conti-
nent. Due to the unique and harsh conditions in Antarctica, it is expensive to install and maintain
a dense observing network. This means that it is critical to weigh the potential costs and benefits
when deciding where to place new stations and whether or not to retain current stations. Optimal
design provides a means of maximizing the information and coverage of the observing network,
and provide an objective framework to help make such decisions.
Two types of networks have been considered – one aimed at accurately monitoring 2-m tempera-
ture over the continent, and one aimed at reducing forecast errors at lead times of 12, 24, and 36
hours. The data used to perform these calculations come from AMPS over a period of 4 years. A
multivariate ensemble sensitivity approach with Monte Carlo bootstrapping is used to find optimal
locations, with optimal locations being those which maximally reduce the total variance of the
metric (which in this case was surface temperature). Three calculations were performed for three
types of networks. First, a blank slate calculation is done to find optimal locations if no stations
currently existed in Antarctica. The second calculation involved regressing out the influence of
the CD90 stations, and then optimizing over the residuals, providing an optimal network that is
conditional on already existing networks. The third calculation is a ranking of the CD90 stations,
where only the grid points closest to those stations were considered as potential observations to
allow for gauging their relative importance with respect to the metrics.
For the blank slate calculation where no other stations exist for monitoring daily surface tem-
perature, the top three locations are central East Antarctica near Vostok, in Marie Byrd Land on
22
the Siple Coast, and again East Antarctica but near South Pole. The most variance is explained
by the first station, and all the results give clear, geographically localized areas that are most
sensitive to the surface temperature field over 10000 Monte Carlo iterations. The augmented
network provides slightly different results, with the top locations being near Halley, a different
location along the Siple Coast, and a location in East Antarctica in Queen Maud Land. A lot of the
variance in the metric is removed, and so these conditional stations explain very little of the total
variance, suggesting that the current CD90 network already provides adequate coverage, although
some gaps do exist, especially in East Antarctica (see Bumbaco et al. (2014)). The CD90 ranked
network chooses stations that are geographically close to the optimal blank slate locations, with
the top 3 stations being Vostok, Siple Dome, and South Pole. East Antarctica is an area of long
correlation length scales on daily timescales, and so a station in East Antarctica will explain much
of the total temperature variance since it will covary with much of the region.
The networks for forecast errors are fairly consistent with each other for all lead times. The
blank slate network chooses Oates Land, Marie Byrd Land, and Cape Adare for the first three
stations for a lead time of 12 hours. These locations are places of largest mesocyclone activity in
Antarctica, and hence may be a large source of forecast errors if large-scale conditions suitable for
mesocyclone development are not captured adequately by the observing network. For lead times
of 24 and 36 hours, East Antarctica near Vostok, Marie Byrd Land, and a place on the coast in
Queen Maud land near the Weddell Sea are chosen as the top three stations. The calculation with
the CD90 stations regressed out highlights the same areas, suggesting there may be additional
information that could be measured to reduce forecast errors that the current network is not
capturing. Finally, the CD90 rankings choose the same 3 stations for all lead times as the best -
Vostok, McMurdo, and Halley. These locations are geographically close and statistically similar
to the locations identified in the blank slate calculation, but do not explain as much total variance
23
as the optimally chosen locations. As with the analysis case, if new stations are installed, these
three stations would provide important redundancy in the network for the optimal stations.
The uses of the Antarctic surface network are manyfold, and so a more complete evaluation is
necessary to . However, the framework presented here provides a flexible methodology to do
so without the need to run forecasts. Because we used the total variance as metric, the results
are consistent with a statistical analysis of correlation length scales. However, this methodology
is not constrained to simply using total variance – other metrics, such as leading eigenvalue or
determinant of the covariance matrix (which measure information content), may also be used
within the same general algorithm. This methodology can also take into account already existing
stations and find new locations conditional on those stations, providing an objective means to do
so.
More work needs to performed, however, to address the statistical significance of stations chosen
and what adjustments need to be made to the algorithm for specific applications of this optimal
design approach. Our work likely overestimates the impact a station has on a metric, and so the
variance removed by the first few stations is too large. The residual variance used to find more
stations is noisy and does not provide statistically significant results. These issues need to be
investigated before the optimal design algorithm is used to make decisions on where to place new
stations in Antarctica or elsewhere. Further work should focus on adjusting the variance reduction
in the optimal design calculation. This may be done through applying covariance localization and
performing OSEs.
24
Guillaume
Guillaume
Guillaume
is needed
variance explained by a particular station
5. Acknowledgements
This work was funded by NSF grant number 1043090. The authors would like to thank Kevin
Manning for providing the AMPS data. The authors would also like to thank Matthew Lazzara for
all of this helpful discussions concerning the Antarctic surface observing network, and Eric Steig
for his discussions concerning Antarctic weather and climate.
25
References
Aberson, S. D. (2003). Targeted observations to improve operational tropical cyclone track fore-
cast guidance. Monthly weather review, 131(8).
Ancell, B. and Hakim, G. J. (2007). Comparing adjoint-and ensemble-sensitivity analysis with
applications to observation targeting. Monthly Weather Review, 135(12):4117–4134.
Baker, N. L. and Daley, R. (2000). Observation and background adjoint sensitivity in the adap-
tive observation-targeting problem. Quarterly Journal of the Royal Meteorological Society,
126(565):1431–1454.
Barker, D. M., Huang, W., Guo, Y.-R., Bourgeois, A., and Xiao, Q. (2004). A three-dimensional
variational data assimilation system for mm5: Implementation and initial results. Monthly
Weather Review, 132(4):897–914.
Bishop, C. H., Etherton, B. J., and Majumdar, S. J. (2001). Adaptive sampling with the ensemble
transform kalman filter. part i: Theoretical aspects. Monthly weather review, 129(3):420–436.
Bromwich, D. H., Monaghan, A. J., Manning, K. W., and Powers, J. G. (2005). Real-time fore-
casting for the antarctic: An evaluation of the antarctic mesoscale prediction system (amps)*.
Monthly Weather Review, 133(3):579–603.
Bromwich, D. H., Otieno, F. O., Hines, K. M., Manning, K. W., and Shilo, E. (2013). Comprehen-
sive evaluation of polar weather research and forecasting model performance in the antarctic.
Journal of Geophysical Research: Atmospheres, 118(2):274–292.
Buizza, R., Cardinali, C., Kelly, G., and Thépaut, J.-N. (2007). The value of observations. ii: The
value of observations located in singular-vector-based target areas. Quarterly Journal of the
Royal Meteorological Society, 133(628):1817–1832.
26
Buizza, R. and Montani, A. (1999). Targeting observations using singular vectors. Journal of the
Atmospheric Sciences, 56(17):2965–2985.
Bumbaco, K. A., Hakim, G. J., Mauger, G. S., Hryniw, N., and Steig, E. J. (2014). Evaluating the
antarctic observational network with the antarctic mesoscale prediction system (amps). Monthly
Weather Review, 142(10):3847–3859.
Carrasco, J. F., Bromwich, D. H., and Monaghan, A. J. (2003). Distribution and characteristics of
mesoscale cyclones in the antarctic: Ross sea eastward to the weddell sea*. Monthly Weather
Review, 131(2):289–301.
Chaloner, K. and Verdinelli, I. (1995). Bayesian experimental design: A review. Statistical Sci-
ence, pages 273–304.
Comboul, M., Emile-Geay, J., Hakim, G. J., and Evans, M. (2005). Paleoclimate sampling as a
sensor placement problem. Journal of Climate, 28:in review.
Dadic, R., Mott, R., Horgan, H. J., and Lehning, M. (2013). Observations, theory, and modeling of
the differential accumulation of antarctic megadunes. Journal of Geophysical Research: Earth
Surface, 118(4):2343–2353.
Dimet, F.-X. L. and Talagrand, O. (1986). Variational algorithms for analysis and assimilation of
meteorological observations: theoretical aspects. Tellus A, 38(2):97–110.
Fedorov, V. V. (1972). Theory of optimal experiments. Elsevier.
Gaspari, G. and Cohn, S. E. (1999). Construction of correlation functions in two and three dimen-
sions. Quarterly Journal of the Royal Meteorological Society, 125(554):723–757.
27
Huang, X.-Y., Xiao, Q., Barker, D. M., Zhang, X., Michalakes, J., Huang, W., Henderson, T.,
Bray, J., Chen, Y., Ma, Z., et al. (2009). Four-dimensional variational data assimilation for wrf:
formulation and preliminary results. Monthly Weather Review, 137(1):299–314.
Huntley, H. S. and Hakim, G. J. (2010). Assimilation of time-averaged observations in a quasi-
geostrophic atmospheric jet model. Climate dynamics, 35(6):995–1009.
Khare, S. and Anderson, J. (2006). A methodology for fixed observational network design: theory
and application to a simulated global prediction system. Tellus A, 58(5):523–537.
Lazzara, M. A., Weidner, G. A., Keller, L. M., Thom, J. E., and Cassano, J. J. (2012). Antarctic
automatic weather station program: 30 years of polar observation. Bulletin of the American
Meteorological Society, 93(10):1519–1537.
Majumdar, S., Bishop, C., Buizza, R., and Gelaro, R. (2002a). A comparison of ensemble-
transform kalman-filter targeting guidance with ecmwf and nrl total-energy singular-vector
guidance. Quarterly Journal of the Royal Meteorological Society, 128(585):2527–2549.
Majumdar, S., Bishop, C. H., Etherton, B., and Toth, Z. (2002b). Adaptive sampling with the
ensemble transform kalman filter. part ii: Field program implementation. Monthly Weather
Review, 130(5):1356–1369.
Mauger, G., Bumbaco, K., Hakim, G., and Mote, P. (2013). Optimal design of a climatological
network: beyond practical considerations. Geoscientific Instrumentation, Methods and Data
Systems, 2(2):199–212.
Monaghan, A. J., Bromwich, D. H., Powers, J. G., and Manning, K. W. (2005). The climate
of the mcmurdo, antarctica, region as represented by one year of forecasts from the antarctic
mesoscale prediction system*. Journal of Climate, 18(8):1174–1189.
28
Powers, J. G., Manning, K. W., Bromwich, D. H., Cassano, J. J., and Cayette, A. M. (2012). A
decade of antarctic science support through amps. Bulletin of the American Meteorological
Society, 93(11):1699–1712.
Powers, J. G., Monaghan, A. J., Cayette, A. M., Bromwich, D. H., et al. (2003). Real-time
mesoscale modeling over antarctica: The antarctic mesoscale prediction system*. Bulletin of
the American Meteorological Society, 84(11):1533.
Schlosser, E., Manning, K., Powers, J., Duda, M., Birnbaum, G., and Fujita, K. (2010). Charac-
teristics of high-precipitation events in dronning maud land, antarctica. Journal of Geophysical
Research: Atmospheres, 115(D14).
Skamarock, W. C. and Klemp, J. B. (2008). A time-split nonhydrostatic atmospheric model for
weather research and forecasting applications. Journal of Computational Physics, 227(7):3465–
3485.
Whitaker, J. S. and Hamill, T. M. (2002). Ensemble data assimilation without perturbed observa-
tions. Monthly Weather Review, 130(7):1913–1924.
Xue, M., Tong, M., and Droegemeier, K. K. (2006). An osse framework based on the ensemble
square root kalman filter for evaluating the impact of data from radar networks on thunderstorm
analysis and forecasting. Journal of Atmospheric and Oceanic Technology, 23(1):46–66.
29
LIST OF TABLESTable 1. Ranked results of top three CD90 stations chosen to optimize for continent-
wide two-meter temperature. The frequency chosen for the stations 2 and 3indicate how often that location is chosen, but not conditional that the previousstation be optimal (i.e., every time the top location for station two is chosen thetop first station chosen may not have been the most frequently chosen stationover all trials). The variance reduction gives the median conditional variancereduction over all trials. . . . . . . . . . . . . . . . . . 30
Table 2. Ranked results of top three CD90 stations chosen to optimize for continent-wide two-meter temperature forecast errors for a lead time of 12 hours. Thefrequency chosen for the stations 2 and 3 indicate how often that location ischosen, but not conditional that the previous station be optimal (i.e., every timethe top location for station two is chosen the top first station chosen may nothave been the most frequently chosen station over all trials). The variancereduction gives the median conditional variance reduction over all trials. . . . . 31
Table 3. Ranked results of top three CD90 stations chosen to optimize for continent-wide two-meter temperature forecast errors for a lead time of 24 hours. Thefrequency chosen for the stations 2 and 3 indicate how often that location ischosen, but not conditional that the previous station be optimal (i.e., every timethe top location for station two is chosen the top first station chosen may nothave been the most frequently chosen station over all trials). The variancereduction gives the median conditional variance reduction over all trials. . . . . 32
Table 4. Ranked results of top three CD90 stations chosen to optimize for continent-wide two-meter temperature forecast errors for a lead time of 36 hours. Thefrequency chosen for the stations 2 and 3 indicate how often that location ischosen, but not conditional that the previous station be optimal (i.e., every timethe top location for station two is chosen the top first station chosen may nothave been the most frequently chosen station over all trials). The variancereduction gives the median conditional variance reduction over all trials. . . . . 33
30
TABLE 1. Ranked results of top three CD90 stations chosen to optimize for continent-wide two-meter tem-
perature. The frequency chosen for the stations 2 and 3 indicate how often that location is chosen, but not
conditional that the previous station be optimal (i.e., every time the top location for station two is chosen the top
first station chosen may not have been the most frequently chosen station over all trials). The variance reduction
gives the median conditional variance reduction over all trials.
34
35
36
37
38
Station Number Name Frequency Chosen Median Variance Reduction in Metric
Station 1 Vostok 98.54% 43.08%
Station 2 Siple Dome 74.81% 5.71%
Station 3 South Pole 62.61% 5.29%
31
Suggest adding a column with the variance reduction corresponding to the nearby optimal location in each case
The “Frequency Chosen” numbers are unclear: Does this refer to the frequency that this site is chosen in any of the top X stations (e.g., any that fall above the threshold for statistical significance)?
I assume you are instead listing the % of the time it’s selected for the specific ranking that’s most common — e.g., Vostok is selected 1st 98.54% of the time. But this might be misleading, since one location may be selected frequently but not consistently placed in one specific ranking.
TABLE 2. Ranked results of top three CD90 stations chosen to optimize for continent-wide two-meter temper-
ature forecast errors for a lead time of 12 hours. The frequency chosen for the stations 2 and 3 indicate how often
that location is chosen, but not conditional that the previous station be optimal (i.e., every time the top location
for station two is chosen the top first station chosen may not have been the most frequently chosen station over
all trials). The variance reduction gives the median conditional variance reduction over all trials.
39
40
41
42
43
Station Number Name Frequency Chosen Median Variance Reduction in Metric
Station 1 Vostok 70.73% 24.44%
Station 2 McMurdo 70.60% 4.84%
Station 3 Halley 63.68% 2.62%
32
TABLE 3. Ranked results of top three CD90 stations chosen to optimize for continent-wide two-meter temper-
ature forecast errors for a lead time of 24 hours. The frequency chosen for the stations 2 and 3 indicate how often
that location is chosen, but not conditional that the previous station be optimal (i.e., every time the top location
for station two is chosen the top first station chosen may not have been the most frequently chosen station over
all trials). The variance reduction gives the median conditional variance reduction over all trials.
44
45
46
47
48
Station Number Name Frequency Chosen Median Variance Reduction in Metric
Station 1 Vostok 78.56% 21.66%
Station 2 McMurdo 62.69% 4.52%
Station 3 Halley 72.03% 2.8%
33
TABLE 4. Ranked results of top three CD90 stations chosen to optimize for continent-wide two-meter temper-
ature forecast errors for a lead time of 36 hours. The frequency chosen for the stations 2 and 3 indicate how often
that location is chosen, but not conditional that the previous station be optimal (i.e., every time the top location
for station two is chosen the top first station chosen may not have been the most frequently chosen station over
all trials). The variance reduction gives the median conditional variance reduction over all trials.
49
50
51
52
53
Station Number Name Frequency Chosen Median Variance Reduction in Metric
Station 1 McMurdo 49.75% 24.29%
Station 2 Vostok 51.24% 4.77%
Station 3 Halley 82.14% 2.89%
34
LIST OF FIGURESFig. 1. Plot of the AMPS grids, including the 15km grid. Taken from Powers et al. (2012). . . . . 35
Fig. 2. Antarctica with referenced regions and locations. . . . . . . . . . . . . . 36
Fig. 3. First three optimal stations for monitoring 0000 UTC 2-m temperature over the entire con-tinent. Cells that are colored indicate that location was chosen at least once, and colorindicates the frequency with which that location is chosen over 10000 iterations. The greencircles represent the CD90 stations. The green boxes indicate the median percentage of thetotal variance that is explained by top three stations. The boxes in the spectrum plot indicatethe lower to upper quartile range for the optimal change in total variance over all iterations,and whiskers extend to 99.3% (two standard deviations) of the distribution. . . . . . . 37
Fig. 4. Same as Fig. 3, but with the influence of the CD90 stations regressed out before optimalstations are chosen. . . . . . . . . . . . . . . . . . . . . . 38
Fig. 5. Same as Fig. 3, but using 2-m temperature 12-hr forecast errors over the entire continent asthe metric. . . . . . . . . . . . . . . . . . . . . . . . 39
Fig. 6. Same as Fig. 3, but using 2-m temperature 24-hr forecast errors over the entire continent asthe metric. . . . . . . . . . . . . . . . . . . . . . . . 40
Fig. 7. Same as Fig. 3, but using 2-m temperature 36-hr forecast errors over the entire continent asthe metric. . . . . . . . . . . . . . . . . . . . . . . . 41
Fig. 8. Same as Fig. 5, but using 2-m temperature 12-hr forecast errors over the entire continent asthe metric, with the influence of the CD90 stations regressed out. . . . . . . . . . 42
Fig. 9. Same as Fig. 5, but using 2-m temperature 24-hr forecast errors over the entire continent asthe metric, with the influence of the CD90 stations regressed out. . . . . . . . . . 43
Fig. 10. Same as Fig. 5, but using 2-m temperature 36-hr forecast errors over the entire continent asthe metric, with the influence of the CD90 stations regressed out. . . . . . . . . . 44
Fig. 11. Same as Fig. 3, but with covariance localization applied. The variance reduction of thefirst 3 stations are about an order of magnitude smaller than the variance reduction of theunlocalized network. . . . . . . . . . . . . . . . . . . . . 45
35
FIG. 1. Plot of the AMPS grids, including the 15km grid. Taken from Powers et al. (2012).
36
I assume you’ll generate a higher resolution version of this image? I like the format, but this is too grainy
FIG. 2. Antarctica with referenced regions and locations.
37
This should either be integrated into Figure 1 or shown alongside it as Fig 1a and 1b — same domain/etc.; one shows elevation (and maybe ice shelves), the other shows the referenced regions/locations
FIG. 3. First three optimal stations for monitoring 0000 UTC 2-m temperature over the entire continent. Cells
that are colored indicate that location was chosen at least once, and color indicates the frequency with which that
location is chosen over 10000 iterations. The green circles represent the CD90 stations. The green boxes indicate
the median percentage of the total variance that is explained by top three stations. The boxes in the spectrum
plot indicate the lower to upper quartile range for the optimal change in total variance over all iterations, and
whiskers extend to 99.3% (two standard deviations) of the distribution.
54
55
56
57
58
59
38
Clarify that this is the blank slate calculation, for comparison with Fig 4
FIG. 4. Same as Fig. 3, but with the influence of the CD90 stations regressed out before optimal stations are
chosen.
60
61
39
FIG. 5. Same as Fig. 3, but using 2-m temperature 12-hr forecast errors over the entire continent as the metric.
40
FIG. 6. Same as Fig. 3, but using 2-m temperature 24-hr forecast errors over the entire continent as the metric.
41
FIG. 7. Same as Fig. 3, but using 2-m temperature 36-hr forecast errors over the entire continent as the metric.
42
FIG. 8. Same as Fig. 5, but using 2-m temperature 12-hr forecast errors over the entire continent as the metric,
with the influence of the CD90 stations regressed out.
62
63
43
FIG. 9. Same as Fig. 5, but using 2-m temperature 24-hr forecast errors over the entire continent as the metric,
with the influence of the CD90 stations regressed out.
64
65
44
FIG. 10. Same as Fig. 5, but using 2-m temperature 36-hr forecast errors over the entire continent as the
metric, with the influence of the CD90 stations regressed out.
66
67
45
FIG. 11. Same as Fig. 3, but with covariance localization applied. The variance reduction of the first 3 stations
are about an order of magnitude smaller than the variance reduction of the unlocalized network.
68
69
46