dissertation/osse.tex at master · rsobash/dissertation · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
\chapter{Part I: Observing system simulation experiments of a developing convective system}
\label{osse_chapter}
An OSSE was designed to explore the capability of EnKF data assimilation to produce analyses of a more complex convective event, such as that associated with a developing convective system that grows upscale through cell and cold pool mergers. An important aspect of the EnKF is analysis sensitivity to choices for covariance localization, which will be the focus of this chapter. The combination of the type of convective system and systematic study of localization sensitivities make the present OSSEs unique.

A summary of localization choices in convective-scale DA literature is provided in the next section. Section 3.2 describes the experimental design and section 3.3 and 3.4 examine the impacts of the localization cutoff, \( r \), on analysis error, and the analysis increments respectively. Section 3.5 examines the correlation length scales between the observations and state, while Section 3.6 examines short-term ensemble forecasts spawned from the final EnKF analyses. Section 3.7 contains conclusions and a synthesis of the results. Much of this chapter is derived from \citet{sobashstensrud13}\footnote{\copyright 2013, American Meteorological Society}.

\section{Localization for radar data assimilation}
Insufficient ensemble size leads to degraded background error covariance statistics due to sampling error, resulting in a sub-optimal data assimilation update that is propagated forward by the ensemble advance. For convective-scale EnKF applications, having a large enough ensemble of model simulations (\textgreater\textgreater 100 members) to mitigate sampling errors is computationally prohibitive, thus these errors are inevitable for such experiments. A ubiquitous treatment to reduce these errors is to confine the impact of an observation to nearby model grid points. Errors in the sample correlation are greatest where the true correlations are small \citep{fisher15}, thus the true correlation structure is more robustly measured at grid points near the observation, where the correlations should be largest. Thus, using localization implies that distant estimates of non-zero correlations are more vulnerable to sampling error and should be ignored.

Localization is commonly implemented by weighting the observation-state covariance estimates from the ensemble using a function that becomes zero beyond a specified cutoff distance. This was first applied using a quasigeostrophic model by \citet{houtekamermitchell98}, hereafter HM98, and further refined to utilize a Schur (elementwise) product for filtering noisy covariances. This has several benefits, including reducing computational burden by only updating a subset of model state points, and increasing the rank of the background error covariance matrix \citep{hamilletal01}.

Improper localization can introduce imbalance into the analysis, thus optimally choosing localization is essential for numerical weather prediction \citep{greybushetal11}. Flow-dependent error covariance structures vary in time and space; adaptive localization techniques that permit these structures (e.g. \citealt{bishophodyss09a,bishophodyss09b}) could provide more benefit than the fixed weighting schemes described above, but have not yet been applied to convective-scale EnKF.

All previous convective-scale EnKF studies have implemented some manner of covariance localization (Table \ref{covlocsummary}). When implementing localization, the two adjustable parameters are the weighting function and the cutoff distance\footnote{Here, the cutoff distance, \(r\), is the distance beyond which the localization weights are zero. Some literature defines the term “cutoff” to mean a half-width, that is, \(0.5r\).}, \( r \). Early studies used a boxcar weighting function (e.g. HM98), while later studies (e.g. \citealt{houtekamermitchell01}) have implemented a weighting function that approximates a Gaussian function (\citealt{gasparicohn99}; hereafter GC99). Often, tuning is used to select an optimal \( r \) for the weighting function, although the specific sensitivities often go undocumented or a cutoff is chosen following recommendations in previous work. Ultimately, the optimal localization cutoff depends on the ensemble size, state variable, observation type, density, location, and the dynamics of the observed system. Many of these characteristics differ among the previous studies, preventing any robust conclusions about the effects of localization on the analyses, especially those that utilize WSR-88D radar datasets.

\begin{table}
\centering
\begin{tabular}{ r || m{2cm} | m{1.5cm} | m{2.5cm} | m{2.5cm} }
Reference & Ensemble Size & Analysis grid (km) & Radar obs spacing & Localization cutoff (\(r\), km) \\
\hline \hline
\citet{snyderzhang03} & 50 & 2 & Model grid & H: 4; V: 4 \\
\citet{dowelletal04} & 50 & 2 & Radar scans & H: 6; V: 6 \\
\citet{tongxue05} & 100 & 2 & Model grid & H: 8; V: 8 \\
\citet{cayaetal05} & 100 & 2 & Radar scans & H: 7.3; V: 7.3 \\
\citet{aksoyetal09} & 50 & 2 & Radar scans & H: 5; V: 4 \\
\citet{dowellwicker09} & 50 & 1 & Radar scans & H: 6; V: 6 \\
\citet{dowelletal11} & 50 & 1 & Radar scans & H: 6; V: 6 \\
\citet{dongetal11} & 50 & 2 & Radar scans & H: 6; V: 6 \\
\citet{dawsonetal12} & 30 & 1 & Radar scans & H: 12; V: 6 \\
\end{tabular}
\caption{Summary of covariance localization choices in a selection of previous convective-scale EnKF studies. Radar observation spacing of "radar scans" indicates observations are assimilated at model grid points in the horizontal, but kept on the radar sweep surfaces in the vertical. "Model grid" indicates observations are located at model grid points in both the horizontal (H) and vertical directions (V).}
\label{covlocsummary}
\end{table}

Early studies using OSSEs chose \( r \) between 4 and 8 km, with equivalent values for the horizontal and vertical cutoffs (i.e., the region of influence around an observation is a sphere). For example, \citet{snyderzhang03}, in the first documented convective-scale perfect-model OSSE of a supercell, used \( r \) = 4 km for a 50-member ensemble with a boxcar weighting function. While not the focus of their work, their results showed some sensitivity to the cutoff, with large \( r \) values (\textgreater 20 km) being detrimental to the analyses and little sensitivity for \( r \) values between 2 km and 6 km. Using the GC99 weighting function improves their results over the boxcar function, and thus later studies have almost exclusively used the GC99 function for localization.

\citet{cayaetal05} further examined the analysis error of differing localization cutoffs. The smallest errors occurred using a value of 7.3 km for \( r \) in their 100-member OSSEs, but they noted that a smaller \( r \) (approximately 5.5 km) should be used for a 50-member ensemble. \citet{tongxue05} also used a 100-member ensemble, but with \( r \) = 8 km. In a real-data study, \citet{dowelletal04} used \( r \) = 6 km, and concluded that cutoff radii between 4 and 10 km produced the best results, while \( r \) = 2 km was detrimental to the analysis. \citet{aksoyetal09} chose \( r \) = 5 km (4 km) horizontal (vertical) cutoff, and found little sensitivity to a larger horizontal cutoff radius, but noted larger sensitivity to the vertical localization cutoff. Later studies generally settled on \( r \) = 6 km \citep{dowellwicker09,dowelletal11,dongetal11}, although \citet{dawsonetal12} used a larger horizontal cutoff (\( r \) = 12 km) than all previous studies. In other real-data studies, attempts were made to extract information about multiple scales of motion from radar observations by using varying localization cutoffs for different groups of observations \citep{zhangetal09}. While this was employed to produce analyses of a hurricane using dense radar observations, the technique is also potentially useful for convective storm analyses.

In an attempt to isolate the impacts of changes to covariance localization on convective-scale EnKF analyses, perfect-model OSSEs of a developing mesoscale convective system (MCS) were conducted using a variety of cutoff distances encompassing the range of values used in previous work. MCSs are understudied in the convective-scale EnKF literature, with supercells garnering most of the attention (all the referenced studies in Table \ref{covlocsummary} have focused on supercells, with the exception of \citet{aksoyetal09} which included a supercell and squall-line case). In addition to testing the sensitivity of the EnKF analyses to the localization cutoff, the results herein could provide evidence of differences in localization between convective modes (e.g., isolated supercells vs. squall lines).

To capture such differences, the present OSSEs simulate the process of upscale growth from initially isolated cells into a linear squall line during a 150-min period (much longer than any previous OSSE or real-data convective-scale EnKF experiment). The main thrust of this work aims to understand the impacts of changes in \(r\) on the error of the state fields and by doing so, gaining an understanding of the role of localization in producing accurate convective-scale analyses across a spectrum of convective modes. Further, the proper choice of localization is related to the covariance structures present within a simulated ensemble of convective storm. Little attention has been paid in the convective storms literature to understanding these structures; the analyses within this chapter relate the covariance structures to the localization choices.

\section{Methods}
\subsection{Truth simulation}

The Advanced Research Weather Research and Forecasting (WRF-ARW) model \citep{skamarocketal08} V3.2.1 was used to create a truth simulation of a developing MCS. The model was initialized using a thermodynamic profile derived from 28 proximity soundings from strong, weakly-forced MCS events (\citealt{coniglioetal06}; Fig. \ref{osse_fig1}). \citet{coniglioetal06} tested a variety of wind shears in the environmental wind profile; the wind shear used herein produced a strong, long-lived, upright-tilted squall line by the end of the simulation. This shear profile (Fig. \ref{osse_fig1}) contains 20 m s\textsuperscript{-1} of 0-5 km westerly shear, sufficient for long-lived squall lines in model simulations \citep{weismanrotunno04}. Convection was triggered with 5 +3K bubbles, with random perturbations added along the bubble edges to initiate three-dimensional motions. The horizontal grid spacing is 3 km and spanned 396 km in both horizontal directions (133 grid points), and 16 km in the vertical (40 grid points). These dimensions, larger than most EnKF convective-scale experiments, were required to retain the system within the domain during the 3-hour period. The WRF-ARW was integrated as a cloud-model with WRF 6-class single-moment (WSM6) microphysics \citep{honglim06}, open-boundary conditions, no terrain, surface physics, or boundary-layer physics. The Coriolis force was permitted to act on the perturbation wind, as is customary in convective-scale cloud modeling. A prognostic 1.5 order turbulence kinetic energy closure is used to compute the horizontal and vertical eddy viscosities necessary to represent the effects of sub-grid scale turbulence.

\begin{figure}
\centering
\includegraphics[scale=0.9]{osse_figure1}
\caption{Skew-T profile used to initialize truth simulation. Taken from \citet{coniglioetal06}. Sounding represents the median environmental thermodynamic profile of 28 weakly-forced MCSs. The unidirectional environmental wind profile is constructed to contain values of environmental shear that produce intense simulated squall lines.}
\label{osse_fig1}
\end{figure}

\subsection{Simulated radar observations}
Synthetic radial velocity and reflectivity observations were extracted from the truth simulation every five minutes assuming a radar placement at the center of the domain. A severe weather WSR-88D scanning strategy composed of 14 scan elevations (between \(0.5^{\circ}\) and \(19.5^{\circ}\)) was used. The observations were extracted from each grid column within the range of the radar (230 km), at the intersection of the grid columns and the 14 scan surfaces. This technique for extracting simulated radar observations was also applied in \citet{tongxue05} and serves to reduce the number of observations and produce a gridded observation dataset. The observations were computed by tri-linearly interpolating the required model state fields to the location of each simulated radar bin. Observation errors were added to each synthetic observation by performing a random draw from a Gaussian with zero mean and standard deviation of 2 m s\textsuperscript{-1} (2 dBZ) for radial velocity (reflectivity). Reflectivity observations of less than 10 dBZ were considered “clear-air” observations for the purposes of this study and the corresponding radial velocity observation at these locations was removed. All other observations are considered “precipitation” observations.

\subsection{Data assimilation and OSSEs}
The assimilation experiments were conducted with the EnKF implementation within the Data Assimilation Research Testbed (DART) software\footnote{DART software available at http://www.image.ucar.edu/DAReS/DART}. Specifically, the sequential parallel version \citep{andersoncollins07} of the deterministic ensemble adjustment Kalman filter \citep{anderson01} was used for the data assimilation update. The 50-member ensemble was initialized with different horizontally homogeneous wind profiles (identical thermodynamic profiles), each produced by adding random, uncorrelated Gaussian noise to the wind profile (both U and V) of the truth sounding below 11 km AGL. The ensemble generation technique used herein is consistent with other OSSEs and real-data experiments that use homogenous environments for initial conditions (e.g. \citealt{dowelletal04}). The initial ensemble was the same for each OSSE. The OSSEs were initialized 30-minutes after the beginning of the truth simulation when appreciable amounts of hydrometeors develop within the simulated storm. The synthetic observations were assimilated every 5-minutes for 150 minutes. To increase ensemble spread, additive, smoothed, perturbations were added to the wind, temperature, and dew point temperature fields where the observed (i.e. simulated) reflectivity is greater than 20 dBZ immediately prior to the ensemble advance \citep{dowellwicker09} throughout the length of the assimilation period. The additive noise smoothing procedure prevents any detrimental effects due to discontinuities on the edges of the convective system. The magnitudes of additive noise (Table \ref{exptable}) were chosen using values from previous work (e.g. \citealt{dowellwicker09}). The additive noise magnitudes were not tuned for each experiment, yet the results herein appear to be robust. Analysis of additive noise sensitivities are found in section \ref{addnoisesection}.

\begin{table}
\centering
\begin{tabular}{ r || m{2.5cm} | m{2.5cm} | m{2.5cm} | m{2.5cm} }
Experiment  & \(r_{h, precip}\) (km) & \(r_{v, precip}\) (km) & \(r_{h, noprecip}\) and \(r_{v, noprecip}\) (km) & Noise (K or m s\textsuperscript{-1}) \\
\hline \hline
LOC6V6 & 6 & 6 & 12 & 1.0 \\
LOC12V6 & 12 & 6 & 12 & 1.0 \\
LOC18V6 & 18 & 6 & 12 & 1.0 \\
LOC12V3 & 12 & 3 & 12 & 1.0 \\
LOC12V12 & 12 & 12 & 12 & 1.0 \\
LOC12V12N0.5 & 12 & 12 & 12 & 0.5 \\
LOC12V12N0.25 & 12 & 12 & 12 & 0.25 \\
\end{tabular}
\caption{Summary of covariance localization experiments conducted in Chapter 3. The first number in each row indicates the horizontal localization cutoff (km) and the second number indicates the vertical localization cutoff (km) for precipitation observations. Hence, LOC12V6 uses a horizontal and vertical localization cutoff of 12 km and 6 km, respectively. All experiments use the same horizontal and vertical cutoffs for clear-air (no precipitation) observations.}
\label{exptable}
\end{table}

The OSSEs use the same model configuration as the truth simulation (i.e., they are perfect-model experiments). The experiments are identical except for the localization cutoff, \(r\).  \(r_h\) varies between 6, 12, and 18 km in three experiments to assess the impacts of the horizontal localization (\(r_v\)  = 6 km in all three experiments), while in another set of experiments, \(r_v\) is varied between 3, 6, and 12 km while \(r_h\) is held constant at 12 km (Table \ref{exptable}). The localization choices for the clear-air observations remain constant across all experiments; those observations are thinned to 6 km in the horizontal with \(r_h\) = \(r_v\) = 12 km. These clear-air observations have been shown to improve convective-scale EnKF analyses by suppressing spurious convection \citep{aksoyetal09}.

\subsection{State-space diagnostics and analysis techniques}
\label{statespace}
The fidelity of the analyses to the truth simulation is gauged by domain-averaged grid point based verification metrics, as is common in EnKF analysis verification (e.g. \citealt{dowelletal04}). In an OSSE, the state estimate can be directly compared to the true state at all state points. Herein, these “state-space” diagnostics (as opposed to “observation-space” diagnostics that are computed at the observation locations) are the primary metrics used in this study to assess the overall quality of an OSSE. Root-mean-squared error (RMSE) of the analyses and 5-min forecasts is computed, defined as,
\begin{equation}
   RMSE^{a,f} = \sqrt{\frac{1}{M}\sum_{i=1}^{M} (x_i^t - \overline{x_i^{a,f}})^2}
\label{rmse}
\end{equation}
where the subscript \(i\) is an index over all \(M\) grid points where the true state reflectivity is greater than 20 dBZ, and \(t\), \(a\), and \(f\) represent the true, analysis (i.e., posterior), and forecast (i.e., prior) state, respectively. The use of clear-air observations prevents the development of spurious convection outside the truth state convection (this was confirmed by visually inspecting the ensemble for areas of convection outside the truth state), justifying the use of the truth state for defining the points over which RMSE is computed. State-space RMSE is computed for the three velocity components, temperature, and the WSM6 microphysical species (water vapor, cloud water, cloud ice, rain, snow, and graupel). In addition, the average bias is computed, defined as,
\begin{equation}
   BIAS^{a,f} = \frac{1}{M}\sum_{i=1}^{M} (x_i^t - \overline{x_i^{a,f}})
\label{bias}
\end{equation}
This is analogous to the mean innovation in observation-space. The consistency ratio (CR), a comparison of the ensemble mean RMSE to the ensemble spread, is also computed to ensure the ensemble contains an appropriate amount of spread. This ratio is defined as,
\begin{equation}
   CR^{a,f} = \frac{\frac{1}{M}\sum_{i=1}^{M} [\frac{1}{N-1}\sum_{n=1}^{N}(x_{i,n}^{a,f} - \overline{x_i^{a,f}})^2]}{(RMSE^{a,f})^2}
\label{cr}
\end{equation}
where the numerator is the squared ensemble spread (i.e., ensemble variance) and the denominator is the mean squared error (i.e., equation \ref{rmse} squared). \(N\) represents the number of ensemble members. As in equation \ref{rmse} and \ref{bias}, the spread computation in eqn. \ref{cr} is averaged over \(M\) grid points where the true state reflectivity is greater than 20 dBZ.

The prior analyses were compared to both the posterior and truth analyses to analyze the spatial structures of the prior analysis errors and the data assimilation increments. While posterior-prior (PO-PR) increments are useful to understand how the assimilation of observations impacts the state, the prior-truth (PR-TR) increments provide a sense of the accuracy of the state estimate throughout the model domain. Both sets of increments were averaged in time and space (along the convective line) over select periods within the OSSE to observe the systematic behavior of each set of increments. The line-averages are computed on a domain 200 km across that is translated each output time with the leading edge of the storm’s cold pool.

\section{State-space RMSE}
\subsection{Truth simulation}
The 3-hour truth simulation encompasses the development of 5 initially isolated cells into a quasi-linear convective system. The homogeneous environment used to initialize the model contains a vertical wind profile that produces splitting cells after approximately 30 minutes (Fig. \ref{osse_fig2}a). By 90 minutes, the eight pairs of left and right cell splits merge to produce 4 cells, while the left (right) split of the northernmost (southernmost) cells remain isolated (Fig. \ref{osse_fig2}b). These 6 cells proceed to merge into a linear convective system by the end of the simulation as their cold pools merge, forcing convection along the upshear edge (Fig \ref{osse_fig2}c). The initially isolated cells are located approximately 50-80 km west of the radar at 30 minutes. By 90 minutes, the line of cells is co-located with the longitude of the radar, and by the end of the simulation, the squall line is 30-60 km east of the radar.

\begin{figure}
\centering
\includegraphics[scale=0.75]{osse_figure2}
\caption{Column maximum reflectivity (color) and 0C perturbation theta (contour) from the truth simulation at a) 30 minutes, b) 90 minutes, and c) 150 minutes after initialization.}
\label{osse_fig2}
\end{figure}

The process simulated by the model is representative of upscale growth observed in nature, when isolated multicells or supercells in close proximity merge into a convective line after several hours (this is a distinctly different process compared to convection forced along linear boundaries). By 3 hours, a well-developed cold pool is established with a vertically upright line of convection along the leading edge of the cold pool. While not used in this study, extending the simulation further (1-2 hours) results in a mature bow echo system.

\subsection{LOC6V6 experiment}
The LOC6V6 experiment is conducted using \(r\) = 6 km, a common choice in previous convective-scale data assimilation studies (e.g., \citealt{dowelletal04,dongetal11}). The horizontal grid spacing of the observations is 3 km, matching the model grid spacing. Observations extracted from the truth simulation at T+30 min are assimilated at the initial time in LOC6V6. No temperature bubbles are used in the assimilation experiments to promote convective circulations. Thus, the model state takes approximately 20-30 minutes for the increments to stabilize as convection initially develops. Statistics are shown from this point forward (Fig. \ref{osse_fig3}). Both the prior and posterior RMSE are plotted, hence the appearance of the “sawtooth” pattern. The RMSE decreases gradually for all state fields through the first 60 minutes of assimilation. This decrease is purely due to the assimilation of simulated radar data; the RMSE is not reduced due to the choice of boundary conditions, since open boundary conditions were used, or the choice of initial conditions, which is void of convection.

\begin{figure}
\centering
\includegraphics[scale=0.55]{osse_figure3}
\caption{Domain-averaged state-space RMSE for LOC6V6, LOC12V6, LOC18V6 for the ten state fields updated by the data assimilation. Average includes grid points where the truth simulation reflectivity \textgreater 20 dBZ. The spin-up period between 0-20 minutes is not shown. The two periods of cell mergers mentioned in the text are shaded for the U RMSE panel only.}
\label{osse_fig3}
\end{figure}

After 60 minutes, the RMSE increases until approximately 90 minutes; this 30-minute period coincides with the first set of cell mergers (shaded in Fig. \ref{osse_fig3}). Afterward, the RMSE again stabilizes, although an increase in RMSE is evident in some state fields after 110 minutes (especially V, QVAPOR, and QRAIN). The increase in RMSE between 110 and 150 minutes (shaded in Fig. \ref{osse_fig3}) is likely related to the second set of mergers that occur as the cells and associated cold pools merge near the end of the assimilation period. As the storm passes the radar location between 60-120 minutes, portions of the storm become unobserved due to the “cone of silence” above the radar. The effect of this on the RMSE statistics (and other analyses) is minimal since the unobserved portions of the storm are small compared to the size of the convective system. In addition, the increase in RMSE between 60 and 90 minutes occurs when the main convective cores are well-observed and only a segment of the anvil is unobserved. Thus, the increase in RMSE is thus most likely due to the cell merger process and not gaps in observations.

Comparing the velocity fields, the V field has the smallest RMSE compared to U and W throughout the assimilation period. This may simply be a reflection of a smaller overall V component of the horizontal wind compared to the U component, since the system is dominated by momentum in the west-east direction. While the horizontal wind is directly related to radial velocity, several of the other state fields (e.g. T and QVAPOR) are only changed through cross-correlations and the evolution of the dynamics during the model advance. For T and QVAPOR, most of the RMSE change occurs during the model integration. Among the other microphysical fields, QGRAUP has the largest RMSE, likely due to larger overall mixing ratios compared to QRAIN and QSNOW. In general, QGRAUP has a larger degree of variability in these experiments, which is evident in the slightly larger RMSE values.

\subsection{Larger horizontal localization experiments}
LOC12V6 and LOC18V6 both produce analyses with less error than LOC6V6 during the entire assimilation period (Fig. \ref{osse_fig3}). During the first 60 minutes, the LOC12V6 and LOC18V6 state stabilizes to lower RMSE values more quickly than LOC6V6, but the LOC12V6 or LOC18V6 analyses are generally of similar quality during this period. Between 60-90 minutes, an increase in RMSE is also evident in the LOC12V6 and LOC18V6 analyses. Yet during this period, the RMSE of the LOC12V6 analyses increases more rapidly, thus LOC18V6 produces more accurate analyses. For most of the state fields, this result persists throughout the remainder of the assimilation period, although for some (e.g. QCLOUD, T, QRAIN), LOC12V6 and LOC18V6 produce comparable results near the end of the period. These results suggest that the underlying convective mode contributes to the ability of the EnKF system to produce accurate analyses. Specifically, in these experiments, a larger horizontal localization appears to be especially beneficial during the first period of cell mergers. This is also true for the vertical localization, as will be discussed in a following section.

\begin{figure}
\centering
\includegraphics[scale=0.55]{osse_figure4}
\caption{As in Fig. \ref{osse_fig3}, but for prior analysis domain-average state-space consistency ratio. The dashed horizontal line indicates the optimal consistency ratio of 0.98.}
\label{osse_fig4}
\end{figure}

The differences in \(r\) produce differences in posterior ensemble spread in the three experiments. As the localization cutoff is increased, state points are impacted by more observations, decreasing the posterior ensemble spread. The average prior state-space consistency ratio was computed to gauge the suitability of the spread within each experiment, similar to previous work (e.g., Snyder and Zhang 2003). This consistency ratio will be \(\frac{N}{N + 1}\), where \(N\) is the ensemble size, if the ensemble variance is equivalent to the error of the ensemble mean in these experiments (Murphy 1988). This optimal ratio is approximately 0.98 for the 50-member ensemble used herein. The consistency ratio within the storm (at Z \textgreater 20 dBZ) increases during the first 50-minutes for each experiment and state field (Fig. \ref{osse_fig4}) as the RMSE decreases. Afterward, the consistency ratio gradually decreases or remains steady through the end of the assimilation period as the RMSE remains steady or increases.

As expected, the state fields altered by additive noise before the model advance (U, V, T, QVAPOR) have larger consistency ratios than the other state fields. For example, the U, V, and T consistency ratios are greater than the optimal ratio for the first half of the assimilation period in all three experiments, with values closer to the optimal ratio during the second half. The spread is likely too large for U, V, and T between 20 and 80 minutes, although this may be desirable to give more weight to the observations as the covariances develop in the ensemble. It is clear that efforts to implement adaptive spread adjustment techniques (e.g., \citealt{anderson09}) in convective-scale EnKF studies will be necessary in the future.

\subsection{Prior analysis vertical profiles}
Vertical profiles of time-averaged RMSE and bias (equation 1 and 2) are shown in Figure \ref{osse_fig5}. The biggest differences between the three experiments are found in the velocity fields. U, V, and W LOC18V6 RMSE are approximately 1 m s\textsuperscript{-1} smaller than LOC6V6 RMSE between 2 km and 12 km. The differences are similar between the experiments for U, but for V the differences increase with height. For all three experiments, the largest U, V, and W errors occur between 3 and 7 km, coincident with the strongest vertical velocities (not shown). Only small differences exist between the three T RMSE profiles. LOC18V6 has consistently smaller RMSE for QRAIN, QSNOW, and QGRAUP throughout the profile. The behavior of the biases between the 3 experiments isn't consistent among the state fields. The existence of some of these prior analysis biases, particularly the large negative bias that exists in the surface cold pool for T and QVAPOR will be discussed further in Section 3.4.

\begin{figure}
\centering
\includegraphics[scale=0.7]{osse_figure5}
\caption{Average state-space prior analysis RMSE (solid) and bias (dotted) profiles for LOC6V6, LOC12V6, LOC18V6 for 10 state fields. Computed at prior state points where the true state reflectivity \textgreater 20 dBZ from T+20 min through T+150 min.}
\label{osse_fig5}
\end{figure}

\subsection{Sensitivity to vertical localization}
The LOC12V6 vertical localization cutoff length, \(r_v\), was changed in two additional experiments to assess the analysis sensitivity to changes in only the vertical localization.  In these experiments, \(r_v\) was changed to 3 km and 12 km, while \(r_h\) remained at 12 km (LOC12V3 and LOC12V12). LOC12V6 from the previous section serves as a control experiment in this set of experiments.

During the first 60 minutes of assimilation, smaller values of \(r_v\) have either no appreciable impact or are slightly detrimental to the analyses (Fig. \ref{osse_fig6}). The larger vertical localization appears to improve the spinup time for some of the fields during the first 40 minutes (e.g. W, QCLOUD). By 40 minutes, the differences in RMSE between the three experiments are minimal. Between 60-100 minutes, a noticeable difference in the RMSE between the three experiments emerges, with LOC12V3 having the smallest RMSE for most state fields (Fig. \ref{osse_fig6}). As described earlier, an initial set of cell mergers occurs during this time period. The smaller vertical localization length scale in LOC12V3 appears to improve the state and covariance estimate during the time of cell mergers and interactions, and this benefit continues through the end of the assimilation period in most of the state fields.

\begin{figure}
\centering
\includegraphics[scale=0.55]{osse_figure6}
\caption{Same as Fig. \ref{osse_fig3}, but for LOC12V3, LOC12V6, and LOC12V12.}
\label{osse_fig6}
\end{figure}

In LOC12V3, many of the fields have assimilation increments that improve the state estimate. For example, assimilation decreases both the T and V analysis RMSE during the final 6 assimilation cycles (mean increments of approximately -0.1K and -0.2 m s\textsuperscript{-1}, respectively). In LOC12V6 and LOC12V12, the T and V analysis RMSE increases or remains unchanged after assimilation during these 6 cycles (mean increments of approximately +0.2K and +0.3 m s\textsuperscript{-1}, respectively). This is evident in other fields (e.g., QCLOUD, QVAPOR, QRAIN) as well, particularly during the second half of the assimilation period. Overall, the LOC12V3 experiment produces the smallest mean analysis RMSE.

\section{Line-averaged, time-averaged increments}
\subsection{W and T increment patterns}
Line-averaged, time-averaged (90-150 minutes) posterior mean – prior mean (PO-PR) and prior mean-truth (PR-TR) updraft increments can elucidate systematic adjustments to the prior mean model state that are occurring within the convective system through the assimilation period. Line averages were taken in the y-direction from the northernmost to southernmost point along the convective line (from grid point 20 through grid point 110).

A negative-positive increment pair straddles the prior mean updraft (W) location in all of the experiments (Fig. \ref{osse_fig7}). These adjustments indicate the location of the maximum updraft is, on average, displaced ahead of the truth updraft maximum. This is also evident in the PO-PR increments in the temperature field (Fig. \ref{osse_fig8}). The negative updraft increments are collocated with negative temperature increments. At the surface below the location of the PO-PR updraft increments, positive PO-PR temperature increments exist. The assimilation, through the covariances between the radar observations and temperature field, is attempting to adjust the location of the cold pool by warming the leading edge by 0.5-1 K. These W and T PO-PR increments indicate the surface cold pool location is displaced eastward of the cold pool placement in the truth simulation, resulting in a low and mid-level updraft that is also biased eastward.

\begin{figure}
\centering
\includegraphics[scale=0.7]{osse_figure7}
\caption{Line-time-averaged prior-truth (a-c) and posterior-prior (d-f) vertical velocity increments (shaded) from LOC6V6, LOC12V6, LOC18V6. The line averages are taken between grid point 20 and 110 in the y-direction and the time averages include analyses between 90 and 150 minutes. Also shown is the region where the combined cloud water and ice is greater than (gray contour). The average position of the leading edge of the surface cold pool is also shown (gray dashed line). }
\label{osse_fig7}
\end{figure}

\begin{figure}
\centering
\includegraphics[scale=0.7]{osse_figure8}
\caption{As in Fig. \ref{osse_fig7}, but for temperature increments.}
\label{osse_fig8}
\end{figure}

Compared to the truth simulation, not only is the assimilated convective system displaced eastward compared to the truth run, but the prior mean state surface cold pool is 1-3 K colder in all assimilation experiments. If the simulated cold pool propagates as a density current, then a colder cold pool will produce a simulated convective system with a faster forward speed \citep{wakimoto82} and will lead to the convective system in the assimilation experiments being slightly ahead of the system in the truth simulation.  This relationship suggests that inaccuracies in cold pool temperature are producing the displacement error in the assimilated convective systems.  The assimilation acts to correct this displacement error by adjusting the placement of the updraft and cold pool edge.  Unfortunately, due to the lack of observations below 1 km (\textless 10\% of the total radar observations are located below 1 km), any adjustments to the cold pool must be accomplished exclusively through covariances between observations above the cold pool and state locations within the cold pool.  Results suggest that the assimilation is unable to adequately adjust these fields, as the cold pool temperature and position errors remain throughout the assimilation period. Although, as will be discussed in the following section, changes in localization lessen the impact of this bias. While more observations below 1 km would improve the perfect-model results herein, in real cases with parameterized boundary layers, observations may still be unable to produce accurate analyses due to model error.

In addition to the errors in the mid-level updraft region, positive temperature errors exist in a band near the model tropopause (approximately 12 km) and negative temperature errors exist above the tropopause in the overshooting top. The assimilation attempts to correct the latter errors by adjusting the temperature upward in the mean posterior state (i.e. positive PO-PR T increments exist). Data assimilation is unable to correct the former temperature errors as evidenced by negligible PO-PR T increments. In both locations, observations are extremely sparse (e.g. \textless 40 radar observations exist between above 12 km over the entire domain at each assimilation time), so the assimilation increments in the overshooting top region are likely dominated by cross-correlations that extend between observations in the updraft to the temperature field above the updraft. These correlations are unable to correct the errors near the tropopause, thus the errors persist and grow through the assimilation period. Errors in the horizontal wind field also exist at this level (not shown). The wind errors supports the notion that these errors are primarily due to observational limitations, since radial velocity observations, if sufficiently dense, should be able to correct the wind errors in the upper-level outflow.

\subsection{Differences between localization experiments}
The patterns noted above are present in varying degrees in all of the assimilation experiments (Fig. \ref{osse_fig7}, \ref{osse_fig8}). In general, the spatial extent and magnitude of the PO-PR and PR-TR increments are the smallest in LOC18V6. For the T field, the PR-TR increments within the surface cold pool are smallest in LOC18V6. This is likely a contributing factor to the smaller W increments in the same experiment. In all three experiments, covariances develop between the assimilated radar observations and the state temperature field within the cold pool. The cold pool is a region that is inadequately sampled, so it is promising that the covariances are partly able to make appropriate corrections to the temperature errors near the surface, especially along the leading edge of the cold pool. The PO-PR increments are somewhat larger near the surface in LOC6V6, yet the magnitude of the positive increments in this region is similar between the experiments (since the vertical localization is not changing between these three experiments, the regions of the storm that rely on vertical covariances for corrections should be similar). The T errors near the tropopause are smallest in LOC18V6, although the PO-PR increments along the tropopause are similar between the three experiments.

\subsection{Cold pool development sensitivity to additive noise}
\label{addnoisesection}
The convective system speed bias that develops in all of the experiments is a result of the prior state cold pool being colder than the truth simulation. In real-data experiments, cold pool biases often are attributed to errors in the model microphysics scheme (e.g., \citealt{dowelletal11}). In a perfect-model OSSE, the same microphysics scheme is used in the truth and assimilation experiment, thus it is hypothesized that the errors are due to some component of the data assimilation process (e.g., spurious covariances between reflectivity and surface temperature) or additive noise. \citet{dowelletal11} demonstrated substantial sensitivity to the surface cold pool when the magnitude of additive noise was varied (compare their figure 9b and 9c). Specifically, their experiment which used additive noise values similar to those used herein produced a substantially larger cold pool, although, they were unable to verify which experiment was closer to reality. To assess the sensitivity of the present cold pool results to the additive noise, two additional simulations were conducted where the standard deviation of noise was reduced (compared to LOC12V12) to 0.25 (m s\textsuperscript{-1} and K; experiment LOC12V12N0.25) and 0.5 (m s\textsuperscript{-1} and K; experiment LOC12V12N0.5) beginning 20-minutes into the assimilation period. Retaining the larger values of noise during the initial 4 assimilation cycles is beneficial for storm spin-up \citep{dowellwicker09}.

\begin{figure}
\centering
\includegraphics[scale=0.75]{crosssect_noise}
\caption{As in Fig. \ref{osse_fig8}, but for PR-TR temperature increments below 2 km for (a) LOCH12V12N0.25, (b) LOCH12V12N0.5, and (c) LOCH12V12N1.0.}
\label{osse_addnoise}
\end{figure}

\begin{figure}
\centering
\includegraphics[scale=0.75]{rmse_prof_noise}
\caption{As in Fig. \ref{osse_fig5}, but for LOCH12V12N1.0, LOC12V12N0.5, and LOC12V12N0.25.}
\label{osse_addnoise2}
\end{figure}

By the end of the assimilation period, PR-TR increments in LOC12V12N0.25 and LOC12V12N0.5 are smaller within the cold pool (Fig. \ref{osse_addnoise}a) and also reduce the PR-TR increments near the tropopause (not shown). Although the overall increments were smaller, PR-TR temperature increments of 1-2 K remained along the leading edge of the cold pool gust front in both experiments (Fig, \ref{osse_addnoise}a-b). At several locations, the prior state gust front in LOC12V12N0.25 was placed 10-15 km to the east of the gust front in the truth simulation, while other segments of the prior state gust front closely matched the truth. This explains the placement of the PR-TR increments near and ahead of the gust front compared to LOC12V12 in Fig. \ref{osse_addnoise}a. This also occurred in LOC12V12N0.5, although these were generally restricted to the northern half of the cold pool gust front; the southern half of the prior mean gust front matched well with the truth simulation in LOC12V12N0.5. The slightly better positioning of the gust front produced slightly smaller W PO-PR increments in LOC12V12N0.5 compared to LOC12V12, yet the PO-PR increment pair remains in the time-line-averaged cross-sections (not shown).

These results suggest that using smaller values of noise produce improvements in the T state estimates, especially within the cold pool and anvil regions of thunderstorms. On the other hand, other state fields are slighly negatively impacted by smaller values of additive noise. For example, the average prior RMSE/bias profiles reveal that the microphysical fields have smaller errors using the original values for the standard deviation of the noise distribution (Fig. \ref{osse_addnoise2}). This is also true for U, V, W, and T above the cold pool. Within the cold pool, U, V, T and QVAPOR RMSE are all reduced in LOC12V12N0.25 and LOC12V12N0.5 compared to LOC12V12, in addition to decreases in the large surface T bias. Further comment on the impact of additive noise on these results is provided in the discussion section.

\section{Observation-state correlations}
The results in the previous sections can be further interpreted by examining the covariance structures present within the OSSEs. Several previous studies have analyzed the covariance structures within convective storms using various techniques (e.g., \citealt{micheletal11}), primarily to develop covariance models for 3D-VAR applications. Herein, we apply a brute-force method by manually computing correlations and analyzing the spatial scales over which substantial correlations exist. Correlations are computed between the assimilated observations (radial velocity and reflectivity within the storm) and prior state fields (at the state locations where reflectivity is greater than 20 dBZ). For the present analyses, correlations are computed between radial velocity and the U, V, and W state fields, as well as reflectivity and the QRAIN, QGRAUP, and QSNOW state fields. These observation-state pairs were chosen due to their direct relationship to each other, although meaningful correlations exist between other observation-state pairs (e.g., reflectivity and temperature). The prior state ensemble correlations drive the data assimilation update, and should provide insight into the characteristic spatial scales of the correlation field between the radar observations and state variables.

Observation-state distances and correlations are used to compute reverse cumulative density functions (RCDFs) that show the number of pairs greater than a given distance. These give an aggregate sense of the percentage of observations that fall at various distances away from the state points (or, alternatively, the percentage of state points that fall at a given distance away from the observation location). RCDFs were produced for the observation-state pairs with correlations greater than 0.4, 0.6, and 0.8. These thresholds were chosen to represent the length scales for mid- and high-end correlations that are not significantly affected by sampling noise.

\subsection{3-D correlation structure}
Figure \ref{osse_fig9}a-c shows the RCDFs from the LOC12V12 60-minute prior analysis for all 6 observation-state pairs. Correlations greater than 0.4 exist for all observation-state pairs at distances up to 20 km (Fig. \ref{osse_fig9}a), while those greater than 0.6 and 0.8 are generally restricted closer to the observation location (Fig. \ref{osse_fig9}b-c). In the latter case, the number decreases quickly as the observation-state distance increases. For the microphysical fields, roughly half of the observation-state pairs with correlations greater than 0.4 are within ~6.5 km (Fig. \ref{osse_fig9}a). For the dynamic fields, this occurs at distances between 8 and 10 km. Thus, the radial velocity correlations with the dynamic fields (U, V, W) extend to further distances than the reflectivity correlations with the microphysical fields. This is also true for the larger observation-state correlations, although the RV,U correlations tend to behave similarly to the microphysical observation-state pairs.

\begin{figure}
\centering
\includegraphics[scale=0.7]{osse_figure9}
\caption{RCDFs for observation-state correlations greater than (a-d) 0.4, (b-e) 0.6, and (c-f) 0.8 using the (a-c) three-dimensional and (d-f) vertical observation-state distance from the 60-min prior analysis for LOC12V12.}
\label{osse_fig9}
\end{figure}

Some insight into the performance of the various localization experiments can potentially be gleaned from these results. In LOC6V6, observations are not permitted to update the state at distances beyond 6 km. Yet, only 20\%-25\% (40\%) of the state points that are correlated at \textgreater 0.4 with the radial velocity (reflectivity) observations lie within this radius (Fig. \ref{osse_fig9}a; these results are consistent across LOC6V6 and LOC12V6). This may have proven to be detrimental to the LOC6V6 analyses, especially in data sparse regions. Increasing \(r_h\) to 12 km, as in LOC12V6, captures 70-75\% of the state points that are correlated at \textgreater 0.4 with the radial velocity observations (while capturing all of the state points with correlations \textgreater 0.8). For reflectivity observations, roughly 90\% (100\%) of the state points with correlations greater than 0.4 (0.8) are within 12 km. While the localization choices in LOC18V6 permit the inclusion of an even greater fraction (nearly 100\%) of the observation-state pairs, a larger fraction of noisier covariances less than 0.4 are also included (not shown). Thus, the improvement of the LOC12V6 analysis RMSE over LOC6V6 and LOC18V6 may be due to a closer match between the localization choice and the spatial scales present within the observation-state correlation field.

By the end of the assimilation period (150 minutes) the length scales of most observation-state pairs have increased compared to the correlations present at 90 minutes across all three experiments (not shown). This is hypothesized to be the result of upscale growth during the second half of the assimilation period. This will be examined further in section 3.5.3.

\subsection{Horizontal vs. vertical correlation structure}
To isolate the correlation spatial scales between the horizontal and vertical, the RCDFs in Fig. \ref{osse_fig9}a-c were reconstructed based on the observation-state pair horizontal and vertical distance. The horizontal scales have similar structures to those shown in Fig. \ref{osse_fig9}a-c, thus they are not shown. The vertical correlation scales are much smaller, especially at correlation thresholds of 0.6 and 0.8 (Fig. \ref{osse_fig9}e-f). The length scales are similar for 5 of the 6 observation-state pairs, where approximately 90\% (98\%) of the pairs have vertical separation distances of less than 2 km for correlations greater than 0.6 (0.8). For the radial velocity-W pair, the length scales are longer at all correlation thresholds. Only 50\% (70\%) of these pairs have vertical separation distances less than 2 km for correlations greater than 0.6 (0.8). These findings suggest why LOC12V3 had the smallest RMSE between the three vertical localization experiments. A 3 km vertical cutoff contains a large majority of the observation-state pairs with correlations greater than 0.6, while reducing the number of observation-state pairs that are dominated by sampling noise.


\subsection{Comparison with computed correlation length scales}
Correlation length scales of the forecast errors from the OSSEs were computed using a technique described by \citet{pereiraberre06}, hereafter PB06. While the RCDFs provide some insight into the spatial structure of the observation-state correlations, the PB06 technique computes a local horizontal correlation length-scale for each model grid point for each state field. A domain median length-scale was computed for all model grid points where reflectivity was greater than 20 dBZ for each assimilation period beginning with the 30-minute prior ensemble analyses. The PB06 technique tends to produce large length-scales along the edges of the simulated convection. These length-scales were judged to be an artifact of the large horizontal derivatives of the state fields in these areas. The domain median, instead of average, length-scale was computed to reduce the overall influence of the length-scales at these grid points.

Some agreement exists between the RCDF and PB06 length-scales. First, the length scales associated with the microphysical fields from both methods are generally smaller than the dynamic fields (Fig. \ref{osse_fig10}a) over the entire assimilation period. In addition, the changes in length scale during the assimilation period as the convection grows upscale (discussed in section 3.5.1) are also observed using the PB06 length scales. For instance, between 30 and 90 minutes, the PB06 length scales for U, V, and T increase by 300-500 meters. During this period, a more rapid increase occurs with QGRAUP and QSNOW between 30 and 60 minutes, followed by a period of little change for 60 minutes.

\begin{figure}
\centering
\includegraphics[scale=0.6]{osse_figure10}
\caption{(a) Domain-averaged PB06 length scale during the assimilation period for 6 state fields from the prior analysis ensemble. Average includes grid points where the truth simulation reflectivity \textgreater 20 dBZ. (b) Height-averaged PB06 length scale for 6 state
 fields from the final prior analysis ensemble.}
\label{osse_fig10}
\end{figure}

Differences are also evident between the two techniques. For example, the RCDFs produced using correlations larger than 0.4 indicate that the RV-W correlation length scale is larger than the RV-U and RV-V (Fig. \ref{osse_fig9}a), while the W field PB06 length scale is smaller than the U and V length scales. In addition, separation is noted among the PB06 microphysical length scales while the RCDFs show no evidence of any difference between the three fields. Specifically, the QRAIN PB06 length scale is the smallest of the 7 fields for almost the entire assimilation period. This length scale evolves differently than the other 6 fields; instead of increasing or remaining steady during the 30-90 minute period, it gradually decreases during the entire 120-minute period. During the final 30 minutes of the assimilation period, the PB06 length scales begin to decrease for all state fields except for QRAIN. There is no indication of this behavior using the RCDFs.

The median vertical profiles of PB06 length scale were created for the same 7 state fields for the final assimilation period (Fig. \ref{osse_fig10}b). The U and V length scales are maximized near the surface and aloft (above 8 km) with a minimum near 3 km. The W length scale is smallest near the surface and is largest above 8 km. The QGRAUP and QSNOW length scales gradually increase from the mid-levels (they are not defined below this level) to the top of the storm. The length scale is smaller in the mid-levels than U, V, and W, but larger than these 3 fields above 11 km. The QRAIN length scale is fairly constant below 5 km and is 1-2 km smaller than the U and V length scales near the surface.

The PB06 length scales appear to have some substantial variability within the convective storms simulated in these experiments. For example, the W length scale at 6.3 km from the final prior analysis is plotted in Figure \ref{osse_fig11} (only points where the reflectivity is greater than 20 dBZ are shown). The length scale varies in the across-line direction by as much as 5 km, with larger length scales toward the front of the line and smaller length scales toward the rear. The PB06 length scales are more consistent in the along-line direction. While potentially revealing, further analysis of the spatial variability of the forecast error length scales is left to future work.

\begin{figure}
\centering
\includegraphics[scale=0.8]{osse_figure11}
\caption{PB06 length scale from the final prior analysis for W at 7.4 km at points where the truth simulation reflectivity \textgreater 20 dBZ.}
\label{osse_fig11}
\end{figure}

\section{Ensemble forecasts}
As another measure of the proper localization cutoff length, 60-minute ensemble forecasts are initialized from the final 50 model states at the end of the 150-minute assimilation interval. While analyses with smaller initial error should produce better forecasts, imbalance in the analysis also impacts the development of forecast error. Imbalance can be introduced due to suboptimal localization cutoffs, which could potentially hamper the ability to initialize forecasts with these analyses \citep{greybushetal11}. 60-minute ensemble forecasts were produced for LOC6V6, LOC12V6, and LOC12V3 and LOC12V12. Ensemble mean state-space RMSE was used as a measure of the skill of the forecasts; the truth simulation was run for an addition 60 minutes to use as verification for the ensemble mean forecasts.

For most of the experiments and state fields, the rapid error growth occurs during the first 15-30 minutes (Fig. \ref{osse_fig12}). For some fields, RMSE continues to increase through the end of the 60-minute forecast period (e.g., U, QRAIN) while for others the RMSE is steady (e.g., QVAPOR), or even decreases in the 30-60 minute period (e.g., W, QCLOUD). The model states become progressively more similar among the experiments during the forecast period. For example, the range of RMSE values from the ensemble mean U analysis at the initialization time is roughly 2 m s\textsuperscript{-1}, while the error differences in the 60-minute forecast of U are less than 1 m s\textsuperscript{-1}.

\begin{figure}
\centering
\includegraphics[scale=0.7]{osse_figure12}
\caption{RMSE for ensemble mean forecasts initialized with the final posterior ensemble from LOC6V6, LOC12V12, LOC12V3 and LOC12V6 for 10 state fields. The last 12 analysis cycles are shown (unshaded), in addition to the 60-minute ensemble forecasts (shaded).}
\label{osse_fig12}
\end{figure}

For most of the forecast period, the RMSE evolves similarly in time between the experiments, with the forecast error a function of the initial analysis error (i.e., the experiments with smaller 60-minute forecast RMSE have smaller initial RMSE). For example, LOC12V3 generally produces the most accurate analyses, and the 60-minute forecasts benefit from this more accurate initial state.

While the sensitivity of the forecasts to localization is unique to this study, previous studies (e.g. \citealt{aksoyetal10,dawsonetal12}) that examined convective-scale ensemble forecasts using the EnKF have found similar results in terms of error growth, although their statistics were computed in observation-space. A significant difference in this work is the apparent decrease in error evident in several state fields as described above. This may be due to the simplistic view of convective-scale forecast quality used herein; other more advanced verification metrics (e.g. \citealt{schwartzetal10}) could provide a different view of these findings.

\section{Summary and Discussion}
Part I describes the results of several OSSEs to assess the impact of covariance localization for radar data on EnKF analyses of a developing convective system. Sensitivity to both horizontal and vertical localization was assessed via RMSE statistics, assimilation increments, and prior-truth increment structures. To interpret the results further, the prior correlations between the assimilated radar observations and state variables within the convective storm were computed and analyzed both in aggregate (RCDFs) and on a local scale (PB06 length scales). Further, ensemble forecasts were produced from the final EnKF analyses to provide another measure of the impact of different localization options on the analyses.

The results herein suggest that when assimilating radar data with the temporal and spatial resolution typically provided by a WSR-88D, a larger horizontal (\(r_h\) = 12-18 km), but smaller vertical (\(r_v\) = 3 km) localization cutoff (using a typical Gaspari-Cohn localization function) than previous studies could be beneficial, especially in cases where the convective evolution is dominated by thunderstorms interactions, mergers, and/or upscale growth into a linear convective system, as was the case in the present simulations. These findings run counter to the localization choices used in all previous convective-scale EnKF literature, especially the vertical localization cutoff, where the vertical length scales of the covariance structures for radar observations were seemingly limited. When considering the choice of localization, factors such as ensemble size and inflation should be considered. For example, a larger localization length scale would likely be beneficial when employing a larger ensemble \citep{houtekamermitchell98}.

Some discrepancies between the previous work and this present work exist that are worth mentioning. For example, \citet{tongxue05} noted that \(r\) = 6 km produced the best results after decreasing the ensemble size from 100 to 40 members, which is similar to the ensemble size used herein. One difference between \citet{tongxue05} and this work is the model grid and observation density (2 km vs. 3 km). Another difference is the mode of the simulated convective system. Previous convective-scale EnKF studies employing both OSSEs and real-data experiments, primarily have focused on producing analyses of an individual thunderstorm cell (e.g., tornadic supercells). While these studies have laid the foundation for convective-scale EnKF and proved its viability, the scenarios depicted in these studies are only a subset of the large variety of convective modes observed in nature. Thus, more work is needed to demonstrate how those results generalize to other types of convective systems and forms of severe weather.

The results of the current experiments were obtained with the assumption of a perfect model. In reality, model error (e.g. due to microphysical parameterizations) plays a significant role in producing accurate analyses and will likely modulate the exact results shown herein. Specifically, model errors can produce erroneous covariance structures, which could change the results reported herein. In Part III, real-data experiments are performed to examine the sensitivity of the analyses to radar data localization in a setting where model error is not neglected.

The results herein reiterate the difficulties with analyzing the surface cold pool, even in perfect-model OSSEs. Producing accurate analyses of the cold pool structure near the surface is essential in order to produce accurate forecasts of the speed and structure of cold-pool driven convective storms whose behavior and severity are largely controlled by the thermodynamic properties of the surface cold pool, even more so than other convective modes (e.g., supercells). Yet, a paucity of observations near the surface, sensitivities to assimilation parameters such as additive noise, and microphysical scheme errors (in real-data experiments) will make this a challenging task.

Herein, covariance localization was achieved by applying an isotropic weighting function to all ensemble covariances, based on the observation distance from the state location being updated. While this type of localization is easy to implement and has proven to be useful, the present results indicate a need for more flexible, adaptive localization options that vary based on location, state variable type, time, and dynamical considerations. For example, during a time when cell mergers and interactions were occurring the OSSE using the largest localization radius produced the most accurate analyses. In addition, the covariance length scales increase as the overall system grows upscale and the covariances exist substantial spatial variability. Furthermore, in the present experiments, radial velocity observations are related to the dynamic state variables of U, V, and W on slightly larger length scales than reflectivity observations are related to the microphysical variables. This result is consistent with several studies that have developed heterogeneous background error covariance models for 3D-VAR applications (e.g. \citealt{micheletal11}). While the present results may provide guidance for Gaspari-Cohn localization choices, they also underscore the need for techniques that determine the meaningful prior ensemble covariances dynamically during the data assimilation step (e.g., \citealt{bishophodyss09a,bishophodyss09b,chenoliver09}).