Evaluation of Multiplicative Weight of Covariance Matrix on Hybrid Data Assimilation Schemes

Abstract

This research develops a comparative study between different multiplicative weights that are assigned to the covariance matrix that represents the background error in two hybrid assimilation schemes: 3DEnVAR and 4DEnVAR. These weights are distributed between the static and time-invariant matrix and the matrix generated from the perturbations of a previous ensemble. The assigned values are 25%, 50%, and 75%, always having as a reference the ensemble matrix. The experiments are applied to the short-range Prediction System (SisPI) that works operationally at the Institute of Meteorology. The impact of Tropical Storm Eta on November 7 and 8, 2020 was selected as a study case. The results suggest that by giving the main weight to the ensemble matrix more realistic solutions are achieved because it shows a better representation of the synoptic flow. On the other hand, it is observed that 3DEnVAR method is more sensitive to multiplicative weight change of the first guess. More realistic results are obtained with 50% and 75% relations with 4DEnVAR method, whereas with 3DEnVAR a weight of 75% for the ensemble matrix is required.

Share and Cite:

González-Jardines, P. , Sierra-Lorenzo, M. and Ferrer-Hernández, A. (2023) Evaluation of Multiplicative Weight of Covariance Matrix on Hybrid Data Assimilation Schemes. Atmospheric and Climate Sciences, 13, 255-281. doi: 10.4236/acs.2023.132015.

1. Introduction

Data assimilation incorporates the information from meteorological observations into the Numerical Weather Prediction models (NWP), applying consistency restrictions and dynamic balance between all the meteorological variables, to produce an analysis field that constitutes the initialization of the model [1] [2] .

The objective of this process is to obtain initial conditions that produce the best possible numerical forecast. For this reason, most of the main world forecasting centers that have Numerical Weather Modeling systems dedicate great efforts to making this activity count with the highest data availability and the most robust and accurate assimilation system (https://www.nhc.noaa.gov/modelsummary.shtml, consulted: June 23, 2020).

Taking into account the current needs of the national meteorological service, the SisPI project (Short-range Forecast System, acronym in Spanish) evaluated the implementation of a robust and efficient data assimilation design that would improve the ability of short-term forecasts while adjusting to current technological capabilities. In this way, it is obtained that the hybrid assimilation schemes constitute the option that allows for obtaining more realistic results [3] [4] .

However, a data assimilation key concept is that the weight given to background and observations is inversely proportional to their error variance: information with a smaller error variance (implies a smaller expected error) receives more weight [5] . A critically important component for data assimilation is the background error (BEC, Background Error Covariance), which is represented by a covariance matrix at different points. This means that, if the model error in one place is usually related to an error in another place, applying a correction in the first place implies the need to correct the second place as well.

The largest volume of national research in this direction has been carried out mainly using the 3DVAR scheme (3-Dimensional Variational) [6] [7] . Sensitivity studies using different covariance matrices showed that the generation of domain-dependent matrices leads to more realistic results [8] . Following this same line [9] , evaluated the impact of the number of previous days necessary to build the background error, performing experiments with 7, 15, 30, and 45 days before the initialization instant, obtaining that matrices between 15 and 30 days are those with the greatest contribution to the assimilation process.

Finally, in 2021 [3] [4] hybrid assimilation methods are evaluated in SisPI, for the first time in Cuba. The results of this study suggest that the 4DEnVAR (4-Dimensional Ensemble-Variational) scheme is the most robust but at the same time the most computationally expensive, while 3DEnVAR (3-Dimensional Ensemble-Variational) tends to be unstable in performance since it can lead to very realistic solutions or others similar to 3DVAR, showing a discrete modification of the background field and therefore, very close to the version of the model without data assimilation.

Various foreign authors agree that giving a higher weight to the ensemble BEC, because theoretically its better represents the error of the day, leads to more realistic solutions [10] [11] . A similar result was obtained by Zhu et al. [12] , who estimated that a much higher weight for the ensemble covariance in a hybrid covariance, helps the proposed assimilation system to produce a more reasonable analysis field. Other authors, however, limit themselves to stating the advantages of combining the static BEC of the variational system with the ensemble BEC [12] [13] due to if the set does not adequately represent the flow of the day, it could be negative for the assimilation system.

Taking into account the previous approaches, this research aims to evaluate the impact of the multiplicative weight of the covariance matrix in hybrid schemes, which can lead to a more adequate calibration of the assimilation method used in SisPI.

2. Materials and Methods

2.1. Short-Range Forecast System (SisPI)

To carry out the experiments, the WRF (Weather Research and Forecasting) mesoscale model is used, with its ARW (Advanced Research WRF) dynamic core in version 3.8.1. This constitutes the SisPI’s core [6] [14] . The SisPI design has two-way nested domains of 27 and 9 km (kilometers) of spatial resolution, respectively (d01: 140 × 78; d02: 199 × 112). It also has a high spatial resolution domain of 3 km (d03: 421 × 184) one-way nested through the down tool (Figure 1).

The 27 and 9 km domains output is every three hours, the high resolution domain prints the solutions hourly basis. The model is initialized with data from the GFS (Global Forecast System) with 0.5˚ spatial resolution. These characteristics related to the execution and initialization of the operational version of SisPI were respected in the experiments described in this document to evaluate the operating system as homogeneous as possible.

The configuration proposed by SisPI includes 28 vertical levels, the Mellor-Yamada-Nakanishi and Niino2.5 PBL scheme, and the RRTM longwave

Figure 1. SisPI’s domains. (27 km) Parent domain, (9 km) blue box, (3 km) red box.

parameterization for all domains. In the case of low-resolution domains, it contains the microphysics of WSM5, the Grell-Freitas cumulus parameterization, and the shortwave Dudhia scheme. For the high resolution domain, the Morrison double-moment microphysics is used, the cumulus parameterization is deactivated and the Goddard shortwave radiation scheme is used. These differences have been supported by sensitivity studies made in the project development [6] [14] .

2.2. Assimilation Desing

To develop the assimilation experiments, the WRFDA (WRF Data Assimilation) was used in its version 3.9, available free of charge at https://github.com/wrf-model/WRF/releases?page=3. The selection of this slightly higher version than the WRF version used in SisPI responds to the fact that it incorporates the 4DEnVAR method for the first time, in addition to updating and improving the assimilation of satellite data compared to previous versions.

Assimilation was executed only on the high resolution domain, a decision that responds to several factors. In the first place, the fact that data assimilation can be carried out only in one domain at a time would force us to modify the execution philosophy of SisPI with the consequent increase in computational cost by repeating the process in each of the domains. On the other hand, one-way nesting gives the high-resolution grid the possibility of having initial and boundary conditions adjusted to it, which allows assimilation to be carried out in this domain by adding only the computational cost of the assimilation process itself.

The main disadvantage of this design is that the spatial coverage is limited and this can produce fluctuations concerning the availability of data to assimilate, especially with radiances, which come from polar-orbiting satellites.

Data is assimilated in prepbufr format, which contains information from weather stations in FM12 and METAR format, as data from ships, buoys, and soundings. The radiances are assimilated from information from polar-orbiting satellites with information from microwave channels in bufr format, the sensors AMSU-A (NOAA-15/16/18/19), MHS (NOAA-18/19), SSMIS (DMSP-16) and ATMS (Suomi-NPP) were used, in all cases downloaded free of charge from the site: https://rda.ucar.edu/datasets. The set of assimilable data to make the experiments was the same in all cases.

Assimilation Experiments Design

The hybrid methods, as well as the purely variational case, can be summarized as the iterative solution to find the state x that minimizes the cost function J(x) [5] [15] . This solution represents the maximum probability (least variance) estimate of the true state of the atmosphere.

Hybrid schemes combine two ways of obtaining the BEC. On the one hand, they use a BEC generated for a variational process. In this case, the differences or perturbations between valid forecasts for the same time but with different initialization are calculated, generally following the scheme (T1 + 24) minus (T2 + 12), where T is the instant of initialization in each run [15] [16] . These differences are usually constructed using different times of the day to remove the diurnal cycle, typically using 00:00 and 12:00 UTC (United Time Coordinated).

The differences obtained are averaged over a period that must range from 15 to 30 days before the initialization of interest at least [17] , which provides the system with a mean background error. The BECs obtained in this way tend to be synoptically dependent and time invariant, which acts as an encapsulated climatology, which can provide poor results when the flow of the day differs from the one fixed in the covariance matrix [2] [18] . For these experiments, static BECs built with 15 days are used, taking into account that previous results obtained with SisPI [8] , do not show statistically significant differences between the results obtained with BECs with 15 and 30 days.

The other way to obtain the BEC is by calculating the perturbations from a previous ensemble, where all the members contain the initialization time of the experiment to be carried out. For this, the SisPI solutions corresponding to two days before the initialization time are used. This feature responds to the fact that SisPI is executed 4 times a day and has a forecast extension of 36 hours. This allows to generation a small ensemble of 6 members (Figure 2).

The ensemble size can be a point of discussion because several authors work with ensembles larger than 20 members ( [19] Kong et al. 2017), others authors like Tong et al. [20] suggest that with small ensemble a satisfactory analyses can be achieved. By the other hand, the adjustment of the ensemble size responds to the technological limitations because its work with a design that can be used operationally.

This way of calculating the covariance matrix generates a flow-dependent BEC similar to that used in sequential assimilation schemes such as EnKF (Ensemble Kalman Filter). It is based on the fact that the covariances that weight the errors of the model extracted from the set, being dependent on the flow, can better represent the error of the day, as opposed to the isotropic and static characteristics in purely variational schemes [21] .

The same authors suggest that because the purely variational BEC is flow-independent may not adequately represent the structure of a cyclonic vortex, since the not geostrophic and vortex motion of tropical cyclones implies that this

Figure 2. Illustrative scheme that represents the construction of the ensemble used in the hybrid methods used in this research.

background error may vary with the flow of the system, making it dependent on the characteristics of the flow where it is embedded.

Therefore, the hybrid methods proposed in this research use a combination of both ways to calculate the BEC. The control variables associated with the BECs obtained for this research are detailed in Table 1.

The main advantage of using a hybrid scheme is the fact that it uses small ensembles to obtain flow-dependent perturbations, in these cases the use of static BEC can be beneficial to obtain a high covariance index [18] [20] [21] . On the other hand, the main disadvantage is precise that the ensemble used must well represent the flow of the day, otherwise, the result of the assimilation may not be the most appropriate.

In 3DEnVAR formulation (Equation (1)) the ensemble is valid only for the analysis time [10] [11] :

J ( x ; α ) = B s 1 2 ( x 1 x b ) T B 1 ( x 1 x b ) + B e 1 2 i = 1 n ( a | i T C 1 | a i ) + 1 2 [ y H ( x 1 + x e ) ] T R 1 [ y H ( x 1 + x e ) ] (1)

Here J(x,α) is the cost function, x is the analysis field, xb the background field, C is the diagonal matrix that controls the spatial correlation of α, which in turn represents the control variable used, Bs covariance matrix static, Be the flow-dependent covariance matrix, B the weight that is established between the matrices, H the observational operator contained within the WRFDA module, which interpolates the grid point values at the location of the observations and transforms the variables predicted by the model in observational quantities, finally R is the observational covariance error also contained within the assimilation package. Bs, Be and R matrices assuming a zero-mean Gaussian error distribution and are not correlated.

According to [21] and [22] the calculation of the increments is rewritten taking into account the contribution of both matrices (Equation (2)).

x x b = x 1 + 1 N 1 i = 1 N α i x i (2)

In this case, x1 represents the increase associated with the static BEC, while the expression on the right represents the increase associated with the ensemble perturbation. On the other hand, xi represents the average perturbation of the

Table 1. BECs variables control.

ensemble members, whose size is indicated by N (for this research N = 6).

The weight between the ensemble and static BEC must follow the relationship (Equation (2a)):

1 B s + 1 B e = 1 (2a)

4DEnVAR is very similar, the only difference is that it requires flow-dependent perturbations and observations for multiple time steps (Equation (3)). The weight assigned to the covariance matrices does not vary over time and in this case, a time window of 6 hours is used, 3 before and 3 after the initialization time.

J ( x ; α ) = B s 1 2 ( x 1 x b ) T B 1 ( x 1 x b ) + B e 1 2 i = 1 n ( a | i T C 1 | a i ) + 1 2 [ y H ( x 1 + x e ; k ) ] T R 1 [ y H ( x 1 + x e ; k ) ] (3)

where k represents the multiple perturbations used in the algorithm.

For this research, it was decided to study 3 possible combinations of relationships between the static BEC and the flow-dependent BEC. These combinations were established arbitrarily, being, in terms of percentage, 75/25, 50/50, and 25/75, always placing first the value of the variational BEC and in second place the one assigned to the ensemble BEC. The purpose of this design is to give greater importance to one or the other matrix over the background error, as well as equalize the impact of the matrices. From now on, for synthesizing the explanations, the weight value relative to the flow-dependent BEC will be taken as a reference. The nomenclature assigned to the different experiments is reflected in Table 2.

2.3. Experiments Evaluation

Because the evaluation is made on landfall of tropical storm Eta, it focused on 3 fundamental parameters of the storm, trajectory, intensity, and precipitation.

For this evaluation, continuous verification statistics are used, such as the mean absolute error (MAE), (Equation (4)), the root mean square error (RSME)

Table 2. Experiments nomenclature and their relationship with the weight attributed to the different matrices expressed in %.

(Equation (5)), and Pearson’s correlation (Pearson) (Equation (6)) [23] . Where, in all cases, O represents the observations and P the experiment’s solutions.

MAE = i 1 n O i P i n (4)

RSME = 1 n i 1 n ( O i P i ) 2 (5)

Pearson = i = 1 n [ ( O i O ¯ ) ( P i P ¯ ) ] i = 1 n ( O i O ¯ ) 2 i = 1 n ( P i P ¯ ) 2 (6)

The evaluation also includes a dichotomous analysis with the use of statisticians calculated through a contingency table (Table 3), which was applied mainly in the evaluation of precipitation. This type of analysis allowed a more detailed description of the behavior of the error of the precipitation forecasted by the different experiments for different thresholds.

The calculated statistics allow for obtaining an estimate of the occurrences that are correctly predicted (Equation (7)) and the proportion of non-occurrences that were incorrectly predicted (Equation (8)) [23] .

H = a a + c (7)

F = b b + d (8)

The critical success index (CSI) is calculated too. However, despite this statistician is used for low-frequency events since it does not take into account the correct negatives, the calculation of the extreme dependency index (EDI) is also introduced (Equation (10)), which can offers a better characterization than the CSI, given that it tends to give very low values when there are significant differences between the frequencies of manifestation of an event, which can lead to difficulties when making an adequate interpretation. This often happens with precipitation because large accumulations tend to occur in small areas and are less frequent.

CSI = a a + b + c (9)

EDI = log ( F ) log ( H ) log ( F ) + log ( H ) (10)

Table 3. Contingency table.

In the particular case of precipitation, the accumulated every 6 hours measurements by conventional meteorological stations distributed throughout the country was used for evaluation purposes (Figure 3).

Incremental analysis is carried out to measure the impact of each of the first guess proposed designs. The value of the increases or corrections is obtained by subtracting the values of the variables of the analysis field (assimilation result) and the background field (initial condition without assimilation). This step allows understanding of the reason for the model solutions, providing exact values of how the observations and the background error modify the first guess. Higher values of increments suggest a greater impact on the assimilation process.

Since the objective of SisPI is short and very short-term forecasting, the evaluation is made within the first 24 forecasting hours after the initialization.

3. Results

3.1. Brief Description of Studies Case

After reacquiring tropical depression status on November 6, located east of Belize, the system turned toward the east-northeast, a movement favored by the flow resulting from the combination of a mid- and upper-level trough with the circulation of the subtropical anticyclone (Figure 4) [24] .

At the beginning of day 7th, Eta once again reaches the category of a tropical storm, maintaining course, but temporarily accelerating its forward speed during that day to estimated speeds of 28 km/h. With this movement, it made landfall on the southern coast of the central region of Cuba, specifically through the province of Ciego de Ávila, at approximately 09:00 UTC on day 8th, with maximum sustained winds around 100 km/h, according to data from the reconnaissance plane.

Figure 3. Spatial distribution of conventional weather stations in Cuba.

Figure 4. Wind speed maps produced by the National Forecasted Center of INSMET, corresponding to November 7th, 2020 at 12:00 UTC. (a) 200 hPa level; (b) 500 hPa level. Courtesy of the National Forecast Center, INSMET.

3.2. Track Forecast Evaluation

The erratic nature of Eta’s track after emerging into the Caribbean Sea and its subsequent interaction with an upper-low pressure system made it very difficult to forecast the track of the tropical system. On the other hand, over the national territory, Eta experienced a reduction in its movement speed and an inflection in its trajectory, changing its course towards the northwest, aspects that tend to complicate this forecast in practice.

The runs initialized on November 7th at 12:00 UTC show great uncertainty during the first 6 to 9 hours (Figure 5(a)). This fact is related to the low ability of the model to locate the center of circulation, a usual difficulty in very weak systems such as the case of Eta at that time. Subsequently, the errors of all methods

Figure 5. Comparison between Eta best track (black), SisPI (grey), and the different assimilation experiments. (a) Initialization corresponding to November 7th; (b) initialization corresponding to November 8th.

are similar, although, in the intermediate terms, the designs with a greater influence of the static BEC tend to lead to less effective solutions than SisPI.

On the other hand, in the experiments started on day 8 at 00:00 UTC, a decrease in the trajectory error and a lower dispersion of the solutions can be seen (Figure 5(b)).

Common points are observed in both initializations. At the first place, the impact of data assimilation is evident, since the initial location of the cyclonic circulation center is more consistent with reality than in the case of SisPI, decreasing the model’s location error. On the other hand, the 4DEnVAR_75 scheme showed the best performance, generating the lowest forecast errors about the other designs.

These results are directly related to the forecast environment surrounding the storm. In this sense, the direction and speed of mean flow in the layer of isobaric surfaces that are between 700 and 400 hPa are decisive for the movement of the storm.

Concerning November 7th initialization, a tendency of the experiments to displace the center of circulation of the low-pressure zone in the height towards the southeast, where the latitudinal error is more significant, is observed (Figure 6).

Figure 6. Mean flow behavior in 700 to 400 hPa layers (lines), mean wind speed in that layer (shading), and forecasted position of the Eta circulation center (L). (a) Reanalysis of the RAP (Rapid Refresh Model); (b) 4DEnVAR_25; (c) 4DEnVAR_75. Maps corresponding to November 8th at 06:00 UTC forecast. Experiments b and c was initialized on November 7th.

The schemes that give greater weight to the flow-dependent BEC tend to generate an average circulation associated with this incipient low, a result that is more consistent with reality, which indicates that the set better represents the flow conditions where the storm is embedded than the static BEC.

It can be seen that a further eastward localization of the upper low core leads to solutions that turn more northward to Eta’s best track, which increases the trajectory error. About November 8th initialization, it can be observed that the solutions of the 4DEnVAR_50 and 4DEnVAR_75 experiments better represent the mean circulation of the layer selected, although generating a dipole with a northeast-southeast orientation as a result of a deepening of Eta, indicating an overestimation of storm intensity. In the case of the 3DEnVAR method, the 3DEnVAR_75 experiment better represents these characteristics, coinciding with a better representation of those experiments where the flow-dependent BEC has greater weight (Figure 7).

3.3. Eta Intensity Forecast Evaluation

The intensity forecast presented several difficulties. SisPI generated an overestimation of this indicator, forecasting minimum pressures within category 1 hurricane thresholds, something that never happened. A probable cause of these results lies precisely in the trajectory error since advancing the turn northward and therefore moving the system further westward, kept it longer over the sea, a factor that usually favors the intensification processes.

In both initializations, it is observed how the error is considerably reduced during the first 9 forecasting hours and, after this time, the experiments solutions converge to SisPI (Figure 8). This indicates that the impact of assimilation reaches its maximum effectiveness precisely at this time threshold.

The experiments related to 4DEnVAR method show the lowest errors, as a result of a more realistic reproduction of the synoptic environment unlike 3DEnVAR. In this sense, the 4DEnVAR_50 and 4DEnVAR_75 designs exhibit

Figure 7. Mean flow behavior in 700 to 400 hPa layer (lines), mean wind speed in that layer (shading), and forecasted position of Eta circulation center (L). (a) 3DEnVAR_25; (b) 3DEnVAR_75. Maps corresponding to November 8th at 06:00 UTC forecast. Experiments b and c was initialized on November 7th.

Figure 8. MAE comparison for the first 24 forecasting hours for both initializations. (a) November 7th; (b) November 8th.

the best results, coinciding with the most accurate track forecasts.

These differences between both methods are also related to the volume of assimilated observations, which is much greater in the case of 4DEnVAR since it includes data contained in a 6 hours window time, in contrast to 3DEnVAR, which only assimilates the observations present in the environment of the initialization time. This allows 4DEnVAR to have information, not only about the real state of the atmosphere over the high-resolution domain at the initialization instant but also about the behavior of the real gradients in the assimilation window used.

The way of different experiments solves the interaction of the storm with the upper low significantly influences the intensity forecasts. Eta was subjected to dry and cooler air intrusion from the third and fourth quadrants, this affected the storm’s core, which is usually warm and humid, limiting the intensification processes and leading Eta to remain in a kind of transition between a classic tropical system and a subtropical storm.

The cold, dry air intrusion mechanism works by generating undersaturated convective downdrafts that reduce the wet static energy in the boundary layer, thus limiting the energy available to the storm. In situations like this, it takes several hours for evaporation to recover moisture from the surface boundary layer before the intensification process can resume.

Therefore, this interaction affects the inertial stability of the storm, which is nothing more than its resistance to external forcing [2] . This is due to the inertial stability tends to decrease with height, due to the weakening of the rotational wind and the absolute vorticity of the upper anticyclonic outflow, which makes Eta susceptible to fluctuations of the upper low, a process that is reflected in the experiments, with different accuracy degrees.

The designs that make a better forecast of this extremely complex process were the most successful in predicting intensity. Figure 9 illustrates this by comparing temperature and relative humidity advection forecasts between the 3DEnVAR_25 (low performance) and 4DEnVAR_50 (high performance) designs.

Figure 9. Temperature advection forecast comparison; (a)-(b); and relative humidity; (c)-(d); at 950 hPa for the 3DEnVAR_25 (a)-(c) and 4DEnVAR_50 (b)-(d) experiments. In both cases initialized on November 8th. The time shown corresponds to November 8th at 06:00 UTC, near the storm’s landfall.

In this example, the 3DEnVAR_25 solution shows cold advection around the core but without interrupting the warm advection generated by deep convection growing around the center of the cyclonic vortex (Figure 9(a)), which is consistent with the limited dry advection around the center of Eta (Figure 9(c)).

On the opposite, 4DEnVAR_50 shows an interruption of the warm advection generated by the growth of deep convection clouds and a rearrangement of the greater convection, expressed in a greater warm advection, to the left side of the circulation center, one of the characteristics that they usually show subtropical cyclones or in transition to extratropicalization. This also coincides with a greater entry of dry air (Figure 9(b) and Figure 9(d)), which limits the intensification of the system and, by affecting the growth of deep convection cells due to interruptions in the flow of humidity, also limits the ability of Eta to produce significant precipitation over the central region.

In general, it can be summarized that all the proposed experiments lead to a reduction in the intensity forecast error, mainly during the first 9 forecasting hours. However, both hybrid schemes lead to an underestimation of the intensity of the storm, as opposed to SisPI which overestimates this feature, mainly at November 8th initialization (Figure 8).

Once again, it is observed that when weighting the background errors fundamentally using the flow-dependent BEC, more realistic solutions are achieved, although, in the case of 4DEnVAR, the variant that gives equal weight to both BECs also generates satisfactory results.

3.4. Eta Precipitation Forecast Evaluation

As Eta is a relatively weak system and moves very close to the Guamuhaya mountains, crossing extensive areas of agricultural importance in the country, precipitation is precisely the most important associated phenomena for forecasting purposes, whose forecast, as expected, is more or less accurate in correspondence with the ability to forecast the track and intensity of the storm.

An increase in the ability to forecast precipitation is observed in all the assimilation schemes, fundamentally in the initialization corresponding to November 7th, being somewhat more discreet in those experiments initialized on November 8th (Figure 10).

In the experiments initialized on November 7th, 4DEnVAR_50 and 4DEnVAR_ 75 designs showed the best performance indices for accumulates exceeding 10 mm/6 hours (Figure 10(a)). On the opposite, the experiments carried out with the 3DEnVAR method led to an underestimation of the accumulated precipitation, where the 3DEnVAR_50 design showed slightly superior results to the rest. In this sense, 3DEnVAR_75, with an increase in false alarms, and 3DEnVAR_25, with failures, exhibit a marked underestimation of the precipitation forecast.

The experiments initialized on November 8th show more modest results. In this sense, the highlight is a substantial increase in false alarms, as a consequence

Figure 10. Performance diagrams evaluating the accumulated precipitation forecasts that exceed the threshold of 10 mm/6 hours. (a) initialized at November 7th; (b) initialized at November 8th.

of an overestimation of the humid advection towards the west of the country (Figure 11), together with a greater volume of faults towards the central region due to an increase in dry advection associated with a greater intrusion of dry air from the upper low.

This dry advection overestimation is directly related to the forecast of the approximation of the center of the upper low to Eta core, since those solutions where both circulations are closer, the more depressed the precipitation will be in the center of the storm, which responds, as stated above, to the fact that dry

Figure 11. Moisture advection outputs at 500 hPa level (shaded); combined with the flow in the layer between 700 and 400 hPa (lines) and the relative position of cyclonic circulation center (L), initialized on November 7th, forecast corresponding to November 8th at 09:00 UTC when Eta was over Cuban territory. (a) 3DEnVAR_75; (b) 4DEnVAR_50.

air weakens the convection, creating undersaturated downdrafts that subsequently cool the surface boundary layer and reduce its moisture.

Analyzing the results taking into account different precipitation thresholds, it is observed that these differ to a certain extent from those shown up to now (Figure 12). This behavior is directly related to the fact that the Eta forecast does not manage to generate enough deep convection around the core and extensive areas of stratiform precipitation predominate, particularly towards the western and central portions of the country.

The experiments reflect this environment, however, it overestimates the spatial distribution of stratiform precipitation while underestimating deep convection zones. These differences are related to the dry advection around the storm core, something that, as has been seen, has a significant weight in the increase of errors in the designs where the influence of the static BEC predominates.

In general, it can be seen that 4DEnVAR method loses effectiveness as the value of the precipitation threshold increases. This means that the method predicts spatially limited deep convection zones, mainly towards the storm center, where the largest accumulated were generated. In this case, the 4DEnVAR_25 experiments, in both initializations, are the ones with the best performance with this method. This result indicates that when these experiments overestimate the convection around the center of Eta, not disrupting the center of the vortex, they improve the correct detections towards the central region and compensate, to some extent, the failures towards other regions of the country.

Related to 3DEnVAR method, the 3DEnVAR_75 experiments show the best results in both initializations. The main difficulties are presented towards the weakest precipitation thresholds, approximately from 0 to 20 mm/6hours. However, for heavy rain (≥50 mm/6hours) it exhibited EDI values above 0.5 in both initializations.

Figure 12. EDI behavior for different precipitation thresholds. (a) November 7th initialization; (b) November 8th initialization.

3.5. Complementary Evaluation

3.5.1. Cost Function

The cost function, by definition, represents the minimum deviation from the true state of the atmosphere [11] [15] . This breaks down into a linear combination of three functions: Jo which is associated with the observations, Jb which is related to the static BEC, and Je which represents the contribution of the flux-dependent BEC (Equation (11)).

J ( x ) = J o + J b + J e (11)

The general behavior is that as the minimization process advances, the variance of the observations decreases, and that of the BECs increases, in such a way that the dynamic consistency of the meteorological fields is maintained. The results show that the minimization requires a greater number of iterations in the case of the 4DEnVAR method, which is not surprising since the volume of assimilated information is greater than in the case of 3DEnVAR. However, it can be noted that in both initializations the number of additional iterations required by 4DEnVAR is usually not more than two, which indicates a high efficiency of the method (Figure 13).

The cost functions resulting from giving a higher weight to the flow-dependent BEC, generate higher values of J(x). This means that, since the contribution of the BECs is the same, these experiments (3DEnVAR_75 and 4DEnVAR_75) allow assimilating a greater number of observations.

This response is to the fact that, in 4DEnVAR case, a smaller number of observations are rejected when passing the pre-established quality control inside the assimilation process. For 3DEnVAR method, this detail influences the speed with which the method reaches the maximum reduction of the cost function, requiring some additional iterations in the 3DEnVAR_25 and 3DEnVAR_50 experiments. These results suggest that 3DEnVAR could be more sensitive to the BEC contribution than 4DEnVAR, therefore, when the adjusted error in the covariance matrices does not adequately represent the flow of the day, their solutions could be more affected than in the case of 4DEnVAR.

Figure 13. Cost function for both initializations and all assimilation experiments.

3.5.2. Incremental Analysis

A brief evaluation of the behavior of these corrections or increases corroborates the explanation above. For example, concerning the temperature fields (Figure 14), it is observed that 4DEnVAR generates a more realistic initial condition, by representing the growth of convective cells that took place in the Eta circulation. This process is characterized by a slight cooling at low levels and heating at high levels, in response to the flow of sensible heat (heat transfer), which favors the conversion of wet static energy into kinetic energy, allowing the intensification of the system.

The evaluation also shows that the contributions tend to be discrete and more notable between the experiments in the case of 4DEnVAR, since with the 3DEnVAR method the differences between the experiments are barely perceptible and, on the other hand, in the low and medium model levels, generates a modification of the first guess that tends to zero, which indicates a limited effect with the propagation of the correction of the observations, suggesting some isotropic characteristics that are not appreciated with the 4DEnVAR method.

On the other hand, the 4DEnVAR_25 experiments show a tendency to cool around the intermediate model levels. This behavior translates into an acceleration of the vertical movements in their ascent through the wet adiabatic gradient,

Figure 14. Temperature (˚K). Mean of gridded analysis increments.

advancing the appearance of the convective cells for the other experiments, a fact that explains the behavior of their solutions with the precipitation forecast.

The differences, expressed by RSME between the analysis field and first guess, in the case of temperature are therefore discrete, in thresholds of 0.2 to 0.4 ˚K, reaching values close to 1.0 ˚K in the upper levels, as a result of radiances ingestion. The 4DEnVAR method shows to be less susceptible to BEC fluctuations compared to 3DEnVAR, which exhibits the greatest differences with the 3DEnVAR_ 75 experiments (Figure 15).

Pressure field increments showed very similar results (Figure 16). In general, the assimilation schemes lead to a slight increase in the atmospheric pressure values, as a result of predicting a weaker cyclonic vortex in contrast to the background field represented by SisPI, which overestimates the intensity of Eta in both initializations.

In the experiments initialized on November 7th, again, little difference is observed between the mean increments per grid point obtained through the different experiments with the 4DEnVAR scheme. However, the 4DEnVAR_50 and 4DEnVAR_25 generate greater increments from surface to middle levels, excessively weakening the low-pressure region associated with Eta, which is related to a greater underestimation of the intensity of the cyclone forecasted by these

Figure 15. Temperature (˚K). RSME between analysis field and first guess.

Figure 16. Pressure (Pa). Mean of gridded analysis increments.

experiments.

In 3DEnVAR method, the differences between the weight given to the BECs are more significant. In this case, 3DEnVAR_25 and 3DEnVAR_50 generate very discrete increments, which do not sufficiently correct the pressure field of the first guess. On the other hand, 3DEnVAR_75 generates a volume of corrections comparable to those obtained with 4DEnVAR, however, the result of the increases is a weaker initial vortex.

The differences between the analysis and the background field indicate precisely that the 3DEnVAR method and, to a lesser extent, the 4DEnVAR_25 experiment, show the largest positive differences, as a result of an average increase in atmospheric pressures over the high-resolution domain. On the other hand, the 4DEnVAR_50 and 4DEnVAR_75 experiments generate a more discrete average increase, as a result of a more realistic representation of the intensity of the cyclonic vortex at initialization (Figure 17).

With the initialization corresponding to November 8th, it can be observed that the experiments with a greater influence of the flow-dependent BEC lead to more discrete contributions compared to those more dependent on the static BEC.

This indicates that the experiments with a greater influence of the static BEC

Figure 17. Pressure (Pa). RSME between analysis field and first guess.

tend to create zones of higher atmospheric pressure around Eta, affecting the static stability of the system and therefore incurring greater trajectory and intensity errors, as described in the previous sections. This is, to some extent understandable, since the static BEC was obtained by averaging the model disturbances 15 days before the initialization instant, where the synoptic flow was not disturbed by the presence of any synoptic-scale low-pressure system, so it does not adequately reflect the background error of the model under the conditions where the experiments are carried out. This reality negatively affects the propagation of the contribution of the assimilated observations. In this case, the flow-dependent BEC better reflects the error of the day and generally leads to obtaining a new initial condition that is closer to reality.

4. Conclusions

This research shows the impact of data assimilation with the hybrid 3DEnVAR and 4DEnVAR methods, by empirically modifying the weight of the covariance matrices on the assimilation process.

It suggests that the impact of data assimilation, for both methods, has a maximum influence that extends for up to 6 or 9 hours after the initialization time, after which the results converge to the solution without assimilation, although a decrease in errors in a general sense is appreciated for the first 24 forecasting hours.

Track, intensity, and precipitation forecasts were strongly influenced by how the experiments reflected the interaction between Eta and the upper-low system. The main errors were associated with a displacement toward the southeast of the center of the upper low and an increase in the baric gradients in the 700 - 400 hPa layer, which increased the speeds of guiding currents.

Experiments that give more weight to the static BEC tend to generate higher pressures in the domain and, as a result, predict a weaker storm. They do not adequately reflect the interaction with the upper low and consequently, their solutions are far from reality by failing to predict the vortex disruption due to dry air intrusion, this generates greater errors of trajectory and intensity mainly, compared to the designs that give more weight to the flow-dependent BEC.

The 3DEnVAR scheme seems to be more sensitive to BEC changes, presenting the greatest differences between experiments. On the other hand, 4DEnVAR is usually more robust and efficient, presenting the best results except for the precipitation forecast. Precipitation turned out to be the most problematic variable with low skill indices in the experiments. 4DEnVAR does not show homogeneous results in the evaluation carried out, while the 3DEnVAR_75 experiments exhibited the best ability for all the thresholds analyzed with an EDI ≥ 0.5.

These results suggest that giving a weight of 75% to the flow-dependent BEC leads to the most realistic solutions with both methods since it reflects the flow conditions more accurately than the static BEC. In the case of 4DEnVAR assigning a 50% weight between both BECs can also lead to satisfactory results.

Acknowledgements

The authors of this research are grateful for the support provided through the project “Building resilience to drought in Cuba”.

This research was carried out thanks to a grant from the International Development Research Center (IDRC), Ottawa, Canada.

The opinions expressed do not necessarily represent those of IDRC or its Board of Governors.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1] Kalnay, E. (2003) Atmospheric Modeling, Data Assimilation and Predictability. Cambridge University Press, Cambridge.
https://doi.org/10.1017/CBO9780511802270
[2] Laing, A. and Evans, J. (2010) Introduction to Tropical Meteorology: A Comprehensive Online and Print Textbook. Version 2a. COMET Program, University Corporation for Atmospheric Research.
https://www.meted.ucar.edu/tropical/textbook_2nd_edition_es/
[3] González, P.M., Sierra, M. and Ferrer, A.L. (2022) Data Assimilation System Applied to Short-Range Forecast System. Environmental Sciences Proceedings, 19, Article 20.
https://doi.org/10.3390/ecas2022-12806
[4] Ferrer Hernández, A.L., González Jardines, P.M., Sierra Lorenzo, M. and de la Caridad Aguiar Figueroa, D. (2022) Impact of the Assimilation of Non-Precipitating Echoes Reflectivity Data on the Short-Term Numerical Forecast of SisPI. Environmental Sciences Proceedings, 19, Article 13.
https://doi.org/10.3390/ecas2022-12845
[5] Barker, D., Huang, W., Guo, Y. and Xiao, Q. (2004) A Three-Dimensional (3DVAR) Data Assimilation System for MM5: Implementation and Initial Results. Monthly Weather Review, 132, 897-914.
https://doi.org/10.1175/1520-0493(2004)132<0897:ATVDAS>2.0.CO;2
[6] Sierra, M., Ferrer, H., Valdés, R., Mayor, G., Cruz, R., Borrajero, I., Rodríguez, C., Rodríguez, N. and Roque, A. (2015) Sistema Automático de Predicción a Mesoescala de Cuatro Ciclos Diarios, Instituto de Meteorología, La Habana.
[7] Armas, E. (2015) Asimilación de Observaciones de Satélites Meteorológicos En El Modelo WRF. Trabajo Diploma, Instituto Superior de Tecnologías y Ciencias Aplicadas, La Habana.
[8] Fernández, D. (2019) Corrección de Errores Sistemáticos En La Inicialización Del Modelo WRF Utilizando Asimilación de Datos de Radar. Trabajo Diploma, Instituto Superior de Tecnologías y Ciencias Aplicadas, La Habana.
[9] Ferrer, A. and Borrajero, I. (2019) Estudio Preliminar Para La Implementación De La Asimilación De Datos De Radares Y Satélites Meteorológicos En Pronósticos Numéricos A Muy Alta Resolución Sobre La Región Occidental De Cuba. Informe de Resultado Científico Presentado en el Consejo Científico del Instituto de Meteorología de Cuba. Instituto de Geografía Tropica, La Habana.
[10] Schwartz, C.S., Liu, Z., Huang, X.-Y., Kuo, Y.-H. and Fong, C.-T. (2013) Comparing Limited-Area 3DVAR and Hybrid Variational-Ensemble Data Assimilation Methods for Typhoon Track Forecasts: Sensitivity to Outer Loops and Vortex Relocation. Monthly Weather Review, 141, 4350-4372.
https://doi.org/10.1175/MWR-D-13-00028.1
[11] Malakar, P., Kesarkar, A.P., Bhate, J. and Deshamukhya, A. (2020) Appraisal of Data Assimilation Techniques for Dynamical Downscaling of the Structure and Intensity of Tropical Cyclones. Earth and Space Science, 7, e2019EA000945.
https://doi.org/10.1029/2019EA000945
[12] Zhu, S., Wang, B., Zhang, L., Liu, J., Liu, Y., Gong, J., Xu, S., Wang, Y., Huang, W. and Liu, L. (2022) A 4DEnVar-Based Ensemble Four-Dimensional Variational (En4DVar) Hybrid Data Assimilation System for Global NWP: System Description and Primary Tests. Journal of Advances in Modeling Earth Systems, 14, e2022MS003023.
https://doi.org/10.1029/2022MS003023
[13] Lee, J.C.K., Amezcua, J. and Bannister, R.N. (2022) Hybrid Ensemble-Variational Data Assimilation in ABC-DA within a Tropical Framework. Geoscientific Model Development, 15, 6197-6219.
https://doi.org/10.5194/gmd-2022-3
[14] Sierra-Lorenzo, M., Borrajero-Montejo, I., Ferrer-Hernández, A., Morfá-ávalos, Y., Morejón-Loyola, Y. and Hinojosa-Fernández, M. (2017) Estudios de Sensibilidad Del SisPI a Cambios de La PBL. La Cantidad de Niveles Verticales y, Las Parametrizaciones de Microfísica y Cúmulos, a Muy Alta Resolución.
https://www.researchgate.net/publication/325050959_Estudios_de_sensibilidad_del_SisPI_a_cambios_de_la_PBL_la_cantidad_de_niveles_verticales_y_las_parametrizaciones_de_microfisica_y_cumulos_a_muy_alta_resolucion
[15] Barker, D., Huang, X.-Y., Liu, Z., Auligné, T., Zhang, X., Rugg, S., Ajjaji, R., Bourgeois, A., Bray, J. and Chen, Y. (2012) The Weather Research and Forecasting Model’s Community Variational/Ensemble Data Assimilation System: WRFDA. Bulletin of the American Meteorological Society, 93, 831-843.
https://doi.org/10.1175/BAMS-D-11-00167.1
[16] Parrish, D.F. and Derber, J.C. (1992) The National Meteorological Center’s Spectral Statistical-Interpolation Analysis System. Monthly Weather Review, 120, 1747-1763.
https://doi.org/10.1175/1520-0493(1992)120<1747:TNMCSS>2.0.CO;2
[17] Skamarock, W.C., Klemp, J.B., Dudhia, J., Gill, D.O., Barker, D.M., Wang, W. and Powers, J.G. (2008) A Description of the Advanced Research WRF Version 3. NCAR Technical Note-475+STR, UCAR.
[18] Lu, X., Wang, X., Li, Y., Tong, M. and Ma, X. (2017) GSI-Based Ensemble-Variational Hybrid Data Assimilation for HWRF for Hurricane Initialization and Prediction: Impact of Various Error Covariances for Airborne Radar Observation Assimilation. Quarterly Journal of the Royal Meteorological Society, 143, 223-239.
https://doi.org/10.1002/qj.2914
[19] Tong, M., Sippel, J.A., Tallapragada, V., Liu, E., Kieu, C., Kwon, I.-H. Wang, W., Liu, Q., Ling, Y. and Zhang, B. (2018) Impact of Assimilating Aircraft Reconnaisse Observations on Tropical Cyclone Initialization and Prediction Using Operational HWRF and GSI Ensemble-Variational Hybrid Data Assimilation. Monthly Weather Review, 146, 4155-4173.
https://doi.org/10.1175/MWR-D-17-0380.1
[20] Statterfield, E., Hodyss, D., Kuhl, D. and Bishop, C. (2018) Observation-Informed Generalized Hybrid Error Covariance Models. Monthly Weather Review, 146, 30605-3622.
https://doi.org/10.1175/MWR-D-18-0016.1
[21] Wang, X., Barker, D.M., Snyder, C. and Hamill, T.M. (2008) A Hybrid ETKF-3DVAR Data Assimilation Scheme for the WRF Model. Part II: Real Observation Experiments. Monthly Weather Review, 136, 5132-5147.
https://doi.org/10.1175/2008MWR2445.1
[22] Wang, X., Snyder, C. and Hamill, T.M. (2007) On the Theoretical Equivalence of Differently Proposed Ensemble-3DVAR Hybrid Analysis Schemes. Monthly Weather Review, 135, 222-227.
https://doi.org/10.1175/MWR3282.1
[23] Jolliffe, I.T. and Stephenson, D.B., Eds. (2003) Forecast Verification: A Practitioner’s Guide in Atmospheric Science. Wiley, Hoboken.
[24] Pasch, R.J., Reinhart, B.J., Berg, R. and Roberts, D.P. (2021) Hurricane ETA. National Hurrican Center Tropical Cyclone Report No. AL292020, NOAA, 70.
https://www.nhc.noaa.gov/data/tcr/AL292020_Eta.pdf

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.