^{1}

^{2}

Extreme values of wind speed were studied based on the highly detailed ERA5 dataset covering the central part of the Kara Sea. Cases in which the ice coverage of the cells exceeded 15% were filtered. Our study shows that the wind speed extrema obtained from station observations, as well as from modelling results in the framework of mesoscale models, can be divided into two groups according to their probability distribution laws. One group is specifically designated as black swans, with the other referred to as dragons (or dragon-kings). In this study we determined that the data of ERA5 accurately described the swans, but did not fully reproduce extrema related to the dragons; these extrema were identified only in half of ERA5 grid points. Weibull probability distribution function (PDF) parameters were identified in only a quarter of the pixels. The parameters were connected almost deterministically. This converted the Weibull function into a one-parameter dependence. It was not clear whether this uniqueness was a consequence of the features of the calculation algorithm used in ERA5, or whether it was a consequence of a relatively small area being considered, which had the same wind regime. Extremes of wind speed arise as mesoscale features and are associated with hydrodynamic features of the wind flow. If the flow was non-geostrophic and if its trajectory had a substantial curvature, then the extreme velocities were distributed according to a rule similar to the Weibull law.

A large part of the Kara Sea (

The Arctic region is characterized by sparsein-situ observational coverage (conventional coastal weather stations, buoys, ships). Exceedingly few studies (e.g., [

Reanalysis data provide a useful alternative for filling these gaps in wind speed data over the Arctic, as they have global coverage and combine weather forecast models and assimilation of observations from a wide variety of sources. Modelling data (for example, within the framework of the historical CMIP5 experiment) have also been used to assess the pattern of surface wind climatology.

The climatology of winds across the oceans is detailed in multiple works [

This study focuses on the Kara Sea, a small part of the Pan-Arctic domain, to more clearly delineate its regional characteristics. For this purpose, a horizontally detailed re-analysis of ERA5 was used. This product (see below) was developed by the European Centre for Medium-Range Weather Forecasts (ECMWF). There is relatively little research that has been published on the climate of this region. Additionally, we consider the issues of hydrodynamic substantiation in evaluating the peculiarity of extreme value statistical laws.

The Weibull distribution has traditionally been used for statistical approximations of wind extremes [

Apart from the statistical approach, an explanation of the observed wind speed probability distribution should be based on theoretical ideas from hydrodynamic peculiarities of the atmospheric motion. This justification can be obtained by studying the products of numerical simulations or by studying equations that are sufficiently simplified to obtain their analytical solutions. In a previous paper, we concluded that the wind extremes modelled by a general circulation model involved only samples conforming to the base distribution (swans). The same conclusion was derived after reanalysing the ERA Interim dataset. Thus, the numerical coarse resolution products did not contain observed exceptional outliers.

The next step of our analysis was to investigate how accurately a mesoscale atmospheric model (with a fine spatial resolution) simulated the aforementioned peculiarities of wind extremes [

In this study, we continue the investigation of the ability of numerical simulations to reproduce wind speed extremes based on the ERA5 dataset [

Regarding analytical models, several studies (e.g., [

In the next section, we describe the data and study area, and briefly summarize the methods. Section 3 describes the evidence for a Weibull distribution in the near surface wind speed. Section 4 is devoted to explaining how the Weibull distribution arises from simplified equations of hydrodynamics. Section 5 concludes the paper.

In this study, we used the new global reanalysis ERA5 developed by the ECMWF. The ERA5 reanalysis was improved compared to a previous successful ERA-Interim reanalysis [

For our purposes, we used zonal and meridional components of wind speed at 10 meters, the geopotential at 700 hPa and 850 hPa, and the sea ice concentration.

To apply statistical approaches, we composed our data according to the independence condition. Practically, this means that the data sample had to include only independent extreme values. We selected the maximum wind speeds from 3-day intervals in wind speed data for each grid cell. This interval was obtained via autocorrelation function analysis as a period for the disappearance of the correlation between fluctuations (correlation coefficient becomes insignificant). The same time intervals for the same aims were used in several previous studies [

During the summer, Kara Sea may either be open water or covered with ice of various concentrations. This causes different roughness conditions, as the roughness of open water is usually lower than that of sea ice. Drag coefficients for open water are approximately 1.5 - 2 times lower than compared to the sea ice surface [

As mentioned, the statistics for extreme wind speed were described by the Weibull distribution. The following equations represent the cumulative distribution function (W, CDF) and the PDF (w):

W ( u ) = 1 − exp [ − ( u V ) k ] (1a)

w ( u ) = k u k − 1 V k exp [ − ( u V ) k ] (1b)

The value of V determines the scale of speed. A value of u = V corresponds to W ( u ) = 0.63 . This means that V is slightly more than the median ( u m e d ), and V = u m e d / ( ln 2 ) − 1 / k . The dependence of moments of the distribution on the Weibull parameters is illustrated in Monahan (2006a). Note that the Weibull distribution for k = 3.6 approximates a Normal distribution within a range extended to several values of standard deviation.

The Weibull parameters (k, V) are estimated using the maximum likelihood method. One variant of this method is discussed in [^{2}) providing a measure of the success of approximation. At all sites of the Kara Sea, we observed that practically all points of the CDF (besides several points depicting rare and high speeds) showed a close approximation to a Weibull distribution. In a mathematical sense, the use of R^{2} is related to the application of the Cramer-Mises-Smirnov statistical criterion. The application of the Kolmogorov-Smirnov test also showed that there was no reason not to trust the Weibull distribution (see, for example, [

Thus, most events fit into the basic distribution, and some of the most powerful ones did not fall into it. This result falls under the classification introduced in the Introduction, i.e., when the sample data of the same item refers to different distribution functions. In ERA5, the swans (and black swans) are always represented. Unlike station data, dragons are completely absent in some pixels (

from such a small volume of samples (

As a rule, with respect to a certain group of points, it is impossible to determine the population that they belong to, as the trend lines on the graphs practically coincide (

In the basic distribution, the values of V and k were unique in different pixels, but V varied minimally, i.e., from 9.5 to 10.5 m/s. Changes in the exponent were much more substantial (from 3 to 5). The parameter k increased in the south and east directions of the region, adjacent to Novaya Zemlya (

For dragons, the exponent in the Weibull distribution was substantially less than for swans (the value of k varied from 1 to 3). This meant that the distribution differed from the normal distribution because of the presence of a heavier tail, and that the likelihood of strong winds increased.

In

The expression for the basic range, i.e., the swans, was given by V = 0.36 k + 8.7 with a low coefficient of determination. As V varied minimally compared to the variations of k, and there was thus no reason to expect a correlation. For dragons, V = 5.62 ln ( k ) + 3.29 , R 2 = 0.99 . A close relationship between the parameters meant that the Weibull distribution encompassed a single parameter.

The reason for this unambiguity was unclear; it may have been a consequence of the algorithm for calculating the wind speed near the sea surface in the ERA5 reanalysis. Alternately, this may have occurred because we examined a relatively small area, as it contained a uniform wind regime. A comparison of the parameters (k, V) according to station measurements in the Arctic does not demonstrate such a close relationship. There was an increase in V with increasing k. Existence of a strong connection between V and k was not noted according to the scatterometer [

Even though anomalies related to dragons are rare events, their presence or absence are unprincipled for the parameters of the basic distribution, and neglecting them can lead to an incorrect interpretation of the results. To illustrate, consider the situation at pixel 71.5˚N, 60˚E. In this cell, there were 26 events related to “dragons” (with respect to 20 events, it was impossible to make a conclusion about which affiliation, i.e., dragons or black swans, that they belonged to) (

_{0.99} was larger for dragons. Station data were characterized by large differences between representatives of different populations. This result, together with the already noted situation that the statistical properties of dragons were evaluated only in a quarter of cases, suggested that the ERA5 did not fully provide information on the largest extremes. We encountered the same phenomenon when analysing COSMO-CLM data with a horizontal step of 13.2 km: The model reproduced dragons, but they were not as powerful as those obtained from the measurement data [

Consider what happens if the selection of information (on the grounds of a lack of ice cover) is not carried out. The calculations showed, first, that with ice, the wind speed distribution was described by the Weibull distribution with a high accuracy (the determination coefficients never fell below 0.95). Second,

dragons were almost completely absent in the sample. Third, the exponent over an open surface was always greater than approximately 15%, and the magnitude of V is greater by 10%. As a result, the average value from the set of extrema was less by 1 m/s above the ice, and the variance was greater by 1 m^{2}/s^{2}. It is possible that the surface roughness was responsible for this effect.

The purpose of this section is to understand why the probabilities of extreme velocities were described by the Weibull distribution.

From a probabilistic point of view, the applicability of the Weibull distribution for extreme value analysis is generally based on the following concept. Starting with a parent distribution whose CDF is Q ( U ) , the distribution is sampled m times, and the maximum value of the m samples is obtained. This maximum value has a CDF of simply Q m . Next, knowing the shape of the initial distribution, we can proceed to the law for extreme values. This allows them to be fit to one of three limiting distributions [

On the other hand, it is clear that the probability distribution of anomalies should be determined by the flow hydrodynamics. Accordingly, research has demonstrated that detailed numerical products such as the ERA5 or those derived from mesoscale models are capable of reproducing the observed statistical features of the wind regime within the main features. Conversely, coarse-resolution models only reproduce anomalies related to the base distribution.

To obtain a physical understanding of the observed and simulated PDFs of surface wind speeds, we consider the simple hydrodynamic model. This model should reflect the behaviour of the velocity modulus, as this value is used in statistical studies. The determination of an analytical justification for the Weibull type distribution law was attempted based on the characteristics of the hydrodynamic flow.

For this aim, following the classical book on the subject [

d U d t = − g ∂ H ∂ s (2)

U 2 R + f U = − g ∂ H ∂ n (3)

For our task, the analysis of these equations is mostly suitable because U denotes the horizontal speed as a nonnegative scalar. R is the curvature radius, and f is the Coriolis parameter. For a stationary case when the motion is parallel to the geopotential line g ∂ H / ∂ s = 0 , and dynamics are determined in Equation (3). We consider cyclonic motion (which corresponds to the conditions R > 0 , ∂ H / ∂ n < 0 ), as under these synoptic conditions the greatest anomalies of wind speed are achieved. We also consider the curvature and the Coriolis parameter to be constant values on a certain segment of the trajectory.

Viscosity and the effect of friction are not included in Equation (3). However, this does not preclude analysis, as it can be assumed that the flow is considered outside the atmospheric boundary layer. For the task of studying near-surface wind, this is not a limitation, because maximum velocities are associated with the transfer of large momentum values from the lower troposphere to the surface [

Because the geostrophic wind is defined as U g = − ( g / f ) ∂ H / ∂ n , Equation (3) is transformed by:

U 2 R f + U = U g (4)

Equation (4) can be used to calculate the PDF of the wind speed through knowing the PDF of the geostrophic wind. The latter can be estimated by considering the PDF of the geopotential height.

For this purpose, we calculated the PDFs of the variations of the geopotential height at 850 and 700 hPa pressure levels for individual grid cells of the study area based on ERA5 data. These levels above the boundary layer were chosen because we analysed motion without friction (see above). The PDFs had the characteristic shape of Gaussian curves (bell curve) (

We then considered the difference in geopotential heights at points “1” and “2”. The difference of H 1 − H 2 ≡ δ H determines the geostrophic wind. The determination of the density function of the sum (difference) of two quantities with a normal distribution is a classical problem of probability theory. As a result, the PDF is given simply by the Gaussian curve

r ( δ H ) = 1 2 σ π ( 1 − ρ ) exp [ − δ H 2 4 σ 2 ( 1 − ρ ) ] (5)

Here, σ is standard deviation of the height, and ρ is the autocorrelation coefficient between height fluctuations at point “1” and “2”.

Alternately, we can replace this expression for the PDF of geostrophic winds:

q ( U g ) = 1 2 σ g π ( 1 − ρ ) exp [ − U g 2 4 σ g 2 ( 1 − ρ ) ] (6)

Here, σ g is standard deviation of the geostrophic wind.

In considering Equation (4) and the function q ( U g ) (6), the CDF of wind velocity is given by:

G ( U ) = ∫ 0 U g q ( x ) d x = ∫ 0 U 2 / R f + U q ( x ) d x (7)

The PDF calculates as:

' G ( U ) ≡ g ( U ) = q ( U 2 R f + U ) ⋅ [ 2 U R f + 1 ] (8)

g ( U ) = ( 2 U R f + 1 ) ⋅ 1 2 σ g π ( 1 − ρ ) ⋅ exp [ Ψ ] (9)

Ψ = − ( U 2 / R f + U ) 2 / [ 4 σ g 2 ( 1 − ρ ) ] (10)

A complex combination of functions resembles the Weibull distribution. The exponent depends on the curvature of the trajectory. When Rf is approximately 10 m/s, the effective degree is 3.5; at Rf = ~1 m/s, the degree is already 3.9, and at Rf = ~0.1 m/s, the effective degree is practically 4. These results are in accordance with those obtained according to EPA5 (see

Comparing Expressions (9) and (1b), the similarity of their overall structures can be observed, although the factor in front of the exponent is not the same as required (see (1b)). The theory is suitable only for a basic distribution and can serve as an explanation of the probability of the appearance of black swans. The transition from black swans to dragons in the framework of this approach is not reproduced; for this purpose, apparently, we must consider the factors that in this case remained out of sight (see above). The plausibility of this thesis is indicated by our finding that, as already noted, dragons along with black swans were found in the results of reproduction of the wind by the mesoscale model. However, despite certain shortcomings, this result (depicted by the Equation (9)) can be considered successful. This is because, generally, we confirmed that in a stationary flow, the distribution of velocity anomalies was determined by a Weibull type distribution.

The data were analysed on ice-free (with ice coverage less than 15% of the cell area) cells of the Kara Sea. In many pixels, the extreme wind speed sample ERA5 was split into swans (and black swans) and dragons. In a quarter of the grid nodes examined, the parameters of the Weibull probability distribution function could be estimated not only for the swan sample, but also for the dragon population. The practical importance of highlighting dragons was that the largest anomalies are skipped without the former’s presence. It is easy to create these errors in the automatic processing of information without special controls.

For swans and dragons, such a close relationship was found between parameters of the Weibull distribution to the point where it subsequently was classified as a one-parameter distribution. It remains unclear whether this uniqueness of the connection was a consequence of the features of the calculation algorithm used in the ERA5, or whether it was a consequence of the relatively small water area, with close conditions for the formation of anomalies, that was considered.

The manifestation of the general laws of extreme velocity statistics is predetermined by the general hydrodynamic peculiarities of flow. The curvature of the flow played a key role in distinguishing these peculiarities from the normal distribution of wind speed anomalies. As expected, ruggedness of the trajectory associated with non-geostrophic movements in mesoscale systems was reflected in extreme velocities. We were able to show that the distribution function of the anomalies had a shape close to that of the Weibull distribution. This demonstrates the bridge between the hydrodynamics and statistics of extreme events.

This work has received funding from the Russian Foundation for Basic Research (RFBR) (project number 18-05-60147) and Lomonosov Moscow State University (grant AAAA-A16-116032810086-4).

General data used in this study is archived in the repository (Datasets Generated: “Mendeley Data”, https://data.mendeley.com/datasets/vdy2nksk4h/1).

The authors declare no conflicts of interest regarding the publication of this paper.

Kislov, A. and Matveeva, T. (2021) Extreme Values of Wind Speed over the Kara Sea Based on the ERA5 Dataset. Atmospheric and Climate Sciences, 11, 98-113. https://doi.org/10.4236/acs.2021.111007