An Analog Method for Seasonal Forecasting in Northern High Latitudes

An analog forecast method designed for monthly and seasonal outlooks is applied to the Arctic. The analog selection process uses pattern matches based on agreement with historical data to identify past years with similar distributions of sea level pressure, upper-air geopotential height, surface and up-per-air temperatures, precipitation, and sea surface temperatures. The evolution of the atmosphere in the analog years is then the basis of a prediction for the target year. Users can choose the predictor domain, the predictand domain, the variable to be predicted, and the number of antecedent months on which the analog selection is based. We provide an example of a monthly forecast generated by the analog forecast tool. In comparisons with operational dynamical model forecasts over the period 2012-2019, the analog system underperforms the dynamical models in middle latitudes but generally outperforms the dynamical models in monthly forecasts of surface air temperatures in the Arctic. The improvement over the dynamical models is especially apparent in the late summer and early autumn (August-October).


Introduction
While the concept of analog forecasting has been known since the advent of weather forecasting, serious scholarly research on the use of analogs for weather and climate forecasting began in the late 1960s and into the 1980s [1] [2] [3] [4] [5]. These studies, and many others, sought to identify large-scale patterns in the atmosphere and ocean as a guide to forecasting weather in the over timescales ranging from days to seasons. On the seasonal time scale, perhaps the Figure 1. Winter anomalies of temperature (red = warmer than normal, blue = cooler than normal) and precipitation (green = wetter than normal, brown = drier than normal) in winters with (a) El Nino and (b) La Nina conditions in the equatorial Pacific. Characteristic jet stream patterns are shown by thick arrows. From Climate Prediction Center/NCEP/NWS/NOAA, https://www.cpc.ncep.noaa.gov/products/analysis_monitoring/ensocycle/enso_cycle.shtml. usually assume no a priori knowledge of the climate system. A second approach is to utilize analogs in conjunction with dynamical (numerical weather prediction, NWP) models [9] [10]. More recently, machine learning techniques have been brought to bear on the use of analogs in weather forecasting, particularly in the targeting of surface temperature extremes over multiday time scales [11] [12]. Despite this upswing of research interest, analog-based methods have not been prominent in weather and climate forecasting strategies in the past few decades, especially with the advances in dynamical models for weather forecasting and climate simulations. For purposes of deterministic weather forecasts, the conventional wisdom is that analogs diverge sufficiently rapidly that a forecast based on analogs cannot compete with forecasts from dynamical models. The divergence arises from nonlinear error growth triggered by differences between any two analog states. However, dynamical models are also sensitive to errors in the initial conditions, and several additional considerations suggest that a fresh look at an analog approach may be in order, especially at monthly-to-seasonal lead times. These considerations include the recognition that slowly evolving surface boundary conditions (sea surface temperature, sea ice, soil moisture) contribute to departures from normal atmospheric states. Potential applications are especially deserving of exploration for regions such as the Arctic, where the analog method has not previously been evaluated, where observational data to initialize dynamical models are relatively sparse, and where dynamical model skill is lower than in middle latitudes (https://www.ecmwf.int/en/about/media-centre/science-blog/2018/improving-pr ediction-and-climate-monitoring-polar-regions).
Additional motivation for the present study includes the following. First, beyond the range of several weeks, the skill of dynamical models in predicting departures from monthly and seasonal means is small. Second, statistical approaches such as those used by the Climate Prediction Center for its seasonal outlooks (e.g., CCA-Canonical Correlation Analysis, Regression Analysis (SMT) tool), implicitly assume that there is some longer-term forcing that can provide forecast skill. This forcing may arise from persistence in ocean temperatures or other parts of the land-ocean system, or from evolution of this forcing inherent in the analog approach. In this regard it should be noted that an analog ap- forecast system is applied only to shorter forecast ranges (days), it 1) demonstrates the use of analog forecast targeting impacts, which in the CIPS case are tied to severe weather threats and 2) is publicly available.
The targeting of applications and the real-time availability highlight one of the main advantages of analog forecasts: stakeholders and decision-makers can be given historical analogs for use in assessing impacts of greatest relevance to their particular decision-making needs (e.g., related coastal flooding or, more generally, the likelihood of severe weather) by knowing which past events most closely resemble their present situation. On the seasonal timescale, analog years provide potentially useful information for anticipating wildfire season severity, droughts or, in the Arctic, sea ice conditions affecting navigation and other offshore activities. The fact that the analog forecast approach has not been rigorously explored for the Arctic represents an opportunity for enhancing monthly-to-seasonal outlooks in a region that is arguably under-served by the suite of existing forecast products.
The considerations summarized above led to the present study, for which the objectives were to: 1) develop an analog system to forecast atmospheric conditions in the Arctic over timescales of a month to several seasons, and 2) assess the skill of the analog forecasts relative to dynamical model forecasts. Because our analog system will target forecasts on the monthly to seasonal timescale, monthly fields will form the basis of the analog selection. The study is carried out with the aid of software that gives the user the choice of region, forecast lead time, and predictor variables, all at monthly temporal resolution.

Data Source
The analog forecast procedure presented here uses current atmospheric and sea surface temperature to from the NCEP/NCAR R1 reanalysis [13] [14] to identify a set of n best historical matches (analogs). For illustrative purposes, we choose n = 5. A previous hindcast study [15] has shown that values of n in the range of 5 to 10 represent an optimal compromise between excessive damping of anomalies as the number of analogs increases above about 5 -10 and the increased volatility as the number of analogs is reduced below five. The R1 reanalysis is used because it offers a longer period of record (since 1949) than most other atmospheric reanalyses and is routinely updated (to within a day or two of real time). Its resolution of 2.5˚ × 2.5˚ in latitude and longitude enables the running of the required matching algorithm in less than a minute on a laptop or desktop computer.

Generation of the Analog Forecast
The analog identification is based on six variables from an antecedent period: surface air temperature, sea level pressure, precipitation, upper-air pressure (geopotential height), upper-air temperature, and sea surface temperature. As described below, there are several options for the pressure level of the upper-air geopotential height and temperature. The antecedent period, which is also user-selected, can be as current as the most recent calendar month. The main attributes of the analog system are presented in Figure 2. In order to capture the Atmospheric and Climate Sciences First, a search area and calendar month(s) for which the analog years are selected by the user (upper left panel in Figure 2). Monthly data are available, so the time-window can be specified as a single month (e.g.., January, 2020) or a set of 2, 3, …, 12 consecutive months (e.g., June through August 2020). The geographical area should be one that the user believes is climatologically predictive of the intended forecast area. One natural choice is a region in the Pacific Ocean near the equator which is part of teleconnection patterns (El Niño, La Niña) known to be associated with monthly-to-seasonal variations over a large area ( Figure 1). The user also selects the pressure levels (925, 500 or 250 hPa) for the upper air geopotential height and the air temperature fields to be used in the analog-year selection. The upper-air fields used in the analog selection are in addition to the near-surface fields of sea level pressure, 2-meter air temperature, precipitation and sea surface temperature.
There are varying degrees of relationship among the user options. First, the geographical area of the predictors can be the same as the area for which the prediction is desired, although it need not be. In the left portion of Figure 3, the two areas are different. With regard to the variables, the six predictor variables are pre-determined or "hard-wired" into the tool, while the variable being predicted is left to the choice of the user. The predicted variable can be any one of the six variables in the predictor suite. Both the predictor (analog selection) time-frame and the prediction timeframe can range in length from 1 to 12 months. However, the month(s) used the analog selection (and the predictor variables) are different from the month(s) of the prediction. The former must precede the latter, and there must be no overlap; otherwise the predictor information would extend into the timeframe for which the prediction is being made, in which case there would not be a true "prediction".
Second, a forecast area and time window must be defined. Similar to the search area, data are available for monthly and multi-month periods. The area for which the forecast is made should be user-relevant.  Figure 2). The conditions returned during the top five match years are also averaged to produce a composite forecast for the forecast period (lower right panel of Figure 2). The composite forecast may be regarded as a "constructed analog" forecast [16] [17]. This is a prediction of the user-requested field in the forecast area during the forecast period based on conditions during the five best analog years in the search area.

Pattern Match Criteria
What constitutes an analog match is a fundamental question in this type of study. While an analog pattern match need to be perfect to be useful, the closeness of the match should align with the utility of an analog. The closeness of a match may be measured qualitatively, or quantitatively by using various statistical metrics. [1] describes a simple procedure using squared differences between atmospheric states and also a root mean square error (RMSE) metric. Other techniques involve pattern correlation, linear regression [16] or more sophisticated statistical algorithms [18].
Analog matches in the current study use a similar match technique as [1] [19] [20] and others namely, a ranked RMSE score. The target month/year is eva- overall deviation from climatology-not the best pattern matches. Metrics based on simple departures from the mean are not appropriate for the atmospheric circulation, which is determined by spatial gradients rather than by departures from the mean. Root mean squared error (RMSE) between each past year and the search period is performed for each grid cell, then summed and weighted across each variable using an auto-weighting calculation based on a standardized closeness of the match of each variable. The sum of the weighted mean squared errors across all variables is the "Match Score". Low numerical values of the match score, i.e., small values of the summed RMSEs, correspond to high levels of similarity between that year and the search year.

Sample Forecast
In order to illustrate the performance of the analog forecast software, we present the following example. In this case, the domains for the analog selection and the actual forecast are identical, 60˚N -90˚N. The variable being forecast is the 2-meter air temperature. The month being forecast is April 2020, while March 2020 is the period used to identify the best analog match years. This period was chosen because April 2020 was characterized by extreme temperature anomalies, particularly the anomalous warmth that was observed over northern Eurasia. Figure 4 shows  that temperatures during March 2020 were generally 1˚C -4˚C warmer than normal over most of northern Eurasia and the seas to the north. During April, however, temperature anomalies of +5˚C to +8˚C developed over central Siberia, with anomalies exceeding 5˚C extending across much of the Arctic Ocean, including the Chukchi Sea. This highly unusual springtime warmth contributed to an early sea ice retreat north of Siberia, leading to record-low sea ice in in the Kara and Laptev Seas during much of the summer [21]. The anomalous heat persisted into June, leading to large areas of wildfire and degraded air quality over much of northern Asia [21]. The change from March to April is sufficiently large that a simple forecast of anomaly persistence from March to April would have severely underestimated the magnitude of the warmth over Siberia. Figure 5 shows the temperature anomaly fields during March of the five years identified as the best analogs for March of 2020. These years, determined by application of the analog selection software to the NCEP reanalysis data for 1948-2019, were (in order of decreasing match) 2002, 2015, 2011, 1975 and 2014. The preponderance of recent years in the set of five best matches is consistent with the recent Arctic warming and the fact that the heaviest weights were assigned by this search to sea surface temperature and the 2 m air temperature. While the different sample members have varying locations and magnitudes of the high-latitude warm anomalies, all show positive departures from normal somewhere in the vicinity of northern Asia. Figure 6 shows the temperature anomaly fields for April of the five best analog years. The anomalies are defined here as the departures from the climatological  of discrepancy is from northeastern Canada to Greenland, where the negative anomalies forecast by the analogs did not verify. Over Baffin Bay, for example, the observationally-based reanalysis shows positive departures from normal where the analog system had forecast large negative anomalies.
While the preceding example illustrates the analog system's ability to capture some, but not all, features of month-to-month evolution of Arctic temperatures, it represents a single case. Moreover, the analog forecast was not compared to other forecasts. In the following section, we draw upon a larger sample of forecasts and include comparisons with operational models used for monthly to seasonal climate outlooks.

Comparison with Dynamical Forecast Models
The analog tool produces forecasts for periods of a month to several seasons.
These forecast ranges are the same as those spanned by climate prediction centers' dynamical forecast models, which are generally coupled atmosphere-oceanice-land models. The atmospheric components of these models are essentially the same models used in numerical weather prediction, for which there is demonstrable skill in day-to-day weather predictions out to one or two weeks.
When these models are run beyond a week or two in order to generate monthly and seasonal outlooks, the forecasts target monthly or seasonally-averaged departures from climatological means-as do the analog forecasts in this study.
The availability of dynamical model forecasts for recent decades enables comparisons of the skill of the analog and dynamical model forecasts. For the comparison here, we use the North American Multimodel Ensemble (NMME), a set of models used to generate forecasts to lead times of seven to eleven months. The NMME ensemble of models has evolved over time (cf. Table 1 in [22]). The current forecasts of the six primary ensemble members of the current NMME are accessible at https://www.cpc.ncep.noaa.gov/products/NMME/ One of the NMME models is the National Centers for Environmental Prediction's CFSv2 [23], which we include in single-model comparisons described below.
At the global scale, the dynamical model forecasts of temperatures and heights were found to outperform the analogs method for most forecast months at all time lags. For example, over the 2012-2017 period, the analogs' 2-meter forecasts of temperature for the Contiguous U.S. using matches at one-month lead time beat the NMME only 40% of the time, with the analogs' with the percentage ranging from 49% in July-September to 25% in October-December (Figure 8) (The seasons in Figure 8 correspond to those of the annual cycle of temperature over 70˚N -90˚N, where the months of minimum and maximum temperature are February and August, respectively). This under-performance of the analogs relative to the dynamical models diminishes as one moves into high latitudes and, for the Arctic, the skill of the analog forecasts slightly exceeds that of the dynamical models. Figure 9 shows the percentages of cases in which the analog and dynamical model forecasts had the lower RMSE of Arctic temperature forecasts. Each panel in Figure 9 shows the results for forecasts targeting a particular calendar month (January, February, …, December) over the period 2012-2017.
Each pair of bars summarizes results for cases in which the number (m) of months used for the analog selection ranged from 1 to 12. Each bar also includes  a range of lead times from one month to six or seven months. Because of occasional corruptions of the archived NMME files, the sample sizes for each bar range from 34 to 37 cases (generally six lead times for six different years). Figure 9 shows that each type of forecast (analog, NMME) outperformed the other in some calendar months. For a given calendar month, the relative performance of the analog system shows some sensitivity to the number of antecedent months, m, used in the analog selection. However, the choice of m does not have a major impact on which of the two methods shows has the smaller error in a particular calendar month. While there is substantial month-to-month scatter in the relative performance of the two methods, an aggregation of the results by season (Figure 8) leads to the conclusion that the analog method slightly outperforms the NMME in this particular application (forecasts of 2-meter temperature over the domain 70˚N -90˚N, which is essentially the Arctic Ocean. The 60˚N -90˚N was used in the analog selection for these forecasts. The aggregations in Figure 8 include all values of m for each of the three months in the season. Figure 8 and Figure 9 show that, except for the autumn season, the analogs outperform the NMME in forecasts of 2 m air temperature in the Arctic. This suggests that the analogs are capturing some features of the Arctic system that have sufficient memory to impact subsequent air temperatures. The fact that the NMME outperforms the analogs in the autumn is consistent with the large autumn sea ice anomalies of 2012-2017, which are incorporated into the dynamical models' initializations with subsequent large impacts on the surface air temperatures during the freeze-up period. In this respect, the dynamical models' forecasts of autumn air temperatures over the Arctic during these years should be "hard to beat". As a further illustration of the relative performance of the analogs and the dynamical models, Figure 10 shows the lowest RMSE errors of the forecasts made by the NMME, CFSv2 and analog forecasts for each month of the 2012-2019 period. All forecasts in this case are one-month forecasts, and the number of antecedent months used to identify the best analogs is m = 1. The errors show a strong seasonal cycle, with the largest errors in winter and the smallest errors in summer, consistent with the seasonal cycle of variance of Arctic air temperatures (Przybylak, 2000). The color-coded symbols in Figure 10 indicate which of the three forecasts was the "winner" in the sense that it had the lowest RMSE: analog (blue), NMME (orange) and CFSv2 (green). In over half (50/87, or 57%) of the cases, the analog forecast had the smallest error. In the remaining 37 cases, the lowest errors were about evenly split among the NMME (19 cases) and the CFSv2 (18 cases). The latter two forecasts are not independent, as the CFSv2 is one of the NMME models. Consistent with Table 1, the analog forecasts in Figure 9 generally fare most poorly during the autumn season (the periods of increasing errors in the seasonal cycles of Figure 10).

Conclusions
The analog forecast system presented here has been shown to be competitive with state-of-the-art dynamical forecast models in the Arctic despite its underperformance relative to dynamical models in middle latitudes. This regional dependence of the relative accuracies of the different approaches may be surprising at first glance, but it is consistent with the sparser network of in situ observations in the Arctic. Especially over the Arctic Ocean, which is the domain of the comparison presented here, there are no rawinsondes or surface observing stations. Figure 10. Root-Mean-Square Errors (˚C) of one-month temperature forecasts produced by the analog system (blue), the NMME (orange) and the CFSv2 model (green).
Aircraft and ship reports are also much less frequent in the Arctic, where surface buoys and satellite products are the main sources for the initialization of dynamical models. While one may argue that an analog forecast should also be sensitive to the initial state, the impact of uncertainties in the initial state is evidently less in an approach that inherently assumes some correspondence in the evolution of the atmosphere in years with similar initial states. Given the limits of deterministic predictability (one to two weeks at present), surface boundary forcing or some other "memory" in the system is being utilized by the analog system.
There are several notable limitations of the analog forecast system. First, the pool of candidate years for the analog matches is clearly limited, in this case to approximately 70 years. An analog-based forecast would clearly benefit from a larger sample of candidate analog years, as the closeness of the best match(es) will generally increase as the size of the candidate matches increases. Stated differently, the divergence from the "best" analogs will be more rapid as the closeness of the match worsens. Another major limitation is that the effect of climate change or even low-frequency internal climate variability. If climate change of variability takes the climate system to a new extreme that is outside the range spanned by the available analogs, then not only is a perfect match impossible but a close match becomes increasingly unlikely. The fact that the Arctic climate is changing rapidly and reaching new extremes (e.g., of temperature) makes this