^{1}

^{*}

^{1}

^{1}

This study investigates different sources of uncertainty in the assessment of the climate change impacts on total monthly precipitation in the Campbell River basin, British Columbia, Canada. Four global climate models (GCMs), three greenhouse gas emission scenarios (RCPs) and six downscaling methods (DSMs) are used in the assessment. These sources of uncertainty are analyzed separately for two future time periods (2036 to 2065 and 2066 to 2095). An uncertainty metric is calculated based on the variation in simulated precipitation due to choice of GCMs, emission scenarios and downscaling models. The results show that the selection of a downscaling method provides the largest amount of uncertainty when compared to the choice of GCM and/or emission scenario. However, the choice of GCM provides a significant amount of uncertainty if downscaling methods are not considered. This assessment work is conducted at ten different locations in the Campbell River basin.

Climate change due to greenhouse gas (GHGs) emissions is impacting the global hydrological cycle as well as regional hydrology across the world, and it will continue in the future [

A large number of climate change assessment studies on hydrology have been conducted so far on different temporal and spatial scales [^{2}) and often fail to capture non-smooth fields such as precipitation [

Spatial downscaling translates large scale climate variables simulated by GCMs to a regional scale. A generalized climate change impact assessment process framework is outlined in

Minville et al. (2008) [

changes in precipitation extremes in various climatic zones in British Columbia with six GCMs from the Coupled Model Inter-comparison Project (CMIP3) under three emission scenarios (B1, A1B and A2). Eight downscaling methods were used to compare downscaling uncertainty. This investigation concludes that the results are more sensitive to the choice of downscaling methods followed by the choice of GCM where the emission scenarios have a minor influence. Although this study addressed different sources of uncertainty, GCM data is now available from the CMIP5 and the conclusion is conflicting with other regional climate impact studies [

It this study we are going to investigate the three primary sources of uncertainty attributed to the selection of GCM, emission scenario, and downscaling model for the assessment of the climate change impacts on total monthly precipitation in the Campbell River basin, BC, Canada (

GCM model | Centre Name | GCM resolution (Lon. vs Lat.) |
---|---|---|

CanESM2 | Canadian Centre for Climate Modeling and Analysis | 2.8 × 2.8 |

CCSM4 | National Center of Atmospheric Research, USA | 1.25 × 0.94 |

CSIRO-Mk3-6-0 | Australian Commonwealth Scientific and Industrial Research Organization incollaboration with the Queensland Climate Change Centre of Excellence | 1.8 × 1.8 |

GFDL-ESM2G | National Oceanic and Atmospheric Administration’s Geophysical Fluid Dynamic Laboratory, USA | 2.5 × 2.0 |

uncertainties. The steps we followed for this study are shown in

The following section gives information about the study area and data used. The paper proceeds with a brief description of downscaling methods in Section 3. Comparisons of the downscaling results and uncertainty quantification are presented in Section 4. Summary and conclusions are then given in Section 5.

The Campbell River is situated on the west coast of Canada. The total drainage area of this coastal watershed is approximately 1856 Km^{2} (

For this assessment, historical daily precipitation (prep) data for a 25 year span (1976 to 2005) was extracted from the ANUSPLIN data set on a 0.1˚ × 0.1˚ grid [

For the regression based statistical downscaling models, a predictor data set is needed. Predictor variables need to be 1) easily available from GCM outputs, 2) reliably simulated by GCMs and 3) strongly correlated with the predict and or variable of interest (precipitation in the present case) [

The ANUSPLIN and GCM data sets used in this study have different spatial resolutions. For climate change impact assessment at the catchment scale, all the data sets are spatially interpolated to the ten locations of interest (

Two gridded statistical downscaling methods from the Pacific Climate Impacts Consortium (PCIC) [

Bias corrected spatial disaggregation (BCSD) [

Station | Elevation (m) | Latitude (^{˚}N) | Longitude (^{˚}W) | Station abbreviation |
---|---|---|---|---|

Elk R ab Campbell Lk | 270 | 49.85 | 125.8 | ELK |

Eric Creek | 280 | 49.6 | 125.3 | ERC |

Gold R below Ucona R | 10 | 49.7 | 126.1 | GLD |

Heber River near Gold River | 215 | 49.82 | 125.98 | HEB |

John Hart Substation | 15 | 50.05 | 125.31 | JHT |

Quinsam R at Argonaut Br | 280 | 49.93 | 125.51 | QIN |

Quinsam R nr Campbell R | 15 | 50.03 | 125.3 | QSM |

Salmon R ab Campbell Div | 215 | 50.09 | 125.67 | SAM |

Strathcona Dam | 249 | 49.98 | 125.58 | SCA |

Wolf River Upper | 1490 | 49.68 | 125.74 | WOL |

from the historical time period at each station. This step is called “local scaling” because simulated coarse gridded monthly precipitation data is multiplied by a monthly scaled factor at each local station. This step helps to remove long term bias between large-scale simulated precipitation and observed precipitation at a regional scale. The mathematical description of the “local scaling” process is as follows:

P d a ( x , t ) = P mod ( x , t ) m o n 〈 P mod 〉 m o n (1)

where P mod ( x , t ) is simulated large scaled mean monthly precipitation from station x at time t in months “mon”; P d s ( x , t ) is the monthly downscaled mean precipitation and 〈 ⋯ 〉 m o n is the monthly mean precipitation calculated from gridded observed and historical GCM datasets.

Finally, the daily time series is generated by temporal downscaling of monthly mean precipitation to daily using a stochastic resampling technique following Wood et al. (2002) [

Development of future precipitation projections using a weather generator is divided into two steps: 1) scaling of future scaled climate variables and 2) generation of synthetic future climate time series [

After scaling the climate data, weather generators (WGs) are used for generating a synthetic time series. WGs can preserve statistical characteristics of input data as well as capture temporal and spatial correlation between climate variables at multiple sites. The two different WGs: 1) K-nearest neighbor (Knn CAD V4) and 2) maximum entropy bootstrap (MBE), are used in this investigation.

A non-parametric multisite weather generator named KnnCAD V4 [

y p p t , t + i j = λ p p t x p p t , t + i j + ( 1 − λ p p t ) z t + i ; i = 1 , 2 , ⋯ , n (2)

where z t + i comes from two parameter log-normal distribution; x p p t , t + i j is reshuffled non-zero precipitation value for t + i^{th} day in j^{th} location; y p p t , t + i j is the perturbed precipitation value for t + i^{th} day in j^{th} location and t is current day. λ p p t value varies in between 0 to 1 (0 means data series are totally perturbed and 1 means no perturbation in the results) [

Srivastava and Simonovic (2014) [

m 1 = 0.75 O 1 + 0.25 O 2 (3)

m k = 0.25 O k − 1 + 0.5 O k + 0.25 O k + 1 ; ∀ k = 2 , 3 , ⋯ , t − 1 (4)

m t = 0.25 O t − 1 + 0.75 O t (5)

where O t is a rank matrix derived from first principal component and t is time step.

This method is able to capture temporal and spatial dependency structures along with other historical statistics (e.g. mean, standard deviation) in downscaled climate variables. The performance of MBEWG is free of modeling parameters and it is computationally inexpensive.

Regression based methods are most commonly used for statistical downscaling. In this method a statistical relationship (linear or non-linear) is established between large scale climate variables simulated by GCMs (predictors) with observed local surface variable (predictand) which is then applied to future climate. For this assesment, two multivariate regression methods (kernel regression and beta regression) are used in this study.

A multisite multivariate non-parametric kernel regression (KR) based statistical downscaling method was proposed by Kannan and Ghosh (2013) [

E ( Y / X ) = m ( X ) = ∫ y f ( y / x ) f x ( x ) (6)

where Y is the predictand; X is principal component of the predictor variable; f ( y / x ) is conditional probability density function (pdf) of Y given X = x and f x ( x ) is marginal pdf of X.

The multivariate pdf in Equation (6) is replaced by kernel density estimator and formulated as follows:

m h ( x ) = ∑ i = 1 n K h ( x − X ) i Y i ∑ i = 1 n K h ( x − X i ) (7)

where m h ( x ) the expected is value Y for a condition of X i = x ; and K h is the kernel with bandwidth h. The method can efficiently capture extreme precipitation events as well as autocorrelations and spatial cross-corre- lation among downscaling sites.

Mandal et al. (2015) [

Step-I: Precipitation states are discretized using CART coupled with an unsupervised clustering technique (K-means clustering). Using the K-means clustering algorithm, daily observed precipitation (1976 to 2005) is clustered into three distinct precipitation states [

Step-II: Historical GCM predictor variables are standardized by subtracting the mean and dividing the data by standard deviation. To reduce dimensionality and remove multicollinearity, principal component analysis (PCA) is applied to transform the standardized historical GCM predictor variables into five orthogonal components. The first five principal components are used, as they were shown to explain 97% of total variability in the historical data as shown in

Step-III: The CART is built using precipitation states obtained from K-means and first five principal components derived from the historical GCM data.

Step-IV: The trained CART model is applied to determine future precipitation states through prediction based on the principal components of future GCMs predictor data.

Step-V: For multisite precipitation generation the following relationship between predictor and predictand is considered:

P t = F R ( X t / S t ) (8)

where S_{t} is precipitation state of the river basin at time t, X_{t} is predictor variable at time t and P_{t} is the precipitation at a certain station at time t. Beta regression is used to model the above-mentioned relationship.

The regression model builds a relationship between predictor variables (x) and the predcitand (y), using the generaized relationship is as follows:

y i = f ( x i ) + ε i ; i = 1 , 2 , ⋯ , n (9)

where ε_{i} is a normally distributed non-zero error term and i is temporal scale. If the relationship is linear then the Equation (9) can be modified as:

y = x T β + ε i = β 0 + x 1 β 1 + x 2 β 2 + ⋯ , x d β d + ε i (10)

where x is a vector of predictor variables with dimension d and β is a coefficient vector.

The beta regression model assumes that the predictand is beta distributed. The beta distribution is flexible and efficient to model dependent/predictand variables because the beta density function can assume a number of different shapes based on its parameter. Beta distribution can successfully represents asymmetric data and it can capture non-linear relationships [

f ( y , μ , ϕ ) = Γ ( ϕ ) Γ ( μ ϕ ) Γ ( ( 1 − μ ) ϕ ) y μ ϕ − 1 ( 1 − y ) ( 1 − μ ) ϕ − 1 , 0 < y < 1 , 0 < μ < 1 , ϕ > 0 (11)

where µ is the mean of predictand, ϕ is the precipitation parameter, y is the dependent variable and Γ(.) is a gamma function. If μ ≠ 1 / 2 then model becomes asymmetric and if μ = 1 / 2 then the model is symmetric.

Principal components | Percentage variance explained |
---|---|

1 | 61.7873 |

2 | 36.3651 |

3 | 19.3867 |

4 | 8.7302 |

5 | 2.1188 |

6 | 0.3506 |

7 | 0.2260 |

8 | 0.1582 |

9 | 0.1028 |

10 | 0.0605 |

The beta regression model assumes that the dependent variable (precipitation) is constrained to the unit interval of (0, 1). To fulfill this condition precipitation data needs to be scaled into (0, 1) interval. Precipitation data is bounded in an interval (a, b) where a and b are the minimum and maximum daily precipitation values respectively. The following equations are used for scaling precipitation data into (0, 1) interval:

y ′ = ( y − a ) / ( b − a ) (12)

Pr scaled = ( y ′ ( n − 1 ) + 0.5 ) / n (13)

where y is precipitation data, n is sample size and Pr_{scaled} is scaled precipitation data into (0, 1).

To formulate the conditional expectation function E(y/x) for multivariate predictors, the beta regression model is formulated as follows:

g ( μ t ) = ∑ i = 1 k x t i β i (14)

x t i = ( x t 1 , ⋯ , x t k ) ; t = 1 , ⋯ , n (15)

β i = ( β 1 , ⋯ , β k ) T ( β ∈ ℝ k ) (16)

where β_{i} is a vector of unknown beta regression parameters and x_{ti} are t^{th} day observation of k covariates (k < n). g(.) strictly monotonic and twice differentiable link function which maps (0, 1) into ℝ . Log it transformation is used as a link function for this work. β is estimated based on maximum likelihood estimation (MLE). For generation of extreme precipitation outside of the observed range, a perturbation technique is used following King et al. (2015) [

The main objective of this study is to quantify sources of uncertainty and assess which one has major influence on precipitation projections. Daily precipitation is projected using different downscaling models (BCSD, BCCAQ, KnnCad V4, MEBWG, KR and BR) at different locations over the river basin and results are compared at different temporal and spatial scales.

The annual average total monthly precipitation is used to compare the different sources of uncertainty amongst the selection of GCM, DSM, and RCP scenario for the near (2036-2065) and the far future (2066-2095) time slices (

To identify and quantify the sources of uncertainty, an uncertainty metric is calculated. The uncertainty metric is used to gauge the amount of uncertainty associated with each step of the statistical downscaling process (i.e. choice of GCMs, RCP scenario and downscaling model). The calculation for each weather station and calendar month can be summarized by the following steps:

Step-I: Calculate the total monthly precipitation by summing the precipitation into monthly bins, and taking the average for each calendar month, across all years for the future downscaled precipitation ( F i , j , k , l , m ) for each GCM i, DSM j, RCP scenario k, and weather station l.

Step-II: Follow the same procedure as described in previous step for observed historical precipitation to calculate monthly total historical precipitation ( H l , m ) where m is month and l is weather station.

Step-III: Take the ratio of the future downscaling to observed total monthly precipitation values.

Step-IV: Calculate the range across the dimensions representing a selection step in the downscaling process:

G C M uncertainty = max ( A i , j , k , l , m ) i − min ( A i , j , k , l , m ) i (18)

D S M uncertainty = max ( A i , j , k , l , m ) j − min ( A i , j , k , l , m ) j (19)

R C P uncertainty = max ( A i , j , k , l , m ) k − min ( A i , j , k , l , m ) k (20)

The resulting ranges in total monthly precipitation represent the uncertainty in results associated with the downscaling process due to the choice of a particular GCM, DSM, or RCP scenario. This method uses the range in total monthly precipitation as a metric for the amount of uncertainty and does not consider the distribution of total monthly precipitation attributed to the selection made in a level of the downscaling process.

In

The combined spatial and seasonal variation of uncertainty in the precipitation projections across the ten stations in the river basin are analyzed (Figures 7-9). GCMs were shown to be associated with larger amounts of uncertainty in summer precipitation for both time periods (Figures 7(e)-(f)) along with spring precipitation for the near future (

In this paper, different sources of uncertainty in the projection of total monthly precipitation were assessed and

compared for two future time periods in the Campbell River basin. Previous studies found that the choice of GCM is the largest source of uncertainty in the downscaling process [

constraints imposed by different sets of predictor variables, and they all assume a stationary relationship between predictor and predictand. This can be the reason why DSMs show the largest source of uncertainty.

Uncertainty metric for different sources of uncertainty is very simple to calculate and it is computationally inexpensive. It can be used at any temporal and spatial scale. This study represents the analyses on a regional scale; however if applied to continental or global scales the spatial component of uncertainty in downscaled precipitation projections can be studied more in depth.

This assessment work is funded by the Discovery grant from the Natural Sciences and Engineering Council (NSERC) of Canada granted to the third author.