Uncertainty in Precipitation Projection under Changing Climate Conditions : A Regional Case Study

This study investigates different sources of uncertainty in the assessment of the climate change impacts on total monthly precipitation in the Campbell River basin, British Columbia, Canada. Four global climate models (GCMs), three greenhouse gas emission scenarios (RCPs) and six downscaling methods (DSMs) are used in the assessment. These sources of uncertainty are analyzed separately for two future time periods (2036 to 2065 and 2066 to 2095). An uncertainty metric is calculated based on the variation in simulated precipitation due to choice of GCMs, emission scenarios and downscaling models. The results show that the selection of a downscaling method provides the largest amount of uncertainty when compared to the choice of GCM and/or emission scenario. However, the choice of GCM provides a significant amount of uncertainty if downscaling methods are not considered. This assessment work is conducted at ten different locations in the Campbell River basin.


Introduction
Climate change due to greenhouse gas (GHGs) emissions is impacting the global hydrological cycle as well as regional hydrology across the world, and it will continue in the future [1].Precipitation is directly impacted due to an increase in global average temperature driving evapotranspiration rates thereby increasing the concentra-tion of water vapor in the atmosphere.Variation in precipitation is expected to differ in magnitude and frequency from region to region.Changes in precipitation will affect water resources activities including use of reservoir storage, flood control, water supply, irrigation, energy production, navigation, and recreation.These impacts may require that water resources planners and managers adopt alternative water management strategies in the future.Before making any adoption strategy, an assessment of climate change impacts on future precipitation is essential.
A large number of climate change assessment studies on hydrology have been conducted so far on different temporal and spatial scales [2]- [4].Generally, the outputs of global climate models (GCMs) are used for regional climate change impact assessment.GCMs simulate time series of global climate variables (e.g.sea level pressure, temperature, specific humidity) considering different emission scenarios of GHGs.GCM outputs are coarsely gridded (>100 km 2 ) and often fail to capture non-smooth fields such as precipitation [5].Spatial downscaling is required for better understanding and assessment of future hydrologic conditions at watershed scales under different climate change scenarios.
Spatial downscaling translates large scale climate variables simulated by GCMs to a regional scale.A generalized climate change impact assessment process framework is outlined in Figure 1(a).At the regional scale, projection of hydro-climatic variables under changing climatic conditions is burdened with a considerable amount of uncertainty originating from several sources.Uncertainty may arise from [6] [7]: 1) inter-model variability due to different model structures between GCMs; 2) inter-scenario variability due to different types of emission scenarios; 3) intra-model variability due to the model parameter selection; and 4) the choice of downscaling model (Figure 1(a)).Minville et al. (2008) [8] observed that projection of precipitation is most sensitive to the choice of GCM where Wilby and Harris (2006) [7] found out that GHG emission scenarios also caused uncertainty in precipitation projections under changing climatic conditions.However, according to Prudhomme and Davies (2009) [9] [10] downscaling is a significant source of uncertainty along with the uncertainty due to the choice of GCM.There are several climate impact studies conducted for the west coast of Canada [11]- [13] but future precipitation projections considering the propagation of uncertainty (GCMs uncertainty, GHG emission scenarios uncertainty and downscaling uncertainty) are rarely performed.Werner (2011) [12] conducted a study to project future monthly precipitation in three BC (British Columbia) watersheds with eight GCMs under three emission scenarios (B1, A1B and A2).This study found that the uncertainty in precipitation projection due to the choice of GCM is larger than that due to the choice of emission scenarios for different temporal scales.However, this study did not assess the uncertainty due to the choice of downscaling method.Bürger et al. (2012) [13] looked at changes in precipitation extremes in various climatic zones in British Columbia with six GCMs from the Coupled Model Inter-comparison Project (CMIP3) under three emission scenarios (B1, A1B and A2).Eight downscaling methods were used to compare downscaling uncertainty.This investigation concludes that the results are more sensitive to the choice of downscaling methods followed by the choice of GCM where the emission scenarios have a minor influence.Although this study addressed different sources of uncertainty, GCM data is now available from the CMIP5 and the conclusion is conflicting with other regional climate impact studies [7] [14].From the past studies, we found 1) inconsistency in the conclusions and 2) that sometimes all sources of uncertainty are not included in the climate change impact analyses.Ensuring that all sources of uncertainty are included during quantification of climate change impacts on the regional hydrology is essential [6].
It this study we are going to investigate the three primary sources of uncertainty attributed to the selection of GCM, emission scenario, and downscaling model for the assessment of the climate change impacts on total monthly precipitation in the Campbell River basin, BC, Canada (Figure 2).This investigation includes four GCMs, three emission scenarios, and six downscaling methods.GCM simulations from Coupled Model Inter-Comparison Phase 5 (CMIP5) are used in this study [1].The list of GCMs is shown in Table 1.These four GCMs are selected based on data availability for the six downscaling methods (described below).Four Representative Concentration Pathway (RCP) emission scenarios are recommended by the Fifth Assessment Report (AR5) of Intergovernmental Panel on Climate Change (IPCC) [1].Three of these are used in this research (RCP 2.6, RCP 4.5 and RCP 8.5) that cover the range of emission scenarios.RCP 2.6 represents lower carbon emission scenario, RCP 4.5 and RCP 6.0 represent intermediate carbon emission scenarios and RCP 8.5 assumes high and unabated carbon emission by the end of 2100.Six Downscaling methods applied in this study are as follows: 1) bias corrected spatial disaggregation (BCSD) [11] [15], 2) bias correction constructed analogues with quantile mapping reordering (BCCAQ) [16], 3) delta change method coupled with a non-parametric K-nearest neighbor weather generator [17], 4) delta change method coupled with maximum entropy based weather generator [18], 5) non-parametric statistical downscaling model based on the kernel regression [19], and 6) beta regression based statistical downscaling model [20].BCSD and BCAAQ were successfully applied across Canada in the past, however these methods cannot explicitly capture changes in daily extremes [16] where other four downscaling methods can capture changes in daily extremes and can produce extreme values outside of the historical boundaries [18]- [21].The above mentioned six downscaling methods are used to quantify the amount of uncertainty arising from different types of statistical downscaling methods and compare it with other sources of  uncertainties.The steps we followed for this study are shown in Figure 1(b).
The following section gives information about the study area and data used.The paper proceeds with a brief description of downscaling methods in Section 3. Comparisons of the downscaling results and uncertainty quantification are presented in Section 4. Summary and conclusions are then given in Section 5.

Study Area and Data Used
The Campbell River is situated on the west coast of Canada.The total drainage area of this coastal watershed is approximately 1856 Km 2 (Figure 2).The river basin is both snow and rain-fed, however mountain snowpack will likely decreases due to higher temperatures as a result of climate change [21].As a result, there will be a shift towards the river being predominantly rain-fed, causing stream flow to be lower during the spring and summer months and higher in the fall and winter.
For this assessment, historical daily precipitation (prep) data for a 25 year span (1976 to 2005) was extracted from the ANUSPLIN data set on a 0.1˚ × 0.1˚ grid [22].The ANUSPLIN data set is developed by applying a "thin plate smoothing spline" algorithm to observed data from Environment Canada.Historical precipitation data is extracted for ten locations covering the entire Campbell River basin (Table 2).
For the regression based statistical downscaling models, a predictor data set is needed.Predictor variables need to be 1) easily available from GCM outputs, 2) reliably simulated by GCMs and 3) strongly correlated with the predict and or variable of interest (precipitation in the present case) [23].Considering the above mentioned condition, daily maximum and minimum air surface temperature (Tmax and Tmin), mean sea level pressure (mslp), specific humidity (hus) at 500 hPa, zonal (u-wind) and meridional (v-wind) wind are considered as predictor variables in this study following Kannan and Ghosh (2013) [19].These climate variables are extracted from four GCMs (Table 1) for a period of 25 years spanning 1976-2005, as well as for a near future period (2036 to 2065) and a far future period (2066-2095) under RCP 2.6, RCP 4.5 and RCP 8.5 emission scenarios.Details with regards to the use of these climate variables for the regression based statistical downscaling models are given in next section.
The ANUSPLIN and GCM data sets used in this study have different spatial resolutions.For climate change impact assessment at the catchment scale, all the data sets are spatially interpolated to the ten locations of interest (Table 2) using the inverse distance square method [24].

Precipitation Projections
Two gridded statistical downscaling methods from the Pacific Climate Impacts Consortium (PCIC) [25], two weather generators (based on K-nearest neighbor and maximum entropy bootstrap) and two regression based statistical downscaling methods (kernel regression and beta regression) are used for future precipitation projection.The details of these methods are given below.

Gridded Downscaling Methods
Bias corrected spatial disaggregation (BCSD) [11] [15] and bias correction constructed analogues with quantile mapping reordering (BCCAQ) [16] are gridded statistical downscaling methods that can effectively produce plausible hydro-climate variables from the GCM output with computational efficiency.The BCSD downscaling method is performed in three steps.First, monthly GCM simulated precipitation data is corrected for bias using quantile mapping.Next, bias corrected monthly precipitation is downscaled by interpolating "monthly anomalies" is the monthly mean precipitation calculated from gridded observed and historical GCM datasets.
Finally, the daily time series is generated by temporal downscaling of monthly mean precipitation to daily using a stochastic resampling technique following Wood et al. (2002) [26].BCCAQ is a hybrid method that combines bias correction constructed analogues (BCCA) and bias corrected climate imprint (BCCI) where BCCI is referred as "inverse BCSD".BCCA follows the same spatial aggregation and bias correction (quantile mapping) steps as BCSD but it includes spatial information from daily GCM anomalies [16].Simulated daily future precipitation datasets using BCSD and BCCAQ downscaling techniques are extracted from the PCIC database [27].

Weather Generators
Development of future precipitation projections using a weather generator is divided into two steps: 1) scaling of future scaled climate variables and 2) generation of synthetic future climate time series [28].The delta change, or change factor methodology is used in this study for scaling climate variables to account for GCM simulated climate change.In the delta change method, change factors are calculated from historical and future GCM data.This change is then applied to the observed historical climate data to scale the historically observed climate variables to account for the projected changed between the historical and future GCM condition.Several types of change factors (CF) can be applied at different temporal scales (monthly, seasonal or annual).They can use different mathematical formulations (additive or multiplicative) or can be applied based on number of change factors (single or multiple).Using only a single CF will not capture changes in event frequency calculation and antecedent conditions in the case of hydrologic modelling due to the importance of temporal sequencing of dry and wet days which remains unchanged [29].In present study, we used 25 evenly spaced additive CFs across the precipitation distribution for scaling the precipitation data following Anandhi et al. (2011) [29].
After scaling the climate data, weather generators (WGs) are used for generating a synthetic time series.WGs can preserve statistical characteristics of input data as well as capture temporal and spatial correlation between climate variables at multiple sites.The two different WGs: 1) K-nearest neighbor (Knn CAD V4) and 2) maxi-mum entropy bootstrap (MBE), are used in this investigation.

KnnCAD V4
A non-parametric multisite weather generator named KnnCAD V4 [21] based on K-nearest neighbors (K-NN) is used in this study.The KnnCAD V4 is the updated version of KnnCAD V3 [4] which includes block resampling and perturbation.A detailed description of block resampling can be found in King et al. (2015) [21].For perturbation the following equation is used: where t i z + comes from two parameter log-normal distribution; , x + is reshuffled non-zero precipitation value for t + i th day in j th location; , j ppt t i y + is the perturbed precipitation value for t + i th day in j th location and t is current day.ppt λ value varies in between 0 to 1 (0 means data series are totally perturbed and 1 means no perturbation in the results) [21].This model can adequately reproduce statistical characteristic of historical climate variables as well as extrapolate historical extremes.

Maximum Entropy Based Weather Generator (MEBWG)
Srivastava and Simonovic (2014) [18] developed a non-parametric multisite, multivariate maximum entropy based weather generator (MEBWG) for generating daily precipitation and minimum and maximum temperature.The three main steps involved in MBE are: 1) orthogonal transformation of daily climate variables at multiple sites to remove spatial correlation; 2) use of maximum entropy bootstrap (MEB) to generate synthetic replicates of climate variables and 3) inverse orthogonal transformation of synthetic climate variables to re-established spatial correlation.Principal component analysis is used for orthogonal transformation.The maximum entropy density is constructed using the following equations to satisfy ergodic theorem (mean preserving): 0.25 0.5 0.25 ; 2,3, , 1 1 0.25 0.75 where t O is a rank matrix derived from first principal component and t is time step.This method is able to capture temporal and spatial dependency structures along with other historical statistics (e.g.mean, standard deviation) in downscaled climate variables.The performance of MBEWG is free of modeling parameters and it is computationally inexpensive.

Regression Based Downscaling Methods
Regression based methods are most commonly used for statistical downscaling.In this method a statistical relationship (linear or non-linear) is established between large scale climate variables simulated by GCMs (predictors) with observed local surface variable (predictand) which is then applied to future climate.For this assesment, two multivariate regression methods (kernel regression and beta regression) are used in this study.

Multivariate Kernel Regression Model
A multisite multivariate non-parametric kernel regression (KR) based statistical downscaling method was proposed by Kannan and Ghosh (2013) [19].This model projects precipitation conditioned on precipitation states.A non-parametric regression is a smoothing technique that projects the predictand using a set of predictor variables.Multiple sites can be included by applying weights to the other neighboring region predictand of the one desired.Multivariate kernel regression is used for calculating the conditional expectation of a random variable.In this study, kernel regression is used to capture a non-linear relationship between daily precipitation and other predictor variables.The conditional expectation of the kernel regression can be expressed as follows: where Y is the predictand; X is principal component of the predictor variable; ( ) x is marginal pdf of X.The multivariate pdf in Equation ( 6) is replaced by kernel density estimator and formulated as follows: where ( ) h m x the expected is value Y for a condition of i X x = ; and h K is the kernel with bandwidth h.The method can efficiently capture extreme precipitation events as well as autocorrelations and spatial cross-correlation among downscaling sites.Mandal et al. (2015) [20] proposed a multisite and multivariate downscaling method based on beta regression (BR).The proposed model generates future daily precipitation conditioned on precipitation states.The logical framework of this model is shown in Figure 3.The model is divided into two phases.In the first phase, the model predicts precipitation states using classification and regression trees (CART) (Figure 3(a)) where in the next phase, daily precipitation is simulated at a particular station using multivariate beta regression (Figure 3(b)).The detailed procedure is described below.

Multisite and Multivariate Beta Regression Model
Step-I: Precipitation states are discretized using CART coupled with an unsupervised clustering technique (K-means clustering).Using the K-means clustering algorithm, daily observed precipitation (1976 to 2005) is clustered into three distinct precipitation states [19].Step-II: Historical GCM predictor variables are standardized by subtracting the mean and dividing the data by standard deviation.To reduce dimensionality and remove multicollinearity, principal component analysis (PCA) is applied to transform the standardized historical GCM predictor variables into five orthogonal components.The first five principal components are used, as they were shown to explain 97% of total variability in the historical data as shown in Table 3.
Step-III: The CART is built using precipitation states obtained from K-means and first five principal components derived from the historical GCM data.
Step-IV: The trained CART model is applied to determine future precipitation states through prediction based on the principal components of future GCMs predictor data.
Step-V: For multisite precipitation generation the following relationship between predictor and predictand is considered: where S t is precipitation state of the river basin at time t, X t is predictor variable at time t and P t is the precipitation at a certain station at time t.Beta regression is used to model the above-mentioned relationship.
The regression model builds a relationship between predictor variables (x) and the predcitand (y), using the generaized relationship is as follows: where ε i is a normally distributed non-zero error term and i is temporal scale.If the relationship is linear then the Equation ( 9) can be modified as: where x is a vector of predictor variables with dimension d and β is a coefficient vector.The beta regression model assumes that the predictand is beta distributed.The beta distribution is flexible and efficient to model dependent/predictand variables because the beta density function can assume a number of different shapes based on its parameter.Beta distribution can successfully represents asymmetric data and it can capture non-linear relationships [30] [31].The beta density function can be represented as follows: where µ is the mean of predictand, ϕ is the precipitation parameter, y is the dependent variable and Γ(.) is a gamma function.If 1 2 µ ≠ then model becomes asymmetric and if 1 2 µ = then the model is symmetric.The beta regression model assumes that the dependent variable (precipitation) is constrained to the unit interval of (0, 1).To fulfill this condition precipitation data needs to be scaled into (0, 1) interval.Precipitation data is bounded in an interval (a, b) where a and b are the minimum and maximum daily precipitation values respectively.The following equations are used for scaling precipitation data into (0, 1) interval: where y is precipitation data, n is sample size and Pr scaled is scaled precipitation data into (0, 1).
To formulate the conditional expectation function E(y/x) for multivariate predictors, the beta regression model is formulated as follows: ( ) , , where β i is a vector of unknown beta regression parameters and x ti are t th day observation of k covariates (k < n).g(.) strictly monotonic and twice differentiable link function which maps (0, 1) into  .Log it transformation is used as a link function for this work.β is estimated based on maximum likelihood estimation (MLE).For generation of extreme precipitation outside of the observed range, a perturbation technique is used following King et al. (2015) [21].

Results and Discussion
The main objective of this study is to quantify sources of uncertainty and assess which one has major influence on precipitation projections.Daily precipitation is projected using different downscaling models (BCSD, BCCAQ, KnnCad V4, MEBWG, KR and BR) at different locations over the river basin and results are compared at different temporal and spatial scales.

Comparison of Uncertainty
The annual average total monthly precipitation is used to compare the different sources of uncertainty amongst the selection of GCM, DSM, and RCP scenario for the near (2036-2065) and the far future (2066-2095) time slices (Figure 4).This metric was chosen as it will allow for uncertainty to be disaggregated across the seasons.Results for the three stations; JHT, SCA and WOL are shown in Figures 4(a)-(f) (for shorten the manuscript length three stations are considered).WOL are located upstream of the river where SCA and JHT are located downstream near Strathcona dam and John Hart reservoir respectively.On the contrary, these three stations have different elevation levels (Table 2) which may have influence on the result.Figure 4 shows that the summer months (June, July and August) are typically drier in comparison to the other seasons for all three stations.However, there is a potential for more extreme events in the spring (March, April and May) for all three stations.Although the median total monthly precipitation is higher for the winter months, there is still a potential for larger amounts of precipitation in the early spring, as indicated by the outliers in Figure 4. Figure 4 shows a significant variation in precipitation projections without clear identification of the sources of uncertainty.

Quantification of Uncertainty
To identify and quantify the sources of uncertainty, an uncertainty metric is calculated.The uncertainty metric is used to gauge the amount of uncertainty associated with each step of the statistical downscaling process (i.e.choice of GCMs, RCP scenario and downscaling model).The calculation for each weather station and calendar month can be summarized by the following steps: Step-I: Calculate the total monthly precipitation by summing the precipitation into monthly bins, and taking the average for each calendar month,  across all years for the future downscaled precipitation ( , , , , ) for each GCM i, DSM j, RCP scenario k, and weather station l.
Step-II: Follow the same procedure as described in previous step for observed historical precipitation to cal-culate monthly total historical precipitation ( , l m H ) where m is month and l is weather station.
Step-III: Take the ratio of the future downscaling to observed total monthly precipitation values.
Step-IV: Calculate the range across the dimensions representing a selection step in the downscaling process: The resulting ranges in total monthly precipitation represent the uncertainty in results associated with the downscaling process due to the choice of a particular GCM, DSM, or RCP scenario.This method uses the range in total monthly precipitation as a metric for the amount of uncertainty and does not consider the distribution of total monthly precipitation attributed to the selection made in a level of the downscaling process.
In Figure 5 uncertainty is aggregated for each step of the downscaling process for each month in different future time periods.It can be observed that uncertainty in precipitation projections can mainly be attributed to the choice of DSMs compared to GCMs and RCPs throughout the year.A larger amount of uncertainty has been found in the late spring (May) and summer months (June, July and August) using different DSMs.Further disaggregation can show the level of uncertainty associated with a single choice of GCMs and DSMs different RCPs and future time periods (Figure 6).From this, it is shown that the two regression based statistical downscaling methods (KR and BR) are attributed a larger portion of uncertainty in precipitation projections than the other methods.KR and BR model used six predictor climate variables which may influenced the uncertain precipitation projection.
The combined spatial and seasonal variation of uncertainty in the precipitation projections across the ten stations in the river basin are analyzed (Figures 7-9).GCMs were shown to be associated with larger amounts of uncertainty in summer precipitation for both time periods (Figures 7(e)-(f)) along with spring precipitation for the near future (Figure 7(c)).The choice of RCP was only associated with a small amount of uncertainty in the far future summer months (Figure 7(f)).Another important observation is that the uncertainty in downstream precipitation is higher than that of the stations upstream except for the winter period (Figure 7).This may be caused because of basin topography because stations located in the upstream have higher elevation compare to downstream stations and three reservoirs (Strathcona, Ladore and John Hart) are located in the downstream of the Campbell River.Compared to GCMs and RCPs, the choice of DSM shows maximum uncertainty in precipitation projections across all seasons in the basin (Figure 9).

Summary and Conclusion
In this paper, different sources of uncertainty in the projection of total monthly precipitation were assessed and  compared for two future time periods in the Campbell River basin.Previous studies found that the choice of GCM is the largest source of uncertainty in the downscaling process [8] [10].However, this study concludes that the choice of DSMs dominates other sources of uncertainty, particularly in the case of the regression based models.Downscaling methods used in this study have significant difference in formulation.For instance, climate variables are not bias corrected for two weather generators (KnnCad V4 and MBEWG) but other four methods (BCSD, BCCAQ, BR and KR) used bias corrected data.Every statistical downscaling model is subject to   constraints imposed by different sets of predictor variables, and they all assume a stationary relationship between predictor and predictand.This can be the reason why DSMs show the largest source of uncertainty.
Uncertainty metric for different sources of uncertainty is very simple to calculate and it is computationally inexpensive.It can be used at any temporal and spatial scale.This study represents the analyses on a regional scale; however if applied to continental or global scales the spatial component of uncertainty in downscaled precipitation projections can be studied more in depth.

Figure 1 .
Figure 1.(a) Generalized framework of climate change impact assessment process; (b) Flow chart presenting the assesment process followed in this study.

Figure 3 .
Figure 3. Framework of beta regression model.(a) Generation of precipitation states using CART; (b) Future precipitation projection using multivariate beta regression.

Figure 4 .
Figure 4. Boxplots showing projected annual average total monthly precipitation at three different stations in the Campbell River basin with historical (1976-2005) observed precipitation-comparison between two future time periods.

Figure 5 .
Figure 5. Heat maps showing comparison of different sources of uncertainty metrics for two future time periods.

Figure 6 .
Figure 6.Heat maps showing GCMs and DSMs uncertainty metrics for different emission scenarios-comparison between two future time periods.

Figure 7 .
Figure 7. Seasonal variation of GCMs uncertainty metric in the Campbell River basin for two future time periods (a)-(h).(i) Location of the downscaling stations.

Figure 8 .
Figure 8. Seasonal variation of RCPs uncertainty metric in the Campbell River basin for two future time periods (a)-(h).(i) Location of the downscaling stations.

Figure 9 .
Figure 9. Seasonal variation of DSMs uncertainty metric in the Campbell River basin for two future time periods (a-h).(i) Location of the downscaling stations.

Table 1 .
List of GCMs.

Table 2 .
Detailed features of downscaling locations in the Campbell River basin.
from the historical time period at each station.This step is called "local scaling" because simulated coarse gridded monthly precipitation data is multiplied by a monthly scaled factor at each local station.This step helps to remove long term bias between large-scale simulated precipitation and observed precipitation at a regional scale.The mathematical description of the "local scaling" process is as follows: mod , Px t is simulated large scaled mean monthly precipitation from station x at time t in months "mon";( ) , ds P x t is the monthly downscaled mean precipitation and mon 

Table 3 .
Percentage of variance explained by principal components.