Examining the Impact of Bias Correction on the Prediction Skill of Regional Climate Projections

Rainfall is crucial for many applications e.g. agriculture, health, water re-sources, energy among many others. However, quantitative rainfall estima-tion is normally a challenge especially in areas with sparse rain gauge net-work. This has introduced uncertainties in rainfall projections by climate models. This study evaluates the performance of three representative concentration pathways, RCP i.e. 4.5, 6.0 and 8.5 over Uganda using the Weather Research and Forecasting (WRF) model. It evaluates the model output using observed daily rain gauge data over the period 2006-2018 using Pearson correlation; relative root mean square error; relative mean error and skill scores (accuracy). It also evaluates the potential improvement in the performance of the WRF model with respective RCPs by applying bias correction. The bias correction is carried out using the quantile mapping method. A poor correlation with observed rainfall is generally found (−0.4 to +0.4); error magnitudes in the ranges of 1 to 3.5 times the long-term mean are observed. The RCPs presented different performances over different areas suggesting that no one RCP is universally valid. Application of bias correction did not produce realistic improvement in performance. Largely, the RCPs underestimated rainfall over the study area suggesting that the projected rainfall cases under these RCPs could be seriously underestimated. However, the study found RCP8.5 with slightly better performance and is thus recommended. Due to the general weak performance of the RCPs, the study recommends re-evaluating the assumptions under the RCPs for different regions or attempt to improve them using data assimilation.


Introduction
Developing countries e.g. Uganda normally suffer from the adverse impacts of extreme climate. Studies on future climates e.g. Tiyo et al. [1], Okonya et al. [2], Ongoma et al. [3], among others have generally projected increasing magnitudes and frequency of extreme weather events. Unfortunately, developing countries have lower adaptive capabilities [4] [5] [6] and less developed early warning mechanism [3] [6] which make them vulnerable to the negative impacts associated with these extreme events. The changes in climate have been attributed to increasing pollution levels and changes in environment due to changes in land cover and land use. Consequently, the concentration of atmospheric pollutants has been conceptualized into Representative Concentration Pathways (RCP) [7].
Future climate studies using General Circulation Models (GCMs) e.g. Giannini et al. [8], Kisembe et al. [9] among others, have projected enhanced wet conditions over East Africa. This is due to the weakening of moisture convergence over the Congo basin [8] [10]. The GCMs are normally used for simulating the climate on different spatial scales, i.e. mesoscale, regional and global scale [9] including different time-scales i.e. days, weeks, months, years and decades. However, these GCMs have been found to have coarser horizontal resolutions which are not useful for regional high impact studies [5] [9] [11]. An evaluation of selected 10 models within CODEX by Kisembe et al. [9] revealed that most regional climate models (RCM) reproduce the inter-annual rainfall variability but present a poor skill in reproducing the rainy seasons especially the March-May rainfall season. Additionally, dry days are normally overestimated and presented as drizzles in these numerical models [10] [12] [13].
An approach normally proposed to address the limitations of the GCM is using bias correction. It has been used in many studies e.g. Sharma et al. [14], Ghimire et al. [15], Noor et al. [7], Cannon et al. [16], Monaghan et al. [17], Piani et al. [13] among others. By carrying out statistical bias correction on daily rainfall, Piani et al. [13] found an improvement in the mean and representation of extreme events like droughts. Ghimire et al. [15] argue that bias correction results in reduced biases and improves accuracy of simulations. For this reason, Noor et al. [7] evaluated the bias correction methods i.e. linear scaling, gamma quantile mapping, generalized quantile mapping, and power transformation and noted that the power transformation method was the most suitable for bias correction of the GCM. However, Myo et al. [18] and Ghimire et al. [15] found the linear scaling method to produce the best performance and recommended it for hydrological studies at river basins. On the other hand, Mahmood & Mukand [19], and Sharma & Kumar [14] recommended the quantile mapping bias correction method. This could suggest that no one method is universally valid.
Additional efforts to improve the projections of GCMs using dynamical  [5], Kisembe et al. [9], Nalukwago et al. [20], among others. The RCMs are useful in down-scaling the coarse resolution of the GCMs to a higher resolution which is potentially useful for high impact studies [9] [21]. This is because the RCMs have a better representation of local features e.g. mountains [5] [9], land-cover and water bodies than the GCM [9] [21].
However, these RCMs normally inherit biases from the parent GCM [5] [21] which predisposes them to require robust validation over areas of interest before they can be reliably used.
It is therefore necessary to have a realistic representation of climate fields especially rainfall in climate models for high impact studies [11]. This is because understanding the physical basis of the climate models will help us to advance better prediction as argued by Giannini et al. [8]. Equally important is to have deeper understanding of the biases of these climate models at different spatial and temporal scales. For this reason, Piani et al. [13] has recommended this to enable high impact studies for improved vulnerability assessment. Additionally, the changing frequency of extreme weather events requires a detailed assessment to build realistic future occurrences [4] [5] [22].
In order to enhance our understanding of the future climatic evaluations, a couple of studies using RCPs/GCMs and experiments e.g. CORDEX have been proposed and widely carried out. For example, Ongoma et al. [3]

Study Area
This study was carried out over Uganda and used 28 study locations as presented using Figure 1.

Study Design
This study contends that the study carried out by Ongoma et al. [3] used GCM in the CMIP5 which were coarse i.e. largely coarse greater than 1.5˚ about 150 Km × 150 Km horizontal resolution compared to the horizontal resolution used in this study i.e. 30 km (about 0.3˚). Therefore, this study designs and runs a comparatively higher resolution validation experiment of the RCPs over Uganda and uses 28 study locations as presented in Figure 1. Additionally, the study carried out by Fotso-Nguemo et al. [27] over a comparatively similar region used gridded data-sets but this study uses observed station rainfall data-sets and uses comparatively a longer validation period i.e. 13 years (2006-2018).  [28] with boundary conditions of the three RCPs i.e. RCP4.5, RCP6.0 and RCP8.5 [17]. The direct model outputs are then bias corrected using quantile mapping (Equation (6)) to investigate any possible improvement in the performance as recommended by Sharma et al. [14], Ghimire et al. [15], Noor et al. [7], Cannon et al. [16], Monaghan et al. [17], Piani et al. [13] among others.
In running the WRF model, model the parameterization schemes used are adopted from Mugume et al. [29] and are presented using Table 1. These parameterization schemes are also used in Ingula et al. [30]. The study domain used in this study is shown using Figure 2. This study used single domain at 30 Km × 30 Km covering the equatorial Africa but analyses are carried out over Uganda shown with a red box in Figure 2.

Study Methods
The evaluation of the WRF model performance based on the three RCPs  (1)) for assessing the relationship between observed and simulated; the relative root mean square error, RMSE (Equation (2)) and relative mean error, ME (Equation (3)) for examining the error magnitudes. The categorical skill scores, namely the accuracy (Equation (4)) are obtained from the contingency table (Table 2).
where: i p , i O , LTM, and n are the model predicted i th value, observed i th , long-term mean, and number of observations respectively. The relative root mean square error RMSE (Equation (2)) and the relative mean error, ME (Equation 3) presented in this study are as percentage of the LTM which is also used and recommended by Ongoma et al. [3]. This study used relative root mean square error and relative mean error in order to compare the performance against long-term mean.
Additionally, in this paper, the accuracy (i.e. hit rate) is defined as the proportion of hits (i.e. A 11 , A 22 and A 33 ) ( Table 2) to total observations. So in this paper for a given location, i, the accuracy is:    Table 2. Shows the contingency table as used in the study. "Below normal" is total rainfall less than 75% of the long-term mean; "Normal" is total rainfall within 75% to 125% of long-term mean; and "Above normal" is total rainfall greater than 125% of long-term mean.
a hit is defined in this paper as, for example when model prediction is "below normal" and the observed is also "below normal" i.e. A 11 ; model prediction is "normal" and observation is "normal" i.e. A 22 ; and model prediction is "above normal" and observation is "above normal" i.e. A 33 as illustrated in Table 2.
The contingency table used, as presented using Table 2 is based on three cases, namely "below normal", "normal" and "above normal". These terms are in operational use by UNMA and are defined as captioned in Table 2 long-term mean monthly rainfall used in the study is presented using Table 3. This study also used graphical analysis of line graphs and maps obtained using inverse distance weighting interpolation [38] and given by Equation (5).
p is the interpolated precipitation amount from i p neighboring stations weighted with i w and n is the total of stations used to derive * p .

Bias Correction Methods
A couple of methods for bias correction have been proposed which include: Table 3. Shows the long-term mean monthly rainfall amount in millimeters (mm) used in the study. It is derived from the different publications e.g. dekadal reports issued by UNMA.  [18]; gamma quantile mapping [7]; generalized quantile mapping [7] [16] [19]; and power transformation [7], among others. This study has adopted the generalized quantile mapping method to examine any potential improvement in the skill as simulated by the WRF model with different initial conditions based on the different RCPs (i.e. RCP4.5, RCP6.0 & RCP8.5). Quantitle mapping has been proposed by [19] and presented using Equation 6. This method is also used and promoted by [14] while assessing the changes in precipitation and temperature over the Teesta River basin in the Indian Himalayan region under climate change.
where * p is the bias corrected precipitation estimate from the model; rcp p is the direct model output without bias correction and obs p is the observed precipitation and obs p and rcp p are the mean values of obs p and rcp p respectively.

The Temporal Performance of the WRF Model for the Different RCPs
The performance of RCPs on monthly, seasonal and annual time scales is presented using Figure 3 and  Figure 3 shows that 36.7% of observed monthly rainfall were below long-term mean; 63.3% for RCP4.5; 58.3% for RCP6.0; and 57.5% for RCP8.5. This suggested that the RCPs largely underestimated monthly rainfall over the study period. It is this underestimation that resulted in a smaller overall relative anomaly of −3.956% for RCP4.5; 1.265% for RCP6.0; and 4.870% for RCP8.5. A detailed performance for each of the stations used is presented using Figure 5.
Additional analysis of annual rainfall patterns revealed that the RCPs largely underestimate annual rainfall totals (Figure 4(a)). This performance however improves slightly with bias correction (Figure 4(b)). Further analysis of the re-   The results in this study are comparable to the findings of Ongoma et al. [3] over East Africa. They evaluated 22 models under the CMIP5 and found the models to have a comparatively lower skill over East Africa while Fotso-Nguemo et al. [27] found RCP8.5 to present rainfall magnitudes comparatively lower than the Global Precipitation Climatology Center (GPCC) and Tropical Rainfall Measuring Mission (TRMM). While using RCP4.5 and RCP8.5 and 10 GCMs along with linear scaling as the bias correction method, Myo et al. [18] found that these RCPs projected fluctuating average monthly precipitation but found that annual precipitation is likely going to increase. These results are consistent with our findings which make us conclude that the 21 st century precipitation is going to be highly variable at monthly scale.

Spatial Performance of the RCPs
The spatial performance of WRF model driven by the three RCPs (RCP4.5, RCP6.0 & RCP8.5) considered in this study along with the bias corrected output using quantile mapping is presented using Figures 6-8. These figures are for correlation analysis ( Figure 6); relative mean error (Figure 7), and relative root mean square error (Figure 8). Additional analysis is presented using Tables 4-6 which present results per study location for correlation (Table 4); relative mean error ( Table 5) and relative root mean square error (Table 6).     Atmospheric and Climate Sciences the relative mean error magnitudes for both bias corrected and direct model output remain comparatively small largely within −0.9 to 1.0 of the long-term mean. This is also observed from the performance at specific study locations presented using Table 5. A detailed analysis of the results presented in Table 5 reveals that bias correction had a tendency of removing the underestimation to become overestimation over most of locations i.e. 23 out of 28; 19 out of 28; and 21 out of 28 for RCPs 4.5; 6.0; and 8.5 respectively. In a related study by Ghimire et al. [15], they noted that bias correction reduces the overall error magnitudes.
This is why the relative mean error appears positive on average in Table 5. This could also be the reason why the bias corrected results presented comparatively better performance in Figure 3 and Figure 4. A further investigation of the relative magnitudes of errors compared to the long-term mean carried out using the relative root mean square error (Equation (2)) and presented using Figure 8 confirms the results presented in Figure 7. Generally, the magnitude of errors is approximately of magnitude 1.0 to 2.5 with exception of southwestern Uganda where they largely appear greater than 1.5 of the long-term mean especially over Mt. Rwenzori region. Additionally RCP6.0 and the bias corrected RCP4.5 and RCP6.0 appear to present comparatively larger error magnitudes of the north eastern region. A slight improvement in the magnitudes of relative root mean square errors over the south western region especially Mt. Rwenzori area is noted in all bias corrected RCP products. This is also confirmed by the results presented at specific locations in Table 6 which presents a slight improvement in the error magnitudes. A detailed analysis of the absolute root mean square errors in comparison to the long-term mean is presented using Table 6. A poor performance is noted over Kasese, Lira, Moroto and Mbarara ( Table 6) of order of magnitude generally greater than 1.5 of the long-term mean especially the bias corrected results. Overall the results show that RCP4.5 presented a slightly better performance.
The foregoing results probably indicate that these RCPs may not be realistically valid in low latitudes. However, a study over Central Africa with the domain including Uganda, by Fotso-Nguemo et al. [27] noted that the ensemble mean of the 20 GCMs was able to reproduce the rainfall patterns exhibited by the Global Precipitation Climatology Center better than those presented by the Tropical Rainfall Measuring Mission. This suggests different performance scores of RCPs with different data-sets and thus underscores the importance of using station observations as ground truth in validation studies. Nonetheless, the RCPs fairly reproduces the temporal patterns ( Figure 3 and Figure 4) albeit with an underestimation. This could suggest that future extreme events being projected by different studies under these RCPs could be underestimated and could actually be severe. To improve the performance of the RCPs, this study proposes data assimilation or review of these RCPs.  Table 7 presents the hit rate levels per study location. The results generally show RCP8.5 presenting a slightly better skill than the rest. Isolated cases of above average skill are observed over the northwestern and northeastern regions. However, generally the skill is largely 20% -50%. There was no noticeable improvement in skill with bias correction of the RCPs. In some cases over some areas, actually the skill degraded e.g. over the northeastern Uganda. This is in contrast to the findings of Ghimire et al. [15], who noted that bias correction improves the accuracy of numerical simulations. This study argues that the weak/no improvement in the performance after bias correction could be because the initial conditions used to initialize this study are already bias corrected as described by Monaghan et al. [17], and so additional bias correction is not necessary.

Skill Scores
Further to the skill scores ( Figure 9 and Table 7), the results are in agreement with Kisembe et al. [9], who noted that the climate simulations whereas they reproduce the climate variability, they present a poor skill regarding the rainfall seasons especially the March-May rainfall season over Uganda. In general, this study finds RCP8.5 to present a slightly better performance in terms of the hit rate and is thus proposed for future simulation over low latitudes including Uganda. However it is surprising to note that bias correction did not necessary improve performance and probably considers that this observation could be that because the LBCs used in this experiment are already bias corrected as explained earlier.

Summary and Conclusion
This study was about validating the RCPs and was carried out over Uganda us- was for training the bias correction algorithm and the study used the quntile mapping method to correct the biases of the three RCPs i.e. RCP4.5, RCP6.0 and RCP8.5. In assessing the performance of the RCPs, the study used the Pearson correlation coefficient; the relative root mean square error; relative mean error; and accuracy (i.e. hit rate) computed from a 3 × 3 contingency table for the cases of "Below normal", "Normal", and "Above normal". Below normal is when the monthly rainfall is less than 75% of the long-term mean; Normal is when the monthly rainfall is within 75% -125% of the long-term mean; and Above normal is when the monthly rainfall is greater than 125% of the long-term mean. Trends are presented using line graphs while the spatial patterns are presented using maps derived using the inverse distance weighted spatialization method.
This study summarises the performance of the RCPs using Table 8. The summary shows that RCP8.5 presented comparatively better ranking on correlation score; relative mean error; relative root mean square error and hit rate. This was followed by RCP4.5 and then RCP6.0. However this study observed that there was no significant difference in the performance of all these RCPs and considers that they remain poor in informing us about future changes in climates especially over low latitudes.
The study further noted largely a negligible improvement due to bias correction. It noted that bias correction tended to improve underestimated rainfall cases and on the other hand decreased the overestimated rainfall cases. This did not necessarily improve the skill, nor the error magnitudes and we attribute this Table 8. Summarizes the performance of the RCPs on different monthly time-scales. "RMSE" is the root mean square error. The values in the parenthesis are the ranking for the given score on the scale of 1 -3 for the respective RCP. The lower the rank, the better the performance of the RCP. The average ranking is obtained by simple arithmetic average column-wise across different performance scores. to the fact that, it could be because the LBCs are already bias-corrected. Whereas this study recommends RCP8.5, it also recommends re-evaluating the assumptions in these RCPs. The other option recommended is to use data assimilation to improve the analysis of these RCPs for future climate scenarios.