^{1}

^{*}

^{1}

A regional analysis of design storms, defined as the expected rainfall intensity for given storm duration and return period, is conducted to determine storm Rainfall Intensity-Duration-Frequency (IDF) relationships. The ultimate purpose was to determine IDF curves for homogeneous regions identified in Botswana. Three homogeneous regions were identified based on topographic and rainfall characteristics which were constructed with the K-Means Clustering algorithm. Using the mean annual rainfall and the 24 hr annual maximum rainfall as an indicator of rainfall intensity for each homogeneous region, IDF curves and maps of rainfall intensities of 1 to 24 hr and above durations were produced. The Gamma and Lognormal probability distribution functions were able to provide estimates of rainfall depths for low and medium return periods (up to 100 years) in any location in each homogeneous region of Botswana.

The rainfall Intensity-Duration-Frequency (IDF) relationship is one of the most commonly used tools in water resources engineering. It can be used as an input in the planning, design and operation of water resources projects. It can also be used for models that are meant for flood protection and flood risk management of various engineering projects such as dams, roads, urban infrastructures, among others. A typical problem that is met in many developing countries is the non-existent or very sparse network of recording stations, whose data are the natural basis for the calculation IDF relationships. As a solution to this problem, additional information from the denser network of non- recording stations can be utilized. There is also the need for developing an appropriate methodology for incorporating data from non-recording stations.

Design storms, usually defined as the expected rainfall intensity for given storm duration and return period, are needed in many hydrological studies, and especially for providing an indirect estimation of the design flood. To this end, depth-duration-fre- quency curves are often employed. These curves allow to estimate the design storm, provided that historical rainfall extremes are available. When observed rainfall data are lacking, the estimation of the design storm may be conducted by using regional frequency analyses [

In this study, regional depth-duration-frequency relations for the estimation of rainfall extremes in Botswana are derived by combining simple rainfall depth-duration and depth-frequency relationships. The proposed formulation allows to estimate the expected rainfall depth for a duration ranging from 1 to 24 hours and for low and medium return periods (up to 100 years) in any location of the study area. A simulation experiment is developed in order to assess the reliability of the proposed formulation, whose performances are tested also in comparison with other regionalization approaches recently proposed by the scientific literature.

Design storm values are needed in many hydrological studies, and especially for providing an indirect estimation of the design flood. To this end, intensity-duration- frequency (IDF) curves are often employed. These curves allow one to estimate the design storm, provided that historical rainfall extremes are available using scaling and stochastic models [

In Botswana there are few continuous rainfall recording stations and more of daily non-recording stations with records of daily time interval that one may apply stochastic modeling of extreme daily rainfall series only. Botswana receives most of its rainfall from convective processes such as instability showers and thunderstorms. A detailed account of occurrence and distribution of rainfall in Botswana is available in [

In this study we recommend the development of a regional parametric Intensity- Duration-Frequency (IDF) model for Botswana. The regional IDF model can be used for the estimation of design storm depths. The proposed formulation used in the model allows one to estimate the expected rainfall depth for a duration ranging from 5 minutes to 2 hours and beyond, and for low and medium return periods (up to 50 or more years) in any location in the country.

The IDF model is developed based on few optimized parameters and on record annual rainfall data only as an input. The use of the annual rainfall data, which is a measure of aridity in Botswana, is fairly reliable available information and its long-term distribution spatially in most regions over Botswana is fairly known, as documented in [

Botswana is one of the countries in Southern Africa with an average rainfall of 300 - 600 mm per annum at most part of the nation. The distribution of the mean annual rainfall and the stations used in the study including the respective record lengths is shown in

Precipitation data with daily time scales within Botswana were collected. Most of the

records for the stations spans from 1950 to early 2010s which is adequate for rainfall- intensity-duration (IDF) studies. We have considered a total of 76 stations in this study. The raw daily rainfall data was processed and the 24 hr annual maximum rainfall (AMR) time series data as well as the mean annual rainfall (MAR) were extracted for all the stations covering the years of record. The characteristics of the raingauge sites is summarized in

The determination of homogenous IDF regions was undertaken through clustering. The Fuzzy C-Means (FCM) clustering was employed for this purpose. The FCM algorithm is a modification of the K-means algorithm and minimizes intra-cluster variance [

The algorithm assumes that the attributes are from a vector space and is targeted to achieve a minimized total intra-cluster variance function, D_{v} is given as [

where c_{k} is the centroid point of all the points in cluster k; N the total number of clusters; S_{k} the set of points in the k^{th} cluster; x_{j} the standardized vector for site j.

The FCM algorithm is initiated with an initial set of k groups and then it calculates the centroid point of each set. The next step is to construct a new partition by associating each point with the closest centroid. The centroids are recalculated for the new clusters and the algorithm is repeated by alternate application of these two steps until convergence [

Station | Location | A | Longitude | Latitude | Tc | MAR | 24 hr AMR | |
---|---|---|---|---|---|---|---|---|

ID | Station Name | Start | End | (˚E) | (˚S) | (m∙amsl) | (mm) | (mm) |

1 | Baines Drift | 1960 | 2011 | 28.73 | 22.48 | 750 | 325.4 | 61.4 |

2 | Bobonong | 1959 | 2011 | 28.42 | 21.97 | 675 | 321.2 | 63.0 |

3 | Bokspits | 1975 | 2011 | 20.70 | 26.70 | 850 | 186.6 | 37.1 |

4 | Dibete | 1958 | 2010 | 26.42 | 23.83 | 1005 | 377.4 | 63.9 |

5 | Digawana | 1981 | 2003 | 25.53 | 25.30 | 1275 | 344.9 | 62.8 |

72 | Tshane | 1958 | 2011 | 21.88 | 24.02 | 1118 | 329.6 | 51.0 |

73 | Tshesebe | 1980 | 2000 | 27.58 | 20.68 | 1170 | 422.8 | 62.0 |

74 | Tutume | 1959 | 2000 | 27.02 | 20.50 | 1080 | 456.1 | 56.2 |

75 | Werda | 1960 | 2002 | 23.27 | 25.27 | 1000 | 265.9 | 53.2 |

76 | Zwenshambe | 1981 | 1999 | 27.43 | 20.50 | 1230 | 388.8 | 59.4 |

variance and this is obtained when the points no longer switch clusters. In contrast to the K-means algorithm, which assigns each site to only one cluster, partial membership is permitted in FCM. This entails that each point has a degree of membership in each of the clusters. Thus points on the edge of a cluster may be in that cluster to a lesser degree than points in the center of a cluster. The degree of belonging of site i in the k^{th} cluster is equal to the inverse of the distance of site i to the centroid of cluster as defined in [

where ^{th} cluster,

where ^{th} cluster.

The most important contribution of this study was to determine the IDF curves using a parsimonious modelling approach of relying on few model parameters of a robust frequency model to model maximum rainfall intensity, duration and frequency, and to construct the IDF curves. The IDF curves were constructed based on the method proposed in [

This model of IDF curves is developed in a manner that is applicable to any location within Botswana. In this study two types of empirical IDF equations that relate intensity and duration were considered. The first one was a simple relationship, which has the following form:

In which R is the mean annual rainfall (mm/a); I is average rainfall intensity (mm/hr); t_{d} is storm duration, time of concentration (minutes); n is return period (years); and a, b, c are constants that depend on the units employed. In Equation (1) the constants a, b and c do not depend on return period, however, the constants vary significantly with location and estimated for specific region. Given a rainfall intensity I_{o}, the sum of the squared deviation (SSE) to minimum, we have

Equation (1) and (2) are utilized to compute the required intensities for respective stations and durations. Formulas of the form given in Equation (1) above are used in the region [

The Intensity-Duration-Frequency (IDF) Curves were developed for each region. As shorter durations of storm intensities below 24 hr are of great importance for different drainage design and water resource management, then such as 0.5 hr, 1 hr, 3 hr, etc are selected for the time scale of IDF to construct the IDF curves for each homogeneous region for this study.

In a previous study three mass curve of storm profiles shown in

In studies similar to storm intensities, such as drought spatial interpolation of historical time series precipitation data was used to derive Standardized Precipitation Indices (SPI) and to construct drought Severity-Area-Frequency (SAF) curves. In those studies, GIS and other multi-temporal spatial interpolation techniques such as Inverse Distance Weighting (IDW) approach were used in several studies (e.g. [

The thrust of this study is that the frequency modeling of rainfall intensity for various durations and recurrence intervals was used to derive spatial distribution of IDF for any homogeneous region. By avoiding data intensive, multi-temporal interpolation of all the at-site (station-wide) time series of 24 hr annual maximum rainfall series, and the corresponding IDF, a robust solution was achieved.

Spatial interpolation of the few parameters of the frequency models that control the precipitation over a homogeneous region and the consequential storm intensity corresponding to less than and above 24 hr durations was undertaken. For the respective

periods, the hydrological annual maximum daily (24 hr) values of rainfall intensity and duration are used for frequency analysis in this study. These rainfall intensities were fitted to four candidate probability distributions, namely Lognormal (LN), General Extreme Value (GEV), General Pareto (GP), Exponential (EXP) and 2-Parameter Gamma (G2). After model fit and performance evaluation, the best IDF model was selected as a regional IDF model, based on the comparison between the observed and model-esti- mated quantiles 24 hr rainfall rainfall as discussed in the next section.

The most important contribution of this study was to determine the IDF curves using a parsimonious modeling approach of relying on few model parameters of a robust frequency model to model storm frequency and construct the IDF curves.

The relationship between return period (T) and probability of rainfall quantile of non- exceedence probability (F) is expressed as:

The value of F is the probability of an event having a magnitude of P_{T} or less and the T-years magnitude. P_{T} is determined from the parent probability distribution using the methods of moments, method of likelihood or method of L-Moments [

where KT is the frequency factor which is a function of return period and the parameter of the distribution, and μ and σ are the location and shape parameters of the probability distribution. Equivalently, the quantile can be estimated from Extreme Value Type I (EV1) probability distribution function on a linear plot on EV1 reduced variate (y_{T}) axis as follows:

where u and a are the parameters of EV1 distribution. This plot can be used as a simple curve to derive P_{T} from y_{T} once the parameters are estimated. Typical plots for three stations is presented in

The goodness-of-fit of the equality of the parent population probability distribution with the sample frequencies were investigated to test of descriptive ability of the candidate frequency models. For this the Kolmogorov-Smirnov (K-S) test was applied. Furhermore, the predictive ability of a candidate probability distribution needs to be established. The estimates of 24 hr quanitles of rainfall intensity can be evaluated using the standard error of the estimated quantile corresponding to a return period using the standard error of estimate, SE given by:

Standard error of estimate justifies error due to small sample, but it does not imply error due to inappropriate choice of distribution. The most efficient method of parameter estimation is the one which gives the least standard error of estimate.

There was a marked variability of the 24 hour annual maximum rainfall for the various stations used in the study. The annual rainfall is different in heterogeneous regions; hence the homogenous regions in this study are formed by identifying the Fuzzy C-Means (FCM) clusters in the space of site characteristics of the mean annual rainfall with latitude, longitude and elevation. There were a few stations which were initially classified into a cluster region that are geographically far away. These particular stations were subject to further scrutiny to associate them to the respective nearby cluster regions. It was found that the rainfall of Botswana is divided into three homogenous regions (

The homogenous regions which were formed through the identification of FCM clusters and the various stations contained in each cluster region are presented

Time series of 24 hr annual maximum time series events in all 76 stations was extracted and then the IDF parameters for each station in each region were used to derive regional IDF curves. Storm intensities corresponding to the short-, medium- and long- term are of great importance for different flood drainage and water resource management strategies. Once this was completed the IDF curves for each homogeneous region was constructed for durations at steps below and above 24 hr. All of the distributions have adequately represented as the for the annual maximum 24 hr rainfall samples at all

Region | Containing stations ( |
---|---|

1 | 1 5 6 7 13 14 17 23 26 27 28 30 31 34 35 36 38 41 42 45 47 48 51 53 54 60 61 64 67 70 |

2 | 2 4 8 9 10 11 15 16 20 21 22 24 29 32 33 37 39 40 43 44 49 50 52 55 56 57 58 59 62 63 65 68 69 73 74 76 |

3 | 3 18 19 25 42 46 66 71 72 75 |

Attribute | Cluster | ||
---|---|---|---|

1 | 2 | 3 | |

Longitude (˚E.) | 25.56 | 25.64 | 20.87 |

Latitude (˚S.) | 22.32 | 23.08 | 25.55 |

Elevation (m.∙a.m.s.l.) | 977 | 1060 | 960 |

Mean Annual rainfall (mm) | 326 | 444 | 184 |

No. of stations in each cluster | 30 | 36 | 10 |

stations. The goodness-of-fit of the equality of the parent population probability distribution with the sample frequencies have been investigated as a test of descriptive ability of the candidate frequency models. This test was conducted using the Kolmogorov-Smirnov (K-S) test at the 90% confidence level, and was found to be adequately for each homogeneous region.

A summary of parameters of the IDF curve for the region that provided lower SSE of estimates in the 24 hr rainfall depths is summarized in

Further, investigation of the predictive ability tests were conducted using the standard error (Equation (7)) and results of the four frequency models in each region was determined. Among the four frequency models, the 2-Parameter Gamma distribution followed by Lognormal distribution has resulted comparatively high accuracy in model predictions of rainfall intensities in all regions, and these models could be adopted.

The quantile estimates were in a good agreement with the observed 24 hr rainfall intensity in terms of lower standard of error (SE) for the various return periods. For instance, a comparison among the 4 probability distributions in terms of quantiles of

Parameter Values | ||||
---|---|---|---|---|

Parameter | Range | Region 1 | Region 2 | Region 3 |

a | 1960 - 1980 | 1961 | 1970 | 1978 |

b | 2900 - 3200 | 2950 | 3100 | 3120 |

c | 18 - 21 | 18.2 | 19 | 20.2 |

rainfall intensity (P_{T}) and return periods for a station at Maun in Region 1 is shown in

Return period (years) | Longnormal (LN) | General Extreme Value (GEV) | Exponential (EXP) | 2-Parameter Gamma (G2) | ||||
---|---|---|---|---|---|---|---|---|

T | P_{T} | SE | P_{T} | SE | P_{T} | SE | P_{T} | SE |

2 | 55.20 | 3.28 | 52.95 | 3.35 | 51.49 | 3.13 | 55.88 | 3.42 |

5 | 78.91 | 5.98 | 76.24 | 5.71 | 78.72 | 6.69 | 83.09 | 7.06 |

10 | 95.12 | 8.63 | 94.99 | 8.37 | 99.32 | 10.26 | 100.30 | 10.43 |

20 | 110.96 | 11.62 | 115.96 | 12.66 | 119.92 | 13.96 | 116.16 | 13.88 |

25 | 113.82 | 12.20 | 123.30 | 14.50 | 126.55 | 15.17 | 118.94 | 14.50 |

50 | 131.77 | 16.03 | 148.24 | 22.04 | 147.15 | 18.94 | 135.74 | 18.43 |

100 | 148.15 | 19.81 | 176.89 | 32.96 | 167.75 | 22.73 | 150.27 | 21.98 |

Return period (years) | Longnormal (LN) | General Extreme Value (GEV) | Exponential (EXP) | 2-Parameter Gamma (G2) | ||||
---|---|---|---|---|---|---|---|---|

T | P_{T} | SE | P_{T} | SE | P_{T} | SE | P_{T} | SE |

2 | 50.66 | 3.02 | 49.54 | 3.36 | 47.77 | 2.95 | 51.81 | 3.20 |

5 | 73.34 | 5.12 | 72.45 | 5.28 | 71.89 | 4.90 | 75.88 | 5.19 |

10 | 88.98 | 7.00 | 89.66 | 6.83 | 90.13 | 6.73 | 91.00 | 6.83 |

20 | 104.36 | 9.09 | 107.88 | 8.94 | 108.38 | 8.66 | 104.88 | 8.50 |

25 | 107.15 | 9.49 | 114.04 | 9.80 | 114.25 | 9.29 | 107.30 | 8.80 |

50 | 124.68 | 12.15 | 134.24 | 13.20 | 132.50 | 11.27 | 121.96 | 10.71 |

100 | 140.76 | 14.76 | 156.28 | 17.92 | 150.74 | 13.27 | 134.60 | 12.42 |

Return period (years) | Longnormal (LN) | General Extreme Value (GEV) | Exponential (EXP) | 2-Parameter Gamma (G2) | ||||
---|---|---|---|---|---|---|---|---|

T | P_{T} | SE | P_{T} | SE | P_{T} | SE | P_{T} | SE |

2 | 31.36 | 3.23 | 30.79 | 3.95 | 30.00 | 3.31 | 32.49 | 3.60 |

5 | 51.20 | 6.28 | 50.34 | 6.59 | 51.18 | 6.03 | 53.77 | 6.26 |

10 | 66.15 | 9.25 | 65.95 | 8.66 | 67.21 | 8.51 | 67.89 | 8.58 |

20 | 81.71 | 12.76 | 83.31 | 11.40 | 83.23 | 11.10 | 81.23 | 11.02 |

25 | 84.62 | 13.45 | 89.35 | 12.51 | 88.39 | 11.95 | 83.59 | 11.47 |

50 | 103.44 | 18.21 | 109.83 | 16.91 | 104.42 | 14.59 | 98.07 | 14.32 |

100 | 121.46 | 23.13 | 133.21 | 23.18 | 120.45 | 17.26 | 110.78 | 16.94 |

Three homogeneous regions were identified each representing the northern region (Region 1), region (Region 2) and south western region (Region 3) of Botswana. Each cluster region portrays the relative and distinct climatic regions in Botswana. Region 1 represents a typically relatively human region of Botswana with mean annual rainfall (MAR) of above 450 mm. Whereas, Region 2 presents semi-arid part of Botswana with a MAR of 300 - 450. Region 3 represents the arid sub-region of Botswana with MAR of hardly 300 mm and below.

The raingauge stations in Region 1, among the 36 stations identified in this cluster include Baines Drift, Digawana, Kasane, Maun, and others. In Region 2, among the 30 stations identified in this cluster include Gaborone, Francistown, Palapye, among others. Region 3 consists of 10 stations including Bokspits, Ghanzi, and others.

The quantile estimates of the 24 hr rainfall intensities for were in good agreement with the observed IDF in terms of lower standard of error (SE) for the various return periods. ^{2}) that is well above 90%, and a correlation coefficient (r) close to unity as illustrated in

IDF curves for each homogeneous region are illustrated. The regional IDF curves developed for the region are illustrated for three typical stations of Maun, Gaborone, and Bokpits, representing Region 1, 2 and 3, respectively. These IDF curves are shown in

We have developed homogenized and regionalized Intensity-Duration-Frequency (IDF) curves of Botswana using regional storm IDF modeling approach. It can be used as a generalized IDF model being proposed to produce the IDF relationships in Botswana. Three homogeneous regions were identified, each representing the northern region (Region 1), region (Region 2) and south western region (Region 3) of Botswana. Each cluster region portrays the relative and distinct climatic regions in Botswana. Region 1 represents a typically relatively human region of Botswana with mean annual rainfall (MAR) of above 450 mm. Whereas, Region 2 presents semi-arid part of Botswana with a MAR of 300 - 450. Region 3 represents the arid sub-region of Botswana with MAR of hardly 300 mm and below.

The performance of the model was judged against a number of model performance indices, which gave very high model efficiency (R^{2}) [

The consistent agreement in quality and magnitude of rainfall intensity curves shows the acceptability for the use of the proposed design storm Intensity-Duration-Fre- quency (IDF) model for the calculation of design flood estimates for various applications including design of culverts, bridges, spillways and associated urban drainage and infrastructures.

The authors wish to appreciate the support of the University of Botswana and Department of Metrological Services of the Botswana Government for encouraging this research directly or indirectly by providing the necessary information and support.

Alemaw, B.F. and Chaoka, R.T. (2016) Regionalization of Rainfall Intensity-Duration-Frequency (IDF) Curves in Botswana. Journal of Water Resource and Protection, 8, 1128-1144. http://dx.doi.org/10.4236/jwarp.2016.812088