Accounting for the Effects of Climate Variability in Regional Flood Frequency Estimates in Western Nigeria

Extreme flood events are becoming more frequent and intense in recent times, owing to climate change and other anthropogenic factors. Nigeria, the case-study for this research experiences recurrent flooding, with the most disastrous being the 2012 flood event that resulted in unprecedented damage to infrastructure, displacement of people, socio-economic disruption, and loss of lives. To mitigate and minimize the impact of such floods now and in the future, effective planning is required, underpinned by analytics based on reliable data and information. Such data are seldom available in many developing regions, owing to financial, technical, and organizational drawbacks that result in short-length and inadequate historical data that are prone to uncertainties if directly applied for flood frequency estimation. This study applies regional Flood Frequency Analysis (FFA) to curtail deficiencies in historical data, by agglomerating data from various sites with similar hydro-geomorphological characteristics and is governed by a similar probability distribution, differing only by an “index-flood”; as well as accounting for climate variability effect. Data from 17 gauging stations within the Ogun-Osun River Basin in Western Nigeria were analysed, resulting in the delineation of 3 sub-regions, of which 2 were homogeneous and 1 heterogeneous. The Generalized Logistic distribution was fitted to the annual maximum flood series for the 2 homogeneous regions to estimate flood magnitudes and the probability of occurrence while accounting for climate variability. The influence of climate variability on flood estimates in the region was linked to the Madden-Julian Oscillation (MJO) climate indices and resulted in increased flood magnitude for regional and direct flood frequency estimates varying from 0% 35% and demonstrate that multi-decadal changes in atmospheric conditions How to cite this paper: Ekeu-Wei, I.T., Blackburn, G.A. and Giovannettone, J. (2020) Accounting for the Effects of Climate Variability in Regional Flood Frequency Estimates in Western Nigeria. Journal of Water Resource and Protection, 12, 692-713. https://doi.org/10.4236/jwarp.2020.128042 Received: June 18, 2020 Accepted: August 16, 2020 Published: August 19, 2020 Copyright © 2020 by author(s) and Scientific Research Publishing Inc. This work is licensed under the Creative Commons Attribution International License (CC BY 4.0). http://creativecommons.org/licenses/by/4.0/ Open Access I. T. Ekeu-Wei et al. DOI: 10.4236/jwarp.2020.128042 693 Journal of Water Resource and Protection influence both small and large floods. The results reveal the value of considering climate variability for flood frequency analysis, especially when non-stationarity is established by homogeneity analysis.


Introduction
Floods are natural hazards aggravated by both climatic factors (i.e. climate variability and climate change) and non-climatic factors (e.g. changes in land cover, use, vegetation, etc.) [1], and result in the destruction and disruption of socio-economic activities, damage to property and infrastructure, loss of lives, and financial loss [2]. In Nigeria-the case-study for this research, frequent and unprecedented levels of flooding and impacts has increased the concern from the public, government, and other stakeholders about the probability of flood recurrence, thus reinforcing the need to establish appropriate mitigation measures to minimize flood impacts [3].
Knowledge of flood frequency estimates is crucial to ensure socio-economic activities and infrastructural development are planned appropriately to improve resilience [4]. Accurate estimates of flood intensities and frequencies are also important for the design of critical infrastructure required to flood risk reduction (dykes, levees, dams, etc.), construction of hydraulic structures (bridges, culverts, drainages), the development of floodplain and urban land-use regulations, emergency management, and disaster risk insurance [5]. Under-estimating design flood could lead to increased flood risk with potentially catastrophic consequences, while over-estimation, on the other hand, could cause resource wastage and aggravate upstream and downstream flooding [6].
To accurately estimate expected flood magnitudes and return periods, networks of gauging stations are typically established to collect hydrological data over a long period. However, in many developing regions, establishing the optimal number of hydrological stations is usually hampered by challenges such as the high cost associated with gauging equipment. Therefore, several locations are usually left ungauged or contain short-length of record if the gauging stations are newly established, discontinued, or damaged. In many low/middle-income countries, many catchments are sparsely gauged due to factors that include lack of commitment by station operators; deteriorating conditions of observation equipment; insecurity/theft, and inaccessibility to remote locations [7] [8]. The absence of high-quality and sufficient data results in poor flood predictions in these areas, and consequently, flawed flood risk management interventions [9].
Therefore, it is essential to explore techniques capable of extracting maximum Generally, the choice of a flood frequency estimation approach depends on the extent of availability of historical flood records at/or around the specific site of interest. When sufficient historical flood data are available, direct flood frequency analysis is performed by fitting a pre-defined probability distribution to the annual maximum flood or partial flood time series [10]. Where data is insufficient, indirect flood estimation procedures are used, such as the adoption of hydro-meteorological data from other locations similar in characteristics to the site of interest [11] [12] or the incorporation of data from other sources including remote sensing [13] [14]. The present study adopts the former approach.
A major factor that affects future flood regimes and must be considered when estimating flood magnitudes is the changing climate, whether characterized by

Study Area, Datasets and Sources
The   The climate of the OORB is influenced by tropical continental and maritime air masses [21] and experiences an annual rainfall of 1400 mm to 1500 mm; mean annual air temperature ranges from 25.7˚C to 30˚C, and relative humidity varies from 37% to 85% [22]. OORB experiences recurring flooding caused by the increased frequency of intense precipitation events, poor urban planning and waste management practices, and failure of upstream hydraulic systems, which have led to significant socio-economic, infrastructural, ecological and environmental impacts [23]. Also, recent evidence from studies in West Africa [24] [25] [26] and Nigeria [27] suggests the presence of strong correlations between climatic variability and hydro-meteorological events in the regions.
Hydrological data (discharge, water levels and rating curves) used for this study were provided by the Ogun-Osun River Basin Development Authority (OORBDA), the agency responsible for collecting and managing data within the Basin. Additional data sets for two hydrological stations, Yewa Mata and Ona River/Sala village were extracted from published research by Olukanni and Alatise [28] and Ewemoje and Ewemooje [29] respectively, using WebPlotDigitizer tool [30]. The catchment area for each station was delineated from 30-m resolution Shuttle Radar Topography Mission (SRTM) digital elevation model data [31] using the Arc Hydro tool in ArcMap 10.2. The properties of the gauging stations for OORB are presented in Table 1, while the spatial distribution of gauging stations is presented in Figure 1, showing the spread and sparsity of the hydrological monitoring network.
Climate indices data used in this study originates from the data repository of the United States National Oceanographic and Atmospheric Administration (NOAA) [32] and are embedded within the open-source International Centre for Integrated Water Resources Management Regional Analysis of Frequency Tool (ICI-RAFT). This data includes multi-decadal hydro-climate indices such as the Pacific Decadal Oscillation, El Nino/Southern Oscillation, Madden Julian Oscillation, North Atlantic Oscillation and others. As a data limitation, it is important to note the variation in the period of data availability for the respective sites restricts the direct comparison of sites for analytical purposes.

Data Preparation and Preliminary Analysis
Prerequisite data preparation undertaken includes data formatting and conversion, infilling of missing data and statistical testing. First, historical river water level data were converted to discharge using available up-to-date rating curves provided by the OORBDA. Multiple imputation techniques based on coupled Markov Chain Monte Carlo and ordinary least squares regression is applied using Microsoft XLSTAT tool to fill the gaps to approximate missing peak annual discharge (flood) [33]. FFA application typically assumes that available data satisfies conditions of randomness, serial non-correlation, the absence of outliers, and homogeneity, to reduce the inherent data uncertainty [10]. The randomness of hydrologic data points at each station is estimated using the Mann-Kendall (M-K) test [34], which assesses the increasing and decreasing trends in the time series [35]. The presence of serial correlation within hydrological records at a particular site results in discrepancies in regional variance

I. T. Ekeu-Wei et al. Journal of Water Resource and Protection
and increased data skewness [36], thus contributing to uncertainty in regional flood frequency estimates [4]. To assess the magnitude of the serial correlation, 1-unit Lag correlation coefficients [17] are applied to derive values ranging from −1 (perfect non-correlation) to 1 (perfect correlation). The presence of outliers also affects data quality and can be attributed to gauge failure, sampling inconsistencies, typo errors, or gauge disruptions; they are not considered part of the real flood population [37]. Outliers were identified by using the Grubbs and Beck test [38]. Finally, breakpoint analysis [16] is applied to assess significant homogeneity within the hydrological time series.

Climate Indices-Assessing Climate Variability Effect on Flood Frequency Estimates
Climate variability depicts the pattern of climate dynamics on both temporal and spatial scales, identified as fluctuation above or below the average climate pattern for a short period, thus influencing the magnitude and frequency of extreme flood events [39] [40]. While past hydrologic models have assumed stationarity, current climatic conditions suggest that the future is expected to differ despite what is known of the past and present [41]. Ocean-atmosphere processes that influence precipitation, atmospheric pressure and temperature are defined by climatic indices and are useful in tracking long-term decadal hydrological changes [42] [43]. Some key climate indices that characterize the frequency, intensity and duration of extreme climatic events include the Arctic Oscillation (AO), North Pacific Oscillation (NPO), North Atlantic Oscillation (NAO), Pacific Decadal Oscillation (PDO), Pacific/North American Index (PNA), El Nino/Southern Oscillation (ENSO), and Madden-Julian Oscillation (MJO) [24]. In this study, the correlation between the annual maximum discharge time-series and climatic indices are evaluated as an indicator of climate variability influence on flood magnitude and frequencies [19] [43]. ICI-RAFT developed by Giovannettone and Wright, [44] contains a database of 30 hydro-climate indices (HCI), including those previously mentioned, to facilitate correlation analysis, taking into account a stipulated Lag, as well as lower and upper index limits. To consider climate variability influence on flood frequency analysis, the ICI-RAFT program recomputes flood magnitude for each return period using only non-zero data that satisfies the index limits set for the HCI with the highest cor-

L-Moment-Index Flood Regional Flood Frequency Analysis (RFFA)
Regional flood frequency analysis is based on the agglomeration of hydrological data from homogeneous regions characterised by similar physiographical parameters (e.g. catchment area, catchment slope, stream length, precipitation, and elevation). Hydrological data available at sites within a defined region are used to estimate the regional flood quantile based on the assumption that they are defined by the same probability distribution and differ only by the index flood [4].

I. T. Ekeu-Wei et al. Journal of Water Resource and Protection
This approach helps reduce inconsistencies associated with data shortage [6].
The Index flood technique developed by Dalrymple [45] has been widely applied in determining flood estimates for gauged and ungauged catchments of varying sizes at the global, regional and local scales [13] [46] [47]. The general assumption for this method is that the probability distribution of the annual maximum floods across sites in the region are similar and differ only by a site-specific scaling factor termed the index flood (mean or median flood) [4] [45]. The flood quantile ( T Q ) for a T-year return period at a site of interest (i), given a common regional probability distribution factor ( T X ), can be mathematically expressed as: The index-flood ( index Q ) for an ungauged site of interest is usually derived from establishing a relationship between available catchment characteristic (such as catchment area, elevation, annual precipitation, etc.) and the index-flood of gauged sites within a homogeneous region [48]. The regional probability distribution is a dimensionless parameter determined using a best-fit statistical approach discussed later in Section 3.3.3. L-moment based flood frequency analysis was undertaken using ICI-RAFT [44], and the procedure includes 1) data screening and site clustering to derive discordancy measure (D) based on the Wards hierarchical clustering approach, 2) regional homogeneity testing using the heterogeneity measure (H), and 3) selection of the appropriate distribution using the goodness-of-fit measure (Z) [4]. The L-moments is a widely-preferred method for regional flood frequency analysis due to the robustness of linear moments in comparison to ordinary moments in handling extreme values over a wider range of probability distributions and its reduced susceptibility to bias. The components of L-moment analysis are detailed in Hosking and Wallis [4] and other studies [49] and are summarized below.

Data Screening
The discordancy measure based on L-Moments (L-Mean, L-Covariance, L-Kurtosis and L-Skewness) is applied to identify sites whose L-Moment ratio are discordant from that of the whole group, denoted by a critical value of (D ≥ 3). Where, L-mean is similar to conventional mean, defined as a measure of central tendency; L-Covariance is defined as a dimensionless measure of variability, ranging from low (0) to very high (0.4) variability; L-Skewness (t3) is a measure of the degree of symmetry of a sample. L-Skewness value typically lies between −1 and +1, t3 = 0 suggests symmetric distribution. L-Kurtosis (t4), is a measure of peakedness or the flatness of the frequency distribution curve near its centre. Formulas that define L-moment statistics are presented in Hosking and Wallis [4].

Homogeneity Testing
Heterogeneity measure (H) compares the variation between L-moments for a group of sites and what is expected of a homogeneous region to justify that a

Probability Distribution Selection
The Z-Statistic is a goodness-of-fit measure that assesses the probability distribution that best fits the weighted-average regional L-moment parameters of each site in a homogeneous region (L-Skewness and L-Kurtosis). A preliminary probability distribution can also be approximated and visualized using an L-moment diagram (L-Kurtosis vs. L-Skewness), with the best distribution defined as the distribution curve closest to the majority of the sample data points [50].

Data Characteristics and Preliminary Analysis
Data preparation results are presented in Table 2. The 1-unit Lag correlation results show that the serial correlation between data sets at each site varied from −0.002 to 0.516 (−1 = perfect inverse correlation; 1 = perfect correlation; and 0 = no correlation), suggesting the absence of a strong relationship among peak annual discharge at each site. Grubbs and Beck test detected no low outlier, but high

Identification of Homogeneous Regions and Determination of Discordancy Measure
Regional discordancy (D) and heterogeneity (H) statistics are presented in Table   3, while site-specific results of the same statistics for sites within each sub-region are presented in Table 4. An H-statistic of 8.89 (i.e. H > 1) was realized for the entire catchment area, suggesting heterogeneity [52]. Consequently, the region was divided into three sub-regions and tested for homogeneity (Table 3); the resulting L-moment statistics and discordancy of sites constituting each sub-region are presented in Table 4. The H-Statistics for sub-regions 2 and 3 showed homogeneity (H < 1), while sub-region 1 exhibited heterogeneity (H >> 1). In terms of discordancy, only Idogo was discordant (D = 4.2232) and, therefore, Table 3. L-moments and Homogeneity statistics per sub-region.

Regional Distribution and Goodness of Fit Measures
The L-Moment ratio diagram (Figure 3) displays the relationship between regional average L-skewness and L-kurtosis fitted to varying probability distributions for all three regions. The 3-parameter distribution line/curve closest to L-moment ratio points of sub-regional sites portrays an initial deduction concerning an optimal distribution [10] [54]; in this case, the Generalized Logistic (GLO) curve satisfies this approximation. A 3-parameter distribution is selected instead of its 2-parameter counterpart due to its robustness and ability to optimally represent the probability distribution parameters [55].
In addition to L-Moment ratio diagram, Z Statistics provides a viable statistical approach to identify the optimal distribution that best fits the data for each sub-region. Table 5 shows the Z Statistics for all distributions for each sub-region and reveals that GLO is the most significant at a 90% confidence interval (Z ≤ |1.64|) as prescribed by Hosking and Wallis [4] for regions 2 and 3. The result is consistent with deductions from the L-Moment Ration diagram.
The identified optimal probability distribution corresponds with those applied in previous single-site and regional studies undertaken for catchment areas close to our study area [49] [50]. The insignificance of the probability distribution for all combined sites and region 1 (Z > 1.65) suggests that all individual sites within this region are not defined by the same particular distribution due to their apparent heterogeneity.

Regional Flood Frequency and Parameter Estimation
After identifying GLO as the optimal probability distribution for regions 2 and 3, a flood frequency relationship was established to derive flood magnitudes. The GLO probability distribution function is given by: I. T. Ekeu-Wei et al.
where ξ, α and k depict the location, scale and shape parameters, respectively [4].
The range of x is defined as where: β α ξ = , T is the return period and Z T is the growth curve of T.
GLO distribution parameters estimated for each sub-region using L-moments were substituted into Equation (3) to estimate the sub-regional growth factors for ungauged and sparsely gauged basins and are presented in Table 6.

Climate Indices Correlation with Peak Annual Discharge
Ijaka-Oke, Oba/Oyo-Ogbomosho, Ofiki/Igangan-Ilere road and Ofiki-Igangan were identified by breakpoints and trend analysis to be heterogeneous and were further investigated to ascertain the influence of climate variability by correlating peak annual discharge and global climate indices. Regional and direct flood frequency estimates were then determined in ICI-RAFT using the highest corre-    [20]. The remaining variability in peak annual discharge can be linked factors such as local catchment properties, land use/cover changes and hydrodynamics [56] [57], which is beyond the scope of this study. Journal of Water Resource and Protection MJO is known to be a strong driver of rainfall variability in tropical regions [58], governing atmospheric pressure and temperature around the equator. The MJO significantly influences regional rainfall [59] [60] and was reported to have influenced the rainfall dynamics that triggered the unprecedented 2012 flood event in Nigeria [61]. Arnold et al., [62] and Caballero and Huber, [63] further suggested that, due to the dependence of MJO on Sea Surface Temperature (SST) and Outgoing Longwave Radiation (OLR), MJO activities may increase in response to global warming, resulting in more frequent MJO-influenced flood events.

Climate Variability Effect and Flood Quantile Estimation
Results presented in Table 7 and Figure 5 show flood frequency estimates derived when all available data points for the hydrological time series are used for regional and direct flood frequency estimation, as well as when data points that correlate and satisfies the limits set for the MJO climate variability index are used, thus accounting for the influence of climate variability. These results suggest Table 7. Flood frequency estimates per return period (regional and Direct, and considering climate variability).  that climate variability accounted for increased flood magnitude for regional and direct flood estimates varying from 0% -35%, and demonstrate that multi-decadal changes in ocean-atmosphere conditions can influence both small and larger floods [63]. Also, the influence of climate variability was most evident at sites that exhibited a higher correlation with HCI (i.e. Ofiki Igangan, Ofiki/Igangan-Ilere road and Oba/Oyo-Ogbomosho). These results are generally consistent with those revealed by other studies where flood estimates that accounted for climate variability were higher than those estimated under the assumption of stationarity [65].
The criss-cross plot pattern observed at Ijaka-Oke for the inclusion of climate variability in regional flood frequency estimation suggests that caution must be taken when integrating climate variability into FFA [65], especially when the relationship between climate indices is low (R 2 = 0.28). Additionally, the significance of the homogeneity rather than trends is identified as the key indicator of nonstationarity [66], as evident at the Ijaka-Oke gauging station (p-values: trend = 0.001, homogeneity = 0.081).
Furthermore, Figure 4 reveals that the historical maximum flood experienced at each site in the OORB is less than the 1-in-100 year flood guideline stipulated for flood management planning in Nigeria [53]. Thereby reinforcing the need to implement flood management measures (both structural and non-structural) based on a 1-in-100 year to curtail recurring flood impacts.

Conclusions
The impact of flooding in Nigeria has increased over the last two decades, resulting in the displacement of persons, disruption of socio-economic activities, damage of infrastructures and loss of lives. Therefore, efficient flood risk management is urgently needed to reduce the vulnerability and exposure of the local population and assets. Flood frequency analysis is usually the first step towards flood risk management, aimed to determine flood magnitudes for varying return periods. This is, however, challenging in many developing and remote regions, including the Ogun-Osun River Basin of Nigeria due to the absence of sufficient historic hydrological data, due to financial, technical and organizational drawbacks.
We have presented a robust flood estimation approach based on L-moment regional flood frequency analysis that combines multiple short-length historical hydrological data to curb aleatoric uncertainty. Building on the evidence of climate influence on the changing hydrological regime in the region, this study accounts for climate variability effect on flood frequency estimates through climate indices.
Two homogeneous regions are identified based on clustering algorithm and statistical tests, while GLO three-parameter distribution is identified as the best fit distribution for flood frequency analysis in the 2 sub-regions based on L-moment ratio diagrams and goodness of fit test (Z-statistic). Also, Madden-Julian Oscillation (MJO) is identified as the most influential climate indices for the region, resulting in increased flood magnitude for direct and regional flood frequency estimates. This further reinforces the need to integrate climate variability into flood frequency analysis as more climate-driven events are expected due to global warming.
In conclusion, integrating climate variability into regional and direct flood frequency estimation results in more robust outcomes-a useful input into hydraulic modelling and flood risk mapping needed to inform resilient structural and non-structural flood risk management interventions. However, for the result of this study to be transferable to ungauged areas within the homogeneous regions, further analysis is needed to determine the physiographical parameters required to establish the best relationship with the index flood of sites in homogeneous sub-regions; for example, using approaches such as artificial neural networks (ANN)