Flood Generation Mechanisms and Potential Drivers of Flood in Wabi-Shebele River Basin, Ethiopia

Flood is a natural process generated by the interaction of various driving factors. Flood peak flows, flood frequency at different return periods, and potential driving forces are analyzed in this study. The peak flow of six gauging stations, with a catchment area ranging from 169 - 124,108 km 2 and sufficient observed streamflow data, was selected to develop threshold (3 rd quartile) magnitude and frequency (POTF) that occurred over ten years of records. Sixteen Potential climatic, watershed and human driving factors of floods in the study area were identified and analyzed with GIS, Pearson’s correlation, and Principal Correlation Analysis (PCA) to select the most influential factors. Eight of them (MAR, DA, BE, VS, sand, forest AGR, PD) are identified as the most significant variables in the flood formation of the basin. Moreover, mean annual rainfall (MAR), drainage area (DA), and lack of forest cover are explored as the principal driving factors for flood peak discharge in Wabi-Shebele River Basin. Fi-nally, the study resulted in regression equations that helped plan and design different infrastructure works in the basin as ungauged catchment empirical equations to compute Q MPF , Q 5 , Q 10 , Q 50 , and Q 100 using influential climate, watershed, and human driving factors. The results of these empirical equations are also statistically accepted with a high significance correlation (R 2 > 0.9).


Introduction
Flood is a natural hazard that is most widespread around the globe both in terms *Corresponding author. of the occurrence and the resulting damages to human lives, environments, and properties [1]. Based on a combination of sources, causes, and impacts, floods categorize into river (or fluvial) floods, pluvial (or overland) floods, coastal floods, groundwater floods, or the failure of artificial water systems [2]. Therefore, the major causes of floods include intensity, duration, and spatial distribution of rainfall on catchments; steep slopes, deforestation, less soil infiltration capacity; failure of hydrologic structures, and sudden release of water from dams; and landslides [3]. Nied et al. [4] also describe physical controlling factors of flood include: hydrological pre-conditions (e.g., soil saturation, snow cover), meteorological conditions (e.g., amount, intensity, and spatial and temporal distribution of precipitation), runoff generation processes as well as river routing (e.g., superposition of flood waves in the main river and its tributaries). These multidimensional causes of the flood made it less predictable and aggravated its impacts worldwide [5].
Floods are mainly driven by climate, catchment, and river characteristics that determine the terrestrial conditions of water or runoff [6]. Climate is a critical driver on the fluvial flood hazard. And it is also highly affected by various features of atmospheric systems, including water content of the atmosphere, different precipitation characteristics (intensity, duration, total amount, timing, or phase), the antecedent precipitation index (API), large-scale circulation patterns [7]. It is true in Ethiopia also; the climate/weather characteristics, including torrential rainfall and summer thunderstorms, are strongly linked with flooding [8] [9]. Similarly, the catchment characteristics, variability in drainage area, very short changeable topography, and low infiltration capacity of the ground surface expose to high floods [3]. Although flood is a natural action, the human land base activities often involve clearing the natural vegetation (either for construction or agriculture), and altering the characteristics of the ground cover can increase runoff substantially, and the potential threat from flash floods and river floods [10].
However, the impact levels of flood drivers, the significance among the different elements of flood factors, and the relationship between peak discharges and potential drivers are still a critical knowledge gap in tropical river basins.
Moreover, understanding the hydrological process of flooding in different regions and estimating the flood quintals are important limitations in the basin since most rivers are ungauged. Therefore, this study aimed to address the above knowledge gaps and development hindrance by identifying influential flood generations drivers and establishing relationships among drivers and peak flood indicators.

Flooding in Wabi-Shebele Basin
Floods that cause most damages in Wabi-Shebele River Basin are generated by a few days of heavy rainfalls with an average intensity of 10 -200 mm/hr and a total sum of precipitation of a hundred millimeters [11]. In the basin, flood events

Flood Discharge Characteristics in Wabi-Shebele River Basin
The peak flows over threshold (3 rd quartile) magnitude and frequency (POTF) are analyzed. The analysis undertakes with a fixed time interval approach to ensure the time-series independencies of extreme values. The successive peaks within the Time intervals between 5 to 14 days are used in this study [14] [15]. A total of 89 events consider in this POTF analysis. The mean peak flow (Q MPF ) is expressed as the arithmetic mean value of peak over the threshold (3 rd quartile) flows for the period of record. The sampled watersheds exhibit less variability in flood-peak discharges. From Table 2, the standard deviation in Q MPF is less than 35% of the mean except at the Jijiga station (i.e., a standard deviation related to 37% of the mean value).
Studies [16] [17] indicated the higher standard deviation of flood discharges, indicating a potential for flash floods. Accordingly, only the northeastern part of the basin, in the Jijiga watershed, is identified as a potential flash flood area.
Floods in other catchments are fall under riverine floods.
The Mann-Kendall test [18] [19], the common non-parametric trend detection, is used to detect trends in flood discharge. Most of the flood discharges indicate a significant trend (p < 0.05) in gauging stations located in the northwestern (i.e., Wabi at Dodola and Maribo) and downstream part of the basin (i.e., Gode). However, flood discharge at Robe and Erer discharge has no significant trend, as shown in Table 2

Climate Factors
Precipitation with its different characteristics; intensity, duration, total amount,   Moreover, the rain-bearing clouds coverage over the Wabi-Shebele River Basin is less intense than other basins like the Abay basin in Ethiopia [13]. Therefore, flood events in the Wabi-Shebele basin are highly associated with the frequency

Watershed Factors
The multiple catchment characteristics identified as vital variables affecting flows Similarly, the study conducted by Huang [28] showed that the drainage area affects not only the flow collection but also the time to peak flow. Moreover, the soil properties, mainly the soil infiltration rate, are sensitive variables for surface runoff generation. Coarse textured soils have big well-connected spaces and allow more water to infiltrate through them quite rapidly, while fine-grained soils dominated by clay have low infiltration rates due to their smaller-sized pore spaces [29]. Soils contain a large amount of sand and silt habit forming a crust and becoming more compacted, significantly reducing the infiltration rate. The mean peak flow (Q MPF ) in Wabi-Shebele River Basin is positively correlated with variables sand and loam.

Human Activities Factor
The land use and population density, and growth in the basin are considered human activity drivers for flooding [13] [22] [30]. The flood magnitude has a high positive correlation with cultivated land and population density and a strong negative correlation with forest cover.

Selection of Potential Flood Drivers in Wabi-Shebele
Using variables correlation matrix The magnitude and type of correlation among the potential flood drivers from climate, watershed, and human variables (i.e., MAR, DA, BS, VL, SF, DD, VS, ER, clay, sand, loam, forest, AGR, and PD) are estimated using the correlation matrix (Table 3) and scatter plot matrix (Figure 2). To identify significant predictors in watershed variables absolute value of correlation coefficient, R 2 exceeded 0.8, is selected.   The eigenvalues represent the quantity of variability in the data, and they are presented in Table 4. The first three PCs explain the maximum degree of variability of the data set with a proportion of 45%, 26%, and 19%, respectively. They indicate about 90% of the influence of the flood induces possible mange with the variables in these three PCs. Therefore the variables in the three PCs are taken to develop the multiple linear regression equations among the flood drivers and flood indices.
The coefficients in Table 5 show the linear combinations of variables that make each principal component. The absolute values near zero indicate that a variable contributes little to the PCs, whereas larger absolute values indicate variables that contribute more to the element. In the analysis, the first principal component has high negative associations with BE, VS, MAR, and forest and a high positive association with DA and sand, so this component primarily measures the basin altitude difference and land cover. The second component has  high positive associations with BS, SF, and loam, so this component primarily measures the slope and shape of the catchment. The third component has a high positive association with sand, AGR, and PD, so this component primarily measures the basin farmland and population density. The loading plot in Figure 3 visually shows the results for the first two components. From the graph, DA and sand indicate a small angle (<90˚) from the Q MPF line, meaning the variables positively correlated to Q MPF . The variables: forest, PD, BE, VS, and forest indicate angles related to 180˚, meaning they are negatively correlated to Q MPF . However, the variables: BS, SF, and loam have no significant correlation with Q MPF in Wabi-Shebele River Basin.

Relationship Development among Drivers and Flood Magnitude
A significance level (p-value) for all drivers is examined ( Table 6). The selection criterion is set to p < = 0.1 in regression analysis. Based on this criterion, DA, sand, MAR, and forest are found as the significant ones to be used in the development of regression equations to estimate the Q MPF . Therefore, Q MPF can well be estimated from Model 3 in Table 6, where adjusted R 2 has the highest value and p-value is significant (<0.05   Table 6 that watershed characteristics are the most influential factors of flood-peak frequency at 5 and 10-year return periods. On the other side, climate and human factors are most powerful in representing Q MPF and flood-peak frequency at 20, 50, and 100-year return periods. The regression equations that describe the relationship between influential driving factors and different return periods floodpeak flows are:   Table 6 summarizes the evaluation statistics from the regression model to MAE, NSE, RMSE, and R 2 based on observed and predicted flood values for all the six flood quantiles. A value close to zero is preferable for MAE as zero indicates no error in prediction. It is seen that all the MAE values for all quantiles lie between 1 and 44. The smallest value of MAE is found in the case of Q MPF and Q 20 estimations. It is noted that except for Q 5 and Q 100 , most flood quantiles estimations are evaluated as good values. Figure 4 shows plots of predicted quantiles over observed flood quantiles. These plots generally present a good agreement between the predicted and observed flood quantiles. Off courses in a few cases of underestimating the flood magnitude. For instance, the observed flows for Q 5 are range between 5.08 to 387.67 m 3 /s, while the predicted values range from 2.04 to 352 m 3 /s.

Conclusion
The major flood drivers and flood generation mechanisms in Wabi-shebele River basin were assessed using observed mean peak stream flow observed at six hydrological gauging stations in the basin. The six gauging stations have varied catchment areas with a range of between 169 to 124,108 km 2 . The threshold (3 rd quartile) magnitude and frequency (POTF) that occurs over ten years of record, is used to build the flood dataset. Sixteen climatic, watershed and human factors were extracted and computed using GIS, Pearson's correlation analysis, Principal Correlation Analysis (PCA). Eight of them (MAR, DA, BE, VS, sand, forest AGR, PD) are identified as the most influential variables in flood formation of the basin. Moreover, mean annual rainfall (MAR), drainage area (DA), and lack of forest cover are explored as the principal driving factors for flood peak discharge in Wabi-Shebele River Basin. In other directions, watershed slope (BS), catchment shape factor (SF), fraction of loam and clay soil coverage are separated as less influential factors and the possibility of substituting of them by the most influential factors during quantification modeling ascertained. Moreover, larger watersheds with higher elevation and agricultural/farmlands lead to larger flood-peak flow in all investigated return periods. Finally, regression equations are developed to estimate flood quantiles using identified driving factors that are used for different planning and designing of infrastructures in the basin.