Hydrological Modeling: A Better Alternative to Empirical Methods for Monthly Flow Estimation in Ungauged Basins ()
1. Introduction
Water resources is required to perform agricultural, industrial, and domestic activities and for environmental preservation [1]. With the increase in population and accelerated growth of urbanization, industrialization, and commercial development, demand for water resources of sufficient quantity and quality will continue to increase [2] [3] [4]. The design of all water related structures such as dams, highway bridges, embankments, among others, consists of three basic components: hydrologic design, hydraulic design and structural design. Hydrologic design deals with the estimation of the quantities of water to be handled at the site of the structure in terms of time distribution, time of occurrence and frequency of occurrence [5]. Streamflow time series is, therefore, one of the most important data required for the effective water resource planning and management at both local and national scales [6]. However, availability of measured flow data in many cases is either inadequate or not available at all [7] [8]. Such situations create challenges not only for the optimal use of water resources in ungauged river basins for various development works like domestic water supply and sanitation, irrigation, hydropower etc. but also in flood control works [9] [10]. Underestimation of the flows could lead to rejection of attractive projects whereas overestimation could have huge implications on the physical infrastructure and overall economic feasibility of the projects [4] [11]. Accurate flow estimates are, therefore, necessary at these basins where water resources projects are developed.
Although the global scientific community has put substantial efforts to resolve the issue of flow estimation in ungauged basins/sites, a universal solution method is not available till date [12]. Various methods are found in use in different parts of the world to deal with this issue. One of the oldest methods of generating flow data is the use of regression equation/s developed at the regional level [7] [13] [14] [15]. Razavi and Coulibaly [6] reviewed regional methods and highlighted that those methods making use of different combinations of physiographic information and meteorological attributes, among others, were found to predict streamflows in ungagged basins/sites better. They listed catchment area, elevation, slope of basin, rainfall and temperature as the main parameters used in those methods. Another popular method is transposition of gauged streamflow data to ungauged sites. One of them is the Drainage Area Ratio (DAR) method [16] [17]. It is based on the assumption that the streamflow at the ungauged site can be estimated by multiplying the ratio of the drainage area for this site and the drainage area for the gauging site by the streamflow of the gauging site [17]. As it needs only catchments areas and the observed streamflow of the gauged station, it is considered one of the easiest methods of flow prediction and therefore popularly used in the past [16]. One of the variants of the DAR method is MDAR (Multiple gauging stations Drainage Area Ratio). In the MDAR method, the weighted sum of more than one streamflow gauging stations is used to estimate the flow at the site of interest [18]. Incorporating the basin rainfall ratio of the ungauged basin to the gauged one as a multiplier to the DAR method has been considered as an improved version of the DAR method [17] [19]. This method can be called as a General Transposition (GT) method.
Hydrological simulation method is a numerical method in which a hydrologic model, a simplified software representation of the natural rainfall-runoff process within a catchment boundary, is used to generate streamflow data at the site of interest with known meteorological data. The hydrological model is first calibrated and validated at a gauged basin and then the model parameters are used appropriately at other ungauged sites within the modeling domain to simulate the flows using the calibrated model [8] [20]. Usually, several statistical indicators as well as visual inspection of the results (hydrographs and the water balance distribution in particular) are relied upon to determine the performance capacity and robustness of the model.
Since the simulation of the entire hydrologic cycle became a reality by Stanford Watershed Model as reported by Crawford and Linsley in 1966, modeling at large spatial scales and at small temporal scales [21] became possible with the recent development in hardware and software capabilities at an exponential rate in the last few decades [22]. Being able to use precise satellite data such as precipitation in hydrological models has further improved the performance and thus the overall applicability of hydrological models considerably around the globe. In recent years, application of hydrological models is becoming popular in Nepal, for assessment of water availability, planning purposes and to examine the impact of climate change in river hydrology [10] [23] - [28]. However, they are confined mainly to academic research studies. When the world is utilizing artificial intelligence as part of a data-driven approach to assist watershed modeling for stream flow generation [29], most of the project level studies in Nepal are still using coarse conventional methods in ungauged sites in Nepal, especially in the study of hydropower projects of different scales [19] [30] [31] [32]. As an awakening step, SWAT (Soil and Water Assessment Tool), a public domain hydrological model and capable for hydrological modeling in Nepalese catchments [23] [26] [27] [33] was used to estimate the flow at ungauged sites in this study and compared with other commonly used methods viz.WECS/DHM1990, NEA1997, DHM2004, DAR and its variant GT methods.
The Budhigandaki River Basin (BRB) of Nepal was chosen for the study. Six popular performance evaluation parameters viz. coefficient of determination (R2), Mean Absolute Error (MAR), Root Mean Square Error (RMSE), Percentage of Volume Bias (PBIAS), Nash Sutcliff Efficiency (NSE) and Kling-Gupta Efficiency (KGE) [16] [34] were used to evaluate the considered flow estimation methods in this study. Global Performance Index (GPI) was introduced for overall evaluation of flow estimation methods. The assessed flow estimation methods in Nepal were ranked based on the GPI value. It was found that hydrological simulation ranked the best among the considered methods.
2. Study Area
The Budhidgandaki River Basin (BRB) is situated in the central part of Nepal, between 27˚50' and 29˚00'N latitudes and 84˚30' and 85˚10'E longitudes (Figure 1). It has an elongated shape with its main axis oriented north-south. Its length is about 113 km while the width is in the range of 15 and 30 km. The basin elevations range from 315 masl at Budhigandaki-Trishuli confluence to 8163 meters above sea level (masl) at Mount Manaslu (8th highest peak) of the world [35] with a mean basin elevation of 3723 m. The basin area, thus, falls in two physiographic regions; Middle Mountains and the Himalaya [36]. It is a part of the Narayani drainage system, bordered in the north by the vast Tibetan Plateau, in the south and east by the Trishuli River basin and in the west by the Marsyangdi River basin.
The reference flow gauging station is at Arughat (Department of Hydrology and Meteorology, DHM station #445) which is at an elevation of 485 masl. The catchment area of the BRB at this station is 3863 km2 while it is 4985 km2 for Budhigandaki-Dam site (Figure 1).
3. Theoretical Background
When any water resources development project is planned and implemented in an ungauged catchment, different methods are generally used to estimate the flow at the project sites. Among them, one set of values are chosen for the design purpose based on the prevailing site conditions and judgment of the hydrologist. The most popular methods used in the estimation of mean monthly flow at ungauged sites are given below.
![]()
Figure 1. Location Map of the Budhigandki River Basin
3.1. Hydrological Simulation Method
Hydrological models have been broadly categorized depending on their spatial discretization (lumped, semi-distributed, fully-distributed), period of simulation (event-based or long term) and other complexities associated with the data requirement, governing equations and licensing issues. There is no doubt that they are gaining popularity in recent times. HEC-HMS, SWAT, MIKE SHE, MIKE NAM and VIC, among others, are some popular hydrological models used globally for assessing flows [29]. SWAT (Soil and Water Assessment Tool) model capable of simulating the hydrological process satisfactorily in Nepalese catchments [23] [26] [27] [33] was used for simulating the flows of the BRB in this study.
SWAT is a process-based semi-distributed hydrological model that is capable of simulating the impact of land management practices on flow, sediment and agricultural chemical yields in basins with varying soils, land use and management conditions [37]. Conceptually, SWAT divides a basin into sub-basins and further into Hydrological Response Units (HRUs). An HRU is a unique combination of land use, topographical and soil characteristics in a sub-watershed. SWAT simulates hydrology, vegetation growth and management practices at the HRU level [26]. SWAT simulates the hydrologic cycle based on the water balance equation as expressed in Equation (1).
(1)
where:
SWt is final soil water content (mm); SW0 is initial soil water content on day i (mm); t is time (day); Rday is amount of precipitation on day i (mm); Qsur is amount of surface runoff on day i (mm); Ea is amount of evapotranspiration on day i (mm); wseep is amount of water entering into the vadose zone from the soil profile on day i (mm) and Qgw is amount of return flow (from groundwater) on day i (mm).
3.2. WECS/DHM 1990 Method
The Water and Energy Commission Secretariat (WECS) and Department of Hydrology and Meteorology (DHM), Government of Nepal (GoN) [13] proposed a regression equation to estimate the long term mean monthly flow at an un-gauged site given in Equation (2).
(2)
where, Qmean is the mean monthly flow (m3/s); Atotal is the total catchment area (km2); A<5k is catchment area below 5000 masl elevation (km2); MWI is monsoon wetness index (total rainfall of the catchment from June to September in mm); C is a regression constant; and α, β and γ are constants derived from the regression analysis for each month (supplementary, S-1).
3.3. NEA 1997 Method
Nepal Electricity Authority (NEA), GoN proposed another regression based method to estimate the mean monthly flow for an un-gauged site [30]. It is given in Equation (3)
(3)
where, constants C, α and γ for this method are given in S-2
3.4. DHM 2004 Method
The DHM developed Equations (4a and 4b) for the estimation of monthly flow [14]. For some months logarithmic transformation gave a better estimate while in the other months square root transformation performed better. Monthly flow estimation equation with logarithmic transformation takes the following form:
(4a)
Monthly flow estimation equation with square root transformation takes the following form:
(4b)
where, AvgElev is average elevation of the catchment (masl); AWI is annual wetness index (mm); A<3k is catchment area below 3000 masl elevation (km2); and ε, ρ, μ and δ are the constants derived from regression analysis; their values are given in S-3. For March, April and May, square root transformation is better while for the other months, logarithmic transformation gives better estimates [14].
3.5. Drainage Area Ratio (DAR) Method
The DAR method is a simple method based on the assumption that the specific discharge calculated using the data from a flow gauging station remains constant within the basin [16] [17]. It is expressed as in Equation (5).
(5)
where Qe-site is the estimated flow at the site of interest (m3/s); Qgs is the observed flow at gauging station (m3/s); Ags and Asite are the catchment areas (km2) at the gauging station and site of interest respectively.
3.6. General Transposition (GT) Method
The GT method can be considered as an improved version of the DAR method, as it accounts the rainfall in addition to the drainage area. Although different variations of this method are found in application [19] [31], a simple form given in Equation (6) has been used in this study.
(6)
where Pavg-site and Pavg-gs are the annual average precipitation values (mm) of the basin up to the site of interest and the gauging station respectively.
4. Data Collection and Analysis
Spatial data (digital elevation model, land use land cover map and soil map) and hydro-meteorological time series data (temperature, rainfall and discharge) are required for this study. The collected data types and their use are given in Table 1. The Digital Elevation Model (DEM) and soil map were downloaded from Shuttle Radar Topography Mission (SRTM) and SOTER soil map site respectively while Land Use and Land Cover (LULC) Map was obtained from International Center for Integrated Mountain Development (ICIMOD), Nepal, Department of Water Resources and Irrigation (DoWRI) and district soil map of Nepal Agriculture Research Council (NARC). Precipitation, maximum and minimum temperature and discharge of Budhigandaki river at Arughat (#445) was collected from the Department of Hydrology and Meteorology (DHM), GoN while discharge at Budhigandaki Hydroelectric Project (BGHEP) dam site was collected from BGHEP project office. Data collected by the BGHEP for two years during the feasibility study was only available at this site.
Total catchments area, area below 3000 masl and 5000 masl of Arughat gauging station and Budhigandaki Hydro-Electric Project (BGHEP) dam site were calculated using GIS. Annual wetness indexes, monsoon wetness index for the basin area were calculated from the available daily rainfall data. Mean monthly values of the observed flows of Arughat gauging station and BGHEP dam site were calculated from the available daily flow data.
The hydrological model was setup and calibrated using ArcSWAT and used to simulate the flow in the BRB. Model development was carried out by generating the river networks and sub-basins using the 30 m × 30 m SRTM DEM. Hydrological response units were generated from land use land cover, soil maps, and by providing slope ranges. The model was calibrated (1983-2002) and validated (2003-2012) at Arughat using 30 years flow data. However, simulation was done up to 2014 to see how the model performed at BGHEP dam site lying downstream of Arughat station (Figure 2). The model simulated flows were extracted at Arughat (1983-2012) and BGHEP dam site (2013-2014) and compared with the respective observed data.
![]()
Figure 2. Hydrological simulation results at BGHEP dam site.
Mean monthly flows at Arughat and BGHEP dam site from WECS/DHM 1990, NEA 1997 and DHM 2004 methods were calculated using the equation given in Sections 3.2 to 3.4. Flows were transposed by DAR and GT methods to BGHEP dam site using observed monthly average flow data of Arughat station and vice versa.
5. Performance Evaluation
Performance of the various flow estimation methods explained above was evaluated objectively using goodness-of-fit measures by comparing the estimated and observed monthly flows. Performance evaluation of considered methods of the study at Arughat and BGHEP dam site were made using the following statistical parameters:
5.1. Coefficient of Determination (R2)
Coefficient of Determination measures both the strength of the linear relationship between observed and estimated values. It is calculated by Equation (8).
(8)
where,
= Observed Annual Average Flow (m3/s)
= Estimated Annual Average Flow (m3/s)
= Observed monthly average flow of month i
= Estimated monthly average flow of month i
n = number of data. As number of months are 12, n = 12 in this case.
The coefficient of determination (R2) is the square of the coefficient of correlation.
Criteria: Larger the value of R2, better the performance.
5.2. Mean Absolute Error (MAE)
The MAE measures the average of the deviation of the estimated values with respect to the observed ones. It is calculated using Equation (9).
(9)
Criteria: Smaller the value of MAE, better the performance.
5.3. Root Mean Square Error (RMSE)
The RMSE measures the differences between the estimated and observed values. It is given as Equation (10).
(10)
Criteria: Smaller the value of RMSE, better the performance.
5.4. Percentage Volume Bias (PBIAS)
The PBIAS measures the degree of volume biasness between the observed and estimated values. It is given by Equation (11).
(11)
where Vo = Observed total volume
Ve = Estimated total volume
Criteria: Smaller the absolute value of PBIAS, better the performance. The sign of the PBIAS value shows the direction towards which the estimated result is biased: +ve value is an indication of underestimation while −ve value shows overestimation.
5.5. Nash Sutcliff Efficiency (NSE)
The NSE is a normalized statistic that determines the relative magnitude of the residual variance compared to the observed value variance. It is calculated with Equation (12).
(12)
Criteria: Larger the value of NSE, better the performance.
5.6. Kling-Gupta Efficiency (KGE)
The KGE is considered as an improvement over the widely used NSE which considers different types of model/estimation errors, namely the error in the mean, the variability and the dynamics. The KGE is calculated using Equation (13).
(13)
where, r = correlation coefficient
Q0-sd = Standard deviation of observed flow
Qe-sd = Standard deviation of estimated flow
Criteria: Larger the value of KGE, better the performance.
5.7. Global Performance Index
A total of six performance evaluation criteria of the flow estimation methods are discussed above. However, assessment and comparison of the individual evaluation criteria and thus establishing the preference of one criterion over another is beyond the scope of this paper. Therefore, all the criteria are treated with equal weights while evaluating the overall performance of the flow estimation methods. To find the best method, the lowest performing method to the highest performing method with respect to a given parameter were assigned values from 1 to 6 in increments of one. If two/three values are equal, average of the two/three values are assigned for both/all of them. For example, if the NSE values calculated for Hydro-Sim, WECS1990, NEA1997, DHM2004, DAR and GT methods are 0.97, 0.67,0.95, 0.67, 0.74 and 0.92 respectively, then the respective numerical values these methods get are 6, 1.5, 5, 1.5, 3 and 4. The WECS1990 and DHM2004 methods have equal values of NSE i.e., 0.67, and therefore both these methods are assigned 1.5 (average of 1 and 2). The mean value of performance of each method was then calculated as Global Performance Index (GPI) as given by Equation (14).
(14)
where j represents the estimating method, say, for hydrological simulation method j = 1 while for DAR method, j = 5.
Based on the GPI value, the six considered methods are ranked from first to sixth such that higher the GPI value, better the method for monthly flow estimation.
6. Results and Discussion
6.1. Comparison of Average Flows at Arughat
The observed and estimated average monthly flows at Arughat gauging station are given in supplementary material (S-5) and depicted in Figure 3. Numerical figures of Goodness-of-fit of different flow estimation methods are presented in Table 2. It is to be noted here that performance evaluation of simulation results, WECS/DHM1990, NEA1997 and DHM2004 methods are made with respect to long term averages of observed flow (1983-2012: Obs-A) at Arughat. However, DAR and GT estimates are compared with the average of two years data (2013-2014: Obs-B) at BGHEP Dam site. This limitation is because measured data at BGHEP dam site is not available for the other years. It is assumed that such difference will have minimum impact on performance parameters.
Considering the monthly values, R2 is almost the same for all the methods. All the other calculated performance parameters except MAE show that the simulated flows obtained through SWAT hydrological modeling are found closer to the observed values. Even for MAE, the calculated value is very close to the NEA 1997 method. Thus, from the table, the overall performance ranking indicates that that hydrological simulation is the best among the methods considered in the study to estimate the flows for Arughat at monthly time steps. Further, the NEA 1997 and GT methods ranked second and third in terms of performance ranking.
![]()
Figure 3. Comparison of monthly flows at arughat.
![]()
Table 2. Performance parameters of estimated methods at arughat.
Obs-A: Observed flow data from DHM; Hydro Sim: Simulated flow using SWAT; WECS1990: Flow calculated using the WECS/DHM-1990 method; NEA1997: Flow calculated using the NEA-1997 method; DHM2004: Flow calculated using the DHM-2004 method; DAR: Flow calculated using drainage area ratio method; GT: Flow calculated using general transposition method; Obs-B: Observed flow data from BGHP at the dam site.
From the viewpoint of availability of flow for electricity production and demand of the electrical energy, three distinct seasons can be seen in Nepal [19]: Dry (December to May), Monsoon (June to September) and Post Monsoon (October and November). Seasonal evaluation at Arughat gauging site was also done following the methods discussed above to see whether the performance of each method differed from the monthly time steps. GPI based ranking in dry, monsoon and post-monsoon seasons are presented in Table 3. For dry and post-monsoon seasons, the GT and NEA 1997 methods respectively showed the best performance while hydrological simulation is next to these methods in both cases. However, its performance is better than the other methods in the monsoon season. This is particularly important in most Nepalese catchments where the runoff is largely rainfall driven. Based on weighted average GPI, the GT method ranks first in overall. Hydrological simulation and NEA 1997 methods rank second and third respectively. The remaining three methods are not found satisfactory in terms of seasonal performance.
6.2. Comparison of Average Flows at Budhigandaki Dam Site
The monthly observed and estimated flows by different methods at Budhigandaki Dam site are given in S-6 and shown in Figure 4. Values of the different performance parameters of those methods are presented in Table 4. They clearly indicate that the hydrological simulation method is the best by all criteria. The GT method ranked second as shown by the respective values while the DAR method ranked the last.
Seasonal performance of these methods at Budhigandaki dam site was also analyzed to see if it is consistent with the monthly and annual performance. The calculated seasonal GPI of the six performance parameters and its weightage average are given in Table 5. Although the dry season performance is found better for the GT method, hydrological simulation is found better for the other two seasons. The weighted GPI is the highest for hydrological modeling which is similar to the results of the whole year at this site (Table 4). Based on the overall GPI value, the hydrological simulation ranked first, GT the second and NEA 1997 the last.
![]()
Table 3. Seasonal Performance of flow estimation methods at Arughat.
![]()
Table 4. Performance parameters of estimated methods at the budhigandaki dam site.
![]()
Table 5. Seasonal performance of flow estimation methods at budhigandaki dam site.
![]()
Figure 4. Comparison of monthly flow at budhigandaki dam site.
Based on the results presented above, it can be inferred that hydrological simulation method is the best among the other considered methods of flow estimation in the BRB. It is to be noted here that the WECS 1990, NEA 1997 and DHM 2004 are regional methods and their coefficients are average values which have been established by regression analysis. Thus, these methods may perform better in some catchments while poorer in the others depending upon how well the coefficients represent the catchment characteristics. Since the DAR method does not account the rainfall variation, it might be better suited for in regions where rainfall variation is small. The GT method takes into account the rainfall, and therefore, it performs better than the regional and the DAR methods. However, it does not take into consideration the spatial variation in soil type and land use/land cover. Hydrological modeling takes all these factors into account and the flow estimated by this method is better than that by all the other considered regional methods for ungauged basins. Another advantage of the hydrological simulation method over others is that it provides continuous data at the site of interest which could be extremely useful for hydrological analysis required for any water resources project development works. However, it is extremely important that quality (length, accuracy and reliability) of the input data for model setup as well as calibration and validation is mandatory for the hydrological model to perform its best.
7. Conclusion
This study was carried out to evaluate the performance of different flow generation methods namely, DAR, GT, DHM/WECS 1990, NEA 1997, DHM 2004 and hydrological modeling using SWAT. The estimated flows from each method were compared with the observed flows at Arughat and BGHEP dam site of the Budhigandaki River Basin. Six performance parameters viz. R2, MAE, RMSE, PBIAS, NSE and KGE were used to evaluate the considered flow estimation methods. For overall evaluation of these flow estimation methods, Global Performance Index (GPI) was introduced. Results show that hydrological modeling is the best among all considered methods for estimating flows at monthly timescales. Carrying out hydrological analyses using suitable hydrological model(s) for Nepalese river basins is recommended as a policy prescription to the Government of Nepal so that flow at the site of interest can be obtained when required for any water resources development project.