Sensitivity Analysis and Calibration of Hydrological Modeling of the Watershed Northeast Brazil ()
1. Introduction
The degradation of hydrological resources has made it essential to encourage management practices based on knowledge of spatial and temporal changes in the quantity and quality of water, in order to ensure the suitability of water supplied for different uses. This can be assisted by using hydrological and water quality models to simulate a wide range of processes in hydrographic basins, such as the production of water and sediments and the dynamics of point and nonpoint sources of pollution.
Advances in computational capacity have meant that these models have become increasingly complex and require greater numbers of input parameters. This, in turn, can lead to increased uncertainties in the models. In principle, the parameters of a physically based model do not need to be calibrated, because the input data are obtained from field measurements. Nonetheless, the effects of spatial variability, measurement errors, incomplete description of the elements and processes of a system, and the extrapolation of information for one point to locations for which measurements are not available, amongst other factors, mean that the values of many parameters cannot be known with precision. Uncertainties may also be related to the model inputs, since for economic reasons the input data are often only measured at a limited number of locations. It is therefore necessary to calibrate models of hydrographic basins [1] - [4] .
Using simulations, uncertainties in the management of a watershed can be reduced by evaluating different scenarios before they occur [5] . Water quality models employ a large quantity of parameters, resulting in a range of data sets that can be used to compare with the predictions of the model. Sensitivity analyses are then needed in order to accommodate the large number of parameters and multiple output variables [6] .
The objective of this work was to perform a sensitivity analysis and calibration of flows in the watershed of the Poxim River, using the SWAT distributed hydrological model, in order to provide a tool for evaluation of management practices that could potentially help to reduce diffuse sources of pollution derived from agricultural practices in the hydrographic subbasins.
2. Materials and Methods
2.1. Study Region
The study was performed in the watershed of the Poxim River, an area of 116.11 km2 that forms part of the watershed of the Sergipe River (one of the most important rivers in Sergipe State). Located in the east of the State, the Sergipe River basin includes parts of the municipalities of Itaporangad’ Ajuda, AreiaBranca, Laranjeiras, NossaSenhora do Socorro, São Cristóvão, and Aracaju (Figure 1). It is situated between latitudes 10˚55'S and 10˚45'S, and longitudes 37˚05'W and 37˚22'W, and has an overall area of 397.87 km². The main affluents are the Poxim-Mirim, Poxim-Açu, and Pitanga rivers [7] [8] .
The region is characterized by a humid tropical climate, with a dry season between September and March and a rainy season between April and August. Average annual precipitation varies between 1 600 and 1 900 mm. Average temperatures are around 23˚C during the colder months (June-August), and around 31 ˚C during the warmer months (December-February) [9] [10] . Precipitation is greatest near the mouth of the river and can be sparse in some of the headwater regions [11] .
The main surface features in this watershed are degraded areas (1.60%), water bodies (0.15%), sugar cane (18.37%), forest (23.80%), riparian forest (2.21%), pasture (50.23%), residential areas (0.54%), restinga vegetation (3.03%), and saline pools (0.09%) [8] . The soils (Figure 2(a)) consist of eutrophic LitholicNeosol (7.02%),
Figure 1. Location of the watershed of the Poxim River.
Quartzarenic Neosol (11.78%), LitholicNeosol (16.67%), Gleysol (10.11%), and Red-Yellow Argisol (54.40%) [8] .The slope of the terrain (Figure 2(b)) was described according to the five categories established by [12] . These are flat (0% -3% angle), gentle slope (3% - 8%), slope (8% - 20%), steep slope (20% - 45%), and mountainous (>45%).
2.2. Description of the Model
The SWAT (Soil and Water Assessment Tool) distributed model was developed by the Agricultural Research Service (ARS) of the United States Department of Agriculture (USDA). It enables prediction of the long-term impacts of soil management practices on the water, sediments, and pesticide levels in large hydrographic basins with different types of soils, land use, and management practices [13] . It was employed by the USDA’s Natural Resources Conservation Service during the Conservation Effects Assessment Project (CEAP), created in 2003 to measure the environmental effects of conservation measures implemented at different spatial scales [14] . SWAT has been widely used to simulate processes that occur in the environment, identify the origins of contaminants, predict the likely outcomes of different scenarios, and establish the causes and effects of diffuse sources of pollution [15] .
Based on the topographical characteristics of the terrain, obtained from a digital elevation model (DEM), the model delimits watersheds, defines the drainage network, and describes the spatial variability of the hydrographic basin, by splitting it into smaller units. The basin is first split into subbasins and the network of channels is calculated. Each subbasin is then divided into hydrological response units (HRUs), which are areas with homogeneous characteristics defined after establishing categories in terms of soil use, soil type, and slope. Based on the options available in SWAT, the HRUs can describe different parts of the subbasin in terms of the main types of soil and land use, as well as management characteristics, and can reflect differences in evapotranspiration between different cultivations and soils [13] [16] - [18] .
Surface runoff is predicted separately for each HRU, which then enables the determination of runoff in the hydrographic basin as a whole. This procedure improves accuracy and provides a better physical description of the hydric balance [13] [19] [20] . Hence, all the processes that occur in the landscape are modeled for each HRU within the hydrographic basin, independent of its position in the subbasin [21] . The complete cycles of the nutrients nitrogen and phosphorus in the HRUs can also be modeled using SWAT [13] .
(a) (b)
Figure 2. Soil classes (a) and declivity (b) in the watershed of the Poxim River.
For each HRU, the volume of surface runoff is simulated using the curve number (CN) method of the Soil Conservation Service [22] , which calculates surface runoff as a function of the type and use of the soil, slope, initial soil humidity, and type of management practice [13] . The evapotranspiration potential is determined using the Penman-Monteith equation and is corrected for the type of soil cover and simulated plant growth in order to obtain the real evapotranspiration rate [13] [23] .
2.3. Input Data
The data required for simulation in SWAT concern topography, soil type, land use, and meteorology. A digital elevation model (DEM) was used to provide the topographical data at a resolution of 90 × 90 m (Figure 3(a)).
Although there has been debate concerning the influence of the DEM on the processes simulated by SWAT [24] -[26] , the same resolution was also used by [27] - [29] , while [30] used a resolution of 100 m. In all cases, SWAT was able to achieve the desired objectives. The DEM utilized was generated from radar data obtained during the Shuttle Radar Topography Mission (SRTM) project [31] . Demarcation of the hydrographic basin, considering the drainage network and the size of the subbasins, employed a minimum channel area of 150 ha [17] . The convergence point of the subbasins was at −10.92 (latitude) and −37.19 (longitude), for which the average daily discharge for the period July 2011 to January 2012 was 1.20 m³∙s−1, with minimum and maximum values of 0.02 and 9.17 m³∙s−1, respectively. Twenty-five subbasins were identified in an overall area of 113.12 km2 (Figure 3(b)).
In order to simulate the area and hydrological parameters within each subbasin, SWAT requires data concerning the soil type and land use [32] . Maps of the watershed of the Poxim River containing this information (vegetation type, land use, and soil class), at a scale of 1:400,000, were obtained from the Atlas of Hydric Resources of Sergipe [8] .
Since the hydric balance is one of the physical processes considered by SWAT, the model requires parameterization of the soil [33] . Samples were therefore collected and analyzed in order to determine the physical characteristics of the soil. The parameters that were not measured were estimated using pedotransfer functions (Table 1).
For definition of the HRUs, limits of 10%, 20%, and 10% were established for soil use, soil type, and slope, respectively. These values have been used previously by [34] [35] . The final number of HRUs was 209. After definition of the HRUs, the soil uses and declivities in the area studied were reclassified by the model, as shown in Table 2.
(a) (b)
Figure 3. Digital elevation model (a) and subbasins (b) of the watershed of the Poxim River.
Table 1. Characteristics of the soil in the watershed of the Poxim River.
aEMBRAPA (1997); bFIORIN (2008): pedotransfer function.
Table 2. Soil uses and slope classes in the watershed of the Poxim River, after definition of the HRUs.
The climatic data (daily measurements of precipitation, maximum and minimum temperature, solar radiation, relative humidity of the air, and wind speed) were obtained from the Aracaju meteorological station, operated by the National Meteorological Institute (INMET) (latitude −10.95, longitude −37.04, altitude 4.72 m). The data used were for the period 01/01/1991 to 30/06/2012. Precipitation data were obtained from measurement stations at Itabaiana (latitude −10.70, longitude −37.42, altitude 200 m), São Cristóvão (latitude −10.92, longitude −37.20, altitude 30 m), and Laranjeiras (latitude −10.81, longitude −37.17, altitude 13 m), operated by the Center for Weather Forecasting and Climate Studies of the National Space Research Institute (CPTEC/INPE), for the period 01 January 2000 to 30 June 2012.
2.4. Performance Evaluation of the Model
The performance of the model was evaluated by visual and statistical comparison of the measured and simulated data. The graphical technique provided an initial general overview [36] . Interpretation of the hydrograms first focused on the peak flows and then on the baseflow.
Amongst the various statistical parameters that can be used to evaluate hydrological models, the American Society of Civil Engineers [36] has highlighted the Nash-Sutcliffe efficiency coefficient (NSE) [37] . The statistical criteria used here to evaluate the performance of the model are indicated in Table 3.
The NSE describes the deviation from unity of the ratio of the square of the difference between the observed and simulated values and the variance of the observations [38] . The value of the coefficient can vary from minus infinity to one, with the latter value indicating perfect agreement between the simulated and observed data [36] . A smaller NSE value indicates a poorer fit between the simulated and observed time series data. It is possible to obtain negative values of the NSE, indicating that the average of the observational data provides a better fit to the data, compared to the simulated values; in other words, use of the simulated model values is worse than simply using the observed average [39] - [41] . For NSE values that are negative or very close to zero, the prediction of the model is considered to be poor or unacceptable [42] .
The percent bias (PBIAS) describes the tendency of the simulated data to be greater or smaller than the observed data, expressed as a percentage [41] . The optimum PBIAS value is 0.0, and low values indicate that the model simulation is satisfactory. Positive values indicate a tendency of the model to underestimate, while nega-
Table 3. Criteria for evaluating the performance of the hydrological model, and their corresponding classifications. i―time series of the measured and simulated pairs; n―number of pairs of the measured and simulated variables; Oi―observational data; Si―simulated data;―mean of the observational data.
tive values are indicative of overestimation [43] . This test is recommended due to its ability to reveal any poor performance of the model [44] .
The root mean square error (RMSE) provides a measure of the average difference between the measured and simulated values, and can be positive or negative [38] . Values close to 0.0 indicate a perfect fit, with values smaller than half the standard deviation (SD) of the observed values being considered low [45] .
The RSR value, which is the ratio of the RMSE to the standard deviation of the observations, can provide additional information, as recommended by [46] , and can be applied to a variety of different constituents [41] .
There are no existing standards describing the range of values of the statistical parameters that would indicate acceptable performance of the model [40] . The criteria adopted were therefore based on a review of the literature (Table 3).
3. Results and Discussion
3.1. Sensitivity Analysis of the Model Parameters
The parameters used for the flow were selected based on the literature and the SWAT documentation. The initial simulation to determine the sensitivity of the model to different parameters was performed using default parameter values. The values were then varied within upper and lower limits established according to the characteristics of each parameter, using three methods. In the first procedure, the initial value of the parameter is modified by adding an increment. The second method consists of multiplying the initial value by a set amount. In the third method, the initial value is substituted by a different value [47] .
The sensitivity analysis procedure employed measurement data for the period 1 January 2012 to 30 June 2012 to evaluate the fit between the measured and modeled time series data. This enabled identification of the parameters that were influenced by the characteristics of the hydrographic basin, and those to which the model was most sensitive. Evaluation was then made of the way in which adjusting the value of a parameter affected the model output, in order to identify parameters that might improve the characteristics of the model [48] . Figure 4 illustrates the sensitivities for the parameters affecting the hydrological processes, calculated using the sum of square errors (SSE) between the measured and simulated daily flow data.
The sensitivity of the model to a parameter is determined using the percentage difference between the output values of the objective function for simulations performed immediately before and immediately after changing the value of a parameter [48] . A higher value indicates greater sensitivity for a given parameter.
Thirteen of the twenty parameters submitted to sensitivity analysis showed significant effects on simulation of the flow, and were therefore those to which the model was most sensitive. Assignment of the degree of sensitivity is subjective [39] . Here, most of the parameters to which the model was considered to be sensitive were those with average percentage variation of the value of the objective function greater than 0.05 (Figure 5).
3.2. Calibration of the Model Parameters
The parameters that influenced the surface runoff and base flow were optimized in the manual calibration. As a way of reducing the number of parameters to be calibrated, ordering of the parameters (Figure 4) was used to select only those for which the sensitivity value was ≥0.05). The calibration was performed using the period 01 January 2012 to 30 June 2012. The changes to the parameter values obtained from the calibration process are listed in Table 4.
The curve number (Cn2) was adjusted for the different land uses, including sugar cane, forest, riparian forest, and pasture. This soil hydric balance parameter enables the model to adjust the soil humidity in order to estimate the surface runoff [49] . It is determined by characteristics of the watershed including the type of soil, hydrological group, soil use and management, and initial humidity, amongst others, and its value varies from 1 to 100. A completely permeable soil would have a Cn2 value of 1, while a totally impermeable soil would have a Cn2
Figure 4. Results of the sensitivity analysis for the model parameters in terms of the average percentage variation in the value of the objective function.
Figure 5. Hydrogram of the daily flow for the calibration period.
Table 4. Alterations and final parameter values after manual calibration.
value of 100 [50] .
High values of Cn2 reflect increased surface runoff and reduced baseflow [51] . The standard Cn2 values defined in the manual of the Soil Conservation Service (SCS) were increased by 5%, which increased the surface runoff and provided a better estimate of soil drainage [52] [53] . The alteration was similar to those reported previously [54] [55] .
The available water capacity of the soil (Sol_Awc) is the volume of water available to plants when the soil is at field capacity [56] . It can be estimated by determining the quantity of water released between field capacity (water content at a soil water matric potential of −0.033 Mpa) and the point of permanent wilting (water content at a soil water matric potential of −1.5 Mpa) [50] , and has an inverse relation with the components of the hydric balance. High values of Sol_Awc signify a high capacity of the soil to maintain its humidity, which reduces the amount of water available for surface runoff and percolation, and therefore affects the production of water [51] [57] . The value of this parameter was increased by 10%, relative to the initial value, in order to obtain convergence between the observational and simulated data.
The soil evaporation compensation factor (Esco) adjusts the depth considered for evaporation from the soil, involving capillary action, crusts, and cracks. A low Esco value reflects an ability of the deeper soil layers to compensate the hydric deficit in the upper layers, resulting in greater evapotranspiration, reduced surface runoff and baseflow, and reduced soil water content [51] [56] [58] . The Esco value lies in the range 0.01 - 1.0. When a high Esco value is used in the model, there is less extraction of the evaporative demand from lower levels, so that evaporation is reduced. Here, the value of this parameter was adjusted to 0.95.
The three parameters described above (Cn2, Sol_Awc, and Esco) govern the behavior of surface waters [59] , and favor the direct contribution of surface runoff to the flow [60] .
The baseflow recession constant (Alpha_Bf) reflects the response of the subterranean flow to changes in recharge. Calibration of this parameter enables better fitting of the hydrogram [34] , and here an increment of +0.95 was added to the standard Alpha_Bf value. According to [13] , Alpha_Bf values of between 0.9 and 1.0 are used for soils that show rapid recharge responses, while values between 0.1 and 0.3 are used for soils with slow responses. The importance of the Alpha_Bf parameter is because in dry periods the flow depends on the contribution of subterranean water, which in turn has a strong relationship with Alpha_Bf [47] .
The Gw_Revap coefficient describes the quantity of water that moves from the superficial aquifer to the root zone due to the depletion of soil humidity and the direct capture of underground water by the deep roots of trees and bushes [56] . A value near 0.0 reflects restricted movement of water from the superficial aquifer to the root zone, while a value near unity indicates that the transfer rate is close to the rate of evapotranspiration [50] . An increase of the water return coefficient reduces the baseflow due to increased transfer of water from the shallow aquifer to the root zone. The value of this parameter was changed to 0.03.
The Gw_Delay parameter is the delay time for recharge of the aquifer, representing the time required for water to move from the deepest soil layer (the root zone) and reach the shallow or superficial aquifer [13] . The value of this parameter was changed from 32 to 75. The parameters Alpha_Bf, Gw_Delay, and Gw_ Revap govern the response of the subsurface water [59] .
The Surlag parameter determines the response of the watershed [56] and provides a storage factor for basins where there is a delay of more than one day before the surface runoff reaches its final convergence point [55] . The value of this parameter was changed from 4 to 1, corresponding to greater storage of water [49] . Delay in the release of surface runoff acts to smooth the simulated flow hydrogram. However, values smaller than 0.5 are not suitable for the release of surface runoff in a subbasin of the main channel [56] .
Increase of the Slope parameter increases the lateral flow [52] . The hydraulic conductivity of the channel (Ch_K2) governs the movement of water from the riverbed to the subsoil, or vice-versa, in the case of ephemeral or transitory rivers [59] [60] . The Sol_Z parameter defines the thickness of the soil layer, influencing the movement of water in the soil during the processes of redistribution and evaporation of soil water [61] . A 5% increase in the value of this variable served to increase the surface runoff.
The means and standard deviations of the observational and simulated data obtained after the calibration process are provided in Table 5.
The results demonstrated that the performance of the calibrated model was satisfactory, with NSE > 0.75, R² > 0.5, PBIAS < ±10, RMSE ≤ (SD/2), and 0.00 ≤ RSR ≤ 0.50. The values of the statistical parameters were poorer for the validation step, but still remained within acceptable limits (Table 6). The negative PBIAS value indicates that in the validation step the flow was overestimated, as shown in Figure 5.
A range of NSE values has been reported in the literature. [61] obtained values of 0.66 (calibration step) and 0.87 (validation step) for simulations of the watershed of the RibeirãoJaguara River in Minas Gerais State, and the performance of the model was classified as very good. [62] found values of between 0.78 and 0.94 for the calibration step in simulations of different scenarios for the watershed of the Araranguá River, and concluded that the fit of the model was satisfactory. [63] obtained a value of 0.76 for the calibration of daily flow data in a simulation of the production of water in the Concórdia watershed in Santa Catarina State, and considered the model performance to be satisfactory. [64] obtained values of 0.997 and 0.85 for calibration and validation steps, respectively, and considered the results highly satisfactory, despite the smaller NSE value for the validation step. The values obtained by [27] for calibration and validation steps were 0.58 and 0.51, respectively, so a lower NSE was again found for the validation step.
In simulations of the Eastern Nile Basin, [65] obtained PBIAS values ranging between −10.9% and 38% (ca-
Table 5. Results of calibration of the daily flow for the period January-June 2012.
Table 6. Performance criteria values for the calibration and validation procedures.
libration step), and between −25% and 25% (validation step). These values were considered to be good, with the exception of the Abbaysubbasins, where the values were unsatisfactory (±20% < PBIAS ≤ ±40%).
[66] obtained values smaller than ±15%, indicating that the model provided a good simulation of the components of surface runoff.
[45] reported RMSE values of 0.60 and 0.69 for the calibration and validation, respectively, of daily flow data, indicating an excellent fit between the observational and simulated data. [67] obtained a value of 3.52 for the calibration step, and values of 3.45 (year 2000) and 2.24 (year 2001) for the validation step. These values were considered indicative of acceptable simulation of the flow.
A wide range of RSR values have been reported in the literature. [68] obtained values of 0.47 and 0.60 for calibration and validation steps, respectively, while [69] obtained values of 0.33 (calibration step) and 0.45 (validation step).
It can be seen from Figure 5 that the simulations slightly underestimated the peak flow values for the February to May period, while in June (in the rainy season) the peak flows were slightly overestimated. Nevertheless, the model provided a good simulation of the overall trend in water production. During low water periods, SWAT provided a satisfactory fit, indicating that the model can effectively simulate low flows, as also reported by [61] .
The period between 08/08/2011 and 31/10/2011 was used to validate the model (Figure 6). It can be seen that the peak flows obtained from the observational and simulated data did not coincide, and that the model provided better simulation of the minimum flows (as also found for the calibration period).
The results obtained for the statistical criteria demonstrated that the model was able to describe the hydrological processes in the watershed of the Poxim River. The PBIAS value of −17.70% indicated that, in general terms, the simulation overestimated the flow [43] .
4. Conclusions
The use of sensitivity analysis enabled identification of the most important parameters required to model the hydrological processes in the watershed of the Poxim River, hence reducing the quantity of model parameters that needed to be calibrated. Although attention should be focused on these parameters in future studies, it should be emphasized that the results of sensitivity analysis can be dependent on the period considered for the simulation, especially in the case of short periods [47] .
In the calibration procedure, the model provided a good fit between observational and simulated data for the hydrographic basin, with values of the statistical performance parameters ranging between very good and satisfactory, indicating that the values of the hydrological parameters could be used for this hydrographic basin.
The performance of the model for the validation step indicated that the set of parameters identified during the calibration step could satisfactorily represent the hydrological processes in the river basin, despite the fact that the validation statistics were worse than for the calibration step. This could have been due to the small sample size of the observational flow data. It is clear that the provision of suitable input data is essential in order to ob-
Figure 6. Hydrogram of the daily flow for the validation period.
tain a good fit between observational and simulated data. The set of parameters identified here could be used for the simulation and evaluation of other scenarios.
Acknowledgements
Financial support for this research was provided by the CNPq, IFS and SEMARH/SE.
NOTES
*Corresponding author.