Sensitivity Analysis and Calibration of Hydrological Modeling of the Watershed Northeast Brazil

Mathematical models of the quantity and quality of water in hydrographic basins enable simulation of a wide variety of processes, including the production of water and sediments, and the dynamics of point and nonpoint sources of pollution. These models have become increasingly complex, requiring large amounts of input data, which can increase the uncertainty of the results of simulations. For this reason, it is essential to perform calibration and validation procedures. The objective of this work was to conduct sensitivity analysis and calibration of a distributed hydrological model (SWAT) applied to the flows of water in the watershed of the Poxim River. Satisfactory performance of the model was indicated by the values obtained for the Nash-Sutcliffe efficiency coefficient (0.77), the percent bias (5.05), the root mean square error (0.48), and the ratio of the RMSE to the standard deviation of the observations (RSR) (0.49). The set of parameters identified here could be used for the simulation and evaluation of other scenarios.


Introduction
The degradation of hydrological resources has made it essential to encourage management practices based on knowledge of spatial and temporal changes in the quantity and quality of water, in order to ensure the suitability of water supplied for different uses. This can be assisted by using hydrological and water quality models to si-

Study Region
The study was performed in the watershed of the Poxim River, an area of 116.11 km 2 that forms part of the watershed of the Sergipe River (one of the most important rivers in Sergipe State). Located in the east of the State, the Sergipe River basin includes parts of the municipalities of Itaporangad' Ajuda, AreiaBranca, Laranjeiras, NossaSenhora do Socorro, São Cristóvão, and Aracaju (Figure 1). It is situated between latitudes 10˚55'S and 10˚45'S, and longitudes 37˚05'W and 37˚22'W, and has an overall area of 397.87 km². The main affluents are the Poxim-Mirim, Poxim-Açu, and Pitanga rivers [7] [8].
The region is characterized by a humid tropical climate, with a dry season between September and March and a rainy season between April and August. Average annual precipitation varies between 1 600 and 1 900 mm. Average temperatures are around 23˚C during the colder months (June-August), and around 31 ˚C during the warmer months (December-February) [9] [10]. Precipitation is greatest near the mouth of the river and can be sparse in some of the headwater regions [11].

Description of the Model
The SWAT (Soil and Water Assessment Tool) distributed model was developed by the Agricultural Research Service (ARS) of the United States Department of Agriculture (USDA). It enables prediction of the long-term impacts of soil management practices on the water, sediments, and pesticide levels in large hydrographic basins with different types of soils, land use, and management practices [13]. It was employed by the USDA's Natural Resources Conservation Service during the Conservation Effects Assessment Project (CEAP), created in 2003 to measure the environmental effects of conservation measures implemented at different spatial scales [14]. SWAT has been widely used to simulate processes that occur in the environment, identify the origins of contaminants, predict the likely outcomes of different scenarios, and establish the causes and effects of diffuse sources of pollution [15].
Based on the topographical characteristics of the terrain, obtained from a digital elevation model (DEM), the model delimits watersheds, defines the drainage network, and describes the spatial variability of the hydrographic basin, by splitting it into smaller units. The basin is first split into subbasins and the network of channels is calculated. Each subbasin is then divided into hydrological response units (HRUs), which are areas with homogeneous characteristics defined after establishing categories in terms of soil use, soil type, and slope. Based on the options available in SWAT, the HRUs can describe different parts of the subbasin in terms of the main types of soil and land use, as well as management characteristics, and can reflect differences in evapotranspiration between different cultivations and soils [13] [16]- [18].
Surface runoff is predicted separately for each HRU, which then enables the determination of runoff in the hydrographic basin as a whole. This procedure improves accuracy and provides a better physical description of the hydric balance [13] [19] [20]. Hence, all the processes that occur in the landscape are modeled for each HRU within the hydrographic basin, independent of its position in the subbasin [21]. The complete cycles of the nutrients nitrogen and phosphorus in the HRUs can also be modeled using SWAT [13]. For each HRU, the volume of surface runoff is simulated using the curve number (CN) method of the Soil Conservation Service [22], which calculates surface runoff as a function of the type and use of the soil, slope, initial soil humidity, and type of management practice [13]. The evapotranspiration potential is determined using the Penman-Monteith equation and is corrected for the type of soil cover and simulated plant growth in order to obtain the real evapotranspiration rate [13] [23].

Input Data
The data required for simulation in SWAT concern topography, soil type, land use, and meteorology. A digital elevation model (DEM) was used to provide the topographical data at a resolution of 90 × 90 m (Figure 3(a)).
Although there has been debate concerning the influence of the DEM on the processes simulated by SWAT [24]- [26], the same resolution was also used by [27]- [29], while [30] used a resolution of 100 m. In all cases, SWAT was able to achieve the desired objectives. The DEM utilized was generated from radar data obtained during the Shuttle Radar Topography Mission (SRTM) project [31]. Demarcation of the hydrographic basin, considering the drainage network and the size of the subbasins, employed a minimum channel area of 150 ha [17]. The convergence point of the subbasins was at −10.92 (latitude) and −37.19 (longitude), for which the average daily discharge for the period July 2011 to January 2012 was 1.20 m³•s −1 , with minimum and maximum values of 0.02 and 9.17 m³•s −1 , respectively. Twenty-five subbasins were identified in an overall area of 113.12 km 2 (Figure 3(b)).
In order to simulate the area and hydrological parameters within each subbasin, SWAT requires data concerning the soil type and land use [32]. Maps of the watershed of the Poxim River containing this information (vegetation type, land use, and soil class), at a scale of 1:400,000, were obtained from the Atlas of Hydric Resources of Sergipe [8].
Since the hydric balance is one of the physical processes considered by SWAT, the model requires parameterization of the soil [33]. Samples were therefore collected and analyzed in order to determine the physical characteristics of the soil. The parameters that were not measured were estimated using pedotransfer functions ( Table 1).
For definition of the HRUs, limits of 10%, 20%, and 10% were established for soil use, soil type, and slope, respectively. These values have been used previously by [34] [35]. The final number of HRUs was 209. After definition of the HRUs, the soil uses and declivities in the area studied were reclassified by the model, as shown in Table 2.   The climatic data (daily measurements of precipitation, maximum and minimum temperature, solar radiation, relative humidity of the air, and wind speed) were obtained from the Aracaju meteorological station, operated by

Performance Evaluation of the Model
The performance of the model was evaluated by visual and statistical comparison of the measured and simulated data. The graphical technique provided an initial general overview [36]. Interpretation of the hydrograms first focused on the peak flows and then on the baseflow.
Amongst the various statistical parameters that can be used to evaluate hydrological models, the American Society of Civil Engineers [36] has highlighted the Nash-Sutcliffe efficiency coefficient (NSE) [37]. The statistical criteria used here to evaluate the performance of the model are indicated in Table 3.
The NSE describes the deviation from unity of the ratio of the square of the difference between the observed and simulated values and the variance of the observations [38]. The value of the coefficient can vary from minus infinity to one, with the latter value indicating perfect agreement between the simulated and observed data [36]. A smaller NSE value indicates a poorer fit between the simulated and observed time series data. It is possible to obtain negative values of the NSE, indicating that the average of the observational data provides a better fit to the data, compared to the simulated values; in other words, use of the simulated model values is worse than simply using the observed average [39]- [41]. For NSE values that are negative or very close to zero, the prediction of the model is considered to be poor or unacceptable [42].
The percent bias (PBIAS) describes the tendency of the simulated data to be greater or smaller than the observed data, expressed as a percentage [41]. The optimum PBIAS value is 0.0, and low values indicate that the model simulation is satisfactory. Positive values indicate a tendency of the model to underestimate, while nega- Very good Good Satisfactory Unsatisfactory [41] Root mean square error Value below half the standard deviation Satisfactory [45] Ratio of the RMSE to the standard deviation of the observations Very good Good Satisfactory Unsatisfactory [41] tive values are indicative of overestimation [43]. This test is recommended due to its ability to reveal any poor performance of the model [44].
The root mean square error (RMSE) provides a measure of the average difference between the measured and simulated values, and can be positive or negative [38]. Values close to 0.0 indicate a perfect fit, with values smaller than half the standard deviation (SD) of the observed values being considered low [45].
The RSR value, which is the ratio of the RMSE to the standard deviation of the observations, can provide additional information, as recommended by [46], and can be applied to a variety of different constituents [41].
There are no existing standards describing the range of values of the statistical parameters that would indicate acceptable performance of the model [40]. The criteria adopted were therefore based on a review of the literature ( Table 3).

Sensitivity Analysis of the Model Parameters
The parameters used for the flow were selected based on the literature and the SWAT documentation. The initial simulation to determine the sensitivity of the model to different parameters was performed using default parameter values. The values were then varied within upper and lower limits established according to the characteristics of each parameter, using three methods. In the first procedure, the initial value of the parameter is modified by adding an increment. The second method consists of multiplying the initial value by a set amount. In the third method, the initial value is substituted by a different value [47].
The sensitivity analysis procedure employed measurement data for the period 1 January 2012 to 30 June 2012 to evaluate the fit between the measured and modeled time series data. This enabled identification of the parameters that were influenced by the characteristics of the hydrographic basin, and those to which the model was most sensitive. Evaluation was then made of the way in which adjusting the value of a parameter affected the model output, in order to identify parameters that might improve the characteristics of the model [48]. Figure 4 illustrates the sensitivities for the parameters affecting the hydrological processes, calculated using the sum of square errors (SSE) between the measured and simulated daily flow data.
The sensitivity of the model to a parameter is determined using the percentage difference between the output values of the objective function for simulations performed immediately before and immediately after changing the value of a parameter [48]. A higher value indicates greater sensitivity for a given parameter.
Thirteen of the twenty parameters submitted to sensitivity analysis showed significant effects on simulation of the flow, and were therefore those to which the model was most sensitive. Assignment of the degree of sensitivity is subjective [39]. Here, most of the parameters to which the model was considered to be sensitive were those with average percentage variation of the value of the objective function greater than 0.05 (Figure 5).

Calibration of the Model Parameters
The parameters that influenced the surface runoff and base flow were optimized in the manual calibration. As a way of reducing the number of parameters to be calibrated, ordering of the parameters (Figure 4) was used to select only those for which the sensitivity value was ≥0.05). The calibration was performed using the period 01 January 2012 to 30 June 2012. The changes to the parameter values obtained from the calibration process are listed in Table 4.
The curve number (Cn2) was adjusted for the different land uses, including sugar cane, forest, riparian forest, and pasture. This soil hydric balance parameter enables the model to adjust the soil humidity in order to estimate the surface runoff [49]. It is determined by characteristics of the watershed including the type of soil, hydrological group, soil use and management, and initial humidity, amongst others, and its value varies from 1 to 100. A completely permeable soil would have a Cn2 value of 1, while a totally impermeable soil would have a Cn2    value of 100 [50]. High values of Cn2 reflect increased surface runoff and reduced baseflow [51]. The standard Cn2 values defined in the manual of the Soil Conservation Service (SCS) were increased by 5%, which increased the surface runoff and provided a better estimate of soil drainage [52] [53]. The alteration was similar to those reported previously [54] [55].
The available water capacity of the soil (Sol_Awc) is the volume of water available to plants when the soil is at field capacity [56]. It can be estimated by determining the quantity of water released between field capacity (water content at a soil water matric potential of −0.033 Mpa) and the point of permanent wilting (water content at a soil water matric potential of −1.5 Mpa) [50], and has an inverse relation with the components of the hydric balance. High values of Sol_Awc signify a high capacity of the soil to maintain its humidity, which reduces the amount of water available for surface runoff and percolation, and therefore affects the production of water [51] [57]. The value of this parameter was increased by 10%, relative to the initial value, in order to obtain convergence between the observational and simulated data.
The soil evaporation compensation factor (Esco) adjusts the depth considered for evaporation from the soil, involving capillary action, crusts, and cracks. A low Esco value reflects an ability of the deeper soil layers to compensate the hydric deficit in the upper layers, resulting in greater evapotranspiration, reduced surface runoff and baseflow, and reduced soil water content [51] [56] [58]. The Esco value lies in the range 0.01 -1.0. When a high Esco value is used in the model, there is less extraction of the evaporative demand from lower levels, so that evaporation is reduced. Here, the value of this parameter was adjusted to 0.95.
The three parameters described above (Cn2, Sol_Awc, and Esco) govern the behavior of surface waters [59], and favor the direct contribution of surface runoff to the flow [60].
The baseflow recession constant (Alpha_Bf) reflects the response of the subterranean flow to changes in recharge. Calibration of this parameter enables better fitting of the hydrogram [34], and here an increment of +0.95 was added to the standard Alpha_Bf value. According to [13], Alpha_Bf values of between 0.9 and 1.0 are used for soils that show rapid recharge responses, while values between 0.1 and 0.3 are used for soils with slow responses. The importance of the Alpha_Bf parameter is because in dry periods the flow depends on the contribution of subterranean water, which in turn has a strong relationship with Alpha_Bf [47].
The Gw_Revap coefficient describes the quantity of water that moves from the superficial aquifer to the root zone due to the depletion of soil humidity and the direct capture of underground water by the deep roots of trees and bushes [56]. A value near 0.0 reflects restricted movement of water from the superficial aquifer to the root zone, while a value near unity indicates that the transfer rate is close to the rate of evapotranspiration [50]. An increase of the water return coefficient reduces the baseflow due to increased transfer of water from the shallow aquifer to the root zone. The value of this parameter was changed to 0.03.
The Gw_Delay parameter is the delay time for recharge of the aquifer, representing the time required for water to move from the deepest soil layer (the root zone) and reach the shallow or superficial aquifer [13]. The value of this parameter was changed from 32 to 75. The parameters Alpha_Bf, Gw_Delay, and Gw_ Revap govern the response of the subsurface water [59].
The Surlag parameter determines the response of the watershed [56] and provides a storage factor for basins where there is a delay of more than one day before the surface runoff reaches its final convergence point [55]. The value of this parameter was changed from 4 to 1, corresponding to greater storage of water [49]. Delay in the release of surface runoff acts to smooth the simulated flow hydrogram. However, values smaller than 0.5 are not suitable for the release of surface runoff in a subbasin of the main channel [56].
Increase of the Slope parameter increases the lateral flow [52]. The hydraulic conductivity of the channel (Ch_K2) governs the movement of water from the riverbed to the subsoil, or vice-versa, in the case of ephemeral or transitory rivers [59] [60]. The Sol_Z parameter defines the thickness of the soil layer, influencing the movement of water in the soil during the processes of redistribution and evaporation of soil water [61]. A 5% increase in the value of this variable served to increase the surface runoff.
The means and standard deviations of the observational and simulated data obtained after the calibration process are provided in Table 5.
The results demonstrated that the performance of the calibrated model was satisfactory, with NSE > 0.75, R² > 0.5, PBIAS < ±10, RMSE ≤ (SD/2), and 0.00 ≤ RSR ≤ 0.50. The values of the statistical parameters were poorer for the validation step, but still remained within acceptable limits ( Table 6). The negative PBIAS value indicates that in the validation step the flow was overestimated, as shown in Figure 5.
A range of NSE values has been reported in the literature. [61] obtained values of 0.66 (calibration step) and 0.87 (validation step) for simulations of the watershed of the RibeirãoJaguara River in Minas Gerais State, and the performance of the model was classified as very good. [62] found values of between 0.78 and 0.94 for the calibration step in simulations of different scenarios for the watershed of the Araranguá River, and concluded that the fit of the model was satisfactory. [63] obtained a value of 0.76 for the calibration of daily flow data in a simulation of the production of water in the Concórdia watershed in Santa Catarina State, and considered the model performance to be satisfactory. [64] obtained values of 0.997 and 0.85 for calibration and validation steps, respectively, and considered the results highly satisfactory, despite the smaller NSE value for the validation step. The values obtained by [27] for calibration and validation steps were 0.58 and 0.51, respectively, so a lower NSE was again found for the validation step.
In simulations of the Eastern Nile Basin, [65] obtained PBIAS values ranging between −10.9% and 38% (ca-  libration step), and between −25% and 25% (validation step). These values were considered to be good, with the exception of the Abbaysubbasins, where the values were unsatisfactory (±20% < PBIAS ≤ ±40%). [66] obtained values smaller than ±15%, indicating that the model provided a good simulation of the components of surface runoff. [45] reported RMSE values of 0.60 and 0.69 for the calibration and validation, respectively, of daily flow data, indicating an excellent fit between the observational and simulated data. [67]  It can be seen from Figure 5 that the simulations slightly underestimated the peak flow values for the February to May period, while in June (in the rainy season) the peak flows were slightly overestimated. Nevertheless, the model provided a good simulation of the overall trend in water production. During low water periods, SWAT provided a satisfactory fit, indicating that the model can effectively simulate low flows, as also reported by [61].
The period between 08/08/2011 and 31/10/2011 was used to validate the model (Figure 6). It can be seen that the peak flows obtained from the observational and simulated data did not coincide, and that the model provided better simulation of the minimum flows (as also found for the calibration period).
The results obtained for the statistical criteria demonstrated that the model was able to describe the hydrological processes in the watershed of the Poxim River. The PBIAS value of −17.70% indicated that, in general terms, the simulation overestimated the flow [43].

Conclusions
The use of sensitivity analysis enabled identification of the most important parameters required to model the hydrological processes in the watershed of the Poxim River, hence reducing the quantity of model parameters that needed to be calibrated. Although attention should be focused on these parameters in future studies, it should be emphasized that the results of sensitivity analysis can be dependent on the period considered for the simulation, especially in the case of short periods [47].
In the calibration procedure, the model provided a good fit between observational and simulated data for the hydrographic basin, with values of the statistical performance parameters ranging between very good and satisfactory, indicating that the values of the hydrological parameters could be used for this hydrographic basin.
The performance of the model for the validation step indicated that the set of parameters identified during the calibration step could satisfactorily represent the hydrological processes in the river basin, despite the fact that the validation statistics were worse than for the calibration step. This could have been due to the small sample size of the observational flow data. It is clear that the provision of suitable input data is essential in order to ob-

Time (day)
Observed Simulated tain a good fit between observational and simulated data. The set of parameters identified here could be used for the simulation and evaluation of other scenarios.