Parameter Uncertainty Estimation by Using the Concept of Ideal Data in GLUE Approach

The hydrological uncertainty about NASH model parameters is investigated and addressed in the paper through “ideal data” concept by using the Generalized Likelihood Uncertainty Estimation (GLUE) methodology in an application to the small Yanduhe research catchment in Yangtze River, China. And a suitable likelihood measure is assured here to reduce the uncertainty coming from the parameters relationship. “Ideal data” is assumed to be no error for the input-output data and model structure. The relationship between parameters k and n of NASH model is clearly quantitatively demonstrated based on the real data and it shows the existence of uncertainty factors different from the parameter one. Ideal data research results show that the accuracy of data and model structure are the two important preconditions for parameter estimation. And with suitable likelihood measure, the parameter uncertainty could be decreased or even disappeared. Moreover it is shown how distributions of predicted discharge errors are non-Gaussian and vary in shape with time and discharge under the single existence of parameter uncertainty or under the existence of all uncertainties.


Introduction
It is well known that the hydrological processes are very complicated and influenced by climate, weather, geographic and geomorphic conditions, underlying surface conditions and that it is very difficult to obtain the hydrographic features (precipitation, evaporation, discharge etc.) as well as the spatial and temporal distributions of hydrologic cycle features precisely.For all of these reasons, the accuracy of hydrological modeling will be influenced by these uncertainties.
The randomness and fuzziness of the hydrological phenomenon are the primary causes for the modeling uncertainty.Hydrologists [1] [2] [3] [4] [5] have discussed such uncertainties originated by such causes as input-output data, hydrological model structure and model parameter.In particular, the uncertainty of the hydrological data can be summarized as below: 1) Representativeness of the distribution character and mathematical expectation of hydrological features.
Taking the rainfall as an example, as heterogeneityand variability of the precipitation spatial distribution, maybe the information obtained in fixed rainfall station network is inaccurate to be used as mean value in an area; 2) measurement error.As the existence of the instrument error or fault and observer's operation or evaluation error for the flood monitoring system, there must be input-output error for modeling; 3) lack reliable information for some hydrologic features.
Analogously, the model uncertainty can be summarized as below: 1) Since the knowledge limitations of hydrological processes, the descriptions of such processes may be approximate or unreasonable; 2) most mathematical and physical functions used in complex processes calculation are simplified; 3) many models cannot reflect the influence of environmental factors, such as global climate and land cover change due to human activities, to the run off process; 4) effective computing methods are needed.The rainfall-runoff is a continuous process that is simulated by the model in a discrete way causing inevitable errors.Besides, different discrete ways could have different influences; 5) model parameter values are difficult to be obtained by either measurement or prior estimation.
The research of parameter uncertainty is fundamental and meaningful.Once the hydrological model is confirmed, the parameters will be the key point for the modeling validity: the modeling will stand or fall according to the parameters.
Premier researches about modeling uncertainty are mainly about model parameters but with the existence of other uncertainties, such as the Generalized Likelihood Uncertainty Estimation (GLUE) methodology [4] [6], the Shuffled Complex Evolution Metropolis algorithm (SCEM-UA) [7], and the Markov Chain Monte Carlo (MCMC) method [8].All these methods were aimed to represent the parameter uncertainty but ignored the influence of other uncertainty factors.Therefore, the key point now is how to avoid or decrease the influences of input-output data uncertainty and model structure uncertainty for the exact estimation of the parameters.For this, "ideal data" is proposed in the paper to do the parameter uncertainty and interaction estimation by using GLUE methodology.And meanwhile the likelihood measure is also studied here.
The proposed approach is applied to the well-known NASH runoff model considering as case study in Yanduhe basin of Yangtze River, China.

Theoretical Background
For the sake of completeness, the main theoretical background of involved models and methods are briefly described in what follows.

The GLUE Methodology
The GLUE procedure recognizes the equivalence of different sets of parameters in the calibration of models.It is based upon running a model with different sets of parameter values chosen randomly from the specified spatial distributions.
Many papers have applied this methodology and emphasize the effects of the likelihood measure in the whole applying process [9] [10].The term "likelihood" was used in a general sense, as a fuzzy, belief, or possible measure of how well the model conforms to the observed behavior of the system [4], yet not in the restricted sense of maximum likelihood theory which is developed under specific assumptions of zero mean, normally distributed errors [11] [12].Moreover, it is subjective to choose a suitable threshold for the likelihood measure to identify the behavior of the model.In the past studies, usually the threshold was chosen subjectively on the scale of some summary goodness of fit index [13] [14] [15].
And the GLUE method is used well in uncertainty research [16] [17].
Here in the application of GLUE methodology, for each set of parameters, whether the model is behavioral or not is determined by the likelihood value on a basis of comparing predicted and observed responses.
The requirements of the GLUE procedure are given as follows: 1) A formal definition of a likelihood measure.At this stage it is worth noting that a formal definition is required but the choice of a likelihood measure will be inherently subjective.
2) An appropriate definition of the initial range and the distributions of the parameters to be considered for a particular model.
3) Definition of a feasible threshold value.

NASH Model
The NASH model [18] [19] [20] [21] is a conceptual hydrological concentration model developed by Nash, J.E., and it is widely used in the watershed concentration simulation [22] [23].In the model, the research basin is divided into a series of identical reservoirs, and the reallocation of the net rainfall in the catchment is assimilated to be an adjustment of the reservoirs.So the instantaneous unit hydrograph (IUH) can be deduced as Eq. ( 1): ( ) where ( ) u t is the y-coordinate of instantaneous unit hydrograph; ( ) n Γ is the Gamma function; n reflects the regulation and storage capacity of the basin, and it could be the number of the reservoirs, termed shape parameter; k is the storage-discharge parameter of the reservoirs, termed scale parameter.The Nash model with its clear conception and simple structure has been used extensively in flood forecasting [24] [25].
According to Eq. ( 1), when the net rainfall ( ) is given, the flow hydrograph at the basin outlet can be deduced as

Generation of the Ideal Data
To avoid the input-output uncertainty and the model structure uncertainty, the concept of "ideal data" is proposed to do the research of model parameters uncertainty.
On the basis of the physical interpretation of the hydrological model and the characteristics of the research catchment, the model parameter spaces can be determined based on the prior information.Choose one set of parameters randomly in the parameter spaces as the "true parameter values".First assume the input data has no error, then the input is called "ideal input", and the "ideal output" can be calculated by using the "true parameter values" in the model.
For NASH model, the input is net rainfall and output is flow at the basin outlet.

Case Study and Results
The Yanduhe catchment in upper reaches of the Yangtze River basin covers a drainage area of 601 km 2 where the annual average rainfall is about 1337 mm, and the runoff mainly comes from the rainfall.In this research catchment, four flood events are used in the case study (Table 1).
For NASH model, based on the physical interpretation and the catchment data, the spaces of the two parameters are k [0.5 -4] and n [0.5 -5].The input is the net rainfall and the output is the surface runoff.Here the net rainfall is calculated by XAJ rainfall-runoff model [26].In real data research, the error coming from XAJ model is assumed as input error of NASH model.In the application of GLUE methodology, it is done by Monte Carlo simulation, using uniform sampling in the specified parameter range.
In rainfall-runoff modeling we are often evaluating the errors in simulating a time series of discharge or other observed data.A classical statistical measure for evaluating goodness of fit based on the sum of squared errors or error variance is suggested by Nash and Sutcliffe (1970) in the form where 2 ε σ is the variance of the errors; 2 ο σ is the variance of the observations.
Another measure based on the sum of squared errors is the inverse error measure suggested by Box and Tiao (1992) in the form: where N is a shaping parameter.
Here two likelihood measures were chosen for model and parameter uncertainty research by GLUE methodology.

GLUE Research under Real Data
The GLUE methodology has been first applied to the real data (observed discharge) research with likelihood measure L1.In this study the choice of a model efficiency rejection criterion (<0) has been included, which assure all of the possible modeling results can be contained.
Scatter plots of parameters k and n based on the likelihood measure L1are reported in Figure 1 and Figure 2 which show the effects of using different flood events with real input-output data.It can be seen that good and poor simulations are available virtually throughout the parameter ranges, and the effects of different flood events for parameters k and n are seen clearly.It can be inferred that there is uncertainty of input-output data or others simultaneously to influence the distributions of the scatter plots.
The results shown in Figure 1 and Figure 2 suggest that the parameter response surfaces are very complex and for different data that is for different events they vary showing the uncertainty of data.
It is difficult in rainfall-runoff modeling to bracket all the discharge observations (or other predicted variables) within 90% of the calculated confidence bounds, as seen in Figure 3, the discharge is often outside of the 90% confidence bounds, which furthermore proves the data limitations as well as the model structure limitations.

GLUE Research under Ideal Data
Then the GLUE methodology has been applied in parameter uncertainty research by proposing the "ideal data" to avoid the influence of other uncertainty factors.Following the above generation steps of the "ideal data", one set of parameters is chosen as the "true parameter values": ( ) , then the model is determined.With the ideal input of net rainfall data and the confirmed NASH model, the output data of the model will be obtained to be the ideal output data.So there is no uncertainty for the input data, output data and     and n under ideal data.It is easy to see that each parameter has very similar efficiency under different flood events.By Comparing Figure 7 and Figure 8 with Figure 1 and Figure 2, it also can be inferred that there are no other uncertainty factors affecting the parameters uncertainty research.
Comparing Figure 9 to Figure 3, it is easy to prove that all of the ideal discharges in NASH modeling are bracketed within the 90% calculated uncertainty      value greater than 0.98 trying to find the ridge line, from which we can assume that the distributions of the scatter plots can be matched by the function as below: where C is equal to 3.82.
For the ideal data analysis, the uncertainty assessment refers only to the parameters, so it is accurate to analyze the error distributions of the discharges as the parameter errors.Same as the real data research, Figures 12-14

Conclusions
The paper proposes the concept of "ideal data" for parameter uncertainty assessment.The analysis is carried out for the NASH model parameters k and n in Yanduhe catchment of Yangtze River, China, by GLUE methodology with both of "real" and "ideal" data.
In idea data research, the consistent results from the different flood events    The distributions of errors from model structure uncertainty, data uncertainty and model parameter uncertainty or from single model parameter uncertainty are non-Gaussian, and furthermore, changeable in each rainfall-runoff event.
Meanwhile for different flood events they are also different.So it is not reasonable to assume the errors with this certain kind of distribution.Comparing these to the results from ideal data estimation, the mixed uncertainties make the error distributions more complicated.

Figures
Figures 4-6 show the distribution of errors

Figure 1 .
Figure 1.Scatter plots of efficiency results for parameter k in the four real flood events.

Figure 2 .
Figure 2. Scatter plots of efficiency results for parameter n in the four real flood events.

Figure 3 .
Figure 3. Ninety percent uncertainty bounds for simulations of the four real flood events in the Yanduhe catchment.

Figure 4 .
Figure 4. Error distributions at the point of p t in the four real flood events.

Figure 7 and
Figure 7 and Figure 8 are scatter plots of efficiency results for parameters k

Figure 5 .
Figure 5. Error distributions at the point of 10 p t + in the four real flood events.

Figure 10
Figure10shows the response surface of parameters k and n with good- ness of fit represented as contours and suggests the obvious correlation and uncertainty of the two parameters.

Figure 11
Figure 11 gives the scatter plots of parameters k and n with likelihood

Figure 6 .
Figure 6.Error distributions at the point of 10 p t − in the four real flood events.

Figure 7 .
Figure 7. Scatter plots of efficiency results for parameter k in the four ideal flood events.

Figure 8 .
Figure 8. Scatter plots of efficiency results for parameter n in the four ideal flood events.

Figure 9 .
Figure 9. Ninety percent uncertainty bounds for simulations of the four ideal flood events in the Yanduhe catchment.

Figure 10 .
Figure 10.Ideal data analysis: Response surface for parameter k and n with goodness of fit represented as contours.

Figure 15
Figure 15 is the research result under idea data based on the likelihood L2 with parameter N = 20.For the four flood events, all of the "real parameters of model" are obtained in the figure.

Figure 11 .
Figure 11.Response surface for k and n in the four ideal flood events.

Figure 12 .Figure 13 .
Figure 12.Error distributions at the point of p t in the four ideal flood events.
Figure 14.Error distributions at the point of 10 p t − in the four ideal flood events.

Figure 15 .
Figure 15.Response surface for parameters k and n under ideal data with likelihood L2.
Suitable likelihood measure is very important for uncertainty estimation and parameters determination.While the accurate data and perfect model structure are the two important factors for C.They are the preconditions for the estimation of model parameter uncertainty and interaction.

Table 1 .
The four flood events.