Kriging Geostatistical Methods for Travel Mode Choice : A Spatial Data Analysis to Travel Demand Forecasting

This paper aims to compare the results of two techniques of Kriging (Ordinary Kriging and Indicator Kriging) that are applied to estimate the Private Motorized (PM) travel mode use (car or motorcycle) in several geographical coordinates of non-sampled values of the concerning variable. The data used was from the Origin/Destination and Public Transportation Opinion Survey, carried out in 2007/2008 at São Carlos (SP, Brazil). The techniques were applied in the region with 110 sample points (households). Initially, Decision Tree was applied to estimate the probability of mode choice in surveyed households, thus determining the numeric variable to be used in Ordinary Kriging. For application of Indicator Kriging it was used the variable “main travel mode” in a discrete manner, where “1” represented the use of PM travel mode and “0” characterized others travel modes. The results obtained by the two spatial estimation techniques were similar (Kriging maps and cross-validation procedure). However, the Indicator Kriging (KI) obtained the highest number of hit rates. In addition, with the KI it was possible to use the variable in its original form, avoiding error propagation. Finally, it was concluded that spatial statistics was thriving in travel demand forecasting issues, giving rise, for the both Kriging methods, to a travel mode choice surface on a confirmatory way.


Introduction
Travel mode choice modeling is the process where the means of traveling is estimated.The means of travel is referred to the travel mode, which may be by private automobile, public transportation, walking, etc. Travel mode choice is an important part of travel demand forecasting models that are used for analyses, for instance: 1) Major transportation investment projects since they may attract travelers not only from competing facilities but also competing travel modes; 2) Transit service changes; 3) Pricing policy analyses; 4) Land use planning analyses.
Individuals choose their travel mode considering diverse factors, classified into three groups: 1) Characteristics of the trip maker (car ownership, income, age, etc.); 2) Characteristics of the journey (trip purpose, time of the day, etc.); 3) Characteristics of the transport facility (travel time, monetary costs, comfort, convenience, etc.).Thus, travel behavior involves household and personal characteristics, travel variables and spatially correlated factors, as well as location of households and others destinations, activities distribution in the urban environment [1].
Several studies corroborate the affirmation that travel mode choice issues are strongly related to locations [2] [3].With the advances of technology, geo-referenced information has become more widely available.Thus, travel demand spatial analysis has been identified as an emerging research area [4].
Bhat and Zhao [5] identify the spatial issues that need to be recognized in travel demand modeling, proposing a multi-level, mixed logit, formulation to address these spatial issues in the context of activity stop generation in the Boston Metropolitan area.
The use of kernel estimation has been already proposed by [6] in association with a logit model in order to assign route choice.This study demonstrates efficient results and the potential of using spatial considerations into travel demand models.Miyamoto et al. [7] combine concepts to specify the spatially auto-correlated deterministic and error components within a mixed logit framework.
Páez et al. [4] introduce a new indicator of spatial fit that can be applied to the results of discrete choice models.Peer et al. [8] apply geographically weighted regression for the approximation of door-to-door travel times in departure time choice models.
Among the techniques of spatial statistics, geostatistics is to be highlighted.Geostatistics enables the development of studies involving spatial autocorrelation, allowing mainly estimating the value of a variable in locations where values are unobserved.
Usually, geostatistics is applied to cases in which spatial continuity is apparent.Despite this limitation, geostatistical modeling has been applied to spatially discrete data for many years [9].Generally, travel data are spatially discrete.Dealing with this limitation, it is necessary to adapt travel demand data, considering that they are generally discrete variables and have no spatial continuity.
In the literature, however, studies on the application of geostatistics for transportation problems are derived from traffic engineering.Miura [10] presents an approach for predicting car travel time by Kriging.This prediction method is shown to be effective for urban districts with links having changeable travel times owing to congestion.Zou et al. [11] propose an improved distance metric called approximate road network distance (ARND) for solving the problem of the invalid spatial covariance function in Kriging caused by the non-Euclidean distance metric.
Following this approach, recent studies have shown that geostatistics is able to estimate transportation demand variables and explain the spatial distribution by maps of Kriging predicted values [12]- [14].
The main aim of this paper is to compare the results of two techniques of Kriging (Ordinary Kriging and Indicator Kriging) that were applied to estimate the Private Motorized (PM) travel mode use (car or motorcycle) in several geographical coordinates of non-sampled values of the concerning variable.The data used was from the Origin/Destination and Public Transportation Opinion Survey, carried out in 2007/2008 at São Carlos (SP, Brazil).
This article is organized in four sections besides this Introduction.Section 2 describes the techniques; Section 3 presents the materials and method.Section 4 exhibits the results, and finally, Section 5 describes the main conclusions.

Spatial Analysis Approaches
Geostatistics was developed as an alternative method for exploiting events in which variable values are associated with geographical coordinates.This approach takes account of general spatial statistics because it estimates a continuous surface through a dataset that may be regular or irregularly spatially distributed.
The main point of applying geostatistics follows the idea of characterizing spatial (and/or spatial/temporal) dispersion of an event, assessing uncertainty parameters, concerning its spatial variability, and obtaining a continuous surface estimation.Geostatistics modeling is better defined as the following steps: 1) variography analysis, 2) cross validation, and 3) Kriging.
The primary tool in geostatistical modeling is the semivariogram, which graphically represents a regionalized variable.The semivariogram function is given by Equation (1), where N(h) is the set of all pairwise, and z(x i ) and z(x i + h) are data values at spatial locations i and i + h, respectively.
Moreover, the representation of an experimental semivariogram requires a further understanding of graphical aspects.Some measures include lag distance and tolerance, cut distance, and angle direction.Another step of variography analysis is to model a theoretical semivariogram based off the experimental one.Of the various theoretical models for adjustments of variogram, the most frequently used are Spherical, Gaussian and Exponential.In this step, the experimental variogram is replaced by a theoretical variogram function, from which is possible to obtain the main parameters for spatial modelling: nugget effect (Co), Range (a) and Sill (C+ Co), see Figure 1.
The second step of geostatistics modelling is Cross Validation (CV).CV is a simple way to compare various assumptions either about the model (e.g.type of function to be adjusted, parameters of variograms) or about the data.In the cross validation procedure, each sample value Z(x 1 ) is removed in turn from the data set and a value Z * (x 1 ) at the location is estimated using the remaining n-1 samples.The difference between a data value and the estimated value (Z(x 1 ) − Z * (x 1 )) gives an indication of how well the data value fits into the neighborhood of the surrounding data values [15].
Regarding the CV, the geostatistical modeling general method follows the Kriging estimation, which is a linear prediction represented by matricial calculus.The aim of Kriging is to conduct estimates with minimum error and variance (optimizing the model) through the parameters defined in the theoretical semivariogram.The most common univariate Kriging methods are Simple Kriging, Universal Kriging and Ordinary Kriging.In this paper, we compared Indicator Kriging and Ordinary Kriging to estimate PM travel mode use.
Ordinary Kriging is the most widely used Kriging method.Its main goal is to estimate a value at a point of a region, for which the correspondent variogram is known, using data in neighborhood [15].
OK is a method that is often associated with the acronym B.L.U.E. for "best linear unbiased estimator".OK is linear because its estimates are weighted linear combinations of the available data.It is unbiased since it tries to have the mean residual equal to zero.It is best because it aims at minimizing the variance of errors [16].
For the prediction of the variable Z at a location x 0 , {Z(x 0 )}, the estimator Z * (x 0 ) is defined as [9]: where the λ i are weights that were calculated considering two constrains: (A) the estimator is not biased; (B) The variance of the estimate is minimal.The variance of the estimate is minimal.These conditions are reached when error is equal zero, in average: Developing the error mathematical expectancy expression, we come to the condition of no bias shown in Equation ( 4): The variance of estimate error is given by Equation ( 5): The minimization of the estimate error is carried out developing Equation ( 5), as represented by Equation ( 6): Expanding each term on the right side of this equation, one comes to the Equation ( 7) in terms of the covariance function (C): To minimize the error, it uses the technique of Lagrange multipliers, as shown in Equation ( 8): Finally, we obtained the equation system of Ordinary Kriging ( ) The Indicator Kriging is an estimation technique with the same basis of the linear Kriging, but it is applied to attributes with non-Gaussian distribution, which are transformed according to a non-linear mapping, codification by means of indication.Appliedon a set of numeric sample values, Z(u = u a ), codification by means of indication generates, for a cutting value z k , asample set by means of indication I(u = u a ; z k ), of the kind [16]: (10) Codification by means of indication is applied on the whole sample set, creating, for each cutting value, a sample set by means of indication for which sample values are transformed either in 0 or 1.The K cutting values z k , 1, 2, , k K =  are defined according to the number of samples.Definition of a variation computing model depends on the existence of a minimum distribution of zeros (0) and ones (1) within the set of samples codified by means of indication.It can be shown that the best variation computing is obtained for a cutting value equal to the median computed from the sample set [16].Thus, it is possible to use a single cutting value, equal to or close to the median, in order to generate a single codification by means of indication, known as codification by means of indication by median.Use of a single cutting value reduces computation time, however, this may be restricting if the use of the variation recording model, close to the median, is too different from those obtained by other cutting values.
The set of sample data by means of indication is used to infer values for the random variables by means of indication I(u;z k ) , with u ≠ u a .The conditional expectancy of the numeric random variable obtained by means of indication I(u;z k ) is calculated by Equation (11).
Equation (11) where ( ) ; ; ; The weights ( ) are obtained by solving the following system of equations of Kriging by means of ordinary indication: ; 1, 2 ; , , where φ(u; z k ) is Lagrange's multiplier, h αβ is the separation vector (which is defined by the positions u α and u β ), h α is the vector defined between the positions u α and u β , C 1 (h αβ ; z k ) is the self-covariance defined by h αβ , and C 1 (h α ; z k ) is the self-covariance defined by h α in z = z k .The self-covariances are determined by the theoretical variation computing model defined by set I for z = z k [17].In this paper, the original variable of interest is dummy (0-no use of car/motorcycle; 1-use of car/motorcycle).Thus, in this case there is no cutting value.

Study Area
The study area of this research is the city of São Carlos (São Paulo/Brazil) (Figure 2), with 221,936 inhabitants.96% of the population lives in urban areas, which covers, approximately, 105 km 2 of 1137.30km 2 of total area [18].
The data used for the development of this study is from the Household interview and Urban Transportation Evaluation Survey applied together in Origin-Destination Survey of 2007/2008 in the city [19].A set of 5% of households was randomly selected from the municipality water supply database (from which, the respective geographical coordinates were obtained).

Software Packages: Auxiliary Tools
The computing applications used in this paper were the IBM-Statistical Package for the Social Sciences (SPSS) version 22 for DT modeling and the software GeoMS 1.0 for the geostatistical calculation and definition processes of experimental and theoretical semivariograms and Kriging.The software ArcGIS 10.1 was used in order to obtain graphical representations of the results.

Method
To reach the proposal of this paper, the stages represented in Figure 3, summarized in the subsequent subsections, have been fulfilled.

Data Base Treatment
The database preliminary analysis process led to the elimination of some samples when one, or more, of the following situations occurred: 1) inconsistent or missing data, 2) people who did not travel and 3) households with in meters) relative to the households.The main variables in the sample and its description can be found in Table 1.In the same table, data used in each technique are presented.

Decision Tree (DT) Application
Decision Tree (DT) analysis, especially the CART algorithm, in this research article, was used to achieve the continuous variable for application of Ordinary Kriging-estimated probabilities of PM travel mode.OK method, used in the next step of this paper, consists of using numeric variables, avoiding the use of the original categorical variable "Main travel mode" for the spatial interpolation of mode choice.
The CART algorithm consists of a sequential binary partitioning of the dataset considering the values of the variables.Tree models are fitted by successively splitting the data to form homogeneous subsets, being the result a hierarchical tree of decision rules useful for prediction or classification [20].
The DT analysis was generated with a sample of 1216 cases, using the CART algorithm and adopting the parameter of a minimum of 25 observations per terminal node.The dependent variable was "Main travel mode", consisting of three categories (1 transit, 2 car/motorcycle and 3 non-motorized).The independent variables were socioeconomic, travel characteristics and the qualitative measures of transport system, as shown in Table 1.

Region Selection
Following the steps of the proposed method, exploratory maps of probabilities of travel mode choice, obtained by DT model, were generated in order to observe spatial patterns.Good estimates for spatial estimation depend, mainly, on spatial structure of the variable to be kriged.From the exploratory spatial analysis, was verified that the variables to be kriged did not present apparent spatial pattern.Thus, it was decided to segregate the city of São Carlos (São Paulo-Brazil) into six regions, considering the criteria of income.Thus, using the Cluster Analysis technique, considering the variable income (categorical, by Minimum Wages) and geographic coordinates of households (latitude and longitude in meters), it were obtained six groups/regions.
A detailed analysis of the probability maps for each of the six regions separately, revealed that Region 2 was the one that presented a more diffusive spatial pattern, ranging from the center to the periphery of the area, as shown in Figure 4.This region consists of both low-income households (center of the region) and higher income (peripheral neighborhoods).In addition, Region 2, is a particular one in the city hence it includes the University of São Paulo second campus, an important factor that affects the city dynamics.
Therefore, in order to estimate all the area of Region 2 (Santa Felicia), including the locations where mode choice (PM travel mode choice) was not observed, a database composed by 110 points was submitted to Ordinary Kriging and Indicator Kriging, respectively.

Variograms Modelling: Continuous and Dummy Variables
In modeling the experimental variograms, we used the following two regionalized variables: 1) continuous and 2) dummy.The continuous variable was the probability of choosing PM travel mode, which was then applied in Ordinary Kriging.Following, the variable "Main travel mode" was discretized in a dummy variable.The value "1" represents the use of PM travel mode and value "0" is for cases of non-use.Subsequently, this variable was used in the Indicator Kriging.
In this study, experimental variograms (constructed based on observed points) for the two variables, were constructed and theoretical models were adjusted with spherical and Gaussian functions, respectively.Table 2 summarizes the parameters of the theoretical variograms for each regionalized variable (continuous or dummy).

Cross Validation: Continuous and Dummy Variables
In order to assess the accuracy of the model determined in theoretical variograms, parameters such as Mean Error (ME), Mean Normalized Error (MNE) and Variance of Errors (VE), were calculated taking into account the observed and estimated values.In addition, the percentage of correct estimation of PM travel mode was obtained.The performance measures were calculated considering the following equations.( )

Kriging: Indicator and Ordinary
For the estimation by OK of the continuous variable and by IK of the dummy variable, a grid of 100 × 100 meters was established, being its dimensions based on the average distance between households in the study area.Kriging maps were generated by estimation of the two regionalized variables.

Results and Discussions
In order to access quality of obtained theoretical variograms, some statistical parameters (ME; MNE and VAR) were calculated and can be observed in Table 3.In addition, the percentage of correct estimation of PM travel mode, considering the estimated and observed values of continuous and dummy variables, was also obtained.Upon analysis of cross-validation results, one can see that, despite the low hit rate between observed and estimated values, there were also low values of errors for the both case of Kriging methods.We can also note that the two Kriging methods are very similar.The Indicator Kriging, however, has a higher hit rate, low value of Mean of errors and normalized error.
Important information regarding the accuracy of the theoretical variogram models, as well as the proposed method, that consists in applying two Kriging geostatistical methods to PM travel mode use is listed following: 1) Is to be highlighted that the considered regionalized variables are not natural (variables derive from a previous data treatment and not from direct measures).They were produced by a non-parametric model, the DT model (OK) and from an individual questionnaire (IK); 2) Probably, there are errors associated to survey and data type; 3) The Regionalized Variables are spatially discrete.
Despite the limitations listed for the application of the proposed method, the PM travel mode use was estimated for a 100 × 100 meters grid, for the referred region.Maps generated by Ordinary Kriging estimation and Indicator Kriging are illustrated in Figure 6(a) and Figure 6(b).
Observing Figure 6, we can note that the Kriging maps from the both Kriging methods are very similar.The results show that this travel mode is more likely to be used in the periphery of the region.Thus, the predisposition for car usage decreases considering the center proximity.Is to be highlighted that the main direction of this variogram (N30E) is clearly translated to the respective map.
The results of both spatial estimation methods are consistent with the reality of the region.Locations with higher probability of car/motorcycle use are exactly those corresponding to neighborhoods of higher income population.Conversely, the center, which is the least probable to use the car, corresponds to low-income households.
The map generated by Indicator Kriging is a surface of probabilities to use PM travel mode (car or motorcycle).Discretizing such values at zero and 1, considering the median of the probabilities, it was obtained a discrete map.The blue color indicates the use of PM travel mode (value 1), while the red color represents non-use.The discrete Kriging map (IK) is shown in Figure 7.

Conclusions
The results obtained in this study allowed determining the probabilities of PM travel mode (car/motorcycle) use in household locations where travel mode choices were unobserved through two different procedures: 1) joint application of Decision Tree analysis and Ordinary Kriging; and 2) Indicator Kriging.
The resulting maps, obtained from OK and IK, allowed determining that there was a trend in the use of Private  Motorized travel mode, which increased from the center to periphery of the selected region.Cross-validation showed good results, for the both approaches, considering mean of errors, normalized mean error and variance of errors.
There was better accuracy in approach by Indicator Kriging, which in addition to higher hit rate and lower mean of errors value, was a univariate procedure.
The application of the Ordinary Kriging approach consisted in a two-step methodology, through the application of DT, which included several covariates.Furthermore, regionalized variable, in this case, was from the previous step.This fact could contribute to error propagation.
Nevertheless, the innovative characteristic of this study should be taken into account.The Kriging geostatistical methods presented are based in unusual techniques in the analysis of travel mode choice.Besides, the results show the success in this combination, which allows a preliminary assessment of spatial particularities of the study area, and, it also emphasizes the necessity of robust/solid information basis, special when dealing with questionnaires.
For future research, we should propose the use of non-spatial traditional techniques for travel mode choice problems (discrete choice models, as instance) for comparison with the performance of spatial analysis based on Kriging approaches.
the average of the random function of the stationary region, and the weights to minimize the variance of the estimative error.Considering the sum of the pondering weights equal to 1, a simpler version of Kriging by means of indication is obtained.It is the Kriging by means of ordinary indication, which estimative expression (Equation (13)):

Figure 2 .
Figure 2. Study Area-São Carlos city.Fonte [19].repeated geographic coordinates.Thus, the final sample contains 1216 individuals, 22 categorical variables and 4 numeric variables.Such information was then associated with geographic coordinates (latitude and longitude

Figure 5 (
a) and Figure 5(b) illustrates the theoretical variograms in the main direction for the two regionalized variable.

Figure 4 .
Figure 4. Spatial Distribution of probabilities of motorized private travel mode choice in Region 2.
var = Variance of error; x − y = errors; µ = mean of errors.

Figure 7 .
Figure 7. Map of estimation for dummy variable.
presents an extremely important result concerning the inference of the probability distribution of a random variable.It means that the conditional expectancy of I(u;z k ) yields, for the cutting value z = z k , an estimate of the function of conditioned accumulated distribution, F * (u; z k |(n)), for numeric attributes.Conditional expectancy E{I(u; z k |(n))} may be estimated by algorithms of Kriging, which employ numeric random variables, codified by meansof indication.Kriging by simple indicator is simple linear Kriging applied to a sample set codified by means of indication at the cutting values z = z k , and it has the formulation presented in Equation (12):

Table 1 .
Main variables and techniques.

Table 2 .
Parameters of theoretical variograms for continuous and dummy variables.