Developing a Novel Method for Road Hazardous Segment Identification Based on Fuzzy Reasoning and GIS

Roads are one of the most important infrastructures in any country. One problem on road based transportation networks is accident. Current methods to identify of high potential segments of roads for accidents are based on statistical approaches that need statistical data of accident occurrences over an extended period of time so this cannot be applied to newly-built roads. In this research a new approach for road hazardous segment identification (RHSI) is introduced using Geospatial Information System (GIS) and fuzzy reasoning. In this research among all factors that usually play critical roles in the occurrence of traffic accidents, environmental factors and roadway design are considered. Using incomplete data the consideration of uncertainty is herein investigated using fuzzy reasoning. This method is performed in part of Iran's transit roads (Kohin-Loshan) for less expensive means of analyzing the risks and road safety in Iran. Comparing the results of this approach with existing statistical methods shows advantages when data are uncertain and incomplete, specially for recently built transportation roadways where statistical data are limited. Results show in some instances accident locations are somewhat displaced from the segments of highest risk and in few sites hazardous segments are not determined using traditional statistical methods.


Introduction
The road based transportation networks have become the most important part of the infrastructure in all countries.Roads are not only important as the physical structure of the society, but also as the foundation for social and economic developments.An increased demand for suburban mobility also increases the problems caused by transportation networks.One of these problems is accident occurrence.Several factors such as human factors, vehicle factors, environmental factors and roadway design usually play a role in traffic accident occurrence [1].At present in Iran, accident data obtained from the "Analysis Form for Traffic Accidents" are used to identify the road segments with high potential for accident.This form is filled out by a police officer for each traffic accident with casualties on a public road in Iran.Based on the information in these forms this method picks some segments with high potential for accident and then the danger related to these segments is estimated using statistical approaches.Since there is no statistical information for the newly-built route available, this method cannot be used for transportation networks that have been recently built.This research introduces a new and general method for identification of road segments with high potential for accident in transportation networks.Although driver mistakes often contribute greatly to the occurrence of any particular accident event, spatial analysis of road hazardous segments help to explain why accidents are more frequent in some segments than in others.The study area is Kohin-Loshan transit road that connects Tehran to the North of Iran and is located in a mountainous region that has most factors for accident occurrences.Since the study area is an old one and adequate spatial data were not available, among several factors that usually play a role in traffic accident occurrence, in this research only environmental factors and roadway design are considered.Moreover, integrated use of GIS and fuzzy reasoning is used for identification of roads hazardous segments.Geospatial Information System (GIS) is a technology which when incorporated in the analysis of road hazards, can facilitate a quick way of data retrieval, in addition to facilitating a means of making precise remedial engineering designs to improve road sections which are prone to road traffic accidents [2].The most straightforward use of GIS for accidents analysis is the examination of spatial characteristics of accident locations [3].Road hazardous segment identification can benefit from the data management, representation and spatial analytical functions offered by a GIS.This research shows how integrated use of GIS and fuzzy reasoning can be properly applied in modeling uncertainty of road hazardous segment identification.The terminology of fuzzy logic for spatial information management and modeling localities is introduced in Section 2.1.In the following related researches and proposed method is introduced.Section 3 presents implementation process and the first successful application of this new approach.Evaluating of result points out in Section 4.

Background
In [4], Jha and McCall explored the applications of GIS based computer visualization techniques in highway projects.In this project they found that GIS serves as a repository of geographic information and enables spatial manipulations and database management.Implementation of this project in a real highway project from Maryland indicated that integration of GIS and computer visualization greatly enhances the highway development process.Another research project conducted by Carreker and Bachman demonstrated that by applying GIS, the accuracy and efficiency of locating crashes could be improved [5].In [6], Fuller et al. used GIS and remote sensing data and analyzed several geometric road risk factors in the U.S. Southwest.This research used four road geometry factors and geology-based criteria and did not consider weather condition and road proximity land use effect on road hazardous location identification.They also did not use expert knowledge for determination of fuzzy membership functions.In 2003, a so-called novel adaptive neuro-fuzzy logic model was developed by Adeli and Jiang to estimate freeway work zone capacity.The model combined fuzzy logic with neuro-computing concepts and was used for the nonlinear mapping of 17 different factors impacting the freeway work zone capacity.This method provides two advantages over the existing methods.First, it incorporates a large number of factors impacting the work zone capacity.Second, unlike the empirical equations, this model does not require selection of various adjustment factors or values by the work zone engineers based on prior experience [7].In [3], Steenberghen et al. found the usefulness of GIS and point pattern techniques for defining road-accident black zones within urban agglomerations.This research showed the usefulness of GIS and point pattern techniques for defining road-accident black zones within urban agglomerations.In their research one-dimensional (line) and two dimensional (area) clustering techniques for road accidents were compared.Their method needs previous accident data in the study area for spatial clustering, so cannot be used in newly-built roads.In [8], Cheng and Washington by using experimentally derived simulated data evaluated three hotspot identification methods observed in practice: simple ranking, confidence interval, and Empirical Bayes.The results showed that the Empirical Bayes technique significantly outperforms ranking and confidence interval techniques.Erdogan et al. used GIS as a management system for accident analysis and determined the hotspots in the highways with two different methods of kernel density analysis and repeatability analysis in 2008.They realized that the hotspots determined with two methods reflect really problematic places such as cross roads, junction points etc. [9].The performance of various methods in hotspot identification was compared in [10] by Montella.In this research, seven commonly applied hotspot identification methods (crash frequency, equivalent property, damage only crash frequency, crash rate, proportion method, empirical Bayes estimate of total-crash frequency, empirical Bayes estimate of severe-crash frequency, and potential for improvement) were compared against for robust and informative quantitative evaluation criteria.In [11] Polat and Durduran used four classifier algorithms comprising ANN, ANFIS, SVM, and C4.5 decision tree to classify the traffic accident cases with the help of GIS after a data preprocessing method called SCAW applied to traffic accidents database.Since there is no statistical information for the newly-built route available, these methods cannot be used for transportation networks that are recently built.These methods have used GIS only as a visualization tool to show their results.The proposed method of this research uses GIS functions to analysis and extracts useful information from raw data and integrates GIS and fuzzy reasoning through expertise knowledge to assist road departments in suburban jurisdictions improve the safety of the roads under their management.

The Proposed Method
Research methodology is based on integration of GIS and fuzzy reasoning which helps decision makers to determine which risks are the most important ones, and ultimately decide where hazard mitigation strategies should be employed.Figure 1 shows different steps of research methodology for creating composite risk map for identification of road hazardous segments.

Fuzzy Logic in Spatial Information
Fuzzy set theory, introduced by Zadeh in the 1960s, resembles human reasoning in its use of approximate information and uncertainty to generate decisions [12].Fuzzy logic allows objects to take partial membership in vague concepts.The main idea of fuzzy logic is that items in the real world are better described by having partial membership in complementary sets than by having complete membership in exclusive sets [12].In classic logic the membership of an element to a set is represented by 0 if it does not belong and 1 if it does, having the set {0, 1}.On the other hand, in fuzzy logic this set extends to the interval [0, 1].Therefore, it could be said that fuzzy logic is an extension of the classic systems [13].A fuzzy set A over a universe of discourse X (a finite or infinite interval within which the fuzzy set can take a value) is a set of pairs (Equation ( 1)): In Equation ( 1),  is called the membership degree of the element x to the fuzzy set A. This degree ranges between the extremes 0 and 1 of the dominion of the real numbers.Depending on the type of membership function, different types of fuzzy sets will be obtained.Zadeh proposed a series of membership functions that could be classified into two groups: those made up of straight lines being "linear" ones, and to the contrary the Gaussian forms, or "curved" ones.A linguistic label is the word, in natural language, that expresses or identifies a fuzzy set that may or may not be formally defined.Thus, the membership function

 
A x  of a fuzzy set A expresses the degree in which x verifies the category specified by A. Membership functions are at the core of fuzzy logic, so proper use of fuzzy reasoning depends on proper construction of membership functions.A number of methods are available to construct membership functions using expert knowledge.As such as in this research expert knowledge is used to construct membership functions and fuzzy rules, proper selection of experts must ensure the use of appropriate expert knowledge.Selected experts should be familiar with an analysis of the public concern in terms of multiple issues and be able to judge measurements of corresponding indicators in linguistic terms.As such as in this research road geometry and environmental factors for road hazardous segment identification have been considered, a heterogeneous group of experts (both scientific and practical) from Ministry of Road and Urban Development Transportation Research Institute and Meteorological Organization was selected.Another issue that should be considered is the method of expert knowledge elicitation.Different methods (point estimation, interval estimation, direct rating, and transition interval estimation) are available to elicit expert knowledge for the construction of membership functions.Since research variables are in different type and point estimation method can be applied to nominal, discrete and continuous variables, a point estimation based method has been used in this research.The main advantage of this method is the simple processing of elicited expert knowledge.In point estimation, an expert j (j = 1, , J) determines unambiguously whether each x does or does not have property i A  .An overall assessment is computed as Equation ( 2): In this method to obtain a proper membership function more than one expert is needed [14].
Fuzzy logic based methodology in spatial information can provide a conservative representation tool for individual differences in the perception.A basic difference between perceptions and measurements is that, in general, measurements are crisp whereas perceptions are fuzzy.In a fundamental way, this is the reason why to deal with perceptions it is necessary to employ a logical system that is fuzzy rather than crisp [15].From this simple concept, a complete mathematical and computing theory has been developed that facilitates the solution of certain problems in spatial information.The types of uncertainty that appear in geospatial information systems are not just simple randomness of observation (as in weather data that is used as a environmental factor in this research) but are manifested in many other forms including imprecision, incompleteness and granularization.The multiplicity of uncertainty appearing in GIS data and analysis requires a variety of formalisms to model these uncertainties.In light of this it is natural that fuzzy set theory has become a topic of intensive interest in many areas of geospatial research and applications [16].

Road Hazardous Segment Identification Based on Fuzzy Inference System
This research shows how fuzzy reasoning can be properly applied in modeling localities.Identification of road hazardous locations by fuzzy reasoning has a definite advantage over a crisp set.A fuzzy logic based methodology is used in this research for the following reasons:  It makes best possible use of sparse information to reconstitute details.  Fuzzy logic is well suited for modeling continuous, real world systems. Fuzzy logic based methodology for modeling localities provides a conservative representation tool for individual differences in the perception and constitutes a closer depiction of reality. Research variables are continuous, imprecise, or ambiguous. Fuzzy set modeling over the reference data can minimize the problems caused by the imperfection of source data  Fuzzy sets are an extension of crisp (two valued) sets to handle the concept of partial truth, which enables the modeling of uncertainties of natural language [17] in this research.Traffic crashes are caused due to interaction of vehicle, driver, roadway and environmental factors.All these factors interact with each other and influence the occurrence and severity of crashes simultaneously.Although driver error often contributes greatly to the occurrence of any particular crash event, analysis of roadway and environmental factors help to explain why crashes are more frequent in some locations than in others.In this research considering accessibility to data several roads hazard criteria have been taken into consideration.Table 1 illustrates these criteria and their descriptions.This research is an attempt to implement the road and environmental related factors for road hazardous segment identification and thus help in identifying the required remedial measures.
According to Each of these variables is treated as a risk factor in analysis of risks associated with the roads.Fuzzy processing of the hazard descriptors requires a specification of the linguistic labels which represent fuzzy sets.The linguistic variables and linguistic labels used for investigations of each geometry and environmental factors are listed in Table 2.
The type of fuzzy membership functions for each risk factor is very important so in this research various functions are tested and appropriate function for each risk factor is determined.Widely applied membership functions are bell-shaped and trapezoidal functions with Table 1.Description of roads hazard criteria.

Radius
The shorter the radius the higher the hazard potential

Slope
Sections with higher slope have higher potential for hazard

Visibility
Sections with less visibility have higher potential for hazard

Distance from Intersection
Sections closer to intersections have higher hazard potential

Road Width
Whatever the narrower the road width is, the higher the hazard potential is

Distance from the Starting Point of Roads (cities)
Sections closer to cities have higher hazard potential

Distance from Population Centers
Sections closer to the population centers have higher hazard potential

Rain Value
The higher the rain value is, the higher the hazard potential is  2 shows the trape- zoidal function.
In special cases like symmetrical trapezoids and triangles the number of parameters reduces to three.As the values of these parameters change, the membership functions vary accordingly, thus exhibiting various forms of membership functions [18].This research uses the trapezoidal membership functions because of their simplicity, their learning capability, and the short amount of time required for designing the system.The main steps of this fuzzy inference system are: input, fuzzification, implication, aggregation and defuzzification.After implementing the criteria, in order to create a useful statement, complete sentences have to be formulated.Conditional statements, IF-THEN rules, are statements that make fuzzy logic useful.A single fuzzy IF-THEN rule can be formulated according to Equation (4): on the range of all possible values of x and y, respect- where A and B are linguistic labels defined by fuzzy sets tively.cedent r premise, the THEN part of the rule "y is B" is called ent identific The IF part of the rule "x is A" is called ante o consequent.The antecedent is an interpretation that returns a single number between 0 and 1, whereas the consequent is an assignment that assigns the entire fuzzy set B to the output variable y.The antecedent may integrate several inputs using logical AND and OR.Fuzzy reasoning with fuzzy IF-THEN rules enables linguistic statements to be treated mathematically.In a fuzzy system with the increase of the number of rules the level of qualitative complexity also increases.In this research for complexity reduction we tried to reduce the number of fuzzy rules by reducing the number of linguistic values that fuzzy inference system input variables can takes.The formulation of the fuzzy rules demands a careful assessment of the importance of the descriptors for a mostly unique characterization of hazard classes.According to Table 2 this study restricts output of fuzzy process to five classes, namely absolutely safe, safe, danger prone, dangerous and very dangerous.Some samples of the IF-THEN fuzzy rules for determination of road hazardous segments have been given in Table 3.
This research uses the fuzzy Takagi and Sugeno (TSK) concept for fuzzy based road hazardous segm ation, because it offers some advantages with regard to computational efficiency and adaptive optimization [19].In TSK approach membership values in the premise part are combined by product inference to get the firing strength of each rule and the consequent part of each rule is modeled by a linear combination of the input variables plus a constant term (Equation ( 5)).The TSK rules can be expressed as follows (Takagi and Sugeno, 1983): : If is and is and and is , where j R is the jth rule, 1, 2, , A are linguistic ter Small, App ms of the premise part (e.g.Very Small, ro iate, High), pr j f is the i.e. fuzzy indicator for the amount of dangerous), and output variable ( j i a are coefficients of linear ations.The process of shaping the consequent (implication) is carried out and en aggregates the output fuzzy sets over all rules.The final output equ th y (centroid defuzzifier) of hazardous segment identification fuzzy inference system is calculated using Equation 6): (

9-IF distance from population centers is
Far AND radius is High AND distance from intersection is Far AND road width is Width AND rain value is Very Low AND distance from cities is Far THEN point is Absolutely Safe. 1 1 where, The final output is the weighted average o the consequent equations rules.Figure 3 shows the overall process.mentation steps of ition and prepara-.The study area is Kohincts Tehran to the North of f

Implementation and Result
This section briefly explains the imple proposed method including data acquis tion, modeling road hazardous segments based on fuzzy reasoning, and finally the evaluation of results.

Data and Study Area
Figure 4 shows the study area Loshan transit road that conne Iran (Gilan).Research area is located in a mountainous  This study uses several primary and digital data.The databases are obtained from numerous sources and in various formats (Table 4).
Geometric specification data of study area was paper based text attributes that should be converted to digital format for inputting to the d cising black segments of this transit road are used for validation of proposed method.

Experimental Investigations
For testing the proposed method tion of the study area is extraction of t from data source.After converting dat format and reference coordinate system, geometric and topological corrections are performed on data.In the next step the route is divided into smaller segments where each segment contains at least one of hazard potentials.A special code is assigned to each of these segments.In the next step one point symbol is considered as a candidate for each line segment.Meanwhile, by considering experts' opinion and existing information of the study area, other segments that were prone to accident and hazard are selected.After preparing data and selecting criteria and layers, a database of all data and layers is generated and road geometry and environmental attributes are as-signed to considered points.Each of variables in Table 2 is treated as a risk factor in analysis of road hazardous segment identification and critical standard boundaries for each criterion (observed indicators) are determined.
Since classes or groups of data with boundaries are not sharply defined, their indicators and relationships have uncertain definition.Therefore some uncertainties are lying in this method.Fuzzy set theory is a useful tool for solving the uncertainty with linguistic variables.It also facilitates subsequent integration of data layers in the generation of composite risk maps.Prior to fuzzy process the membership functions of each factor have to be specified using expert knowledge.The membership functions are depicted in Figure 5.
The form sideration occurrence and hazard values, complexity of each factor and the experience of experts.Therefore, selection of appropriate rules for road hazardous segment identification is a sensitive and an important subject.According to samples of rules in

Result and Evaluation
fter the input and output of fuzzy inference system and its membership functions and rules, value of danger for each point is determined.Danger values of proposed fuzzy inference system are classified in the range of 0 (safe) to 250 (very dangerous).Now each hazard point should be assigned to one of these classes: absolutely safe, safe, danger prone, dangerous or very dangerous.Figure 6 shows final results of fuzzy reasoning process for identification of segments with hi tential for accident in the study a y axes show x and y coordinate of ro each danger class have been shown with a special symbol and color.Figure 7 shows the final results of proposed approach for identification of hazardous segments in GIS environment.
In this figure red and blue points indicate very dangerous and dangerous segments respectively.This composite risk map depicts good correlation between existing accident segments (yellow dots that have been taken from statistical analysis of accident records) and segments with high potential for accident (red and blue black dots).However, in some instances accident locations are somewhat displaced from the segments of highest risk and in few sites hazardous segments are not determined using traditional statistical methods.Several factors may explain this dis  Error associated with the accident data;  Approximate determination of existing accident points by police officers;  Error in geometry and environmental data; ctors that are unaccounted in this analysis because of the lack of proper data.

Conclusion and Future Work
This research introduced a novel method for determination of hazardous segments in transportation network under uncertainty, specially for recently built transit roads where there are not statistical data of accident occurrences.The analysis in this research has shown that although driver error often contributes greatly to the occurrence of any particular accident event in suburban roads, consideration of environmental and road geometry factors help to explain why crashes are more frequent in some segments than in others.Consequently, in this research GIS was employed to obtain a new approach for creating maps of the hazardous segments of roads based on the theory of fuzzy logic.The study supports the pro ould have been truncated in a crisp ad hazardous  Temporary obstructions in the roadway;  Other parameters and fa per application of fuzzy set theory to spatial concepts, such as road hazardous locations and provides a mechanism to address various kinds of uncertainty by preservg the detail that w in set.Consideration of more criteria for ro segment identification is the other issue that can be considered in the future researches.

Acknowledgements
This research is done using the data provided by National

Figure 1 .
Figure 1.Research methodology for identification of road hazardous segments.

Figure 4 .
Figure 4. Study area.region where elevation ranges from approxi ately 300 to 2394 m.The lengt oximately 73 km.his route has most of factors for accidents occurrence

Figure 6 .Figure 7 .
Figure 6.Results of fuzzy reasoning process for iden tion of hazardous segments.tifica-

Table 2
these factors can be divided into two classes:  Road geometry design factors;  Environmental factors.

Table 2 . Linguistic variables and labels for the fuzzy-based road hazardous segment identification process.
    .Figure

Table 3 . Some fuzzy rules.
Very igh AND visibility is Inappropriate AND d is Very Near AND Small AND slope is H istance from intersection road width is Very Narrow AND rain value is Very High AND distance from cities is Very Near THEN point is Very Dangerous.3-IF distance from population centers is Near AND radius is Very Small AND slope is Appropriate AND visibility is Inappropriate idth is Very Narrow AND ities is Far ow AND visibility is Appropriate AND opriate AND visibility is Approprislope is Appropriate AND visibility is Appropriate AND AND distance from intersection is Very Near AND road width is Very Narrow AND rain value is Low AND distance from cities is Near THEN point is Very Dangerous.4-IF slope is High AND visibility is Inappropriate AND distance from intersection is Far AND road w distance from cities is Near THEN point is Dangerous.5-IF radius is Very Small AND visibility is Inappropriate AND distance from intersection is Far AND distance from c THEN point is Dangerous.6-IF distance from population centers is Near AND radius is Appropriate AND slope is L distance from intersection is Far AND road width is Appropriate AND rain value is Low AND distance from cities is Far THEN point is Danger Prone.7-IF distance from population centers is Moderate AND radius is Appropriate AND slope is Appr ate AND distance from intersection is Near AND road width is Appropriate AND rain value is Low THEN point is Danger Prone.8-IF distance from population centers is Far AND radius is High AND slope is Appropriate AND visibility is Appropriate AND distance from intersection is Far AND road width is Width AND rain value is High AND distance from cities is Far THEN point is Safe.
1-IF radius is

Table 3
more rules are considered for very dangerous and dangerous output classes.