Remote Sensing Applied to the Extraction of Road Geometric Features Based on Optimum Path Forest Classifiers, Northeastern Brazil

One of the principal difficulties related to road safety management in Brazil is the lack of data on road projects, especially those on rural roads, which makes it difficult to use road safety studies and models from other countries as a reference. Updating road networks through the use of hyperspectral remote sensing images can be a good alternative. However, accurately recognizing and extracting hyperspectral images from roads has been recognized as a challeng-ing task in the processing of hyperspectral data. In order to solve the afore-mentioned challenges, Hyperion hyperspectral images were combined with the Optimum Forest Path (OPF) algorithm for supervised classification of rural roads and the effectiveness of the OPF and SVM classifiers when applied to these areas was compared. Both classifiers produced reasonable results, however, the OPF algorithm outperformed SVM. The higher classification accuracy obtained by the OPF was mainly attributed to the ability to better distinguish between regions of exposed soil and unpaved roads.


Introduction
Traffic accidents in Brazil stand out in terms of magnitude, both in number of deaths and injuries as well as in their financial consequences for users and for society. According to the World Health Organization's world ranking [1], 37,345 traffic deaths were recorded in 2016 by the Ministry of Health's Mortality In-formation System (SIM). The country remains far from the goal established by the United Nations (UN), which stipulates a 50% reduction in the number of victims over 10 years, beginning in 2011, and it ranks fifth among countries with the most traffic deaths, behind only India, China, the United States, and Russia.
In 2016, 301,351 accidents were recorded, of which 169,163 occurred on federal highways inspected by the Federal Highway Police (PRF), representing approximately 56% of the total number of accidents. Of these accidents that occurred on federal highways, 4% had fatalities, 37% had injuries, and 59% were accidents where only property damage occurred. Approximately 67% of fatal accidents occurred in rural areas [2]. Due to the growth in traffic accidents, the sectors responsible are increasingly being questioned with regard to road quality and safety.
Although there are a variety of methods available to detect road safety problems, detailed analysis of road accidents remains one of the main indicators of network deficiencies. The decision-making process in road safety management depends on indicators that can objectively express the safety level of the components of a given transportation network [2]. In this sense, researchers have sought to relate historical series of traffic accidents to the geometric and operational attributes of the road using statistical regression models called Accident Prediction Models (MPA) or road safety performance functions. Although MPAs have been explored for more than two decades in countries such as Canada, the United States, England, and Sweden, in Brazil they are still in an incipient stage of development.
According to IPEA/ANTP [1], the particular characteristics of Brazilian traffic (pedestrians crossing outside of the crosswalk, only motorized vehicles stopping at traffic lights, lack of bicycle lanes, radars only reducing vehicle speed 100 meters before and after, among others) make it difficult to use studies and data from other countries as a reference source. It is necessary to search for data that express the Brazilian reality and that allow studies to be based on these data. It is therefore necessary to collect traffic accident data that allow these studies to be carried out, generate diagnoses, and indicate alternative solutions to the problems detected. One of the main difficulties, within this context, is related to the scarcity of computerized databases on accidents [3]. What does exist may also have flaws in collection and consequently little data reliability.
For many municipalities in Brazil, the only maps available showing their territories are those provided by the Brazilian Institute of Geography and Statistics (IBGE). These are often complex, outdated, and little known, causing the local reality and the cartography to be far apart.
When it comes to geometric road design, the problem is even greater. Around 70% of Brazilian highways were built in the 1960s and their projects are either on paper or digitalized as PDF files. According to Augusto Nardes, auditor of the Federal Audit Court (TCU), in an interview with the Globo newspaper [4], lack of planning is one of the TCU's biggest concerns and that this can be evidenced by the absence of necessary documentation (basic design, planning, and others) in the execution of transportation projects. Another aggravating fact is that traditional methods for updating maps have not kept pace with the increase in the number of roads caused by the country's socioeconomic growth in recent decades, either due to the difficulty of accessing some locations, the difficulty in finding specialized technical personnel or the high cost.
Because the updating of cartographic information is not just a topic of technical interest, but, above all, of economic and social interest, cartographic updating projects are being carried out with the use of geoprocessing and remote sensing techniques, beginning with Geographic Information Systems (GIS) and interpretation of satellite images, making data available in digital format. For Salbego et al. [5], the availability of information in digital format allows for the costs of the updating and replacement processes to be reduced, because products generated from GIS can be updated, edited, printed, and duplicated faster and more easily than those generated through traditional methods.
For more than 80 years, aerial photographs have been an indispensable tool for the development of a variety of engineering projects, such as highways, gas pipelines, transmission lines, and many others in Brazil. The spatial resolution of these images has greatly increased in recent decades, enabling their use in the implementation of highway projects. The automatic or semi-automatic extraction of roads can be the most convenient way to overcome the problem of the lack of project documentation for road safety.
Finding an efficient way to extract road networks automatically or semi-automatically is an important topic that has been discussed in many studies [6] [7] [8] [9] [10], in which different methods and algorithms have been used. Most studies agree that extracting roads from aerial images is a complicated task due to occlusion, shadows, and trees, as well as the different types of roads that appear in aerial images, and these conditions make it difficult to accurately extract roads [11] [12] [13] [14]. In addition, although many road extraction methods based on sensory images have been used in various studies, most of them were designed for urban or high-quality roads [15], while few are applicable to rural roads.
Despite the great potential for information extraction, image classification techniques face some challenges, such as the large amount of data to be processed, which can reduce classifier efficiency. Traditional pattern recognition methods for multispectral classification of remotely-sensed images are based on standard statistical techniques such as maximum likelihood [16]. These methods do not produce satisfactory results for hyperspectral imaging because they have a limited ability to resolve confusion between classes, especially classes with some spectral similarity.
The methods based on machine learning algorithms (Machine Learning Algorithms-MLA) have been applied to extract relevant information from hyperspectral data. The most popular approaches for hyperspectral image classification, such as SVM and ANN, have a considerable time limitation for large databases, especially in the training stage. To speed up SVMs to solve this problem, some variations have been developed, such as LASVM and SVM without Kernel mapping. The former is limited to binary classification and the latter considerably reduces the classification accuracy in cases of class overlap [17].
A new classifier that has been highlighted in the literature is a technique called Optimum-Path Forest (OPF). The OPF proposes to classify patterns using graph theory concepts [18] [19]. This approach emerged as a generalization of the Image Foresting Transform (IFT) [17]. Due to its ease of use and efficiency during data training, it has been shown to be an interesting approach to classification problems [18]. As the technique is relatively recent in the literature and there are few studies on strategies to extract road segment information from hyperspectral images, this paper proposes to introduce the OPF algorithm to extract road geometric features from hyperspectral images and evaluate the effectiveness with a comparison between multi and hyperspectral images. The OPF classifier is not a substitute for deep networks, but can be used as a complement. The idea in this manuscript is not to act directly on feature extraction, as deep learning does, but on the classification stage, and the OPF can be used with the features learned by these networks.
In this method, a geometric base of rural roads will initially be built from road segments extracted from the image. Then, techniques for grouping and reconstructing missing segments from the road network will be used.

Use of Remote Sensing Data to Identify Geometric Features of Roads
When classifying roads with remote sensing data, identification of small objects and linear features is important, while a high spatial resolution is required for more accurate classification [20]. Generally, satellite images with submetric spatial resolution are generated only available in a single panchromatic spectral range. Multispectral images generally have lower spatial resolution than panchromatic bands. This may not be sufficient to accurately distinguish the roads present.
Hyperspectral imaging systems are characterized by the division of the electromagnetic spectrum into a large number of bands (over 40) and are the result of technological advances in imaging. The narrow (typically 10 to 20 nm wide) and contiguous bands allow for the extraction of reflectance spectra at the pixel scale [21]. High spectral resolution allows for a more detailed analysis of land cover spectral signatures and human action than multispectral images [22]. The use of images to extract road features is not recent, however, separating land use classes such as roads and urban areas is not an easy task. The use of hyperspectral images for highway projects is still incipient and the perspective is that geometric characteristics of roads can be identified more simply with image spectroscopy.
When extracting highway information from data generated by remote sens- ing, it is often more important to have high spatial resolution than high spectral resolution. However, for Taherzadeh et al. [23], hyperspectral remote sensing has great potential for application in the analysis of complex urban scenes. Low spatial resolution images make it difficult to identify roads. However, even for high spatial resolution sensors, if the materials present in the scene have a similar spectral response, they may be confused, depending on the number of bands used in the image acquisition process. In fact, the use of a hyperspectral sensor improves the chances of differentiation of these materials and their correct identification in the scene. The narrower the spectral bands, that is, the higher the spectral resolution of the sensor, the more detail about the spectral response of the targets can be extracted, making it possible to considerably reduce misclassification.
The characteristics of the road system's linear features in digital images are mainly linked to the spectral and geometric properties of the target, which directly influence the extraction processes. The pathways in a Remote Sensing image, from the geometric and spectral point of view, can be considered as narrow and continuous ranges of high brightness intensity, bordered by low intensity regions [24]. Xi and Weng [24] further state that the brightness intensity does not vary greatly over short distances along roads, due to the fact that their spectral properties are similar over short stretches. Geometrically, a pathway is usually composed of straight and curved segments, most commonly in the form of circular arcs. Eslami and Mohammadzadeh [25] describe rural or non-urban road sections of a digital image as having characteristics such as constant width, continuous curvature change, and homogeneous local distribution.
In low-resolution images, roads appear mainly as lines that form a more or less dense network, and are directly related to the degree of anthropogenic occupation of the region [26]. In this type of image, the pathways are expressed as lines about 1 to 3 pixels wide and then modeled as lines in the extraction process [27]. In high resolution images, geometric features such as structure and shape play a crucial role in recognizing the road network. Road network extraction over hyperspectral images has great advantages over multispectral images in that it increases the ability to discriminate the materials that make up the road surface from most other types of materials that make up the landscape.
The main advantage of hyperspectral remote sensing in road studies is the ability to measure spectral features or identify unique absorption bands present in materials in this environment. The large number of contiguous bands allows for the extraction of information about the chemical and physical properties of materials [28]. A target that is difficult to identify by traditional multispectral sensors can therefore be discriminated by hyperspectral sensors depending on the spectral features present in the pixels.
However, due to the spatial variability of spectral signatures, the extraction of hyperspectral image characteristics is widely recognized as one of the most challenging tasks in processing hyperspectral images [29] [30]. Many existing methods use manual classification [19] [27] [28] [31] [32] [33], which involves the experience of specialists. Gabor filters [34], adaptive filters [35], and Markov chains [36] are often adopted. In recent years, pattern recognition methods have caused widespread interest in remote sensing [37] [38] [39]. They are considered to have great potential for classifying hyperspectral images. From training sets, the pattern recognition methods can effectively describe the characteristics of the data. Automated information extraction from images attempts to reproduce how the brain can interpret features. One way to make image interpretation simpler is to separate groups of pixels with similar spectral characteristics, that is, to recognize patterns. Pattern recognition involves techniques for assigning patterns to their respective classes automatically with minimal human interference. When unsure of the Gaussian distribution of data, nonparametric algorithms are most often recommended for classification, such as SVM (Support Vector Machine) and Decision Tree classifiers, which are supervised rather than parametric. Although widely used to classify images from remote sensors, they require a lot of computational effort to achieve acceptable accuracy rates in a test suite. They often become unfeasible in situations that require constant data retraining and especially in applications with large data volumes.
Detailed studies of deep learning models for processing remote sensing data have been carried out. Chen et al. [29] propose a classification strategy based on deep belief networks (DBN). The multilayer DBN model is designed to learn the characteristics of the hyperspectral data, and the characteristics learned are then classified by logistical regression. Ding et al. [40] propose a method for classifying hyperspectral images based on convolutional neural networks (CNN), where convolutional nuclei can be automatically learned from the data through grouping. Wu et al. [41] propose a convolutional recurrent neural network (CRNN) for classifying hyperspectral data. Convolutional layers are used to extract locally invariant resources, which are then fed with some recurring layers to further extract contextual information between different spectral bands. Li et al. [42] propose a CNN-based pixel pair extraction structure for classifying hyperspectral images. A pixel pair model is designed to exploit the similarity between pixels and ensure a sufficient amount of data for CNN.
Some set learning methods based on the support vector machine [20] [31] Gao et al. [44] used Random Multi-Graphs (RMG), which are a graph-based set method that uses systematically-constructed trees created from randomly-selected subsets of resources. In other words, the trees are built in randomly chosen subspaces. Inspired by this randomness, the performance of the hyperspectral image classification can be improved to mitigate the well-known Hughes phenomenon. However, graph-based set learning methods have rarely been considered for hyperspectral image classification.
Although there are many studies and methods for extracting roads based on remotely-sensed images [45]- [50], most were designed for urban roads or highquality images, making few of them applicable to rural roads [15]. In their study, Jian et al. highlight three problems related to rural roads. First, the variability of materials used for pavement (asphalt, cement, gravel, stone, etc.) that have different spectral signatures can be a problem when the above methods are applied.
Second, rural roads are generally narrow, and some road segments can be completely obscured by shadows from clouds, buildings, or other elements of the network. Third, rural roads have more curves than urban roads. It is difficult for existing methods to extract complete roads. Another problem with rural roads, specifically in Northeast Brazil, is the predominantly caatinga vegetation. In periods of drought, these plants can be easily confused with exposed soil and unpaved roads.
This study makes use of the graph-based machine learning model, which was rarely considered for classification of hyperspectral images. In particular, this method introduces OPF in hyperspectral classification for use in geometric design of roads.

OPF Classifiers
An optimal path forest classifier (OPF) was presented by Papa et al. [19], with two proposed variants: unsupervised and supervised, which are subdivided into full graph OPF and nearest k-neighbor graphs (k-NN), the most widely-used graphs. This model has produced good results when applied to other problems [17] [18] [51]. For this classifier, a forest is created where the nodes are descriptors. A complete adjacency relationship is considered for these nodes, and a distance function is used to define the relationships between the nodes. Within the forest, trees are classified in such a way that each tree is associated with a class.
The same class can be represented by more than one tree. This form of constructing the classifier allows non-linearly separable classes to be represented, with spatially dispersed samples [51]. Every supervised classifier is built from an initial training phase, in which a set of samples representative of each class to be characterized is provided. Subsequently, the classifier must be able to determine to which class a new sample presented belongs. In the case of the OPF classifier, the set of training samples is modeled using a labeled graph in which each sample is represented by a vertex. These vertices are organized into a forest, that is, a set of trees. Each tree in the forest contains only vertices of the same class, and for the same class there may be several trees. In this way, each class is characterized by a set of trees within the built forest.
To generate the set of representative trees for each class, this method applies a process of maximizing the connectivity map between samples of the same class. In this process, each vertex receives a value characterizing the cost of its connectivity to its group. This value is associated with the lowest cost path of that vertex with some prototype vertex of its tree. The prototype vertices are those within the tree that are closest to the vertices of another class.
To classify a new sample, the vertex having the lowest connectivity cost with the new sample is sought from within the entire forest. The class of that vertex is assigned to the new sample.
The OPF presented results similar to SVM and better than neural networks and Bayesian classifiers. The biggest difference is the execution time, which can be faster depending on the size of the database [52] [53].
Using CBERS-2B CCD satellite imagery covering a sedimentation area of the Pedras river in the city of Itatinga, Sao Paulo, Brazil, Souza, Lotufo and Rittner [31] compared the OPF classifier with the ANN-MLP, BC, and SVM classifiers. Although the results were similar to SVM, OPF was 65 times faster, especially relevant for large data volumes. Freitas et al. [32] estimated rainfall in agricultural areas using GOES satellite weather images, and compared OPF classifiers with SVM, ANN-MLP, and K-NN, with OPF demonstrating a superior runtime. OPF recognized collapsed areas from GeoEye-MS satellite imagery, yielding results similar to cutting-edge techniques [33]. Papa et al. [54], proposed a method combining the OPF classifier with three optimization algorithms (PSO, HS, and GSA) to mitigate the problem of reducing hyperspectral image data through band selection. The combination of OPF with HS and GSA has produced promising results. Macedo et al., [43] used the combination of Hyperion hyperspectral imagery with the Optimum Forest Path (OPF) algorithm for supervised classification of areas affected by desertification and compared the efficacy of the OPF and SVM classifiers when applied to these areas. Validation of the land cover thematic maps was performed based on confusion matrix analysis using the same set of validation points for consistency. Both classifiers produced reasonable results, however, the OPF algorithm outperformed SVM. The higher classification accuracy obtained by OPF was attributed principally to the ability to differentiate better between degraded areas (DA, DOC, and DP) and preserved areas (PGP and PDP).

Study Area
The scope of the analysis was the BR 232 Highway between km 141 and 356, la-  Figure 2 shows the architecture of the proposed approach. In the area to be processed, a Gaussian filter will be used to soften the original image and remove noise to improve the image quality. The edge detection algorithm will be used to detect high gradient regions within the image. The edge image will be generated and candidate road segments will be identified. Based on geometric knowledge, the adjacent candidate road segments will be linked together. For road segments missing because of image quality, occlusion, or other reasons, inference methods will be used and, finally, the complete road vectors will be extracted. The following sections describe the main modules of the architecture.

Data Pre-Processing Landsat 8
The digital image processing was carried out using satellite images from the

Data Pre-Processing Hyperion
The hyperspectral image used in the experiments was a section of the scene 216/65, supplied by the USGS, which covers a semiarid region of the state of Pernambuco between latitudes 8˚02'30''S and 8˚39'27''S and longitudes 36˚11'56''W and 37˚48'57''W, acquired by the Hyperion sensor. Hyperion has 242 10 nm-wide bands with 30m spatial resolution and covers a range from 400 to 2500 nm of the electromagnetic spectrum.
The bands sensitive to water absorption (bands 121 -130, 166 -180, and 233 -242) and the uncalibrated bands (233 -242) were removed by default, using the Erdas Imagine image processing software. After the analysis of the remaining 207 bands, the bands with many negative reflectance values were removed (1 -12 and 58 -76). The resulting image composed of 159 spectral bands was converted to ASCII format, resulting in a matrix of 159 columns and 75,452 instances.

Extraction of Road Geometric Features
The semiautomatic methodology developed for the extraction of the road network from images using the OPF supervised classifier is presented below.

Removal of Noise
To improve image quality, a Gaussian filter was adopted in order to identify a function to normalize the levels for each color. It is therefore necessary to determine the mean values and standard deviation for each color. The result of the processing of a given image with the Gaussian filter applied can be accompanied with the generation of a graph of the Gaussian distributions, in which each curve represents the RGB colors. According to [55], linear smoothing filters use a linear function capable of blurring and reducing noise in the image. This pre-processing is important to perform the extraction of larger objects. Noise can vary in brightness because of failures in the capture phase, failures in the transmission phase, or even because of steps performed in the pre-processing. The use of these linear smoothing filters calculates values for each pixel in the image by averaging the intensity levels of its neighbors, which are defined by a mask of size m × n.

Roadway Detection
The detection of road elements requires, in addition to the satellite image itself (in RGB format), a mask where the streets, unpaved roads, and other places of interest are represented. Elements such as trees can interfere with the identification of objects of interest and, therefore, the first step of the method is to identify and subsequently remove them, so that the region of importance is reduced. For this purpose, the image is converted to the HSV (hue, saturation, and value) color spectrum. The purpose of this step is to improve accuracy when working with a color image.
In the filtering algorithm, where road identification is performed, the brightness value (V) (which varies between 0 and 1, with 0 meaning black and 1 meaning white) is assigned a certain threshold. Then, all pixels that have a value below the chosen threshold are assigned a value of 0. To improve and enhance the image, the saturation (S) was set to a value of 50%.
After the road identification process, other undesirable elements that can corrupt the data analysis are eliminated. Elements that represent buildings can be confused with traffic routes and for that reason they should also be removed.
To do this, the image mask (mentioned at the beginning of the section) is required. The algorithm that performs this step consists of assigning the value 0 (black) to the spaces corresponding to the building elements in the analyzed image, based on the mask.

Edge Detection
After the initial pre-processing steps, the image is submitted to an edge detection algorithm. This step consists of looking for sudden variations (discontinuities) of color intensity, as this helps the objects and their characteristics (area, geometric shape) to be easily identified. In this study, it was decided to use the Sobel edge detection algorithm, as it has advantages over its competitors in the scope proposed by this study, giving more weight to the points close to the center pixel, which allows it to obtain more prominent edges.

Reconstruction of Segments Missing from the Road Network
Through the classification process based on defined rules, it was possible to delineate the most relevant sections of the road network present in the scenes under study [56]. However, due to occlusions from elements such as trees and buildings, there were still separate and non-continuous road sections after the initial stages. Some features of interest were not completely extracted and other features that were not of interest (noise) were detected. This is due to the limitations imposed by the images under study. Routines based on morphological operators were therefore introduced, which allowed non-extracted linear features of interest to be obtained, as well as preprocessing, filtering, and further refinement of solutions. All routines were developed in ArcGis software.
To recover the lost features of interest in the classification, a morphological M. Macedo et al.
reconstruction operation was used, which consists of applying successive dilations to a marker image until it fits into a second mask image. As a marker image, the binary image produced from the results of object-based classification was used, where elements belonging to the roads are represented by the value one (1) and background elements by the value zero (0). This was initially used to enhance the roads, using a full mask structuring element of size 3 × 3. This operation can be considered as a high-pass filter, which in addition to extracting high frequency information, also performs image smoothing [57]. After that, a thresholding operation was performed on the filtered image to separate the highlighted features of interest. The most appropriate threshold was determined by using the histogram of the filtered image.

Refining the Features of Interest and Obtaining the Road Network
The extracted features, according to the procedures described above, are linear segments in the form of polygons that represent the stretches of the road network. To obtain the paths in the form of simple lines, representative of the middle axes of the segments, it was at first necessary to apply operations to connect small disconnected sections and eliminate gaps as well as objects still present in the scene that are not part of the roads. Initially, the small discontinuous sections were connected using a dilation operation. Subsequently, the gaps were eliminated using the morphological closure operation. The area opening operation was the next step, applied to remove objects smaller than a certain number of pixels that are not part of the roadways. In all of these operations, a 3 × 3 full mask-shaped structuring element was used. To obtain the road network in the form of lines, the morphological thinning operation was applied until convergence, that is, each road segment is represented by a single line, with a width of one pixel. The semi-automatic vectorization procedure was used for all of the highways present in the images. Layout was performed in the ArcGIS environment, setting the screen scale to 1:6000 in order to standardize target interpretation and obtain homogeneous and detailed reference line vectors. In this process, it was sought to delineate the center line of the roads.

Comparison between Classification Algorithms
To quantify the errors generated by the different classification techniques used in this study, error or confusion matrices were constructed, which allow for the data from the validation samples and the classification results to be compared. Among the most widely-used methods for assessing the reliability of a classification is the Kappa index, which is an accuracy measurement technique that can be used to determine whether one error matrix is significantly different from another. This index is based on the difference between the overall accuracy (indicated by the matrix diagonal) and the producer and consumer risk accuracy, which is indicated by the sum of the rows and columns of the confusion matrix.
Reliability is the main limitation in classifications from satellite images, generally with less than 90% accuracy [58]. Among the factors that interfere with accuracy are the mixing pixels, the overlap between reflectance data from different targets in space, the low representativeness of the training samples, and the classifier's own ability to deal with inconsistencies in the process [59].

Experiments
The road classification process was performed using Erdas Imagine. As an initial step of this process, the objects were obtained through image segmentation.   The training sets were divided into two new disjoint sets for training and testing with 50% of the total samples in each one, 50% used to train the classifiers and 50% used to evaluate their precision (Table 1).
Three main experiments were performed. The first was to evaluate the effectiveness of OPF for classifying hyperspectral images with original hyperdimensional characteristics. In the second, a quantitative and qualitative comparison of OPF precision for data classification was made using multispectral images (Landsat 8) and hyperspectral images (Hyperion). In the third, a comparison is made between the OPF and SVM classification methods for hyperspectral images from the Hyperion sensor. For the OPF classifier, library C or LibOPF [54] was used, which is a design for optimal path forest classifiers. The same training set was used for the SVM classification.
To compare the results obtained between the classifications of both sensors and classifiers, the confusion matrix was generated in Erdas Imagine software and the kappa coefficient was calculated according to literature [58].

Supervised Classification
At first, the classes described above were grouped and five classes were considered for image classification and comparison between hyperspectral and multispectral images using the OPF classifier: 1) Water-W, 2) Exposed Soil-ES (Exposed Soil + Unpaved Highway), 3) Vegetation-V (Dry Dense Caatinga + Green Dense Caatinga), 4) Urban Area-UA and, 5) Road-R (Paved Road Asphalt). The classes of Paved Road Concrete, Clouds, and Shadows were not considered. This decision was based on the fact that the multispectral images do not present good separability between the constituent materials of the road (asphalt, concrete, gravel, etc.) [15]. Despite being a simpler experiment, it meets the needs of the Brazilian Northeast where most rural highways are asphalt. In both classifications, misclassification between urban area and highways occurred. Although Hyperion's classification accuracy was statistically superior to Landsat and both commission errors and omission errors were lower for all classes studied, it can be said that the quality of both classifications was similar ( Figure 7 and Figure 8). Although the OLI sensor collects approximately 75 times less data than the Hyperion sensor [60], the positioning and width of the spectral bands are sufficient to classify highways.
Subsequently, the image classification for the Hyperion sensor was performed, considering the 10 classes described above using the OPF classifier ( Figure 5). All experiments were repeated using the SVM classifier ( Figure 6) in order to compare the results.
Although low resolution satellite images are available for easy access, free of charge and on a large number of platforms, their use presents a real processing difficulty because they have a large amount of undesirable noise and imperfections that hinder feature extraction analysis. Pattern recognition is essential for remote sensing. However, in situations where the database is very large, the cost of training a classification algorithm may be unsatisfactory and time consuming.  In this paper, multispectral images (Landsat 8) and hyperspectral images (Hyperion) were classified using the OPF classifier. Experimental results showed that the OPF obtained similar recognition rates. The study was a pioneer in the use of the OPF classifier to extract geometric road network features from hyperspectral images. These images store large volumes of data that ultimately reduce the efficiency of traditional classifiers such as SVM. OPF also had a longer execution time, but was 48 times faster than the classifiers tested in the training and testing stage. In the classification of the full image, OPF was about 180 times faster than other classifiers cited in the literature. Due to the greater amount of spectral information provided by the Hyperion sensor, the classification accuracy for this sensor's image was higher than that of the Landsat 8 sensor. The results were good with regard to both land use and land cover separation where spectral differences predominate when compared to classes in which the spectral signatures are very similar. The OPF for multispectral images also presented greater distinctions between regions of the studied areas (Urban Area and Roads). There was some small confusion between these areas, but this did not impact the final result of the classification.
The quality of land use and land cover classification obtained from the OLI sensor image was similar to that obtained from the Hyperion sensor. Having a higher level of spectral detail in hyperspectral images provides better capability to see the unseen and distinguish between urban areas and roads because of the high spectral resolution.

Separability of the Classes
Using the data from the six reflective bands of the OLI sensor, the accuracy in discriminating the five classes of interest in the studied scene was 96.5% with a Kappa value equal to 0.88.
Regarding the discrimination of classes of interest using the first eight bands of Hyperion, the classification accuracy was 97.9% and the Kappa value was 0.93.
The separability of the classes obtained by OPF-Hyperspectral showed a level of accuracy better than OPF-multispectral. Specifically, for the classes related to Roads (R), the OPF-Multispectral obtained 92.78% accuracy, while the OPF-Hyperspectral reached 99.75% accuracy for these classes (Figure 9).
In the classification of multispectral data (Table 2), the smallest commission errors (pixels from other classes that were assigned to the reference class) and omission errors (pixels belonging to a reference class that were assigned to other classes) were observed for water (W), 1.34% and 2.88%, respectively, giving it the best performance of the classifier. Although 91.27% of the pixels belonging to the urban area class and 91.80% of the exposed soil class were correctly classified, the largest commission errors were observed for these classes. This indicates that 32.28% of the pixels that were classified as urban area and 31.42% of those that were classified as exposed soil belong, in fact, to other classes.
In the classification of hyperspectral data (Table 3), the best classification was also found for water (W), but with lower commission and omission errors (1.02% and 1.16%, respectively).
In the classification of hyperspectral data considering the 10 classes of interest, the classification accuracy was 97.9% and the Kappa value was 0.93. Table 4 and   Table 5 show the confusion matrix obtained for the classification of variables using the OPF and SVM classifiers, respectively.     Classes W, PRA, C, and S were correctly classified, meaning that the classifiers were able to discriminate these classes. Among the road classes, separability was adequate for PRA (paved road asphalt), however for UR and ES, the spectral confusion was evident. The spectral similarity of the two classes most likely favored confusion in the classification. The same occurred for the UA and PRC classes.

Image Spectroscopy
These results show that hyperspectral images perform better than multispectral images on roads with very subtle spectral differences, such as lane discrimination and exposed soil. The highways class has a spectral signature similar to the urban spectral signature, but despite the similarity in the intensity of the absorption characteristics, they have different spectral curves with regard to the magnitude of reflectance ( Figure 10). Imaging spectroscopy shows the ability of hyperspectral sensors to intercept electromagnetic energy in very narrow wavelength ranges and detect small absorption characteristics. Figure 11 and Figure 12 show the result of the edge detection for the images from the Hyperion sensor and from the LandSat 8 sensor for the study area. The left side shows the original image, while the right side shows the result of the edge detection.   It can be seen that the algorithms accurately detected all of the obvious gradient changes in the image, however, the linear features are not so evident because of the low spatial resolution. Roads are identified more accurately in straight sections. Especially in Figure 12, where there are two types of roads, a concrete road and an asphalt road, the results of edge detection for both are good. However, on curved sections or near intersections, there are still a small number of incorrect features. It can be seen that the result is fragmented and few of the roads extracted are complete. This is probably because the spectral information limited to a single pixel cannot reflect the characteristics of rural roads.

Results for Extraction of Road Segments
These results are best when the training samples are replaced by the spectral signatures of the targets. Another point that is highlighted is that the roads extracted incorrectly are mainly those with vegetation shadows or intersections, which lead to inaccurate inferences or interconnection of road segments. In general, OPF performed better in the extraction of roads with different materials and those having major changes in curvature.
Despite performance well in the identification of roads, the result from the classification alone is not sufficient to construct road bases that support the models of road safety, which need the geometric design for correct identification of straight and curved sections, as well as the radius of those curves. To complement this classification, the extraction process based on the inference of geometric characteristics is necessary [15]. A small test area (red rectangle) was selected to better show the results ( Figure 13).

Comparison of Results
The results were evaluated using real field information, provided by Embrapa and lower for hyperspectral images. Table 6 shows the accuracy and execution time.
When the comparison performed was related to the performance of the data sets, it was demonstrated that, for all levels of detail, the ratings obtained using data from hyperspectral images showed significantly better results than the ratings obtained using data from multispectral images for both classifiers. However, for hyperspectral images, the OPF classifier was superior. The execution time for the OPF classifier was on average 11 times faster than for SVM.
This result shows that the accurate characterization of targets present in the urban environment requires a high spatial resolution, however, the combination of this important feature with the high spectral resolution can lead to more detailed and accurate results.

Conclusions
Brazil ranks fifth in traffic accident deaths, according to the United Nations. To reduce this number, effective road safety management actions are necessary. This management depends on many factors associated with accidents and the places where they occur. For this, the geometric road base is indispensable. However, most Brazilian municipalities do not have maps. This reality is even worse in the Brazilian Northeast and in rural areas, due to the difficulty of access and the high cost of mapping by traditional methods or using high resolution satellite images. Although high resolution spatial images are commonly used worldwide for road extraction, they are not feasible due to the high cost of acquisition. Multi-and hyperspectral images show satisfactory results, mainly due to the asphalt composition of most of the principal roads. However, hyperspectral images, are better for distinguishing various constituent materials (concrete, asphalt, exposed soil, etc.) due to their high spectral resolution. This lack of information regarding road design can therefore be solved with the use of hyperspectral images. The goal of hyperspectral remote sensing in road studies is the ability to measure spectral features or identify unique absorption bands present in materials in the environment. The large number of contiguous bands allows for the extraction of information about the chemical and physical properties of materials. A target that is difficult to identify by traditional multispectral sensors can therefore be discriminated by hyperspectral sensors depending on the spectral features present in the pixels. The results show that the OPF classifier obtained a better performance using hyperspectral images with an average hit rate of 97.90%. The use of satellite imagery can be an alternative in collecting information for accident databases.
In general, the classification of images alone does not meet the needs of road safety models, where it is important to reconstruct and accurately identify the elements of the road network (straight segments, curved segments, radius of curves, etc.) for later association with accident data (location, type of accident, climatic conditions, among others). Therefore, geoprocessing techniques are necessary to properly elaborate these bases. The methodology proposed in this study for identifying and extracting roads from low-resolution satellite images using digital processing and pattern recognition produced results with a good level of accuracy. The worst results were found on unpaved roads and exposed soil. Other geoprocessing techniques, such as reducing the excessive number of vertices, reconstructing curved elements, and smoothing segments can be added to improve the geometric quality of the road base. Additional work could also be carried out to examine the possibility of applying the proposed method using neural networks to classify road networks.