Automated Burned Scar Mapping Using Sentinel-2 Imagery

The Sentinel-2 satellites are providing an unparalleled wealth of high-resolution remotely sensed information with a short revisit cycle, which is ideal for mapping burned areas both accurately and timely. However, the high detail and volume of information provided actually encumbers the automation of the mapping process, at least for the level of automation required to map systematically wildfires on a national level. This paper proposes a fully automated methodology for mapping burn scars using Sentinel-2 data. Information extracted from a pair of Sentinel-2 images, one pre-fire and one post-fire, is jointly used to automatically label a set of training patterns via two empirical rules. An initial pixel-based classification is derived using this training set by means of a Support Vector Machine (SVM) classifier. The latter is subsequently smoothed following a multiple spectral-spatial classification (MSSC) approach, which increases the mapping accuracy and thematic consistency of the final burned area delineation. The proposed methodology was tested on six recent wildfire events in Greece, selected to cover representative cases of the Greek ecosystems and to present challenges in burned area mapping. The lowest classification accuracy achieved was 92%, whereas Matthews correlation coefficient (MCC) was greater or equal to 0.85.


Introduction
Wildfires constitute a ubiquitous problem in Mediterranean countries, intro-ducing a high risk of direct damage to humans and structures in most of the highly populated Mediterranean countries and especially in coastal regions [1]. Most wildfires in Europe-over 85% of the total burned area-take place in its Mediterranean region, where about 65,000 fires occur every year on average, burning approximately half a million hectares of wildland and forest areas [2]. The analyses performed by the European Forest Fire Information System (EFFIS) indicate an increase in the length of the wildfire season the last 30 years, whereas the fire regime is projected to change almost everywhere in Europe the next decades [3].
Timely and accurate burned area mapping is essential for quantifying the environmental impact of wildfires, for compiling statistics, and for designing effective short-to mid-term impact mitigation measures (e.g., prevention of soil erosion or possible impacts of the fire/heavy rainfall combination). To support such use cases on a national level, however, the burned area mapping methodology should be as automated as possible, requiring minimum-or desirably even none-human interaction.
Satellite imagery has been successfully employed for mapping burned areas for several decades, since it offers a more accurate, seasonable, and resource-efficient alternative to field surveys [4], whereas it permits various levels of automation of the mapping process, especially in view of the great advancements the field of machine learning has seen the last few years. Traditionally, moderate-to coarse-resolution satellite sensors have been used for the task, such as MODIS (Moderate Resolution Imaging Spectroradiometer) [5] and MERIS (MEdium Resolution Imaging Spectrometer) [6]. These sensors offer the advantage of daily (or sub-daily) global coverage and the possibility to identify the date of the fire through a fully automated workflow. However, their coarse spatial resolution (pixel size of 500 m or greater) provides only a rough estimate of the fire perimeter.
When a detailed mapping of the affected area is required, high-resolution satellite imagery can be used instead, sacrificing the temporal frequency of sensors such as MODIS in favor of increased spatial resolution. Landsat data (having 30 m spatial resolution) have been predominately used for this purpose [7] [8] [9], due to their rich spectral information-especially the shortwave infrared (SWIR) bands they include that are important for burned area mapping-and their free data provision policy from the United States Geological Survey (USGS) since 2008 [10]. However, Landsat has a temporal resolution of 16 days, which constraints the rapid mapping of wildfires, especially in regions with frequent cloud coverage (e.g., mountainous areas). Rapid and detailed mapping can be performed alternatively with commercial very high-resolution satellite imagery [11], but this option involves high costs especially if expedited satellite tasking is requested, which limits its applicability to few cases of special significance (e.g., when extended damages are involved).
The Sentinel-2 mission-developed and operated by the European Space Agency (ESA) as part of the Copernicus programme of the European Commission (EC)-is providing free of charge high-resolution optical imagery since 2015. Sentinel-2 data are characterized by high spatial resolution (10 -20 m, depending on the band), rich spectral information (more or less a superset of Landsat 7 ETM+ and Landsat 8 OLI bands), and high temporal frequency (5 days), characteristics which constitute them attractive for setting up an operation burned area mapping service on a national level. Since the Sentinel-2 data are new, though, only few studies have investigated their potential for burned area mapping [12] [13] [14] and even fewer have proposed automated workflows for automated mapping [15] [16]. This study presents a novel methodology for mapping burned areas using Sentinel-2 data, aiming at eliminating user interaction and achieving mapping accuracy that is acceptable for operational use. The proposed methodology uses a pair of Sentinel-2 images, one acquired before the wildfire and one after the fire has been extinguished. A number of difference spectral indices are calculated and a set of empirical rules is employed in order to classify a portion of the image pixels, those that can be unambiguously characterized as burned or unburned. These pixels serve as training patterns and a supervised learning approach is employed to derive an initial mapping, implemented via the Support Vector Machine (SVM) classifier [17]. The latter is subsequently refined using the so-called Multiple Spectral-Spatial Classification with Minimum Spanning Forest (MSSC-MSF) approach [18], which eliminates the salt-and-pepper phenomenon of the pixel-based classification and increases the accuracy of the final burned area map.

Datasets Used
The proposed methodology uses Sentinel-2 MultiSpectral Instrument (MSI) data.
We did not consider the bands with spatial resolution of 60 m-which are primarily useful for atmospheric correction-neither the two additional red-edge bands (bands 5 and 7), which are primarily used for calculating spectral indices sensitive to vegetation stress. The nominal specifications [19] of the eight bands that were ultimately considered in this study are reported in Table 1.
Six wildfire events in 2016 and 2018 in Greece were considered in this study (Table 2 and Figure 1), selected to cover representative cases of the Greek ecosystems and to present challenges in burned area mapping. More specifically, Elata (Figure 1(a)) was a sparsely vegetated environment with many plantations of mastic trees (Pistacia lentiscus). In Farakla (Figure 1    where the fire expanded to populated areas with significant vegetation cover (mostly pine trees in-between houses and within house yards).
We used Level-2A products, i.e., atmospherically corrected Sentinel-2 images (bottom-of-atmosphere reflectance values) [20]. The 2018 images were downloaded directly as Level-2A products from the Copernicus Open Access Hub (https://scihub.copernicus.eu/). For the 2016 images, the Sen2Cor processor (http://step.esa.int/main/third-party-plugins-2/sen2cor/) was run locally to derive the Level-2A products. A rectangular subset of the image that includes the burned area with a small buffer around was cropped out of each original Sentinel-2 tile, as shown in Figure 1. In the current work, this was done manually, although we presently explore ways to automate this step as well. The bands reported in Table 1 were subsequently stacked into a single image, considering absolute reflectance values in the range [0, 1] (i.e., dividing the Level-2A product's pixel values with 10,000). For this purpose, all 20 m resolution images were resampled to 10 m through a nearest neighbor resampling. Moreover, non-land areas were masked out using the official NUTS (Nomenclature of Territorial Units for Statistics) layer. For validation purposes, burned area perimeters derived from the so-called Object-based Burned Area Mapping (OBAM) service were used, which was developed within the context of the Greek National Observatory of Forest Fires (NOFFi) [21]. NOFFi-OBAM is a semi-automated methodology that uses a single post-fire Sentinel-2 image to derive accurate burned area perimeters, but involves significant user interaction. We should note that although these reference mappings were derived from a semi-automated methodology, manual corrections were performed at the end through careful photointerpretation. Therefore, the reference dataset is accurate, at least to the level allowed by Sentinel-2's spatial resolution (which in any case is high enough to support the photointerpretation process). Figure 2 depicts the workflow of the proposed methodology. First, spectral indices are calculated from both the pre-fire and post-fire Sentinel-2 images (Level-2A products, as mentioned previously). The differences (or ratio) of these indices are used to label a portion of the image's pixels through a set of empirical rules, which constitute the training set's labels. These rules are defined in a way such as only pixels that can be characterized unambiguously as burned or unburned are labeled, in order to avoid introducing misclassifications into the training set as much as possible. An augmented set of features, comprising the post-fire image's bands values (reflectance values) and the difference/ratio spectral indices, is used to perform a preliminary pixel-based classification via the SVM classifier. The latter is subsequently refined via the MSSC-MSF methodology, out of which the final burned area perimeter is obtained. The rest of this section details each step of the proposed methodology. Journal of Geographic Information System

Spectral Indices
A number of spectral indices frequently used in burned area mapping studies have been calculated, which are reported in Table 3. Most of them employ the classical normalized difference formulation of the Normalized Difference Vegetation Index (NDVI) [22], but using one or both of the SWIR bands, which are sensitive to moisture content and consequently facilitate the discrimination of the burned areas. The indices are calculated considering the ground reflectance values of the Level-2A products in the range [ ] 0,1 , from either the pre-fire image or the post-fire one or both (as described in the following subsections). In the following, we use the notation pre SI and post SI to describe any index SI calculated from the pre-fire or the post-fire image, respectively (e.g., pre NDVI ).
Moreover, the empirical rule-based classification (described in the next subsection) also considers the difference between the pre-fire and post-fire values for some of the indices, which is an indication of disturbance resulting from the fire. The difference, which is denoted as dSI for any index SI (e.g., dNBR ), is calculated as pre-fire minus post-fire value, i.e.,

Empirical Rule-Based Classification
The supervised classification approach followed by the proposed methodology requires a training set that is labeled automatically. To do so, we defined two Journal of Geographic Information System Table 3. Spectral indices considered in this study. The Sentinel-2 band names are those reported in Table 1. [29] empirical rules that employ thresholds on differences or ratios of the pre-fire and post-fire spectral indices, following a similar approach with previous studies [8] [15]. It is generally impossible to define empirical rules that will classify correctly all (or almost all) pixels, due to the great variety of ecosystem types (e.g., sparsely vegetated areas, dense forests, shrublands, agricultural or pasture areas, etc.), illumination conditions, topography, and other factors. Therefore, these rules are defined in a way such as only pixels that can be characterized unambiguously as burned or unburned are labeled, in order to avoid introducing misclassifications into the training set as much as possible.
To this end, the features used to define the rules and the respective threshold values were determined empirically through a trial-and-error procedure on many wildfire events in Greece, in a way that only unambiguous burned or unburned pixels were labeled, in order to avoid introducing errors within the training set. Eventually, a pixel is classified as burned if it satisfies the rule defined by Equations (1)-(2): Accordingly, an object is classified as unburned if it satisfies the rule defined by Equation (3): In both cases, the pre-fire MNDWI index is considered for discriminating water surfaces and artificial areas (buildings, roads, etc.), whereas the vegetation or burned area identification is performed by the remaining part of the corresponding rule. To avoid misclassifications by differences in illumination between the pre-fire and post-fire image acquisitions-possibly in combination with topography-a morphological opening (erosion followed by dilation) operator [30] is applied to the resulting rule-based classified image, considering a square structuring element with a side of 3 pixels (i.e., the 8-neighborhood). The opening operator effectively removes isolated single-pixel classifications, which exhibit a higher probability of having been misclassified.
Depending on the area, a larger or smaller proportion of pixels may be classified by the above procedure. Densely vegetated forest areas or severely burned areas are more likely to be labeled. In any case, the labeled pixels are used to formulate the training set and the remaining ones are categorized by the supervised classifier, as explained in the following.

Initial Pixel-Based Classification
The dataset used for the supervised classification considers 21 features extracted from multiple sources ("classification features image" module in Figure 2).
More specifically, it is formulated considering: 1) the eight bands of the post-fire Sentinel-2 image reported in Table 1, 2) the first seven spectral indices of Table 3

Final Burned Scar Map
Pixel-based classification are typically severely affected by the salt-and-pepper phenomenon. The latter characterizes the phenomenon whereby individual pixels or small isolated regions are misclassified, either as unburned within the fire perimeter or as burned outside-and in many cases far away for-the fire perimeter, with a human observer being able to identify them easily as errors through visual inspection. These isolated errors are the result of image artifacts (because of illumination angle, topography, specular reflections, etc.) in combination with the fact that the pixel-based classification does not exploit any spatial information and the phenomenon becomes more prominent with the increase of spatial resolution. To overcome this limitation, the initial pixel-based classification described in the previous subsection is refined via the MSSC-MSF approach [18], which can effectively exploit both spatial and spectral information.
Given an input image and a pixel-based classification, MSSC-MSF starts by segmenting the image by means of three different image segmentation algorithms (see Figure 2). For this purpose, we used the watershed segmentation [31], the Fuzzy C-Means (FCM) clustering [32], and the mean shift segmentation [33] algorithms. In all cases, the segmentation was performed considering only the four 10 m bands of Sentinel-2 (B02, B03, B04, and B08 in Table 1). This choice was made in order to a) reduce the overall computation requirements, b) exploit the finest spatial resolution provided by Sentinel-2, and c) increase the methodology's generalization capabilities by using a simpler feature set for the segmentation process than that used for the pixel-based classification.
Watershed is a morphological approach to image segmentation that combines region growing and edge detection. The method is typically applied to a gradient of the original image and segments it into small adjacent but non-overlapping regions, where each region is associated with one minimum (i.e., homogeneous region) of the gradient image. Similarly to the original approach [18], we used the robust color morphological gradient (RCMG) [34] for this purpose. Conversely, FCM is a clustering algorithm, but we can derive a segmentation of the image by identifying and uniquely labeling the connected components (CCs) on the resulting clustering map. Final, mean shift is an efficient feature-space analysis approach that has been successfully used in the past for remotely sensed image segmentation [35] [36], as it balances between the spectral and the spatial information provided by the multispectral image.
For each segmentation, an equivalent classification map is produced fusing the segmentation result with the pixel-based classification obtained by SVM.
More specifically, a majority voting is performed within each segment of the image, i.e., all pixels belonging to the segment are assigned to the class exhibiting the highest frequency in the classification map within this segment. Ultimately, the three independent classification maps are fused to select a set of markers. A pixel is labeled as marker if all three independent classification maps agree (i.e., belong the same class, in our case, either burned or unburned). Marked pixels retain their class, whereas all other are considered as unclassified. Effectively, this Multiple Spectral-Spatial Classification (MSSC) approach eliminates the salt-and-pepper phenomenon of the base pixel-based classification by exploiting the spatial information, which is provided by the segment-wise processing of the image. If all three independent classifications agree, it is assumed that the pixel has been correctly classified (markers), whereas the remaining pixels are considered ambiguous.
Finally, those ambiguous pixels are labeled by growing a Minimum Spanning Forest (MSF) rooted from the markers. Each image pixel is considered as a vertex of an undirected graph, connected with its eight immediate neighbors through edges. A weight is assigned in each edge, which is proportional to the dissimilarity between the two pixels. As a measure of dissimilarity, we used the spectral angle mapper (SAM) measure, which is defined by Equation (4): where B is the cardinality of the feature space (i.e., number of bands in the image) and are the feature vectors of two neighboring pixels. The absolute SAM value is maximized for dissimilar feature vectors and minimized for equal ones. As shown in Figure 2, the SAM measure is calculated considering the full classification feature space, comprising the 21 features described in the previous subsection. In the case that two neighboring pixels are both markers, a zero weight is assigned if they belong to the same class and an infinite weight otherwise.
An additional vertex for each class (here two, burned or unburned) is inserted into the graph and connected with all markers belonging to the respective class. A root vertex is also inserted, connected with those extra vertices. With this configuration, running a minimum spanning tree algorithm (e.g., Prim's algorithm [37]) and then removing the extra vertices induces an MSF over the graph. Assigning all pixels grown from each marker to its class, ultimately results in labeling all unclassified pixels in the image. Effectively, each marker grows its tree by labeling neighboring unlabeled pixels, in a way that the total dissimilarity among pixels is minimized. A detailed description of the MSF procedure can be found in [18] and [38].
Vectorizing the final classification map and keeping only the areas labeled as burned, we derive the final burned area perimeter, which can be subsequently viewed or analyzed in a GIS environment.

Example of the Proposed Algorithm's Application
Before reporting the numerical results, we provide here an example of the proposed algorithm's application in Figure 3, in order to visually explain the various steps of the algorithm. We use the case of the Kallitechnoupoli wildfire (see Table 2), with Figure 3(a) showing a false-color composite of the post-fire Sentinel-2 image and the reference burned area perimeter superimposed with a black line. This was a devasting wildfire that started on July 23, 2018 in Attica, Greece and claimed the lives of more than 100 people. It presents several difficulties as a burned area mapping task, because the fire expanded to populated areas with significant vegetation cover (mostly pine trees in-between houses and within house yards), with some of the residential properties completely destroyed and others partially burned. As such, several regions in the scene are difficult to be correctly classified by a fully automated algorithm.
The training set derived from the empirical rules (see Section 3.2) comprises The three spectral-spatial classifications (see Section 3.4) result in generally different results. Watershed (Figure 3(d)) is more or less a superpixel segmentation algorithm (i.e., it usually generates very small segments) and, as such, it removes the salt-and-pepper misclassifications, but also exhibits relatively high commission error, misclassifying large areas to the north of and outside the fire perimeter as burned. On the other hand, FCM (Figure 3(e)) is a clustering algorithm and, as such, it is closer to the pixel-based SVM result. The mean shift classification (Figure 3(f)) is somewhere in-between the previous two, since it exploits both spatial and spectral information.
The markers (Figure 3(g)) are defined as all pixels that the three segmentation-based classification agree. Closely inspecting Figure 3(a), Figure 3(b), and Figure 3(g), we can observe that indeed the hard-to-discriminate regions are left unmarked. The MSF algorithm (Figure 3(h)) finally labels the unmarked regions, resulting in the most accurate classification result. Comparing the initial (Figure 3(c)) and the final (Figure 3(h)) classifications, it becomes apparent that the MSSC-MSF approach improves the thematic consistency of the mapping process significantly, which is also reflected in the numerical comparison presented in the following.

Quantitative Results
For quantifying accuracy, well-known measures derived from the confusion matrix were calculated, namely, sensitivity, specificity, accuracy, and Matthews correlation coefficient (MCC) [39]. Correct classifications in a confusion matrix are either true positives (TP) or true negatives (TN), i.e., pixels correctly classified as burned or unburned, respectively. The errors are described as false positives (FP; pixels classified as burned but their true label according to the reference was unburned) and false negatives (FN; pixels classified as unburned but their true label according to the reference was burned). Sensitivity is (inversely) related to omission errors (burned areas misclassified as unburned) and is defined by Equation (5): where P is the total number of pixels labeled as burned in the reference set. Specificity is related to commission errors (unburned areas misclassified as burned) and is defined by Equation (6): where N is the total number of pixels labeled as unburned in the reference set.
Accuracy is the ratio of correctly classified areas to the area of the reference perimeter-being the most commonly used measure of classification performance-and is defined by Equation (7): Finally, MCC takes into consideration both omission and commission errors, being a correlation coefficient between the observed and predicted binary classifications and is generally regarded as a balanced measure that can be used even if the classes are of very different sizes [40]. It is defined by Equation (8) Table 4 reports the values of the accuracy measures for both the initial pixel-based SVM classification and the final ones for each test case. The proposed methodology exhibits higher accuracy values in most of the regions compared to the initial pixel-based classification, which is usually attributed to the higher sensitivity (fewer omission errors) with similar specificity values. For two cases (Zemeno and Kallitechnoupoli), the performance of the proposed method is much higher than the base classification, especially in terms of MCC values. In purely absolute terms, the proposed method exhibits classification accuracies greater or equal to 0.92 and MCC values greater or equal to 0.85. Such a performance is perfectly acceptable for operational applications on a national level, especially taking into consideration that it is a fully automated process.

Visual Comparison
Complementary to the numerical results reported above, we provide a visual comparison between the initial pixel-based SVM classification and the final one for all six wildfires of Table 2 in Figures 4-9. This comparison confirms the differences and additionally highlights the qualitative superiority of the proposed framework. One the one hand, the proposed methodology eliminates the salt-and-pepper phenomenon of the initial pixel-based classification outside the fire perimeter. Because the latter are usually isolated pixels misclassifications, the specificity measure (see Table 4) is not greatly affected negatively. Nevertheless, it is significant for the pixel-based classification in many cases, such as the Farakla ( Figure 5), the Kallitechnoupoli (Figure 8), and-most prominently-the   Zemeno (Figure 7) wildfires. On the other hand, the pixel-based SVM classification seems to systematically underestimate the burned area within the fire perimeter, resulting in higher omission error.

Conclusions and Future Work
This paper presented a novel methodology for mapping burned areas using Sentinel-2 data, without any user interaction. A pair of Sentinel-2 images, a pre-fire and a post-fire one, are used for this purpose. Difference and ratio spectral indices are calculated from this pair of images and a set of empirical rules is employed to automatically label a portion of the image pixels, which formulate the training set. The latter is used to train a pixel-based SVM classifier, which labels all other image pixels. This initial classification is further refined by the MSSC-MSF approach, which increases the mapping accuracy, mitigates the salt-and-pepper phenomenon inherent in all pixel-based classifications, and significantly increases the quality of the derived map. The results on a set of recent large wildfires in Greece showed that the proposed methodology exhibits quantitative and qualitative performance that is satisfactory for operational use on a national level, requiring at the same time no user interaction. A notable limitation of the proposed methodology is that the empirical rules that formulate the training set for the initial pixel-based classification have been derived through a trial and error procedure. Although we tried to consider a representative-as much as possible-set of past wildfires for this purpose, it is undeniable that this approach is suboptimal and requires substantial effort for updating the rules as information from new wildfires becomes available. Perhaps most importantly, it encumbers the method's transferability in other ecosystems with different characteristics than the Euro-Mediterranean one (i.e., Greek) considered in this study. Future work will try to address this limitation, by devising an automated methodology for updating or creating anew the empirical rules.