Lithological Mapping Using Landsat 8 OLI in the Meso-Cenozoic Tarfaya Laayoune Basin (South of Morocco): Comparison between ANN and SID Classification ()
1. Introduction
Orbital technology is continuously improving advanced levels which are useful for lithologic mapping as well as mineral exploration [1] - [13].
Advances in image classification and the ability to integrate multiple data sources have further improved geological surveys using remote sensing technologies. There are a multitude of classification algorithms that have proven to be effective in geological mapping: Artificial Neural Networks (ANN) [14] - [21], Spectral Angle Mapper (SAM) [13] [15] [22] [23] [24] [25], Support Vector Machines (SVM) [13] [26] [27] [28] [29].
On the other hand, advances in Landsat sensors with two additional spectral bands and a narrower bandwidth are an advantage for applications requiring thinner, narrow bands, as well the development of spectral indices for various applications of Landsat data, including agriculture, land cover mapping, cool and coastal water mapping, snow and ice, soil and geology [30]. In this sense, Landsat bands are well-known for particular applications: band 7 (geological band), band 5 (soil and rock discrimination) and band 3 (soil/vegetation discrimination) [31] [32] [33].
The Meso-Cenozoic Tarfaya-Laayoune-Dakhla basin is located NW of the Archean-Proterozoic ridge of Réguibate, NW of the West African craton. The studied region belongs to the Tarfaya-Laayoune basin, the central region of the Meso-Cenozoic Tarfaya-Laayoune-Dakhla basin, which has been a relatively stable tectonic platform since the Jurassic period [34]. Primarily, the Meso-Cenozoic formations mainly arbitrate layers of bituminous, phosphate and limestone rocks of obvious economic interest.
Like all arid regions, the geological mapping is more complicated because most of the formations are invaded by large movements of sand (34 m/s) accelerated by the flat morphology of the land [35]. However, the geological mapping coverage has been reported by several maps at a scale of 1:100.000 and which concern the type-localities (from the North to the South): Tarfaya, Tah, Oum Debaa and Laayoune [34].
In another sense, this work aims to classify lithofacies by processing Landsat 8 multispectral data. Artificial Neural Network (ANN) and Spectral Information Divergence (SID) were used for classification while the spectra of the regions of interest (ROI) of the image were used as end members. For better precision, The ANN and SID classifiers were compared, and a choice was made for the classifier to be used as a support for geological mapping. Finally, we are guided to make a comparison between the two classification methods to choose the most efficient for a good presentation of the lithological cartography of the area.
2. Geological Settings
In Morocco, the Saharan domain presents morphological, climatic and hydrological peculiarities. This area is characterized by arid conditions, precipitation less than 54 mm/year, an annual average temperature that varies between 17˚C on the coast and 25˚C elsewhere and 3239.81 hours of sunshine per year. This domain is established on Precambrian crystalline lands (climate-data.org).
The Tarfaya-Laayoune-Dakhla (Figure 1) basin is part of the western margin of the platform. It is made up of a set of elongated NE-SW basins, parallel to the coast [34] [36]. These basins are from N to S: Tarfaya-Laayoune, Boujdour, Dakhla, Lagwira. They were formed during the Mesozoic and Cenozoic in the marine direction of the Archean-Proterozoic formations of the Reguibate Ridge and the folded Paleozoic sediments of the Paleozoic basin of Tindouf.
The Tarfaya Laayoune basin is separated from the Essouira-Agadir basin by the stable structural top of the Anti-Atlas. It is limited to the N and NE, successively, by the anti-Atlas Proterozoic domain and the Paleozoic-Cenozoic basin of Tindouf, to the E by the Archean and Proterozoic crystalline massif of the Tiris domain and to the SE and S by the Archean and Proterozoic crystalline massif of the domain of Oulad Dlim “dominates affected by the Hercynian phase”.
![]()
Figure 1. localization of the sector on the scale of Morocco (A) and at the level of the region of Laayoune (B). Geological setting in the studied locality (C), Tarfaya Laayoune Basin, Morocco. Adaptation from the geological map of West Africa [37].
The Meso-Cenozoic Tarfaya-Laayoune-Dakhla basin, a relatively tectonically stable platform since the Jurassic, is located to the NW of the Archean-Proterozoic ridge of Réguibate “NW of the West African craton” (Figure 1). Its sedimentary filling is asymmetrical, increasing from east to west, reaching a stack of about 14 kilometers [38]. The studied basin is a consequence of the opening of the central Atlantic [39]. This basin summarizes the geological history from the Trias to the Neogene at the northern borders of the West African Craton.
This basin was initially studied during oil exploration [40]. In general, the thickness of the Mesozoic series is greater than in the basins located north of the High Atlas. In the Laayoune basin, they reach over 10,000 m. The structural framework of the Layoune-Tarfaya basins, sketched by Heyman (1988), is a set of hemigrabens developed from NNW dipping listric faults, connected at a depth of about 16 km on a detachment fault.
To the west, the Paleozoic series are folded and form the Appalachian reliefs of the Zemmur, a little more exposed to the winds and rains of the Atlantic. In the north of the country Mechem, the Hamada Tindouf (or Draa) from a vast flat entablature, consisting mainly of limestones, which dominates the Paleozoic lands. The few wadis are directed to the center of the basin of Tindouf. To the west of the Reguibate and Zemmour extends the Atlantic plain of Tarfaya, Laayoune, Boujdour and Dakhla, whose substratum, mainly carbonated, is rich in phosphate deposits.
On the structural level, the Tarfaya-Laayoune basin boundary with the Anti-Atlas is rectilinear in a North Northeast-South Southouest direction [41]. In the South, this limit is aligned with the major accident North Northeast-South Southwest of Zemmour and two major structural directorates control the Tarfaya-Laayoune basin and its hinterland [42]:
1) East Northeast-West Southwest to North East-South West: the Atlas direction limits the Precambrian-Paleozoic to the hinterland of the Tarfaya-Laayoune Basin; this direction is materialized by the axis of the Tindouf Paleozoic basin, the Anti-Atlas chain, the Reguibat Precambrian Massive Middle Axis, the South-Atlas Fault (East Northeast-West Southwest) and the Agadir Fault (N45˚) [38] [43].
2) North Northeast-South Southwest: the Meso-Cenozoic Atlantic direction coincides with the general elongation of the Tarfaya-Laayoune basin and the direction of the Zemmour fault, inherited from the Hercynian cycle [44].
3) A third N90˚ direction, intersecting with the coast, is apparent from the analysis of the bathymetric and gravity maps of the northwestern African margin. This family of transverse accidents segments the margin in low zones and high zones which control, from the Triassic, the distribution of the evaporite deposits [45].
3. Material and Methods
3.1. Data Characteristics
The Landsat 8 satellite rotates in a sun-synchronous, quasi-polar orbit at an altitude of 705 km, inclined at 98.2 degrees, and performs an Earth orbit every 99 minutes. The Operational Land Imager (OLI) satellite and the Thermal Infrared Sensor (TIRS) are sensors on board the Landsat 8 satellite, which was commissioned in 2013. This satellite collects images of the Earth with a cycle of 16-day repetition (Table 1).
The OLI sensor acquires data with improved radiometric accuracy over a 12-bit dynamic range, which improves the signal-to-overall noise ratio. This translates to 4096 potential gray levels, compared to just 256 gray levels in Landsat 1-7 which has only 8-bit instruments. The OLI collects data for two new bands, a coastal/aerosol band (band 1) and a cirrus band (band 9), as well as the heritage Landsat multispectral bands [31] [32] [33].
3.2. Methodology
In our study, we used as data sources the geological maps of Laayoune area (1:100.000) (Figure 2) and the correspondent Landsat 8 satellite image.
As shown in the flowchart (Figure 3), we begin the study using data sources (geological map and satellite image) and a software package for atmospheric and radiometric correction, pre-processing and processing.
Then, the pre-processing step requires the stacking of the data sources necessary for the stacking of the layers, the atmospheric correction and the radiometric correction of the corrected satellite image. It is an important step in remote sensing which aims to obtain a physical parameter independent of lighting conditions and even atmospheric conditions, which allows us to use images from different eras to detect changes.
A certain number of “radiometric noises” can be present in the image due either to deficiencies of the sensors, or to problems of data transmission, or finally of interpretation (coding/decoding). Basically, these radiometric corrections are carried out directly on the image reception by reassigning codes corresponding to neighboring pixels or to defective points.
![]()
Table 1. Landsat 8 band characteristics.
![]()
Figure 2. Geological map showing the outlines of the studied zone (Adaptation from the geological maps of LAAYOUNE area, scale: 1:100,000 [46] [47].
![]()
Figure 3. Methodology flowchart of the processing techniques applied for the lithological mapping based on Landasat 8 OLI data.
On the other hand, the sensors installed on board Earth observation satellites operating in the spectral range of solar emission (wavelengths of 0.24 µm) are radiometers which measure the luminance reflected by the earth + atmosphere assembly lit by the sun. In a non-cloudy atmosphere, the radiometric signal depends on the reflectance of the earth’s surface but also on the effects of the atmosphere that occur during the two paths (descending, from the Sun to the surface, and ascending, from the surface to the sensor) effected by solar radiation through the atmosphere. The simple calibration of the sensor data, in luminance (absolute values measured in) or in reflectance’s (relative values), therefore does not provide information on the surface but a composite signal. the objective of this correction is to extract this signal provides information independent of the effects of the atmosphere, which vary in time and space, and only concerning the terrestrial surface, which is the object to be studied.
3.3. Optimal Index Factor (OIF)
The Optimal Index Factor (OIF) is a statistical value that can be used to select the optimal combination of three bands in a satellite image with which you want to create a color composite developed by Chavez et al. (1984). It ranks the 20 three-band combinations that can be made from six TM data bands (not counting the thermal infrared band) and the optimal band combination out of all the possible 3-band combinations is the one with the greatest amount of d. ‘“Information” (highest sum of standard deviations), with the least duplication (lowest correlation between pairs of bands). This technique is valid for any multispectral remote sensing dataset. It is based on the amount of total variance and correlation within and between various combinations of bands [48]. The algorithm used to figure the OIF for any subset of three bands is:
(1)
where:
N: is the total number of bands in the map list (For 3 bands, there is only 1 combination; for 4 bands, there are 4 combinations, for 5 bands, there are 10 combinations; for 6 bands there are 20 combinations; and for 7 bands, there are 35 combinations).
Then, for each combination of three bands, the OIF is calculated as:
(2)
where:
: standard deviation of band i
: standard deviation of band j
: standard deviation of band k
: correlation coefficient of band i and band j
: correlation coefficient of band i and band k
: correlation coefficient of band j and band k
3.4. Minimum Noise Fraction (MNF)
Minimum noise fraction (MNF) is a powerful technique for denoising remote sensing data. This is a preparatory transformation which condenses most of the components into a few spectral bands and to classify these bands in decreasing orders of interest [49] [50]. Thus, the lower band numbers contain more spectral information and present great interest in lithological mapping [49] [51] [52] [53]. Particularly, the MNF uses a noise covariance matrix to decorrelate and resize the noise in the data and then run a pca to convert the noise-bleached data [54]. In this way, the noise is evenly distributed between the bands, which allows the method to achieve good discrimination of spectral characteristics [55].
3.5. Principal Component Analysis (PCA)
Principal component analysis (PCA) is a multivariate statistical method initiated by Pearson (1901) and widely used in the scientific community. It selects the uncorrelated linear combinations of variables such that each component successively extracts the linear combination and presents a lower variance [51].
In remote sensing, its role is to extract the desired spectral signatures by transforming a number of correlated spectral bands into a smaller number of uncorrelated spectral bands called principal components. His interest in mapping is reinforced by his ability to improve and separate certain types of spectral signatures from the background, which has led various authors to use the PCA technique [1] [2] [11] [51] [56] - [63].
3.6. End Members Extraction
The identification of lithological units and the classification of images using spatial remote sensing appear problematic because few images pixels display “pure” spectra [64]. Thus, the previous analyzes relating to PCA and MNF have a preponderant function for the determination of endmembers (Training Samples). Indeed, the use of Endmembers is mainly a critical step to classify the input data in the classification algorithm used [65] [66].
This approach requires a set of input endmembers that represent “pure” spectra of representative lithologies in the studied zone, where these endmembers can be extracted directly from imagery, so that all relevant materials are included that represent true surfaces and are under the same viewing and lighting conditions as all spectra in the scene. However, to limit the spectral similarity between endmembers, we used the separability of the ROI pairs to select only the most spectrally distinct. In this work, the representative endmember image spectra were derived for regions of interest (ROI) defined mainly from the existing geological map [34].
3.7. Artificial Neural Network (ANN)
ANNs are computer systems inspired by the neural networks that make up the human brain. Its concept is that they “learn” to do tasks from examples, generally without being programmed with task-specific rules [67] because it manages to find a solution in a non-algorithmic and unstructured way based on the adjustment of the weights connected to the network neurons [68].
For each neuron, the input value is calculated as follows [69] [70]:
(3)
where:
: the input value of ith neuron in nth layer;
correspond to the connection weight between ith neuron (layer) and jth neuron in the (n − 1)th layer;
is the output of jth neuron in the (n − 1)th layer;
m is the number of neurons in the (n − 1)th layer.
In each neuron, the value calculated from the Equation (3) is transferred by an activation function. The common function for this purpose is the sigmoid function, and is given by:
(4)
The ANN algorithm has been widely used for the classification of several sets of remote sensor data and has produced very powerful results (Jensen, 2015) compared to traditional classification methods [71] [72] [73] [74] [75]. Thus, the ANN is able to perform calculations in order to acquire, represent and calculate a map of a multivariate information space from an initial data set [76].
Image classification using ANN is performed by extracting texture features and then applying the back propagation algorithm. The typical back propagation network is characterized by input and output layers and at least one hidden one. Theoretically, there is no limit on the number of hidden layers, but there is usually only one or two. Neural Network technique uses standard back propagation for supervised learning which activates by adjusting weights in the node to minimize the difference between the activation of the exit node and the exit [77].
3.8. Spectral Information Divergence (SID)
SID is a spectral classification method that considers each pixel as a random variable and uses its spectral histogram to define a probability distribution [70]. For a given multispectral hyperspectral pixel vector
each component
is a pixel of band image
.
Then x can be modeled as a random variable by defining an appropriate probability distribution. The component
’s in x are nonnegative due to the nature of radiance or reflectance. With this assumption
can be normalized
within the range [0, 1] by defining
so that
is the desired probability vector resulting from the pixel vector x.
In order to further study how to use concepts arising from information theory to capture relationship and correlation between two multispectral hyperspectral
pixel vectors, assume that there is another pixel vector
with the probability distribution given by
and
Using p and q we define Spectral Information Divergence (SID) by:
(5)
where
And
It should note that
is called the relative entropy of y with respect to x which is also known as Kullack-Leibler information function, directed divergence or cross entropy. The SID defined by Equation (5) can be used to measure the spectral similarity between two pixel vectors x, and x2.
The spectral similarity between two pixels is then measured by the difference in probabilistic behaviors between their spectra; the smaller the divergence, the more likely the pixels are to be similar [78]. Regarding methodology, the Endmember spectra used by SID can come from ASCII files or spectral libraries, or you can extract them directly from an image (as average ROI spectra) [79]; this algorithm improves the precision of the estimation of Endmember used in the optimal classification [80].
4. Results
The morphology of the studied area corresponds to a vast plateau (hamada) not presenting major reliefs, except for depressions (known as sebkhas) which does not rarely show most Meso-Cenozoic formations (mainly along oueds and edges of sebkhas).
Obviously, the use of remote sensing data is very valuable for the study of geological features especially in large areas with little or no in-situ data. In order to obtain better results, the Landsat 8 OLI data has been radiometrically and geometrically corrected. This pre-processing step is essential to obtain spatial and radiometric corrected images to delineate the geological units in the study area. Then, the image used requires more enhancement through the following methods (OIF, MNF and PCA).
4.1. Enhancement Images
First, the Optimal Index Factor (OIF) is a statistical value which help to find the optimal combination of three bands in a satellite image in order to create a color composite [10] [81]. Thus, Figure 4 shows that the band composition 6-7-8 of Landsat 8 data, represent the most optimal RGB color combination image that will be chosen for geological mapping in the study area according to the results of the OIF. Indeed, this RGB color combination represents the combination of the most information (as measured by variance) with the least duplication (as measured by correlation) and thus exhibits a noticeable color variation corresponding to large geological formations.
The MNF technique allows to segregate noise, feature extraction, spectral data reduction but also to better discriminate formation still confused at this stage. Figure 5 displays a color composite image comprising the MNF components 1, 2 and, 3 which are displayed in red, green and blue respectively. Table 2 summarizes the most influential bands in the MNF transformation, measured by eigenvectors.
![]()
Figure 4. False colour combination with highest OIF. RGB composite of bands 6 (red), 7 (green), and 8 (blue) of Landsat 8 OLI data of the studied area.
![]()
Figure 5. MNF RGB color combination (MNF 1, MNF 2, MNF 3).
![]()
Table 2. Percentage of information contained in each MNF bands of the studied data.
The PCA transform was applied to Landsat 8 OLI (Figure 6) data in order to produce uncorrelated output bands, segregate noise components, and reduce the dimensionality of the studied data. Table 3 summarizes the most influential bands in the PCA transformation, measured by eigenvectors.
The obtained results expect that the first PC represents the highest percentage of variance or eigenvalue and the last PC represents lower variance; this is computed by the presence of noise in the original spectral bands of the Landsat 8 data.
Thus, more colorful color composite images are produced using PCA instead of the spectral color composite images because the spectral bands are not correlated (Figure 6 shows PCA bands 3, 1 and 2 in RGB for the study area). Thus, the structural information is well presented in the Landsat 8 image and is ready for lithological classification.
Before starting the lithological classification step, the extraction of endmembers helps to be near of the surface reality as much as possible. In this work, we chose first to study several typical cross-sections (along Oued Saquia Hamra and Sebkha Oum Debaa) and the geological map at a scale 1/100,000 [34]; at the same level, a visual interpretation on the enhanced images supported by the spectral profile of the pixels from the best OIF image.
On the other hand, to attenuate the spectral similarity between endmembers, we used the separability of the ROI pairs to select only the most spectrally distinct. In this study, and after several attempts to collect the ROIs covering the same facies on all the enhanced images (PCA and MNF), we opted for a separability threshold of 2 so that the ROI pairs are acceptable.
4.2. SID
Due to the lithological diversity down to the pixel, this type of classification, based on outcrop spectra, was not very satisfactory. Figure 7 shows the resulting image. Indeed, examination of the image reveals large unclassified areas and a large disparity with the geological map.
For example, this classification does not allow to detect the large dune formation which is cut horizontally by the Oued Saquia Hamra and it replaces it by the Amgraw formation (PVlh3) and the bituminous Upper Cretaceous formation (Com). In the outcrop, the bituminous formations appear only along the Wadi and the deep sebkhas. Also, this classification does not distinguish between water and alluvial land.
Basically, the comparison of this classification with the geological map shows that the SID classification does not present a great coherence with the reality on the ground despite having presented good results in other works [82] [83] [84].
![]()
Table 3. Percentage of information contained in each PCA bands of the studied data.
![]()
Figure 6. PCA RGB color combination (PCA1, PCA2, PCA3).
![]()
Figure 7. Classification map obtained by Spectral Information Divergence method (SID).
Quantitatively, the validation test confirms these remarks and findings and during which the accuracy assessment is 49.6144% and the Kappa coefficient does not exceed 0.4398. Probably, the high lithological diversity was not very adequate because of the pixel resolution of the used image.
4.3. ANN
Unlike the SID classification, the supervised classification ANN has presented very adequate results (Figure 8) and an almost perfect correlation with the geological map of Laayoune where the formation was effortless recognized.
Indeed, the dune formation is well identified by this classification as well as the bituminous formations of Upper Cretaceous age which constitute minor thickness and have been plotted along the Oued Saquia Hamra and the edges of Sebkha Oum Debaa. Also, the Cenozoic formations show good similitude with the geological map: the hammada slab (q5), the Miocene sandstone formations (Mla1, Mla2…). Also, it should be noted that this classification differentiates the water zones from the alluvial surfaces.
The performance of these results is tested by the accuracy assessment. This assessment shows that the classification displays a precision equivalent to 92.56% and a Kappa coefficient of 0.9143.
![]()
Figure 8. Classification map by the Artificial Neural Network classification.
5. Discussion
In order to specify which lithological classification is the most precise between the ANN and SID classifications, we used the landsat 8 OLI image which has been subjected to atmospheric and radiometric corrections. After the computation of the OIF, we obtain the highest combination of RGB bands (b6, b7, b8) which will perform a good presentation of the image data. Later, we calculated the PCA which is used to reduce the dimensionality of the satellite image and the MNF to separate the noise in the data, and to reduce the computational requirements for further processing.
The creation of ROIs is done, from the geological map of Laayoune, to supervise over the different facies in order to use it in the two classifications, then we calculate the parability which has values close to 2 that it is principal that we are correctly on the best way in our study.
After performing the two classifications: ANN and SID, the results, based on the overall precision and the confusion matrix, obtained showed that the ANN has the highest overall precision of 92.56% with a Kappa coefficient of 0.9143 (Figure 9). By ANN, the overlay of this classification image with the geological map shows great similarity between the classes and geological formations on the map and allows to have a greater precision in the lithofacies classification and
![]()
Figure 9. Accuracy assessment of the studied classifications (SID and ANN).
![]()
Figure 10. Qualitative validation with samples used with the ANN image.
presents a good contribution to the spectral analysis based on the study. Definitely, adding in situ measurements of the spectral characteristics of the typical formations of the region, the Landsat 8 images could present results with more precision in the classification of the facies.
On the qualitative level, the validation of the results of the ANN classification was endorsed with outcrop images (Figure 10) with GPS position showing the land reality and the concordance with the ANN classification. Allegedly, the ANN classification shows excellent results in these very particular morphological conditions dominated by flat terrain and sand migration which hinder the visualization of satellite images.
6. Conclusions
The present study was carried around the Meso-Cenozoic Tarfaya-Laayoune basin (Saharian domain, Morocco). The studied area represents an arid region which corresponds to a vast plateau (hamada) not presenting major reliefs, except for depressions (known as sebkhas) which does not rarely show most Meso-Cenozoic formations (mainly along oueds and edges of sebkhas).
In this work, we have examined two methods of classification (ANN and SID) and compared them with the current map using an entire image processing process starting with atmospheric and radiometric correction, the georeferencing of the image and the geological map then performing the OIF, PCA, MNF calculations and creating the ROIs on which the separability calculation was performed.
After the creation of classifications input, the results were confirmed with the calculation of the confusion matrix and the Kappa index, which inform that the ANN classification is the most accurate method with an overall accuracy percentage of 92.56% and kappa index of 0.9143.
To conclude, the remote sensing processing on Landsat 8 OLI data can allow better precision in the lithofacies classification, and it gives geologists a very good opportunity to improve their investigations in areas of difficult access. This will be a major contribution to the spectral analysis based on the study. Indeed, with a good improvement of these images by the multiple methods currently available and an adapted classification (ANN for our case) one can easily draw up a very precise lithofacies map.