TITLE:
An Explainable Wavelet-Based Feature Decomposition and Machine Learning Framework for Land Cover Classification
AUTHORS:
Saviour Mantey, Richmond Akwasi Nsiah, Maame Boama Poku
KEYWORDS:
Land Cover, Machine Learning, Wavelet Transform, Classification, Explainable AI (xAI)
JOURNAL NAME:
Open Journal of Applied Sciences,
Vol.16 No.1,
January
4,
2026
ABSTRACT: Accurate land cover classification is essential for environmental monitoring, urban planning, and resource management. Conventional classifiers trained on raw spectral bands are often limited by noise, inter-class spectral similarity, and intra-class variability. This study introduces a wavelet-based feature decomposition and machine learning framework to address these challenges. Landsat-8 Operational Land Imager (OLI) Level-2 surface reflectance data were pre-processed and decomposed using a one-dimensional discrete wavelet transform to isolate low- and high-frequency components. The decomposed features were concatenated with raw bands to form an enriched dataset, which was used to train and validate three supervised classifiers: Random Forest (RF), Gradient Boosting (GBM), and Decision Tree (DT). Model performance was evaluated using 5-fold cross-validation, and the best-performing model, the Random Forest Wavelet Transform (RF_WT), achieved the highest macro-F1 score (0.865). Explainable AI (Gini importance, permutation importance, SHAP (SHapley Additive exPlanations) values) confirmed the complementary role of raw and decomposed features, with wavelet-derived detail components in the Short-Wave Infrared (SWIR) and visible bands strongly influencing classification. The RF_WT model, when compared to RF, DT and GBM classifiers trained on raw spectral bands across all classes, achieved the highest macro-averaged accuracy values (UA = 0.876, PA = 0.864, F1 = 0.865), compared with RF (UA = 0.864, PA = 0.852, F1 = 0.853), GBM (UA = 0.849, PA = 0.832, F1 = 0.832), and DT (UA = 0.815, PA = 0.808, F1 = 0.809). The approach is valuable in support of ecological monitoring and resource management. However, challenges in settlement classification highlight the limitations of medium-resolution data. Future work could integrate higher-resolution imagery or multi-temporal data for improved performance.