TITLE:
Machine Learning-Based Outlier Detection in Long-Term Climate Data: Evidence from Burkina Faso’s Synoptic Network
AUTHORS:
Zamantakonè Guillaume Ki, Soumaila Gandema, Wenceslas Somda, François Dabilgou, Marcel Bawindsom Kébré
KEYWORDS:
Machine Learning, Climate Data, Anomaly Detection, Burkina Faso, PyOD
JOURNAL NAME:
Atmospheric and Climate Sciences,
Vol.15 No.3,
July
8,
2025
ABSTRACT: In recent decades, the impact of climate change on natural resources has increased. However, the main challenges associated with the collection of meteorological data include the presence of missing, outlier, or erroneous data. This work focuses on outliers detection in long-term climate data by using machine learning models. The study uses meteorological data collected over 40 years (1981-2021) from ten synoptic stations operated by Burkina Faso’s National Meteorological Agency (ANAM). The methodology is based on the use of 18 machine learning algorithms from the PyOD library, including probabilistic, linear, proximity-based, and ensemble models. Univariate and multivariate analyses are performed. For the multivariate analysis, this paper focuses on two key variables, maximum temperature and minimum relative humidity which consistently exhibit strong correlations across all stations. A robust approach is adopted to optimize the detection of outliers, using thresholds based on extreme percentiles. The results show that models such as KPCA, LSCP, LOF, and Feature Bagging are best suited to capturing anomalies in complex time series. These results will contribute to more reliable climate analyses and improved modeling of extreme climate events in data-scarce regions.