Research Info

Title
A novel hybrid feature selection and tree‑based models for predicting dust event frequency in West Asia
Type Article
Keywords
Arid regions · Dust pollution · Environmental factors · Machine learning · Remote sensing
Abstract
Accurate and reliable prediction of dust event frequency, based on stable environmental factors and understanding their contribution, is essential to reduce harmful effects on human health and the environment. This study proposes a novel consensus-based voting strategy combining six feature selection methods—correlation analysis, mutual information, elastic net, genetic algorithm, recursive feature elimination, and random forest—along with variance inflation factor analysis to identify the most stable environmental factors of dust event frequency in Iran’s Central Plateau, West Asia. Six tree-based machine learning models (Decision Tree, Random Forest, Extra Trees, XGBoost, LightGBM, and CatBoost) were trained to predict dust event frequency. Their performance was compared using a multi-criterion ranking approach based on R-square, root mean square error, mean absolute error, median absolute error, mean absolute percentage error, and uncertainty analyses on training and test sets. Shapley additive explanation values were applied to interpret predictor importance. Among thirty-four environmental variables, twelve were identified as key factors affecting monthly dust event frequency variability. According to proposed approach, CatBoost outperformed other models, followed by Random Forest, XGBoost, Decision Tree, LightGBM, and Extra Trees. The best model yielded R-squared values of 0.94 for training and 0.61 for testing with corresponding root mean square error (8.8, 20.6), mean absolute error (7.1, 16.6), mean absolute percentage error (42.2, 101.3), and median absolute error (6.1, 13.7). Monthly evapotranspiration, surface pressure, rain, and soil moisture variability were the strongest governing factors. The results form the foundation for early warning systems, dust abatement planning, and policy making for arid environments.
Researchers Zohre Ebrahimi-Khusfi (First researcher)
Seyed Arman Smadi-Todar (Second researcher)
mohammad khosroshahi (Third researcher)