Soil presents a high vulnerability to the environmental degradation processes
especially in arid and semiarid regions, requiring research that leads to its understanding. To date, there are no detailed soil maps covering a large extension of
the Middle East region, especially for calcium carbonate content. Thus, we used topsoil
data (0–20 cm) from more than 5,000 sites for mapping near 3,338,000 square km of
the Middle East. To do this, we used covariates obtained from remote sensing data
and random forest (RF) algorithm. Around 65% of the soil information was acquired
from Iranian datasets and the remaining from the World Soil Information Service
dataset. By using 30 covariates layers of soil, climate, relief, parent material and age
features, we then trained and tuned prediction models—in R software— and used the
optimal ones (according to minimum root mean square error) for making spatial
predictions—within Google Earth Engine— of topsoil attributes and associated
uncertainties at 30 m resolution. All covariates were relatively important for mapping
topsoil attributes, ranging from 4% to 98%. Annual precipitation, temperature annual
range and elevation were the most important ones (> 31%). Overall, the prediction
models trained by RF explained around 40–66% of the variation present in topsoil
attributes. The ratio of the performance to interquartile distance (RPIQ) ranged
between 1.59 and 2.83, suggesting accurate models. Our predicted maps indicated
that sandy and loamy soils with poor organic carbon levels, alkaline reaction and high
calcium carbonate content were widespread in middle eastern topsoils. Our framework
overcomes some limitations related to high computational requirements and enables
accurate predictions of topsoil attributes. Our maps presented correct pedological
correspondences and had realistic spatial representations and interesting levels of
uncertainties.