This study was conducted to estimate soil clay content in two depths using geophysical techniques (Ground Penetration
Radar—GPR and Electromagnetic Induction—EMI) and ancillary variables (remote sensing and topographic
data) in an arid region of the southeastern Iran. GPR measurements were performed throughout ten
transects of 100 m length with the line spacing of 10 m, and the EMI measurements were done every 10 m on
the same transect in six sites. Ten soil cores were sampled randomly in each site and soil samples were taken
from the depth of 0–20 and 20–40 cm, and then the clay fraction of each of sixty soil samples was measured in
the laboratory. Clay contentwas predicted using three different sets of properties including geophysical data, ancillary
data, and a combination of both as inputs to multiple linear regressions (MLR) and decision tree-based algorithm
of Chi-Squared Automatic Interaction Detection (CHAID) models. The results of the CHAID and MLR
models with all combined data showed that geophysical data were the most important variables for the prediction
of clay content in two depths in the study area. The proposed MLR model, using the combined data, could
explain only 0.44 and 0.31% of the total variability of clay content in 0–20 and 20–40 cm depths, respectively.
Also, the coefficient of determination (R2) values for the clay content prediction, using the constructed CHAID
model with the combined data, was 0.82 and 0.76 in 0–20 and 20–40 cm depths, respectively. CHAID models,
therefore, showed a greater potential in predicting soil clay content from geophysical and ancillary data, while
traditional regression methods (i.e. the MLR models) did not perform as well. Overall, the results may encourage
researchers in using georeferenced GPR and EMI data as ancillary variables and CHAID algorithm to improve the
estimation of soil clay content.