Improving predictions of hydrological low-flow indices in ungaged basins using machine learning
Scott C. Worland, William H. Farmer, Julie E. Kiang
We compare the ability of eight machine-learning models (elastic net, gradient boosting, kernel-k- nearest neighbors, two variants of support vector machines, M5-cubist, random forest, and a meta-learning ensemble M5-cubist model) and four baseline models (ordinary kriging, a unit area discharge model, and two variants of censored regression) to generate estimates of the annual minimum 7-day mean streamflow with an annual exceedance probability of 90% (7Q10) at 224 unregulated sites in South Carolina, Georgia, and Alabama, USA. The machine-learning models produced substantially lower cross-validation errors compared to the baseline models. The meta-learning M5-cubist model had the lowest root-mean-squared-error of 26.72 cubic feet per second. Partial dependence plots show that 7Q10s are likely moderated by late summer and early fall precipitation and the infiltration capacity of basin soils.
Scott C. Worland is a hydrologist who applies statistical learning methods to answer pressing earth science questions. He hopes to bring strong disciplinary knowledge to interdisciplinary research teams. Research interests include statistical hydrology, machine learning, and Bayesian statistics.