Oceanic lead concentrations and isotopes mapped using explainable machine learning
Using three machine learning models, Olivelli and her colleagues (2025, see reference below) generated global climatologies of lead (Pb) concentrations, 206Pb/207Pb, and 208Pb/207Pb ratios using the non-linear regression algorithm XGBoost. The models were trained and tested on high-quality Pb data collected as part of the GEOTRACES programme, to which the authors added historical data, and use climatologies of oceanographic and atmospheric variables as features from which to predict Pb concentrations and isotope compositions.
They examined which factors most influenced the model’s predictions (using the SHAP framework). Interestingly, the most important predictors of Pb concentrations are slightly different from those of Pb isotope ratios. While all include seawater temperature, atmospheric dust, and salinity, 206Pb/207Pb ratios are also sensitive to dissolved oxygen concentration, whereas Pb concentrations and 208Pb/207Pb ratios are more sensitive to atmospheric black carbon and surface chlorophyll a.
The model predictions match observations well. They confirm that the Indian Ocean is the most impacted by lead pollution, while the latter is decreasing in the Atlantic and Pacific Oceans. Intermediate waters of Southern origin are spreading and distributing highly anthropogenic Pb isotope signatures to the remaining oceans, making the Southern Ocean a key area to prioritise in future sampling campaigns.

Reference:
Olivelli, A., Arcucci, R., Rehkämper, M., & van de Flierdt, T. (2025). Mapping the global distribution of lead and its isotopes in seawater with explainable machine learning. Earth System Science Data, 17, 3679–3699. Access the paper:10.5194/essd-17-3679-2025