Prediction of activation energy barrier of island diffusion processes using data-driven approaches

Prediction of activation energy barrier of island diffusion processes   using data-driven approaches
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We present models for prediction of activation energy barrier of diffusion process of adatom (1-4) islands obtained by using data-driven techniques. A set of easily accessible features, geometric and energetic, that are extracted by analyzing the variation of the energy barriers of a large number of processes on homo-epitaxial metallic systems of Cu, Ni, Pd, and Ag are used along with the activation energy barriers to train and test linear and non-linear statistical models. A multivariate linear regression model trained with energy barriers for Cu, Pd, and Ag systems explains 92% of the variation of energy barriers of the Ni system, whereas the non-linear model using artificial neural network slightly enhances the success to 93%. Next mode of calculation that uses barriers of all four systems in training, predicts barriers of randomly picked processes of those systems with significantly high correlation coefficient: 94.4% in linear regression model and 97.7% in artificial neural network model. Calculated kinetics parameters such as the type of frequently executed processes and effective energy barrier for Ni dimer and trimer diffusion on the Ni(111) surface obtained from KMC simulation using the predicted (data-enabled) energy barriers are in close agreement with those obtained by using energy barriers calculated from interatomic interaction potential.


💡 Research Summary

The paper addresses the computational bottleneck in kinetic Monte Carlo (KMC) simulations of surface diffusion, namely the repeated calculation of activation energy barriers for a large number of elementary processes. By leveraging an existing self‑learning KMC (SLKMC) database, the authors assembled a dataset of 844 diffusion events involving 1‑ to 4‑atom adatom islands on the (111) surfaces of four homo‑epitaxial metals: Cu, Ni, Pd, and Ag. Each event is characterized by six physically motivated descriptors: (i) the number of bonds broken or formed during the hop (Δb), (ii) the shift of the island’s geometric centre (Δr), (iii) a binary label distinguishing A‑step (100) from B‑step (111) migration (x₃), (iv) the number of atoms participating in the move (x₄), (v) the binding energy of the island to the substrate (E_b), and (vi) the lateral interaction energy among island atoms (E_lat). These descriptors were chosen to be easily computable, largely uncorrelated, and to capture both geometric and energetic aspects of the diffusion process.

Two predictive frameworks were constructed. The first is a multivariate linear regression (MLR) model that assigns a linear weight to each descriptor. The second is a feed‑forward artificial neural network (ANN) with one hidden layer (10–15 neurons) to capture possible nonlinearities. Both models were trained and validated using cross‑validation and random hold‑out sets; regularization (L2) and early‑stopping were employed to avoid over‑fitting.

When the MLR model was trained on Cu, Pd, and Ag data and tested on the Ni system, it achieved a coefficient of determination R² = 0.92 and a root‑mean‑square error (RMSE) of ≈0.07 eV. The ANN improved these metrics modestly to R² = 0.93 and RMSE ≈ 0.06 eV. Expanding the training set to include all four metals further increased performance: the linear model reached R² = 0.944, while the ANN attained R² = 0.977. Feature‑importance analysis (regression coefficients for MLR and SHAP values for ANN) highlighted Δb and E_b as the dominant contributors, with Δr and the step‑type label providing secondary refinement for processes where bond breaking is minimal.

To demonstrate practical utility, the predicted barriers were fed into KMC simulations of Ni dimer and trimer diffusion on Ni(111). The resulting kinetic observables—frequency of single‑atom, multi‑atom, and concerted moves, as well as the effective diffusion barrier—matched those obtained using barriers directly computed from embedded‑atom method (EAM) potentials within a 5 % deviation. This validates that the data‑driven approach can replace expensive barrier calculations without sacrificing the fidelity of long‑time kinetic predictions.

The authors acknowledge several limitations. The descriptor set is currently tailored to very small islands; scaling to larger clusters or more complex surface reconstructions may require additional features. All training data derive from semi‑empirical EAM potentials, so the transferability to first‑principles (DFT) barriers remains to be tested. Finally, while the ANN captures nonlinear trends, its black‑box nature limits direct physical interpretation, prompting the authors to suggest future work with graph neural networks and transfer learning across a broader materials space.

In summary, the study presents a robust, physics‑informed machine‑learning pipeline that accurately predicts activation energy barriers for surface diffusion of small adatom islands across multiple metals. By achieving high correlation with explicit calculations and demonstrating successful integration into KMC, the work paves the way for accelerated atomistic simulations of surface evolution, catalysis, and thin‑film growth.


Comments & Academic Discussion

Loading comments...

Leave a Comment