Classifying hadronic objects in ATLAS with ML/AI algorithms
The identification of hadronic final states plays a crucial role in the physics programme of the ATLAS Experiment at the CERN LHC. Sophisticated artificial intelligence (AI) algorithms are employed to classify jets according to their origin, distinguishing between quark- and gluon-initiated jets, and identifying hadronically decaying heavy objects such as W bosons and top quarks. This contribution summarises recent developments in constituent-based tagging architectures, including graph neural networks (GNNs) and transformer-based approaches, their performance in simulated and real data, and future perspectives towards data-driven optimisation and model-independent tagging strategies.
💡 Research Summary
The ATLAS Collaboration has presented a comprehensive review of modern machine‑learning techniques for classifying hadronic objects in LHC proton‑proton collisions. Traditional taggers relied on high‑level jet substructure variables (track multiplicity, width, etc.) combined in boosted‑decision‑trees (BDTs). Recent advances have shifted the paradigm toward constituent‑based architectures that ingest the four‑vectors of all particles (or particle‑flow objects) inside a jet, thereby exploiting the full detector granularity.
The paper first outlines baseline deep‑learning approaches: fully‑connected DNNs (FC‑DNN), Energy Flow Networks (EFNs) and Particle Flow Networks (PFNs), which are built on the DeepSets formalism and guarantee permutation invariance. While EFNs restrict themselves to infrared‑and‑collinear‑safe observables for theoretical robustness, PFNs accept a broader set of features.
Graph neural networks (GNNs) and transformer‑based models constitute the core of the state‑of‑the‑art. ParticleNet treats jet constituents as nodes in a graph, learning edge relations through message‑passing, and has become a reference for jet‑flavour and boosted‑object tagging. Transformer architectures such as DeParT (Dynamically Enhanced Particle Transformer) and GN2 replace static BDT features with attention blocks that dynamically weight relationships among particles. DeParT operates on particle‑flow objects (PFOs) and relational features, achieving markedly higher gluon‑jet rejection across a wide p_T range for small‑radius (R = 0.4) jets. GN2, augmented with auxiliary tasks (vertex and track‑origin classification), stabilises training and delivers roughly a three‑fold improvement in light‑jet rejection relative to the Run‑2 baselines DL1r and DL1d.
For quark–gluon discrimination, Figure 2 in the paper shows that DeParT reaches gluon‑jet rejection factors of order 10² at a quark‑identification efficiency of 70 %, outperforming FC‑DNN, EFN, PFN, and ParticleNet. The performance remains robust as jet transverse momentum increases.
In the boosted regime (large‑radius R = 1.0 jets), the Particle Transformer (ParT) and LundNet are highlighted. ParT, a transformer that combines PFO kinematics with track information, attains background rejection exceeding 10³ at 80 % signal efficiency for W‑boson jets. LundNet exploits the Lund Jet Plane clustering history as a graph, and its adversarially trained variant LundNetANN decorrelates the tagger output from jet mass, thereby reducing systematic uncertainties associated with mass‑dependent background modelling. Nevertheless, the study reports a substantial dependence on the underlying Monte‑Carlo generator: switching parton‑shower models can degrade performance by up to 40 %.
Top‑quark tagging similarly benefits from constituent‑based methods. ParticleNet delivers the best overall accuracy across a broad p_T spectrum, though its computational cost is higher than that of LundNet. The latter shows improved robustness against generator variations, making it attractive for analyses where systematic control is paramount.
The authors conclude that constituent‑based GNNs and transformers have become the leading architectures for jet tagging, surpassing traditional observable‑based methods in both discrimination power and flexibility. However, challenges remain: (i) generator‑level dependence, (ii) systematic uncertainty mitigation, and (iii) real‑time deployment constraints. Future work will focus on data‑driven validation, uncertainty reduction techniques (e.g., adversarial training, decorrelation losses), and hybrid models that combine transformer‑style attention with clustering‑tree‑based graph representations. Such developments aim to deliver model‑independent, physics‑transparent taggers that can be reliably applied to the upcoming high‑luminosity LHC data‑taking periods.
Comments & Academic Discussion
Loading comments...
Leave a Comment