B-Tagging through Latent Representation and GNN at the CMS Experiment.
In High Energy Physics, jets are seen as collections of hadrons originating from the production of a quark or gluon. The reconstructed jets are predominantly in the direction of propagation of the original quark or gluon and carry its momentum. In experiments such as the CMS, we can obtain the energy deposition (described in the eta-phi plane) of this collection of hadrons through specialized detectors called calorimeters. Jets are frequently one of the main contributors to the data retrieved from high-energy collisions. Because of this, algorithms for the classification of jets are a relevant research subject.
In this work, we focused on the jets originating from the bottom quark (b-Jets). Accurately identifying b jets is crucial when studying and characterizing various channels, such as top quark events and numerous new physics scenarios. This process holds significant relevance in experimental HEP, as it allows physicists to distinguish events involving b quarks from other particle interactions.
We propose to explore the use of two Machine Learning techniques(Latent Representation and Graph Neural Networks) in the discrimination of b-jets from c-jets and and jets from other flavors (u,d,s or gluons). For this purpose, the trajectories of the charged particles (tracks) within the jets and the flavour from Monte Carlo Truth were combined with the information from the jets themselves to create a concatenated vector of the jet-track system. After this initial representation, a Convolutional AutoEncoder architecture was employed to map these vectors to a latent space to obtain a new representation that could distinguish
features different jet flavours. This latent representation was then used to build a graph model of the entire jet, which was then fed to a Graph Neural Network for the flavour discrimination learning.