Project 01  ·  HSF-India Project

Higgs to WW Analysis

Using CMS 2016 Ultra-Legacy NanoAOD Open Data

Overview

Developing a complete analysis pipeline for the Higgs boson decaying into two W bosons in the opposite-sign, different-flavour (electron-muon) final state via the ggH production channel. The project uses CMS 2016 Ultra-Legacy Open Data to probe Standard Model physics while demonstrating the scope of CERN Open Data for education and research.

Analysis Highlights
  • Analysis Pipeline — End-to-end reconstruction of \(H \to WW \to e^\pm\mu^\mp\) (ggH) using CMS Open Data, from trigger selection to final-state object definition.
  • Object Reconstruction — Implementation of lepton identification, isolation criteria, and missing transverse energy \( E_{\text{T}}^{\text{miss}} \) reconstruction for neutrino inference.
  • Signal Extraction — Kinematic and topological selections applied to isolate signal from dominant backgrounds (top and Drell-Yan).
  • Statistical Analysis — Signal significance estimation with evaluation of statistical uncertainties in low signal-to-background regime.
  • Reproducible Workflow — Development of a documented and modular analysis chain for HSF-India, emphasizing transparency, corrections, and known limitations.
Technical Pipeline
  • Core Stack — Uproot (I/O), Awkward Array (jagged structures), Vector (4-vector arithmetic), Hist (yield accumulation).
  • Distributed Computing — Dask for scalable parallel processing of millions of NanoAOD events.
  • Statistical Inference — CMS Combine for simultaneous profile likelihood fits.
  • Scale Factors — HLT Trigger, lepton ID, and isolation corrections applied to simulation weights.
Project 02  ·  M.Sc. Thesis

Drell-Yan Process at CMS

Data Analysis of \( Z \to e^+ e^- \) at the CMS Experiment

Overview

Master's thesis centred on the Drell-Yan process at the LHC, working with Run 2 Monte Carlo data from the CMS experiment to study Standard Model processes. The analysis focused on the \( Z \to e^+ e^- \) channel, isolating a clean Z boson signal from dominant backgrounds.

Analysis Highlights
  • Signal Extraction — Precision reconstruction of \( Z \to e^+ e^- \) peak (60–120 GeV) using CMS Open Data with MVA-based electron identification (Fall17 V2).
  • Event Selection — Optimised kinematic cuts and lepton isolation criteria to enhance signal purity and data quality.
  • Background Suppression — Projected \( E_{\text{T}}^{\text{miss}} \) algorithm exploiting lepton–\( E_{\text{T}}^{\text{miss}} \) angular correlations to suppress \( t\bar{t} \) and diboson backgrounds.
  • Analysis Framework — C++/ROOT-based pipeline using TChain and TLorentzVector with custom data slimming for efficient multi-file processing.
  • Validation — Shape-based comparison with six MC backgrounds (\( t\bar{t},\ WW,\ WZ,\ ZZ \)) via normalized data–simulation overlays.
Technical Implementation

C++-based ROOT scripts for event filtering, selection, and histogram production.

Broader Interests

What I'm Curious About

🤖
AI & ML in HEP
How machine learning is reshaping signal extraction, anomaly detection, and the future of BSM searches at the LHC.
🌐
Open Science
Democratising collider physics through open data, reproducible pipelines, and tools accessible to researchers without full collaboration access.
🔭
Beyond Standard Model
New physics searches — extended Higgs sectors, dark matter mediators, and precision measurements probing BSM effects.
⚙️
HEP Computing
Next-generation columnar analysis tools, distributed computing with Dask, and the HL-LHC data challenge.