Disentangling Jet Categories at Colliders Machine Learning for Jet Physics Workshop Eric M. Metodiev Center for Theoretical Physics Massachusetts Institute of Technology Joint work with Patrick Komiske and Jesse Thaler November 16, 2018 [1809.01140] [1802.00008] 1
Menu Jet-by-jet classification § § “What type of jet is this?” u, d, s g c b W/Z t H Or New Physics (who ordered that?) Eric M. Metodiev, MIT Disentangling Jet Categories 2
Disentangling Distributions Menu § § “What types of jets are these?” Unsupervised Learning? Data-driven categories? Eric M. Metodiev, MIT Disentangling Jet Categories 3
Disentangling Distributions This talk: Towards experimentally measuring separate quark and gluon distributions Why? • Better understand QCD jets • Data-driven quark/gluon taggers Well-defined jet categories & labels • • Parton shower tuning? Better extraction of 𝛽 𝑡 ? • Think distribution-level, not per-jet level Don’t need a perfect tagger! Don’t need MC fractions or templates! Eric M. Metodiev, MIT Disentangling Jet Categories 4
Classification : vs. Jet Tagging q g Jet-Level Techniques Regression : Pileup Removal Clustering : Jet Finding T opic Modeling : This talk. Dataset-Level Techniques Our Goal: Find “jet types” that best explain the data Anomaly Detection : ML New Physics Searches Eric M. Metodiev, MIT Disentangling Jet Categories 5
Topic modeling Treat text documents as statistical mixtures of “topics” – distributions over words. Can you extract the underlying “topics” given only the documents? Yes* * Terms and conditions apply Eric M. Metodiev, MIT Disentangling Jet Categories 6
Topic modeling Treat text documents as statistical mixtures of “topics” – distributions over words. Can you extract the underlying “topics” given only the documents? Yes , as long as the topics are “mutually irreducible” (M.I.): [1710.01167] [1204.1956] Each topic must have an “anchor” word that doesn’t appear in any other topics. A quick example: The term “energy conservation” appears in Physics papers and in Climate Science papers. However, only Physics papers contain “Noether’sTheorem” and only Climate Science papers contain “Kyoto Protocol” . These are the anchor words. Hence Physics and Climate Science are mutually irreducible topics. Eric M. Metodiev, MIT Disentangling Jet Categories 7
An Example Let’s model physicists as random jargon emitters. Deep Learning Deep Learning for Jet Tagging? for Jet Tagging? IRC safety! Use ROOT. Trivial. Eric M. Metodiev, MIT Disentangling Jet Categories 8
An Example Listen to the jargon emitted from two different conferences. Deep Learning Deep Learning for Jet Tagging? for Jet Tagging? IRC safety! Trivial. Use ROOT. Conf.𝐵 Conf. 𝐵 𝑔 𝑂 "ROOT" Conf.𝐵 Conf. 𝐵 1 − 𝑔 Expt. 𝑂 "Trivial" Conf. 𝐶 = Expt. Conf. 𝐶 = Conf.𝐶 𝑂 "ROOT" 𝑔 Conf.𝐶 𝑂 "Trivial" 1 − 𝑔 Expt. Expt. Eric M. Metodiev, MIT Disentangling Jet Categories 9
An Example Disentangle theorist and experimentalist vocabularies from the jargon at conferences. Pure theorist and experimentalist jargon Deep Learning Deep Learning “phase space” is key for Jet Tagging? for Jet Tagging? IRC safety! Trivial. Use ROOT. Conf.𝐵 Conf. 𝐵 𝑔 𝑂 "ROOT" Conf.𝐵 Conf. 𝐵 1 − 𝑔 Expt. 𝑂 "Trivial" Conf. 𝐶 = Expt. Conf. 𝐶 = Conf.𝐶 𝑂 "ROOT" 𝑔 Conf.𝐶 𝑂 "Trivial" 1 − 𝑔 Expt. Expt. Eric M. Metodiev, MIT Disentangling Jet Categories 10
Collider data as mixtures of jet types A mathematical correspondence between topic models and jet distributions. Eric M. Metodiev, MIT Disentangling Jet Categories 11
Collider data as mixtures of jet types This is an unfamiliar way to think about machine learning and jet physics. Jet Sample C Jet Sample B Jet Sample A quark jet gluon jet We are going to use observables and model outputs not as classifiers, but as feature spaces to extract mixture fractions. Eric M. Metodiev, MIT Disentangling Jet Categories 12
Disentangling Distributions Menu § § “What types of jets are these?” Take your favorite jet algorithm Anti-kT R=0.4 Consider multiple jet samples Sample A: Z + jet Sample B: dijets Select a substructure feature space Constituent Multiplicity Jet Mass Soft Drop Multiplicity Model Output Goal: Find the underlying categories which explain the variation in substructure among the samples. Eric M. Metodiev, MIT Disentangling Jet Categories 13
Disentangling Distributions Menu § § “What types of jets are these?” Take your favorite jet algorithm Anti-kT R=0.4 Consider multiple jet samples 𝑟 𝑞 quark 𝒚 + 1 − 𝑔 𝑟 𝑞 gluon (𝒚) 𝑞 sample 𝐵 𝒚 = 𝑔 Sample A: Z + jet 𝐵 𝐵 Sample B: dijets 𝑟 𝑞 quark 𝒚 + 1 − 𝑔 𝑟 𝑞 gluon (𝒚) 𝑞 sample 𝐶 𝒚 = 𝑔 𝐶 𝐶 Select a substructure feature space Constituent Multiplicity Jet Mass Soft Drop Multiplicity Model Output Goal: Find the underlying categories which explain the variation in substructure among the samples. Eric M. Metodiev, MIT Disentangling Jet Categories 14
Demixing the mixtures 𝑟 𝑞 quark 𝒚 + 1 − 𝑔 𝑟 𝑞 gluon (𝒚) 𝑞 𝐵 𝒚 = 𝑔 𝐵 𝐵 𝑟 𝑞 quark 𝒚 + 1 − 𝑔 𝑟 𝑞 gluon (𝒚) 𝑞 𝐶 𝒚 = 𝑔 𝐶 𝐶 𝑞 𝐵 𝒚 𝑞 𝐶 𝒚 𝜆 AB ≡ min 𝜆 BA ≡ min 𝑞 𝐶 𝒚 𝒚 𝑞 𝐵 𝒚 𝒚 𝑟 𝑟 1−𝑔 𝑔 𝐵 = 𝐶 = 𝑟 𝑟 1−𝑔 𝑔 𝐶 𝐵 Eric M. Metodiev, MIT Disentangling Jet Categories 15
Demixing the mixtures 𝑟 𝑞 quark 𝒚 + 1 − 𝑔 𝑟 𝑞 gluon (𝒚) 𝑞 𝐵 𝒚 = 𝑔 𝐵 𝐵 𝑟 𝑞 quark 𝒚 + 1 − 𝑔 𝑟 𝑞 gluon (𝒚) 𝑞 𝐶 𝒚 = 𝑔 𝐶 𝐶 𝑞 𝐵 𝒚 𝑞 𝐶 𝒚 𝜆 AB ≡ min 𝜆 BA ≡ min 𝑞 𝐶 𝒚 𝒚 𝑞 𝐵 𝒚 𝒚 𝑟 𝑟 1−𝑔 𝑔 𝐵 = 𝐶 = 𝑟 𝑟 1−𝑔 𝑔 𝐶 𝐵 With reducibility factors 𝜆 AB and 𝜆 BA , solve for the quark and gluon fractions and distributions: 𝑟 = 𝜆 BA (1 − 𝜆 AB ) 1 − 𝜆 AB 𝑟 = 𝑔 𝑔 B 𝐵 1 − 𝜆 AB 𝜆 BA 1 − 𝜆 AB 𝜆 BA 𝑞 quark 𝒚 = 𝑞 𝐵 𝒚 −𝜆 AB 𝑞 𝐶 𝒚 𝑞 gluon 𝒚 = 𝑞 𝐶 𝒚 −𝜆 BA 𝑞 𝐵 𝒚 1−𝜆 AB 1−𝜆 BA Eric M. Metodiev, MIT Disentangling Jet Categories 16
Exploring substructure feature spaces Why restrict ourselves to multiplicity? It works, but we can explore this choice. We can also use a trained classifier (with CWoLa) as an observable in its own right. Models Observables Multiplicity 𝑜 const • PFN-ID • Full particle-level information Number of particles in the jet Soft Drop Multiplicity 𝑜 SD • PFN • Probes number of perturbative emissions Full four-momentum information • Image Activity 𝑂 95 • EFN Number of pixels with 95% of jet 𝑞 𝑈 Full IRC-safe information See Patrick’s talk! (𝛾=1) N-subjettiness 𝜐 2 • • EFPs Probes how multi-pronged the jet is Full IRC-safe information, linearly Jet Mass 𝑛 • CNN • Mass of the total jet four-vector Trained on two-channel jet images • Width 𝑥 DNN • Probes the girth of the jet Trained on an N-subjettiness basis [P .T. Komiske, EMM, J. Thaler, 1810.05165] Eric M. Metodiev, MIT Disentangling Jet Categories 17
Extracting quark and gluon fractions With the topics procedure, the quark and gluon fractions of the samples can be obtained. Eric M. Metodiev, MIT Disentangling Jet Categories 18
Extracting quark and gluon distributions The extracted quark and gluon fractions can be used to obtain any quark/gluon distributions. Eric M. Metodiev, MIT Disentangling Jet Categories 19
(Self-)calibrating quark and gluon classifiers The extracted quark and gluon fractions can calibrate any data-driven quark/gluon classifiers. better Eric M. Metodiev, MIT Disentangling Jet Categories 20
Jet topics from perturbative QCD Topic modeling for jets can be understood and calculated from perturbative QCD. Jet mass (like many shape observables) Soft Drop Multiplicity (like many count exhibits Casimir scaling at Leading observables) exhibits Poisson scaling at Logarithmic accuracy: Leading Logarithmic accuracy: 𝐷 𝐵 𝑞 𝑟 𝑜 = Pois 𝑜; 𝐷 𝐺 𝜇 , 𝐷 𝐺 Σ (𝑛) = Σ 𝑟 𝑛 𝑞 𝑜 = Pois 𝑜; 𝐷 𝐵 𝜇 . Cas. = 𝐷 𝐺 = 4 Cas. = 0 Pois. = 𝑓 𝜇(𝐷 𝐺 −𝐷 𝐵 ) Pois. = 0 𝜆 𝑟 𝜆 𝑟 𝜆 𝑟 𝜆 𝑟 𝐷 𝐵 9 See back-up slides for more. Eric M. Metodiev, MIT Disentangling Jet Categories 21
Recommend
More recommend