towards data driven particle physics classifiers
play

Towards Data-Driven Particle Physics Classifiers Deep Learning in - PowerPoint PPT Presentation

Towards Data-Driven Particle Physics Classifiers Deep Learning in the Natural Sciences University of Hamburg Eric M. Metodiev Center for Theoretical Physics Massachusetts Institute of Technology Based on work with Patrick Komiske, Benjamin


  1. Towards Data-Driven Particle Physics Classifiers Deep Learning in the Natural Sciences University of Hamburg Eric M. Metodiev Center for Theoretical Physics Massachusetts Institute of Technology Based on work with Patrick Komiske, Benjamin Nachman, Matthew Schwartz, and Jesse Thaler [1708.02949] [1801.10158] [1802.00008] [1809.01140] March 1, 2019 1

  2. Outline Classification at Colliders Training on Data Disentangling Categories Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 2

  3. Outline Classification at Colliders Training on Data Disentangling Categories Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 3

  4. Jet Classification u, d, s g c b W/Z t H Or ? ? ? New ? ? ? Physics Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 4

  5. Jet Classification quark gluon c b W/Z t H Or ? ? ? New ? ? ? Physics Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 5

  6. Jet Classification quark gluon § § c New Physics b signal quark jets W/Z QCD background t gluon jets H Or ? ? ? New ? ? ? Physics Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 6

  7. Jet Classification quark gluon § § c New Physics b signal quark jets W/Z QCD background t gluon jets § § 𝐷 𝑟 = 4/3 H 𝐷 𝑕 = 3 gluon jets are “twice as Or wide” as ? ? ? New quark jets ? ? ? Physics Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 7

  8. Machine Learning with Jets … Images Observables Sequences Point Clouds [M. Andrews, et al. ,1902.08276] e.g. e.g. e.g. [T. Cheng, 1711.02633] [L. de Oliveira, et al., 1511.05190] [1902.08276] [G. Louppe, et al. , 1702.00748] [K. Datta, A. Larkoski, 1704.08249] [P.T. Komiske, EMM, M.D. Schwartz, 1612.01551] [P.T. Komiske, EMM, J. Thaler, 1712.07124] [G. Kasieczka, N. Kiefer, T. Plehn, J. Thompson, 1812.09223] All supervised classification methods require training data. Impossible to isolate pure samples of quark jets and gluon jets. Often rely on simulation, which is sensitive to mismodeling. Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 8

  9. Simulation vs. Data Simple two-feature quark vs. gluon jet classifier using simulation and data . Simulation Data “number of particles in the jet” “number of particles in the jet” “width of the jet” “width of the jet” Very different! [ATLAS Collaboration, 1405.6583] Is it possible to train classifiers on data? Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 9

  10. Outline Classification at Colliders Classifying jets based on their originating particles. Training on Data Disentangling Categories Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 10

  11. Outline Classification at Colliders Classifying jets based on their originating particles. Training on Data Disentangling Categories Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 11

  12. Training on pure samples: Cat vs. Dog jets Cat Jets Dog Jets vs . 1 0 Classifier Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 12

  13. Training on mixed samples: Cat vs. Dog jets Cat-enriched Jets Dog-enriched Jets vs . 1 0 Classifier This defines an equivalent classifier to the pure case! Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 13

  14. Classification without labels (CWoLa) cat 𝑀 cat cat ) 𝑔 𝒚 + (1 − 𝑔 cat 𝑞 cat 𝒚 + 1 − 𝑔 cat 𝑞 dog 𝒚 1 1 𝑦 = 𝑞 𝑁 1 𝒚 𝑔 dog 1 1 𝑀 𝑁 1 𝑞 𝑁 2 𝒚 = cat 𝑞 dog 𝒚 = cat 𝑞 cat 𝒚 + 1 − 𝑔 cat 𝑀 cat cat 𝑔 𝑔 𝒚 + 1 − 𝑔 𝑁 2 2 2 2 2 dog Optimal cat vs. Optimal mixed is a monotonic rescaling of dog classifier sample classifier Hence they define equivalent classifiers. [EMM, B. Nachman, J. Thaler, 1708.02949] [P .T. Komiske, EMM, B. Nachman, M.D. Schwartz, 1801.10158] see also [L. Dery, B. Nachman, F. Rubbo, A. Schwartzman, 1702.00414] [T. Cohen, M. Freytsis, B. Ostdiek, 1706.09451] Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 14

  15. Training on pure samples: Quark vs. Gluon jets Gluon Jets Quark Jets vs . 1 0 Classifier Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 15

  16. Training on mixed samples: Quark vs. Gluon jets dijets Z + jet Quark-enriched Jets Gluon-enriched jets vs . 1 0 Classifier This defines an equivalent classifier to the pure case! Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 16

  17. Performance Compare 80-20% mixtures to pure samples Vary the mixture purity observables Expert Can train on mixed samples! Works for very impure mixtures! [EMM, B. Nachman, J. Thaler, 1708.02949] Also works for convolutional neural networks and jet images. [P .T. Komiske, EMM, B. Nachman, M.D. Schwartz, 1801.10158] Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 17

  18. Outline Classification at Colliders Classifying jets based on their originating particles. Training on Data Weak supervision with mixed jet samples. Disentangling Categories Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 18

  19. Outline Classification at Colliders Classifying jets based on their originating particles. Training on Data Weak supervision with mixed jet samples. Disentangling Categories Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 19

  20. What do we even mean by quark and gluon jets? Quarks are color triplets. Gluons are color octets. Hadrons in jets are color singlets. No unambiguous definition of quark and gluon jets. [P . Gras, et al. , 1704.03878] Various definitions of increasing verbosity We obtained a quark vs. gluon jet classifier without a definition… Operational data-driven definition of quark and gluon jets Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 20

  21. Topic Modeling and Blind Source Separation [Image: D. Blei] [Image: J. Bobin] Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 21

  22. Disentangling Categories Let’s model cats and dogs as random animal noise producers. Meow Growl Growl Woof Howl Purr Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 22

  23. Disentangling Categories Listen to the animal noises from two different pet stores. Meow Growl Growl Woof Purr Howl Store 𝐵 Store 𝐵 𝑂 "Meow" Store 𝐶 = 𝑔 Store 𝐵 Store 𝐵 𝑂 "Bark" Store 𝐶 = 1 − 𝑔 Cat Cat Store 𝐶 𝑂 "Meow" 𝑔 Store 𝐶 𝑂 "Bark" 1 − 𝑔 Cat Cat Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 23

  24. Disentangling Categories Disentangle cat and dog vocabularies from the animal noises at pet stores. Pure cat and dog noise Meow Growl Growl Woof “phase space” is key Purr Howl Store 𝐵 Store 𝐵 𝑂 "Meow" Store 𝐶 = 𝑔 Store 𝐵 Store 𝐵 𝑂 "Bark" Store 𝐶 = 1 − 𝑔 Cat Cat Store 𝐶 𝑂 "Meow" 𝑔 Store 𝐶 𝑂 "Bark" 1 − 𝑔 Cat Cat Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 24

  25. Disentangling Categories An operational definition of quark and gluon jets. [EMM, J. Thaler, 1802.00008] [P .T. Komiske, EMM, J. Thaler, 1809.01140] 𝑞 𝐵 𝒚 𝑞 𝐶 𝒚 𝜆 AB ≡ min 𝜆 BA ≡ min 𝑞 𝐶 𝒚 𝒚 𝑞 𝐵 𝒚 𝒚 𝑟 𝑟 1−𝑔 𝑔 𝐵 = 𝐶 = 𝑟 𝑟 1−𝑔 𝑔 𝐶 𝐵 Number of particles in the jet Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 25

  26. Disentangling Categories An operational definition of quark and gluon jets. [EMM, J. Thaler, 1802.00008] [P .T. Komiske, EMM, J. Thaler, 1809.01140] 𝑞 𝐵 𝒚 𝑞 𝐶 𝒚 𝜆 AB ≡ min 𝜆 BA ≡ min 𝑞 𝐶 𝒚 𝒚 𝑞 𝐵 𝒚 𝒚 𝑟 𝑟 1−𝑔 𝑔 𝐵 = 𝐶 = 𝑟 𝑟 1−𝑔 𝑔 𝐶 𝐵 Number of particles in the jet With reducibility factors 𝜆 AB and 𝜆 BA , solve for the quark and gluon distributions: 𝑞 quark 𝒚 = 𝑞 𝐵 𝒚 −𝜆 AB 𝑞 𝐶 𝒚 𝑞 gluon 𝒚 = 𝑞 𝐶 𝒚 −𝜆 BA 𝑞 𝐵 𝒚 1−𝜆 AB 1−𝜆 BA Can also use machine learning to determine the feature space. Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 26

  27. Collider data as mixtures of jet types Theoretical and experimental definition of jet categories. Theoretically tractable: calculate reducibility factors from perturbative QCD for certain observables. Can use the fractions to calibrate ROC curves. Allows for any observable distributions to be extracted for quark and gluon jets separately. See extra slides for more. Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 27

  28. Summary Classification at Colliders Classifying jets based on their originating particles. Training on Data Weak supervision with mixed jet samples. Disentangling Categories Topic modeling to define data-driven jet categories. Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 28

  29. The End Thank you! Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 29

  30. Extra Slides Eric M. Metodiev, MIT Data-Driven Particle Physics Classifiers 30

Recommend


More recommend