deep learning in tmva
play

Deep learning in TMVA Benchmarking TMVA DNN Integration of a Deep - PowerPoint PPT Presentation

Deep learning in TMVA Benchmarking TMVA DNN Integration of a Deep Autoencoder Marc Huwiler CERN August 28, 2017 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 1 / 34 Outline Outline Introduction 1 About me ROOT, TMVA and


  1. Deep learning in TMVA Benchmarking TMVA DNN Integration of a Deep Autoencoder Marc Huwiler CERN August 28, 2017 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 1 / 34

  2. Outline Outline Introduction 1 About me ROOT, TMVA and Machine learning Benchmarking TMVA DNN vs PyKeras 2 Methodology Benchmarking implementation Results Implementing Autoencoder classes for TMVA 3 Concepts Autoencoder integration in TMVA Results Further possibilities 4 Aknowledgements 5 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 2 / 34

  3. Introduction About me About me Master in High Energy Physics at EPFL in Lausanne Hiking, travelling, reading Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 3 / 34

  4. Introduction ROOT, TMVA and Machine learning ROOT and TMVA ROOT Data analysis framework for HEP , developed mainly at CERN Written in C++ (fully interpreted) Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 4 / 34

  5. Introduction ROOT, TMVA and Machine learning ROOT and TMVA ROOT Data analysis framework for HEP , developed mainly at CERN Written in C++ (fully interpreted) TMVA T oolkit for M ulti v ariate A nalysis Includes several machine learning algorithms such as : Likelihood, KNN, Fisher, MLP , SVN, Neural Networks, BDT, etc... Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 4 / 34

  6. Introduction ROOT, TMVA and Machine learning Machine learning Machine learning is a type of artificial intelligence (AI) that allows software applications to become more accurate in predicting outcomes without being explicitly programmed (...) by receiving input data and using statistical analysis to predict an output value within an acceptable range. 1 1 source : http://whatis.techtarget.com/definition/machine-learning Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 5 / 34

  7. Introduction ROOT, TMVA and Machine learning Neural networks � � ω · � output = α � x + b with α the activation function Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 6 / 34

  8. Introduction ROOT, TMVA and Machine learning Autoencoder A special sort of neural networks Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 7 / 34

  9. Introduction ROOT, TMVA and Machine learning Autoencoder A special sort of neural networks The inputs { x i } are encoded into a lower dimensionality set of variables, that contain a compressed representation of the data Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 7 / 34

  10. Introduction ROOT, TMVA and Machine learning Autoencoder A special sort of neural networks The inputs { x i } are encoded into a lower dimensionality set of variables, that contain a compressed representation of the data Then, they can be decoded again into the output { x ′ i } of the same dimension as the input, and aimed to be the closest possible to it. Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 7 / 34

  11. Benchmarking TMVA DNN vs PyKeras Introduction 1 About me ROOT, TMVA and Machine learning Benchmarking TMVA DNN vs PyKeras 2 Methodology Benchmarking implementation Results Implementing Autoencoder classes for TMVA 3 Concepts Autoencoder integration in TMVA Results Further possibilities 4 Aknowledgements 5 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 8 / 34

  12. Benchmarking TMVA DNN vs PyKeras Methodology Proceeding How the benchmarking is performed Run the same training on a similar neural network layout in both TMVA DNN and PyKeras Benchmarks : ROC curve integral, CPU time, Real time Studies of the benchmarks as function of any common parameter Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 9 / 34

  13. Benchmarking TMVA DNN vs PyKeras Methodology Common basis between TMVA DNN and PyKeras Input parameters in common TMVA DNN PyKeras network layout network layout WeightInitialization initializer ErrorStrategy loss LearningRate lr Momentum momentum ConvergenceSteps TriesEarlyStopping DropConfing Dropout Sampling and Preprocessing are performed by TMVA (Factory) Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 10 / 34

  14. Benchmarking TMVA DNN vs PyKeras Methodology Common basis between TMVA DNN and PyKeras Input parameters in common TMVA DNN PyKeras network layout network layout WeightInitialization initializer ErrorStrategy loss LearningRate lr Momentum momentum ConvergenceSteps TriesEarlyStopping DropConfing Dropout Sampling and Preprocessing are performed by TMVA (Factory) There are also parameters that have no counterpart in the other framework Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 10 / 34

  15. Benchmarking TMVA DNN vs PyKeras Benchmarking implementation Benchmarking workflow Batch script Batch script TMVA DNN macro Input file PyKeras macro A batch script does the following: Generate a common input file with the given parameters Feed it into the macro for both TMVA DNN and PyKeras The macros write the benchmarks into a common file These steps are repeated changing a given parameter Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 11 / 34

  16. Benchmarking TMVA DNN vs PyKeras Benchmarking implementation Input parameters used Input parameter nNeurons 100 nLayers 3 Activation 2 Lastactivation 3 Initializer 1 Lossfunction 0 Transformations ”N,D” Factorystring !V:!Silent:Color:DrawProgressBar:Transformations=I:AnalysisType=Classification Learningrate 0.1 Momentum 0.0 Batchsize 128 Convergencesteps 100 Dropout 0.0 Ntrainsignal 50000 Ntrainbackground 50000 Ntestsignal 100000 Ntestbackground 100000 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 12 / 34

  17. Benchmarking TMVA DNN vs PyKeras Benchmarking implementation Input parameters used Input parameter nNeurons 100 nLayers 3 Activation 2 Lastactivation 3 Initializer 1 Lossfunction 0 Transformations ”N,D” Factorystring !V:!Silent:Color:DrawProgressBar:Transformations=I:AnalysisType=Classification Learningrate 0.1 Momentum 0.0 Batchsize 128 Convergencesteps 100 Dropout 0.0 Ntrainsignal 50000 Ntrainbackground 50000 Ntestsignal 100000 Ntestbackground 100000 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 12 / 34

  18. Benchmarking TMVA DNN vs PyKeras Results ROC curve Example of a ROC curve Signal efficiency vs. Background rejection (DNN CPU) Signal efficiency vs. Background rejection (DNN CPU) Background rejection (Specificity) 1 0.8 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 Signal efficiency (Sensitivity) nNeurons = 20 , Convergence steps = 100 , Batch size = 128 , Dropout = 0 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 13 / 34

  19. Benchmarking TMVA DNN vs PyKeras Results ROC curve integral Varying the number of neurons 1 0.9 0.8 ROC curve integral 0.7 0.6 TMVA DNN 0.5 PyKeras Tensorflow SGD 0.4 PyKeras Theano SGD 0.3 PyKeras Tensorflow Adam 0.2 PyKeras Theano Adam 0.1 0 0 50 100 150 200 250 number of neurons per layer Convergence steps = 100 , Batch size = 128 , Dropout = 0 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 14 / 34

  20. Benchmarking TMVA DNN vs PyKeras Results CPU time Varying the number of neurons 40000 35000 30000 CPU time [s] 25000 TMVA DNN 20000 PyKeras Tensorflow SGD 15000 PyKeras Theano SGD 10000 PyKeras Tensorflow Adam PyKeras Theano Adam 5000 0 0 50 100 150 200 250 number of neurons per layer Convergence steps = 100 , Batch size = 128 , Dropout = 0 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 15 / 34

  21. Benchmarking TMVA DNN vs PyKeras Results Real time Varying the number of neurons 5000 4500 4000 3500 real time [s] 3000 TMVA DNN 2500 PyKeras Tensorflow SGD 2000 PyKeras Theano SGD 1500 PyKeras Tensorflow Adam 1000 PyKeras Theano Adam 500 0 0 50 100 150 200 250 number of neurons per layer Convergence steps = 100 , Batch size = 128 , Dropout = 0 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 16 / 34

  22. Benchmarking TMVA DNN vs PyKeras Results ROC curve integral Statistical validity 1 0.9 0.8 ROC curve integral TMVA DNN 1 0.7 TMVA DNN 2 0.6 TMVA DNN 3 0.5 TMVA DNN 4 0.4 TMVA DNN 5 0.3 Mean 0.2 0.1 0 0 50 100 150 200 250 number of neurons per layer Convergence steps = 100 , Batch size = 128 , Dropout = 0 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 17 / 34

  23. Benchmarking TMVA DNN vs PyKeras Results CPU time Statistical validity 45000 40000 35000 TMVA DNN 1 30000 CPU time [s] TMVA DNN 2 25000 TMVA DNN 3 TMVA DNN 4 20000 TMVA DNN 5 15000 Mean 10000 5000 0 0 50 100 150 200 250 number of neurons per layer Convergence steps = 100 , Batch size = 128 , Dropout = 0 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 18 / 34

  24. Benchmarking TMVA DNN vs PyKeras Results Real time Statistical validity 7000 6000 TMVA DNN 1 5000 real time [s] TMVA DNN 2 4000 TMVA DNN 3 TMVA DNN 4 3000 TMVA DNN 5 2000 Mean 1000 0 0 50 100 150 200 250 number of neurons per layer Convergence steps = 100 , Batch size = 128 , Dropout = 0 Marc Huwiler (CERN) Deep learning in TMVA August 28, 2017 19 / 34

Recommend


More recommend