Machine Learning. Brushing on tools used for summer student - PowerPoint PPT Presentation

Machine Learning. Brushing on tools used for summer student projects

Overview ● Workbooks available online ➢ https://cms-caltech-ml.cern.ch/tree ➢ CERN Open Stack VM ➢ iPython server under Apache 2.4 ➢ Access limited to e-group cms-caltech-ml@cern.ch ➢ Authentication by CERN single sign-on ● GPU available ➢ GeForce GT 610 on pccitevo.cern.ch (Jean-Roch's desktop) ➢ Tesla K40c on felk40.cern.ch (courtesy Felice Pantaleo. CERN) ➢ Useful only for theano based code so far ➢ Download a notebook as python script and run locally THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python code.py ➢

matplotlib http://matplotlib.org/ ● Python library for graphical representation ● Documentation is a bit scattered around ● Lots of practical examples existing ● Used for clustering with k-means, and PCA ● Self-Organizing map getting implemented https://github.com/scikit-learn/scikit-learn/pull/2996 much discussion about it on-going

numpy http://www.numpy.org/ ● Python library for scientific computation ● Documentation is a bit opaque ● Lots of practical examples existing ● Syntax can be very cryptic, but very powerful ● ROOT i/o not much supported in python libraries ● Used for dataset manipulation

h5py http://www.h5py.org/ ● Python library for managing (big) dataset ● Supported on multiple platform ● Strong documentation ● i/o for large dataset ● Dataset engine for NADE (see later) ● Only barely used so far at its full potential

scikit-learn http://scikit-learn.org/dev/index.html ● Python library for machine learning ● Well documented suite of methods ● Lots of example to get started ● Implements most of the classical supervised methods (PCA, SVM, Random forest, …) ● Implements several unsupervised clustering algorithm (k-means, …) ● Used for clustering with k-means, and PCA ● Self-Organizing map getting implemented https://github.com/scikit-learn/scikit-learn/pull/2996 much discussion about it on-going e h t r o f d e r e l u b d m e e p h v o c o h S N s k r o w

pybrain http://pybrain.org/ ● Python library for neural net training ● Well documented ● Easy to get started on neural-net training ● Does not support GPU acceleration ● Used initially to get started with NN ● Faced performance issues early on

theano http://deeplearning.net/software/theano/ ● Python library for manipulating mathematical expressions ● Very complete library ● Intensive tutorial available ● A bit opaque to use by itself ● Full GPU acceleration support ● A bit like cernlib, requires higher level software wrapper ● Easy software manipulation comes with performance hit ● Used for convolutional neural network implementations ● Used as mathematical engine for higher level libray e h t r o f d e r e l u b d m e e p h v o c o h S N s k r o w

theanets http://theanets.readthedocs.org ● Python library for neural net training ● Feels like “pybrain done right” ● Easy to get started on neural-net training ● Use theano as computation engine ● Now used for neural nets in three projects

rNADE http://www.benignouria.com/en/research/RNADE/ http://arxiv.org/abs/1306.0186 ● Python software for neural autoregressive density estimator ● In touch with the authors Ian Murray, Hugo Larochelle, Benigno Uria. ● Transformed into a python library for usage inside noteboooks ● Benchmarking for usability in background estimation ● Possible solutions for outliers detection suggested by the authors

Deep Learning ● Kolmogorov's Theorem http://www.sciencedirect.com/science/article/pii/0893608092900128 ● “Make it deep enough and it will learn anything” ● Provided … ➢ Enough data to fit all the parameters ➢ Enough computing acceleration

spearmint https://github.com/HIPS/Spearmint ● Software to perform bayesian optimization ● Utilized for neural net optimization in the UCI Higgs2tautau/razor papers ● Will use to optimize NN topologies e h t r o f d e r e l u b d m e p e h o v c h o S s N k r o w

NMS Projects Overview ● Assuming Kolmogorov's theorem : we can learn anything ● NN Tracking ✔ Present the full list of hits ✔ Train on hit/track association ➔ Get track candidate categorization ➔ Software acceleration ➔ Possibility of optimization with respect to exiting tracking ● NN Trigger ✔ Present low level reconstruction objects ✔ Train on trigger bits ➔ Emulate the trigger selection ➔ Software acceleration, no optimization with respect to existing trigger table, could free a good fraction of timing in the HLT ● NN Calo Id ✔ Present energy deposition in a jet ✔ Train on generator particle identity ➔ Get the jet “label” (quark, gluon, photon, electron, …) ➔ Has a huge potential impact. Room for optimization

Cost Function and Regularisation ● Regularization is adding sum of squared weights in the cost function. ● Supposed to stabilize and prevent over fitting. ● Observation was that Error (NN mismatch with target) was going down as expected but cost growing back during training. ● Explanation is that discrete gradient descent is going over a barrier in cost to reach out towards a better fit.

NN HLT 1/2

NN HLT 2/2

Machine Learning. Brushing on tools used for summer student - PowerPoint PPT Presentation

Machine Learning. Brushing on tools used for summer student projects Overview Workbooks available online https://cms-caltech-ml.cern.ch/tree CERN Open Stack VM iPython server under Apache 2.4 Access limited to e-group

Dry Brushing, Cold Water Wash, and Foot Treatment 25b Hydrotherapy: Dry Brushing, Cold Water

(Introduce yourself) Today we are going to talk about Brushing Your Teeth!! 1 What causes

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

SUMMER BRAIN GAIN: REIMAGINING SUMMER LEARNING What is the problem? Why Summer Matters There is

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

Machine Learning - Intro Aarti Singh Machine Learning 10-701/15-781 Sept 8, 2010 You tell me

MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED MACHINE LEARNING ADVANCED

ACTIVATING: GATES HILLSIDE Weee Garden BID 2010 | A4: Activating Public Landscapes Alice

TRANSPORT FOR LONDON BOARD MEETING OPEN SESSION TO BE HELD ON 28 JUNE 2006 AT 1000 HOURS IN

BOAT: Building Auto-Tuners with Structured Bayesian Optimization Valentin Dalibard

Multilateral Solutions www.multilateralsolutions.co.uk Multilaterals Are Not New We can all

Constructing Dynamic Policies for Paging Mode Selection Jason Hiebel Laura E. Brown Zhenlin

International Conference on Motivational Dynamics and Second Language Acquisition. 28 th -30 th

A ROADMAP FOR Implementing FORWARD THROUGH the Calls to Action FERGUSON PROCESS 1 12/7/15

Our Company Soltec specializes in the manufacture and supply of single-axis solar trackers with