AN INTRODUCTION TO DEEP LEARNING FOR ASTRONOMY Marc - PowerPoint PPT Presentation

In practice OPTIMIZATION ERROR TRAINING VALIDATION TEST training set: use to train the classifier validation set: use to monitor performance in real time - check for overfitting test set: use to train the classifier

In practice OPTIMIZATION ERROR TRAINING VALIDATION TEST NO CHEATING! NEVER USE TRAINING TO VALIDATE YOUR ALGORITHM!

The algorithm used to minimize is called OPTIMIZATION THERE ARE SEVERAL OPTIMIZATION TECHNIQUES

Optimization THERE ARE SEVERAL OPTIMIZATION TECHNIQUES THEY DEPEND ON THE MACHINE LEARNING ALGORITHM

Optimization THERE ARE SEVERAL OPTIMIZATION TECHNIQUES THEY DEPEND ON THE MACHINE LEARNING ALGORITHM NEURAL NETWORKS USE THE GRADIENT DESCENT AS WE WILL SEE LATER W t +1 = W t � λ h 5 f ( W t ) learning rate weights to be learned epoch

f W ( ~ x ) The differences are   in the function   that is used ARTIFICAL   RANDOM FORESTS NEURAL NETWORKS (DEEP LEARNING) CARTS decision trees SUPPORT VECTOR MACHINES kernel algorithms

HOW TO CHOOSE YOUR CLASSICAL CLASSIFIER? NO RULE OF THUMB - REALLY DEPENDS ON APPLICATION ++ — Python ML METHOD Easy to interpret (“White CARTS / Over-complex trees sklearn.ensemble.RandomFo box”) Unstable restClassifier RANDOM Litte data preparation Biased tress if some classes sklearn.ensemble.RandomFo Both numerical + dominate restRegressor FOREST categorical Easy to interpret + Fast not very well suited to sklearn.svm SVM Kernel trick allows no linear multi-class problems sklearn.svc problems sklearn.neural_network.MP seed of deep-learning more difficult to interpret L_CLassifier very efficient with large NN computing intensive sklearn.neural_network.MP amount of data as we will L_Regressor see

CAN DEPEND ON YOUR MAIN INTEREST credit

ALSO INFLUENCED BY “MAINSTREAM” TRENDS Source

PART II: A FOCUS ON “SHALLOW” NEURAL NETWORKS

THE NEURON INSPIRED BY NEURO - SCIENCE? Credit: Karpathy

Mark I Perceptron FIRST IMPLEMENTATION OF NEURAL NETWORK [Rosenblatt, 1957! ] INTENDED TO BE A MACHINE (NOT AN ALGORITHM) it had an array of 400 photocells, randomly connected to the "neurons". Weights were encoded in potentiometers, and weight updates during learning were performed by electric motors

TODAY’S ARTIFICIAL NEURON x ) = ~ x ) = g ( ~ z ( ~ x + b W. ~ f ( ~ x + b ) W. ~ Pre-Activation Weights Bias Activation Function Output Input

LAYER OF NEURONS x + ~ f ( ~ x ) = g ( W . ~ b ) SAME IDEA. NOW W becomes a matrix and b a vector

Hidden Layers of Neurons FIRST LAYER z h ( x ) = W h x + b h INPUT

ACTIVATION FUNCTION HIDDEN LAYER h ( x ) = g ( z h ( x )) = g ( W h x + b h )

OUTPUT LAYER z 0 ( x ) = W 0 h ( x ) + b 0

PREDICTION LAYER f ( x ) = softmax ( z 0 )

LABEL f W ( ~ x ) = ~ “CLASSICAL” y MACHINE LEARNING Q , SF REPLACE THIS BY A GENERAL   NON LINEAR FUNCTION WITH SOME PARAMETERS W NETWORK   p = g 3 ( W 3 g 2 ( W 2 g 1 ( W 1 ~ x 0 ))) FUNCTION

WHY HIDDEN LAYERS? More complex functions allow increasing complexity Credit: Karpathy

SO LET’S GO DEEPER AND DEEPER!

SO LET’S GO DEEPER AND DEEPER! YES BUT… NOT SO STRAIGHTFORWARD, DEEPER MEANS MORE WEIGHTS, MORE DIFFICULT OPTIMIZATION, RISK OF OVERFITTING…

LET’S FIRST EXAMINE IN MORE DETAIL HOW SIMPLE “SHALLOW” NETWORKS WORK

ACTIVATION FUNCTIONS? Function ADD NON LINEARITIES TO THE PROCESS

ACTIVATION FUNCTIONS Function

ACTIVATION FUNCTIONS 1 Sigmoid: f ( x ) = T anh : f ( x ) = tanh ( x ) 1 + e − x ReLu : f ( x ) = max (0 , x ) Soft ReLu : f ( x ) = log (1 + e x ) Leaky ReLu : f ( x ) = ✏ x + (1 − ✏ ) max (0 , x )

ACTIVATION FUNCTIONS +   MANY   OTHERS! 1 Sigmoid : f ( x ) = T anh : f ( x ) = tanh ( x ) 1 + e − x ReLu : f ( x ) = max (0 , x ) Soft ReLu : f ( x ) = log (1 + e x ) Leaky ReLu : f ( x ) = ✏ x + (1 − ✏ ) max (0 , x )

WHAT IS THE MEANING OF THE ACTIVATION FUNCTION? Any real function in a interval (a,b) can be approximated with a linear combination of translated and scaled ReLu functions

AN INTRODUCTION TO DEEP LEARNING FOR ASTRONOMY Marc - PowerPoint PPT Presentation

AN INTRODUCTION TO DEEP LEARNING FOR ASTRONOMY Marc Huertas-Company IAC WINTER School 2018 REFERENCES SEVERAL SLIDES / INFOS SHOWN HERE ARE INSPIRED/ TAKEN FROM OTHER WORKS / COURSES FOUND ONLINE Deep Learning: Do-It-Yourself! [Bursuc,

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

ASTR 1120 ASTR 1120 General Astronomy: General Astronomy: Stars & Galaxies Stars &

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep learning Deep reinforcement learning Hamid Beigy Sharif university of technology December

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

Differen'able Func'onal Programming Noel Welsh @noelwelsh underscore Goals Deep learning

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 6: Deep Learning Systems 1 Outline

ACCELERATE DEEP LEARNING WITH NVIDIA'S DEEP LEARNING PLATFORM | STEPHEN JONES | GTC16 DEEP

Deep learning for natural language processing A short primer on deep learning Benoit Favre <

Relational Deep Learning: A Deep Latent Variable Model for Link Prediction Hao Wang, Xingjian

Medical Imaging Elisa Sayrol Medical Imaging Interest in this area in Deep Learning: DeepDeep

ASTR 1120 ASTR 1120 General Astronomy: General Astronomy: Stars & Galaxies Stars &

Page 1 Synthetic aperture photography Synthetic aperture photography Computational

QUANTUM MECHANICS QUANTUM MECHANICS So now we come to the theory which has quite literally So

Double- -beta decay: beta decay: Double and new results and new results from EXO- -200 200

Watson and Jeopardy Lecture 23: November 27, 2013 CS886 2 Natural Language Understanding

Q33.5 Three polarizing filters are stacked with the polarizing axes of Three polarizing filters

Electrical conduction and photoconduction in PtSe2 ultrathin films A. Di Bartolomeo 1,2 * F.

DMD array Scene Lens1 A/D Photo diode Lens2 converter Ref: Duarte et al, Single pixel

Interactive Computer Graphics CS 418 Spring 2011 M P3 Teapot Contest Office Hours TA: Gong

AN INTRODUCTION TO DEEP LEARNING FOR ASTRONOMY Marc - PowerPoint PPT Presentation

AN INTRODUCTION TO DEEP LEARNING FOR ASTRONOMY Marc Huertas-Company IAC WINTER School 2018 REFERENCES SEVERAL SLIDES / INFOS SHOWN HERE ARE INSPIRED/ TAKEN FROM OTHER WORKS / COURSES FOUND ONLINE Deep Learning: Do-It-Yourself! [Bursuc,

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

ASTR 1120 ASTR 1120 General Astronomy: General Astronomy: Stars &amp; Galaxies Stars &amp;

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep learning Deep reinforcement learning Hamid Beigy Sharif university of technology December

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

Differen'able Func'onal Programming Noel Welsh @noelwelsh underscore Goals Deep learning

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 6: Deep Learning Systems 1 Outline

ACCELERATE DEEP LEARNING WITH NVIDIA'S DEEP LEARNING PLATFORM | STEPHEN JONES | GTC16 DEEP

Deep learning for natural language processing A short primer on deep learning Benoit Favre &lt;

Relational Deep Learning: A Deep Latent Variable Model for Link Prediction Hao Wang, Xingjian

Medical Imaging Elisa Sayrol Medical Imaging Interest in this area in Deep Learning: DeepDeep

ASTR 1120 ASTR 1120 General Astronomy: General Astronomy: Stars &amp; Galaxies Stars &amp;

Page 1 Synthetic aperture photography Synthetic aperture photography Computational

QUANTUM MECHANICS QUANTUM MECHANICS So now we come to the theory which has quite literally So

Double- -beta decay: beta decay: Double and new results and new results from EXO- -200 200

Watson and Jeopardy Lecture 23: November 27, 2013 CS886 2 Natural Language Understanding

Q33.5 Three polarizing filters are stacked with the polarizing axes of Three polarizing filters

Electrical conduction and photoconduction in PtSe2 ultrathin films A. Di Bartolomeo 1,2 * F.

DMD array Scene Lens1 A/D Photo diode Lens2 converter Ref: Duarte et al, Single pixel

Interactive Computer Graphics CS 418 Spring 2011 M P3 Teapot Contest Office Hours TA: Gong

ASTR 1120 ASTR 1120 General Astronomy: General Astronomy: Stars & Galaxies Stars &

Deep learning for natural language processing A short primer on deep learning Benoit Favre <

ASTR 1120 ASTR 1120 General Astronomy: General Astronomy: Stars & Galaxies Stars &