Search for Cosmic Ray Sources Using Deep Learning on Spherical Data Niklas Langner Martin Erdmann Marcus Wirtz 27.09.2019 niklas.langner@rwth-aachen.de 1
Motivation Cosmic rays Rigidity-dependent Multiplets Passage through GMF from one source deflection Identify multiplets to identify sources Pattern recognition task Use convolutional neural networks (CNNs) as those perform very well in pattern recognition tasks ✓ No signal probability of 1.0 in 10 million samples 27.09.2019 niklas.langner@rwth-aachen.de 2 2
Spherical Convolutions Classical CNN: analyze 2D-images → Challenge: skymap of CR-data is spherical Approach 1 Approach 2 Project spherical data into 2D-images Use a method of convolution on a Network needs to learn sphere spatial relations Use data as a whole in its Distortions, overlap spherical form → Data not optimally used Let‘s try this! 27.09.2019 niklas.langner@rwth-aachen.de 3
Spherical Convolutions • Use spherical data in the HEALPix format (pixelization into pixels of equal area) • Module enabling convolutions on HEALPix-grids: DeepSphere Convolution Operation 27.09.2019 niklas.langner@rwth-aachen.de 4
Spherical Convolutions Normalized graph Laplacian 𝑴 sym Build Graph Graph Laplacian 𝑴 𝑴 sym = 𝑽𝚳𝑽 𝐔 from HEALPix map matrix describing the connections of the graph 𝑽 = [𝒗 1 , … , 𝒗 𝑂 pix ] vertices Filter a graph signal 𝑔 ∈ ℝ 𝑂 pix by a kernel ℎ : ℎ 𝑴 sym 𝒈 = 𝑽(ℎ 𝚳 𝑽 T 𝒈) Computationally efficient: 𝐿 The first 16 eigenvectors of the graph Laplacian, an equivalent of Fourier modes ℎ 𝑴 sym 𝒈 = 𝜄 𝑙 𝑴 sym,𝑙 𝒈 𝑙=0 27.09.2019 niklas.langner@rwth-aachen.de 5
Spherical convolutions 𝐿 ℎ 𝑴 sym 𝒈 = 𝜄 𝑙 𝑴 sym,𝑙 𝒈 𝑙=0 𝑴 sym,𝑙 𝑗𝑘 : Analogous to the Filtering can thus • sum of all weighted paths of length k classical setting but be interpreted as between 𝑤 𝑗 and 𝑤 𝑘 with weights weighted linear • weight is multiplication of all the edge determined by 𝜄 combination of weights on the path and 𝑴 , with one neighboring pixel • non-zero if and only if 𝑤 𝑗 and 𝑤 𝑘 connected coefficient per values . neighborhood by at least one path of length 𝑙 27.09.2019 niklas.langner@rwth-aachen.de 6
Simulation: Minimalistic deflection model Simulate 1000 cosmic rays (all He, 𝐹 min = 40 EeV ) with 5.5% originating from one randomly positioned source 1. Coherent deflection: Rotation with rotation angles motivated by typical Galactic magnetic field models (position independent) 2. Blurring: Use 50% of the maximal blurring of model (JF12) 𝜏 = 2.79 rad ⋅ EeV/𝑆 Random source position Random direction 1. Rotation 2. Blurring Isotropy ∼ 1/𝑆 ∼ 1/𝑆 27.09.2019 niklas.langner@rwth-aachen.de 7
Data preparation Model input 27.09.2019 niklas.langner@rwth-aachen.de 8
Building a classifier provides 𝜈 1 invariance to 𝜈 2 rotation N 𝜈 3 𝑞 signal Statistical layer Multiple Spherical Multiple dense … (take 𝜈 and 𝜏 2 ) Convolutions layers 2 𝑞 isotropy 𝜏 1 E 2 𝜏 2 2 𝜏 3 Input Feature Maps Vector Results 2 x 49152 pixels 64 x 48 pixels 128 entries Probabilities (after softmax) of the two cases 27.09.2019 niklas.langner@rwth-aachen.de 9
Classifying toy simulation • Evaluate model on data set: ▪ JF12-based (position-dependent deflection and blurring) ▪ All He or mixed charges (15% H, 45% He, 40% CNO) • Calculate p-values to judge model capability: p-value: relative amount of isotropy 𝑞 signal larger or equal to signal median 𝑞 signal Skymap with signal fraction sf Network Network Isotropy Median 𝑞 signal 27.09.2019 niklas.langner@rwth-aachen.de 10
Building a classifier – random search Lowest p- # Conv. # Dense Conv-Ac. Dense-Ac Pooling K-Order Dropout L-rate ( ReLU / sigmoid / ( ReLU / sigmoid / ( average / max) (3 / 5 / 10) ( 0.2 / 0.3 / 0.4 ) (1e-5 / 1e-4 ) value Layers Layers tanh / leaky ReLU / tanh / leaky ReLU / (4 / 3 / 6) (1 / 2 / 4) softsign) softsign) Leaky- 1st 6 1 ReLU Average 5 0.2 0.0001 ReLU Leaky- 2nd 6 1 ReLU Average 5 0.3 0.0001 ReLU 3rd 6 1 ReLU ReLU Average 5 0.4 0.0001 Leaky- 4th 6 1 ReLU Average 5 0.4 0.0001 ReLU 5th 6 1 ReLU ReLU Average 5 0.3 0.0001 Optimizer Regularization Loss Polynomials Adam L2 : 0.1 Cross Entropy Chebyshev • Many common parameters between the best models • Clear preference of K=5 and as much convolutions as possible 27.09.2019 niklas.langner@rwth-aachen.de 11
Performance on simulated data Below 5𝜏 at ~30 signal cosmic rays (3% of 1000 cosmic rays) Number of signal cosmic rays 27.09.2019 niklas.langner@rwth-aachen.de 12
Looking into the black box 𝑔 sig = 1% • Understand why the model makes its decision using layer-wise relevance propagation • Each pixel is given a sensitivity that tells how much it contributed to the signal probability output of the network 27.09.2019 niklas.langner@rwth-aachen.de 13
Conclusion and Outlook • DeepSphere is a useful tool in analyzing skymaps ▪ Can be used for classification or regression tasks • On simulated data it is capable to identify multiplets of 1 source with a sensitivity of 5𝜏 for skymaps of 1000 CRs with ~30 originating from the source • Next: Simulate a universe with multiple sources and let networks distinguish the simulated universe from isotropy 1000 cosmic rays 1000 cosmic rays 1000 cosmic rays from 10 sources from 50 sources from 250 sources • No optimal usage of data with DeepSphere: Predefined grid with mostly zeros → Try dynamic graph network methods where the graph is build according to the CRs 27.09.2019 niklas.langner@rwth-aachen.de 14
Backup 27.09.2019 niklas.langner@rwth-aachen.de 15
Layer-Wise Relevance Propagation • Deep neural network: feed-forward graph of neurons: (𝑚+1) = 𝑚 𝑥 𝑗𝑘 𝑚,𝑚+1 + 𝑐 𝑚+1 ▪ 𝑦 𝑘 ൬0, σ 𝑗 𝑦 𝑗 ൰ 𝑘 • Use the networks output 𝑔(𝒚) and a backward pass of same graph to calculate relevance scores 𝑚 = σ 𝑘 𝑚+1 with 𝑨 𝑗𝑘 = 𝑦 𝑗 𝑚 𝑥 𝑗𝑘 𝑨 𝑗𝑘 𝑚,𝑚+1 ▪ 𝑆 𝑗 σ 𝑗′ 𝑨 𝑗′𝑘 𝑆 𝑘 ▪ 𝑗 : index of neuron at layer 𝑚 ; Σ 𝑘 sums over al upper-layer neutrons to which neuron 𝑗 contributes ▪ Conservation property: 1 = 𝑔(𝒚) ▪ σ 𝑞 𝑆 𝑞 27.09.2019 niklas.langner@rwth-aachen.de 16
Recommend
More recommend