Biologically-Inspired Sparse Restricted Boltzmann Machines Pablo Tostado Michael Wiest Alice Yepremyan 1
Content 1. Motivation 2. Background a. Restricted Boltzmann Machines b. Sparsity 3. Methods a. Pruning algorithm b. Evaluation criteria 4. Results 5. Discussion 6. Future directions 2
Motivation 3
Sparsity In Biological systems: 1 In Artificial systems: 2 Reduces computational Increases computational ● ● complexity efficiency ● Increases speed ● Less prone to overfitting ● Yield to higher probabilities ● Can often lead to better solutions in neural networks Total energy consumed (AlexNet and dropout) ● decreases with increasing sparsity 4
Background 5
Restricted Boltzmann Machines ● Generative, stochastic Artificial Neural Networks ● Fully connected bipartite graphs Able to learn probability distributions over its set of inputs ● Building block of deeper neural networks ● ● Hebbian nature in the learning algorithm 6
RBM: Structure [3] 7
RBM: Energy of Network (v,h) Z = v,h 8
Methods 9
RBM architecture Visible Units: Hidden Units: 784 units 1) 100 nodes 2) 500 nodes MNIST (28x28) [3] 10
Our Algorithm 1. Do an initial round of training (1000 epochs). 2. Prune the P* lowest weight connection (set weight to zero) 3. Train again (400 epochs) 4. Repeat 2 and 3 until desired amount of pruning done 5. Do a final round of training (1000 epochs) 11
Data ● Training: 1000 MNIST images (100 of each character) ● Testing: 100 MNIST images (10 of each character) *Due to computing time. 12
Evaluation Criteria for Pruned and Unpruned ● Accuracy* of image reconstruction from: 1. Noisy image 2. Occluded image Altering the parameters ● Hidden nodes: 100 and 500 ○ Percent noisy/occluded: 5, 10, 25, 50% ○ ● Visual nodes represented by hidden nodes Pruning over time ● 13
Evaluation Criteria: Noise 20% Noise 14
Evaluation Criteria: Square Occlusion 20% occlusion 15
Image Recovery Example N Steps 16
Noise Convergence (20% noise, 25% pruning)
Occlusion Convergence (20% occlude, 25% prune) 18
Image recovery Scoring 10 examples of each image ● For each image do 100 iterations of random noise/occlusion ● ● For some iteration i : 19
Results 20
Pruning preference (100 hidden nodes) 0% 5% 10% 25% 50% 80% Percentage Pruned 21
Pruning preference (500 hidden nodes) 0% 5% 10% 25% 50% Percentage Pruned 22
23
24
100 Hidden Nodes 25
100 Hidden Nodes 26
500 Hidden Nodes 27
500 Hidden Nodes 28
Discussion: Example of Denoising Input Img 50% noise 10% pruning 0% pruning 29
Number of Training Epochs 30
Training Error 31
Future directions: ● Train over more images Alternate pruning heuristics (L1 norm) ● Train over more hidden nodes ● ● Computational efficiency evaluation ● Build a classifier on top of MNIST pixels 32
Thanks! 33
Additional Slides 34
100 Hidden Nodes 35
100 Hidden Nodes 36
500 Hidden Nodes 37
500 Hidden Nodes 38
Works Cited [1] B. A. Olshausen and D. J. Field. Sparse coding with an overcomplete basis set: A strategy employed by V1? Vision Research, 37:3311–3325, 1997. [2] S. Changpinyo, M. Sandler, and A. Zhmoginov. The power of sparsity in convolutional neural networks. CoRR, abs/1702.06257, 2017. [3] Chris Nicholson, Adam Gibson, Skymind team. “A Beginner's Tutorial for Restricted Boltzmann Machines.” A Beginner's Tutorial for Restricted Boltzmann Machines - Deeplearning4j: Open-Source, Distributed Deep Learning for the JVM , deeplearning4j.org/restrictedboltzmannmachine. 39
Measure of Error: KL-Divergence 40
Recommend
More recommend