When Neurons Fail El Mahdi El Mhamdi, Rachid Guerraoui BDA, Chicago July 25th, 2016 1 / 28
Motivations Table of Contents Motivations 1 Problem statement 2 Results 3 2 / 28
Motivations Universality NNs everywhere 3 / 28
Motivations Universality Model Figure : Feed forward neural network Nodes: neurons Links: synapses 4 / 28
Motivations Universality Model N L X w ( L +1) y ( L ) F neu ( X ) = i i i =1 N l − 1 with y ( l ) = ' ( s ( l ) j ) for 1 l L ; y (0) = x j and s ( l ) X w ( l ) ji y ( l − 1) = j j j i i =1 5 / 28
Motivations Scalability Software simulated NN 6 / 28
Motivations Scalability Hardware-based NNs SyNAPSE (DARPA, IBM), Human Brain Project (SP9 on neuromorphic), Brains in Silicon at Stanford... 7 / 28
Motivations Fault tolerance How robust is this? Crash failure: a component stops working. 8 / 28
Motivations Fault tolerance How robust is this? Byzantine failure: a component sends arbitrary values. 9 / 28
Motivations Fault tolerance Biological plausibility Examples of extreme robustness in nature 1 1Feuillet et al., 2007. Brain of a white-collar worker. Lancet (London, England) , 370(9583), p.262. 10 / 28
Motivations Experimental observations Classical training leads to non-robust NN E: di ff erence between desired and actual outputs on a training set = � dE ∆ w ( l ) ij dw ( l ) ij 9 robust weight distribution 7! Reach them with learning ! 11 / 28
Motivations Solution Dropout Randomly switch neurons o ff during the training phase Kerlirzin and Vallet (1991, 1993), Hinton et al. (2012, 2014) E D P ( D ) where P ( D ) = (1 � p ) | D | p ( N � | D | ) Minimize E av = P D 12 / 28
Motivations Lack of theory Experimentally observed robustness 2 Over-provisionning Upper-bound ? 2 from Kerlirzin 1993, edited 13 / 28
Problem statement Table of Contents Motivations 1 Problem statement 2 Results 3 14 / 28
Problem statement Given a precision ✏ , derive a tight bound on failures to keep ✏ -precision for a any neural network 3 approximating a function F 3 note: learning is taken for granted 15 / 28
Problem statement Theoretical background: universality Theorem 4 : 8 ( F , ✏ ), 9 NN generating F neu s.t k F neu � F k < ✏ 4 Cybenko 1989, Horkink 1991 16 / 28
Problem statement Minimal networks are not robust 5 Given over-provision ✏ 0 ( ✏ 0 < ✏ ), what condition on failures to preserve ✏ -precision? 5 not to mention: impossible to derive 17 / 28
Results Table of Contents Motivations 1 Problem statement 2 Results 3 18 / 28
Results Single layer, crash f ✏ � ✏ 0 w m More over-provision 7! more robustness Unequal weight distribution 7! single point of failure No Byzantine FT 7! bounded synaptic capacity 19 / 28
Results General case Multilayer networks, Byzantine failures Failure at layer l propagates though layers l 0 > l (Byz and crash). Factors: weights, | layers | , | neurons | , Lipschitz coef. of ' Total error propagated to the output should be ✏ � ✏ 0 20 / 28
Results General case Multilayer networks, Byzantine failures Bounded channel capacity (otherwise no robustness to Byzantine) ! L L f l K L � l w ( L +1) ( N l 0 � f l 0 ) w ( l 0 ) Propagated error C P Q m m l =1 l 0 = l +1 C : capacity, K : Lipschitz coe ff ., w ( l ) m maximal weight to layer l N l : | neurons | , f l : | failures | 21 / 28
Results General case How to read the formula L L ! f l K L � l w ( L +1) ( N l 0 � f l 0 ) w ( l 0 ) X Y ✏ � ✏ 0 C m m l =1 l 0 = l +1 22 / 28
Results General case How to read the formula L L ! f l K L � l w ( L +1) ( N l 0 � f l 0 ) w ( l 0 ) X Y ✏ � ✏ 0 C m m l =1 l 0 = l +1 worst-case propagated error 23 / 28
Results General case How to read the formula L L ! f l K L � l w ( L +1) ( N l 0 � f l 0 ) w ( l 0 ) X Y ✏ � ✏ 0 C m m l =1 l 0 = l +1 error margin permitted by the over-provision 24 / 28
Results General case How to read the formula L L ! f l K L � l w ( L +1) ( N l 0 � f l 0 ) w ( l 0 ) X Y ✏ � ✏ 0 C m m l =1 l 0 = l +1 Error (at most C is transmitted) at f l neurons in layer l propagating through l 0 > l . ( N l 0 � f l 0 ) : only correct neurons propagating it, multiplying by Kw ( l 0 ) m . 25 / 28
Results General case Unbounded capacity Taking C 7! 1 L L ! ( N l 0 � f l 0 ) w ( l 0 ) f l K L � l w ( L +1) X Y ✏ � ✏ 0 C m m l =1 l 0 = l +1 Then 8 l f l = 0 No Byzantine FT. 26 / 28
Results Applications Generalization to synaptic failures. Applications of the bound (Memory cost, neuron duplication, synchrony) Other neural computing models. 27 / 28
Questions ? More details: https://infoscience.epfl.ch/record/217561 28 / 28
Recommend
More recommend