Performance Guarantees for Random Fourier Features – Limitations and Merits Zolt´ an Szab´ o Joint work with Bharath K. Sriperumbudur (PSU) ML@SITraN, University of Sheffield June 25, 2015 Zolt´ an Szab´ o Random Fourier Features – Limitations and Merits
Context Given: � � � � R d e i ω T ( x − y ) d Λ( ω ) = ω T ( x − y ) k ( x , y ) = R d cos d Λ( ω ) . i . i . d . ˆ k ( x , y ): Monte-Carlo estimator of k ( x , y ) using ( ω j ) m ∼ Λ j =1 [Rahimi and Recht, 2007]. Motivation: Primal form – fast linear solvers. Kernel function approximation: out-of-sample extension. Online applications. Zolt´ an Szab´ o Random Fourier Features – Limitations and Merits
Performance measures Uniform ( r = ∞ ): � � � � � k − ˆ � k ( x , y ) − ˆ � � � � k S := sup k ( x , y ) � . � x , y ∈ S L r (1 ≤ r < ∞ ): � 1 �� � r k ( x , y ) | r d x d y � k − ˆ | k ( x , y ) − ˆ k � L r ( S ) := . S S Zolt´ an Szab´ o Random Fourier Features – Limitations and Merits
Approximation of kernel derivatives One could also consider ∂ p , q k . Motivation [Zhou, 2008, Shi et al., 2010, Rosasco et al., 2010, Rosasco et al., 2013, Ying et al., 2012, Sriperumbudur et al., 2014]: semi-supervised learning with gradient information, nonlinear variable selection, fitting of infD exp. family distributions. Many of the presented results hold for derivatives ([ p ; q ] � = 0 ). Zolt´ an Szab´ o Random Fourier Features – Limitations and Merits
Goal Large deviation inequalities Λ m �� � � � k − ˆ � � k S ≤ ǫ ≥ f 1 ( ǫ, d , m , | S | ) , � Λ m �� � � � k − ˆ � � k L r ≤ ǫ ≥ f 2 ( ǫ, d , m , | S | ) . � Scaling of | S | and m ensuring a.s. convergence? Zolt´ an Szab´ o Random Fourier Features – Limitations and Merits
Existing results on the approximation quality Notations: X n = O p ( r n ) ( O a . s . ( r n )) denotes X n r n boundedness in probability (almost surely). [Rahimi and Recht, 2007]: � � � � � log m � ˆ � � k ( x , y ) − k ( x , y ) S = O p | S | . � m [Sutherland and Schneider, 2015]: better constants. Zolt´ an Szab´ o Random Fourier Features – Limitations and Merits
Contents Uniform guarantee (empirical process theory), Two L r guarantees (uniform consequence, direct). Kernel derivatives. Zolt´ an Szab´ o Random Fourier Features – Limitations and Merits
High-level proof Empirical process form: 1 � � � k − ˆ � � k S = sup | Λ g − Λ m g | = � Λ − Λ m � G . � g ∈G � Λ − Λ m � G concentrates by its bounded difference property: 2 1 √ m . � Λ − Λ m � G � E ω 1: m � Λ − Λ m � G + G is a uniformly bounded, separable Carath´ eodory family ⇒ 3 E ω 1: m � Λ − Λ m � G � E ω 1: m R ( G , ω 1: m ) . Zolt´ an Szab´ o Random Fourier Features – Limitations and Merits
High-level proof Using Dudley’s entropy integral: 4 � |G| L 2(Λ m ) � 1 log N ( G , L 2 (Λ m ) , r ) d r . R ( G , ω 1: m ) � √ m 0 G is smoothly parameterized by a compact set ⇒ 5 � � C ( ω 1: m ) � � log N ( G , L 2 (Λ m ) , r ) ≤ log + 1 ⇒ r 1 √ m . E ω 1: m R ( G , ω 1: m ) � Putting together: 6 �� � � � 1 1 log | S | � k − ˆ � � k S � √ m + √ m = O . � m Zolt´ an Szab´ o Random Fourier Features – Limitations and Merits
Step-1: empirical process form � � � m g ( ω ) d Λ m ( ω ) = 1 Notation: Λ g = g ( ω ) d Λ( ω ), Λ m g = j =1 g ( ω j ). m Zolt´ an Szab´ o Random Fourier Features – Limitations and Merits
Step-1: empirical process form � � � m g ( ω ) d Λ m ( ω ) = 1 Notation: Λ g = g ( ω ) d Λ( ω ), Λ m g = j =1 g ( ω j ). m Reformulation of the objective: � � � k ( x , y ) − ˆ � � sup k ( x , y ) � = sup | Λ g − Λ m g | =: � Λ − Λ m � G , x , y ∈ S g ∈G where G = { g z : z ∈ S ∆ } , S ∆ = S − S = { x − y : x , y ∈ S } , � � ω T z g z : ω �→ cos . Zolt´ an Szab´ o Random Fourier Features – Limitations and Merits
Step-2: bounded difference property of � Λ − Λ m � G McDiarmid inequality : Let ω 1 , . . . , ω m ∈ D be independent r.v.-s, and f : D m → R satisfy the bounded diff. property ( ∀ r ): � � � f ( u 1 , . . . , u m ) − f ( u 1 , . . . , u r − 1 , u ′ � ≤ c r . sup r , u r +1 , . . . , u m ) u 1 ,..., u m , u ′ r ∈ D Then for ∀ β > 0 2 β 2 − � m r =1 c 2 r . P ( f ( ω 1 , . . . , ω m ) − E [ f ( ω 1 , . . . , ω m )] ≥ β ) ≤ e Zolt´ an Szab´ o Random Fourier Features – Limitations and Merits
Step-2: bounded difference property of � Λ − Λ m � G Our choice: f ( ω 1 , . . . , ω m ) := � Λ − Λ m � G . | f ( ω 1 , . . . , ω r − 1 , ω r , ω r +1 , . . . , ω m ) − f ( ω 1 , . . . , ω r − 1 , ω ′ r , ω r +1 , . . . , ω m ) | = � � � � � � Λ g − 1 � � Λ g − 1 � g ( ω j ) + 1 �� � � � � � g ( ω r ) − g ( ω ′ � � � � � − sup = sup g ( ω j ) r ) � � � m m m � � g ∈G g ∈G � j =1 j =1 � Zolt´ an Szab´ o Random Fourier Features – Limitations and Merits
Step-2: bounded difference property of � Λ − Λ m � G Our choice: f ( ω 1 , . . . , ω m ) := � Λ − Λ m � G . | f ( ω 1 , . . . , ω r − 1 , ω r , ω r +1 , . . . , ω m ) − f ( ω 1 , . . . , ω r − 1 , ω ′ r , ω r +1 , . . . , ω m ) | = � � � � � � Λ g − 1 � � Λ g − 1 � g ( ω j ) + 1 �� � � � � � g ( ω r ) − g ( ω ′ � � � � � − sup = sup g ( ω j ) r ) � � � m m m � � g ∈G g ∈G � j =1 j =1 � ( ∗ ) ≤ 1 � � � g ( ω r ) − g ( ω ′ m sup r ) � g ∈G Zolt´ an Szab´ o Random Fourier Features – Limitations and Merits
Step-2: bounded difference property of � Λ − Λ m � G Our choice: f ( ω 1 , . . . , ω m ) := � Λ − Λ m � G . | f ( ω 1 , . . . , ω r − 1 , ω r , ω r +1 , . . . , ω m ) − f ( ω 1 , . . . , ω r − 1 , ω ′ r , ω r +1 , . . . , ω m ) | = � � � � � � Λ g − 1 � � Λ g − 1 � g ( ω j ) + 1 �� � � � � � g ( ω r ) − g ( ω ′ � � � � � − sup = sup g ( ω j ) r ) � � � m m m � � g ∈G g ∈G � j =1 j =1 � ( ∗ ) ≤ 1 � ≤ 1 � � � � � �� � g ( ω r ) − g ( ω ′ � g ( ω ′ | g ( ω r ) | + m sup r ) m sup r ) g ∈G g ∈G Zolt´ an Szab´ o Random Fourier Features – Limitations and Merits
Step-2: bounded difference property of � Λ − Λ m � G Our choice: f ( ω 1 , . . . , ω m ) := � Λ − Λ m � G . | f ( ω 1 , . . . , ω r − 1 , ω r , ω r +1 , . . . , ω m ) − f ( ω 1 , . . . , ω r − 1 , ω ′ r , ω r +1 , . . . , ω m ) | = � � � � � � Λ g − 1 � � Λ g − 1 � g ( ω j ) + 1 �� � � � � � g ( ω r ) − g ( ω ′ � � � � � − sup = sup g ( ω j ) r ) � � � m m m � � g ∈G g ∈G � j =1 j =1 � ( ∗ ) ≤ 1 � ≤ 1 � � � � � �� � g ( ω r ) − g ( ω ′ � g ( ω ′ | g ( ω r ) | + m sup r ) m sup r ) g ∈G g ∈G � � ≤ 1 � � � g ( ω ′ sup | g ( ω r ) | + sup r ) � m g ∈G g ∈G Zolt´ an Szab´ o Random Fourier Features – Limitations and Merits
Step-2: bounded difference property of � Λ − Λ m � G Our choice: f ( ω 1 , . . . , ω m ) := � Λ − Λ m � G . | f ( ω 1 , . . . , ω r − 1 , ω r , ω r +1 , . . . , ω m ) − f ( ω 1 , . . . , ω r − 1 , ω ′ r , ω r +1 , . . . , ω m ) | = � � � � � � Λ g − 1 � � � Λ g − 1 g ( ω j ) + 1 �� � � � � � g ( ω r ) − g ( ω ′ � � � � � − sup = sup g ( ω j ) r ) � � � m m m � � g ∈G g ∈G � j =1 j =1 � ( ∗ ) ≤ 1 � ≤ 1 � � � � �� � � g ( ω r ) − g ( ω ′ � g ( ω ′ | g ( ω r ) | + m sup r ) m sup r ) g ∈G g ∈G � � ≤ 1 ≤ 1 + 1 = 2 � � � g ( ω ′ sup | g ( ω r ) | + sup r ) m . � m m g ∈G g ∈G Zolt´ an Szab´ o Random Fourier Features – Limitations and Merits
Step-2: (*) = reverse triangle inequality with sup Lemma: G : set of functions, a , b : G → R maps; then � � � � � � � sup | a ( g ) | − sup | a ( g ) + b ( g ) | � � g ∈G g ∈G �
Step-2: (*) = reverse triangle inequality with sup Lemma: G : set of functions, a , b : G → R maps; then � � � � � � � sup | a ( g ) | − sup | a ( g ) + b ( g ) | � ≤ sup | b ( g ) | . � � g ∈G g ∈G g ∈G Zolt´ an Szab´ o Random Fourier Features – Limitations and Merits
Step-2: (*) = reverse triangle inequality with sup Lemma: G : set of functions, a , b : G → R maps; then � � � � � � � sup | a ( g ) | − sup | a ( g ) + b ( g ) | � ≤ sup | b ( g ) | . � � g ∈G g ∈G g ∈G Proof: combine sup g ∈G | a ( g ) + b ( g ) | ≤ sup ( | a ( g ) | + | b ( g ) | ) ≤ sup | a ( g ) | + sup | b ( g ) | , g ∈G g ∈G g ∈G
Recommend
More recommend