Problem Definition Analysis and Techniques Results Main Results Poster Details Near optimal finite time identification of arbitrary linear dynamical systems Tuhin Sarkar & Alexander Rakhlin Massachusetts Institute of Technology June 12, 2019 Tuhin Sarkar & Alexander Rakhlin ICML 2019 1 / 11
Problem Definition Analysis and Techniques Results Main Results Poster Details Plan 1 Problem Definition 2 Analysis and Techniques 3 Results 4 Main Results 5 Poster Details Tuhin Sarkar & Alexander Rakhlin ICML 2019 2 / 11
Problem Definition Analysis and Techniques Results Main Results Poster Details Linear Time Invariant (LTI) Systems LTI systems appear in autoregressive processes, control and RL systems. Formally, X t +1 = AX t + η t +1 (1) X t , η t ∈ R n . X t is state vector, η t is noise vector. A is state transition matrix : characterizes the LTI system. Assume { η t } ∞ t =1 is isotropic and subGaussian. Tuhin Sarkar & Alexander Rakhlin ICML 2019 3 / 11
Problem Definition Analysis and Techniques Results Main Results Poster Details Linear Time Invariant (LTI) Systems LTI systems appear in autoregressive processes, control and RL systems. Formally, X t +1 = AX t + η t +1 (1) X t , η t ∈ R n . X t is state vector, η t is noise vector. A is state transition matrix : characterizes the LTI system. Assume { η t } ∞ t =1 is isotropic and subGaussian. Tuhin Sarkar & Alexander Rakhlin ICML 2019 3 / 11
Problem Definition Analysis and Techniques Results Main Results Poster Details Linear Time Invariant (LTI) Systems LTI systems appear in autoregressive processes, control and RL systems. Formally, X t +1 = AX t + η t +1 (1) X t , η t ∈ R n . X t is state vector, η t is noise vector. A is state transition matrix : characterizes the LTI system. Assume { η t } ∞ t =1 is isotropic and subGaussian. Tuhin Sarkar & Alexander Rakhlin ICML 2019 3 / 11
Problem Definition Analysis and Techniques Results Main Results Poster Details Learning A from data Goal : Learn A from { X t } T t =1 T ˆ � || X t +1 − A o X t || 2 A = inf 2 A o t =1 Estimation error T T E = A − ˆ � � η t +1 X ⊤ X t X ⊤ t ) + A = ( t )( (2) t =1 t =1 Error analysis hard : { X t } T t =1 are not independent. Tuhin Sarkar & Alexander Rakhlin ICML 2019 4 / 11
Problem Definition Analysis and Techniques Results Main Results Poster Details Related Work Faradonbeh et. al. (2017). Finite time identification in unstable linear systems. Simchowitz et. al. (2018). Learning without mixing : Towards a sharp analysis of linear system identification. Past works fail to capture correct behavior for all A . Tuhin Sarkar & Alexander Rakhlin ICML 2019 5 / 11
Problem Definition Analysis and Techniques Results Main Results Poster Details Main Technique The analysis proceeds in two steps : Show invertibility of sample covariance matrix : � T t =1 X t X ⊤ t ≈ f ( T ) I . Show the following for self–normalized martingale term : T T t ) − 1 / 2 = O (1) � η t +1 X ⊤ � X t X ⊤ ( t )( t =1 t =1 Tuhin Sarkar & Alexander Rakhlin ICML 2019 6 / 11
Problem Definition Analysis and Techniques Results Main Results Poster Details Main Technique The analysis proceeds in two steps : Show invertibility of sample covariance matrix : � T t =1 X t X ⊤ t ≈ f ( T ) I . Show the following for self–normalized martingale term : T T t ) − 1 / 2 = O (1) � η t +1 X ⊤ � X t X ⊤ ( t )( t =1 t =1 Tuhin Sarkar & Alexander Rakhlin ICML 2019 6 / 11
Problem Definition Analysis and Techniques Results Main Results Poster Details Sample Covariance Matrix Let ρ i ( A ) be the absolute value of i th eigenvalue of A with ρ i ( A ) ≥ ρ i +1 ( A ) . Then ρ i ∈ S 0 = ⇒ ρ i ( A ) ≤ 1 − C/T ρ i ∈ S 1 = ⇒ ρ i ( A ) ∈ [1 − C/T, 1 + C/T ] ρ i ∈ S 2 = ⇒ ρ i ( A ) ≥ 1 + C/T Theorem ⇒ � T t =1 X t X ⊤ ρ i ( A ) ∈ S 0 = t = Θ( T ) ⇒ � T t =1 X t X ⊤ t = Ω( T 2 ) ρ i ( A ) ∈ S 1 = t = Ω( e aT ) (under necessary ⇒ � T t =1 X t X ⊤ ρ i ( A ) ∈ S 2 = and sufficient “regularity” conditions only) Tuhin Sarkar & Alexander Rakhlin ICML 2019 7 / 11
Problem Definition Analysis and Techniques Results Main Results Poster Details Sample Covariance Matrix Let ρ i ( A ) be the absolute value of i th eigenvalue of A with ρ i ( A ) ≥ ρ i +1 ( A ) . Then ρ i ∈ S 0 = ⇒ ρ i ( A ) ≤ 1 − C/T ρ i ∈ S 1 = ⇒ ρ i ( A ) ∈ [1 − C/T, 1 + C/T ] ρ i ∈ S 2 = ⇒ ρ i ( A ) ≥ 1 + C/T Theorem ⇒ � T t =1 X t X ⊤ ρ i ( A ) ∈ S 0 = t = Θ( T ) ⇒ � T t =1 X t X ⊤ t = Ω( T 2 ) ρ i ( A ) ∈ S 1 = t = Ω( e aT ) (under necessary ⇒ � T t =1 X t X ⊤ ρ i ( A ) ∈ S 2 = and sufficient “regularity” conditions only) Tuhin Sarkar & Alexander Rakhlin ICML 2019 7 / 11
Problem Definition Analysis and Techniques Results Main Results Poster Details Sample Covariance Matrix Let ρ i ( A ) be the absolute value of i th eigenvalue of A with ρ i ( A ) ≥ ρ i +1 ( A ) . Then ρ i ∈ S 0 = ⇒ ρ i ( A ) ≤ 1 − C/T ρ i ∈ S 1 = ⇒ ρ i ( A ) ∈ [1 − C/T, 1 + C/T ] ρ i ∈ S 2 = ⇒ ρ i ( A ) ≥ 1 + C/T Theorem ⇒ � T t =1 X t X ⊤ ρ i ( A ) ∈ S 0 = t = Θ( T ) ⇒ � T t =1 X t X ⊤ t = Ω( T 2 ) ρ i ( A ) ∈ S 1 = t = Ω( e aT ) (under necessary ⇒ � T t =1 X t X ⊤ ρ i ( A ) ∈ S 2 = and sufficient “regularity” conditions only) Tuhin Sarkar & Alexander Rakhlin ICML 2019 7 / 11
Problem Definition Analysis and Techniques Results Main Results Poster Details Self Normalized Martingale Theorem (Abbasi-Yadkori et. al. 2011) Let V be a deterministic matrix with V ≻ 0 . For any 0 < δ < 1 and { η t , X t } T t =1 defined as before, we have with probability 1 − δ T − 1 || ( ¯ � Y T − 1 ) − 1 / 2 X t η ′ t +1 || 2 t =0 � � � 5 det ( ¯ � Y T − 1 ) 1 / 2 n det ( V ) − 1 / 2 n � ≤ R � 8 n log (3) δ 1 /n = ( Y τ + V ) − 1 and R 2 is the subGaussian parameter where ¯ Y − 1 τ of η t . Tuhin Sarkar & Alexander Rakhlin ICML 2019 8 / 11
Problem Definition Analysis and Techniques Results Main Results Poster Details Main Result 1 Combining the previous results (and a few more matrix manipulations) we show Theorem ⇒ || E || 2 = O ( T − 1 / 2 ) ρ i ( A ) ∈ S 0 ∪ S 1 ∪ S 2 = ⇒ || E || 2 = O ( T − 1 ) ρ i ( A ) ∈ S 1 ∪ S 2 = ⇒ || E || 2 = O ( e − aT ) (under necessary and ρ i ( A ) ∈ S 2 = sufficient “regularity” conditions only) Tuhin Sarkar & Alexander Rakhlin ICML 2019 9 / 11
Problem Definition Analysis and Techniques Results Main Results Poster Details Main Result 1 Combining the previous results (and a few more matrix manipulations) we show Theorem ⇒ || E || 2 = O ( T − 1 / 2 ) ρ i ( A ) ∈ S 0 ∪ S 1 ∪ S 2 = ⇒ || E || 2 = O ( T − 1 ) ρ i ( A ) ∈ S 1 ∪ S 2 = ⇒ || E || 2 = O ( e − aT ) (under necessary and ρ i ( A ) ∈ S 2 = sufficient “regularity” conditions only) Tuhin Sarkar & Alexander Rakhlin ICML 2019 9 / 11
Problem Definition Analysis and Techniques Results Main Results Poster Details Main Result 1 Combining the previous results (and a few more matrix manipulations) we show Theorem ⇒ || E || 2 = O ( T − 1 / 2 ) ρ i ( A ) ∈ S 0 ∪ S 1 ∪ S 2 = ⇒ || E || 2 = O ( T − 1 ) ρ i ( A ) ∈ S 1 ∪ S 2 = ⇒ || E || 2 = O ( e − aT ) (under necessary and ρ i ( A ) ∈ S 2 = sufficient “regularity” conditions only) Tuhin Sarkar & Alexander Rakhlin ICML 2019 9 / 11
Problem Definition Analysis and Techniques Results Main Results Poster Details Main Result 2 Regularity condition : All eigenvalues greater than one should have geometric multiplicity one. Theorem If the regularity conditions are violated then OLS is inconsistent. OLS cannot learn A = ρI where ρ ≥ 1 . 5 . E has a non–trivial probability distribution. Tuhin Sarkar & Alexander Rakhlin ICML 2019 10 / 11
Problem Definition Analysis and Techniques Results Main Results Poster Details Poster Details Please come to our poster at Pacific Ballroom #193 at 6.30 pm today. Thank you ! Tuhin Sarkar & Alexander Rakhlin ICML 2019 11 / 11
Recommend
More recommend