dictionary learning applications in control theory
play

Dictionary Learning Applications in Control Theory Paul Irofti, - PowerPoint PPT Presentation

Dictionary Learning Applications in Control Theory Paul Irofti, Florin Stoican Politehnica University of Bucharest Faculty of Automatic Control and Computers Department of Automatic Control and Systems Engineering Email: paul@irofti.net,


  1. Dictionary Learning Applications in Control Theory Paul Irofti, Florin Stoican Politehnica University of Bucharest Faculty of Automatic Control and Computers Department of Automatic Control and Systems Engineering Email: paul@irofti.net, florin.stoican@acse.pub.ro Recent Advances in Artificial Intelligence, June 20 th , 2017 Acknowledgment: This work was supported by the Romanian National Authority for Scientific Research, CNCS - UEFISCDI, project number PN-II-RU-TE-2014-4-2713.

  2. Sparse Representation (SR) = · y x D

  3. Orthogonal Matching Pursuit (OMP) Algorithm 1: OMP a 1 Arguments: D , y , s 2 Initialize: r = y , I = ∅ 3 for k = 1 : s do Compute correlations with residual: z = D T r 4 Select new column: i = arg max j | z j | 5 Increase support: I ← I ∪ { i } 6 Compute new solution: x = LS( D , y , I ) 7 Update residual: r = y − D I x I 8 a Pati, Rezaiifar, and Krishnaprasad 1993.

  4. Dictionary Learning (DL) ≈ · Y D X

  5. The Dictionary Learning (DL) Problem Given a data set Y ∈ R p × m and a sparsity level s , minimize the bivariate function � Y − DX � 2 minimize F D , X (1) subject to � d j � 2 = 1 , 1 ≤ j ≤ n � x i � 0 ≤ s , 1 ≤ i ≤ m , where D ∈ R p × n is the dictionary (whose columns are called atoms) and X ∈ R n × m the sparse representations matrix.

  6. Approach Algorithm 2: Dictionary learning – general structure 1 Arguments: signal matrix Y , target sparsity s 2 Initialize: dictionary D (with normalized atoms) 3 for k = 1 , 2 , . . . do With fixed D , compute sparse representations X 4 With fixed X , update atoms d j , j = 1 : n 5

  7. DL Algorithms K-SVD 1 solves the optimization problem in sequence 2 � �   � � �  − d j X j , I j � � min  Y I j − d ℓ X ℓ, I ℓ (2) � � d j , X j , I j � � ℓ � = j � � F where all atoms excepting d j are fixed. This is seen as a rank-1 approximation and the solution is given by the singular vectors corresponding to the largest singular value. d j = u 1 , X j , I j = σ 1 v 1 . (3) 1 Aharon, Elad, and Bruckstein 2006.

  8. LC-KSVD � Y − DX � 2 F + α � Q − AX � 2 F + β � H − WX � 2 minimize F D , X , A , W (4) subject to � d j � 2 = 1 , 1 ≤ j ≤ n � x i � 0 ≤ s , 1 ≤ i ≤ m , dictionary atoms evenly split among classes q i has non-zero entries where y i and d i share the same label. linear transformation A encourages discrimination in X h i = e j where j is the class label of y i W represents the learned classifier parameters

  9. Fault Detection and Isolation in Water Networks

  10. FDI via DL Water networks pose some interesting issues: large scale, distributed network with few sensors user demand unknown or imprecise pressure dynamics nonlinear (analytic solutions impractical) The DL approach for FDI: a residual signal compares expected and measured pressures r i ( t ) = p i ( f i ( t ) , f j ( t ) , t ) − ¯ p i , ∀ i , j (5) to each fault is assigned a class and DL provides the atoms which discriminate between them each residual is sparsely described by atoms and thus, FDI is achieved iff the classification is unambiguous

  11. Hanoi 100 1 junction node 1350 20 21 19 2 3 tank node 2 2 0 0 1500 500 9 0 0 400 0 5 18 node with sensor 1 4 1 2650 800 junction partition 1450 5 22 17 27 pipe connection 0 6 5 1500 4 0 5 1230 fault event 7 1 0 850 0 0 7 2 16 23 legend 28 0 5 8 8 0 0 0 3 3 7 1 12 11 10 800 2 9 24 13 14 25 1600 15 26 31 3500 1200 9 5 0 800 850 5 0 0 950 550 300 7 5 0 30 860 150 29

  12. Sensor Placement Let R ∈ R n × mn be measured pressure residuals in all n network nodes. For each node we simulate m different faults. Given s < n available sensors, apply OMP on each column r � r − I n x � 2 minimize 2 x (6) subject to � x � 0 ≤ s , resulting in matrix X with s -sparse columns approximating R .

  13. Placement Strategies (a) select the s most common used atoms (b) from each m -block select most frequent s atoms; of the n · s atoms, select again the first s . 10 case (a) 9 number of sensors case (b) 8 7 6 5 4 3 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 selected nodes

  14. Learning Algorithm 3: Placement and FDI learning a 1 Inputs: training residuals R ∈ R n × nm parameters s , α , β 2 3 Result: dictionary D , classifier W , sensor nodes I s 4 Select s sensor nodes I s based on matrix R using (a) or (b) 5 Let R I s be the restriction of R to the rows in I s 6 Use R I s , α and β to learn D and W from (4) a Irofti and Stoican 2017.

  15. Fault Detection Algorithm 4: Fault detection and isolation 1 Inputs: testing residuals R ∈ R s × mn dictionary D , classifier W 2 3 Result: prediction P ∈ N mn 4 for k = 1 to mn do Use OMP to obtain x k using r k and D 5 Label: L k = Wx k 6 Classify: p k = arg max c L k 7 Position c of the largest entry from L k is the predicted class.

  16. Today Improved sensor placement. Iteratively choose s rows from R solving at each step 1 � proj R I r k � 2 i = arg min 2 + λ , r k ∈ R I c , (7) � δ k , I � 1 k where I is the set of currently selected rows and δ k , I vector elements are the distances from node k to the nodes in I . Graph aware DL. Adding graph regularization 2 � Y − DX � 2 F + α � Q − AX � 2 F + β � H − WX � 2 F + (8) + γ Tr( D T LD ) + λ Tr( XL c X T ) + µ � L � 2 F , where L is the graph Laplacian. 2 Yankelevsky and Elad 2016.

  17. Zonotopic Area Coverage

  18. Zonotopic sets Area packing, mRPI (over)approximation and other related notions may be described via unions of zonotopic sets: �� � min vol ( S ) − vol Z k , (9) Z k k subject to Z k ⊆ S . Zonotopes, given in generator representation 3 Z k = Z ( c k , G k ) = { c k + G k ξ : � ξ � ∞ ≤ 1 } (10) are easy to handle for: Minkowski sum: � � Z ( G 1 , c 1 ) ⊕ Z ( G 2 , c 2 ) = Z ( , c 1 + c 2 ) G 1 G 2 linear mappings: R Z ( G 1 , c 1 ) = Z ( RG 1 , Rc 1 ) 3 Fukuda 2004.

  19. Formulation Each zonotope is parameterized after its center and a scaling vector ( c k , λ k ). These variables help formulate the: inclusion constraint Z ( c k , G · diag ( λ k )) ⊆ U : s ⊤ � | s ⊤ i c k + i G j | λ jk ≤ r i , ∀ i , (11) j where U = { u : s ⊤ i u ≤ r i } . explicitly describe the volume 4 vol ( Z ( c k , G λ k )): � · � � det ( G j 1 ... j n ) � � � vol ( Z ( c k , G Λ k )) = λ jk . 1 ≤ j 1 <... j n ≤ N j ∈{ j 1 ... j n } (12) The formulation becomes simpler if the scaling is homogeneous ( λ ∗ k = λ jk , ∀ j ). 4 Gover and Krikorian 2010.

  20. Implementation We track the OMP formalism, without its theoretical convergence guarantees: Algorithm 5: Area Coverage with zonotopic sets 1 Inputs: area to be covered U , sparsity constraint s 2 Result: pairs of centers and scaling factors ( c k , λ k ) 3 for k = 1 to s do Enlarge the zonotopes until they saturate the constraints 4 k vol ( S k \ Z k ) Select Z k where k = arg min 5 Update the uncovered area vol ( S k +1 ) = vol ( S k ∪ Z k ) 6

  21. Result 0 . 4 0 . 2 0 − 0 . 2 − 0 . 4 − 0 . 4 − 0 . 3 − 0 . 2 − 0 . 1 0 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5

  22. Thank You! Questions?

Recommend


More recommend