in the shallows of the deepc data enabled predictive
play

In the Shallows of the DeePC : Data-Enabled Predictive Control - PowerPoint PPT Presentation

In the Shallows of the DeePC : Data-Enabled Predictive Control Florian D orfler Automatic Control Laboratory, ETH Z urich Acknowledgements John Lygeros Jeremy Coulson Funding: ETH Z urich Simulation data: M. Zeilinger and C. Jones


  1. In the Shallows of the DeePC : Data-Enabled Predictive Control Florian D¨ orfler Automatic Control Laboratory, ETH Z¨ urich

  2. Acknowledgements John Lygeros Jeremy Coulson Funding: ETH Z¨ urich Simulation data: M. Zeilinger and C. Jones Brainstorming: B. Bamieh, B. Recht, A. Cherukuri, and M. Morari 1/27

  3. Feedback – our central paradigm physical actuation sensing world “making “making a sense of difference the world” to the world” inference and automation information data science and control technology 2/27

  4. Big, deep, data, and so on • unprecedented availability of computation, storage, and data • theoretical advances in optimization, statistics, and machine learning • ...and big-data frenzy → increasing importance of data-centric methods in all of science / engineering Make up your own opinion, but machine learning works too well to be ignored. 3/27

  5. Control in a data-rich world • ever-growing trend in CS and robotics: data-driven control by-passing models • canonical problem: black/gray-box u 2 y 2 system control based on I/O samples u 1 y 1 Q: Why give up physical modeling and reliable model-based algorithms ? data-driven control Data-driven control is viable alternative when • models are too complex to be useful Central promise: It (e.g., control of fluid dynamics) is often easier to learn • first-principle models are not conceivable control policies directly (e.g., human-in-the-loop applications) from data, rather than learning a model. • modeling and system ID is too costly Example: PID (e.g., non-critical robotics applications) 4/27

  6. ...of course, we are all tempted, annoyed, ... machine learning often achieves super-human performance, but it performs nowhere near MPC ...but that’s an entirely unfair comparison, is it ? today: preliminary ideas on a new approach that seems equally simple & powerful 5/27

  7. Snippets from the literature unknown system 1. reinforcement learning / or stochastic adaptive control / or approximate dynamic programming with key mathematical challenges observation action • (approximate/neuro) DP to learn approx. reinforcement learning control value/Q-function or optimal policy • (stochastic) function approximation estimate in continuous state and action spaces • exploration-exploitation trade-offs reward and practical limitations • inefficiency : computation & samples • complex and fragile algorithms • safe real-time exploration ø suitable for physical control systems? 6/27

  8. Snippets from the literature cont’d ? 2. gray-box safe learning & control • robust → conservative & complex control y • adaptive → hard & asymptotic performance • contemporary learning algorithms u ( e.g., MPC + Gaussian processes / RL) → non-conservative, optimal, & safe robust/adaptive ø limited applicability: need a-priori safety control 3. Sequential system ID + control u 2 y 2 • ID with uncertainty quantification followed by robust control design u 1 y 1 → recent finite-sample & end-to-end ID + control pipelines out-performing RL + ? ø ID seeks best but not most useful model ø “easier to learn policies than models” 7/27

  9. Key take-aways Quintessence of literature review : • data-driven approach is no silver bullet (see previous ø ), and we did not even discuss output feedback, safety constraints, ... • predictive models are preferable over data (even approximate) → models are tidied-up, compressed, and de-noised representations → model-based methods vastly out-perform model-agnostic strategies • but often easier to learn controllers from data rather than models ø deadlock ? • a useful ML insight: non-parametric methods are often preferable over parametric ones (e.g., basis functions vs. kernels) → build a predictive & non-parametric model directly from raw data ? 8/27

  10. Colorful idea u 1 = 1 y 3 y 5 y 1 y 7 u 2 = u 3 = · · · = 0 x 0 =0 y 4 y 6 y 2 If you had the impulse response of a LTI system, then ... • can build state-space system identification (Kalman-Ho realization) • ...but can also build predictive model directly from raw data :  u future ( t )  u future ( t − 1)   � � y future ( t ) = y 1 y 2 y 3 . . . ·   u future ( t − 2)    .  . . • model predictive control from data: dynamic matrix control (DMC) • today: can we do so with arbitrary, finite, and corrupted I/O samples ? 9/27

  11. Contents Introduction Insights from Behavioral System Theory DeePC: Data-Enabled Predictive Control Beyond Deterministic LTI Systems Conclusions

  12. Behavioral view on LTI systems Definition: A discrete-time dynamical system is a 3 -tuple ( Z ≥ 0 , W , B ) where (i) Z ≥ 0 is the discrete-time axis, (ii) W is a signal space, and (iii) B ⊆ W Z ≥ 0 is the behavior. Definition: The dynamical system ( Z ≥ 0 , W , B ) is (i) linear if W is a vector space & B is a subspace of W Z ≥ 0 , y (ii) time-invariant if B ⊆ σ B , where σw t = w t +1 , and (iii) complete if B is closed ⇔ W is finite dimensional. u In the remainder we focus on discrete-time LTI systems . 10/27

  13. Behavioral view cont’d Behavior B = set of trajectories in W Z ≥ 0 , and set of truncated trajectories B T = { w ∈ W T | ∃ v ∈ B s.t. w t = v t , t ∈ [0 , T ] } A system ( Z ≥ 0 , W , B ) is controllable w 2 if any two truncated trajectories w 1 , w 2 w ∈ B can be patched together in finite w 1 time with a trajectory w ∈ B [ T,T ′ ] . T ′ 0 T I/O : B = B u × B y where B u = ( R m ) Z ≥ 0 and B y ⊆ ( R p ) Z ≥ 0 are the spaces of input and output signals ⇒ w = col ( u, y ) ∈ B parametric kernel representation : B = col ( u, y ) ∈ ( R m + p ) Z ≥ 0 s.t. b 0 u + b 1 σu + · · · + b n σ n u + a 0 y + a 1 σy + . . . a n σ n y = 0 col ( u, y ) ∈ ker [ b 0 b 1 σ . . . b n σ n a 0 a 1 σ . . . a n σ n ] ⇔ 11/27

  14. Behavioral view cont’d • parametric state-space representation with minimal realization col ( u, y ) ∈ ( R m + p ) Z ≥ 0 | ∃ x ∈ ( R n ) Z ≥ 0 � B ( A, B, C, D ) = � s.t. σx = Ax + Bu, y = Cx + Du   C CA • lag smallest ℓ ∈ Z > 0 s.t. observability matrix  has rank n   . .  . CA ℓ − 1 Lemma [Markovsky & Rapisarda ’08] : Consider a minimal state-space model B ( A, B, C, D ) & a trajectory col ( u ini , u , y ini , y ) ∈ B T ini + T future of length T ini + T future with T ini ≥ ℓ . Then ∃ unique x ini ∈ R n such that     C D 0 ··· 0 CA CB D ··· 0 y =  x ini +  u .     . . . ... ... . . .   . . . CA ℓ − 1 CA N − 2 B ··· CB D i.e., we can recover the initial condition from past ℓ samples. 12/27

  15. LTI systems and matrix time series foundation of state-space subspace system ID & signal recovery algorithms u ( t ) y ( t ) y 3 y 4 u 4 u 1 u 3 y 1 u 7 y 5 y 7 t u 5 u 6 t y 2 y 6 u 2 � � [ b 0 a 0 b 1 a 1 ... b n a n ] is in the left u ( t ) , y ( t ) satisfy recursive nullspace of the Hankel matrix difference equation b 0 u t + b 1 u t +1 + . . . + b n u t + n + ( u 1 y 1 ) ( u 2 y 2 ) ( u 3 � u T − L +1 �  y 3 ) · · ·  y T − L +1 . a 0 y t + a 1 y t +1 + . . . + a n y t + n = 0 . ⇒ ( u 2 y 2 ) ( u 3 y 3 ) ( u 4   y 4 ) · · · .    .  H t ( u (kernel representation) y ) = . ( u 3 y 3 ) ( u 4 y 4 ) ( u 5   y 5 ) · · · .     . . ... ... ... . .   ⇐ . .   ( u L ( u T y L ) · · · · · · · · · y T ) under assumptions (collected from data ∈ { 1 , . . . , T } ) 13/27

  16. The Fundamental Lemma Definition : The signal u = col ( u 1 , . . . , u T ) ∈ R T m is persistently   u 1 ··· u T − L +1  is of full row rank, . . exciting of order L if H L ( u ) = ... . .  . . u L ··· u T i.e., if the signal is sufficiently rich and long ( T − L + 1 ≥ mL ) . Fundamental lemma [Willems et al, ’05] : Let T, t ∈ Z > 0 , Consider • a controllable LTI system ( Z > 0 , R m + p , B ) , and • a T -sample long trajectory col ( u, y ) ∈ B T , where • u is persistently exciting of order t + n . Then colspan ( H t ( u y )) = B t . 14/27

  17. Cartoon of Fundamental Lemma u ( t ) y ( t ) y 3 u 4 y 4 y 1 u 1 u 3 u 7 y 5 y 7 t u 5 u 6 t y 2 y 6 u 2 persistently exciting controllable LTI sufficiently many samples � u T − t +1 ( u 1 ( u 2 ( u 3 �   y 1 ) y 2 ) y 3 )   . . . y T − t +1 . . ( u 2 ( u 3 ( u 4     y 2 ) y 3 ) y 4 ) . . . .       .   B t ≡ colspan . ( u 3 ( u 4 ( u 5     y 3 ) y 4 ) y 5 ) . . . .         . . ... ... ... . .     . .     ( u t ( u T y t ) y T ) . . . . . . . . . all trajectories constructible from finitely many previous trajectories 15/27

  18. Consequences ( u 1 ( u 2 ( u 3  y 1 ) y 2 ) y 3 )  . . . ( u 2 ( u 3 ( u 4 y 2 ) y 3 ) y 4 ) x ( t + 1) = Ax ( t ) + Bu ( t )   . . . c olspan ( u 3 ( u 4 ( u 5   y 3 ) y 4 ) y 5 )   . . . y ( t ) = Cx ( t ) + Du ( t )  .  ... ... ... . . � �� � � �� � parametric state-space model non-parametric model from raw data Now let us draw the dramatic corollaries ... 16/27

Recommend


More recommend