Optimal Control and Dynamic Programming 4SC000 Q2 2017-2018 Duarte Antunes
Outline • Loop transfer recovery
Recall The LQR loop for the system , x ( t ) = Ax ( t ) + Bu ( t ) u ( t ) ∈ R ˙ u ( s ) 0 K ( sI − A ) − 1 B + where is such that is the optimal policy for the problem u ( t ) = Kx ( t ) K R ∞ min u ( t ) x ( t ) | Qx ( t ) + u ( t ) | Ru ( t ) dt 0 has the following properties -Guaranteed gain margins and phase margins for any matrices and . [ 1 ( − 60 , 60) 2 , ∞ ) Q R -Guarantees on the sensitivity (<1) and complementary sensitivity (<2) for any , Q R -Root square locus: place the poles of the loop by tuning , . Q R 1
Recall By duality, the Kalman loop for the system E [ w ( t ) w ( t + τ ) | ] = W δ ( τ ) x = Ax ( t ) + w ( t ) ˙ y ( t ) = Cx ( t ) + n ( t ) E [ n ( t ) n ( t + τ ) | ] = V δ ( τ ) y ( t ) ∈ R y( s ) y( s ) ˆ + C ( sI − A ) − 1 L − where is such that is the optimal estimator ˙ x ( t ) = A ˆ ˆ x ( t ) + Bu ( t ) + L ( y ( t ) − C ˆ x ( t )) L (in the sense of slide 11, lecture 11) has the following properties: -Guaranteed gain margins and phase margins for any matrices and . [ 1 ( − 60 , 60) 2 , ∞ ) W V -Guarantees on the sensitivity (<1) and complementary sensitivity (<2) for any , . W V -Root square locus: place the poles of the loop by tuning , . W V 2
Motivation for today’s lecture Since there are guaranteed frequency domain properties for the LQR and Kalman loops (discovered by Kalman in the 60’s, it would be reasonable to search for such properties for LQG loops). However, in 1978, John Doyle wrote the following paper finding an example for some matrices , , , Q W V R 3
John Doyle’s Example � 1 1 Special case of John Doyle’s example A = 0 1 � 0 y( s ) u ( s ) 0 B = K ( sI − ( A + BK − LC )) − 1 L C ( sI − A ) − 1 B 1 + � ⇥ 1 1 1 ⇤ Q = W = 1 ⇥ 1 0 ⇤ R = 0 . 01 C = A = [1 1; 0 1]; Nyquist Diagram B = [0;1]; 0.4 C = [1 0]; n = size(A,1); 0.3 Q = 1*[1 1]'*[1 1]; R = 0.001; W = [1 1]'*[1 1]; V = 1; 0.2 K = lqr(A,B,Q,R); K = -K; 0.1 Imaginary Axis [~,L,Theta,~,~,~] = kalman(ss(A,[B eye(n)],C, 0 [0 zeros(1,n)]),W,V); [numLQG,denLQG] = ss2tf( A+B*K-L*C ,L,-K,0); -0.1 [numplant,denplant] = ss2tf( A , B , C ,0); -0.2 tflqg = tf(numLQG,denLQG); -0.3 tfplant = tf(numplant,denplant); nyquist(tfplant*tflqg) -0.4 -1.2 -1 -0.8 -0.6 -0.4 -0.2 0 Real Axis 4
Discussion • While, as John Doyle’s example shows, there are examples of LQG loops where the gain and phase margins can be arbitrary small, it is possible via loop transfer recovery to avoid these examples. • Loop transfer recovery states that, for minimum-phase systems, the LQG loop transfer function converges to the open loop transfer function of the Kalman filter loop when the LQR control penalty converges to zero (cheap control). • Since the Kalman filter loop has good frequency domain properties LQG control tuned in this way can also have good frequency domain properties (e.g. large gain and phase margins). • Minimum phase single-input single-output (SISO) systems are those whose zeros (if any) lie on the left half complex plane (stable zeros). • Non-minimum phase system have limitations in the achievable closed-loop performance, and although there are some results also for these systems, we will not address them here. • We need a first result of LQR and Kalman loops, then we state the general loop transfer result for model-based systems and then specialised it to LQG loops. • We will consider for simplicity SISO systems although the results extend to MIMO. 5
Preliminary result Consider the family of gains obtained from the optimal policy for the u ( t ) = K ρ x ( t ) K ρ problem R ∞ min u ( t ) x ( t ) | Qx ( t ) + u ( t ) | Ru ( t ) dt u ( t ) ∈ R 0 x ( t ) = Ax ( t ) + Bu ( t ) ˙ with the gains Q = C | C and . Then R = ρ ρ → 0 ( √ ρ K ρ ) = − C or lim ρ → 0 ( √ ρ K ρ ) = C lim Note that, by duality, if and then the Kalman gains satisfy W = HH | V = θ → 0 √ √ or θ → 0 ( lim θ L θ ) = − H θ → 0 ( lim θ L θ ) = H 6
Justification Start with the continuous-time algebraic Riccati equation A | P + PA − PB ρ − 1 B | P + C | C = 0 or equivalently A | P + PA − K | ρ K + C | C = 0 and note that if then since ρ A | P + ρ PA − PBB | P + ρ C | C = 0 R = ρ → 0 P → 0 and then in the limit ρ K | K → C | C from which the conclusion follows. 7
Loop transfer recovery Consider a linear system x ( t ) = Ax ( t ) + Bu ( t ) ˙ y ( t ) = Cx ( t ) y ( t ) ∈ R u ( t ) ∈ R with no unstable zeros and a model based controller, parameterized by gains and L K α ˙ x ( t ) = A ¯ ¯ x ( t ) + Bu ( t ) + L ( y ( t ) − C ¯ x ( t )) u ( t ) = K α ¯ x ( t ) where: (i) are fixed gains such that is Hurwitz*. ( A − LC ) L (ii) are a family of gains such that is Hurwitz and ( A + BK α ) K α α → 0 ( √ α K α ) = − C or α → 0 ( √ α K α ) = C lim lim Then, for each , s ∈ C α → 0 C ( sI − A ) − 1 BK α ( sI − ( A + BK α − LC )) − 1 L = C ( sI − A ) − 1 L lim *a matrix is Hurwitz if all the eigenvalues have negative real part. 8
Loop transfer recovery for LQG Consider a linear system x ( t ) = Ax ( t ) + Bu ( t ) ˙ y ( t ) = Cx ( t ) y ( t ) ∈ R u ( t ) ∈ R with no unstable zeros and an LQG controller ˙ x ( t ) = A ˆ ˆ x ( t ) + Bu ( t ) + L ( y ( t ) − C ˆ x ( t )) u ( t ) = K ρ ˆ x ( t ) where K ρ = − ρ − 1 B | P ρ A | P ρ + P ρ A − P ρ B ρ − 1 B | P α + C | C = 0 A Φ + Φ A | + W − Φ C | V − 1 C Φ = 0 L = Φ C | V − 1 Then, for each , s ∈ C lim ρ → 0 C ( sI − A ) − 1 BK ρ ( sI − ( A + BK ρ − LC )) − 1 L = C ( sI − A ) − 1 L Note that this is a direct consequence of the results of slides 6 and 8 9
Interpretation of LTR for LQG • Loop transfer recovery states that, for minimum phase systems, the LQG loop transfer function K ( sI − ( A + BK − LC )) − 1 L C ( sI − A ) − 1 B converges to the open loop transfer function of the Kalman filter loop C ( sI − A ) − 1 L when the LQR control penalty converges to zero (cheap control) and . Q = C | C R = ρ → 0 y( s ) u ( s ) 0 K ( sI − ( A + BK − LC )) − 1 L C ( sI − A ) − 1 B + (as ) R = ρ → 0 y( s ) y( s ) ˆ + C ( sI − A ) − 1 L − 10
LTR/LQG design Two steps 1. Design Kalman loop such that desired frequency domain properties are obtained, obtaining gains (see lecture 11) L y( s ) y( s ) ˆ + C ( sI − A ) − 1 L − 2. Make and a very small and obtain LQR gain Q = C | C K R = ρ Then the LQG controller ˙ x ( t ) = A ˆ ˆ x ( t ) + Bu ( t ) + L ( y ( t ) − C ˆ x ( t )) u ( t ) = K ˆ x ( t ) will yield a closed loop with similar frequency domain properties to the designed Kalman loop y( s ) u ( s ) 0 K ( sI − ( A + BK − LC )) − 1 L C ( sI − A ) − 1 B + Interesting: design first the observer and then design fast controller! (contrarily to the traditional paradigm of designing first the controller and then make fast observer). 11
Example Let us use the LTR/LQG procedure to design a controller with good gain and phase margins for the following system 1 Transfer function ( s +1)( s − 2)( s +3) − 2 5 6 1 State-space ⇥ 0 1 ⇤ A = C = 0 1 0 0 B = 0 0 1 0 0 y( s ) u ( s ) 0 1 ? ( s +1)( s − 2)( s +3) + 12
Example 1. Design Kalman loop such that desired frequency domain properties are obtained (gain,phase . After tuning these are found acceptable margins, sensitivity and complementary sensitivity) � Nyquist Diagram 1 0 2 W = V = 0 . 1 1.5 0 1 1 0.5 Imaginary Axis 0 -0.5 -1 -1.5 -2 -4.5 -4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 Real Axis Bode Diagram Bode Diagram Bode Diagram 20 5 0 0 10 -2 Magnitude (dB) Magnitude (dB) Magnitude (dB) -5 0 -4 -10 -10 -6 -15 Open loop Complementary Sensitivity -20 -8 -20 -30 -25 -10 -90 0 180 135 Phase (deg) Phase (deg) Phase (deg) -135 -45 90 45 -180 -90 0 13 10 -1 10 0 10 1 10 2 10 -1 10 0 10 1 10 2 10 -1 10 0 10 1 10 2 Frequency (rad/s) Frequency (rad/s) Frequency (rad/s)
Example 2. Make and a very small and obtain LQR gain ρ = 1 × 10 − 4 Q = C | C K R = ρ Nyquist Diagram 2 1.5 1 0.5 Imaginary Axis 0 -0.5 -1 -1.5 -2 -4.5 -4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 Real Axis Blue - Kalman loop Red - LQG loop 14
Example 2. Make and a very small and obtain LQR gain ρ = 1 × 10 − 7 Q = C | C K R = ρ Nyquist Diagram 2 1.5 1 0.5 Imaginary Axis 0 -0.5 -1 -1.5 -2 -4.5 -4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 Real Axis Blue - Kalman loop Red - LQG loop 15
Recommend
More recommend