models and m models and methods ethods
play

MODELS AND M MODELS AND METHODS ETHODS FOR FOR CYBER-PHY CYBER - PowerPoint PPT Presentation

DISCRETE DISCRETE EVENT AN EVENT AND HYBRID SY HYBRID SYSTEM STEM MODELS AND M MODELS AND METHODS ETHODS FOR FOR CYBER-PHY CYBER PHYSICAL SICAL SY SYSTEMS STEMS C. G . G. Cassand . Cassandras as Division of Systems Engineering


  1. SELECTED REFERENCES - MODELING Timed Automata and Timed Petri Nets – Alur, R., and D.L. Dill, “A Theory of Timed Automata,” Theoretical Computer Science, No. 126, pp. 183-235, 1994. – Cassandras, C.G, and S. Lafortune, “Introduction to Discrete Event Systems,” Springer, 2008. – Wang, J., Timed Petri Nets - Theory and Application, Kluwer Academic Pub- lishers, Boston, 1998. Hybrid Systems – Bemporad, A. and M. Morari, “Control of Systems Integrating Logic Dynamics and Constraints,” Automatica, Vol. 35, No. 3, pp.407 -427, 1999. – Branicky, M.S., V.S. Borkar, and S.K. Mitter, “A Unified Framework for Hy - brid Control: Model and Optimal Control Theory,”IEEE Trans. on Automatic Control, Vol. 43, No. 1, pp. 31-45, 1998. – Cassandras, C.G., and J. Lygeros, “Stochastic Hybrid Systems,” Taylor and Francis, 2007. – Grossman, R., A. Nerode, A. Ravn, and H. Rischel, (Eds), Hybrid Systems, Springer, New York, 1993. – Hristu-Varsakelis, D. and W.S. Levine, Handbook of Networked and Embedded Control Systems, Birkhauser, Boston, 2005. Christos G. Cassandras CODES Lab. - Boston University

  2. CONTROL AND OPTIMIZATION – CHALLENGES 1. SCALABILITY Distributed Algorithms 2. DECENTRALIZATION 3. COMMUNICATION Event-driven (asynchronous) Algorithms 4. NON-CONVEXITY Global optimality, escape local optima 5. EXLOIT DATA Data-Driven Algorithms Christos G. Cassandras CODES Lab. - Boston University

  3. WHEN CAN WE WHEN CAN WE DECENTRA DECENTRALIZE ? LIZE ?

  4. MULTI-AGENT OPTIMIZATION: PROBLEM 1 Ω  s i : agent state, i = 1,…, N a 1 s = [ s 1 , … , s N ] a O 1 i  O j : obstacle (constraint) x  R ( x ): property of point x O 2 a 3 a 2  P ( x , s ): reward function   max ( ) ( , ) ( ) s s H P x R x dx  s      , 1 , , s i F i N GOAL: Find the best state vector s = [ s 1 , … , s N ] so that agents achieve a maximal reward from interacting with the mission space Christos G. Cassandras CODES Lab. - Boston University

  5. MULTI-AGENT OPTIMIZATION: PROBLEM 2 Ω a 1 a O 1 i x O 2 a 3 a 2 May also have dynamics T    max ( , ( ( ))) ( ) J P x s u t R x dx dt  0 ( ) u t          ( , , ), 1 , , ( ) , 1 , , s f s u t i N s i t F i N i i i i GOAL: Find the best state trajectories s i ( t ) , 0 ≤ t ≤ T so that agents achieve a maximal reward from interacting with the mission space Christos G. Cassandras CODES Lab. - Boston University

  6. WHEN CAN WE DECENTRALIZE A MULTI-AGENT PROBLEM 1?       s  max ( ) ( , ) ( ) , 1 , , s H P x R x dx s i F i N  s Recall :  x  ( i ) V s  i N    s i    ˆ x  ( , ) 1 1 ( , ) s ( ) P x p x s V s j i i  1 i  s k  s j ( , ) ( ) p x s x V s  ˆ  ( , ) i i i p x s  i i 0 otherwise Define agent i             NEIGHBORHOOD : k : 2 , 1 , , , B b s s k a k i i i i k i Christos G. Cassandras CODES Lab. - Boston University

  7. OBJECTIVE FUNCTION DECOMPOSITION THEOREM: If P ( x, s ) = P ( p 1 ,…, p N ) is a function of local reward functions p i , then H ( s ) can be expressed as:   L ( ) ( ) ( ), s s H H H s 1 2 i i    s i = [ s 1 , × × × , s i - 1 , s i + 1 , × × × for any i = 1,…, N , where , s N ] L [ , , ] and s s s s � 1 i i a b b i i State of i and its neighbors only State of all agents except i   L ( ) ( ) s H s H   Theorem implies 1 i   s s i i  L ( ) H s    b 1 k k  Distributed gradient-based algorithm: 1 i s s  i i k k s i Christos G. Cassandras CODES Lab. - Boston University

  8. OBJECTIVE FUNCTION DECOMPOSITION  Theorem 1 often applies and is easy to check for the “Problem 1” setting EXAMPLE: Coverage Control Problems Christos G. Cassandras CODES Lab. - Boston University

  9. COVERAGE: PROBLEM FORMULATION  N mobile sensors, each located at s i  R 2 R ( x ) ( Hz / 50 m 2 ) 40 30  Data source at x emits signal with energy E 20 ? ? 10  ?? ? ? ? 10 0 ? 10 ? 8 5 6  Signal observed by sensor node i (at s i ) 4 2 0 0 SENSING MODEL:   ( , ) [ Detected by | ( ), ] p x s P i A x s i i i ( A ( x ) = data source emits at x ) Sensing attenuation:  p i ( x, s i ) monotonically decreasing in d i ( x )  || x - s i || Christos G. Cassandras CODES Lab. - Boston University

  10. COVERAGE: PROBLEM FORMULATION Joint detection prob. assuming sensor independence  ( s = [ s 1 ,…, s N ] : node locations) Event sensing probability N       ( , ) 1 1 ( , ) P x s p x s i i  1 i OBJECTIVE: Determine locations s = [ s 1 ,…, s N ] to  maximize total Detection Probability :  max ( ) ( , ) R x P x s dx Theorem 1 applies s  Christos G. Cassandras CODES Lab. - Boston University

  11. DISTRIBUTED COOPERATIVE SCHEME CONTINUED Set    N   dx         ( , , ) ( ) 1 1 ( ) H s s R x p x 1 N i     1 i Maximize H ( s 1 , … , s N ) by forcing nodes to move using  gradient information:    N   ( ) H p x s x     k k ( ) 1 ( ) R x p x dx   i ( ) ( ) s d x d x    1 , i i k k k k  H    b 1 k k s s Desired displacement = V · D t  i i k k s i Cassandras and Li, EJC, 2005 Zhong and Cassandras, IEEE TAC, 2011 Christos G. Cassandras CODES Lab. - Boston University

  12. DISTRIBUTED COOPERATIVE SCHEME CONTINUED CONTINUED    N   ( ) H p x s x     k k ( ) 1 ( ) R x p x dx   i ( ) ( ) s d x d x    1 , i i k k k k … has to be autonomously evaluated by each node so as to determine how to move to next position:  H    b 1 k k s s  i i k k s i  Truncated p i ( x )   replaced by node neighborhood  i  Discretize p i ( x ) using a local grid Christos G. Cassandras CODES Lab. - Boston University

  13. CONTROL AND OPTIMIZATION – CHALLENGES 1. SCALABILITY Distributed Algorithms 2. DECENTRALIZATION 3. COMMUNICATION Event-driven (asynchronous) Algorithms 4. NON-CONVEXITY Global optimality, escape local optima 5. EXLOIT DATA Data-Driven Algorithms Christos G. Cassandras CODES Lab. - Boston University

  14. EVENT EVENT-DRIVEN DRIVEN DISTRIB DISTRIBUTED UTED AL ALGORITHMS GORITHMS

  15. DISTRIBUTED COOPERATIVE OPTIMIZATION N system components  min ( , , ) H s s 1 N s (processors, agents, vehicles, nodes) , 1 . . constraint s on s t s one common objective: 1  min ( , , ) H s s 1 N  , , … s s 1 N . . constraint s on each s t s i 1  min ( , , ) H s s N s N . . constraint s on s t s N Christos G. Cassandras CODES Lab. - Boston University

  16. DISTRIBUTED COOPERATIVE OPTIMIZATION Controllable state s i , i = 1,…, n i i    a ( 1 ) ( ) ( ( )) s s k s k d k i i i i Step Size Update Direction , usually 1  min ( , , ) H s s   N ( ( )) ( ( )) d s k H s k s i i i . . constraint s on s t s i i requires knowledge of all s 1 ,…, s N Inter-node communication Christos G. Cassandras CODES Lab. - Boston University

  17. SYNCHRONIZED (TIME-DRIVEN) COOPERATION COMMUNICATE + UPDATE 1 2 3 Drawbacks:  Excessive communication (critical in wireless settings!)  Faster nodes have to wait for slower ones  Clock synchronization infeasible  Bandwidth limitations  Security risks Christos G. Cassandras CODES Lab. - Boston University

  18. ASYNCHRONOUS COOPERATION 1 2 3  Nodes not synchronized, delayed information used Update frequency for each node    a ( 1 ) ( ) ( ( )) s k s k d s k is bounded  i i i i + converges technical conditions Bertsekas and Tsitsiklis, 1997 Christos G. Cassandras CODES Lab. - Boston University

  19. ASYNCHRONOUS (EVENT-DRIVEN) COOPERATION UPDATE COMMUNICATE 1 2 3  UPDATE at i : locally determined, arbitrary (possibly periodic)  COMMUNICATE from i : only when absolutely necessary Christos G. Cassandras CODES Lab. - Boston University

  20. WHEN SHOULD A NODE COMMUNICATE? Node state at any time t : x i ( t )  s i ( k ) = x i ( t k ) Node state at t k : s i ( k ) : node j state estimated by node i AT UPDATE TIME t k : s i ( k ) j Estimate examples:  j ( k ) j   i j Most recent value ( ) ( ( )) s k x k j j i t k     j ( ) t k     a   i j j ( ) ( ( )) k ( ( )) s k x k d x k Linear prediction D j j i j j j Christos G. Cassandras CODES Lab. - Boston University

  21. WHEN SHOULD A NODE COMMUNICATE? AT ANY TIME t : : node i state estimated by node j  x j ( t ) i  If node i knows how j estimates its state, then it can evaluate x j ( t ) i  Node i uses • its own true state, x i ( t ) • the estimate that j uses, x j ( t ) i   … and evaluates an ERROR FUNCTION j ( ), ( ) g x t x t i i   Error Function examples: j j ( ) ( ) , ( ) ( ) x t x t x t x t i i i i 1 2 Christos G. Cassandras CODES Lab. - Boston University

  22. WHEN SHOULD A NODE COMMUNICATE?   to THRESHOLD  i Compare ERROR FUNCTION j ( ), ( ) g x t x t i i Node i communicates its state to node j only when it detects that its true state x i ( t ) deviates from j ’ estimate of it x j ( t ) i   so that   j ( ), ( ) g x t x t i i i  i ( t ) x i  Event-Driven Control j i i Christos G. Cassandras CODES Lab. - Boston University

  23. THRESHOLD PROCESS Update Direction , usually    i i 0 K ( ( )) ( ( )) d s k H s k  i i Intuition:   i i ( ( ) if K d s k k C     i ( ) near convergence k i    ( 1 ) otherwise k (small ), i ( ( )) i d i s k better estimates are needed   i ( 0 ) ( ( 0 ) K d s  i i Christos G. Cassandras CODES Lab. - Boston University

  24. CONVERGENCE Asynchronous distributed state update process at each i :    a  Estimates of other nodes, i ( 1 ) ( ) ( ( )) s s k s k d k i i i evaluated by node i  i ( ( ) if sends update K d s k k     i ( ) k i   ( 1 ) otherwise  k i Christos G. Cassandras CODES Lab. - Boston University

  25. CONVERGENCE CONTINUED ASSUMPTION 1: There exists a positive integer B such that for all i = 1,…, N and k ≥ 0 at least one of the elements of the set { k−B +1, k−B +2,..., k } belongs to C i . INTERPRETATION: Each node updates its state at least once during a period in which B state update events take place (no time bound)   ASSUMPTION 2: The objective function H ( s ),    N m , s m n i 1 i satisfies: (a) H ( s ) ≥ 0 , for all   m s  H (  (b) H (·) continuously differentiable and Lipschitz continuous, )   m i.e., there exists K 1 such that for all x , y      ( ) ( ) H x H y K x y 1 Christos G. Cassandras CODES Lab. - Boston University

  26. CONVERGENCE CONTINUED ASSUMPTION 3: There exist positive constants K 2 , K 3 such that k  for all i = 1,…, N and i C  2    i (a) ( ) ( ( )) ( ) / d k H s k d k K 3 i i i   i (b) ( ( )) ( ) s K H k d k 2 i i NOTE: Very mild condition, immediately satisfied with K 2 = K 3 = 1 when we use   the usual update direction given by i ( ) ( ( )) d k H s k i i ASSUMPTION 4: There exists a positive constant K 4 such that The ERROR FUNCTION satisfies    j j ( ) ( ) ( ( ) ( )) x t x t K g x t x t 4 i i i i NOTE: Very mild condition, immediately satisfied with K 4 = 1 when we use the    common choice j j ( ( ) ( )) ( ) ( ) g x t x t x t x t i i i i Christos G. Cassandras CODES Lab. - Boston University

  27. CONVERGENCE CONTINUED THEOREM: Under A1-A4, there exist positive constants α and K δ such that   lim ( ( )) 0 H s k   k Zhong and Cassandras, IEEE TAC, 2010 INTERPRETATION: - Event-driven optimization achievable with reduced communication requirements  energy savings - No loss of performance Christos G. Cassandras CODES Lab. - Boston University

  28. CONVERGENCE CONTINUED THEOREM: Under A1-A4, there exist positive constants α and K δ such that   lim ( ( )) 0 H s k   k BYPRODUCT OF PROOF: obtaining the largest possible K δ and hence the smallest possible number of communication events:   1 2     a K       ( 1 ) K K B K m 1 3 4 Comm. State dim. ~ network dim.  a  0 2 / K K frequency 1 3 Christos G. Cassandras CODES Lab. - Boston University

  29. COONVERGENCE WHEN DELAYS ARE PRESENT   j g x i x , i Error function trajectory with    k i NO DELAY 0       ij ij ij ij t 2 3 0 1 j , g x i x Red curve: i   ~ j , g x i x Black curve: i DELAY    k i 0          ij ij ij ij ij ij ij ij ij t 2 1 2 3 4 0 1 3 4 Christos G. Cassandras CODES Lab. - Boston University

  30. COONVERGENCE WHEN DELAYS ARE PRESENT Add a boundedness assumption: ASSUMPTION 5: There exists a non-negative integer D such that if a message is sent before t k-D from node i to node j , it will be received before t k . INTERPRETATION: at most D state update events can occur between a node sending a message and all destination nodes receiving this message. THEOREM: Under A1-A5, there exist positive constants α and K δ such that   lim ( ( )) 0 H s k   k NOTE: The requirements on α and K δ depend on D and they are tighter. Zhong and Cassandras, IEEE TAC, 2010 Christos G. Cassandras CODES Lab. - Boston University

  31. SYNCHRONOUS v ASYNCHRONOUS OPTIMAL COVERAGE PERFORMANCE Energy savings + Extended lifetime SYNCHRONOUS v ASYNCHRONOUS: No. of communication events SYNCHRONOUS v ASYNCHRONOUS: for a deployment problem with obstacles Achieving optimality in a problem with obstacles Christos G. Cassandras CODES Lab. - Boston University

  32. OPTIMAL COVERAGE IN A MAZE http://www.bu.edu/codes/research/distributed-control/ Zhong and Cassandras, 2008 Christos G. Cassandras CODES Lab. - Boston University

  33. DEMO: OPTIMAL DISTRIBUTED DEPLOYMENT WITH OBSTACLES – SIMULATED AND REAL Christos G. Cassandras CODES Lab. - Boston University

  34. IT IS HARD T IT IS HARD TO DECENTRALIZ DECENTRALIZE PROBLEM 2 … MORE ON THAT LATER…

  35. CONTROL AND OPTIMIZATION – CHALLENGES 1. SCALABILITY Distributed Algorithms 2. DECENTRALIZATION 3. COMMUNICATION Event-driven (asynchronous) Algorithms 4. NON-CONVEXITY Global optimality, escape local optima 5. EXLOIT DATA Data-Driven Algorithms Christos G. Cassandras CODES Lab. - Boston University

  36. DA DATA-DRIVEN + DRIVEN + EVENT EVENT-DRIVEN DRIVEN AL ALGORITHMS GORITHMS

  37. DATA-DRIVEN STOCHASTIC OPTIMIZATION GOAL: q max E [ L ( )] q � � CONTROL/DECISION � � � � SYSTEM PERFORMANCE � � q q t max E e c ( x ( t , ), u ( t , )) dt MDP: (Parameterized by q ) � � q E [ L ( )] q � u ( t , ) U � � 0 NOISE � � � � � � t q q max E e c ( x ( t , ), u ( t , )) dt � � q � � � � L ( q ) ∆ 0 L ( q ) x ( t ) GRADIENT q � q � � � L q ( ) � n 1 n n n ESTIMATOR REAL-TIME DATA DIFFICULTIES: - E [ L ( q )] NOT available in closed form - � ( q not easy to evaluate L ) � q � ( q E [ L ( )] - may not be a good estimate of L ) Christos G. Cassandras CISE SE - CODES Lab. - Boston University

  38. DATA-DRIVEN STOCHASTIC OPTIMIZATION IN DES : INFINITESIMAL PERTURBATION ANALYSIS (IPA) Model Sample path x ( t ) CONTROL/DECISION Discrete Event PERFORMANCE (Parameterized by q ) System (DES) q E [ L ( )] NOISE L ( q ) x ( t ) q � q � � � L q ( ) IPA � n 1 n n n For many (but NOT all) DES: L ( q ) ∆ - Unbiased estimators - General distributions - Simple on-line implementation Christos G. Cassandras CISE SE - CODES Lab. - Boston University

  39. REAL-TIME STOCHASTIC OPTIMIZATION: HYBRID SYSTEMS Sample path CONTROL/DECISION HYBRID PERFORMANCE (Parameterized by q ) SYSTEM q E [ L ( )] NOISE L ( q ) L ( q ) ∆ x ( t ) q � q � � � L q ( ) IPA � n 1 n n n A general framework for an IPA theory in Hybrid Systems Christos G. Cassandras CISE SE - CODES Lab. - Boston University

  40. PERFORMANCE OPTIMIZATION AND IPA Performance metric � � � � � � q q � q q J ; x ( , 0 ), T E L ; x ( , 0 ), T (objective function): IPA goal: � � � � q q q dL dJ ; x ( , 0 ), T - Obtain unbiased estimates of , normally q q d d q dL ( ) - Then: q � q � � n � n 1 n n q d � � � � � q � q x , t d � � � � � � k � x t , k NOTATION: � q q d Christos G. Cassandras CISE SE - CODES Lab. - Boston University

  41. THE IP THE IPA CAL A CALCUL CULUS US

  42. IPA: THREE FUNDAMENTAL EQUATIONS System dynamics over ( � k ( q ), � k +1 ( q )]: � q x f ( x , , t ) � k � � � � � q � � q x , t � � � � � � � x t , k NOTATION: k � q � q � � � 1. Continuity at events: � � x ( ) x ( ) k k Take d/d q : � � � � � � � � � � � � x ' ( ) x ' ( ) [ f ( ) f ( )] ' � k k k 1 k k k k � � � � d ( q , q , x , , ) If no continuity, use reset condition � � � � x ' ( ) k q d Christos G. Cassandras CISE SE - CODES Lab. - Boston University

  43. IPA: THREE FUNDAMENTAL EQUATIONS 2. Take d/d q of system dynamics � q over ( � k ( q ), � k +1 ( q )]: x f ( x , , t ) � k � � dx ' ( t ) f ( t ) f ( t ) � � k x ' ( t ) k � � q dt x � � dx ' ( t ) f ( t ) f ( t ) over ( � k ( q ), � k +1 ( q )]: Solve � � k x ' ( t ) k � � q dt x � � � � f ( u ) f ( u ) t t v � � � f ( v ) k � k du du � � � � � � � � � � � x ( t ) e x k e x dv x ( ) � � k k k � q � � initial condition from 1 above � � � k NOTE: If there are no events (pure time-driven system), IPA reduces to this equation Christos G. Cassandras CISE SE - CODES Lab. - Boston University

  44. IPA: THREE FUNDAMENTAL EQUATIONS �� 3. Get depending on the event type: k � � � 0 - Exogenous event: By definition, k q � q � g k x ( ( , ), ) 0 - Endogenous event: occurs when k � 1 � � � � � � � g g g � � � � � � � � � � � � f ( ) x ( ) � � k k k k � � q � � x � � x � - Induced events: � 1 � � � � y ( ) � � � � � � � k k y ( ) � � k k k � t � � Christos G. Cassandras CISE SE - CODES Lab. - Boston University

  45. IPA: THREE FUNDAMENTAL EQUATIONS Ignoring resets and induced events: Recall: � � � � � � � � � � � � � 1. x ' ( ) x ' ( ) [ f ( ) f ( )] ' � � � q x , t � k k k 1 k k k k � � � � x t � q � � � � q � � � k k � q � � � � f ( u ) f ( u ) t t v � � � f ( v ) k � k du du � � � � � � � � 2. � � x ( t ) e x k e x dv x ' ( ) � � k k k � q � � � � � k � 1 � � � � � � � g g g � � � � 3. � � � � � � � or � � f ( ) � x ( ) � 0 � � k k k k � � q � k � x � � x � 2 1 � x � ' ( ) k Cassandras et al, Europ. J. Control, 2010 3 Christos G. Cassandras CISE SE - CODES Lab. - Boston University

  46. IPA PROPERTIES � � � � � N � k 1 q � q L L ( x , , t ) dt Back to performance metric: k � k 0 � k � � � q L x , , t � � � q � NOTATION: L x , , t k k � q � � � � � � q N dL � k 1 � � � � � � � � � � � � � q Then: � � L ( ) L ( ) L ( x , , t ) dt � � k 1 k k 1 k k k k q d � � � � � k 0 � k What happens What happens at event times between event times Christos G. Cassandras CISE SE - CODES Lab. - Boston University

  47. IPA PROPERTY 1: ROBUSTNESS THEOREM 1: If either 1,2 holds, then dL ( q ) /d q depends only on information available at event times � k : 1. L ( x , q , t ) is independent of t over [ � k ( q ), � k +1 ( q )] for all k 2. L ( x , q , t ) is only a function of x and for all t over [ � k ( q ), � k +1 ( q )]: � � � d L d f d f � � � k k k 0 � � � q dt x dt x dt � � � � � � q N dL � k 1 � � � � � � � � � � � � � q � � L ( ) L ( ) L ( x , , t ) dt � � k 1 k k 1 k k k k q d � � � � � k 0 � k IMPLICATION: - Performance sensitivities can be obtained from information limited to event times, which is easily observed - No need to track system in between events ! Christos G. Cassandras CISE SE - CODES Lab. - Boston University

  48. IPA PROPERTY 1: ROBUSTNESS EXAMPLE WHERE THEOREM 1 APPLIES (simple tracking problem): � � T � L � � � min E [ x ( t ) g ( )] dt � � � � 1 q � � , � � x 0 � � � � q � s.t. x a x ( t ) u ( ) w ( t ) f f du � � � q � k a , k k k k k k k k k � � q x d � k 1 , , N � k k k NOTE: THEOREM 1 provides sufficient conditions only. IPA still depends on info. limited to event times if � � q � x a x ( t ) u ( , t ) w ( t ) � k k k k k k � k 1 , , N � for “nice” functions u k ( q k , t ) , e.g., b k q t Christos G. Cassandras CISE SE - CODES Lab. - Boston University

  49. IPA PROPERTY 1: ROBUSTNESS EVENTS x � q f ( x , u , w , t ; ) � � k � k +1 ( q Evaluating x t ; ) requires full knowledge of w and f values (obvious) q dx ( t ; ) However, may be independent of w and f values ( NOT obvious) q d It often depends only on: - event times � k � - possibly f � ( ) � k 1 Christos G. Cassandras CISE SE - CODES Lab. - Boston University

  50. IPA PROPERTY 2: DECOMPOSABILITY THEOREM 2: Suppose an endogenous event occurs at � k with switching function g ( x , q ) . � x � � f � � � ( ) If ( ) 0 , then is independent of f k −1 . k k k dg � � x � � � If, in addition, then ( ) 0 0 k q d IMPLICATION: Performance sensitivities are often reset to 0 � sample path can be conveniently decomposed Christos G. Cassandras CISE SE - CODES Lab. - Boston University

  51. IPA PROPERTY 3: SCALABILITY IPA scales with the EVENT SET, not the STATE SPACE ! As a complex system grows with the addition of more states, the number of EVENTS often remains unchanged or increases at a much lower rate. EXAMPLE: A queueing network may become very large, but the basic events used by IPA are still “arrival” and “departure” at different nodes . IPA estimators are EVENT-DRIVEN Christos G. Cassandras CISE SE - CODES Lab. - Boston University

  52. IPA PROPERTIES In many cases: - No need for a detailed model (captured by f k ) to describe state behavior in between events - This explains why simple abstractions of a complex stochastic system can be adequate to perform sensitivity analysis and optimization, as long as event times are accurately observed and local system behavior at these event times can also be measured. - This is true in abstractions of DES as HS since: Common performance metrics (e.g., workload) satisfy THEOREM 1 Christos G. Cassandras CISE SE - CODES Lab. - Boston University

  53. WHAT IS THE RIGHT ABSTRACTION LEVEL ? TOO FAR… model not detailed enough TOO CLOSE… too much undesirable detail JUST RIGHT… good model CREDIT: W.B. Gong Christos G. Cassandras CISE SE - CODES Lab. - Boston University

  54. A SMAR A SMART CITY T CITY CPS APPLICA CPS APPLICATION: TION: AD ADAPTIVE APTIVE TRAFFIC TRAFFIC LIGHT LIGHT CONTR CONTROL OL

  55. TRAFFIC LIGHT CONTROL - BACKGROUND A basic binary switching control (GREEN – RED) problem with a long history… • Mixed Integer Linear Programming (MILP) [ Dujardin et al, 2011 ] • Extended Linear Complementarity Problem (ELCP) [ DeSchutter, 1999 ] • MDP and Reinforcement Learning [ Yu et al., 2006 ] • Game Theory [ Alvarez et al., 2010 ] • Evolutionary algorithms [ Taale et al., 1998 ] • Fuzzy Logic [ Murat et al., 2005 ] • Expert Systems [ Findler and Stapp, 1992 ] • Perturbation Analysis Christos G. Cassandras CODES Lab. - Boston University

  56. TRAFFIC LIGHT CONTROL - BACKGROUND • Perturbation Analysis [ Panayiotou et al., 2005 ] Single Intersection [ Geng and Cassandras, 2012 ] Use a Hybrid System Model: Stochastic Flow Model (SFM) Vehicle queue DES SFM Aggregate states into modes and keep only events causing mode transitions Christos G. Cassandras CODES Lab. - Boston University

  57. SINGLE-INTERSECTION MODEL Traffic light control: q � q q q q [ , , , ] 4 1 2 3 4 2 GREEN light cycle at queue n = 1,2,3,4 1 3 OBLECTIVE: Determine q to minimize � � 1 4 �� T q � q min J ( ) E w x ( , t ) dt � � total weighted vehicle queues T n n T q � � 0 � n 1 Christos G. Cassandras CISE - CODES Lab. - Boston University

  58. SINGLE-INTERSECTION MODEL � � 4 1 1 � � �� T q � q � q min J ( ) E w x ( , t ) dt E L ( ) � � T n n T T T � � q 0 � n 1 IPA APPROACH: � � � � q q dL T dJ T - Observe events and event times, estimate through q q d d q dL ( ) q � q � � T n - Then, � n 1 n n q d Christos G. Cassandras CISE - CODES Lab. - Boston University

  59. HYBRID SYSTEM STATE DYNAMICS q n GREEN n n n q n GREEN n � � q � q � 1 if 0 z t ( ) or z t ( ) GREEN light “clock” � � n n n n z t ( ) n 0 otherwise � � � � q z t ( ) 0 if z t ( ) Control: GREEN light cycle n n n Christos G. Cassandras CISE - CODES Lab. - Boston University

  60. HYBRID SYSTEM STATE DYNAMICS � � � � q � q � 1 if 0 z t ( ) or z t ( ) � q z t ( ) 0 if z t ( ) � � n n n n z t ( ) n n n n 0 otherwise � [ RESOURCE DYNAMICS ] � � q � q � 1 if 0 z t ( ) or z ( ) t Define: � � n n n n GREEN at queue n G t ( ) n 0 otherwise � � q � � ( ) t if G ( , ) z 0 Queue content n n � � � � � � x t ( ) 0 if x t ( ) 0 and ( ) t ( ) t � n n n n [ USER DYNAMICS ] � � � � ( ) t ( ) t otherwise � n n Vehicle departure rate process Vehicle arrival rate process Christos G. Cassandras CISE - CODES Lab. - Boston University

  61. EVENTS IN THE TLC MODEL Event G2R Event R2G GREEN light switches to RED RED light switches to GREEN endogenous endogenous Event E Event S Non-Empty-Period (NEP) ends Non-Empty-Period (NEP) starts endogenous endogenous or exogenous Christos G. Cassandras CISE - CODES Lab. - Boston University

  62. APPLY IPA EQUATIONS FOR q AND s VECTORS FOR EXAMPLE: Endogenous event with q � q � q � g ( x ( , ), ) x ( , t ) 0 k k n � 1 � � � � � � � g g g � � � ' x ( ) � � � � � � � � � � � � f ( ) x ( ) � � � � n , i k ' k k k k � � q � x x � � � � k , i � � � � � ( ) ( ) n k n k � k � � � � � � � � ' � ( ) ( ) x ( ) � � ' � � ' � � n k n k n , i k � � � � � � � � � � � � x ( ) x ( ) x ' ( ) x ' ( ) [ f ( ) f ( )] ' � k k k 1 k k k k n , i k n , i k � � � � � ( ) ( ) n k n k � 0 Perturbation in queue n RESET to 0 when NEP ends Christos G. Cassandras CISE - CODES Lab. - Boston University

  63. COST DERIVATIVE IN m th NEP � q ( ) � � n , m q L x ( , t ) dt n , m n � q ( ) n , m NOTES: - Need only TIMERS, COUNTERS and state derivatives - Scaleable in number of EVENTS – not states! Christos G. Cassandras CISE - CODES Lab. - Boston University

  64. TYPICAL SIMULATION RESULTS 9-fold cost reduction Traffic pattern changes Adaptivity Christos G. Cassandras CISE - CODES Lab. - Boston University

  65. IT IS HARD T IT IS HARD TO DECENTRALIZ DECENTRALIZE PROBLEM 2 …

  66. MULTI-AGENT OPTIMIZATION: PROBLEM 2 Ω a 1 a i O 1 x O 2 a 3 a 2 T May also have dynamics � � � max J P ( x , s ( u ( t ))) R ( x ) dx dt � 0 u ( t ) � � � � � � s i ( t ) F , i 1 , , N s f ( s , u , t ), i 1 , , N � � � i i i i GOAL: Find the best state trajectories s i ( t ) , 0 ≤ t ≤ T so that agents achieve a maximal reward from interacting with the mission space Christos G. Cassandras CODES Lab. - Boston University

  67. PERSISTENT MONITORING PROBLEM GOAL: Find the best state trajectories s i ( t ) , 0 ≤ t ≤ T so that agents achieve a maximal reward from interacting with the mission space Need three model elements: 1. ENVIRONMENT MODEL T � � � max J P ( x , s ( u ( t ))) R ( x ) dx dt � 0 u ( t ) 2. SENSING MODEL (how agents interact with environment) � � s f ( s , u , t ), i 1 , , N � � 3. AGENT MODEL i i i i Christos G. Cassandras CODES Lab. - Boston University

  68. PERSISTENT MONITORING PROBLEM Start with 1-dimensional mission space � = [0, L ] � � AGENT DYNAMICS: s u , u ( t ) 1 � j j j � � � Analysis still holds for: s g ( s ) bu , u ( t ) 1 � j j j j j Christos G. Cassandras CODES Lab. - Boston University

  69. PERSISTENT MONITORING PROBLEM SENSING MODEL: p ( x , s ) Probability agent at s senses point x x s ( t ) Christos G. Cassandras CODES Lab. - Boston University

  70. PERSISTENT MONITORING PROBLEM x s ( t ) ENVIRONMENT MODEL: Associate to x Uncertainty Function R ( x,t ) � � � 0 if R ( x , t ) 0 , A ( x ) Bp ( x , s ( t )) � Use: � R ( x , t ) � � A ( x ) Bp ( x , s ( t )) otherwise � � � � R ( t ) f ( R , s , t ) noise If x is a known “target”: x x Christos G. Cassandras CODES Lab. - Boston University

  71. PERSISTENT MONITORING PROBLEM Partition mission space � = [0, L ] into M intervals: … � 1 � M For each interval i = 1,…, M define Uncertainty Function R i ( t ): � � � 0 if R ( t ) 0 , A BP ( s ( t )) � � i i i R ( t ) � i � A BP ( s ( t )) otherwise � i i � � N � � � p ( s ) p ( , s ) � � � P ( s ) 1 1 p ( s ) i j j i j i i j � j 1 where P i ( s ) = joint prob. i is sensed by agents located at s = [ s 1 ,…, s N ] Christos G. Cassandras CODES Lab. - Boston University

  72. PERSISTENT MONITORING (PM) WITH KNOWN TARGETS 1 T M � � � min J R ( t ) dt i T 0 u , , u � 1 N � i 1 s.t. � � � � � � s u , u ( t ) 1 , 0 a s ( t ) b L � j j j j � � � 0 if R ( t ) 0 , A BP ( s ( t )) � i i i � R ( t ) � i � A BP ( s ( t )) otherwise � i i Christos G. Cassandras CODES Lab. - Boston University

  73. PERSISTENT MONITORING WITH KNOWN TARGETS Agent-Target Interaction Network Agent Network (time-varying) (time-varying) Hard to decentralize a controller that involves time-varying agent-environment interactions Christos G. Cassandras CODES Lab. - Boston University

  74. THREE TYPES OF NEIGHBORHOODS (conventional) 𝑈 4 𝑈 3 𝐵 2 𝐵 4 𝐵 3 𝑈 2 𝑈 5 𝐵 1 𝑈 1 𝐵 5 Christos G. Cassandras CODES Lab. - Boston University

  75. PM WITH KNOWN TARGETS – 1D CASE We have shown that: 1. Optimal Trajectories are bounded: � � � * x s ( t ) x j 1 , , N � 1 j M 2. Existence of finite dwell times at target on optimal trajectories: Under certain conditions: � � � * * s ( t ) x and u ( t ) 0 for t [ t , t ] j k j 1 2 3. Under the constraint s j ( t ) < s j +1 ( t ) , on an optimal trajectory: � s ( t ) s 1 t ( ) � j j Zhou et al, IEEE CDC, 2016 Christos G. Cassandras CODES Lab. - Boston University

  76. OPTIMAL CONTROL SOLUTION Optimal trajectory is fully characterized by TWO parameter vectors: � � � � � � q � q q � w w w , j 1 , , N , j 1 , , N � � � � j j 1 jS j j 1 jS Waiting times at Switching points switching points, w jk ≥ 0 1 K M � �� ( θ , w ) � � � k 1 J ( θ , w ) R ( t ) dt i � k : k th event time T � ( θ , w ) k � � k 0 i 1 � Under optimal control, this is a HYBRID SYSTEM Christos G. Cassandras CODES Lab. - Boston University

  77. HYBRID SYSTEM EVENTS Type 1: switches in 𝑆 𝑗 (𝑢) Type 2: switches in agent sensing Type 3: switches in 𝑡 𝑘 (𝑢) Type 4: changes in neighbor sets Christos G. Cassandras CODES Lab. - Boston University

  78. HYBRID SYSTEM EVENTS: EXAMPLE � � � 0 if R ( t ) 0 , A BP ( s ( t )) � � i i i R ( t ) � i � A BP ( s ( t )) otherwise � i i A simple example: 1 agent 1 target Event type 1 𝑆 𝑗 = 𝐵 𝑗 − 𝐶 𝑗 𝑄 𝑗 (𝐭(𝑢)) 𝑆 𝑗 = 0 𝑡 𝑘 = −1 𝑡 𝑘 = −1 𝑆 ↓= 0 𝑆 ↑= 0 𝑆 𝑗 = 𝐵 𝑗 − 𝐶 𝑗 𝑄 𝑗 (𝐭(𝑢)) 𝑆 𝑗 = 0 𝑡 𝑘 = 0 𝑡 𝑘 = 0 Event type 2 𝑆 𝑗 = 𝐵 𝑗 − 𝐶 𝑗 𝑄 𝑗 (𝐭(𝑢)) 𝑆 𝑗 = 0 −1 𝑣 = 0 𝑡 𝑘 = 1 𝑡 𝑘 = 1 1 � � s u , u ( t ) 1 � j j j Christos G. Cassandras CODES Lab. - Boston University

  79. IPA GRADIENTS Objective function gradient: M K 1 � ��� ( θ , w ) T � � � � � � k 1 � � R ( t ) R ( t ) J ( θ , w ) R ( t ) dt � � R ( t ) i i � � i i T � � � � θ w � ( θ , w ) k � � i 1 k 0 � where is obtained using the IPA Calculus R i ( t ) � R i ( t ) is updated on an � � � � � � � � � � � � � x ' ( ) x ' ( ) [ f ( ) f ( )] ' 1. � k k k 1 k k k k EVENT-DRIVEN basis � k : k th event time � � � � f ( u ) f ( u ) t t � v � � f ( v ) k k � du du � � � � � � � � � � 2. x ( t ) e x k e x dv x ' ( ) � � k k k � q � � � � � k � 1 � � � � g � � g g � � � � � � � � � � � � � � f ( ) � x ( ) � 3. 0 or � � k � k k � q � k k � x � � x � Christos G. Cassandras CODES Lab. - Boston University

Recommend


More recommend