Observation and Control 1 Observ ation and Con trol for Debugging Distributed Computations Vija y K. Garg Electrical and Computer Engineering Departmen t Univ ersit y of T exas at Austin, Austin, TX 78712 http://maple.ece .utex as.e du/~v ijay / � Vija c y K. Ga rg
Observation and Control 2 Ackno wledgment s � Collab o rato rs on va rious ideas C. M. Chase, E. F romentin, R. Kilgo re, R. Kuma r, J. R. Mitchell, V. V. Murt y , M. T. Raghunath, M. Ra ynal, A. T a rafda r, A. I. T omlinson, and B. W aldeck er � Vija c y K. Ga rg
Observation and Control 3 Outline of the talk � Intro duction : our mo del � Observation: Main ideas � Lack of sha red clo ck � Lack of sha red memo ry � Combinato rial Explosion � Observation: Algo rithms � W CP algo rithm, Channel p redicates � Detecting regula r exp ressions � Control � Dela ying events: o�ine � Dela ying events: online � Controlling o rder: o�ine � Controlling o rder: online � Vija c y K. Ga rg
Observation and Control 4 Cha racteristics of Distributed Systems � Lack of sha red clo ck � o rder of events pa rtial � Lack of sha red memo ry � meaning of global state � need messages fo r observing "global state" � Multiple p ro cesses � Combinato rial explosion � non-determin ism � Vija c y K. Ga rg
Observation and Control 5 Mo del of a Distributed Program 1,3 0,1 2,3 pc,x 3,2 x := x+2 send(x) x := x-1 r[1] � messages: asynchronous, reliable, no FIF O assumption � no sha red clo ck o r memo ry y := y+3 receive(y) y:=2*y � lo cal states r[2] � Lamp o rt's causally p recede relation, concurrency relation pc,y 3,6 0,1 1,4 2,3 � Vija c y K. Ga rg
Observation and Control 6 Motivation fo r Observation De ar Watson, you se e but you do not observe... � Distributed Debugging, T esting � stop when the p redicate q is true � p redicate q = (P1 is in critical section) and (P2 is in critical section). � Detect if the p rogram violates any inva riant � F ault-tolerance � Monito ring while the p rogram is op erationa l � Distributed Active Rules � On global condition p , trigger rule a � General pa radigm fo r observing Distributed Algo rithms � T ermination detection, deadlo ck detection, loss of tok en � Vija c y K. Ga rg
Observation and Control 7 Lack of sha red clo ck � Problem: de�ne truthness of the p redicate C S ^ C S 1 2 � based on real time � based on causalit y � Real-time considered ha rmful in distributed system. � My clo ck synchronizati on algo rithm achieves 10 ms � p rograms should w o rk indep ende nt of p ro cesso r sp eeds � Reject linea r time, accept vecto r time � Lamp o rt 78, Fidge 89, Mattern 89 � Simultanei t y vs Concurrency � Vija c y K. Ga rg
Observation and Control 8 Clo ck in a Distributed System (1,0,0,0) (2,1,0,0) (3,1,0,0) P1 P2 � Prop ert y: s ! t i� s:v < t:v . (0,1,0,0) (0,2,0,0) (2,3,3,1) P3 (0,0,2,1) (2,1,3,1) (0,0,1,0) (2,1,4,1) � Vija c y K. Ga rg P4 (0,0,0,1) (0,0,0,2)
Observation and Control 9 Lack of sha red state C2 � consistent global state C1 � if the receive of an event is reco rded, then send must b e reco rded P1 m1 m3 P2 m2 P3 � Vija c y K. Ga rg
Observation and Control 10 Camera: Chandy and Lamp o rt's Algo rithm � Algo rithm to compute a snapshot of a computation: S � � S is a p ossible global state in the computation � � Stable p redicate: once true sta ys true � e.g. terminati on detection, deadlo ck detection � T o monito r stable p redicates: rep eatedly tak e the snapshots � Disadvantages of CL Algo rithm fo r p redicate detection � Not useful fo r unstable p redicates � Do es not return the �rst cut � Ho w often should the snapshot b e tak en ? � Assumes FIF O � Vija c y K. Ga rg
Observation and Control 11 Unstable Predicates s0 s1 s2 s3 t0 t1 t2 t3 t � Multiple timed executions consistent with one run 3 � Vija c y K. Ga rg 2 1 s (0,0) 1 2 3
Observation and Control 12 Tw o interp retations of p redicates � Tw o mo dalities: [Co op er and Ma rzullo 91], [Ga rg and W aldeck er 91] � P ossibly:q (also called w eak p redicates) � exists a path from the initial state to the �nal state along which q is true on some state � De�nitely:q (also called strong p redicates) � fo r all paths from the initial state to the �nal state q is true on some state � Vija c y K. Ga rg
Observation and Control 13 Comm unicatio n Complexi t y � Consider evaluation of the p redicate q ( x ; x ) 1 2 � only P kno ws all the values tak en b y x 1 1 � only P kno ws values tak en b y x 2 2 � Is q ( x ; x ) true fo r some value of x and x 1 2 1 2 � Key question: numb er of values that need to b e communi- cated � one value p er internal event, o r � one value p er external event � Vija c y K. Ga rg
Observation and Control 14 Monotonicit y � De�nition � Assume x tak es values from a totally o rdered set 1 � q is monotone w.r.t. �rst a rgument if 8 a; b; x : ( a < b ) ) ( q ( a; x ) ) q ( b; x )) 2 2 2 � Examples � q = ( x > x ) : monotonic w.r.t x and x 1 2 1 2 � q = l ^ l : monotonic 1 2 � q = ( x = x ) : not monotonic. 1 2 � Vija c y K. Ga rg
Observation and Control 15 Multiple Pro cesses � Intractabilit y of the Global Predicate Detection Problem � Giv en : an execution S of N p ro cesses, N va riables x ; : : : ; x , and 1 N a p redicate q de�ned on x . � Is there a consistent cut G 2 S such that q ( G ) is true. � Theo rem [Chase and Ga rg 95]: The p redicate detection p rob- lem is NP-Complete. � Pro of: By reduction from SA T ( ( x _ x � _ x ) ^ ( x � _ x ) ^ : : : ) 1 2 3 1 2 � Vija c y K. Ga rg 0 1 x1 x2 x3
Observation and Control 16 Linea rit y � F o rbidden p redicate: fo rbidden (G,i) i� � 8 H : G � H : ( G [ i ] = H [ i ]) ) : q ( H ) G H � Predicate q is linea r w.r.t. a computation S if � 8 G : : q ( G ) ) 9 i : fo rbidden ( G; i ) � Examples � l ^ l ^ ::: ^ l 1 2 n � x + y � k , x is non-increasin g � channel is empt y � Vija c y K. Ga rg
Observation and Control 17 Summa ry of Observation: Problems and Solutions � Cha racteristic Problem Idea Bonus No sha red clo ck o rdering events causalit y avoid race erro rs No sha red memo ry message/state change monotonicit y extremal functions multiple p ro cesses combinato ria l explosion linea rit y �rst cut � Vija c y K. Ga rg
Observation and Control 18 Co op er and Ma rzullo's Algo rithm � P ossibly:p � construct the lattice of global states, check each global state fo r truthness of p � De�nitely:p � fo r all paths from the initial state to the �nal state p is true on some state � construct the lattice of global states � remove states satisfying p � Is last state reachable from the initial state n � Complexit y: O ( k ) where � k : Numb er of lo cal states p er p ro cess � n : Numb er of p ro cesses � Vija c y K. Ga rg
Observation and Control 19 W eak Conjunctive Predicates � W CP � P ossibly: l ^ l ^ : : : ^ l n 1 2 � useful fo r bad o r undesirable p redicates � Example: the classical mutual exclusion p roblem. � Example: (John is sleeping) and (Ma ry is sleeping) and (Rob ert is sleeping) � detect erro rs that ma y b e hidden in some run due to race conditions. � Vija c y K. Ga rg
Observation and Control 20 Imp o rtance of W eak Conjunctive Predicates � Su�cient fo r detection of any b o olean exp ression � which can b e exp ressed as a disjunction of a small numb er of con- junctions. � Example x; y and z a re in three di�erent p ro cesses. Then, ev en ( x ) ^ (( y < 0) _ ( z > 6)) � ( ev en ( x ) ^ ( y < 0)) _ ( ev en ( x ) ^ ( z > 6)) � the global p redicate is satis�ed b y only a �nite numb er of p ossible global states. � Example, x and y a re in di�erent p ro cesses. � ( x = y ) is not a lo c al p redicate � Vija c y K. Ga rg
Recommend
More recommend