a novel parallel deadlock detection algorithm and
play

A Novel Parallel Deadlock Detection Algorithm and Architecture 2 , 2 - PowerPoint PPT Presentation

A Novel Parallel Deadlock Detection Algorithm and Architecture 2 , 2 , Pun H. Shiu 2 , Yudong Yudong Tan Tan 2 , Pun H. Shiu 1 Vincent J. Mooney III 1 Vincent J. Mooney III {ship, ydtan {ship, ydtan, , mooney mooney}@ece.gatech.ed


  1. A Novel Parallel Deadlock Detection Algorithm and Architecture 2 , 2 , Pun H. Shiu 2 , Yudong Yudong Tan Tan 2 , Pun H. Shiu 1 Vincent J. Mooney III 1 Vincent J. Mooney III {ship, ydtan {ship, ydtan, , mooney mooney}@ece.gatech.ed }@ece.gatech.edu u http://codesign codesign. .ece ece. .gatech gatech. .edu edu http:// 1,2 Hardware/Software RTOS Group 1 Low Power Compiler Group 1 Assistant Professor, 1,2 Electrical and Computer Engineering 1 Adjunct Assistant Professor, 1 College of Computing Georgia Institute of Technology Atlanta, GA USA 1 http://crest. http://crest.ece ece. .gatech gatech. .edu edu CODES 2001 April, 2001

  2. Overall Outline � Motivation Motivation - - Technology Trends Technology Trends � � Background Background - - Deadlock Detection Deadlock Detection � � Parallel Algorithm Parallel Algorithm � � Parallel Architecture Parallel Architecture � � Experimental Results Experimental Results � � Conclusion Conclusion � CODES 2001 April, 2001

  3. Motivation - Technology Trends � Many of today’s chip designs contain 2 Many of today’s chip designs contain 2 � processors, e.g., a DSP and a microcontroller processors, e.g., a DSP and a microcontroller � Future Future SoC SoC designs are likely to include designs are likely to include � � 4 4- -40 heterogeneous processors 40 heterogeneous processors � 10 10- -50 on 50 on- -chip hardware resources chip hardware resources � FFT, FFT, Viterbi Viterbi filter, wireless communication filter, wireless communication � Multithreaded software which dynamically requests Multithreaded software which dynamically requests and uses the resources and uses the resources CODES 2001 April, 2001

  4. SoC Software � Ideally, programmers of such future Ideally, programmers of such future SoC SoC � designs would only write deadlock- -free code free code designs would only write deadlock � If not, we provide a way to detect deadlock If not, we provide a way to detect deadlock � very fast very fast � User can write code to recover from User can write code to recover from � deadlock deadlock CODES 2001 April, 2001

  5. Deadlock Detection Unit (DDU) � Small & scalable parallel hardware unit Small & scalable parallel hardware unit � � Multiple requestors & resources Multiple requestors & resources � In this paper, the only requestors are � In this paper, the only requestors are � processors and the only resources processors and the only resources are specialized hardware units like are specialized hardware units like FFT FFT CODES 2001 April, 2001

  6. Overall Outline � Motivation Motivation - - Technology Trends Technology Trends � � Background Background - - Deadlock Detection Deadlock Detection � � Parallel Algorithm Parallel Algorithm � � Parallel Architecture Parallel Architecture � � Experimental Results Experimental Results � � Conclusion Conclusion � CODES 2001 April, 2001

  7. Background: P1 Q2 Deadlock Condition Q1 P2 � Properties of Resources Properties of Resources � � Mutual Exclusion: A Mutual Exclusion: A ny resource can be held exclusively, ny resource can be held exclusively, making it unavailable to other processors making it unavailable to other processors � Non Non- -preemption: A preemption: A ny resources can be released only by ny resources can be released only by the processors holding the resource. the processors holding the resource. � Behavior of processors Behavior of processors � � Partial Allocation: Partial Allocation: a processor may hold some a processor may hold some resources while the processor requests additional resources. resources while the processor requests additional resources. � Blocked Wait: Blocked Wait: processor must wait for unavailable processor must wait for unavailable resources to become available. resources to become available. CODES 2001 April, 2001

  8. Previous Algorithms’ Run Time Generally the run time is O(m*n), where Generally the run time is O(m*n), where m is the number of processors and n is m is the number of processors and n is the number of resources. the number of resources. ≤ m*n), where Path Based, O(e), or O(e ≤ � Path Based, O(e), or O(e m*n), where � e is the set of edges. e is the set of edges. � Tree Based, O(m*n) Tree Based, O(m*n) � � Matrix Based, O(m*n) Matrix Based, O(m*n) � � Message Passing Based, O(m*n) Message Passing Based, O(m*n) � CODES 2001 April, 2001

  9. Overall Outline � Motivation Motivation - - Technology Trends Technology Trends � � Background Background - - Deadlock Detection Deadlock Detection � � Parallel Algorithm Parallel Algorithm � � Parallel Architecture Parallel Architecture � � Experimental Results Experimental Results � � Conclusion Conclusion � CODES 2001 April, 2001

  10. Example processor resource processor request request grant grant resource CODES 2001 April, 2001

  11. Example Source Sink node edge Link Link edge nodes Simple Simple path path Sink Source node edge CODES 2001 April, 2001

  12. Matrix Representation � Each row corresponds to a requestor (processor) Each row corresponds to a requestor (processor) � � p p i represents requestor (processor) i i represents requestor (processor) i � Each column corresponds to a resource Each column corresponds to a resource � � q q j j represents resource j represents resource j � Entries in the matrix Entries in the matrix � � r ( r (r r ij ) represents a request ij ) represents a request � g ( g (g g ij ij ) represents a grant ) represents a grant � 0 represents no action (neither request nor 0 represents no action (neither request nor grant) grant) CODES 2001 April, 2001

  13. Properties � Proposed Algorithm Proposed Algorithm � � Matrix Based Matrix Based � Modified Reduction Technique � Modified Reduction Technique � Handling multiple requests, and � Handling multiple requests, and � grants at the same time. grants at the same time. Requires simple bit- -wise wise boolean boolean � Requires simple bit � operations. operations. CODES 2001 April, 2001

  14. SoC Example P\ P \Q Q q1(IcP q1( IcP) ) q2(PCI) q2(PCI) q3(WI) q3(WI) p1(DSP) p1(DSP) g g r r 0 0 p2(VSP) p2(VSP) r r g g g g CODES 2001 April, 2001

  15. Deadlock and Cycle Relation DSP VSP • Deadlock ⇒ ∃ cycles • Cycles ⇒ ∃ Deadlock (As shown in the red) IcP PCI WI P\ P \Q Q q1(IcP q1( IcP) ) q2(PCI) q2(PCI) q3(WI) q3(WI) p1(DSP) p1(DSP) g g r r 0 0 p2(VSP) p2(VSP) r r g g g g CODES 2001 April, 2001

  16. Matrix Representation     0 1 = =     g r r r     1 0       0 0 g g r r 0 g r = = =     M M   M       r r g g g g r g g   [ ] [ ] [ ] [ ] [ ] [ ] 0 1 0 = = = = = = 01 01 10 10 g g r r 01 10 g r   c c c c c c 1 0 0         01 01 10 10 00 00 01 10 00 = = =   =     M M   M M c c c r       10 10 01 01 01 01   10 01 01 1 0 0       0 1 1 CODES 2001 April, 2001

  17. Matrix Representation: calculation of M rbo and XOR right         0 1 0 1 0 1 0 1           ⊕       ⊕ 1 0 0 1 1 0 0 1 1 1 0 1 1 0         =   = = = = = = =       M M XOR M M XOR r rbo right     r rbo right             ⊕ ⊕ 1 0 0 1     1 0 0 1     1 1 0 1 1 0                 1 0 1 1 1 0 1 1 CODES 2001 April, 2001

  18. Matrix Representation: calculation of M cbo and XOR below   01 10 00 =   M c   10 01 01 [ ] = 11 11 01 M cbo [ ] [ ] = ⊕ ⊕ ⊕ = 1 1 1 1 0 1 0 0 1 XOR below CODES 2001 April, 2001

  19. Result of first iteration [ ] = 0 0 1 XOR below   0 =   XOR right   0 � Based on result, we set all entries in Based on result, we set all entries in � column 3 to zero: column 3 to zero:   g r 0 =   M   0 r g CODES 2001 April, 2001

  20. Multiple Iterations � Continuing in this way, we continue Continuing in this way, we continue � iterating until no more changes iterating until no more changes � When finished, if M is all zeros, we have When finished, if M is all zeros, we have � no deadlock; otherwise, we do have no deadlock; otherwise, we do have deadlock deadlock � This algorithm requires at most This algorithm requires at most � iterations 2*min(m,n) iterations 2*min(m,n) CODES 2001 April, 2001

  21. Overall Outline � Motivation Motivation - - Technology Trends Technology Trends � � Background Background - - Deadlock Detection Deadlock Detection � � Parallel Algorithm Parallel Algorithm � � Parallel Architecture Parallel Architecture � � Experimental Results Experimental Results � � Conclusion Conclusion � CODES 2001 April, 2001

Recommend


More recommend