CS137b: Day2 CS137: Dynamic Programming Electronic Design Automation Solution • Solution described is general instance of dynamic programming • Require: – optimal solution to subproblems is optimal solution Day 10: February 1, 2006 to whole problem Dynamic Programming – (all optimal solutions equally good) – divide-and-conquer gets same (finite/small) number of subproblems • Same technique used for instruction selection 1 2 CALTECH CS137 Winter2006 -- DeHon CALTECH CS137 Winter2006 -- DeHon Sequence Matching Dynamic Programming • Find edit distance between two strings • Two Examples – E.g. – SPLASH sequence matching/edit • Insert cost 1 distances • Delete cost 1 • Replace cost 2 • O(N 2 ) operation in O(N) time with O(N) hardware • Match 0 • Primary Application: – CMU parenthesis matching • O(N 3 ) operation in O(N) time with O(N 2 ) – DNA Sequence comparison hardware – Often compare new sequence against database 3 4 CALTECH CS137 Winter2006 -- DeHon CALTECH CS137 Winter2006 -- DeHon Edit Example Dynamic Programming • Build a table representing string prefixes • SHMOO – Only m×n cases compute – Add E (cost 1) • Fill in costs • SHMOOE • Cell (m,n) is – Remove M (cost 1 + 1=2) result • SHOOE – Replace O with R (cost 2+2=4) • SHORE 5 6 CALTECH CS137 Winter2006 -- DeHon CALTECH CS137 Winter2006 -- DeHon 1
Local Move Costs Edit Distance Table • D(0,0) = 0 • D(i,0)=D(i-1,0)+Delete(Si) • D(0,j)=D(0,j-1)+Insert(Tj) • D(i,j)=min � D(i-1,j)+Delete(Si) � D(i,j-1)+Insert(Tj) � D(i-1,j-1)+Replace(Si,Tj) [Constant work per cell to fillin] 7 8 CALTECH CS137 Winter2006 -- DeHon CALTECH CS137 Winter2006 -- DeHon Systolic Array Systolic Array • Feed Strings from opposite ends • Feed Strings from opposite ends • Compute along diagonals • Compute along diagonals O O M H S � � S H O R E 9 10 CALTECH CS137 Winter2006 -- DeHon CALTECH CS137 Winter2006 -- DeHon Systolic Array Systolic Array • Feed Strings from opposite ends • Feed Strings from opposite ends • Compute along diagonals • Compute along diagonals O O M H S � O O M H S � � S H O R E � S H O R E When Src[i] and Targ[j] line up, Compute cell (i,j) 11 12 CALTECH CS137 Winter2006 -- DeHon CALTECH CS137 Winter2006 -- DeHon 2
D(i,j)=min � D(i-1,j)+Delete(Si) � D(i,j-1)+Insert(Tj) Systolic Array In Operation � D(i-1,j-1)+Replace(Si, On previous cycle: • PEDist = minimum of Cell computes (i-1,j-1) � TDin+Cost(Delete(SCin)) T-neighbor (i-1,j) S H � SDin+Cost(Insert(TCin)) S-neighbor (i,j-1) � PEDist + Cost(Substitute(TCin,SCin)) 3 2 2 1 1 0 1 1 2 2 • SDout=TDout=PEDist (3,0) (2,0) (2,0) (1,0) (1,0) (0,0) (0,1) (0,1) (0,2) (0,2) • ….plus details for edge cases M H S 13 14 CALTECH CS137 Winter2006 -- DeHon CALTECH CS137 Winter2006 -- DeHon S H 3 2 2 1 1 0 1 1 2 2 (3,0) (2,0) (2,0) (1,0) (1,0) (0,0) (0,1) (0,1) (0,2) (0,2) M H S S H O S H O 3 3 2 2 1 0 1 2 2 3 (3,0) (3,0) (2,0) (2,0) (1,0) (1,1) (0,1) (0,2) (0,2) (0,3) (3,0) (3,0) (2,0) (2,0) (1,0) (1,1) (0,1) (0,2) (0,2) (0,3) M H S M H S 15 16 CALTECH CS137 Winter2006 -- DeHon CALTECH CS137 Winter2006 -- DeHon S H O 3 3 2 2 1 0 1 2 2 3 (3,0) (3,0) (2,0) (2,0) (1,0) (1,1) (0,1) (0,2) (0,2) (0,3) M H S S H O S H O 4 3 3 2 1 0 1 2 3 3 (4,0) (3,0) (3,0) (2,0) (2,1) (1,1) (1,2) (0,2) (0,3) (0,3) (4,0) (3,0) (3,0) (2,0) (2,1) (1,1) (1,2) (0,2) (0,3) (0,3) O M H S O M H S 17 18 CALTECH CS137 Winter2006 -- DeHon CALTECH CS137 Winter2006 -- DeHon 3
S H O 4 3 3 2 1 0 1 2 3 3 (4,0) (3,0) (3,0) (2,0) (2,1) (1,1) (1,2) (0,2) (0,3) (0,3) O M H S S H O R S H O R 4 3 2 1 0 1 2 3 4 (4,0) (4,0) (3,0) (3,1) (2,1) (2,2) (1,2) (1,3) (0,3) (0,4) (4,0) (4,0) (3,0) (3,1) (2,1) (2,2) (1,2) (1,3) (0,3) (0,4) O M H S O M H S 19 20 CALTECH CS137 Winter2006 -- DeHon CALTECH CS137 Winter2006 -- DeHon Edit Distance Table Details/Variations • Can have one stationary – Unidirectional: compute row at a time – Most useful when matching single target to large collection of sources • Only need constant state per cell – Delta from neighbor bounded – Similarly, small, constant-width cost datapaths 21 22 CALTECH CS137 Winter2006 -- DeHon CALTECH CS137 Winter2006 -- DeHon Performance (un-normalized) Implementation • On Splash2 • Board with 16 XC4010s • 12MHz • 16PEs per XC4010 – XC4010 has 400CLBs=800 4-LUTs • 50 4-LUTs/PE – Entire Splash 2: 12,800 4-LUTs • Less than XC2V6000 (40%) 23 24 CALTECH CS137 Winter2006 -- DeHon CALTECH CS137 Winter2006 -- DeHon 4
Normalized Computational Density Non-Local (CM5 used Sparc Processors) 25 26 CALTECH CS137 Winter2006 -- DeHon CALTECH CS137 Winter2006 -- DeHon Sub-dividing into Trees Parenthesization • Have an associative sequence of operations • Optimal parenthesization • What is least cost way to parenthesize? • Deciding where to split a tree – E.g. 1 2 3 4 – Perhaps for covering – (((1 2) 3 4) • Search Tree – (((1 (2 3)) 4) – ((1 2) (3 4)) – (1 ((2 3) 4)) – (1 (2 (3 4))) 27 28 CALTECH CS137 Winter2006 -- DeHon CALTECH CS137 Winter2006 -- DeHon Abstract Covering Problem Idea • If we had an ordering of nodes • Given: Graph (V,E) with a single weight (area) on each node and two weights (IO, – (wishful thinking) cost) on the edges. PEs • Then easy to know how to include more • Cluster nodes into subsets V i , such that – Just pick the next node � Σ (Cost(V i )) minimized • Order: 1D list of nodes � IO(V i ) < IO limit • Cluster: a contiguous sequence of � A(V i ) < Area limit nodes in list � Cost(V i ) = Σ (cost(e) | e ∈ E st. e 1 ∈ V i and e 2 ∉ V i ) – Specify start, finish PE Communication 29 30 CALTECH CS137 Winter2006 -- DeHon CALTECH CS137 Winter2006 -- DeHon 5
Feasible Clusters (mult16a) From Sequence to Clusters • Easy to know if a contiguous subsequence – Meets area constraints – Meets io constraints • Cover – Set of (non-overlapping) subsequences – Include all nodes 31 32 CALTECH CS137 Winter2006 -- DeHon CALTECH CS137 Winter2006 -- DeHon Covering Dynamic Programming • Not clear when to put more or less stuff • For each subsequence start,end in a cluster…versus leave with next – Either the area and io match cluster – OR want to find a breakpoint between cluster – � Can’t build clusters greedily sets • Cluster sets start � midpoint, midpoint � end may each either be single or multiple clusters 33 34 CALTECH CS137 Winter2006 -- DeHon CALTECH CS137 Winter2006 -- DeHon Minimization Problem Systolic Solution • c(i,j)=w(i,j)+min k (c(i,k)+c(k,j)) • Solve in O(N) time • Using O(N 2 ) hardware • Also filling in a table • Each PE is one cell in table – PE computes the local min • Work per table entry O(N) – Must look at all k’s • O(N 3 ) total work 35 36 CALTECH CS137 Winter2006 -- DeHon CALTECH CS137 Winter2006 -- DeHon 6
Challenge Computational Array • Getting right pair of data to show up on each (0,0) (1,0) (2,0) (3,0) (4,0) (5,0) (6,0) cycle (1,1) (2,1) (3,1) (4,1) (5,1) (6,1) • c(i,j)=w(i,j)+min k (c(i,k)+c(k+1,j)) (2,2) (3,2) (4,2) (5,2) (6,2) • E.g. (5,0) wants to see – (5,1) (0,0) (3,3) (4,3) (5,3) (6,3) – (5,2) (1,0) (4,4) (5,4) (6,4) – (5,3) (2,0) (5,5) (6,5) – (5,4) (3,0) – (5,5) (4,0) (6,6) 37 38 CALTECH CS137 Winter2006 -- DeHon CALTECH CS137 Winter2006 -- DeHon Computational Array Systolic Algorithm (0,0) (1,0) (2,0) (3,0) (4,0) (5,0) (6,0) • For cell at distance t from edge (5,1) (0,0) – Has completed data at time 2t (1,1) (2,1) (3,1) (4,1) (5,1) (6,1) (5,2) (1,0) (5,3) (2,0) – Send data along row and column (2,2) (3,2) (4,2) (5,2) (6,2) (5,4) (3,0) • Data travels 1 cell / cycle for t units of time (3,3) (4,3) (5,3) (6,3) (5,5) (4,0) • Then travels 2 cells / cycle for rest of time (4,4) (5,4) (6,4) (5,5) (6,5) (6,6) 39 40 CALTECH CS137 Winter2006 -- DeHon CALTECH CS137 Winter2006 -- DeHon Computational Array Computational Array (0,0) 1 3 5 7 (5,0) (6,0) (0,0) (1,0) 3 5 7 (5,0) (6,0) (5,1) (0,0) (5,1) (0,0) 2 (1,1) (2,1) (3,1) (4,1) (5,1) (6,1) (5,2) (1,0) (5,2) (1,0) (1,1) (2,1) (3,1) (4,1) 7 (6,1) 8 (5,3) (2,0) (5,3) (2,0) (5,4) (3,0) (5,4) (3,0) (2,2) (3,2) (4,2) (5,2) (6,2) (2,2) (3,2) (4,2) (5,2) (6,2) (5,5) (4,0) (5,5) (4,0) 6 (3,3) (4,3) (5,3) (6,3) (3,3) (4,3) (5,3) (6,3) (4,4) (5,4) (6,4) (4,4) (5,4) (6,4) (5,5) (6,5) (5,5) (6,5) (6,6) (6,6) 41 42 CALTECH CS137 Winter2006 -- DeHon CALTECH CS137 Winter2006 -- DeHon 7
Recommend
More recommend