CENG5030 Part 1-2: Voltage Scaling - A Dynamic Programming Approach - - PowerPoint PPT Presentation

ceng5030 part 1 2 voltage scaling
SMART_READER_LITE
LIVE PREVIEW

CENG5030 Part 1-2: Voltage Scaling - A Dynamic Programming Approach - - PowerPoint PPT Presentation

CENG5030 Part 1-2: Voltage Scaling - A Dynamic Programming Approach Bei Yu (Latest update: January 14, 2019) Spring 2019 1 / 27 Overview Introduction Background: NP problem Background: Dynamic Programming DAC07: Voltage Partitioning


slide-1
SLIDE 1

CENG5030 Part 1-2: Voltage Scaling

—- A Dynamic Programming Approach

Bei Yu

(Latest update: January 14, 2019)

Spring 2019

1 / 27

slide-2
SLIDE 2

Overview

Introduction Background: NP problem Background: Dynamic Programming DAC’07: Voltage Partitioning ICCAD’06: Voltage Assignment on Netlist ICCAD’07: Voltage Assignment on Slicing Floorplanning

2 / 27

slide-3
SLIDE 3

Overview

Introduction Background: NP problem Background: Dynamic Programming DAC’07: Voltage Partitioning ICCAD’06: Voltage Assignment on Netlist ICCAD’07: Voltage Assignment on Slicing Floorplanning

3 / 27

slide-4
SLIDE 4

Multi-Voltage Design @IBM1

1Ruchir Puri et al. (2003). “Pushing ASIC performance in a power envelope”. In: Proc. DAC, pp. 788–793.

3 / 27

slide-5
SLIDE 5

Level-Converter2

Level-converter is used to avoid excessive static power consumption between the low and high voltage regions.

2Ruchir Puri et al. (2003). “Pushing ASIC performance in a power envelope”. In: Proc. DAC, pp. 788–793.

4 / 27

slide-6
SLIDE 6

Placement Level Multi-Voltage3

3Huaizhi Wu and Martin DF Wong (2009). “Incremental improvement of voltage assignment”. In: IEEE TCAD 28.2, pp. 217–230.

5 / 27

slide-7
SLIDE 7

4 4Kristof Blutman et al. (2017). “Floorplan and placement methodology for improved energy reduction in stacked power-domain design”. In: Proc. ASPDAC,

  • pp. 444–449.

6 / 27

slide-8
SLIDE 8

Floorplanning Level Multi-Voltage

1 7 3 5 4 8 6 2 High voltage Low voltage

Delay Power

(d1,p1) (d2,p2)

◮ Modules are assigned high-voltage or

low-voltage.

◮ Low voltage → high delay. ◮ Trade off between the power saving and

performance.

7 / 27

slide-9
SLIDE 9

Floorplanning Level Multi-Voltage

1 7 3 5 4 8 6 2 High voltage Low voltage

Power Network Resource

◮ Modules are assigned high-voltage or

low-voltage.

◮ Low voltage → high delay. ◮ Trade off between the power saving and

performance.

◮ Consider Power Network Resource

◮ High voltage modules should pack close ◮ Generate Voltage Island

7 / 27

slide-10
SLIDE 10

What’s Netlist?

1 2 3 4 5 6 N-1 N 1 2 3 5 4 Tcycle N N-1

8 / 27

slide-11
SLIDE 11

What’s Floorplanning?

1 2 3 4 5 6 7 8 1 7 3 5 4 8 6 2

After floorplanning

9 / 27

slide-12
SLIDE 12

B*-Tree5

5Yun-Chih Chang et al. (2000). “B*-Trees: A New Representation for Non-Slicing Floorplans”. In: Proc. DAC, pp. 458–463.

10 / 27

slide-13
SLIDE 13

Classic Design Flow7

◮ Integer linear programming (ILP) based ◮ More complicated ILP formulation is develped in ICCAD’076.

6Wan-Ping Lee, Hung-Yi Liu, and Yao-Wen Chang (2007). “An ILP algorithm for post-floorplanning voltage-island generation considering power-network planning”.

In: Proc. ICCAD, pp. 650–655.

7Wai-Kei Mak and Jr-Wei Chen (2007). “Voltage island generation under performance requirement for SoC designs”. In: Proc. ASPDAC, pp. 798–803.

11 / 27

slide-14
SLIDE 14

12 / 27

slide-15
SLIDE 15

12 / 27

slide-16
SLIDE 16

Overview

Introduction Background: NP problem Background: Dynamic Programming DAC’07: Voltage Partitioning ICCAD’06: Voltage Assignment on Netlist ICCAD’07: Voltage Assignment on Slicing Floorplanning

13 / 27

slide-17
SLIDE 17

NP-Completeness [Garey & Johnson,1979]8

◮ Decision Problem (Yes/No Problem) ◮ N P: Set of problems w. Nondeterministic Polynomial time algorithm ◮ P: Set of problems w. (Deterministic) Polynomial time algorithm ◮ N P-Complete: hardest problems in NP

If a problem in NP-Complete solved in polynomial time → any problem in NP solved in polynomial time

8Some contents & figures on this part come from Prof. Takahashi

13 / 27

slide-18
SLIDE 18

NP-Completeness [Garey & Johnson,1979]8

◮ Decision Problem (Yes/No Problem) ◮ N P: Set of problems w. Nondeterministic Polynomial time algorithm ◮ P: Set of problems w. (Deterministic) Polynomial time algorithm ◮ N P-Complete: hardest problems in NP ◮ Conjecture: P = NP

If a problem in NP-Complete solved in polynomial time → any problem in NP solved in polynomial time

8Some contents & figures on this part come from Prof. Takahashi

13 / 27

slide-19
SLIDE 19

Polynomial Time Reduction

◮ Provides difficulty relation between problems ◮ SAT is NP-Complete → 3SAT, Hamilton, TSP, Coloring...

14 / 27

slide-20
SLIDE 20

NP-Hardness [Garey & Johnson,1979]

◮ Optimization problem ◮ Is neither in NP nor in NP-Complete ◮ N P-hard if a related decision problem is NP-complete ◮ E.g. Travelling Salesman Problem (TSP) ◮ No polynomial time algorithm

15 / 27

slide-21
SLIDE 21

Strategies of Algorithm Design

  • 1. Check whether problem is easy or not?
  • 2. If possible, prove is NP-hard or NP-complete
  • 3. For easy problem (in P):
  • 4. For not easy problem (in NP-hard):

16 / 27

slide-22
SLIDE 22

Overview

Introduction Background: NP problem Background: Dynamic Programming DAC’07: Voltage Partitioning ICCAD’06: Voltage Assignment on Netlist ICCAD’07: Voltage Assignment on Slicing Floorplanning

17 / 27

slide-23
SLIDE 23

Case 1: Calculating Binomial Coefficient

17 / 27

slide-24
SLIDE 24

Case 1: Calculating Binomial Coefficient

Question

Can we have a better Algorithm?

17 / 27

slide-25
SLIDE 25

Case 2: Knapsack Problem

max

n

  • i=1

xivi s.t.

n

  • i=1

xiwi ≤ W ◮ vi: value of object i ◮ wi: weight of object i ◮ xi ∈ {0, 1} Question ◮ Design a Dynamic Programming Algorithm to Solve it. ◮ What is xi can be floating value?

18 / 27

slide-26
SLIDE 26

Principle of Optimality

In an optimal sequence of decisions or choices, each subsequence must also be optimal.

19 / 27

slide-27
SLIDE 27

Overview

Introduction Background: NP problem Background: Dynamic Programming DAC’07: Voltage Partitioning ICCAD’06: Voltage Assignment on Netlist ICCAD’07: Voltage Assignment on Slicing Floorplanning

20 / 27

slide-28
SLIDE 28

9

9Hung-Yi Liu, Wan-Ping Lee, and Yao-Wen Chang (2007). “A provably good approximation algorithm for power optimization using multiple supply voltages”. In:

  • Proc. DAC, pp. 887–890.

20 / 27

slide-29
SLIDE 29

◮ On perfect-number partition: https://en.wikipedia.org/wiki/Partition_problem ◮ Correction:10

10Tao Lin et al. (2010). “A revisit to voltage partitioning problem”. In: Proc. GLSVLSI, pp. 115–118.

21 / 27

slide-30
SLIDE 30

Overview

Introduction Background: NP problem Background: Dynamic Programming DAC’07: Voltage Partitioning ICCAD’06: Voltage Assignment on Netlist ICCAD’07: Voltage Assignment on Slicing Floorplanning

22 / 27

slide-31
SLIDE 31

ICCAD’06: Voltage Assignment on Netlist11

(a) Algorithm Flow (b) Notations

Question: ◮ How to define a slack for each vertex vi? ◮ Please provide a mathematical formulation minizing total power consumption.

11Wan-Ping Lee, Hung-Yi Liu, and Yao-Wen Chang (2006). “Voltage island aware floorplanning for power and timing optimization”. In: Proc. ICCAD, pp. 389–394.

22 / 27

slide-32
SLIDE 32

23 / 27

slide-33
SLIDE 33

23 / 27

slide-34
SLIDE 34

◮ Further speed-up: dual to min-cost flow12 ◮ Overcome reconverge issue:13

12Qiang Ma and Evangeline FY Young (2008). “Network flow-based power optimization under timing constraints in MSV-driven floorplanning”. In: Proc. ICCAD,

  • pp. 1–8.

13Yifang Liu and Jiang Hu (2009). “A New Algorithm for Simultaneous Gate Sizing and Threshold Voltage Assignment”. In: Proc. ISPD, pp. 27–34.

23 / 27

slide-35
SLIDE 35

14

15

Consistency Relaxation

backward solution propagation

(v4=1)[ c=3.1, q=8.2] (v4=2)[ c=2.9, q=8] (v4=3)[ c=2.7, q=7.9]

v1 v4 v2 v3

(v2=1: v4=1)[ c=3, q=7] (v2=2: v4=2)[ c=2, q=5] (v3=1: v4=1)[ c=4, q=7] (v3=2: v4=2)[ c=3, q=6] (v1=1: v2=2, v3=2)[ c=5, q=3.5] (v1=2: v2=1, v3=2)[ c=5, q=4.4]

14Yifang Liu and Jiang Hu (2009). “A New Algorithm for Simultaneous Gate Sizing and Threshold Voltage Assignment”. In: Proc. ISPD, pp. 27–34.

24 / 27

slide-36
SLIDE 36

14

16

Consistency Relaxation

(v4=1)[ c=3.1, q=8.2] (v4=2)[ c=2.9, q=8] (v4=3)[ c=2.7, q=7.9]

v1 v4 v2 v3

(v2=1: v4=1)[ c=3, q=7] (v2=2: v4=2)[ c=2, q=5] (v3=1: v4=1)[ c=4, q=7] (v3=2: v4=2)[ c=3, q=6] (v1=1: v2=2, v3=2)[ c=5, q=3.5] (v1=2: v2=1, v3=2)[ c=5, q=4.4]

14Yifang Liu and Jiang Hu (2009). “A New Algorithm for Simultaneous Gate Sizing and Threshold Voltage Assignment”. In: Proc. ISPD, pp. 27–34.

24 / 27

slide-37
SLIDE 37

14

17

Consistency Relaxation

(v4=1)[ c=3.1, q=8.2] (v4=2)[ c=2.9, q=8]

v1 v4 v2 v3

(v2=1: v4=1)[ c=3, q=7] (v3=2: v4=2)[ c=3, q=6] (v1=2: v2=1, v3=2)[ c=5, q=4.4] v4

1 or v4 2 ?

14Yifang Liu and Jiang Hu (2009). “A New Algorithm for Simultaneous Gate Sizing and Threshold Voltage Assignment”. In: Proc. ISPD, pp. 27–34.

24 / 27

slide-38
SLIDE 38

14

18

Consistency Restoration

(v1=2) [a=2] (v2=1) [a=3.6] (v3=2) [a=3.6] (v4=1) [a= ,q=8.2] (v4=2) [a= ,q=8] (v4=3) [a= ,q=7.9]

forward solution propagation

v1 v4 v2 v3

2.2 2 3 V3=2 2.2 3 1.2 V2=1 V4=3 V4=2 V4=1 D(vi,vj)

14Yifang Liu and Jiang Hu (2009). “A New Algorithm for Simultaneous Gate Sizing and Threshold Voltage Assignment”. In: Proc. ISPD, pp. 27–34.

24 / 27

slide-39
SLIDE 39

14

19

Consistency Restoration

forward solution propagation

(v1=2) [a=2] (v2=1) [a=3.6] (v3=2) [a=3.6] (v4=1) [a=6.6 ,q=8.2] (v4=2) [a=6.6 ,q=8] (v4=3) [a=5.8 ,q=7.9]

v1 v4 v2 v3

2.2 2 3 V3=2 2.2 3 1.2 V2=1 V4=3 V4=2 V4=1 D(vi,vj)

14Yifang Liu and Jiang Hu (2009). “A New Algorithm for Simultaneous Gate Sizing and Threshold Voltage Assignment”. In: Proc. ISPD, pp. 27–34.

24 / 27

slide-40
SLIDE 40

14

20

Consistency Restoration

forward solution propagation

(v1=2) [a=2] (v2=1) [a=3.6] (v3=2) [a=3.6] (v4=1) [a=6.6 ,q=8.2] (v4=2) [a=6.6 ,q=8] (v4=3) [a=5.8 ,q=7.9]

v1 v4 v2 v3

2.2 2 3 V3=2 2.2 3 1.2 V2=1 V4=3 V4=2 V4=1 D(vi,vj)

14Yifang Liu and Jiang Hu (2009). “A New Algorithm for Simultaneous Gate Sizing and Threshold Voltage Assignment”. In: Proc. ISPD, pp. 27–34.

24 / 27

slide-41
SLIDE 41

14

21

Consistency Restoration

forward solution propagation

(v1=2) [a=2] (v2=1) [a=3.6] (v3=2) [a=3.6] (v4=1) [a=6.6 ,q=8.2] (v4=2) [a=6.6 ,q=8] (v4=3) [a=5.8 ,q=7.9]

v1 v4 v2 v3

2.2 2 3 V3=2 2.2 3 1.2 V2=1 V4=3 V4=2 V4=1 D(vi,vj)

14Yifang Liu and Jiang Hu (2009). “A New Algorithm for Simultaneous Gate Sizing and Threshold Voltage Assignment”. In: Proc. ISPD, pp. 27–34.

24 / 27

slide-42
SLIDE 42

14

22

Consistency Restoration

(v1=2) [a=2] (v2=1) [a=3.6] (v3=2) [a=3.6] (v4=1) [a=6.6 ,q=8.2] (v4=2) [a=6.6 ,q=8] (v4=3) [a=5.8 ,q=7.9]

v1 v4 v2 v3

14Yifang Liu and Jiang Hu (2009). “A New Algorithm for Simultaneous Gate Sizing and Threshold Voltage Assignment”. In: Proc. ISPD, pp. 27–34.

24 / 27

slide-43
SLIDE 43

14

24

Iterative Refinement

Circuit in consideration

14Yifang Liu and Jiang Hu (2009). “A New Algorithm for Simultaneous Gate Sizing and Threshold Voltage Assignment”. In: Proc. ISPD, pp. 27–34.

24 / 27

slide-44
SLIDE 44

14

25

Iterative Refinement

After relaxation 3x 5x 6x 1x 3x 8x

14Yifang Liu and Jiang Hu (2009). “A New Algorithm for Simultaneous Gate Sizing and Threshold Voltage Assignment”. In: Proc. ISPD, pp. 27–34.

24 / 27

slide-45
SLIDE 45

14

26

Iterative Refinement

After Phase I restoration 6x 1x 3x 7x 3x

14Yifang Liu and Jiang Hu (2009). “A New Algorithm for Simultaneous Gate Sizing and Threshold Voltage Assignment”. In: Proc. ISPD, pp. 27–34.

24 / 27

slide-46
SLIDE 46

14

27

Iterative Refinement

Phase II begins solution propagation direction 5x 1x 2x 7x 3x

14Yifang Liu and Jiang Hu (2009). “A New Algorithm for Simultaneous Gate Sizing and Threshold Voltage Assignment”. In: Proc. ISPD, pp. 27–34.

24 / 27

slide-47
SLIDE 47

14

28

Iterative Refinement

solution propagation direction 5x 1x 2x 6x 3x

14Yifang Liu and Jiang Hu (2009). “A New Algorithm for Simultaneous Gate Sizing and Threshold Voltage Assignment”. In: Proc. ISPD, pp. 27–34.

24 / 27

slide-48
SLIDE 48

14

29

Iterative Refinement

solution propagation direction 4x 1x 2x 6x 3x

14Yifang Liu and Jiang Hu (2009). “A New Algorithm for Simultaneous Gate Sizing and Threshold Voltage Assignment”. In: Proc. ISPD, pp. 27–34.

24 / 27

slide-49
SLIDE 49

14

30

Iterative Refinement

solution propagation direction 4x 1x 2x 5x 3x

14Yifang Liu and Jiang Hu (2009). “A New Algorithm for Simultaneous Gate Sizing and Threshold Voltage Assignment”. In: Proc. ISPD, pp. 27–34.

24 / 27

slide-50
SLIDE 50

14

31

Iterative Refinement

  • Monotonic improvement of solution by

iterative refinement

14Yifang Liu and Jiang Hu (2009). “A New Algorithm for Simultaneous Gate Sizing and Threshold Voltage Assignment”. In: Proc. ISPD, pp. 27–34.

24 / 27

slide-51
SLIDE 51

Overview

Introduction Background: NP problem Background: Dynamic Programming DAC’07: Voltage Partitioning ICCAD’06: Voltage Assignment on Netlist ICCAD’07: Voltage Assignment on Slicing Floorplanning

25 / 27

slide-52
SLIDE 52

ICCAD’07: Voltage Assignment on Slicing Floorplanning15

15Qiang Ma and Evangeline FY Young (2007). “Voltage island-driven floorplanning”. In: Proc. ICCAD, pp. 644–649.

25 / 27

slide-53
SLIDE 53

Normalized Polish Expression (NPE)

! Slicing floorplan representation ! A sequence of operands and operators

" An operand denotes a block " An operator denotes a cut direction

# ‘+’ denotes a horizontal cut # ‘*’ denotes a vertical cut

n1 n2 n3+* n4+

NPE

+ n1 * n4 n2

Slicing tree

+ n3

b4 b1 b3 b2 Slicing Floorplan

26 / 27

slide-54
SLIDE 54

Optimal Island Partitioning

! Given a candidate floorplan represented by NPE, we can perform optimal island partitioning and voltage assignment on the slicing tree ! Procedure TreePart(TreeNode u, Num_island k)

" Optimally partition a tree rooted at u into k islands " Solved by dynamic programming

! When k = 1

" Case 1 : Island in left subtree " Case 2 : Island in right subtree " Case 3 : The whole tree rooted

at u form an island

" Case 4 : Island is formed across

the left and right subtrees

+

L R

26 / 27

slide-55
SLIDE 55

Optimal Island Partitioning

! Case 4 : Island is formed across subtrees

" A set of contiguous right subtrees may also form an

island when operators along the left tree branches are the same

A B C D

+

+

+

D C B A Same Operator + n1 * n2 n3

b3 b1 b2

Not same Operator 26 / 27

slide-56
SLIDE 56

Optimal Island Partitioning

! The procedure NonSubtree() deals with case 4

+

+

+

A NonSubtree (TreeNode u, num_island k) $ min_cost = ∞ $ S = right_child(u) $

  • p= operator(u)

$ While operator(left_child(u)) is op ! u = left_child(u) ! S = S ∪ right_child(u) ! C = TreePart(left_child(u),k-1)+cost(S) ! If min_cost > C, min_cost = C $ Return(min_cost)

+

E D C B A D E C B

TreePart()

26 / 27

slide-57
SLIDE 57

Optimal Island Partitioning

! When k is more than 1, exhaust all different ways of distributing the k islands by dynamic programming

TreePart (TreeNode u, num_island k) $ min_cost = NonSubtree(u, k) $ For i = 0 to k

!

C = TreePart(left_child(u), i) + TreePart(right_child(u),k-i)

!

If min_cost > C, min_cost = C

$ Return(min_cost)

+

L R

Partition into i islands Partition into k-i islands Across Subtree islands

  • i = 0, 1, … , k

26 / 27

slide-58
SLIDE 58

Optimal Island Partitioning

! Use the Cost Table to speed up Procedure TreePart (u, k)

" Store the best partitioning solution of each node

# Minimize the number of recusive calls - a dynamic programming

technique

" After each move, only the nodes lying on the path from the perturbed

node to the root need to be updated

+ * + n1 n2 n3 * n4 * + n5 n6 n7 * n8

n1 n2 n3 n4 * +* n5 n6 n7 n8 * +* +

Node

  • No. of Islands

1 2 3 … K-1 K 1 ( * ) 2 ( + ) 3 ( * ) 4 ( * ) 5 ( + ) 6 ( * ) 7 ( + )

Cost Table:

1 2 4 5 6 7 3

26 / 27

slide-59
SLIDE 59

Takeaway

◮ NP completeness and NP hardness ◮ Dynamic Programming: when and how ◮ How to evaluate previous work

27 / 27