Online Task Remapping Strategies for Fault-tolerant Network-on-Chip Multiprocessors Onur Derin, Deniz Kabakcı, Leandro Fiorin ALaRI Faculty of Informatics University of Lugano Lugano, Switzerland derino@alari.ch NOCS’11 - Pittsburgh May 3, 2011
Outline Problem ILP formulation of the optimal mapping problem of KPN applications onto NoC Minimization of the communication cost Minimization of the total computation time Online Task Remapping Optimal Task Remapping Center of Gravity method (CoG) Nonidentical Multiprocessor Scheduling Heuristics (NMS) Localized NMS Heuristic (LNMS) Case study Results O. Derin, ALaRI NOCS’11— Online Task Remapping Strategies for Fault-tolerant Network-on-Chip Multiprocessors 2/31
Introduction Continuity of service support in the MADNESS NoC platform starting point Kahn Process Networks as the computation model xpipes-based NoC from Uni. Cagliari, NORMA model, message-passing support Main tasks a middleware to execute KPN on NoCs fault detection/masking via self-testing and self-checking reconfiguration via online task remapping task migration from faulty nodes fault-tolerant interconnect O. Derin, ALaRI NOCS’11— Online Task Remapping Strategies for Fault-tolerant Network-on-Chip Multiprocessors 3/31
Problem e e 4 e 1 2 e 3 t 1 t 2 t 3 t 4 t 5 n n n e 1 2 3 e 6 e 5 8 RISC DSP RISC l 1 e l 7 2 t 7 t 11 e t 6 14 n n n e l 3 l 4 l 5 9 4 5 6 DSP e RISC DSP 10 e l 6 l 7 e 11 13 e t 8 t 9 t 10 t 12 12 n n n l l 9 l 10 7 8 9 8 RISC DSP RISC l 11 l 12 Figure: A KPN application running on a 3x3 mesh O. Derin, ALaRI NOCS’11— Online Task Remapping Strategies for Fault-tolerant Network-on-Chip Multiprocessors 4/31
Problem: Where should the tasks on the faulty core be moved? e e 4 e 1 2 e 3 t 1 t 2 t 3 t 4 t 5 n n n e 1 2 3 e 6 e 5 8 RISC DSP RISC l 1 e l 7 2 t 7 t 11 e t 6 14 n n n e l 3 l 4 l 5 9 4 5 6 DSP e RISC DSP 10 e l 6 l 7 e 11 13 e t 8 t 9 t 10 t 12 12 n n n l l 9 l 10 7 8 9 8 RISC DSP RISC l 11 l 12 Figure: Processing node n 5 becomes faulty O. Derin, ALaRI NOCS’11— Online Task Remapping Strategies for Fault-tolerant Network-on-Chip Multiprocessors 5/31
Approach Solve the task mapping problem onto NoC-based heterogeneous multiprocessors optimally Consider different fault scenarios and find new optimal remappings Propose heuristics for the problem Compare their performances with the optimal results O. Derin, ALaRI NOCS’11— Online Task Remapping Strategies for Fault-tolerant Network-on-Chip Multiprocessors 6/31
Mapping problem Given a KPN task graph Given an architecture graph and a deterministic routing algorithm Find the optimal mapping such that computation time and amount of communication is optimized O. Derin, ALaRI NOCS’11— Online Task Remapping Strategies for Fault-tolerant Network-on-Chip Multiprocessors 7/31
ILP formulation of the mapping problem A task graph g t = ( V t , E t ) is composed of tasks t ∈ V t and data dependencies e ∈ E t ⊆ V t × V t . An architecture graph g a = ( V n , E n ) is composed of processing nodes n ∈ V a and bidirectional communication links l ∈ E a ⊆ V a × V a . A task binding β t : V t → V a is an assignment of tasks t ∈ V t to nodes n ∈ V a . A communication binding β c : E t → E i a is an assignment of data dependencies e ∈ E t to paths of length i in the architecture graph g a . A path p of length i is given by i -tuple p = ( l 1 , l 2 , ..., l i ). O. Derin, ALaRI NOCS’11— Online Task Remapping Strategies for Fault-tolerant Network-on-Chip Multiprocessors 8/31
ILP formulation of the mapping problem path : ( V a , V a ) → E i a is a function that implements a deterministic routing algorithm and returns a path between two given nodes. Path set P is the set of paths between all node pairs: P = { p k : p k = path ( n i , n j ) , ∀ n i , n j ∈ V a ∧ n i � = n j } The task graph can be annotated with demand values where demand d i on a data dependency e i ∈ E t , denotes the required bandwidth between the two tasks. The architecture graph can be annotated with capacity values where capacity on an architectural link l i ∈ E a , c i , denotes the maximum bandwidth of the communication link between two architectural nodes. Core type set C consists of core types C i and lists the types of cores available in a given NoC platform. O. Derin, ALaRI NOCS’11— Online Task Remapping Strategies for Fault-tolerant Network-on-Chip Multiprocessors 9/31
Minimization of the communication cost Decision variables � 1 , if t j ∈ V t is bound onto node n i ∈ V a X NT = ij 0 , otherwise � 1 , if e j ∈ E t is mapped to p i ∈ P Y PE = ij 0 , otherwise O. Derin, ALaRI NOCS’11— Online Task Remapping Strategies for Fault-tolerant Network-on-Chip Multiprocessors 10/31
Minimization of the communication cost Parameters 1 , if ∃ t k , e j = ( t i , t k ) ∈ E t M TE = − 1 , if ∃ t k , e j = ( t k , t i ) ∈ E t ij 0 , otherwise 1 , if source ( p j ) = n i M NP = − 1 , if sink ( p j ) = n i ij 0 , otherwise � 1 , if l j ∈ p i M PL = ij 0 , otherwise O. Derin, ALaRI NOCS’11— Online Task Remapping Strategies for Fault-tolerant Network-on-Chip Multiprocessors 11/31
Minimization of the communication cost Constraints Constraint 1 (routing) : Task mapping and communication binding are constrained with each other due to the routing algorithm implemented in the NoC. X NT M TE = M NP Y PE (1) Constraint 2 (task mapping) : A task can be mapped exactly on one node. X TN 1 | V a | = 1 | V t | (2) Constraint 3 (communication mapping) : A data dependency can be mapped at most on one path. Y EP 1 | P | ≤ 1 | E t | (3) O. Derin, ALaRI NOCS’11— Online Task Remapping Strategies for Fault-tolerant Network-on-Chip Multiprocessors 12/31
Minimization of the communication cost Constraint 4 (capacity) : Total bandwidth demand on a link l j should not exceed the capacity of the link c j . M LP Y PE d ≤ c (4) Objective 1 (communication cost) : The total traffic on the links is the sum of all demands d i on the links of the paths that arise according to a given mapping: min: d T Y EP M PL 1 | E a | (5) O. Derin, ALaRI NOCS’11— Online Task Remapping Strategies for Fault-tolerant Network-on-Chip Multiprocessors 13/31
Minimization of the total computation time Parameters � 1 , if C j ∈ C is capable of realizing task t i ∈ V t M TC cap ij = 0 , otherwise � if M TC completion time of t i on C j , cap ij = 1 T TC cap ij = if M TC 0 , cap ij = 0 � 1 , if n i ∈ V a is of core type C j ∈ C M NC = ij 0 , otherwise O. Derin, ALaRI NOCS’11— Online Task Remapping Strategies for Fault-tolerant Network-on-Chip Multiprocessors 14/31
Minimization of the total computation time Constraints Constraint 5 (capability) : All tasks should be mapped on cores that are capable of implementing those tasks. M TC = X TN M NC ≤ M TC (6) cap O. Derin, ALaRI NOCS’11— Online Task Remapping Strategies for Fault-tolerant Network-on-Chip Multiprocessors 15/31
Minimization of the total computation time Objective 2 (total execution time) : We calculate the total computation time of the application by finding the maximum of the sum of the execution times of tasks mapped on the same core. min: max ( T N ) = max ( X NT ((( X TN M NC ) . T TC cap ) 1 | C | )) (7) We apply some linearization techniques to this formula for max () and x ij ∗ x kl The analytical model for the adopted objectives is valid for acyclic KPN applications in cases where communication is faster with respect to computation when the network is not overloaded (no congestion) O. Derin, ALaRI NOCS’11— Online Task Remapping Strategies for Fault-tolerant Network-on-Chip Multiprocessors 16/31
The mapping tool based on IBM � ILOG � CPLEX � API multi-objective ILP problem solved with ε -constraint method O. Derin, ALaRI NOCS’11— Online Task Remapping Strategies for Fault-tolerant Network-on-Chip Multiprocessors 17/31
Optimal Task Remapping New constraints Constraint (faulty core) : Given a faulty node n f , a new constraint is added to the ILP formulation that forbids mapping of tasks on the faulty node n f . | V t | � X NT = 0 (8) fj j =1 Constraint (migrate only tasks on the faulty core) : Given a faulty node n f and an initial task mapping M NT , a new constraint is added to limit the reconfiguration just to the tasks that are running on the faulty node n f . X NT = M NT , 1 ≤ i ≤ | V a | , 1 ≤ j ≤ | V t | , i � = f (9) ij ij O. Derin, ALaRI NOCS’11— Online Task Remapping Strategies for Fault-tolerant Network-on-Chip Multiprocessors 18/31
Optimal Task Remapping Why not encode optimal remapping solutions for every fault scenario? number of scenarios, N | V a | � | V a | � − 1 = 2 | V a | − 1 � N = i i =1 number of bits, B B = (2 | V a | − 1) p | V t | ⌈ log ( | V a | ) ⌉ For | V a | = 9, | V t | = 12, p = 5, we have B = 14 . 97 Kbytes We may need heuristics! O. Derin, ALaRI NOCS’11— Online Task Remapping Strategies for Fault-tolerant Network-on-Chip Multiprocessors 19/31
Recommend
More recommend