Optimization COMP4601 Design Project B Seminar Presenter: Yao Yuan; Peng Yang; Jingran Cheng
Motivation & Background Zero Phone iphone XS
NP-Complete Problem • NP problem stands for problems that run in Non- deterministic Polynomial time. • A problem X that is in NP is also in NP-Complete if and only if every other problem in NP can be quickly (i.e. in polynomial time) transformed into X. • Existing methods to solve NP-complete problems: • Iterate through all the possible solutions to find the answer which can be very slow • Using greedy method but may ends up into a local optima. • Therefore, we optimize these methods to achieve a better performance and result by introducing some optimization algorithms.
Literal Survey • A number of heuristics have been explored to optimize the problem • Genetic Algorithms • Simulated Annealing • Tabu Search
Genetic • GA is a powerful and widely used stochastic search based algorithm. Algorithms • GA is an effective technique to solve combinatorial optimization problems, which is known to be non-deterministic in nature and are associated with a large combination of feasible solution space or search space. • it has been demonstrated that GA is effective in avoiding the local optimal solution and achieves results close to the global optima.
Genetic Algorithms • The possible solutions to the problem under consideration are encoded as a array of finite length, referred to as a chromosome. • Initial Population: first set of potential solutions is called the initial population. • Evaluate Fitness: The quality of each chromosome is assessed by an evaluation function. This determines the fitness of the chromosome. • Reproduction: Based on the fitness of the individual, the next generations of possible solutions are created in the process of reproduction. • Crossover: it creates new individuals by copying parts of two other individuals. • Mutation: it introduces random transformations to the existing chromosome and creates a new individual. • Termination criteria: The algorithm terminates when the fitness function stabilizes after iterating over a predetermined number of generations.
Use GA to find optimization on hardware and software partition
Optimization for a System on a Chip (SoC) application • Software devices • General-Purpose Processors (GPP) ✅ • Hardware devices • Application Specific Integrated Circuits (ASIC) • Field Programmable Gate Array (FPGA) ✅ • It is reconfigurable => more generic • Partial reconfiguration in hardware has also been used to enhance resource utilization
Partition by GA Chromosome: Initial Population: Fitness function: Length is determined by the total number of tasks Consider many underlying factors: randomly select a chromosome (nodes) in the task graph. •inter-processor communication overhead trinary values array is used to classify tasks on •time overhead is on account of memory–FPGA and hardware, reconfigurable hardware and software. memory–GPP communication which is absent in case of all-hardware and all-software design approaches •A ‘0’ represents software •reconfiguration requires additional time and power •A ‘1’ represents hardware with fixed configuration to reconfigure the resource •A ‘2’ represents reconfigurable hardware
Partition by GA • Reproduction: • A single point crossover is adapted. • Mutation operation, in case of partitioning, moves a task randomly from hardware to software or reconfigurable hardware, or vice versa. • Terminate Criteria • GA terminates after executing a predetermined number of iterations (generations). A sample task graph
Case1: SW and fix configuration HW only Interesting results Case2: SW and HW: half fixed and half reconfigurable For time critical applications, Case2 has For power critical applications, the choice slightly better performance. of reconfiguration may be avoided, since its introduction only deteriorates power.
Interesting results This depicts the variation in the resource utilization Among the hardware resources utilized, the in hardware and software elements, depending on reconfigurable resources outnumber the objective applied. preconfigured hardware.
Simulated In metallurgy and materials science, annealing is a heat treatment that annealing -- involving heat and controlled cooling. Background Annealing occurs by the diffusion of atoms within a solid material, so that the material progresses towards its equilibrium state. Heat increases the rate of diffusion by providing the energy needed to break bonds. This alteration to existing dislocations allows a metal object to increasing its ductility.
Simulated annealing – Steps involved in metallurgy A metal is heated to a high The metal is gradually As the metal cools, its atoms Annealing improves the temperature cooled on a specific settle into an optimal cold-working properties of schedule crystalline structure metal
Simulated annealing – Metropolis algorithm • Metropolis introduced a simple algorithm that can be used to provide an efficient simulation of a collection of atoms in equilibrium at a given temperature. • In each step of this algorithm, an atom is given a small random displacement and the resulting change ΔE, in the energy of the system is computed. • Calculate the probability that the configuration is accepted using the formula bellow. �∆� � � � • Where E is the internal energy at temperature T, k is Boltzmann constant
Simulated annealing – Algorithm inspired by annealing • Is a probabilistic technique for approximating the global optimum of a given function. • Inspired by annealing in metallurgy, slowly decrease in the probability of accepting worse solutions
How simulated annealing works?
NP- Complete Problem
Physical Design Of Computers • Partition • Placement • Wiring
Partition Decomposition scheme Decomposition is Decomposition of a has to minimize the carried out Each subsystem can be complex system into interconnections hierarchically until each designed independently smaller subsystem. between the subsystem is of subsystems manageable
Partitioning Example • Partition 1: 15 gates • Partition 2: 16 gates • Partition 3: 17 gates
Delay Implications in different level • Between blocks in chip: 1x • Between chips in board: 10x • Between boards in system: 20x
Simulated Annealing Applied to Partition Taken the logic design of IBM 370 and considered partitioning it into two chips.
Placement An important step in physical design cycle • Poor placement requires larger area • Interconnection increases • Also results in performance degradation It is the process of arranging a set of modules on the layout surface • Reduce the overall wire length
Affecting on Wirelength Wirelength = 10 Wirelength = 12
Simulated Annealing Apply to Placement • Ninety-eight chips on the IBM 3081 • Chips are identified by number ( I to 100 without 20,100) • Different pattern of small squares represents different logic function. • The numbers at the left and lower edges indicates the net- crossing of the vertical and horizontal wires.
Wiring • Two classes of moves that maintain the minimum wirelength • L move • Z move
Results Comparing • Random assignment with L moves • Aligning wires in the direction of least congestion • Simulated annealing with L moves only • Annealing with Z-moves
Example • Steve Jobs : “ I'm gonna see it! I want it to be as beautiful as possible, even if it's inside the box. A great carpenter isn't going to use lousy wood for the back of a cabinet, even though nobody's going to see it. ”;
ANY QUESTION
Recommend
More recommend