Vishal Gupta* (Georgia Tech) Ripal Nathuji (Microsoft Research) * Work done during summer internship at Microsoft Research
Different types of CPU cores CPU Cores P P P P P P P Symmetric Asymmetric multicore processor multicore processor SMP AMP
Application P P SMP B P P A C P AMP B P P SMP Speedup! AMP T 2T 3T time
• How good are AMPs as compared to SMPs? • Can datacenter applications save power using AMPs?
Server λ datacenter (throughput) Others … S S S S Processor … S S S S . . . . . . . . . . . . … P P P S S S S P P P P Datacenter SMP AMP
AMP SMP P < P ? datacenter datacenter • Constant work • Meet latency SLA
Sequential • Energy Scaling execution Parallel • Parallel Speedup … execution
Area equivalent Sequential application P P P SMP AMP
t small t large P T AMP AMP P Slack T SMP SMP P time T SLA Smaller core = lesser power
P P P P P P P P Parallel application P P P P P P P P P P P P P P P … P P P P P P SMP AMP
… Sequential P P P P Phase P P P P P P P P P P P P P P P P P P P Small cores: Run on P P P P P P Bottleneck the fast core AMP SMP … Speedup = Higher throughput
Latency SLA Server Arrival Rate λ Request Service Rate Queue µ M/M/1 Queuing Model 1 Avg. E [ T ] = Response Time µ − λ
Parallel Speedup (PS) (refer to paper for ES) P P P P P P P P Parallel application P P P P P P P P P P P P P P P … P P P P P P SMP AMP Amdahl’s Law for Multicores
Area = r r = Area(Big/Core) Area = 1 P P Perf = perf(r) P P P P P P P P P P P P P P P n = Chip area P P P P P P P P P P P P P P P P P P SMP SMP AMP n=16, r=1 n=16, r=4 n=16, r=4 f = fraction of computation that can be parallelized
1 µ SMP ( f , n , r ) = 1 − f f perf ( r ) + n r * perf ( r ) 1 µ AMP ( f , n , r ) = 1 − f f perf ( r ) + n − r Ref: Hill and Marty, Amdahl's law in the multicore era (IEEE Computer’08)
peak = µ − 1 λ server T SLA Datacenter capacity = No. of servers * Server throughput SMP * λ server SMP λ datacenter = N server Constant Work AMP * λ server AMP λ datacenter = N server
Datacenter power (P) = No. of servers * Server power SMP * P SMP SMP P = N server datacenter server AMP * P AMP AMP P = N server datacenter server
Peak Power Server Power Consumption P(U) Idle Power CPU Utilization (U) Ref: The Case for Energy-Proportional Computing, Barroso & Hölzle, IEEE Computer 2007
Server load distribution (W load ) Fraction of time CPU Utilization (U) ∑ P W load ( U )* P server ( U ) server =
AMP SMP P < P ? datacenter datacenter
Upto 52% power savings n = 64 60% Power savings of AMP 50% over SMP 40% r=32 30% r=16 20% r=8 10% r=4 0% 0 0.2 0.4 0.6 0.8 1 Fraction of work that can be parallelized (f)
Upto 14% power savings 20% Power savings of AMP over 15% 10% 5% Application A Small core bias SMP 0% Application B 5% 10% 15% 20% 25% 30% 35% 40% 45% Uniform bias -5% Application C Large core bias -10% -15% -20% -25% Fraction of area sacrificed for small core
• PS looks more promising that ES • Can we achieve these savings in reality?
High (but not too high!) f High r 60% Power savings of AMP 50% (realistic r = 3) 40% over SMP r=32 30% r=16 20% r=8 10% r=4 0% 0 0.5 1 Fraction of work that can be parallelized (f)
• Scalability : Amdahl’s law assumes unbounded scalability • Migration overhead : zero migration overhead • Perfect scheduling : oracle scheduler Actual savings are going to be lower
• Potential for power savings in datacenters using AMPs • Parallel Speedup more promising than Energy Scaling • Practical considerations to realize full benefits Future work: Extend our analysis to functional asymmetry
Recommend
More recommend