the variability expeditions
play

The Variability Expeditions: Variability-Aware Software for - PowerPoint PPT Presentation

The Variability Expeditions: Variability-Aware Software for Efficient Computing With Nanoscale Devices. Rajesh K. Gupta Lara Dolecek, UCLA Nikil Dutt, UCI Subhashish Mitra, Stanford Punit Gupta, UCLA YY Zhou, UCSD Mani Srivastava, UCLA


  1. The Variability Expeditions: Variability-Aware Software for Efficient Computing With Nanoscale Devices. Rajesh K. Gupta Lara Dolecek, UCLA Nikil Dutt, UCI Subhashish Mitra, Stanford Punit Gupta, UCLA YY Zhou, UCSD Mani Srivastava, UCLA Tajana Rosing, UCSD Lucas Wanner, UCLA Alex Nicolau, UCI Steve Swanson, UCSD Ranjit Jhala, UCSD Sorin Lerner, UCSD Rakesh Kumar, UIUC Dennis Sylvester, UMich

  2. To a software designer, all chips look alike To a hardware engineer, a chip is delivered as per contract in a data-sheet. 2

  3. From Chiseled Objects to Molecular Assemblies guardband actual circuit delay Clock Temperature V CC Droop Aging Across-wafer Frequency 3 20-May-13 Abbas Rahimi/ UC San Diego 3

  4. What if? Application Application Operating System Hardware Abstraction Layer (HAL) } underdesigned hardware Time or part 4

  5. New Hardware-Software Interface.. Application Application Operating System Opportunistic Hardware Abstraction Layer (HAL) Software minimal variability handling in hardware Underdesigned Hardware Time or part Builds upon a 50-year rich research in fault tolerance. 5

  6. UNO Computing Machines Seek Opportunities based on Sensing Results Change Change Change Hardware Change Do Program Runtime Nothing Operating Algorithms Parameters Parameters Point Metadata Mechanisms: Reflection, Introspection Models Sensors 6

  7. Building Machines that leverage move from Crash & Recover to Sense & Adapt Machines that consist of parts with variations in performance, power and reliability Machines that incorporate sensing circuits Machines w/ interfaces to change ongoing computation & structures New machine models: QOS or Relaxed Reliability parts SW 7 HW

  8. Example: Procedure Hopping in Clustered CPU, Each core with its voltage domain I$ B0 ... I$ Bi-1 • Statically characterize procedure Log. Interc. f+180 ° for PLV Level Shifters Level Shifters • High V DD A core increases voltage if Typical V DD Low V DD monitored delay is high PSS PSS VA-V DD -hopping VA-V DD -hopping • A procedure hops from one core DFS f ... Core 15 Core 0 to another if its voltage variation CPM CPM is high SHM Level Shifters Level Shifters f+180 ° • Less 1% cycle overhead in Log. Interc. EEMBC. TCDM B0 ... TCDM Bj-1 V DD = 0.99V V DD = 0.81V VA-V DD -Hopping=( 0.81V 0.99V , ) f 0 f 1 f 2 f 3 f 0 f 1 f 2 f 3 f 0 f 1 f 2 f 3 862 909 870 847 862 909 870 847 1408 1389 1408 1370 f 4 f 5 f 6 f 7 f 4 f 5 f 6 f 7 f 4 f 5 f 6 f 7 1370 1408 1408 1408 826 855 877 893 1370 855 877 893 f 8 f 9 f 10 f 11 f 8 f 9 f 10 f 11 f 8 f 9 f 10 f 11 1370 1370 1389 1370 820 826 909 847 1370 1370 909 847 f 12 f 13 f 14 f 15 f 12 f 13 f 14 f 15 f 12 f 13 f 14 f 15 8 1408 1408 1389 1389 901 917 847 901 901 917 847 901

  9. HW/SW Collaborative Architecture to Support Intra-cluster Procedure Hopping Shared Proc X Proc X L1 - I$ … … Proc X @Callee @Caller Callee Core k TCDM Caller Core i … … Operating Con. Monit. Operating Con. Monit. Proc X @Callee: call Proc X //conventional compile if (calculate_PLV ≤ PLV_threshold) Call ProcX@Caller //VA-compile Shared … set_status X _PHIT = running Local load_contex&param_from_SSP X Proc X @Caller: If (calculate_PLV ≤ PLV_threshold) set_all_param&pointers PHIT call Proc X call Proc X Shared store_contex_to_SSP X else Stack set_status X _PHIT = done create_shared_stack_layout send_broadcast_ack set_PHIT_for_Proc X else send_broadcast_req Stacks Interrupt Cont. Interrupt Cont. resume_normal_execution set_timer … wait_on_ack_or_timer … Broadcast_req_ISR: Heap Proc X @Callee = search_in_PHIT Broadcast_ack_ISR: call Proc X @Callee if (status X _PHIT == done) … load_context&return_from_SSP X • The code is easily accessible via the shared-L1 I$. • The data and parameters are passed through the shared stack in TCDM. • A procedure hopping information table (PHIT) keeps the status for a migrated procedure. 9

  10. ViPZonE: Exploiting Memory Power Variability Application Application Layer Source code annotations Upper OS Layer Special GLIBC library, kernel system calls OS Lower OS Layer DIMM power variability-aware zoning and allocation DIMM Memory Controller Power Hardware Profiles DIMM 1 DIMM 2 DIMM n Applications Power • App developers can optimize Performance Microarchitecture and Compilers dynamic allocations for reduced Errors Runtime power Ambient • Linux + Glibc implementation Process Vendor CPU Mem Storage Accelerators Aging Energy Source Network 10 (Batteries) 10

  11. Example: UnO Stack for Duty-cycled Sensors A module SenseAndForward { provides energylevel LowFid<1>; provides energylevel MidFid<2>; provides energylevel HiFid<3>; } Monit Baseli Application Samp Forwa or ne { On_event Timer le rd Timer task call SensorRead(); On_event LowFid call Timer(2500); On_event MidFid OS call Timer(2000); Asynchronous notification Reflection On_event HiFid call Timer(1650);} Introspection Sysinfo B module SenseAndForward { provides energylevel LowFid<1>; Metadata provides energylevel MidFid<2>; provides energylevel HiFid<3>; } Reflection { On_event Timer call SensorRead(); Hardware Signature Sensing Manager On_event MonitorTimer call SysinfoRead(&sysinfo); If Error > Delta call Time(DownSample); Activation } Sample, Event, Sampling Configuration Time -series Sampling Request C Many Sensors: P sleep , P active , Memory Speed, Temp, Battery,... module SenseAndForward { provides energylevel LowFid<1>; provides energylevel MidFid<2>; provides energylevel HiFid<3>; } { On_event SysinfoChanged call SysinfoRead; if Error > Delta call Timer(DownSample);} 11

  12. RESEARCH AND ITS ORGANIZATION GRAND CHALLENGE, QUESTIONS AND RESEARCH PROGRESS 12

  13. Expedition Grand Challenge & Questions “Can microelectronic variability be controlled and utilized in building better computer systems?” Three Goals: a. Address fundamental technical What are most effective ways to detect variability? challenges (understand the problem) What are software-visible manifestations? b. Create experimental systems What are software mechanisms to exploit variability? (proof of concept prototypes) c. Educational and broader impact opportunities to make an impact (ensure training for How can designers and tools leverage adaptation? future talent). How do we verify and test hw-sw interfaces? 13

  14. Research Organization • Four thrust areas 1. Measurement and Modeling 2. Design Tools and Testing Methodologies 3. Microarchitecture and Compilers 4. Runtime Support • Two Cross-cutting thrusts 5. Applications and Testbeds 6. Outreach and Education Thrusts span teams across universities, usually in pairs. 14

  15. Thrusts traverse institutions on testbed vehicles seeding various projects Group C: Group A: Signature Group B: Variability Opportunistic Detection and Mitigation Software and Generation Measures Abstractions Characterizing variability in power Mitigating variability in solid-state consumption for modern computing Effective error resilience storage devices platforms, and implications Hardware solutions to better Runtime support and software Negative bias temperature instability understand and exploit variability adaptation for variable hardware and electromigration VarEmu emulation-based testbed for Probabilistic analysis of faulty Memory-variability aware runtime variability-aware software hardware systems Understanding and exploiting Design-dependent ring oscillator and Variability-aware opportunistic variability in flash memory devices software testbed system software stack Application robustification for Executing programs under relaxed FPGA-based variability simulator stochastic processors semantics 15

  16. Two years of building an Expedition • Kickoff, review, tape-outs and builds-ins – 82 peer-reviewed publications, 21% collaborative – 54 events/releases on variability.org/news – 64 presentations on variability.org/presentations • A collaborative community – 15 faculty, 25 GSRs, 1 postdoc, 10+ UG, 300 K-8-12 NSF announces Expeditions Kickoff/AHM Summit@EPFL DFM&Y Industry Advisory Y1 Review NSF NNI 6/10 10/6 8/19 3/18 11/19-20 Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct 3/26 COSMOS LACC 8 Teaching Modules (UCLA) 8/23 Aging Simulator Released (UCLA/UIUC) Girls’ Hat Day Sensorized ARM Chips 16 Intel, Google, Oracle, Cisco (UCSD), STmicro (Michigan) Wafer Pruning from UCLA (EEtimes)

  17. Timeline in Progress Y1 Review IMEC/ESWeek ATS Research Review (UCSD) Industry Advisory (Stanford) Y2 Review Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec LACC COSMOS Teaching Modules (UCLA) Complete Eval Boards w/ S-ARM 28nm Test Chips Girls’ Hat Day S-ARM R2 Tapeout CUDA Simulator Samsung (Tapeout Measurements) ARM, TSMC (Benchmarking) 17

Recommend


More recommend