hete terog ogene neous c ous conc oncur urrenc ncy
play

Hete terog ogene neous C ous Conc oncur urrenc ncy Michael L. - PowerPoint PPT Presentation

Hete terog ogene neous C ous Conc oncur urrenc ncy Michael L. Scott (on leave at Google Madison) www.cs.rochester.edu/u/scott/ Schloss Dagstuhl January 2015 Future Future proc processors ssors may not be y not be pre pretty tty


  1. Hete terog ogene neous C ous Conc oncur urrenc ncy Michael L. Scott (on leave at Google Madison) www.cs.rochester.edu/u/scott/ Schloss Dagstuhl January 2015

  2. Future Future proc processors ssors may not be y not be pre pretty tty ● Specter of “dark silicon”: most of the chip may need to be “off” most of the time ● Architects likely to fill the space with customized circuits » compression, encryption, XML parsing, pattern matching, media transcoding, vector/matrix algebra, arbitrary precision math, FFT, even FPGA ... » Not to mention cores with different computational/energy tradeoffs ● “Typical” program may need to jump frequently from one core to another MLS � 2 �

  3. Progre Progression of func ssion of functiona tionality lity ● FPU: pure simple function (e.g., arctan) » protection not really an issue ● GPU: fire-and-forget rendering ● GPGPU: compute and return (with memory access) » direct access from user space » one protection domain at a time ● first-class core: juggle multiple contexts safely » preemption, multiprogramming MLS � 3 �

  4. How do we ow do we... ... ● arbitrate access to resources (cycles, scratchpad memory, bandwidth, ...) » what do we need in HW that we don’t have now? ● choose among cores with non-trivial tradeoffs (speed, power, energy, load) ● access system services on nontraditional cores ● balance computational ability v. locality » how fast can we stream data from core to core? ● accommodate heterogeneous ISAs (esp. if choosing among cores on which these differ) MLS � 4 �

  5. And ( nd (w.r.t w.r.t. c . conc oncurre urrenc ncy), y), how do we... how do we ... ● dispatch across cores (HW queues? flat combining?) ● manage stacks (contiguous v. linked frames) ● wait for completion (spin? yield? deschedule? ship continuations?) ● avoiding writing code in a different language for every accelerator ● unblock threads across cores? across languages? » connections here to Eliot’s talk MLS � 5 �

  6. (U (Unsupporte nsupported) H d) Hypothe ypothese ses ● Traditional kernel interface will not suffice » must expose more of the underlying architecture, so run-time systems can figure out what to do » must not make everything a pthread [ Capriccio, Akaros, ... ] ● Contiguous stack frames will not suffice; neither will proliferating languages » compiler help will be required ● “Accelerator” cores will need “first-class status” » ability to request OS services directly [ GPUfs, ... ] ● Tree-structured dynamic call graph will be too restrictive » will sometimes want to “return” elsewhere than whence we came (continuation shipping) MLS � 6 �

  7. Ple Plenty to k nty to keep us b p us busy! usy! www.cs.rochester.edu / u / scott /

Recommend


More recommend