using today s fastest chips to design the chips of
play

Using Todays Fastest Chips to Design the Chips of Tomorrow Mauro - PowerPoint PPT Presentation

Using Todays Fastest Chips to Design the Chips of Tomorrow Mauro Calderara, Sascha Brck, Mathieu Luisier | | Overview What we want to do How we do it | | Mauro Calderara Apr 08 2016 2 Overview What we want to do


  1. Using Today’s Fastest Chips to Design the Chips of Tomorrow Mauro Calderara, Sascha Brück, Mathieu Luisier | |

  2. Overview  What we want to do  How we do it | | Mauro Calderara Apr 08 2016 2

  3. Overview  What we want to do → Quantum Transport: electrons and structures  How we do it → How GPUs saved the day | | Mauro Calderara Apr 08 2016 3

  4. Probably you’re familiar with this | Apr 08 2016 | Mauro Calderara 4

  5. Zooming in | Apr 08 2016 | Mauro Calderara 5

  6. The future? (link to video: http://iis.ee.ethz.ch/~mauro/movie_SC15.avi) | Apr 08 2016 | Mauro Calderara 6

  7. From a somewhat more abstract POV Device | Apr 08 2016 | Mauro Calderara 7

  8. From a somewhat more abstract POV ? e Device | Apr 08 2016 | Mauro Calderara 7

  9. From a somewhat more abstract POV ? e e Device | Apr 08 2016 | Mauro Calderara 7

  10. From a somewhat more abstract POV ? e e e Device | Apr 08 2016 | Mauro Calderara 7

  11. From a somewhat more abstract POV ? e e e Device e e e | Apr 08 2016 | Mauro Calderara 7

  12. This is what we’re ultimately interested in!  How do electrons behave w.r.t the device? Device | | Mauro Calderara Apr 08 2016 8

  13. This is what we’re ultimately interested in!  How do electrons behave w.r.t the device?  Change in parameters → change in Device behavior? | | Mauro Calderara Apr 08 2016 8

  14. This is what we’re ultimately interested in!  How do electrons behave w.r.t the device? e e  Change in parameters → change in e Device behavior? e e e | | Mauro Calderara Apr 08 2016 8

  15. This is what we’re ultimately interested in!  How do electrons behave w.r.t the Gate voltage device? e e  Change in parameters → change in e Device behavior? e e e | | Mauro Calderara Apr 08 2016 8

  16. This is what we’re ultimately interested in!  How do electrons behave w.r.t the Gate voltage Material device? properties e e  Change in parameters → change in e Device behavior? e e e Dimensions | | Mauro Calderara Apr 08 2016 8

  17. This is what we’re ultimately interested in!  How do electrons behave w.r.t the Gate voltage Material device? properties e e  Change in parameters → change in e Device behavior? e e e  Applies not just to transistors Dimensions  Batteries  Storage devices  ... | | Mauro Calderara Apr 08 2016 8

  18. How would we do that? The ‘‘easy’’ case: | Apr 08 2016 | Mauro Calderara 9

  19. How would we do that? The ‘‘easy’’ case: → device behaves like bulk material | Apr 08 2016 | Mauro Calderara 9

  20. How would we do that? The ‘‘difficult’’ case: | Apr 08 2016 | Mauro Calderara 10

  21. How would we do that? The ‘‘difficult’’ case: → device behaves like atomic structure | Apr 08 2016 | Mauro Calderara 10

  22. The cost of going small Why is this ‘‘easy’’ ... ... and this ‘‘difficult’’? | Apr 08 2016 | Mauro Calderara 11

  23. The cost of going small Can assume is ‘‘infinite’’ and Very finite! Need to do it from first principles. use semi empirical model. | Apr 08 2016 | Mauro Calderara 12

  24. The cost of going small Can assume is ‘‘infinite’’ and Very finite! Need to do it from first principles. use semi empirical model. | Apr 08 2016 | Mauro Calderara 12

  25. The cost of going small Can assume is ‘‘infinite’’ and Very finite! Need to do it from first principles. use semi empirical model. | Apr 08 2016 | Mauro Calderara 12

  26. The cost of going small Can assume is ‘‘infinite’’ and Very finite! Need to do it from first principles. use semi empirical model. | Apr 08 2016 | Mauro Calderara 12

  27. The cost of going small Can assume is ‘‘infinite’’ and Very finite! Need to do it from first principles. use semi empirical model. | Apr 08 2016 | Mauro Calderara 12

  28. The cost of going small Can assume is ‘‘infinite’’ and Very finite! Need to do it from first principles. use semi empirical model. | Apr 08 2016 | Mauro Calderara 12

  29. The cost of going small runtime runtime Can assume is ‘‘infinite’’ and Very finite! Need to do it from first principles. use semi empirical model. | Apr 08 2016 | Mauro Calderara 12

  30. The cost of going small runtime runtime Semi- empirical → O(Hours) First principles → O(Months) | | Mauro Calderara Apr 08 2016 13

  31. The cost of going small runtime runtime Semi- empirical → O(Hours) First principles → O(Months) | | Mauro Calderara Apr 08 2016 13

  32. The cost of going small runtime runtime Semi- empirical → O(Hours) First principles → O(Months) | | Mauro Calderara Apr 08 2016 13

  33. Overview  What we want to do → Quantum Transport: electrons and structures  How we do it → How GPUs saved the day | | Mauro Calderara Apr 08 2016 14

  34. Where does all that time go? runtime ~ 40x | Apr 08 2016 | Mauro Calderara 15

  35. Where does all that time go? runtime ~ 40x Solve an eigenvalue problem (not discussed here). | Apr 08 2016 | Mauro Calderara 15

  36. Where does all that time go? runtime ~ 40x Invert the matrix from before (selectively!) using a recursive algorithm. Solve an eigenvalue problem (not discussed here). | Apr 08 2016 | Mauro Calderara 15

  37. Avoiding the inversion, use a sparse solver instead  Instead of trying to invert selectively, runtime solve system using generic sparse solver package ~ 40x | | Mauro Calderara Apr 08 2016 16

  38. Avoiding the inversion, use a sparse solver instead  Instead of trying to invert selectively, runtime solve system using generic sparse solver package ~ 40x  Gain: speed, parallelism, capacity for somewhat larger systems | | Mauro Calderara Apr 08 2016 16

  39. Avoiding the inversion, use a sparse solver instead  Instead of trying to invert selectively, runtime solve system using generic sparse solver package ~ 40x  Gain: speed, parallelism, capacity for somewhat larger systems  Cost: code now mem-bw bound And: not such a good fit for GPUs ...  | | Mauro Calderara Apr 08 2016 16

  40. Avoiding the inversion, use a sparse solver instead  Instead of trying to invert selectively, runtime solve system using generic sparse solver package ~ 40x  Gain: speed, parallelism, capacity for somewhat larger systems  Cost: code now mem-bw bound And: not such a good fit for GPUs ...  | | Mauro Calderara Apr 08 2016 16

  41. Tackling the eigenvalue problem runtime runtime ~ 200x  We’ve been able to solve that one  | | Mauro Calderara Apr 08 2016 17

  42. Now what? runtime  Good speedup so far ~ 70x overall (now: O(Days), still not quite there...) | | Mauro Calderara Apr 08 2016 18

  43. Now what? runtime  Good speedup so far ~ 70x overall (now: O(Days), still not quite there...)  But | | Mauro Calderara Apr 08 2016 18

  44. Now what? runtime  Good speedup so far ~ 70x overall (now: O(Days), still not quite there...) ?  But Mem-BW bound by sparse solver | | Mauro Calderara Apr 08 2016 18

  45. Now what? runtime  Good speedup so far ~ 70x overall (now: O(Days), still not quite there...) ?  But Mem-BW bound by sparse solver | | Mauro Calderara Apr 08 2016 18

  46. Now what? runtime  Good speedup so far ~ 70x overall (now: O(Days), still not quite there...)  But Mem-BW bound by sparse solver | | Mauro Calderara Apr 08 2016 18

  47. Now what? runtime  Good speedup so far ~ 70x overall (now: O(Days), still not quite there...) Advisor PhD student ?  But Mem-BW bound by sparse solver | | Mauro Calderara Apr 08 2016 18

  48. A Sparse Solver for Transport Problems running on GPUs  Inverting sparse system not feasible -1 = | | Mauro Calderara Apr 08 2016 19

  49. A Sparse Solver for Transport Problems running on GPUs  Inverting sparse system not feasible  In our case: also not neccessary -1 = | | Mauro Calderara Apr 08 2016 19

  50. A Sparse Solver for Transport Problems running on GPUs  Inverting sparse system not feasible  In our case: also not neccessary -1  Need first and last block rows only = | | Mauro Calderara Apr 08 2016 19

  51. A Sparse Solver for Transport Problems running on GPUs  Inverting sparse system not feasible  In our case: also not neccessary -1  Need first and last block rows only =  If we can compute this fast, we can  interleave the solving step with the BC computation  obtain the full solution very efficiently | | Mauro Calderara Apr 08 2016 19

  52. Obtaining the first and last block columns of the inverse  Recursive algorithm based on the for i = N:1 𝑌 𝑗 ← (𝐵 𝑗,𝑗 − 𝐵 𝑗,𝑗+1 𝑌 𝑗+1 ) \ 𝐵 𝑗,𝑗−1 Schwinger-Dyson equation for i = 2:N 𝑅 𝑗 ← −𝑌 𝑗 𝑅 𝑗−1 | | Mauro Calderara Apr 08 2016 20

  53. Obtaining the first and last block columns of the inverse  Recursive algorithm based on the for i = N:1 𝑌 𝑗 ← (𝐵 𝑗,𝑗 − 𝐵 𝑗,𝑗+1 𝑌 𝑗+1 ) \ 𝐵 𝑗,𝑗−1 Schwinger-Dyson equation for i = 2:N 𝑅 𝑗 ← −𝑌 𝑗 𝑅 𝑗−1 | | Mauro Calderara Apr 08 2016 20

Recommend


More recommend