Using Today’s Fastest Chips to Design the Chips of Tomorrow Mauro Calderara, Sascha Brück, Mathieu Luisier | |
Overview What we want to do How we do it | | Mauro Calderara Apr 08 2016 2
Overview What we want to do → Quantum Transport: electrons and structures How we do it → How GPUs saved the day | | Mauro Calderara Apr 08 2016 3
Probably you’re familiar with this | Apr 08 2016 | Mauro Calderara 4
Zooming in | Apr 08 2016 | Mauro Calderara 5
The future? (link to video: http://iis.ee.ethz.ch/~mauro/movie_SC15.avi) | Apr 08 2016 | Mauro Calderara 6
From a somewhat more abstract POV Device | Apr 08 2016 | Mauro Calderara 7
From a somewhat more abstract POV ? e Device | Apr 08 2016 | Mauro Calderara 7
From a somewhat more abstract POV ? e e Device | Apr 08 2016 | Mauro Calderara 7
From a somewhat more abstract POV ? e e e Device | Apr 08 2016 | Mauro Calderara 7
From a somewhat more abstract POV ? e e e Device e e e | Apr 08 2016 | Mauro Calderara 7
This is what we’re ultimately interested in! How do electrons behave w.r.t the device? Device | | Mauro Calderara Apr 08 2016 8
This is what we’re ultimately interested in! How do electrons behave w.r.t the device? Change in parameters → change in Device behavior? | | Mauro Calderara Apr 08 2016 8
This is what we’re ultimately interested in! How do electrons behave w.r.t the device? e e Change in parameters → change in e Device behavior? e e e | | Mauro Calderara Apr 08 2016 8
This is what we’re ultimately interested in! How do electrons behave w.r.t the Gate voltage device? e e Change in parameters → change in e Device behavior? e e e | | Mauro Calderara Apr 08 2016 8
This is what we’re ultimately interested in! How do electrons behave w.r.t the Gate voltage Material device? properties e e Change in parameters → change in e Device behavior? e e e Dimensions | | Mauro Calderara Apr 08 2016 8
This is what we’re ultimately interested in! How do electrons behave w.r.t the Gate voltage Material device? properties e e Change in parameters → change in e Device behavior? e e e Applies not just to transistors Dimensions Batteries Storage devices ... | | Mauro Calderara Apr 08 2016 8
How would we do that? The ‘‘easy’’ case: | Apr 08 2016 | Mauro Calderara 9
How would we do that? The ‘‘easy’’ case: → device behaves like bulk material | Apr 08 2016 | Mauro Calderara 9
How would we do that? The ‘‘difficult’’ case: | Apr 08 2016 | Mauro Calderara 10
How would we do that? The ‘‘difficult’’ case: → device behaves like atomic structure | Apr 08 2016 | Mauro Calderara 10
The cost of going small Why is this ‘‘easy’’ ... ... and this ‘‘difficult’’? | Apr 08 2016 | Mauro Calderara 11
The cost of going small Can assume is ‘‘infinite’’ and Very finite! Need to do it from first principles. use semi empirical model. | Apr 08 2016 | Mauro Calderara 12
The cost of going small Can assume is ‘‘infinite’’ and Very finite! Need to do it from first principles. use semi empirical model. | Apr 08 2016 | Mauro Calderara 12
The cost of going small Can assume is ‘‘infinite’’ and Very finite! Need to do it from first principles. use semi empirical model. | Apr 08 2016 | Mauro Calderara 12
The cost of going small Can assume is ‘‘infinite’’ and Very finite! Need to do it from first principles. use semi empirical model. | Apr 08 2016 | Mauro Calderara 12
The cost of going small Can assume is ‘‘infinite’’ and Very finite! Need to do it from first principles. use semi empirical model. | Apr 08 2016 | Mauro Calderara 12
The cost of going small Can assume is ‘‘infinite’’ and Very finite! Need to do it from first principles. use semi empirical model. | Apr 08 2016 | Mauro Calderara 12
The cost of going small runtime runtime Can assume is ‘‘infinite’’ and Very finite! Need to do it from first principles. use semi empirical model. | Apr 08 2016 | Mauro Calderara 12
The cost of going small runtime runtime Semi- empirical → O(Hours) First principles → O(Months) | | Mauro Calderara Apr 08 2016 13
The cost of going small runtime runtime Semi- empirical → O(Hours) First principles → O(Months) | | Mauro Calderara Apr 08 2016 13
The cost of going small runtime runtime Semi- empirical → O(Hours) First principles → O(Months) | | Mauro Calderara Apr 08 2016 13
Overview What we want to do → Quantum Transport: electrons and structures How we do it → How GPUs saved the day | | Mauro Calderara Apr 08 2016 14
Where does all that time go? runtime ~ 40x | Apr 08 2016 | Mauro Calderara 15
Where does all that time go? runtime ~ 40x Solve an eigenvalue problem (not discussed here). | Apr 08 2016 | Mauro Calderara 15
Where does all that time go? runtime ~ 40x Invert the matrix from before (selectively!) using a recursive algorithm. Solve an eigenvalue problem (not discussed here). | Apr 08 2016 | Mauro Calderara 15
Avoiding the inversion, use a sparse solver instead Instead of trying to invert selectively, runtime solve system using generic sparse solver package ~ 40x | | Mauro Calderara Apr 08 2016 16
Avoiding the inversion, use a sparse solver instead Instead of trying to invert selectively, runtime solve system using generic sparse solver package ~ 40x Gain: speed, parallelism, capacity for somewhat larger systems | | Mauro Calderara Apr 08 2016 16
Avoiding the inversion, use a sparse solver instead Instead of trying to invert selectively, runtime solve system using generic sparse solver package ~ 40x Gain: speed, parallelism, capacity for somewhat larger systems Cost: code now mem-bw bound And: not such a good fit for GPUs ... | | Mauro Calderara Apr 08 2016 16
Avoiding the inversion, use a sparse solver instead Instead of trying to invert selectively, runtime solve system using generic sparse solver package ~ 40x Gain: speed, parallelism, capacity for somewhat larger systems Cost: code now mem-bw bound And: not such a good fit for GPUs ... | | Mauro Calderara Apr 08 2016 16
Tackling the eigenvalue problem runtime runtime ~ 200x We’ve been able to solve that one | | Mauro Calderara Apr 08 2016 17
Now what? runtime Good speedup so far ~ 70x overall (now: O(Days), still not quite there...) | | Mauro Calderara Apr 08 2016 18
Now what? runtime Good speedup so far ~ 70x overall (now: O(Days), still not quite there...) But | | Mauro Calderara Apr 08 2016 18
Now what? runtime Good speedup so far ~ 70x overall (now: O(Days), still not quite there...) ? But Mem-BW bound by sparse solver | | Mauro Calderara Apr 08 2016 18
Now what? runtime Good speedup so far ~ 70x overall (now: O(Days), still not quite there...) ? But Mem-BW bound by sparse solver | | Mauro Calderara Apr 08 2016 18
Now what? runtime Good speedup so far ~ 70x overall (now: O(Days), still not quite there...) But Mem-BW bound by sparse solver | | Mauro Calderara Apr 08 2016 18
Now what? runtime Good speedup so far ~ 70x overall (now: O(Days), still not quite there...) Advisor PhD student ? But Mem-BW bound by sparse solver | | Mauro Calderara Apr 08 2016 18
A Sparse Solver for Transport Problems running on GPUs Inverting sparse system not feasible -1 = | | Mauro Calderara Apr 08 2016 19
A Sparse Solver for Transport Problems running on GPUs Inverting sparse system not feasible In our case: also not neccessary -1 = | | Mauro Calderara Apr 08 2016 19
A Sparse Solver for Transport Problems running on GPUs Inverting sparse system not feasible In our case: also not neccessary -1 Need first and last block rows only = | | Mauro Calderara Apr 08 2016 19
A Sparse Solver for Transport Problems running on GPUs Inverting sparse system not feasible In our case: also not neccessary -1 Need first and last block rows only = If we can compute this fast, we can interleave the solving step with the BC computation obtain the full solution very efficiently | | Mauro Calderara Apr 08 2016 19
Obtaining the first and last block columns of the inverse Recursive algorithm based on the for i = N:1 𝑌 𝑗 ← (𝐵 𝑗,𝑗 − 𝐵 𝑗,𝑗+1 𝑌 𝑗+1 ) \ 𝐵 𝑗,𝑗−1 Schwinger-Dyson equation for i = 2:N 𝑅 𝑗 ← −𝑌 𝑗 𝑅 𝑗−1 | | Mauro Calderara Apr 08 2016 20
Obtaining the first and last block columns of the inverse Recursive algorithm based on the for i = N:1 𝑌 𝑗 ← (𝐵 𝑗,𝑗 − 𝐵 𝑗,𝑗+1 𝑌 𝑗+1 ) \ 𝐵 𝑗,𝑗−1 Schwinger-Dyson equation for i = 2:N 𝑅 𝑗 ← −𝑌 𝑗 𝑅 𝑗−1 | | Mauro Calderara Apr 08 2016 20
Recommend
More recommend