Invasive Malleable Applications Sebastian Buchwald, Manuel Mohr, Andreas Zwinkau Karlsruhe Institute of Technology TCRC 89 "Invasive Computing" ATPS 2015
No faster CPUs, only more of them
More cores means slower cores M. B. Taylor, A Landscape of the New Dark Silicon Design Regime , Micro, IEEE , vol.33, no.5, 2013
Battery power is more important these days
Challenge: Use more cores more effjciently
Invasive Computing means Tiled Manycore Architectures
Invasive Computing means completely rewritten stack Custom Hardware iNoC, CiC, i -Core Custom Operating System iRTSS, OctoPOS Custom Programming invadeX10 Language Applications ported/ rewritten HPC, Robotics, ...
The invasive framework gives access to more OS functionality Application Code App X10 Runtime Invasive Framework iRTSS Agent Agent OctoPOS OctoPOS OctoPOS Tile Tile Tile Core Core Core Core Core Core
Invasive Computing is about invade-infect-retreat Constraints i -let i n v a d e Claim i n f e c t r e i n v a d e r e t r e a t
Invasive Computing uses X10 v a l i l e t = ( i d : I n c a r n a t i o n I D ) = > { C o n s o l e . O U T . p r i n t l n ( " H e l l o W o r l d " ) ; } v a l c o n s t r a i n t s = n e w P E Q u a n t i t y ( 4 , 1 0 ) & & n e w S c a l a b i l i t y H i n t ( s p e e d u p C u r v e ) ; v a l c l a i m = C l a i m . i n v a d e ( c o n s t r a i n t s ) ; c l a i m . i n f e c t ( i l e t ) ; c l a i m . r e t r e a t ( ) ;
Invasive Applications are Malleable D. G. Feitelson, L. Rudolph; Towards convergence in job schedulers for parallel supercomputers , IPPS 1996 on submission at runtime User rigid evolving decides System moldable malleable decides
Example: Heat Dissipation with Multigrid Approach M. Schreiber, A. Zwinkau, et al. Invasive computing in HPC with X10, X10 Workshop 2013
Multigrid lets resource idle idle!
Invasive Apps exchange cores
Asynchronously Malleable: „System decides any time at runtime“ Master Master Slave Slave Slave Works nicely for Master-Slave Applications
Asynchronously Malleable is more effjcient Not idle!
Integration is async-malleable f(x) = sin(x²)
Sorting is async-malleable Patrick Flick, Peter Sanders, Jochen Speck Malleable Sorting IPDPS 2013
v a l i l e t = ( i d : I n c a r n a t i o n I D ) = > { f o r ( j o b i n q u e u e ) { i f ( q u e u e . c h e c k T e r m i n a t i o n ( i d ) ) b r e a k ; j o b . d o ( ) ; } } v a l r e s i z e H a n d l e r = ( a d d : L i s t [ P E ] , r e m o v e : L i s t [ P E ] ) = > { f o r ( p e i n a d d ) q u e u e . a d d W o r k e r ( p e , i l e t ) ; q u e u e . a d a p t ( ) ; f o r ( p e i n r e m o v e ) q u e u e . s i g n a l T e r m i n a t i o n ( p e ) ; } v a l c o n s t r a i n t s = n e w P E Q u a n t i t y ( 4 , 1 0 ) & & n e w A s y n c M a l l e a b l e ( r e s i z e H a n d l e r ) & & n e w S c a l a b i l i t y H i n t ( s p e e d u p C u r v e ) ; v a l c l a i m = C l a i m . i n v a d e ( c o n s t r a i n t s ) ; q u e u e . a d a p t T o ( c l a i m ) ; c l a i m . i n f e c t ( i l e t ) ; c l a i m . r e t r e a t ( ) ;
v a l i l e t = ( i d : I n c a r n a t i o n I D ) = > { f o r ( j o b i n q u e u e ) { i f ( q u e u e . c h e c k T e r m i n a t i o n ( i d ) ) b r e a k ; j o b . d o ( ) ; } } v a l r e s i z e H a n d l e r = ( a d d : L i s t [ P E ] , r e m o v e : L i s t [ P E ] ) = > { f o r ( p e i n a d d ) q u e u e . a d d W o r k e r ( p e , i l e t ) ; q u e u e . a d a p t ( ) ; f o r ( p e i n r e m o v e ) q u e u e . s i g n a l T e r m i n a t i o n ( p e ) ; } v a l c o n s t r a i n t s = n e w P E Q u a n t i t y ( 4 , 1 0 ) & & n e w A s y n c M a l l e a b l e ( r e s i z e H a n d l e r ) & & n e w S c a l a b i l i t y H i n t ( s p e e d u p C u r v e ) ; v a l c l a i m = C l a i m . i n v a d e ( c o n s t r a i n t s ) ; q u e u e . a d a p t T o ( c l a i m ) ; c l a i m . i n f e c t ( i l e t ) ; c l a i m . r e t r e a t ( ) ;
v a l i l e t = ( i d : I n c a r n a t i o n I D ) = > { f o r ( j o b i n q u e u e ) { i f ( q u e u e . c h e c k T e r m i n a t i o n ( i d ) ) b r e a k ; j o b . d o ( ) ; } } v a l r e s i z e H a n d l e r = ( a d d : L i s t [ P E ] , r e m o v e : L i s t [ P E ] ) = > { f o r ( p e i n a d d ) q u e u e . a d d W o r k e r ( p e , i l e t ) ; q u e u e . a d a p t ( ) ; f o r ( p e i n r e m o v e ) q u e u e . s i g n a l T e r m i n a t i o n ( p e ) ; } v a l c o n s t r a i n t s = n e w P E Q u a n t i t y ( 4 , 1 0 ) & & n e w M a l l e a b l e ( r e s i z e H a n d l e r ) & & n e w S c a l a b i l i t y H i n t ( s p e e d u p C u r v e ) ; v a l c l a i m = C l a i m . i n v a d e ( c o n s t r a i n t s ) ; q u e u e . a d a p t T o ( c l a i m ) ; c l a i m . i n f e c t ( i l e t ) ; c l a i m . r e t r e a t ( ) ;
Invasive Malleable Applications
Appendix
Case Study: Multigrid M. Schreiber, A. Zwinkau, et al. Invasive computing in HPC with X10 X10 Workshop 2013
v a l i l e t = ( i d : I n c a r n a t i o n I D ) = > { f o r ( j o b i n q u e u e ) { i f ( q u e u e . c h e c k T e r m i n a t i o n ( i d ) ) b r e a k ; j o b . d o ( ) ; } } v a l r e s i z e H a n d l e r = ( a d d : L i s t [ P E ] , r e m o v e : L i s t [ P E ] ) = > { f o r ( p e i n a d d ) q u e u e . a d d W o r k e r ( p e , i l e t ) ; q u e u e . a d a p t ( ) ; f o r ( p e i n r e m o v e ) q u e u e . s i g n a l T e r m i n a t i o n ( p e ) ; } v a l c o n s t r a i n t s = n e w P E Q u a n t i t y ( 4 , 1 0 ) & & n e w M a l l e a b l e ( r e s i z e H a n d l e r ) & & n e w S c a l a b i l i t y H i n t ( s p e e d u p C u r v e ) ; v a l c l a i m = C l a i m . i n v a d e ( c o n s t r a i n t s ) ; q u e u e . a d a p t T o ( c l a i m ) ; c l a i m . i n f e c t ( i l e t ) ; c l a i m . r e t r e a t ( ) ;
Recommend
More recommend