Parallel Computing at the Desktop Aaron&Smith&&–&&March&2015&GSPS&
Outline( /usr/local ! |---Why Parallel? ! |---Closer Look @ ! | |---Hardware ! | |---Software ! |---Language Considerations ! |---Parallel Paradigms ! |---Example Code ! |---Serial ! |---MPI ! |---OpenMP ! Aaron&Smith&&|&&UT&Aus:n&&|&&Parallel&Compu:ng&at&the&Desktop&
Why(parallel?( &>>&&Speed&up&code& & &[&processing&power&]& & & &Slow&is&rela:ve&(minutes/days/months)& &>>&&Share&the&workload& &[&big/distributed&data&]& & & &Big&is&rela:ve&(MB,&GB,&TB)& Aaron&Smith&&|&&UT&Aus:n&&|&&Parallel&Compu:ng&at&the&Desktop&
Amdahl’s(Law( &>>&&Serial&sec:ons&limit&the¶llel&effec:veness& & 1& Speedup&&=&&_________& & & f s &+&f p& / & p& & & & & &f s &=&serial&frac:on& & & &f p &=¶llel&frac:on& & & &p&=&number&of&processors& Aaron&Smith&&|&&UT&Aus:n&&|&&Parallel&Compu:ng&at&the&Desktop&
What(resources(do(you(have?( Hardware( So<ware( ( ( >>&Know&the&basic&architecture.& >>&Compilers&are&smart!& >>&What&exactly&is&mul:Xcore?& &&&We&don’t&have&to&try&as&hard.& >>&Who’s&developing?& &&&CPU&=&Central&Processing&Unit& &&&SMP&=&Simultaneous&Mul:processing& &&&Open&source&community& &&&CMP&=&ChipXlevel&Mul:processing& &&&WellXestablished&standards& &&&&&&&&&&&&&&&&Big&pool&of&slower&cache&and& >>&Version&Control& (git/hg)& &&&&&&&&&&&&&&&&separate&fast&memory/cycles& >>&Documenta:on& &&&SMT&=&Simultaneous&Mul:threading& >>&UserXfriendliness& &&&e.g.&quadXcore,&hyperthreaded&processors& &&&Unified&codebase& &&&&&&&&&&&&&&&&Effec:vely&2x4x2&&–&&lower&latency& &&&Trustworthy& >>&Distributed&and&Shared&Memory& &&&Unit&Tes:ng& &&&What&processor&owns&the&data?& &&&Installa:on& &&&Race&condi:ons&and&other&problems& &&&Languages…& &&&Communica:on&overhead&/&bo^lenecks& Aaron&Smith&&|&&UT&Aus:n&&|&&Parallel&Compu:ng&at&the&Desktop&
What(resources(do(you(have?( Hardware( So<ware( ( ( >>&Know&the&basic&architecture.& >>&Compilers&are&smart!& >>&What&exactly&is&mul:Xcore?& &&&We&don’t&have&to&try&as&hard.& >>&Who’s&developing?& &&&CPU&=&Central&Processing&Unit& &&&SMP&=&Simultaneous&Mul:processing& &&&Open&source&community& &&&CMP&=&ChipXlevel&Mul:processing& &&&WellXestablished&standards& &&&&&&&&&&&&&&&&Big&pool&of&slower&cache&and& >>&Version&Control& (git/hg)& &&&&&&&&&&&&&&&&separate&fast&memory/cycles& >>&Documenta:on& &&&SMT&=&Simultaneous&Mul:threading& >>&UserXfriendliness& &&&e.g.&quadXcore,&hyperthreaded&processors& &&&Unified&codebase& &&&&&&&&&&&&&&&&Effec:vely&2x4x2&&–&&lower&latency& &&&Trustworthy& >>&Distributed&and&Shared&Memory& &&&Unit&Tes:ng& &&&What&processor&owns&the&data?& &&&Installa:on& &&&Race&condi:ons&and&other&problems& &&&Languages…& &&&Communica:on&overhead&/&bo^lenecks& Aaron&Smith&&|&&UT&Aus:n&&|&&Parallel&Compu:ng&at&the&Desktop&
The(Language(Landscape( Compiled( vs. ( Interpreted( ( ( >>&C/C++&and&FORTRAN& >>&Python,&Java,&C#,&bash& & & >>&Code&is&reduced&to&machineX& >>&Code&is&saved&as&wri^en&and&& &&&&&&specific&instruc:ons&(executable)& &&&&&&must&be&translated&at&run:me.&& & & >>&Faster&run:mes,&easy&to&op:mize& >>&Faster&develop&:mes& & & >>&LowXlevel&access&to&data&structures& >>&Convenient&highXlevel&func:ons& & & >>&Less&flexible&XX&sta:c&types& >>&Extra&freedom&–&dynamic&types,& & &&&&&&type&checking,&extra&informa:on& JustXInXTime&(JIT)& & >>&WebXbased&applica:ons&(Java)& & & >>&Julia&–&smart&compiler,&s:ll&under& >>&Ongoing&development&&&support& &&&&&&development,&read&the&docs& &&&&&&thoroughly&to&avoid&pilalls& Aaron&Smith&&|&&UT&Aus:n&&|&&Parallel&Compu:ng&at&the&Desktop&
Paradigms(in(Parallel(Programming( 1.&Run&several&serial&programs& &&&&&&&&&e.g.&shell&scrip:ng&–¬&processor&or&memory&limited& 2.&MessageXPassing&Interface&(MPI)& &&&&&&&&&STANDARD&–&“necessary”&for&large&clusters&and&supercomputers& 3.&Open&Mul:&Processing&(OpenMP)& &&&&&&&&&STANDARD&–&incremental¶lleliza:on,&easy,&shared&memory& 4.&Hybrid&Programming& &&&&&&&&&Important&enough&to&be&it’s&own&category&–&more&memory&&&processors& 5.&Graphics&Processing&Units&(GPU)& &&&&&&&&&Very&efficient&for&certain&kinds&of&opera:ons&but¬&everything& 6.&Useful&but&more&obscure&methods& &&&&&&&&&Na:ve&to&languages,&architectureXcentric,&many&integrated&cores&(MIC)&…& Aaron&Smith&&|&&UT&Aus:n&&|&&Parallel&Compu:ng&at&the&Desktop&
Example:( MC(integraGon( 4&&×&&#&Hits& π&&=&&__________& y( #&A^empts& x( Aaron&Smith&&|&&UT&Aus:n&&|&&Parallel&Compu:ng&at&the&Desktop&
! ! ! ! Example:(Serial( #include <stdio.h> ! #include <stdlib.h> ! #include <time.h> ! MC(integraGon( #include <math.h> ! int main ( int argc, char * argv[]) ! { ! double x, y, r, pi; ! int i, count = 0, niter = 1e8; ! srand(time(NULL)); /* set random seed */ ! y( /* main loop */ ! for ( i = 0; i < niter; ++i ) ! { ! /* get random points */ ! x = ( double )rand() / RAND_MAX; ! y = ( double )rand() / RAND_MAX; ! x( r = sqrt(x*x + y*y); ! /* check to see if point is in unit circle */ ! if ( r <= 1 ) ++count; ! } /* end main loop */ ! pi = 4.0 * ( ( double )count / ( double )niter ); ! printf("Pi: %f\n", pi); // p = 4(m/n) ! return 0; ! } ! Aaron&Smith&&|&&UT&Aus:n&&|&&Parallel&Compu:ng&at&the&Desktop&
! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! Example:(Serial( #include <...> ! MC(integraGon( int main ( int argc, char * argv[]) ! { ! /* declare variables */ ! srand(time(NULL) ); // random seed ! for ( i = 0; i < niter; ++i ) ! { /* test if random points are in unit circle */ } ! pi = 4.0 * ( ( double )count / ( double )niter ); ! printf("Pi: %f\n", pi); // p = 4(m/n) ! y( return 0; ! } ! x( Aaron&Smith&&|&&UT&Aus:n&&|&&Parallel&Compu:ng&at&the&Desktop&
Recommend
More recommend