Porting Charm++ to a New System Writing a Machine Layer Sayantan - PowerPoint PPT Presentation

Porting Charm++ to a New System Writing a Machine Layer Sayantan Chakravorty 5/01/2008 Parallel Programming Laboratory 1

Why have a Machine Layer ? User Code .ci .C .h Charm++ Load balancing Virtualization Scheduler Converse Memory management Message delivery Machine Layer Timers 5/01/2008 Parallel Programming Laboratory 2

Where is the Machine Layer ? • Code exists in charm/src/arch/<Layer Name> • Files needed for a machine layer – machine.c : Contains C code – conv-mach.sh : Defines environment variables – conv-mach.h : Defines macros to choose version of machine.c – Can produce many variants based on the same machine.c by varying conv-mach-<option>.* • 132 versions based on only 18 machine.c files 5/01/2008 Parallel Programming Laboratory 3

What all does a Machine Layer do? ConverseInit ConverseInit ConverseInit FrontEnd CmiSyncSendFn CmiSyncSendFn CmiSyncSendFn CmiSyncBroadcastFn ConverseExit ConverseExit ConverseExit CmiAbort CmiAbort CmiAbort 5/01/2008 Parallel Programming Laboratory 4

Different kinds of Machine Layers • Differentiate by Startup method – Uses lower level library/ run time • MPI: mpirun is the frontend – cray, sol, bluegenep • VMI: vmirun is the frontend – amd64, ia64 • ELAN: prun is the frontend – axp, ia64 – Charm run time does startup • Network based (net) : charmrun is the frontend – amd64, ia64,ppc – Infiniband, Ethernet, Myrinet 5/01/2008 Parallel Programming Laboratory 5

Net Layer: Why ? • Why do we need a startup in Charm RTS ? – Using a low level interconnect API, no startup provided • Why use low level API ? – Faster » Why faster • Lower overheads • We can design for a message driven system – More flexible » Why more flexible ? • Can implement functionality with exact semantics needed 5/01/2008 Parallel Programming Laboratory 6

Net Layer: What ? • Code base for implementing a machine layer on low level interconnect API CmiMachineInit ConverseInit node_addresses_obtain CommunicationServer CmiSyncSendFn charmrun req_client_connect DeliverViaNetwork CmiSyncBroadcastFn ConverseExit CmiMachineExit CmiAbort 5/01/2008 Parallel Programming Laboratory 7

Net Layer: Startup charmrun.c machine.c main(){ ConverseInit(){ // read node file //Open socket with charmrun nodetab_init(); skt_connect(..); //fire off compute node processes //Initialize the interconnect start_nodes_rsh(); CmiMachineInit(); //Wait for all nodes to reply //Send my node data //Send nodes their node table //Get the node table Node data req_client_connect(); node_addresses_obtain(..); Node Table //Poll for requests //Start the Charm++ user code while (1) req_poll(); ConverseRunPE(); } } 5/01/2008 Parallel Programming Laboratory 8

Net Layer: Sending messages CmiSyncSendFn(int proc,int size,char *msg){ //common function for send CmiGeneralSend( proc,size,`S’,msg ); } CmiGeneralSend(int proc,int size, int freemode, char *data){ OutgoingMsg ogm = PrepareOutgoing(cs,pe, size,freemode,data); DeliverOutgoingMessage(ogm); //Check for incoming messages and completed //sends CommunicationServer(); } DeliverOutgoingMessage(OutgoingMsg ogm){ //Send the message on the interconnect DeliverViaNetwork(ogm,..); } 5/01/2008 Parallel Programming Laboratory 9

Net Layer: Exit ConverseExit(){ //Shutdown the interconnect cleanly CmiMachineExit(); //Shutdown Converse ConverseCommonExit(); //Inform charmrun this process is done ctrl_sendone_locking("ending",NULL,0, NULL,0); } 5/01/2008 Parallel Programming Laboratory 10

Net Layer: Receiving Messages • No mention of receiving messages • Result of message driven paradigm – No explicit Receive calls • Receive starts in CommunicationServer – Interconnect specific code collects received message – Calls CmiPushPE to handover message 5/01/2008 Parallel Programming Laboratory 11

Let’s write a Net based Machine Layer 5/01/2008 Parallel Programming Laboratory 12

A Simple Interconnect • Let’s make up an interconnect – Simple • Each node has a port • Other Nodes send it messages on that port • A node reads its port for incoming messages • Messages are received atomically – Reliable – Does Flow control itself 5/01/2008 Parallel Programming Laboratory 13

The Simple Interconnect AMPI • Initialization – void si_init() – int si_open() – NodeID si_getid() • Send a message – int si_write(NodeID node, int port, int size, char *msg) • Receive a message – int si_read(int port, int size, char *buf) • Exit – int si_close(int port) – void si_done() 5/01/2008 Parallel Programming Laboratory 14

Let’s start • Net layer based implementation for SI conv-mach-si.sh conv-mach-si.h #undef CMK_USE_SI #define CMK_USE_SI 1 //Polling based net layer CMK_INCDIR=“ -I/opt/si/include ” #undef CMK_NETPOLL CMK_LIBDIR=“ -I/opt/si/lib ” #define CMK_NETPOLL 1 CMK_LIB=“ $CMK_LIBS – lsi ” 5/01/2008 Parallel Programming Laboratory 15

Net based SI Layer machine-si.c machine-dgram.c #include “ si.h ” machine.c CmiMachineInit #if CMK_USE_GM //Message delivery #include "machine-gm.c “ #include “ machine-dgram.c ” #elif CMK_USE_SI DeliverViaNetwork #include “ machine-si.c ” #elif … CommunicationServer CmiMachineExit 5/01/2008 Parallel Programming Laboratory 16

Initialization machine-si.c NodeID si_nodeID; int si_port; machine.c charmrun.c static OtherNode nodes; CmiMachineInit(){ void req_client_connect(){ void node_adress_obtain(){ si_init(); si_port = si_open(); //collect all node data ChSingleNodeinfo me; si_nodeID = si_getid(); for(i=0;i<nClients;i++){ } ChMessage_recv(req_clients[i],&msg); #ifdef CMK_USE_SI ChSingleNodeInfo *m=msg->data; me.info.nodeID = si_nodeID; #ifdef CMK_USE_SI me.info.port = si_port; nodetab[m.PE].nodeID = m.info.nodeID #endif nodetab[m.PE].port = m.info.port //send node data to chamrun #endif ctrl_sendone_nolock("initnode",&me, } sizeof(me),NULL,0); //send node data to all //receive and store node table for(i=0;i<nClients;i++){ ChMessage_recv(charmrun_fd, &tab); //send nodetab on req_clients[i] for(i=0;i<Cmi_num_nodes;i++){ } nodes[i].nodeID = tab->data[i].nodeID; nodes[i].port = tab->data[i].port; } 5/01/2008 Parallel Programming Laboratory 17

Messaging: Design • Small header with every message – contains the size of the message – Source NodeID (not strictly necessary) • Read the header – Allocate a buffer for incoming message – Read message into buffer – Send it up to Converse 5/01/2008 Parallel Programming Laboratory 18

Messaging: Code machine-si.c typedef struct{ unsigned int size; machine-si.c NodeID nodeID; void CommunicationServer(){ } si_header; si_header hdr; while(si_read(si_port,sizeof(hdr),&hdr)!= 0) void DeliverViaNetwork(OutgoingMsg { ogm, int dest ,…) { void *buf = CmiAlloc(hdr.size); DgramHeaderMake(ogm- >data,…); int readSize,readTotal=0; while(readTotal < hdr.siez){ si_header hdr; if((readSize= si_read(si_port,hdr.size,buf) hdr.nodeID = si_nodeID; ) <0){} hdr.size = ogm->size; readTotal += readSize; } OtherNode n = nodes[dest]; //handover to Converse if(!si_write(n.nodeID, n.port,sizeof(hdr), } &hdr) ){} } if(!si_write(n.nodeID, n.port, hdr.size, ogm->data) ){} } 5/01/2008 Parallel Programming Laboratory 19

Exit machine-si.c NodeID si_nodeID; int si_port; CmiMachineExit (){ si_close(si_port); si_done(); } 5/01/2008 Parallel Programming Laboratory 20

More complex Layers • Receive buffers need to be posted – Packetization • Unreliable interconnect – Error and Drop detection – Packetization – Retransmission • Interconnect requires memory to be registered – CmiAlloc implementation 5/01/2008 Parallel Programming Laboratory 21

Thank You 5/01/2008 Parallel Programming Laboratory 22

Porting Charm++ to a New System Writing a Machine Layer Sayantan - PowerPoint PPT Presentation

Porting Charm++ to a New System Writing a Machine Layer Sayantan Chakravorty 5/01/2008 Parallel Programming Laboratory 1 Why have a Machine Layer ? User Code .ci .C .h Charm++ Load balancing Virtualization Scheduler Converse Memory

Porting Go to NetBSD/arm64 Maya Rashish <coypu@sdf.org> Porting Go to NetBSD/arm64

PORTING THE HAMMER FILE SYSTEM TO LINUX Daniel Lorch June 10, 2009 Outline 2/13 Motivation 1.

Charm++ Workshop 2010 Processor Virtualization in Weather Models Eduardo R. Rodrigues Institute

Charm++ Interoperability Nikhil Jain Charm Workshop - 2013 1 Monday, April 15, 13 1

Charm++ as an Energy Efficient Runtime 1 4/18/17 BILGE ACUN - CHARM++ WORKSHOP 2017 Interaction

State of Charm++ Laxmikant Kale http://charm.cs.uiuc.edu Parallel Programming Laboratory

Porting the Robot Operating System ( ROS ) to Free BSD Trenton Schulz You have a robot & its

Welcome to the 2017 Charm++ Workshop! Laxmikant (Sanjay) Kale http://charm.cs.illinois.edu

Challenges in Application Porting and Abstraction Presented by: Raj Johnson, President & CEO

Porting a climate-air quality modelling system to the EGEE Grid E. Katragkou 1 , P. Korosoglou 2

Studying charm hadronization c /D in pp Charm baryon-to-meson ratio in pp : factor ~3

Porting LLVM to a new OS Kai Nacke 31 January 2016 LLVM devroom @ FOSDEM16 Porting LLVM

Prex: Finding Guidance for Forward and Backward Porting of Linux Device Drivers Julia Lawall,

Charm physics and XYZ states at BESIII Evgeny BOGER JINR Dubna On behalf of BESIII

Porting an existing Qt-based Windows only application to Mac OS X Overview Introduction

Heterogeneous Task Execution Frameworks in Charm++ Michael Robson Parallel Programming Lab

How to Write a Parallel GPU Application Using CUDA and Charm++ Presented by Lukasz Wesolowski

Porting OpenBSD Niall OHiggins <niallo@openbsd.org> Uwe Sthler <uwe@openbsd.org>

Charm CP violation & mixing Mat Charles (Oxford & UPMC) ! Overview YES NO Interested

of Charm Physics Alexey Dzyuba \ HEPD PNPI NRC KI on behalf of LHCb Collaboration 21 st of May

Recent Results in Charm Physics Recent Results in Charm Physics Topics Topics Rare Charm

Dynamic Load Balancing in Dynamic Load Balancing in Charm+ + Charm+ + Abhinav S Bhatele

CP violation in charm decays in LHCb Giulia Tuci, on behalf of the LHCb collaboration

NATIVE MODE PORTING CASE STUDY Adrian Jackson adrianj@epcc.ed.ac.uk @adrianjhpc Native mode

Porting Charm++ to a New System Writing a Machine Layer Sayantan - PowerPoint PPT Presentation

Porting Charm++ to a New System Writing a Machine Layer Sayantan Chakravorty 5/01/2008 Parallel Programming Laboratory 1 Why have a Machine Layer ? User Code .ci .C .h Charm++ Load balancing Virtualization Scheduler Converse Memory

Porting Go to NetBSD/arm64 Maya Rashish &lt;coypu@sdf.org&gt; Porting Go to NetBSD/arm64

PORTING THE HAMMER FILE SYSTEM TO LINUX Daniel Lorch June 10, 2009 Outline 2/13 Motivation 1.

Charm++ Workshop 2010 Processor Virtualization in Weather Models Eduardo R. Rodrigues Institute

Charm++ Interoperability Nikhil Jain Charm Workshop - 2013 1 Monday, April 15, 13 1

Charm++ as an Energy Efficient Runtime 1 4/18/17 BILGE ACUN - CHARM++ WORKSHOP 2017 Interaction

State of Charm++ Laxmikant Kale http://charm.cs.uiuc.edu Parallel Programming Laboratory

Porting the Robot Operating System ( ROS ) to Free BSD Trenton Schulz You have a robot &amp; its

Welcome to the 2017 Charm++ Workshop! Laxmikant (Sanjay) Kale http://charm.cs.illinois.edu

Challenges in Application Porting and Abstraction Presented by: Raj Johnson, President &amp; CEO

Porting a climate-air quality modelling system to the EGEE Grid E. Katragkou 1 , P. Korosoglou 2

Studying charm hadronization c /D in pp Charm baryon-to-meson ratio in pp : factor ~3

Porting LLVM to a new OS Kai Nacke 31 January 2016 LLVM devroom @ FOSDEM16 Porting LLVM

Prex: Finding Guidance for Forward and Backward Porting of Linux Device Drivers Julia Lawall,

Charm physics and XYZ states at BESIII Evgeny BOGER JINR Dubna On behalf of BESIII

Porting an existing Qt-based Windows only application to Mac OS X Overview Introduction

Heterogeneous Task Execution Frameworks in Charm++ Michael Robson Parallel Programming Lab

How to Write a Parallel GPU Application Using CUDA and Charm++ Presented by Lukasz Wesolowski

Porting OpenBSD Niall OHiggins &lt;niallo@openbsd.org&gt; Uwe Sthler &lt;uwe@openbsd.org&gt;

Charm CP violation &amp; mixing Mat Charles (Oxford &amp; UPMC) ! Overview YES NO Interested

of Charm Physics Alexey Dzyuba \ HEPD PNPI NRC KI on behalf of LHCb Collaboration 21 st of May

Recent Results in Charm Physics Recent Results in Charm Physics Topics Topics Rare Charm

Dynamic Load Balancing in Dynamic Load Balancing in Charm+ + Charm+ + Abhinav S Bhatele

CP violation in charm decays in LHCb Giulia Tuci, on behalf of the LHCb collaboration

NATIVE MODE PORTING CASE STUDY Adrian Jackson adrianj@epcc.ed.ac.uk @adrianjhpc Native mode

Porting Go to NetBSD/arm64 Maya Rashish <coypu@sdf.org> Porting Go to NetBSD/arm64

Porting the Robot Operating System ( ROS ) to Free BSD Trenton Schulz You have a robot & its

Challenges in Application Porting and Abstraction Presented by: Raj Johnson, President & CEO

Porting OpenBSD Niall OHiggins <niallo@openbsd.org> Uwe Sthler <uwe@openbsd.org>

Charm CP violation & mixing Mat Charles (Oxford & UPMC) ! Overview YES NO Interested