A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed A Sleep-based Communication Mechanism to Streaming Systems Save Processor Utilization in Distributed Shoaib Akram, Streaming Systems Angelos Bilas Outline Introduction Shoaib Akram Angelos Bilas Our Work Experimental Platform Foundation for Research and Technology - Hellas (FORTH) Institute of Computer Science (ICS) Results A Broader Picture of Our May 1, 2011 Work Conclusions
A Sleep-based Communica- tion Mechanism to Save 1 Introduction Processor Utilization in Distributed Streaming Systems 2 Our Work Shoaib Akram, Angelos Bilas 3 Experimental Platform Outline Introduction 4 Results Our Work Experimental Platform Results 5 A Broader Picture of Our Work A Broader Picture of Our Work 6 Conclusions Conclusions
A Sleep-based Communica- tion Efficiency in Back-end Processing Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, • Efficiency in back-end processing is important. Angelos Bilas • Scalability is important but software stacks of indiviual Outline nodes are becoming complex : Introduction • Runtime bloat (Nick Mitchell). Our Work • Complex messaging protocols. Experimental • Layers of software, libraries etc. Platform Results • This leads to over-provisioning of resources for back-end A Broader processing. Picture of Our Work Conclusions
A Sleep-based Communica- tion Distributed Streaming Systems Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib • Recently gaining attention due to large amounts of data to Akram, Angelos Bilas be processed/filtered. • Static queries and moving data. Outline Introduction • Similar operators like traditional data bases. Our Work • Reasons for adopting a distributed model : Experimental • Geographically distributed sources of data. Platform • Speed-up of application queries. Results A Broader • Borealis (academic consortium) and SystemS (IBM) are Picture of Our Work common examples. Conclusions
A Sleep-based Communica- tion Key Requirements of Distributed Mechanism to Save Processor Streaming Systems Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline • Scalability to many nodes. Introduction • Provisioning for heavy inter-node communication. Our Work • Rich library of stream operators. Experimental Platform • Communication protocol and operators should be Results decoupled. A Broader Picture of Our Work Conclusions
A Sleep-based Communica- tion The Architecture of Borealis - Mechanism to Save Processor Event Structure Utilization in Distributed Streaming Systems • Event-driven architetcure. Shoaib Akram, Angelos Bilas • The notion of streams and tuples. Outline Introduction Tuples Number of Size of Source of Stream Tuple Tuples Our Work Tuples Info Experimental Platform Results Tuple1 Tuple2 Tuple3 Tuple4 A Broader Picture of Our Work Conclusions Time Field1 Field2 Field3 Stamp
A Sleep-based Communica- tion The Architecture of Borealis - Mechanism to Save Processor Threads and Data Structures Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas • Four threads that work asynchronously: Outline • receive thread Introduction • process thread Our Work • prepare thread Experimental Platform • send thread Results • Data structures for inter-thread communication. A Broader Picture of Our Work Conclusions
A Sleep-based Communica- tion Communication Subsystems in Mechanism to Save Processor Distributed Middleware Systems Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas • Send/Receive operations are implemented using: • Interrupts - High overhead at high network speeds and Outline large message rates. Introduction • Polling - Wastes CPU cycles at low network rates. Our Work • Send/Receive API provided by Linux Sockets : Experimental Platform • Blocking sockets (interrupts). Results • Non-blocking sockets (polling). A Broader • Monitoring multiple sockets (blocking call to select). Picture of Our Work • Problems with monitoring multiple sockets with select. Conclusions
A Sleep-based Communica- tion Sleeping - An Alternative Mechanism to Save Processor Approach Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas • Sleep for a specific amount of time if no communication is Outline expected. Introduction • Regulation of sleeping time : Our Work • Kernel issues. Experimental Platform • Multiple applications. Results • Parameters of a single application changes. A Broader • Granularity of sleeping time may change with a different Picture of Our kernel. Work Conclusions
A Sleep-based Communica- tion Our Approach: Mechanism to Save Processor Distribution/Accumulation of Utilization in Distributed Work Streaming Systems Shoaib Akram, Angelos Bilas • Typical configuration of a data streaming system is a Outline pipeline of senders/receivers. Introduction • Send and receive threads work asynchronously. Our Work • Goal of send thread : Experimental Platform • Node downstream has enough work to perform. Results • Goal of receive thread : A Broader Picture of Our • Unpack the events and give work to process thread. Work • Layers above the communication protocol have enough Conclusions work to do.
A Sleep-based Communica- tion Working in Waves Mechanism to Save Processor Utilization in Distributed Streaming Systems • Both send and receive threads maintain messaging queues. Shoaib Akram, • The receive thread informs the send thread of the Angelos Bilas availability of free slots in the queue by sending a message Outline (credit message). Introduction • After processing a few buffers, the receive thread sends a Our Work credit message to the send thread. Experimental Platform • The credit message allows the send thread to send data in Results buffers that the receive thread has already made available. A Broader Picture of Our • If there the send thread can not find a credit message, it Work sleeps. Conclusions
A Sleep-based Communica- tion Working in Waves Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, • The receive thread unpacks the events, hand the events to Angelos Bilas the event handler and then checks for an event in the next Outline slot in the queue. Introduction • If the receive thread can not find data in the buffer, it Our Work Experimental sleeps. Platform • While it is sleeping, the send thread fills up the queue with Results new events. A Broader Picture of Our Work Conclusions
A Sleep-based Communica- tion Working in Waves: Summary Mechanism to Save Processor Utilization in Distributed Streaming Systems • Sleeping criteria for send thread : Shoaib • Criteria: Sleep for a fixed amount of time if no credits Akram, Angelos Bilas available. • Rationale: Receiver is busy unpacking messages and will Outline send credits at some point. Introduction • Sleeping criteria for receive thread : Our Work • Criteria: Sleep for a fixed amount of time if no new Experimental Platform message is available. Results • Rationale: A Broader • All the available messages were unpacked and distributed Picture of Our to layer above. Work • Processing is much heavier than unpacking. Conclusions • Collect work while consuming no extra CPU cycles.
A Sleep-based Communica- tion Machine Parameters and Mechanism to Save Processor Benchmark for Evaluation Utilization in Distributed Streaming Systems • Four server-type systems running Linux CentOS release Shoaib 5.4. Akram, Angelos Bilas • Two Intel Xeon Quad-core (2-way hyper threaded). Outline • 14 Gbytes DRAM. Introduction • 10 Gbits/s Ethernet NIC from Myrinet. Our Work • 10 Gbits/s Ethernet HP ProCurve 3400cl switch. Experimental Platform • A custom-benchmark that filters the incoming data (filter Results condition is always true to load network). A Broader Picture of Our • First node generates the tuples, the next two process the Work Conclusions tuples. • The last node receives the tuples and consumes them internally.
A Sleep-based Communica- tion Some Parameters of Borealis Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib • No. of instances of borealis (8). Akram, Angelos Bilas • Batching factor (varying). Outline • Tuple size (varying). Introduction • Size of send-side queue (10). Our Work Experimental • Size of receive-side queue (100). Platform • Frequency of exchanging credits (every 10 buffers). Results A Broader • Sleeping time is 10 ms. Picture of Our Work Conclusions
A Sleep-based Communica- tion Myrinet MX - A User-level Mechanism to Save Processor Networking API Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas • Provides a user-level networking API. Outline • Baseline throughput is higher : Introduction • Removes one copy on send side. Our Work • Removes two copies on the receive path. Experimental Platform • Reduces the number of interrupts on the receive side. Results • Fine-grained control for managing buffers. A Broader Picture of Our • Ease of implementation of flow-control mechanisms. Work Conclusions
Recommend
More recommend