CS 644: Introduction to Big Data Chapter 8. Enabling Big-data Scientific Workflows in High-performance Networks Chase Wu New Jersey Institute of Technology 1
Outline • Introduction • Challenges & Objectives • A Three-layer Architecture Solution - Enabling Technologies: networking and computing • Networking for Big Data - Software-defined Networking (SDN) - High-performance Networking (HPN) • Computing for Big Data - Workflow Management and Optimization • Simulation/Experimental Results • Conclusion 2
Introduction • Supercomputing for big-data science Astrophysics Computational biology Nanoscience Climate research Neutron sciences Flow dynamics Fusion simulation Computational materials 3
Terascale Supernova Initiative (TSI) • Collaborative project - Supernova explosion • TSI simulation - 1 terabyte a day with a small portion of parameters - From TSI to PSI • Transfer to remote sites - Interactive distributed visualization - Collaborative data analysis - Computation monitoring - Computation steering Visualization channel Visualization control channel Computation steering channel Client Supercomputer or Cluster 4
Challenges for Extreme-scale Scientific Applications • A typical way of research conduct - Run simulation code on supercomputer in a batch mode Ø One day: 1 terabyte datasets - Move datasets to HPSS Ø About 8 hours - Transfer datasets to remote sites over the Internet Ø TCP-based transfer tools: up to one week } - Filter out data of interest - Partition dataset for parallel processing Visualization: - Extract geometry data several hours or days - Generate images on rendering engine - Display results on desktop, laptop, powerwall, etc. Start over if any parameter values are not set appropriately! 5
Challenges in Modern Sciences • BIG DATA: from T to P, to E, to Z, to Y, and beyond… - Simulation • Astrophysics, climate modeling, combustion research, etc. - Experimental • Spallation Neutron Source, Large Hadron Collider, etc. - Observational • Large-scale sensor networks, astronomical image data (Dark Energy Camera), etc. No matter which type of data is considered, we need an end-to-end workflow solution for data transfer, processing, and analysis! 6
Big-data Scientific Workflows • Require massively distributed resources - Hardware • Computing facilities, storage systems, special rendering engines, display devices (tiled display, powerwall, etc.), network infrastructure, etc. - Software • Domain-specific data analytics/processing tools, programs, etc. - Data type • Real-time, archival • Feature different complexities - Simple case: linear pipeline (a special case of DAG) - Complex case: DAG-structured graph • Support different application types - Interactive: minimize total end-to-end delay for fast response - Streaming: maximize frame rate to achieve smooth data flow 7
Ultimate Goals - Support distributed workflows in heterogeneous environments - Optimize workflow performance to meet various user requirements Ø Delay, throughput, reliability, etc. Ø Remote visualization, online computational monitoring and steering, etc. - Make the best use of computing and networking resources 8
Solution: A Three-layer Architecture 9
Enabling Technologies • Three layers - Top: Abstract scientific workflow - Middle: Virtual overlay network (grid, cloud) - Bottom: Physical high-performance network • Top and bottom layers meet at middle layer - From bottom to middle: resource abstraction • Bandwidth scheduling • Performance modeling and prediction - From top to middle: workflow mapping • Optimization: where to execute modules? • Workflow execution - Actual data transfer: transport control - Actual module running: job scheduling 10
Networking Requirements • Provision dedicated channels to meet different transport objectives - High bandwidths • Multiples of 10Gbps to terabits networking • Support bulk data transfers - Stable bandwidths • 100s of Mbps • Support interactive control operations • Why not the Internet? - Only backbone has high bandwidths (last mile) - Packet-level resource sharing - Best-effort IP routing - TCP: hard to sustain 10s Gbps or to stabilize 11
An Overview of TCP/IP Stack 12
Software-Defined Networking • The Concept of Virtualization • Virtualization of Computing • Virtualization of Networking • Software-Defined Network • Possible Directions 13
Concept of Virtualization • Decoupling HW/SW • Abstraction and layering • Using, demanding, but not owning or configuring • Resource pool: flexible to slice, resize, combine, and distribute • A degree of automation by software 14
Benefits of Virtualization • An analogy: owning a huge house - Real estate, immovable property - Does not generate cash and income • How to gain more profit? - Divide this huge house into suites, and RENT to people! - Renting suites: using but not owning - Transform a static investment into cash generators!!! 15
Virtualization of Computing • Partitioning one physical machine - Virtual instances • Running concurrently, sharing resources • Hypervisor: Virtual Machine Monitor (VMM) - A software layer presents abstraction of physical resources Key Factor of Virtualization 16
Networks are Hard to Manage • Operating a network is expensive - More than half the cost of a network - Yet, operator error causes most outages • Buggy software in the equipment - Routers with 20+ million lines of code - Cascading failures, vulnerabilities, etc. • The network is “ in the way ” - Especially a problem in data centers - … and home networks 17
Traditional Computer Networks Management plane: Collect measurements and configure the equipment 18
Software Defined Networking (SDN) Logically-centralized control API to the data plane (e.g., OpenFlow) Smart, slow Switches Dumb, fast 19
Provide Choices Bandwidth Dynamic Application Networking Unified Traffic - on - Optical -Aware Applications Recovery Engineering Demand Bypass QoS NETWORK OPERATING SYSTEM Unified Control Plane VIRTUALIZATION (SLICING) PLANE Switch OpenFlow Protocol Abstraction Packet & Circuit Packet & Circuit Switch Switch Underlying Data Plane Switching Packet Wavelength Multi-layer Time-slot Packet Switch Switch Switch Switch Switch 20
Architecture Control Plane / Applications API Provides Logical Forwarding Plane Abstraction Control Logical States Commands Abstractions Provides Network Info Mapping Distributed Base System Onix / Network OS Network Hypervisor Distributes, Configures Real States OpenFlow 21
Switch Forwarding Pipeline Logical Forwarding Plane As packets/flows traverse the network: moving both in logical and physical forwarding plane → logical context 22
Data-Plane: Simple Packet Handling • Simple packet-handling rules - Pattern: match packet header bits - Actions: drop, forward, modify, send to controller - Priority: disambiguate overlapping patterns - Counters: #bytes and #packets 1. src=1.2.*.*, dest=3.4.5.* à drop 2. src = *.*.*.*, dest=3.4.*.* à forward(2) 3. src=10.1.2.3, dest=*.*.*.* à send to controller 23
24 Unifies Different Kinds of Boxes • Router • Firewall - Match: longest destination - Match: IP addresses and IP prefix TCP/UDP port numbers - Action: forward out a link - Action: permit or deny • Switch • NAT - Match: destination MAC - Match: IP address and port address - Action: rewrite address and - Action: forward or flood port
25 Controller: Programmability Controller Application Network OS Events from switches Commands to switches Topology changes, (Un)install rules, Traffic statistics, Query statistics, Arriving packets Send packets
Example OpenFlow Applications • Dynamic access control • Seamless mobility/migration • Server load balancing • Network virtualization • Using multiple wireless access points • Energy-efficient networking • Adaptive traffic monitoring • Denial-of-Service attack detection See http://www.openflow.org/videos/ 26
Example: Dynamic Access Control • Inspect first packet of a connection • Consult the access control policy • Install rules to block or route traffic 27
Seamless Mobility/Migration • See host send traffic at new location • Modify rules to reroute the traffic 28
29 Server Load Balancing • Pre-install load-balancing policy • Split traffic based on source IP src=0* src=1*
30 Network Virtualization Controller #1 Controller #2 Controller #3 Partition the space of packet headers
OpenFlow in the Wild • Open Networking Foundation - Google, Facebook, Microsoft, Yahoo, Verizon, Deutsche Telekom, and many other companies • Commercial OpenFlow switches - HP, NEC, Quanta, Dell, IBM, Juniper, … • Network operating systems - NOX, Beacon, Floodlight, Nettle, ONIX, POX, Frenetic • Network deployments - Eight campuses, and two research backbone networks - Commercial deployments (e.g. Google backbone) 31
32 Controller Delay and Overhead • Controller is much slower than the switch • Processing packets leads to delay and overhead • Need to keep most packets in the “ fast path ” packets
33 Distributed Controller For scalability and reliability Controller Controller Application Application Partition and replicate state Network OS Network OS
Recommend
More recommend