D ata I ntensive S calable C omputing Randal E. Bryant Carnegie Mellon University http://www.cs.cmu.edu/~bryant
Examples of Big Data Sources Wal-Mart 267 million items/day, sold at 6,000 stores HP built them 4 PB data warehouse Mine data to manage supply chain, understand market trends, formulate pricing strategies LSST Chilean telescope will scan entire sky every 3 days A 3.2 gigapixel digital camera Generate 30 TB/day of image data – 2 –
Why So Much Data? We Can Get It Automation + Internet We Can Keep It Seagate Barracuda 1.5 TB @ $150 (10¢ / GB) We Can Use It Scientific breakthroughs Business process efficiencies Realistic special effects Better health care Could We Do More? Apply more computing power to this data – 3 –
Google Data Center Dalles, Oregon Hydroelectric power @ 2¢ / KW Hr 50 Megawatts Enough to power 6,000 homes – 4 –
Varieties of Cloud Computing “I don’t want to be a system “I’ve got terabytes of data. Tell me what they mean.” administrator. You handle my data & applications.” Very large, shared data repository Hosted services Complex analysis Documents, web-based email, etc. Data-intensive scalable computing (DISC) Can access from anywhere Easy sharing and collaboration – 5 –
Oceans of Data, Skinny Pipes 1 Terabyte Easy to store Hard to move Disks MB / s Time Seagate Barracuda 115 2.3 hours Seagate Cheetah 125 2.2 hours Networks MB / s Time Home Internet < 0.625 > 18.5 days Gigabit Ethernet < 125 > 2.2 hours PSC Teragrid < 3,750 > 4.4 minutes – 6 – Connection
Data-Intensive System Challenge For Computation That Accesses 1 TB in 5 minutes Data distributed over 100+ disks Assuming uniform data partitioning Compute using 100+ processors Connected by gigabit Ethernet (or equivalent) System Requirements Lots of disks Lots of processors Located in close proximity Within reach of fast, local-area network – 7 –
Desiderata for DISC Systems Focus on Data Terabytes, not tera-FLOPS Problem-Centric Programming Platform-independent expression of data parallelism Interactive Access From simple queries to massive computations Robust Fault Tolerance Component failures are handled as routine events Contrast to existing supercomputer / HPC systems – 8 –
System Comparison: Programming Models DISC Conventional Supercomputers Application Application Programs Programs Machine-Independent Software Programming Model Packages Runtime System Machine-Dependent Programming Model Hardware Hardware Programs described at very Application programs low level written in terms of high-level Specify detailed control of operations on data processing & communications Runtime system controls Rely on small number of scheduling, load balancing, … software packages Written by specialists Limits classes of problems & – 9 – solution methods
System Comparison: Reliability Runtime errors commonplace in large-scale systems Hardware failures Transient errors Software bugs DISC Conventional Supercomputers “Brittle” Systems Flexible Error Detection and Recovery Main recovery mechanism is to recompute from most Runtime system detects and recent checkpoint diagnoses errors Must bring down system for Selective use of redundancy diagnosis, repair, or and dynamic recomputation upgrades Replace or upgrade components while system running Requires flexible programming model & – 10 – runtime environment
Exploring Parallel Computation Models MapReduce MPI SETI@home Threads PRAM Low Communication High Communication Coarse-Grained Fine-Grained DISC + MapReduce Provides Coarse-Grained Parallelism Computation done by independent processes File-based communication Observations Relatively “natural” programming model Research issue to explore full potential and limits Dryad project at MSR Pig project at Yahoo! – 11 –
Message Passing Existing HPC Machines P 1 P 2 P 3 P 4 P 5 Characteristics Long-lived processes Make use of spatial locality Hold all program data in memory High bandwidth communication Shared Memory Memory P 1 P 2 P 3 P 4 P 5 Strengths High utilization of resources Effective for many scientific applications Weaknesses Very brittle: relies on everything working correctly and in close synchrony – 12 –
HPC Fault Tolerance P 1 P 2 P 3 P 4 P 5 Checkpoint Checkpoint Periodically store state of all processes Wasted Computation Significant I/O traffic Restore Restore When failure occurs Reset state to that of last Checkpoint checkpoint All intervening computation wasted Performance Scaling Very sensitive to number of failing components – 13 –
Map/Reduce Operation Characteristics Map/Reduce Computation broken into many, short-lived tasks Map Mapping, reducing Reduce Use disk storage to hold Map Reduce intermediate results Map Strengths Reduce Great flexibility in placement, Map scheduling, and load Reduce balancing Handle failures by recomputation Can access large data sets Weaknesses Higher overhead – 14 – Lower raw performance
Generalizing Map/Reduce E.g., Microsoft Dryad Project Computational Model Op k Op k Op k Op k Acyclic graph of operators But expressed as textual program Each takes collection of objects and produces objects Purely functional model Implementation Concepts Op 2 Op 2 Op 2 Op 2 Objects stored in files or memory Any object may be lost; any operator may fail Op 1 Op 1 Op 1 Op 1 Replicate & recompute for fault tolerance Dynamic scheduling x 1 x 2 x 3 x n # Operators >> # Processors – 15 –
Concluding Thoughts Data-Intensive Computing Becoming Commonplace Facilities available from Google/IBM, Yahoo!, … Hadoop becoming platform of choice Lots of applications are fairly straightforward Use Map to do embarrassingly parallel execution Make use of load balancing and reliable file system of Hadoop What Remains Integrating more demanding forms of computation Computations over large graphs Sparse numerical applications Challenges: programming, implementation efficiency – 16 –
Recommend
More recommend