mapreduce simplified data processing on large clusters
play

MapReduce: Simplified Data Processing on Large Clusters Jeffrey - PowerPoint PPT Presentation

MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat Presented by: Chaochao Yan 04/25/2018 MapReduce A programming model and an associated implementation for processing and generating large data sets.


  1. MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat Presented by: Chaochao Yan 04/25/2018

  2. MapReduce A programming model and an associated implementation for processing and generating large data sets.

  3. Motivation: Large Scale Data Processing ❏ Want to process lots of data (>1 PB data) ❏ Want to use hundreds or thousands of CPUs ❏ Want to make this easy

  4. MapReduce ❏ Automatic parallelization & distribution ❏ Fault-tolerant ❏ Provides status and monitoring tools ❏ Clean abstraction for programmers

  5. Programming model ❏ Input & Output: each a set of key/value pairs ❏ Divide and conquer similar ❏ Programmer specifies two functions: map(in_key, in_value) -> list(out_key, intermediate_value) reduce(out_key, list(intermediate_value)) -> list(out_value)

  6. WordCount Pseudo-code map(String input_key, String input_value): // input_key: document name, input_value: document contents for each word w in input_value: EmitIntermediate(w, "1"); reduce(String output_key, Iterator intermediate_values): // output_key: a word, output_values: a list of counts int result = 0; for each v in intermediate_values: result += ParseInt(v); Emit(AsString(result));

  7. Illustrated WordCount “see”: [“1”, “1”] Picture from http://ranger.uta.edu/~sjiang/CSE6350-spring-18/lecture-8.pdf

  8. Distributed and Parallel Computing ❏ map() functions run in distributed and parallel, creating different intermediate values from different input data sets ❏ reduce() functions also run in distributed and parallel, each working on a different output key ❏ All values are processed independently

  9. Implementation Overview ❏ 100s/1000s of 2-CPU x86 machines, 2-4 GB of memory ❏ Commodity networking hardware is used ❏ Storage is on local IDE disks ❏ GFS: distributed file system manages data ❏ Job scheduling system: jobs made up of tasks, scheduler assigns tasks to machines

  10. High-level MapReduce Pipeline Picture from http://mapreduce-tutorial.blogspot.com/2011/04/mapreduce-data-flow.html

  11. High-level MapReduce Pipeline Picture from https://research.google.com/archive/mapreduce-osdi04-slides/index-auto-0008.html

  12. Question 1 Use Figure 1 to explain a MR program’s execution.

  13. Picture 1 from Google MapReduce Paper, OSDI04

  14. Question 2 Describe how MR handles worker and master failures

  15. Fault Tolerance ❏ Detect failure via periodic heartbeats ❏ Worker Failure Map and reduce tasks in progress are rescheduled ❏ Completed map tasks are rescheduled (data on local disk) ❏ Completed reduce tasks do not need to be re-executed (data on GFS) ❏ ❏ Master Failure abort the computation ❏

  16. Question 3 Compared with traditional parallel programming models, such as multithreading and MPI, what are major advantages of MapReduce? Easy to use, scalability, and reliability

  17. Comparison with Traditional Models Picture from http://ranger.uta.edu/~sjiang/CSE6350-spring-18/lecture-8.pdf

  18. Locality ❏ Master program divides up tasks based on location of data: tries to have map() tasks on the same machine as physical data or as “near” as possible. ❏ Map task inputs are divided into 16-64 MB blocks, Google File System chunk size is 64 MB.

  19. Task Granularity And Pipelining Fine granularity tasks: many more map tasks than machines ❏ Minimizes time for fault recovery ❏ Can pipeline shuffling with map execution ❏ Better dynamic load balancing

  20. Task Granularity And Pipelining ❏ Picture from https://research.google.com/archive/mapreduce-osdi04-slides/index-auto-0009.html

  21. Question 4 The implementation of MapReduce enforces a barrier between the Map and Reduce phases, i.e., no reducers can proceed until all mappers have completed their assigned workload. For higher efficiency, is it possible for a reducer to start its execution earlier, and why? (clue: think of availability of inputs to reducers)

  22. Backup Tasks Slow workers significantly delay completion time ❏ Other jobs consuming resources on machine ❏ Bad disks w/ soft errors transfer data slowly ❏ Weird things: processor caches disabled (!!) Solution: Near end of phase, schedule backup tasks ❏ Whichever one finishes first "wins"

  23. Sort Performance ❏ 10^10 100-byte records(1TB data, 1800 nodes)

  24. Refinement ❏ Sorting guarantees within each reduce partition ❏ ❏ Combiner Reduce in advance ❏ Useful for saving network bandwidth ❏ ❏ User-defined counters Useful for debug ❏

Recommend


More recommend