big data processing with apache spark
play

Big Data Processing with Apache Spark Jay Urbain, PhD Credits: - PowerPoint PPT Presentation

Big Data Processing with Apache Spark Jay Urbain, PhD Credits: Resilient Distributed Datasets Resilient Distributed Datasets A Fault-T A Fault-Tolerant Abstraction for In-Memory Cluster Computing olerant Abstraction for In-Memory Cluster


  1. Big Data Processing with Apache Spark Jay Urbain, PhD Credits: Resilient Distributed Datasets Resilient Distributed Datasets A Fault-T A Fault-Tolerant Abstraction for In-Memory Cluster Computing olerant Abstraction for In-Memory Cluster Computing Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael Franklin, Scott Shenker, Ion Stoica http://spark.apache.org/

  2. Motivation

  3. Example: MapReduce

  4. Example: MapReduce

  5. Example: MapReduce

  6. Example: MapReduce

  7. Example: MapReduce

  8. Idea: cache data in-memory h"p://people.csail.mit.edu/matei/papers/2010/hotcloud_spark.pdf ¡ ¡

  9. Example: MapReduce h"p://people.csail.mit.edu/matei/papers/2010/hotcloud_spark.pdf ¡ ¡

  10. Goal: In-Memory Data Sharing

  11. Challenge

  12. Challenge h"p://web.stanford.edu/~ouster/cgi-­‑bin/papers/ramcloud.pdf ¡ ¡ h"p://piccolo.news.cs.nyu.edu/piccolo.pdf ¡ ¡

  13. Solution: Resilient Distributed Datasets (RDDs)

  14. RDD Recovery

  15. Generality of RDDs

  16. Tradeoffs

  17. Tradeoffs

  18. Tradeoffs

  19. h"p://databricks.com/blog/2014/11/05/spark-­‑officially-­‑sets-­‑a-­‑new-­‑record-­‑in-­‑large-­‑scale-­‑sorDng.html ¡ ¡

  20. Programming API

  21. Programming Spark • Written in Scala “ scah-lah ” (runs on JVM) • Can write applications in Scala, Java, Python, and R • Interactive: Scala, Python, R

  22. h"p://mesos.apache.org/ ¡ ¡

  23. Spark References • http://spark.apache.org/docs/latest/programming- guide.html • http://spark.apache.org/docs/latest/api/python/index.html

  24. h"p://shop.oreilly.com/product/0636920028512.do ¡

  25. h"p://shop.oreilly.com/product/0636920028512.do ¡

  26. h"p://shop.oreilly.com/product/0636920028512.do ¡

  27. h"p://shop.oreilly.com/product/0636920028512.do ¡

Recommend


More recommend