design patterns leveraging spark in pdi
play

Design Patterns Leveraging Spark in PDI Chris Skirde Pentaho - PowerPoint PPT Presentation

Design Patterns Leveraging Spark in PDI Chris Skirde Pentaho Director of Sales Engineering, Hitachi Vantara Rakesh Saha Pentaho Senior Product Manager, Hitachi Vantara Quiz Time! What is Spark? A. A good way to start a fire. B. Necessary


  1. Design Patterns Leveraging Spark in PDI Chris Skirde Pentaho Director of Sales Engineering, Hitachi Vantara Rakesh Saha Pentaho Senior Product Manager, Hitachi Vantara

  2. Quiz Time! • What is Spark? A. A good way to start a fire. B. Necessary for a well running internal combustion engine. C. Fast and general purpose engine for large-scale data processing. D. All of the above. • True or False, Pentaho supports Spark? • Who is using Spark today (with or without Pentaho)?

  3. Agenda • Introduction to Spark • Common design patterns • How to leverage Spark with Pentaho

  4. Introduction to Spark • Why are we interested? • What is it really? • What’s been done?

  5. Spark Application Architecture PDI/Server Daemon

  6. What Do Those Applications Have in Common?

  7. Common Design Patterns • Filter/Organize • Join • Sum • Transform/Enrich • Query • Machine Learning/Data Science

  8. Filter/ Organize

  9. Join

  10. Sum (and Other Aggregations)

  11. Transform/Enrich • Any step you like!

  12. Query – Easy! • Cloudera use Hive-on-Spark with Hive2 • Hortonworks use SparkSQL via Simba

  13. Machine Learning/Data Science

  14. Recap What we covered today: • Reviewed what Spark is and why organizations are adopting it • Discussed several common data integration design patterns • Linked those design patterns to Pentaho features for you to try

  15. Questions?

  16. Next Steps Want to learn more? • “Meet the Experts” Matt Casters and Mark Hall! • Adaptive Execution Layer http://www.pentaho.com/blog/introducing-adaptive- execution-layer-spark-architecture • SQL on Spark http://www.pentaho.com/blog/operationalize-spark-big-data- newest-enhancements

Recommend


More recommend