stratosphere
play

StratoSphere Above the Clouds Triangle Enumeration Input Set of - PowerPoint PPT Presentation

Stratosphere Demo Triangle Enumeration & TPC-H Query 3 Thomas Bodner, Matthias Ringwald TU Berlin StratoSphere Above the Clouds Triangle Enumeration Input Set of undirected edges Friend-of-a-Friend RDF data from Billion Triple


  1. Stratosphere Demo Triangle Enumeration & TPC-H Query 3 Thomas Bodner, Matthias Ringwald TU Berlin StratoSphere Above the Clouds

  2. Triangle Enumeration Input ■ Set of undirected edges ■ Friend-of-a-Friend RDF data from Billion Triple Challenge 2011 Goal ■ Find triples of edges that build triangle ■ Used as preprocessing to find highly connected subgraphs Stratosphere: Information Management above the Clouds 2

  3. Triangle Enumeration – Step 1 & 2 Stratosphere: Information Management above the Clouds 3

  4. Triangle Enumeration – Step 3 & 4 Stratosphere: Information Management above the Clouds 4

  5. Triangle Enumeration on PACTs Stratosphere: Information Management above the Clouds 5

  6. StratoSphere Above the Clouds CODE + DEMO Stratosphere: Information Management above the Clouds 6

  7. TPC-H Query 3 ■ OLAP-style query ■ 2 Tables: Orders, Lineitem ■ Join and Aggregation SELECT l_orderkey, o_shippriority, sum(l_extendedprice) as revenue FROM orders, lineitem WHERE l_orderkey = o_orderkey AND o_custkey IN [X] AND o_orderdate > [Y] GROUP BY l_orderkey, o_shippriority Stratosphere: Information Management above the Clouds

  8. StratoSphere Above the Clouds CODE + DEMO Stratosphere: Information Management above the Clouds 8

  9. Additional Information ■ www.stratosphere.eu provides open-source release and additional examples: WordCount □ One Iteration of K-Means □ Pair-wise shortest path computation in graphs □ Weblog file analysis □ ■ „MapReduce and PACT - Comparing Data Parallel Programming Models “ , A.Alexandrov et al., BTW 2011 Compares MapReduce and PACT implementations of additional □ examples Stratosphere: Information Management above the Clouds 9

  10. StratoSphere Above the Clouds Demo Screenshots ENUMERATING TRIANGLES FOR SOCIAL NETWORK MINING Stratosphere: Information Management above the Clouds

  11. Enumerating Triangles – Job Preview Stratosphere: Information Management above the Clouds

  12. Enumerating Triangles – Optimized Plan Stratosphere: Information Management above the Clouds

  13. Enumerating Triangles – Nephele Schedule in Execution Stratosphere: Information Management above the Clouds

  14. Enumerating Triangles – Result Edge 1 ID Edge 2 ID Edge 3 ID 1669672241|957516469|13113271| 1174119379|957443913|195638598| 1805945648|956415427|448134175| 1950197714|956415427|448134175| 2016831532|956415427|448134175| 1669297417|956305207|315643403| 1039976411|956305207|315643403| 1467833050|953504954|878633592| 1672783901|950586510|524583308| 1840098659|949391994|562197935| 1146307869|947061533|121415420| 1564227243|945488147|536289824| 1892548695|945488147|536289824| Stratosphere: Information Management above the Clouds

  15. StratoSphere Above the Clouds Demo Screenshots TPCH QUERY 3 (SIMPLIFIED) Stratosphere: Information Management above the Clouds

  16. TPCH3 – Plan Preview Stratosphere: Information Management above the Clouds

  17. TPCH3 – Optimized Plan Stratosphere: Information Management above the Clouds

  18. TCPH-3 – Nephele Schedule in Execution Stratosphere: Information Management above the Clouds

  19. TPCH Query – Result Priority (0 = Normal) Order Nr. Total Sales Volume 2948|0|5896| 6691|0|1887| 6691|1|26837| 9507|0|61605| 9665|0|28995| 12641|0|75846| 17282|0|34564| 24964|0|124820| 27330|0|191310| 29858|0|22865| 30533|0|183198| 30561|0|91683| 41255|0|206275| Stratosphere: Information Management above the Clouds

Recommend


More recommend