Effizienz-Optimierung daten-intensiver Data Mashups am Beispiel - PowerPoint PPT Presentation

Effizienz-Optimierung daten-intensiver Data Mashups am Beispiel von Map-Reduce Pascal Hirmer BTW 2017 BigDS Workshop

Towards optimizing the efficiency of data- intensive data mashups based on the example of Map-Reduce Pascal Hirmer BTW 2017 BigDS Workshop

Motivation Big Data • Big Data: volume and complexity of data highly increases • New paradigms: Internet of Things, Industrie 4.0, Data Lakes, … • It is important to gain knowledge through data processing and analysis (knowledge discovery) • But: gaining knowledge is difficult because of the (at least) three Vs of Big Data: • Volume • Variety • Velocity 3

Data Mashups - Definition • Goal: flow-based processing, analytics, and integration of data • Modeling of data operations based on Pipes and Filters extract filter join analyze extract • Famous example: Yahoo! Pipes 4

Motivation Data Processing Tools • Data Mashup tools, ETL tools, and data analytics tools (e.g. KNIME) offer means to process and analyze data • Focus on approaches that support abstract modeling based on the pipes and filters pattern • nodes: data operations (e.g., extraction, transformation, analysis) • edges: data flow • nodes are associated with services that process the data (orchestrated by workflows) • Offer an explorative means to process data • Focus lies on the Open Source Data Mashup Tool FlexMash developed at the Uni Stuttgart • Concepts are also applicable to different approaches for data processing 5

Motivation • Overall goal of this work: Increasing the efficiency of service-based data processing • State of the art: data processing "in-service" (memory)  scalability / memory issues S1 S2 S4 S5 S3 • Approach in a nutshell: • Move data processing on computing clusters and process data in parallel • Integration of modern data processing techniques and technologies (Map-Reduce, Apache Spark, …) • Coping with the generated overhead (where is the cost-value limit?) 6

FlexMash Cloud-based execution Mashup Execution Environments Robust Time-Critical FlexMash Secure ? Modeling Tool Robust & Mashup Pattern Mashup Secure Result Mashup Plan Selection & Modeler Combination … Pattern-based Domain-specific Pattern Transformation and Visualization Modeling Selection Execution 7

FlexMash – Graphical User Interface Download FlexMash on Github: https://github.com/hirm erpl/FlexMash 8

Main contribution (I) Mashup Plan (non-executable) Executable representation of the data flow model extract analyze filter join Service runtime in-service parallel data processing Parallel data processing based on computing clusters 9

Main contribution – decision: in-service vs. distributed/parallel Requirements (e.g., costs) Transformation executable model Mashup Plan (non-executable) Service Repository Services Policies/Capabilities 10

Conclusion and future work • First approach to increase the efficiency of service-based data processing tools • Large efficiency advantages enabled through parallelization • Finding the cost-value limit is difficult • Future/ongoing work • Conducting measurements for comparison and finding cost-value limit • Concretizing the concepts • Generation of Map-Reduce jobs 11

Questions & Discussion ? 12

Thank you! Pascal Hirmer E-Mail Pascal Hirmer@ipvs.uni-stuttgart.de Telefon +49 (0) 711 685- 88297 Fax +49 (0) 711 685- 78217 Universität Stuttgart Pascal.Hirmer@ipvs.uni-stuttgart.de Universitätsstraße 38, 70569 Stuttgart, Germany

Effizienz-Optimierung daten-intensiver Data Mashups am Beispiel - PowerPoint PPT Presentation

Effizienz-Optimierung daten-intensiver Data Mashups am Beispiel von Map-Reduce Pascal Hirmer BTW 2017 BigDS Workshop Towards optimizing the efficiency of data- intensive data mashups based on the example of Map-Reduce Pascal Hirmer BTW

Enterprise Mashups why do I care? Ross Mason, MuleSoft About Me Agenda What? Why? How?

Edge Mashups for Clinical Collaboration Michael Siegenthaler Ken Birman Edge Mashups for

Selektive Neutralitt und Effizienz der Evolution Was wir aus Evolutionsexperimenten lernen

The Browser as a Secure Platform for Loosely Coupled, Private-Data Mashups Ben Adida C enter for

Interac(vely Building Geospa(al Mashups Craig A. Knoblock University of Southern California

Adam Barth (Berkeley) Collin Jackson (Stanford) William Li (Berkeley) Mashups Two web sites

Clutter Reduction Methods for Point Symbols in Map Mashups Jari Korpi, Paula Ahonen-Rainio

Web Map Mashups: Cartography of Insurgence? Alan McConchie Dr. Brian Klinkenberg University of

HIV-GRADE HBV-Tool M. Obermeier 04/ 2008 Medizinisches Labor Berg Daten- bank HIVdb HIVdb

EPSA 2007 Daten und Fakten Strafvollzugsanstalt f. mnnliche Erwachsene Strafzeit 18 Monate

Ball micro slides type PMM TECHNISCHE DATEN ASSEMBLY The mounting holes of each type are drilled

Low profile slides type RTS TECHNISCHE DATEN ASSEMBLY The mounting holes of each type are

Roller micro slides type PMMR TECHNISCHE DATEN ASSEMBLY The mounting holes of each type are

Frieder Nake: Information und Daten Mit Grundlagen der Zeichentheorie nach Morris Seminar 31120:

Lightweight slides type RTA TECHNISCHE DATEN ASSEMBLY The mounting holes of each type are

Creating Semantic Mashups: Bridging Web 2.0 and the Semantic Web Jamie Taylor, Colin Evans, Toby

An End User Perspective on Mashup Makers Lars Grammel The CHISEL Group University of Victoria

Case Study: Wind Sports Mashup on Google App Engine JAOO rhus 2009 | Jakob A. Dam |

Mashup Generator for XBaya Denis Weerasiri University of Moratuwa 1 Outline The Story

Empirical Algorithmics Holger H. Hoos Department of Computer Science University of British

Dawn Song dawnsong@cs.berkeley.edu 1 Administrative Stuff Proposal feedback Revised

Context, Quality and Relevance: Dependencies and Impacts on RESTful Web Services Design Hong-Linh

ResEval: A Mashup PlaDorm For Research Evalua5on Muhammad

John Magee jmagee@clarku.edu 1 Whats a Mashup? Why does this course list 3 (or 4) books?

Sambuz

Useful Links

Newsletter

Mail Us