Dependency Driven Analytics a Compass for Uncharted Data - PowerPoint PPT Presentation

Dependency Driven Analytics a Compass for Uncharted Data Oceans/Jungles Ruslan Mavlyutov, Carlo Curino, Boris Asipov, Phil Cudre-Mauroux

The production job “ JobA ” failed… impact? debug? re-run?

1) look in the logs PBs of daily

2) ask local experts (they know “how” to look)

But don’t bother them too much…

The Problem Focused analyses of massive, loosely structured, evolving data has prohibitive cognitive and computational costs.

The Problem Focused analyses of massive, loosely structured, evolving data has prohibitive cognitive and computational costs. Cost of understanding raw data Cost of processing raw data

A better vantage point?

Dependency Driven Analytics (DDA) DDA today DDA vision • Derive a dependency graph (DG) from raw data • Automation • Language-integration • Real-time • … The DG serve as: • Conceptual Map , and • Sparse Index for the raw data

DDA: infrastructure logs “incarnation” “JobA’s impact?” Query Interface • The DG stores: provenance + telemetry • NODES: jobs / files / machines / tasks / … • EDGES: job-reads-file, task-runs-on-machine • PROPERTIES: timestamps / resources usage / … Raw data (logs)

Current implementation Dependency Querying Definition Graph Big Data Raw System System Data Schema + extr. rules Scope/ Raw Cosmos Neo4J Data dependency Raw Extraction Storage graph Data

Extract “jobs processing hours” extStart = EXTRACT * FROM "ProcStarted_%Y%m%d.log" USING EventExtractor("ProcStarted"); startData = SELECT ProcessGuid AS ProcessId, CurrentTimeStamp.Value AS StartTime, JobGuid AS JobId FROM extStart WHERE ProcessGuid != null AND JobGuid != null AND CurrentTimeStamp.HasValue; … procH = SELECT endData.JobId, SUM((End - Start).TotalMs)/1000/3600 AS procHours, FROM startData INNER JOIN endData ON startData.ProcessId == endData.ProcessId AND startData.JobId == endData.JobId GROUP BY JobId; OUTPUT (SELECT JobId, procHours FROM procH) TO "processingHours.csv";

Example: “Measure JobA’s impact” graph.traversal().V() .has("JobTemplateName","JobA_*") .local( emit().repeat(out()).times(100) … .hasLabel("job").dedup() .values(“ procHours").sum() ).mean()

DDA: Initial Experiments Improvements of up to: • 7x less LoC* • 700x less run-time • > 50,000x less CPU-time • > 800x less I/O * Heavy under-representation of hardness of baseline

Not all queries are as easy…  UI (keyword search) Simple search/browsing  Graph queries on DG Local or agg. queries on telemetry / provenance (i.e., covering index) Neo4J  Mix of DG and raw data Complex/AdHoc queries Scope/ + (e.g., debugging) querying (clumsy today) Cosmos Neo4J

DDA: open challenges • Automatically “map” the raw data • Real-time log ingestion at scale • Scale-out graph management • Leverage specialized graph structures • Integrated language for graph+relational+unstructured

Scope Internet of Things Infrastructure logs Enterprise Search …

Conclusions Problem: • Focused analyses of massive, loosely structured, evolving data has prohibitive costs DDA solution: • Extract a Dependency Graph (DG)  conceptual map + sparse index • Current impl. leverages existing BigData/Graph tech Open challenges: • automation / real-time / scalable graph tech / integrated language

Dependency Driven Analytics a Compass for Uncharted Data - PowerPoint PPT Presentation

Dependency Driven Analytics a Compass for Uncharted Data Oceans/Jungles Ruslan Mavlyutov, Carlo Curino, Boris Asipov, Phil Cudre-Mauroux The production job JobA failed impact? debug? re-run? 1) look in the logs PBs of daily 2)

Dependency Dependency- -Based Automatic Evaluation Based Automatic Evaluation Dependency

Analytics and Data Summit 2020 Analytics and Data Summit 2020 Analytics and Data Summit 2020

Graph Based Dependency Parsing Wei Qiu December 15, 2011 . . . . . . Graph Based

Dependency Grammars Topological Dependency Trees: A Constraint-based Account of Linear

Lecture 19: Dependency Grammars and Dependency Parsing Julia Hockenmaier juliahmr@illinois.edu

Priority-Driven Scheduling of Periodic Tasks Priority-driven vs. clock-driven scheduling:

Natural Language Processing Other Syntactic Models Parsing IV Dan Klein UC Berkeley Dependency

Dependency Parsing CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre, Dan

Dependency Grammars and Parsing CMSC 473/673 UMBC Outline Review: PCFGs and CKY Dependency

Thoughts on Learner Data and Motivation Learner Language Dependency Parsing and Dependency

Dependency Parsing II CMSC 470 Marine Carpuat Graph-based Dependency Parsing Slides credit:

Dependency Parsing 2 CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre,

Marina Valeeva Outline 2 1. Introduction What is Dependency Parsing? What is a

Dependency Parsing CMSC 470 Marine Carpuat Dependency Grammars Syntactic structure = lexical

Undergraduate Business Analytics Minor Spreadsheet Analytics BANA-2081 Business Analytics

False fasting is driven by pride False fasting is driven by pride False fasting is

W W

Deep - Di v e on Classes W OR K IN G W ITH TH E C L ASS SYSTE M IN P YTH ON Vicki Bo y kis

Noncommutative extensions of the space-time symmetries beyond supersymmetry L. A. Wills-Toro

Advances in Knot Polynomials 21 October 2016 Advances in Knot Polynomials 21 October 2016 1 /

Coding, counting, and sampling triangulations and other planar graphs Gilles S CHAEFFER CNRS,

Basic Computational Arrangement Problems Pipeline Enumerate a set S of primitives that

Highlights of MAGIC Razmik Mirzoyan Max-Planck-Institute for Physics Munich, Germany On Behalf

Trieste an ICES Introduc3on 31.07.2017 Bob Bishop President & Founder ICES Founda3on

Dependency Driven Analytics a Compass for Uncharted Data - PowerPoint PPT Presentation

Dependency Driven Analytics a Compass for Uncharted Data Oceans/Jungles Ruslan Mavlyutov, Carlo Curino, Boris Asipov, Phil Cudre-Mauroux The production job JobA failed impact? debug? re-run? 1) look in the logs PBs of daily 2)

Dependency Dependency- -Based Automatic Evaluation Based Automatic Evaluation Dependency

Analytics and Data Summit 2020 Analytics and Data Summit 2020 Analytics and Data Summit 2020

Graph Based Dependency Parsing Wei Qiu December 15, 2011 . . . . . . Graph Based

Dependency Grammars Topological Dependency Trees: A Constraint-based Account of Linear

Lecture 19: Dependency Grammars and Dependency Parsing Julia Hockenmaier juliahmr@illinois.edu

Priority-Driven Scheduling of Periodic Tasks Priority-driven vs. clock-driven scheduling:

Natural Language Processing Other Syntactic Models Parsing IV Dan Klein UC Berkeley Dependency

Dependency Parsing CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre, Dan

Dependency Grammars and Parsing CMSC 473/673 UMBC Outline Review: PCFGs and CKY Dependency

Thoughts on Learner Data and Motivation Learner Language Dependency Parsing and Dependency

Dependency Parsing II CMSC 470 Marine Carpuat Graph-based Dependency Parsing Slides credit:

Dependency Parsing 2 CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre,

Marina Valeeva Outline 2 1. Introduction What is Dependency Parsing? What is a

Dependency Parsing CMSC 470 Marine Carpuat Dependency Grammars Syntactic structure = lexical

Undergraduate Business Analytics Minor Spreadsheet Analytics BANA-2081 Business Analytics

False fasting is driven by pride False fasting is driven by pride False fasting is

W W

Deep - Di v e on Classes W OR K IN G W ITH TH E C L ASS SYSTE M IN P YTH ON Vicki Bo y kis

Noncommutative extensions of the space-time symmetries beyond supersymmetry L. A. Wills-Toro

Advances in Knot Polynomials 21 October 2016 Advances in Knot Polynomials 21 October 2016 1 /

Coding, counting, and sampling triangulations and other planar graphs Gilles S CHAEFFER CNRS,

Basic Computational Arrangement Problems Pipeline Enumerate a set S of primitives that

Highlights of MAGIC Razmik Mirzoyan Max-Planck-Institute for Physics Munich, Germany On Behalf

Trieste an ICES Introduc3on 31.07.2017 Bob Bishop President &amp; Founder ICES Founda3on

Trieste an ICES Introduc3on 31.07.2017 Bob Bishop President & Founder ICES Founda3on