Modern Graph Analytic Support in GSQL, TigerGraphss GQL Alin - PowerPoint PPT Presentation

Modern Graph Analytic Support in GSQL, TigerGraphs’s GQL Alin Deutsch TigerGraph Chief Scientist Professor, UC San Diego

The Age of the Graph Is Upon Us (Again) • Early-mid-90s: semi- or un-structured data research was all the rage – data logically viewed as graph – initially motivated by modeling WWW (page=vertex, link=edge) – query languages expressing constrained reachability in graph • Late 90s-late 2000s: special case XML (graph restricted to tree shape) – Mature: W3C standard ecosystem for modeling and querying (XQuery, XPath, XLink, XSLT, XML Schema, … ) • Since mid 2000s: JSON and friends (also restricted to tree shape) – Mongodb, Couchbase, SparkSQL, GraphQL, AsterixDB, … • Present: back to unrestricted graphs – Initially motivated by analytic tasks in social networks – Now universal use (most interesting data is linked, after all)

The Traditional Graph Data Model • Nodes correspond to entities • Edges correspond to binary relationships • Edges may be directed or undirected (asymmetric, resp. symmetric relationships) • Nodes and edges may be labeled/typed • Nodes and edges annotated with data – both have sets of attributes (key-value pairs)

Example: Customers Buy Products customer product bought price quantity name discount

Key Traditional Language Ingredients • Pioneered by academic work on relational query extensions for graphs (since ‘87) – Path expressions (PEs) for navigation – Variables for referring to and manipulating data found during navigation – Stitching multiple PEs into complex navigation patterns à conjunctive path queries – Constructors for new nodes and edges

Path Expressions • Express reachability via constrained paths • Early graph-specific extension over conjunctive queries • Introduced initially in academic prototypes in early 90s – StruQL (AT&T Research - Fernandez, Halevy, Suciu) – WebSQL (U Toronto - Mendelzon, Mihaila, Milo) – Lorel (Stanford - Widom et al) • Supported by modern languages – SparQL, Cypher, Gremlin, GSQL

Path Expression Examples (1) • Pairs of customer and product they bought: -Bought-> • Pairs of customer and product they were involved with (bought or reviewed) - Bought|Reviewed-> • Pairs of customers who bought same product (lists customers with themselves) - Bought->.<-Bought-

Path Expression Examples (2) • Pairs of customers involved with same product (like- minded) -Bought|Reviewed->.<-Bought|Reviewed- • Pairs of customers connected via a chain of like-minded customer pairs (-Bought|Reviewed->.<-Bought|Reviewed-)*

Conjunctive Regular Path Queries • Path expressions as atomic building blocks • Explicitly introduce variables binding to source and target nodes of path expressions. • Variables can be used to stitch multiple path expression atoms into complex patterns.

CRPQ Examples • Pairs of customers who have bought same product (do not list a customer with herself): Q1(c1,c2) :- c1 – Bought->.<-Bought- c2, c1 != c2 • Customers who have bought a product and also reviewed it: Q2(c) :- c – Bought-> p, c – Reviewed-> p

Key Language Ingredients Needed in Modern Applications – All primitives inherited from past • path expressions + variables + conjunctive patterns + node/edge construction & – Support for large-scale graph analytics • Aggregation of data encountered during navigation à requires bag semantics for pattern matches • Control flow support for class of iterative algorithms that converge in multiple steps – (e.g. PageRank-class, recommender systems, shortest paths, etc.)

Aggregation

Aggregation in Modern Graph QLs • PGQL, Gremlin and SparQL use an SQL-style GROUP BY clause • Cypher’s RETURN clause uses similar syntax as aggregation-extended CQs • GSQL uses aggregating containers called “accumulators” – (soon to add above solutions as syntactic sugar, but accumulators remain strictly more versatile)

GSQL Accumulators • GSQL traversals collect and aggregate data by writing it into accumulators • Accumulators are containers (data types) that – hold a data value – accept inputs – aggregate inputs into the data value using a binary operator • May be built-in (sum, max, min, etc.) or user-defined • May be – global (a single container) – Vertex-attached (one container per vertex)

Vertex-Attached Accumulator Example: Revenue per Customer and per Product customer @cSales product @pSales bought price quantity discount thisSaleRevenue

Vertex-Attached Accumulator Example: Revenue per Customer and per Product + @pSales @cSales + @pSales @cSales @pSales

Vertex-Attached Accumulator Example: Revenue per Customer and per Product SumAccum < float > @cSales, @pSales; accumulator declaration SELECT c FROM Customer :c – (Bought :b)-> Product :p ACCUM thisSaleRevenue = b.quantity*(1-b.discount)*p.price, c.@cSales += thisSaleRevenue, p.@pSales += thisSaleRevenue; same sale revenue contributes groups are distributed, each node to two aggregations, each by accumulates its own group distinct grouping criteria

Recommended Toys Ranked by Log-Cosine Similarity SumAccum <f loat > @rank, @lc; SumAccum < int > @inCommon; Me = {Customer . 1}; p INTO ToysILike, o INTO OthersWhoLikeThem SELECT Me : c -( Likes )-> Product : p <-( Likes )- Customer : o FROM p . category == “ T oy s” and o != c WHERE o . @inCommon += 1 ACCUM POST-ACCUM o . @lc = log ( 1 + o . @inCommo n) ; T o ysTheyLike = SELECT t FROM OthersWhoLikeThem : o – ( Like s)-> Product : t WHERE t . category == " toy " ACCUM t . @rank += o . @lc ; RecommendedToys = ToysTheyLike – ToysILike;

Control Flow Primitives

Loops Are Essential • Loops (until condition is satisfied) – Necessary to program iterative algorithms, e.g. PageRank, recommender systems, shortest-path, etc. – They synergize with accumulators. This GSQL-unique combination concisely expresses sophisticated graph analytics – Can be used to program unbounded-length path traversal under various semantics

PageRank in GSQL CREATE QUERY pageRank (float maxChange, int maxIteration, float dampingFactor) { MaxAccum<float> @@maxDifference = 9999; // max score change in an iteration SumAccum<float> @received_score = 0; // sum of scores received from neighbors SumAccum<float> @score = 1; // initial score for every vertex is 1. AllV = {Page.*}; // start with all vertices of type Page WHILE @@maxDifference > maxChange LIMIT maxIteration DO @@maxDifference = 0; S= SELECT s FROM AllV:s -(Linkto)-> :t ACCUM t.@received_score += s.@score/s.outdegree() POST-ACCUM s.@score = 1-dampingFactor + dampingFactor * s.@received_score, s.@received_score = 0, @@maxDifference += abs(s.@score - s.@score'); END ; }

Takeaway Serendipitous synergy of flexible aggregation + loops from point of view of both expressive power (conciseness, naturalness) performance

Modern Graph Analytic Support in GSQL, TigerGraphss GQL Alin - PowerPoint PPT Presentation

Modern Graph Analytic Support in GSQL, TigerGraphss GQL Alin Deutsch TigerGraph Chief Scientist Professor, UC San Diego The Age of the Graph Is Upon Us (Again) Early-mid-90s: semi- or un-structured data research was all the rage data

Gr Graph Analysis of Candidate GQ GQL Features Graph Query Language Project Existing Languages

MODERN 1 MODERN 2 MODERN 3 MODERN 4 MODERN A peep at some distant orb has power to raise

Zeros of analytic functions Lecture 14 Zeros of analytic functions Zeros of analytic functions

A Decision A Decision A Decision-Analytic Approach for A Decision Analytic Approach for

SQL/PGQ & GQL STATUS Keith W. Hare Convenor, ISO/IEC JTC1 SC32 WG3 Database Languages

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Modern Risk Modern Risk Modern Risk Management Modern Risk Management anagement Concepts:

Graph Indexing: Tree + Delta Delta >= Graph >= Graph Graph Indexing: Tree + Peixian Zhao,

Graph Mining Marco Serafini COMPSCI 532 Lecture 11 Classes of Graph Systems Graph

On p -adic comparison theorems for analytic spaces Wies lawa Nizio l, joint with Pierre

Analytic Combinatorics in Several Variables Robin Pemantle and Mark Wilson A of A conference, 30

Hadamard type operators for real analytic functions of several variables and moments of analytic

5. Analytic Combinatorics http://aofa.cs.princeton.edu Analytic combinatorics is a calculus for

Functional Analytic Framework Functional Analytic Framework for Model Selection for Model

Modern OLTP Indexes (Part 2) 1 / 43 Modern OLTP Indexes (Part 2) Recap Recap 2 / 43 Modern OLTP

Value Creation Through Constructive Activism Q3 2018 Shareholder Update Call October 30, 2018 1

When Negotiation Goes Wrong: Debt Collection and Pay for Delay Pay for Delay Joseph Farrell

BM25 is so Yesterday Modern Techniques for Better Search Relevance in Solr Grant Ingersoll CTO

Information Visualization Aggregate & Filter Tamara Munzner Department of Computer Science

Network Economics -- Lecture 1: Pricing of communication services Patrick Loiseau EURECOM

Healthcare Anchor Network Health systems collaborating to improve well-being by building an

Water Marketing Strategy Workshop #2 Virtual Meeting October 21, 2020 4pm 6pm Agenda

Lecture 3: The night they reread Minsky Paul Krugman Source: Simon Johnson/ James Kwak What

Modern Graph Analytic Support in GSQL, TigerGraphss GQL Alin - PowerPoint PPT Presentation

Modern Graph Analytic Support in GSQL, TigerGraphss GQL Alin Deutsch TigerGraph Chief Scientist Professor, UC San Diego The Age of the Graph Is Upon Us (Again) Early-mid-90s: semi- or un-structured data research was all the rage data

Gr Graph Analysis of Candidate GQ GQL Features Graph Query Language Project Existing Languages

MODERN 1 MODERN 2 MODERN 3 MODERN 4 MODERN A peep at some distant orb has power to raise

Zeros of analytic functions Lecture 14 Zeros of analytic functions Zeros of analytic functions

A Decision A Decision A Decision-Analytic Approach for A Decision Analytic Approach for

SQL/PGQ &amp; GQL STATUS Keith W. Hare Convenor, ISO/IEC JTC1 SC32 WG3 Database Languages

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Modern Risk Modern Risk Modern Risk Management Modern Risk Management anagement Concepts:

Graph Indexing: Tree + Delta Delta &gt;= Graph &gt;= Graph Graph Indexing: Tree + Peixian Zhao,

Graph Mining Marco Serafini COMPSCI 532 Lecture 11 Classes of Graph Systems Graph

On p -adic comparison theorems for analytic spaces Wies lawa Nizio l, joint with Pierre

Analytic Combinatorics in Several Variables Robin Pemantle and Mark Wilson A of A conference, 30

Hadamard type operators for real analytic functions of several variables and moments of analytic

5. Analytic Combinatorics http://aofa.cs.princeton.edu Analytic combinatorics is a calculus for

Functional Analytic Framework Functional Analytic Framework for Model Selection for Model

Modern OLTP Indexes (Part 2) 1 / 43 Modern OLTP Indexes (Part 2) Recap Recap 2 / 43 Modern OLTP

Value Creation Through Constructive Activism Q3 2018 Shareholder Update Call October 30, 2018 1

When Negotiation Goes Wrong: Debt Collection and Pay for Delay Pay for Delay Joseph Farrell

BM25 is so Yesterday Modern Techniques for Better Search Relevance in Solr Grant Ingersoll CTO

Information Visualization Aggregate &amp; Filter Tamara Munzner Department of Computer Science

Network Economics -- Lecture 1: Pricing of communication services Patrick Loiseau EURECOM

Healthcare Anchor Network Health systems collaborating to improve well-being by building an

Water Marketing Strategy Workshop #2 Virtual Meeting October 21, 2020 4pm 6pm Agenda

Lecture 3: The night they reread Minsky Paul Krugman Source: Simon Johnson/ James Kwak What

SQL/PGQ & GQL STATUS Keith W. Hare Convenor, ISO/IEC JTC1 SC32 WG3 Database Languages

Graph Indexing: Tree + Delta Delta >= Graph >= Graph Graph Indexing: Tree + Peixian Zhao,

Information Visualization Aggregate & Filter Tamara Munzner Department of Computer Science