Asynchronous and Fault-Tolerant Recursive Datalog Evalua9on in Shared-Nothing Engines Jingjing Wang, Magdalena Balazinska, Daniel Halperin University of Washington
Modern Analy>cs Requires Itera>on • Graph applica>ons – Graph reachability – Connected components – Shortest Path • Machine learning – Clustering algorithms – Logis>c regression • Scien>fic analy>cs – N-body simula>on • … Jingjing Wang - University of Washington 2
Galaxy Evolu>on: An Itera>ve Example A Simula9on of the Universe Galaxy Galaxy … … Picture from D. H. Stalder et. al. arXiv:1208.3444 [astro-ph.CO] Present day Millions of years ago Big Bang Jingjing Wang - University of Washington 3
Galaxy Evolu>on: Itera>ve Lineage Tracing … … Par9cle Galaxy … Present day Millions of years ago Millions of years ago Jingjing Wang - University of Washington
Galaxy Evolu>on: Why It Is not Easy • Large-scale data sizes – Scalability • Itera>ve is the core – Support efficient itera>ve constructs • Users are data scien>sts – Provide an easy-to-use query interface • Shared datasets and resources – Within a data management system Jingjing Wang - University of Washington 5
Itera>ve Analy>cs: Where to Do • SQL Server – Single-node, cannot handle huge scale • MapReduce – Rigid programming model – Write to disk, expensive itera>on • In-memory systems such as Spark – Synchronous opera>ons • Graph engines such as GraphLab – Think like a vertex Jingjing Wang - University of Washington 6
No Exis>ng System Meets All Requirements • Synchronous itera>ons only – AsterixDB, HaLoop, Pregel, REX, Spark, PrIter, Glog, … • Single-node – LogicBlox, DatalogFS, … • No declara>ve language – Stratosphere, Naiad, Grace, GraphLab, … • Specialized for graphs – GraphLab, Grace, … • Not a data management system – SociaLite, … • Theory on recursive queries – DatalogFS, … Jingjing Wang - University of Washington 7
Outline and Contribu>ons • Full-stack solu>on for itera>ve processing – Declara>ve rela>onal query language • A subset of Datalog-with-Aggrega>on – Scalable and easily implementable • Small extensions to exis>ng shared-nothing systems – Efficient itera>ve computa>on • Execu>on models and op>miza>ons • Implementa>on and empirical evalua>on using Jingjing Wang - University of Washington 8
Outline and Contribu>ons • Full-stack solu>on for itera>ve processing – Declara9ve rela9onal query language • A subset of Datalog-with-Aggrega9on – Scalable and easily implementable • Small extensions to exis9ng shared-nothing systems – Efficient itera>ve computa>on • Execu>on models and op>miza>ons • Implementa>on and empirical evalua>on using Jingjing Wang - University of Washington 9
From Datalog Programs to Asynchronous Query Plans • Datalog: a rela>onal query language – Nicely expresses recursions DECLARE @id AS INT, @lvl AS INT CC(x,x) :- Edges(x, ) • Two special operators SET @id = 3 CC(y,$Min(v)) :- CC(x,v), Edges(x,y) SET @lvl = 2 :- CC(y,v) – IDBController ;WITH cte (id, parent, child, lvl) AS ( SELECT id, parent, child, 0 • Maintains state of “nonconstant” rela>ons FROM t WHERE id = 1 – Termina>onController UNION ALL SELECT E.id, E.parent, E.child, M.lvl+1 – Easy extensions to an exis>ng engine FROM t AS E JOIN CTE AS M ON E.parent = M.child WHERE lvl < @lvl • Automa>c compila>on ) SELECT * FROM CTE --where lvl=@lvl --OPTION (MAXRECURSION 10) Jingjing Wang - University of Washington 10
Outline and Contribu>ons • Full-stack solu>on for itera>ve processing – Declara>ve rela>onal query language • A subset of Datalog-with-Aggrega>on – Scalable and easily implementable • Small extensions to exis>ng shared-nothing systems – Efficient itera9ve computa9on • Execu9on models and op9miza9ons • Implementa9on and empirical evalua9on using Jingjing Wang - University of Washington 11
Itera>ve Computa>on: How Can We Do Beqer • Performance impact: # of intermediate tuples – More tuples, more work, more resources • Op>miza>on: recursive execu>on models – Synchronous vs. asynchronous • Op>miza>on: priori>zing tuples – For asynchronous model, favor new tuples vs. base tuples Jingjing Wang - University of Washington 12
Op>miza>on: Recursive Execu>on Models • Synchronous – Stop at the end of each itera>on • Asynchronous – No barrier, propagate updates when ready • Galaxy Evolu>on – Synchronous • Find all galaxies at >mestep 1, then 2, … – Asynchronous • Galaxy A is a part of the evolu>on history • A shares par>cles with galaxy B Jingjing Wang - University of Washington 13
Galaxy Evolu>on: Execu>on Model Does Not Maqer Much 600 500 Time (seconds) 400 300 200 100 0 8 16 32 64 # workers 80GB, 27 snapshots 16 machines Jingjing Wang - University of Washington 14
Another Applica>on: Least Common Ancestor dist:1 1 dist:3 Paper 4 5 dist:2 2 3 Cita>on Jingjing Wang - University of Washington 15
LCA: Asynchronous Can Be Much Slower Than Synchronous 160 140 120 Time (seconds) 100 80 60 40 20 0 8 16 32 64 # workers 2 million papers 8 million cita>ons Jingjing Wang - University of Washington 16
Op>miza>on: Priori>zing Tuples • For asynchronous processing – Choice: favor new tuples vs. base tuples • Example: connected components 1 1 3 4 3 4 2 2 Jingjing Wang - University of Washington 17
Connected Components: Pull Order Impacts Run Time 2000 Time (seconds) 1500 Sync Async, new tuples first 1000 Async, base tuples first 500 0 8 16 32 64 # workers 21 million ver>ces 776 million edges Jingjing Wang - University of Washington 18
Conclusion • Full-stack solu>on for itera>ve big-data analy>cs – A declara>ve language – Small extensions to exis>ng shared-nothing engines – Efficient itera>ve execu>on – Failure handling methods – More details in the paper • Empirical evalua>on of various models – No single method outperforms others – Future work: an adap>ve cost-based op>mizer Jingjing Wang - University of Washington 19
Recommend
More recommend