An Introduction to Distributed Data Streaming Elements and Systems Paris Carbone<parisc@kth.se> PhD Candidate KTH Royal Institute of Technology 1
2
2
2
how to avoid this? 2
how to avoid this? 2
Q how to avoid this? 2
Q Q = + how to avoid this? 2
Motivation Q = + 3
Motivation Q = + 3
Motivation Q Q Q = + 3
Motivation Q Q Q = + 3
Motivation Q Q Q = + 3
Motivation Standing Query Q 4
Motivation Standing Query Q 4
Motivation Standing Query Q 4
Motivation Standing Query Q 4
Preliminaries • Data Streaming Paradigm • Incoming data is unbound - continuous arrival • Standing queries are evaluated continuously • Queries operate on the full data stream or on the most recent views of the stream ~ windows 5
Data Streams Basics • Events/Tuples : elements of computation - respect a schema • Data Streams : unbounded sequences of events • Stream Operators: consume streams and generate new ones. • Events are consumed once - no backtracking! S1 S’1 f S2 S’2 So 6
Streaming Pipelines Q approximations stream1 predictions alerts …… sources stream2 sinks 7
Core Abstractions • Windows • Synopses (summary state) • Partitioning 8
Windows Discussion Why do we need windows? 9
Windows • We are often interested only in fresh data • f = “ average temperature over the last minute every 20 sec ” • Range: Most data stream processing systems allow window operations on the most recent history (eg. 1 minute, 1000 tuples) • Slide: The frequency/granularity f is evaluated on a given range W: 1min, 20sec Average #1 Average #2 f Average #3 0 20 40 60 80 100 #seconds 10
Window Types Average #1 Average #2 Average #3 Sliding range > slide #sec 0 20 40 60 80 100 Average #1 Average #2 #sec Tumbling range = slide 0 40 80 120 20 60 100 Average #1 Average #2 #sec Jumping range < slide 0 20 40 60 80 100 120 11
Synopses We cannot infinitely store all events seen • Synopsis : A summary of an infinite stream • It is in principle any streaming operator state • Examples: samples, histograms, sketches, state machines… a summary of everything s seen so far 1. process t, s t t’ 2. update s f 3. produce t’ What about window synopses? 12
Synopses-Aggregations • Discussion - Rolling Aggregations • Propose a synopsis, s=? when • f= max • f= ArithmeticMean • f= stDev 13
Synopses-Approximations • Discussion - Approximate Results • Propose a synopsis, s=? when • f= uniform random sample of k records over the whole stream • f= filter distinct records over windows of 1000 records with a 5% error 14
Synopses-ML and Graphs • Examples of cool synopses to check out • Sparsifiers/Spanners - approximating graph properties such as shortest paths • Change detectors - detecting concept drift • Incremental decision trees - continuous stream training and classification 15
Partitioning • One stream operator is not enough s f • Data might be too large to process • e.g. very high input rate, too many stream sources • State could possibly not fit in memory s f s parallel instances f s f How do we partition the input streams? 16
Partitioning • Partitioning defines how we allocate events to each parallel instance. Typical partitioners are: s • Broadcast f P s f s f • Shuffle P s f s by f color • Key-based P s f 17
Putting Everything Together {area,temp} trigger periodically Fire Detection {loc,alert!} Pipeline {area,smoke} trigger on detection ? • operators • synopses • windows • partitioning 18
Operators Sensor Data Sources {area,temp} Src Periodic Temperature Updates {area} Src Smoke Detections Rolling Arithmetic Mean of Temperatures {area,temp} {area,avgTemp} s A trivial… State Machine-based Fire Alarm s {alarm} What is the state and its transitions? F 19
Partitioning • We are only interested in correlating smoke and high temperature within the same area • Events carry area information so we can partition our computation by area key:area P Src 20
Windowing • Individual sensor data could be potentially faulty • We need to gather data from all temperature sensors of an area and produce an average • We want fresh average temperatures w s A {area,temp} key:area P w = ? Src w s A 21
The Fire Alarm 22
The Fire Alarm s F 22
The Fire Alarm s F T : avgTemp>40 T : avgTemp<40 S : Smoke 22
The Fire Alarm s F T : avgTemp>40 T : avgTemp<40 …TTTSTTSTTTT…. S : Smoke 22
The Fire Alarm s F T : avgTemp>40 T : avgTemp<40 …TTTSTTSTTTT…. S : Smoke T S T HOT T OK FIRE S T SMOKE T 22
The Fire Alarm s F T : avgTemp>40 T : avgTemp<40 …TTTSTTSTTTT…. S : Smoke T synopsis= 1 state S T HOT T OK FIRE S T SMOKE T 22
Putting Everything Together w s {area,temp} A key:area key:area P P Src {area,avg_temp} w s A s F {area, alert} s F {area,smoke} key:area {area,smoke} P Src 23
Systems: The Big Picture Proprietary Open Source Google Flink DataFlow Samza IBM Infosphere Spark Microsoft Azure Storm 24
Evolution ’95 ’05 ’12 ’13 Materialised High Availability Policy-Based Parallel Views on Streaming Windowing Recovery ’88 ’01 ’13 ’05 ’15 Active Complex Discretized Decentralised User-Defined DataBases Event Streams Stream Queries Windows Processing concepts systems ’12 ’13 Twitter Google ’03 Storm Millwheel ’88 02 TelegraphCQ ’12 HiPac Aurora ’14 Twitter Apache Storm Flink ’12 ’13 ’00 ’03 ’05 IBM Spark Eddies STREAM Borealis System S Streaming 25
Programming Models Declarative Compositional • Offer basic building blocks • Expose a high-level API • Operators are higher order for composing custom operators and topologies functions on abstract data • Advanced behaviour such stream types • Advanced behaviour such as windowing is often missing as windowing is supported • Custom Optimisation • Self-Optimisation 26
Programming Model Types • Transformations abstract DStream, DataStream, operator details PCollection… • Suitable for engineers and data analysts • Direct access to the execution graph / topology • Suitable for engineers 27
Standing Queries with Apache Storm • Step1: Implement input ( Spouts ) and intermediate operators ( Bolts ) • Step 2: Construct a Topology by combining operators Bolts represent all intermediate computation Spouts are the vertices of the topology topology sources The listen to data They do arbitrary data manipulation feeds Spout Bolt Bolt Each operator can emit/subscribe to Streams ( computation results ) 28
Example: Topology Definition numbers new_numbers toFile numbers new_numbers 29
Standing Queries with Apache Flink Streaming Program Flink Client • Operator fusion Flink Job Graph Builder/Optimiser • Window Pre-aggregates • Deploy Long Running Tasks Flink Runtime • Monitor Execution 30
Distributed Stream Execution Paradigms 1) Real Streaming (Distributed Data Flow) STATE IS KEPT INSIDE LONG-LIVED TASK EXECUTION TASKS 2) Batched Execution (Spark Streaming) (Hadoop, Spark) 31
Windows in Action range slide • DStreams are already • Windows decomposed into partitioned in time windows policies • Only time windows supported • Policies can be user-defined too 32
Windows on Storm? src-http://www.michael-noll.com/blog/2013/01/18/implementing-real-time-trending-topics-in-storm/ 33
Partitioning in Action forward() shuffleGrouping() shuffle() repartition(num) allGrouping() broadcast() reduceByKey() fieldsGrouping() keyBy() updateStateByKey() customGrouping() partitionCustom() full control no fine-grained control 34
Synopses in Action implementing a rolling max per key 35
State in Spark? • Streams are partitioned into small batches • There is practically no state kept in workers (stateless) • How do we keep state?? put new states in output RDD dstream.updateStateByKey(…) In S’ (Spark Streaming) 36
Implementing the alarm in Flink 37
So everything works w s {area,temp} A key:area key:area P P Src {area,avg_temp} w s A s F s F {area,smoke} key:area {area,smoke} P Src or… 38
Unreliable Sources Standing Query Q 39
Unreliable Sources Standing Query Q 39
Unreliable Sources Standing Query Q 39
Unreliable Sources Standing Query Q add more sensors 39
Unreliable Processing Standing Query Q 40
Unreliable Processing Standing Query Q 40
Unreliable Processing recovered! Standing Query Q 40
Unreliable Processing recovered! Standing Query Q lost smoke events 40
Resilient Brokers Main Features • Topic-based partitioned queues • Strongly consistent offset mapping to records 41
Recommend
More recommend