streaming sql to unify batch and stream processing theory
play

Streaming SQL to Unify Batch and Stream Processing: Theory and - PowerPoint PPT Presentation

Streaming SQL to Unify Batch and Stream Processing: Theory and Practice with Apache Flink at Uber Fabian Hueske Shuyi Chen Strata Data Conference, San Jose March, 7th 2018 1 What is Apache Flink? Data Stream Processing Event-driven


  1. Streaming SQL to Unify Batch and Stream Processing: Theory and Practice with Apache Flink at Uber Fabian Hueske Shuyi Chen Strata Data Conference, San Jose March, 7th 2018 1

  2. What is Apache Flink? Data Stream Processing Event-driven Batch Processing realtime results Applications from data streams process static and data-driven actions historic data and services Stateful Computations Over Data Streams 2

  3. What is Apache Flink? Stateful computations over streams real-time and historic fast, scalable, fault tolerant, in-memory, event time, large state, exactly-once Application Queries Streams Applications Database Devices Stream Historic etc. Data File / Object Storage 3

  4. Hardened at scale Streaming Platform Service Streaming Platform as a Service billions messages per day 3700+ container running Flink, A lot of Stream SQL 1400+ nodes, 22k+ cores, 100s of jobs 100s jobs, 1000s nodes, TBs state, Fraud detection metrics, analytics, real time ML, Streaming Analytics Platform Streaming SQL as a platform 4

  5. Powerful Abstractions Layered abstractions to navigate simple to complex use cases High-level SQL / Table API (dynamic tables) Analytics API val stats = stream Stream- & Batch DataStream API (streams, windows) .keyBy( "sensor" ) Data Processing .timeWindow(Time.seconds(5)) .sum((a, b) -> a.add(b)) Process Function (events, state, time) Stateful Event- Driven Applications def processElement(event: MyEvent, ctx: Context, out: Collector[Result]) = { // work with event and state (event, state.value) match { … } out.collect(…) // emit events state.update(…) // modify state 5 // schedule a timer callback ctx.timerService.registerEventTimeTimer(event.timestamp + 500) }

  6. Apache Flink’s Relational APIs ANSI SQL LINQ-style Table API SELECT user, COUNT(url) AS cnt tableEnvironment FROM clicks .scan("clicks") GROUP BY user .groupBy('user) .select('user, 'url.count as 'cnt) Unified APIs for batch & streaming data A query specifies exactly the same result regardless whether its input is static batch data or streaming data. 6

  7. Query Translation tableEnvironment SELECT user, COUNT(url) AS cnt .scan("clicks") FROM clicks .groupBy('user) GROUP BY user .select('user, 'url.count as 'cnt) Input data is Input data is bounded unbounded (batch) (streaming) 7

  8. What if “clicks” is a file? Input data is Result is produced Clicks read at once at once user cTime url user cnt Mary 12:00:00 https://… SELECT Mary 2 Bob 12:00:00 https://… user, Bob 1 COUNT(url) as cnt Mary 12:00:02 https://… FROM clicks Liz 1 GROUP BY user Liz 12:00:03 https://… 8

  9. What if “clicks” is a stream? Input data is Result is continuously Clicks continuously read produced user cTime url user cnt Mary 12:00:00 https://… SELECT Mary Mary 2 1 Bob 12:00:00 https://… user, Bob 1 COUNT(url) as cnt Mary 12:00:02 https://… FROM clicks Liz 1 GROUP BY user Liz 12:00:03 https://… The result is identical! 9

  10. Why is stream-batch unification important? Usability § ANSI SQL syntax: No custom “StreamSQL” syntax. • ANSI SQL semantics: No stream-specific results. • Portability § Run the same query on bounded and unbounded data • Run the same query on recorded and real-time data • bounded query bounded query future past now start of the stream unbounded query unbounded query Do we need to soften SQL semantics for streaming? § 10

  11. DBMSs Run Queries on Streams § Materialized views (MV) are similar to regular views, but persisted to disk or memory • Used to speed-up analytical queries • MVs need to be updated when the base tables change § MV maintenance is very similar to SQL on streams • Base table updates are a stream of DML statements • MV definition query is evaluated on that stream • MV is query result and continuously updated 11

  12. Continuous Queries in Flink § Core concept is a “Dynamic Table” • Dynamic tables are changing over time § Queries on dynamic tables • produce new dynamic tables (which are updated based on input) • do not terminate § Stream ↔ Dynamic table conversions 12

  13. Stream ↔ Dynamic Table Conversions § Append Conversions • Records are only inserted/appended § Upsert Conversions • Records are inserted/updated/deleted and have a (composite) unique key § Changelog Conversions • Records are inserted/updated/deleted 13

  14. SQL Feature Set in Flink 1.5.0 § SELECT FROM WHERE § GROUP BY / HAVING Non-windowed, TUMBLE, HOP, SESSION windows • § JOIN Windowed INNER, LEFT / RIGHT / FULL OUTER JOIN • Non-windowed INNER JOIN • § Scalar, aggregation, table-valued UDFs § SQL CLI Client (beta) § [streaming only] OVER / WINDOW UNBOUNDED / BOUNDED PRECEDING • § [batch only] UNION / INTERSECT / EXCEPT / IN / ORDER BY 14

  15. What can I build with this? Data Pipelines § Transform, aggregate, and move events in real-time • Low-latency ETL § Convert and write streams to file systems, DBMS, K-V stores, indexes, … • Convert appearing files into streams • Stream & Batch Analytics § Run analytical queries over bounded and unbounded data • Query and compare historic and real-time data • Data Preparation for Live Dashboards § Compute and update data to visualize in real-time • 15

  16. The New York Taxi Rides Data Set The New York City Taxi & Limousine Commission provides a public data § set about taxi rides in New York City We can derive a streaming table from the data § Table: TaxiRides § rideId: BIGINT // ID of the taxi ride isStart: BOOLEAN // flag for pick-up (true) or drop-off (false) event lon: DOUBLE // longitude of pick-up or drop-off location lat: DOUBLE // latitude of pick-up or drop-off location rowtime: TIMESTAMP // time of pick-up or drop-off event 16

  17. Identify popular pick-up / drop-off locations § Compute every 5 minutes for each location the number of departing and arriving taxis of the last 15 minutes . SELECT cell, isStart, HOP_END(rowtime, INTERVAL '5' MINUTE, INTERVAL '15' MINUTE) AS hopEnd, COUNT(*) AS cnt FROM (SELECT rowtime, isStart, toCellId(lon, lat) AS cell FROM TaxiRides WHERE isInNYC(lon, lat)) GROUP BY cell, isStart, HOP(rowtime, INTERVAL '5' MINUTE, INTERVAL '15' MINUTE) 17

  18. Average ride duration per pick-up location § Join start ride and end ride events on rideId and compute average ride duration per pick-up location . SELECT pickUpCell, AVG(TIMESTAMPDIFF(MINUTE, e.rowtime, s.rowtime) AS avgDuration FROM (SELECT rideId, rowtime, toCellId(lon, lat) AS pickUpCell FROM TaxiRides WHERE isStart) s JOIN (SELECT rideId, rowtime FROM TaxiRides WHERE NOT isStart) e ON s.rideId = e.rideId AND e.rowtime BETWEEN s.rowtime AND s.rowtime + INTERVAL '1' HOUR GROUP BY pickUpCell 18

  19. Building a Dashboard SELECT cell, isStart, HOP_END(rowtime, INTERVAL '5' MINUTE, INTERVAL '15' MINUTE) AS hopEnd, COUNT(*) AS cnt FROM (SELECT rowtime, isStart, toCellId(lon, lat) AS cell FROM TaxiRides WHERE isInNYC(lon, lat)) GROUP BY cell, isStart, HOP(rowtime, INTERVAL '5' MINUTE, INTERVAL '15' MINUTE) Elastic Search Kafka 19

  20. Flink SQL in Production @ UBER 20

  21. Uber's business is Real-Time Uber 21

  22. Challenges Infrastructure Productivity Operation q Target audience q 100s of Billions of q ~1000 streaming messages / day q Operation people jobs q At-least-once q Data scientists processing q Engineers q Multiple DCs q Integrations q Exactly-once state processing q Logging q Backend services q 99.99% SLA on q Storage systems availability q Data management q 99.99% SLA on q Monitoring latency 22

  23. Stream processing @ Uber § Apache Samza (Since Jul. 2015) • Scalable • At-least-once message processing • Managed state • Fault tolerance § Apache Flink ( Since May, 2017 ) • All of above • Exactly-once stateful computation • Accurate • Unified stream & batch processing with SQL 23

  24. Lifecycle of building a streaming job 24

  25. Writing the job Business Input Testing Debugging Logics Output Java/Scala Duplicate code • • Streaming/batch • 25

  26. Running the job Resource Monitoring Deployment Logging Maintenance estimation & Alerts Manual process • Hard to scale beyond > 10 jobs • 26

  27. Job from idea to production takes days 27

  28. How can we improve efficiency as a platform? 28

  29. Flink SQL to be savior SELECT AVG(…) FROM eats_order WHERE … 29

  30. Connectors SELECT AVG(…) FROM eats_order WHERE … HTTP 30 Pinot

  31. UI & Backend services To make it self-service § SQL composition & validation • Connectors management • 31

  32. UI & Backend services To make it self-service § Job compilation and generation • Resource estimation • Test Analyze input Analyze query deployment Kafka input rate YARN containers SELECT * FROM ... Hive metastore data CPU Heap memory 32

  33. UI & Backend services To make it self-service § Job deployment • Sandbox Functional correctness • Play around with SQL • Staging Promote System generated estimate • Production like load • Production Managed • 33

  34. UI & Backend services To make it self-service § Job management • 34

Recommend


More recommend