etl is dead long live streams
play

ETL is dead; long-live streams Neha Narkhede, Co-founder & - PowerPoint PPT Presentation

ETL is dead; long-live streams Neha Narkhede, Co-founder & CTO, Confluent Data and data systems have really changed in the past decade Old world: Two popular locations for data DB DB DWH DB DB Operational databases Relational


  1. “ #1: Powerful and lightweight Java library; need just Kafka and your app app

  2. “ #2: Convenient DSL with all sorts of operators: join(), map(), filter(), windowed aggregates etc

  3. Word count program using Kafka’s streams API

  4. “ #3: True event-at-a-time stream processing; no microbatching

  5. “ #4: Dataflow-style windowing based on event-time; handles late-arriving data

  6. “ #5: Out-of-the-box support for local state ; supports fast stateful processing

  7. External state

  8. local state

  9. Fault-tolerant local state

  10. “ #6: Kafka’s Streams API allows reprocessing; useful to upgrade apps or do A/B testing

  11. reprocessing

  12. Real-time dashboard for security monitoring

  13. Kafka’s streams api: simple is beautiful Vision 1 Vision 2

  14. Logs unify batch and stream processing

  15. New shiny future of ETL: Kafka Connect API Connect API sink source Extract Load Streams API app Transforms

  16. App App App App cache A giant mess! cache MQ MQ search monitoring security DWH Hadoop

Recommend


More recommend