how to for real time streaming and analytics at scale
play

How-to for real-time streaming and analytics at scale with Apache - PowerPoint PPT Presentation

How-to for real-time streaming and analytics at scale with Apache Kafka and Apache Ignite Viktor Gamov, Confluent, @gamussa Denis Magda, GridGain, @denismagda Digital Transformations Challenges Application Layer 10-100x more queries and


  1. How-to for real-time streaming and analytics at scale with Apache Kafka and Apache Ignite Viktor Gamov, Confluent, @gamussa Denis Magda, GridGain, @denismagda

  2. Digital Transformations Challenges Application Layer • 10-100x more queries and transactions Web-Scale Apps IoT Mobile Apps Social Media • 50x as much data today as a decade ago 10-100x 10-1000x 50x Queries and Faster Data Storage Transactions Analytics (Big Data) (per Sec) (Hours to Sec) • Overnight analytics becomes real-time Data Layer RDBMS NoSQL Hadoop @gamussa @denismagda

  3. In-Memory Computing and Real-Time Streaming To Solve the Challenges § Performance Increases 10x to 1,000x Application Layer § Act faster by analyzing streams of data Web-Scale Apps IoT Mobile Apps Social Media GridGain Confluent § Scalability up to petabytes of data In-Memory Computing Platform Streaming Platform Transactional Persistence @gamussa @denismagda

  4. Pre-Streaming Era @gamussa @denismagda

  5. Streaming-First Workd @gamussa @denismagda

  6. Origins in Streams Processing Java Apps with Kafka Streams or KSQL Serving Layer Apache Ignite, GridGain, etc. High Throughput API based Continuous Streaming platform clustering Computation @gamussa @denismagda

  7. Search Stream Processing Real Time Analytics DW RDBMS KV Apps @gamussa @denismagda Monitoring

  8. CONSUMER PRODUCER Consumer Producer Application Application • Where to restart ? • How to handle failure & retries ? • How to scale and parallelize ? • How to properly use the producer • What metrics to capture ? / consumer API ? @gamussa @denismagda

  9. KAFKA CONNECT KAFKA CONNECT CONSUMER PRODUCER Source Sink Connector SMTs SMTs Converter Converter Connector • Offset management • Task distribution • Configuration • Elastic scalability • Metrics management • Parallelization • Failure & retries • REST API • Schemas & data types @gamussa @denismagda

  10. Discover connectors, SMTs, and converters @gamussa @denismagda

  11. Discover connectors, SMTs, and converters Descriptions, licensing, support, and more @gamussa @denismagda

  12. Lower the Bar to Enter the World Core developers who use Java/Scala streams Coding Sophistication Core developers who don’t use Java/Scala Data engineers, architects, DevOps/SRE BI analysts User Population @gamussa @denismagda

  13. GridGain and Kafka Connect 💶 @gamussa @denismagda

  14. GridGain: Real-time Streaming and Analytics @gamussa @denismagda

  15. Essential GridGain APIs Distributed memory-centric storage Co-located Computations Distributed Key-Value Combines the performance and scale of in- Brings the computations to the servers where Read, write and transact with memory computing together with the disk the data actually resides, eliminating need to fast key-value APIs durability and strong consistency in one system move data over the network Distributed SQL ACID Transactions Machine and Deep Learning Supports distributed ACID transactions for Set of simple, scalable and efficient tools that Horizontally, fault-tolerant distributed SQL key-value as well as SQL operations allow building predictive machine learning database that treats memory and disk as models without costly data transfers (ETL) active storage tiers @gamussa @denismagda

  16. GridGain SQL For Real-Time Analytics Ignite Node Toronto 2 Montreal Canada Ottawa Calgary 1 Ignite Node 3 2 Mumbai India New Delhi 1. Initial Query 2. Query execution over local data 3. Reduce multiple results in one @gamussa @denismagda

  17. Demo

  18. Q&A

Recommend


More recommend