cloud native data pipelines with apache kafka
play

Cloud Native Data Pipelines with Apache Kafka Gwen Shapira, - PowerPoint PPT Presentation

1 Cloud Native Data Pipelines with Apache Kafka Gwen Shapira, Software Engineer @gwenshap 2 What is a Cloud Native Application? 3 Resilience Elasticity Common ideas Agility DevOps 4 You will build Cloud Native Applications


  1. �1 Cloud Native Data Pipelines with Apache Kafka Gwen Shapira, Software Engineer @gwenshap

  2. �2 What is a Cloud Native Application?

  3. �3 Resilience Elasticity Common ideas Agility DevOps

  4. �4 You will build Cloud Native Applications from 
 Non Cloud Native components

  5. �5 What do 
 Cloud Native architectures 
 look like?

  6. �6 You Have Microservices

  7. �7 They need to communicate

  8. I know! I’ll use REST APIs �8 Orders Validate Returns? Order Inventory Fulfill Order

  9. �9 But, we forgot something…

  10. The Problem is DATA 10

  11. �11 Cloud Native Architectures are Different. 
 We need data architectures for cloud. And Data is about context and sharing

  12. �12 Lets say I have this: V a l i d a t e O r d e r ( i d , u s e p r r , o d u c t , p r i c e , a m o u n t . . ) Order Service Validation T r u e

  13. �13 We need Fraud Detection

  14. �14 Option: WARNING: Antipattern V a l i d a t e O r d e r ( i d , u s e p r r , o d u c t , p r i c e , a m o u n t . . ) Order Service Fraud Service Validation Alert Service T r u e

  15. �15 Option: V a l i d a t e O r d e r ( i d , u s e p r r , o d u c t , p r i c e , a m o u n t . . ) WARNING: Antipattern Order Service Fraud Service Validation Order T r u e history credit alerts Service Service customer history Service

  16. �16 What I want is really smart validator V a l i d a t e O r d e r ( i d , u s e p r r , o d u c t , p r i c e , a m o u n t . . ) Order Service Validation T r u e

  17. �17 Maybe even more than one Order Service Proxy Validation new Validation

  18. �18 The challenges ● Services are really Stateful ● Data has history ● Data is shared

  19. �19 Lets Look at Patterns

  20. �20 Publish Events

  21. �21 Events are not: Events are: • Commands • Things that happened • Queries • Notification • Requests • Data

  22. �22 Buying an iPad 
 (with REST) Webserver - Orders Service calls Shipping Submit Order Service to tell it to ship item. - Shipping service looks up address Customer Orders Shipping to ship to (from Customer Service Service Service Service) shipOrder() getCustomer()

  23. �23 Using events for Notification Notification Webserver Submit - Orders Service no longer knows Order about the Shipping service (or any Customer Orders Shipping other service). Events are fire and Service Service Service REST forget. Order getCustomer() Created Event Bus == Kafka

  24. �24 Webserver Using events to 
 Data is share facts Submit replicated Order - Call to Customer service is gone. Customer - Instead data in replicated, as Orders Shipping Service Service Service events, into the shipping service, where it is queried locally. . Customer Order Updated Created Event Bus == Kafka

  25. Need someone else’s events? 
 Change Data Capture Kafka 
 Connect APACHE KAFKA Mainframe

  26. Need someone else’s events? 
 Change Data Capture Debezium {key=600, Kafka 
 update table accounts 
 old_record={… 
 Connect set total=total+50 vip=f 
 where id=600 total=300 
 APACHE KAFKA …}, 
 new_record={… 
 Database vip=f, 
 total=350, … 
 } }

  27. �27 Local state 
 for Microservices

  28. �28 We have a stream of events: {order:1, 
 {order:2, 
 {order:1, 
 {order:1, 
 product: iphone, product: ipad, product: iphone, product: iphone, status: shipped status: created status: valid status: created } } } } event 4 event 3 event 2 event 1

  29. �29 Store current state: {order:1, 
 {order:2, 
 {order:1, 
 {order:1, 
 product: iphone, product: ipad, product: iphone, product: iphone, status: shipped status: created status: valid status: created } } } } event 4 event 3 event 2 event 1 Order 1 -> iphone, shipped 
 Order 2 -> ipad, created

  30. �30 Low risk due to Just the data you shared event need stream Duplicate data? Sharded with the application

  31. �31

  32. �32 Sharded View Odd orders: {order:1, 
 {order:1, 
 {order:1, 
 product: iphone, product: iphone, product: iphone, Order 1 -> 
 status: valid status: shipped status: created iphone, shipped 
 } } } Even orders: {order:2, 
 product: ipad, Order 2 -> 
 status: created ipad, created 
 }

  33. �33 Better than shared DB ● The data I need, 
 the way I need it ● Reduced dependencies ● Low latency ● Events are also triggers

  34. �34 select order_id, customer_id, product where total_value>10000 
 … 
 And also, if you get one like that in the future, execute callback()

  35. �35 Reporting Live 
 from Streams of Events

  36. �36 Requirements ● Aggregated reports ● Combining data from many services ● Updated in real time ● Scalable and resilient

  37. �37 Reporter 1 Orders shipments Reporter 2 My Browser customers

  38. State Recovery �38 Instance 1 Instance 2 Trade Stats App Trade Stats App restore Changelog Topic

  39. �39 3-layer data model

  40. �40 Consumer 
 Who controls the data Producer pre-processor format? ● Publishers? ● Consumers? Clean 
 Enriched 
 Raw Standard 
 Aggregated events ● How do we share events? events events Consumer Integrator

  41. In Event Streaming World Event Schemas ARE the API

  42. �43 Take Away Points!

  43. �44 Remember This ● As you design cloud-native architectures 
 - don’t forget the data ● Publish events ● Build views and reports from events ● Be nice to each other

  44. �45 Orchestration vs Choreography

  45. �46 Orchestration: One Service to Rule them all Orchestrator Step 4 Step 1 Step 1 
 If success: Step 2 Else: Step 3 Finally: Step 2 Step 4 Step 3

  46. �47 Choreography: We react to each other Step 1 Orders Success Fail Shipped emailed Step 2 Step 4 Step 3

Recommend


More recommend