a social network on streams
play

A SOCIAL NETWORK ON STREAMS 2 DRIVETRIBE D RIVETRIBE The world - PowerPoint PPT Presentation

DRIVETRIBE ENGINEERING A SOCIAL NETWORK ON STREAMS 2 DRIVETRIBE D RIVETRIBE The world biggest motoring community. A social platform for petrolheads. By Clarkson, Hammond and May. 3 DRIVETRIBE D RIVETRIBE A content destination


  1. DRIVETRIBE ENGINEERING A SOCIAL NETWORK ON STREAMS

  2. 2 DRIVETRIBE D RIVETRIBE ▸ The world biggest motoring community. ▸ A social platform for petrolheads. ▸ By Clarkson, Hammond and May.

  3. 3 DRIVETRIBE D RIVETRIBE ▸ A content destination at the core. ▸ Users consume feeds of content: images, videos, long-form articles. ▸ Content is organised in homogenous categories called “tribes”. ▸ Different users have different interests and the tribe model allows to mix and match at will.

  4. 4 DRIVETRIBE D RIVETRIBE A RTICLE ▸ Single article by James May. ▸ Contains a plethora of content and engagement information. ▸ What do we need to compute an aggregate like this?

  5. 5 DRIVETRIBE D RIVETRIBE A RTICLE ▸ getUser(id: Id[User])

  6. 6 DRIVETRIBE D RIVETRIBE A RTICLE ▸ getUser(id: Id[User]) ▸ getTribe(id: Id[Tribe])

  7. 7 DRIVETRIBE D RIVETRIBE A RTICLE ▸ getUser(id: Id[User]) ▸ getTribe(id: Id[Tribe]) ▸ getArticle(id: Id[Article])

  8. 8 DRIVETRIBE D RIVETRIBE A RTICLE ▸ getUser(id: Id[User]) ▸ getTribe(id: Id[Tribe]) ▸ getArticle(id: Id[Article]) ▸ countViews(id: Id[Article])

  9. 9 DRIVETRIBE D RIVETRIBE A RTICLE ▸ getUser(id: Id[User]) ▸ getTribe(id: Id[Tribe]) ▸ getArticle(id: Id[Article]) ▸ countViews(id: Id[Article]) ▸ countComments(id: Id[Article])

  10. 10 DRIVETRIBE D RIVETRIBE A RTICLE ▸ getUser(id: Id[User]) ▸ getTribe(id: Id[Tribe]) ▸ getArticle(id: Id[Article]) ▸ countViews(id: Id[Article]) ▸ countComments(id: Id[Article]) ▸ countBumps(id: Id[Article])

  11. 11 DRIVETRIBE D RIVETRIBE F EED OF A RTICLES ▸ rankArticles(forUserId).flatMap { a => … } ▸ getUser(id: Id[User]) ▸ getTribe(id: Id[Tribe]) ▸ getArticle(id: Id[Article]) ▸ countViews(id: Id[Article]) ▸ …

  12. 12 DRIVETRIBE Q UINTESSENTIAL PREREQUISITES ▸ Scalable. Jeremy Clarkson has 7.2M Twitter followers. Cannot really hack it and worry about it later. ▸ Performant. Low latency is key and mobile networks add quite a bit of it. ▸ Flexible. Almost nobody gets it right the first time around. The ability to iterate is paramount. ▸ Maintainable . Spaghetti code works like interest on debt.

  13. 13 DRIVETRIBE T HREE T IER A PPROACH ▸ Clients interact with a fleet of stateless servers (aka “API” servers or “Backend”) via HTTP (which is stateless). ▸ Global shared mutable state (aka the Database). ▸ Starting simple: Store data in a DB. ▸ Starting simple: Compute the aggregated views on the fly.

  14. 14 DRIVETRIBE D RIVETRIBE A RTICLE ▸ getUser(id: Id[User]) ▸ getTribe(id: Id[Tribe]) ▸ getArticle(id: Id[Article]) ▸ countComments(id: Id[Article]) ▸ countBumps(id: Id[Article]) ▸ countViews(id: Id[Article])

  15. 15 DRIVETRIBE R EAD T IME A GGREGATION ▸ (6 queries per Item) x (Y items per page) ▸ Cost of ranking and personalisation. ▸ Quite some work at read time. ▸ Slow. Not really Performant .

  16. 16 DRIVETRIBE W RITE T IME A GGREGATION ▸ Compute the aggregation at write time. ▸ Then a single query can fetch all the views at once. That scales.

  17. 17 DRIVETRIBE W RITE T IME A GGREGATION ▸ Compute the aggregation at write time. ▸ Then a single query can fetch all the views at once. That scales.

  18. 18 DRIVETRIBE W RITE T IME A GGREGATION ▸ Compute the aggregation at write time. ▸ Then a single query can fetch all the views at once. That scales.

  19. 19 DRIVETRIBE W RITE T IME A GGREGATION ▸ Compute the aggregation at write time. ▸ Then a single query can fetch all the views at once. That scales.

  20. 20 DRIVETRIBE W RITE T IME A GGREGATION - E VOLUTION ▸ sendNotification

  21. 21 DRIVETRIBE W RITE T IME A GGREGATION - E VOLUTION ▸ sendNotification ▸ updateUserStats

  22. 22 DRIVETRIBE W RITE T IME A GGREGATION - E VOLUTION ▸ sendNotification ▸ updateUserStats. ▸ What if we have a cache? ▸ Or a different database for search?

  23. 23 DRIVETRIBE W RITE T IME A GGREGATION ▸ A simple user action is triggering a potentially endless sequence of side effects. ▸ Most of which need network IO. ▸ Many of which can fail .

  24. 24 DRIVETRIBE A TOMICITY Atomicity? ▸ What happens if one of them fails? What happens if the server fails in the middle? ▸ We may have transaction support in the DB, but what about external systems? ▸ Inconsistent.

  25. 25 DRIVETRIBE C ONCURRENCY Concurrency? ▸ Concurrent mutations on a global shared state entail race conditions. ▸ State mutations are destructive and can not be (easily) undone. ▸ A bug can corrupt the data permanently.

  26. 26 DRIVETRIBE I TALIAN P ASTA Extensibility? ▸ Model evolution becomes difficult. Reads and writes are tightly coupled. ▸ Migrations are scary. ▸ This is neither Extensible nor Maintainable.

  27. 27 DRIVETRIBE D IFFERENT A PPROACH ▸ Let’s take a step back and try to decouple things. ▸ Clients send events to the API: “John liked Jeremy’s post”, “Maria updated her profile” ▸ Events are immutable . They capture a user action at some point in time. ▸ Every application state instance can be modelled as a projection of those events.

  28. 28 DRIVETRIBE ▸ Persisting those yields an append- only log of events. ▸ An event reducer can then produce application state instances. ▸ Even retroactively. The log is immutable . ▸ This is event sourcing .

  29. 29 DRIVETRIBE ▸ The write-time model (command model) and the read time model (query model) can be separated. ▸ Decoupling the two models opens the door to more efficient, custom implementations. ▸ This is known as Command Query Responsibility Segregation aka CQRS .

  30. 30 DRIVETRIBE E VENT S OURCING A PPROACH Like!!

  31. 31 DRIVETRIBE E VENT S OURCING A PPROACH Store Like Event Like!!

  32. 32 DRIVETRIBE E VENT S OURCING A PPROACH Store Like Event ArticleStatsReducer Like!!

  33. 33 DRIVETRIBE E VENT S OURCING A PPROACH NotificationReducer Store Like Event ArticleStatsReducer Like!!

  34. 34 DRIVETRIBE E VENT S OURCING A PPROACH NotificationReducer Store Like Event ArticleStatsReducer Like!! UserStatsReducer

  35. 35 DRIVETRIBE E VENT S OURCING A PPROACH NotificationReducer Store Like Event ArticleStatsReducer Like!! UserStatsReducer And so on..

  36. 36 DRIVETRIBE E VENT S OURCING A PPROACH NotificationReducer Store Like Event ArticleStatsReducer Like!! UserStatsReducer Get Articles Sky Is the limit

  37. 37 DRIVETRIBE E VENT S OURCING A PPROACH NotificationReducer Store Like Event ArticleStatsReducer Like!! Extensibility ? Maintainability ? UserStatsReducer Get Articles Sky Is the limit

  38. 38 DRIVETRIBE E VENT S OURCING A PPROACH NotificationReducer Store Like Event ArticleStatsReducer Like!! Performance? UserStatsReducer Get Articles Sky Is the limit

  39. 39 DRIVETRIBE E VENT S OURCING A PPROACH NotificationReducer Store Like Event ArticleStatsReducer Like!! Atomicity? UserStatsReducer Get Articles Sky Is the limit

  40. 40 DRIVETRIBE E VENT S OURCING A PPROACH NotificationReducer Store Like Event ArticleStatsReducer Like!! Concurrency? UserStatsReducer Get Articles Sky Is the limit

  41. IMPLEMENTATION?

  42. WE NEED A LOG

  43. 43 DRIVETRIBE A PACHE K AFKA ▸ Distributed, fault-tolerant, durable and fast append-only log. ▸ Can scale the thousands of nodes, producers and consumers. ▸ Each business event type can be stored in its own topic.

  44. WE NEED A STREAM PROCESSOR

  45. 45 DRIVETRIBE A PACHE F LINK ▸ Scalable, performant, mature. ▸ Elegant high level APIs in Scala. ▸ Powerful low level APIs for advanced tuning. ▸ Multiple battle-tested integrations. ▸ Very nice and active community.

  46. WE NEED DATASTORE

  47. 47 DRIVETRIBE E LASTICSEARCH ▸ Horizontally scalable document store. ▸ Rich and expressive query language. ▸ Dispensable. Can be replaced.

  48. WE NEED AN API

  49. 49 DRIVETRIBE A KKA H TTP ▸ Asynchronous web application framework. ▸ Written in Scala. ▸ Very expressive routing DSL. ▸ Any modern web application framework would do.

  50. 50 DRIVETRIBE E VENT S OURCING I N P RACTICE Store Raw events Consume raw events Retrieve aggregated Produce aggregated views views

  51. 51 DRIVETRIBE E VENT S OURCING I N P RACTICE Store Raw events Consume raw events Stateful Retrieve aggregated Produce aggregated views views

  52. 52 DRIVETRIBE E VENT S OURCING I N P RACTICE Store Raw events Consume raw events Retrieve aggregated Produce aggregated views views

  53. 53 DRIVETRIBE B LUE /G REEN A PPROACH MIRROR

  54. A REAL WORLD EXAMPLE

  55. 55 DRIVETRIBE C OUNTING B UMPS ▸ Thousands of people like the fact that Jeremy Clarkson is a really tall guy ▸ Users can “bump” a post if they like it

  56. 56 DRIVETRIBE E VENT S OURCING I N P RACTICE Store Raw events Consume raw events Retrieve aggregated Produce aggregated views views

  57. 57 DRIVETRIBE C OUNTING B UMPS

  58. 58 DRIVETRIBE C OUNTING B UMPS - F IRST ATTEMPT

  59. 59 DRIVETRIBE C OUNTING B UMPS - F IRST ATTEMPT

  60. 60 DRIVETRIBE C OUNTING B UMPS - F IRST ATTEMPT ▸ Use Flink with at least once semantics

  61. 61 DRIVETRIBE C OUNTING B UMPS - F IRST ATTEMPT ▸ Use Flink with at least once semantics ▸ Our system is eventually consistent

Recommend


More recommend