the architecture of uber s realtime system
play

The Architecture of Uber's Realtime System March 25, 2015 Amos - PowerPoint PPT Presentation

The Architecture of Uber's Realtime System March 25, 2015 Amos Barreto Danny Yuan @amos_barreto @g9yuayon What is Uber Uber is a transportation platform Uber connects riders to drivers Transportation at your fingertips What is Realtime?


  1. The Architecture of Uber's Realtime System March 25, 2015 Amos Barreto Danny Yuan @amos_barreto @g9yuayon

  2. What is Uber

  3. Uber is a transportation platform

  4. Uber connects riders to drivers

  5. Transportation at your fingertips

  6. What is Realtime?

  7. It’s the brain of Uber’s logistics platform

  8. It assigns drivers to riders Driver Riders

  9. It balances driver & rider satisfaction

  10. Sounds pretty simple, right?

  11. Not Really Realtime Analytics

  12. Not Really Realtime Analytics

  13. Not Really Realtime Analytics

  14. Not Really Realtime Analytics

  15. We didn’t start with this

  16. Instead…

  17. The Beginning • PHP application • Transactions appended to flat files • Half of the code in Español • Lifespan: 6-9 months

  18. But we had to evolve

  19. So we built a distributed state machine

  20. We tried to follow good practices

  21. Simulator is very helpful

  22. Extensive instrumentation

  23. Asynchronous State Machines

  24. Scaling Horizontally W

  25. Scaling Horizontally W W W W W

  26. Stateless Workers W W W W W

  27. Stateless Workers HAProxy W W W W W

  28. Shared States HAProxy W W W W W

  29. Shared States HAProxy W W W W W Redis with T wemproxy

  30. Everything can go down HAProxy W W W W W Redis with T wemproxy

  31. Everything can go down HAProxy W W W W W Redis with Riak T wemproxy

  32. There’s tougher problem to solve

  33. Dispatch needs coordination • Trip states happen in partial order • Rider and driver states may need to be synchronized

  34. Coordination is hard “Minimizing coordination, or blocking communication between concurrently executing operations, is key to maximizing scalability, availability, and high performance in database systems” Coordination Avoidance in Database Systems

  35. Coordination is hard • Row lock? Distributed lock? Consensus protocols? • Insight: What we really need is ordered execution • Solution: Assign requests with the same user-defined key to the same stateless worker node

  36. What about load balance? W W W W W

  37. What about load balance? HAProxy W W W W W

  38. What about load balance? HAProxy W W W W W

  39. What about load balance? HAProxy W W W W W

  40. What about load balance? Consistent Hashing to the rescue

  41. What about load balance? Consistent Hashing to the rescue

  42. Membership changes • A worker in a cluster can crash • A worker can join a cluster • We need fast and reliable failure detector and membership updates

  43. Key Insights • Separate failure detection from membership updates • Do not rely on a single peer for failure detection • Membership changes via gossip-like protocols

  44. SWIM: Scalable Weakly-consistent Infection-style Process Group Membership Protocol

  45. Ringpop • Open source • Hash ring abstraction • Implements a variation of SWIM • Flap damping • Use checksums to verify correctness of ring state • Proxying capabilities

  46. Ringpop var ringpop = new RingPop({ app: 'myapp', hostPort: 'myhost:30000' }); ringpop.bootstrap(['myhost:30001’, 'myhost2:30000']); ringpop.on('ready', function() { // do something }); var node = ringpop.lookup(‘[unique-request-id]’); if (node === ringpop.whoami()) { // process request } else { // forward request }

  47. Ringpop Serial • Simple ringpop wrapper • Requests are queued by key • Processed serially, one at a time • Emulates transactions

  48. Transactions

  49. Transactions • Conflicts are possible during membership changes

  50. Transactions • Conflicts are possible during membership changes • Need smart application level conflict resolution

  51. Global Geospacial Index

  52. Global Geospacial Index • High volume of location updates • Mild, but expensive, query volume • Large search space (the world)

  53. What about crash recovery?

  54. Sevnup • Open sourced node.js module • Ringpop extension • Key ownership hand-off • Customizable recovery & release • Pluggable persistence layer

  55. Keys key432 key988 key654 Virtual 1 2 3 4 5 … Nodes (1024) A B Ringpop Application Cluster D C

  56. Keys key432 key988 key654 Virtual 1 2 3 4 5 … Nodes (1024) A B Ringpop Application Cluster D C

  57. Reliable Timers • Node.js offers in-memory timers • Use sevnup to make them reliable • Riak as persistence layer

  58. Data Center Failure • How do we replicate trip data? • Constants updates • Writes heavy • T emporal, and minimal loss expected

  59. Realtime Trip Replication (RTTR) Key insight: each driver application has trip data already

  60. RTTR • A key-value store on the phone • A timeseries store for partner gps points on the phone • Piggyback on existing communication protocols • All data encrypted

  61. Data in Realtime Realtime Analytics

  62. Ops need realtime analytics

  63. Ops need realtime analytics

  64. Ops need realtime analytics

  65. Ops need realtime analytics

  66. Dispatch needs data for decisions Realtime Analytics

  67. Dispatch needs data for decisions Realtime Analytics

  68. Dispatch needs data for decisions Realtime Analytics

  69. Dispatch needs data for decisions

  70. Applications need real-time data

  71. Applications need real-time data • Notification

  72. Applications need real-time data • Notification • Marketing

  73. Applications need real-time data • Notification • Marketing • Fraud detection

  74. Dispatch can’t do everything

  75. Empower But we use data to empower people

  76. An event-based data platform state driver_arrived from_state driver_accepted timestamp 13244323342 lattitude 12.23 longitude 30.00

  77. An event-based data platform

  78. An event-based data platform • Reliable replication of states

  79. An event-based data platform • Reliable replication of states • Canonical state representation

  80. An event-based data platform • Reliable replication of states • Canonical state representation • Domain specific APIs

  81. Reliable replication of states

  82. Reliable replication of states

  83. Reliable replication of states

  84. Canonical representation of states

  85. Canonical representation of states • Consistency matters

  86. Canonical representation of states • Consistency matters • Normalize your events if possible. E.g., no PII

  87. Canonical representation of states • Consistency matters • Normalize your events if possible. E.g., no PII • More generally: keep apps robust by minimizing assumptions

Recommend


More recommend