extreme programming meets real time data
play

Extreme Programming meets Real Time Data Gel Goldsby & Tom - PowerPoint PPT Presentation

Extreme Programming meets Real Time Data Gel Goldsby & Tom Johnson, Unruly When Santa Got Stuck Up The Chimney Title position here Your title sit over it When Data Got Stuck Up The Chimney Title position here Your title sit over it


  1. Extreme Programming meets Real Time Data Gel Goldsby & Tom Johnson, Unruly

  2. When Santa Got Stuck Up The Chimney Title position here Your title sit over it

  3. When Data Got Stuck Up The Chimney Title position here Your title sit over it

  4. Hello. My name is... Title position here Your title sit over it Gel Goldsby Tom Johnson Reporting and Senior Developer Data Team Lead

  5. We Believe In XP Title position here Your title sit over it

  6. Extreme Programming Values Title position here Your title sit over it ● Communication ● Simplicity ● Feedback ● Courage

  7. Simplicity Title position here Your title sit over it

  8. Simplicity Title position here Your title sit over it

  9. Simplicity Title position here Your title sit over it

  10. Simplicity Title position here Your title sit over it

  11. Simplicity Title position here Your title sit over it

  12. Simplicity Title position here Your title sit over it

  13. Our Reporting Pipeline Title position here Your title sit over it events pipeline

  14. Our Reporting Pipelines Title position here Your title sit over it super duper wizzy pipeline events old pipeline

  15. Shut It Off! Title position here Your title sit over it super duper wizzy pipeline events old pipeline

  16. A Closer Look At Our Pipeline Title position here Your title sit over it consumer events events pipeline

  17. It’s Not A Truck, It’s A Series of Tubes Title position here Your title sit over it consumer events nginx parser sequencer

  18. Queueing with S3 Title position here Your title sit over it S3 S3 S3 consumer events nginx parser sequencer

  19. Queueing with S3 Title position here Your title sit over it S3 S3 S3 consumer events nginx parser sequencer S3 S3 S3

  20. We Need More Power, Cap’n Title position here Your title sit over it consumer events nginx parser sequencer

  21. We Need More Power, Cap’n Title position here Your title sit over it consumer events nginx parser sequencer nginx parser

  22. We Need More Power, Cap’n Title position here Your title sit over it consumer events nginx parser sequencer nginx parser nginx parser

  23. We Need More Power, Cap’n Title position here Your title sit over it consumer events nginx parser sequencer nginx parser nginx parser nginx parser

  24. Two Writes Can Make A Wrong Title position here Your title sit over it consumer events nginx parser sequencer

  25. Two Writes Can Make A Wrong Title position here Your title sit over it consumer events nginx parser sequencer

  26. Christmas was saved! Title position here Your title sit over it

  27. Simplicity Title position here Your title sit over it ● Each component does one thing and does it well

  28. Just Another Report, Right? Title position here Your title sit over it ● Improving targeting ● Correlate events for same ad call ● Need to join on session id ● Needs disaggregated data

  29. Aggregation Title position here Your title sit over it Campaign Site Acme Zombo.com Acme Zombo.com Acme Zombo.com Acme Nyan.cat Brawndo Zombo.com Brawndo Nyan.cat Brawndo Nyan.cat

  30. Aggregation Title position here Your title sit over it Campaign Site Acme Zombo.com Acme Zombo.com Acme Zombo.com Acme Nyan.cat Brawndo Zombo.com Brawndo Nyan.cat Brawndo Nyan.cat

  31. Aggregation Title position here Your title sit over it Campaign Site Acme Zombo.com Acme Zombo.com Acme Zombo.com Acme Nyan.cat Brawndo Zombo.com Brawndo Nyan.cat Brawndo Nyan.cat

  32. Aggregation Title position here Your title sit over it Count Campaign Site 1 Acme Zombo.com 1 Acme Zombo.com 1 Acme Zombo.com 1 Acme Nyan.cat 1 Brawndo Zombo.com 1 Brawndo Nyan.cat 1 Brawndo Nyan.cat

  33. Aggregation Title position here Your title sit over it Count Campaign Site 3 Acme Zombo.com 1 Acme Nyan.cat 1 Brawndo Zombo.com 2 Brawndo Nyan.cat

  34. Aggregation Title position here Your title sit over it Count Campaign Site Lots More 3 Acme Zombo.com ... ... 1 Acme Nyan.cat ... ... 1 Brawndo Zombo.com ... … 2 Brawndo Nyan.cat ... ...

  35. Lots of buckets Title position here Your title sit over it

  36. Micro-Aggregations Title position here Your title sit over it ● Roughly 20k events per second ● Batched: window size 20s ● x7 reduction factor ● Reduces writes to db

  37. Make America Aggregate Again Title position here Your title sit over it ● Daily ● From ~800 million events ● Compacts to ~2 million rows ● 400x reduction ● Reduces disk usage ● Speeds up queries

  38. Querying data Title position here Your title sit over it user view query historic data today’s data

  39. Aggregatable facts Title position here Your title sit over it Campaign Site Acme Zombo.com Acme Zombo.com Acme Zombo.com Acme Nyan.cat Brawndo Zombo.com Brawndo Nyan.cat Brawndo Nyan.cat

  40. Add in session ids Title position here Your title sit over it Campaign Site Session Id Acme Zombo.com Wo5Meiri Acme Zombo.com Xotaipu6 Acme Zombo.com Xu1goor7 Acme Nyan.cat eVai6OhS Brawndo Zombo.com oiMoo7Du Brawndo Nyan.cat aiSh1eej Brawndo Nyan.cat rae8ieY5

  41. Does not aggregate well Title position here Your title sit over it Campaign Site Session Id Acme Zombo.com Wo5Meiri Acme Zombo.com Xotaipu6 Acme Zombo.com Xu1goor7 Acme Nyan.cat eVai6OhS Brawndo Zombo.com oiMoo7Du Brawndo Nyan.cat aiSh1eej Brawndo Nyan.cat rae8ieY5

  42. What next? Title position here Your title sit over it

  43. What next? Spikes! Title position here Your title sit over it

  44. Big Data! Title position here Your title sit over it

  45. Big data: big choices Title position here Your title sit over it ● Many options ● Available documentation was: ○ Academic ○ Evangelical ○ Naive/Trivial

  46. Spark! Title position here Your title sit over it

  47. Big data: big costs Title position here Your title sit over it ● Infrastructure ● Language (Scala) ● Incompatible with current approach ● Performance tradeoffs

  48. Why we could step away Title position here Your title sit over it ● Understood our data better ● Underestimated costs ● We know our code ● We can change our code

  49. Feedback Title position here Your title sit over it ● Regular retrospectives ● Shared understanding of “research” ● Shared understanding of value

  50. Courage Title position here Your title sit over it ● Not afraid to try new things ● Not afraid to change direction ● Not lured by what we “ought” to do

  51. The Shape of our Data Title position here Your title sit over it

  52. The Shape of our Data Title position here Your title sit over it Disaggregated

  53. The Shape of our Data Title position here Your title sit over it Disaggregated Unsampled

  54. The Shape of our Data Title position here Your title sit over it Disaggregated Unsampled Real Time

  55. Programmatic Pacing Title position here Your title sit over it Disaggregated Unsampled Real Time

  56. Operational Debugging Title position here Your title sit over it Disaggregated Unsampled Real Time

  57. Auction Data Title position here Your title sit over it Disaggregated Unsampled Real Time

  58. Advertising 101 Title position here Your title sit over it user user loads ad call auction payments interaction page

  59. Funnel of data Title position here Your title sit over it user user loads ad call auction payments interaction page

  60. Pipelines to match data shape Title position here Your title sit over it user user loads ad call auction payments interaction page

  61. Our Actual Reporting Pipelines Title position here Your title sit over it payments pipeline user interaction pipeline events auction pipeline ad call pipeline

  62. When We Get Overloaded... Title position here Your title sit over it payments pipeline user interaction pipeline events auction pipeline ad call pipeline

  63. When We Get Overloaded... Title position here Your title sit over it payments pipeline user interaction pipeline events auction pipeline ad call pipeline

  64. When We Get Overloaded... Title position here Your title sit over it payments pipeline user interaction pipeline events auction pipeline ad call pipeline

  65. Ensuring real time performance Title position here Your title sit over it

  66. Ensuring real time performance Title position here Your title sit over it

  67. Communication Title position here Your title sit over it ● How data was used ● Performance requirements ○ What was needed ○ What wasn’t needed ○ Hard vs soft requirements

  68. Simplicity Title position here Your title sit over it ● Green cards ● 10 pair-days total ● Incremental ● Separable

  69. Let's talk about our databases Title position here Your title sit over it

  70. Row-based database Title position here Your title sit over it Column A Column B Column C Column D Column E

  71. Row-based database Title position here Your title sit over it Column A Column B Column C Column D Column E

Recommend


More recommend