rpc metrics at google
play

RPC Metrics at Google JBD, Google (@rakyll) gRPC Metrics at - PowerPoint PPT Presentation

RPC Metrics at Google JBD, Google (@rakyll) gRPC Metrics at Google JBD, Google (@rakyll) Request Metrics at Google JBD, Google (@rakyll) "100% is the wrong reliability target for basically everything." -- Benjamin Treynor


  1. RPC Metrics at Google JBD, Google (@rakyll)

  2. gRPC Metrics at Google JBD, Google (@rakyll)

  3. Request Metrics at Google JBD, Google (@rakyll)

  4. "100% is the wrong reliability target for basically everything." -- Benjamin Treynor Sloss, VP of Engineering, Google @rakyll

  5. "A service is available if users cannot tell that there was an outage." @rakyll

  6. SLOs Principled way of saying what level of downtime is acceptable. Error rate ● Latency expectations ● @rakyll

  7. Analytics frontend server Authentication Reporting Users ... Spanner Blob Store @rakyll

  8. Questions infra teams want to ask: Are we meeting the SLO for the other team? ● What’s the impact of a product on infra? ● How much do we need to scale up if product grows 10%? ● @rakyll

  9. High-Cardinality Breaking down the metrics data... @rakyll

  10. Query the collected data in various ways: Latency distribution for RPCs originated at Google Analytics. ● Requests take took more than 100ms for the customer #123. ● Compare the request latency initiated at web vs mobile frontend. ● @rakyll

  11. Analytics frontend server Authentication Reporting Users ... Spanner originator=analytics; ... Blob Store @rakyll

  12. Blob store read errors by originator @rakyll

  13. Dynamically choose aggregation (split between recording and aggregation) @rakyll

  14. Exemplars @rakyll

  15. /rpz and /statz @rakyll

  16. http://server:7777/debug/rpcz @rakyll

  17. Export? Monarch, Prometheus, and more. @rakyll

  18. import “cloud.google.com/go/pubsub” @rakyll

  19. + @rakyll

  20. Thank you! JBD, Google jbd@google.com @rakyll

Recommend


More recommend