attila szegedi software engineer asz
play

Attila Szegedi, Software Engineer @asz 1 Wednesday, November 23, - PowerPoint PPT Presentation

Attila Szegedi, Software Engineer @asz 1 Wednesday, November 23, 11 Twitters Open Source Involvements 2 Wednesday, November 23, 11 Both users and producers Twitters systems are almost completely based on Open Source software


  1. Attila Szegedi, Software Engineer @asz 1 Wednesday, November 23, 11

  2. Twitter’s Open Source Involvements 2 Wednesday, November 23, 11

  3. Both users and producers • Twitter’s systems are almost completely based on Open Source software • our finance department runs Windows and Outlook, though… 3 Wednesday, November 23, 11

  4. 4 Wednesday, November 23, 11

  5. Contributor agreements • Twitter has signed a companywide contributor agreements with: • Oracle (OpenJDK) • Eclipse Foundation • Apache Software Foundation • Our employees can contribute to these projects automatically, no further red tape involved. 5 Wednesday, November 23, 11

  6. Twitter’s own Open Source projects https://github.com/twitter 6 Wednesday, November 23, 11

  7. Twitter’s own Open Source projects • We use these projects internally, and either: • develop on GitHub, or… • … frequently sync to GitHub • You get access to same versions that we use. • Lots of things for both front-end presentation and back-end capacity and scalability. 7 Wednesday, November 23, 11

  8. Hearst Castle 8 Wednesday, November 23, 11

  9. Hearst Castle • William Randolph Hearst had it built between 1919-1947 • “The estate is a pastiche of historic architectural styles that its owner admired in his travels around Europe. Hearst was an omnivorous buyer who did not so much purchase art and antiques to furnish his home as built his home to get his bulging collection out of warehouses… The floor plan of the Main Building is chaotic due to his habit of buying centuries-old ceilings, which dictated the proportions and decor of various rooms.” --Wikipedia 9 Wednesday, November 23, 11

  10. 10 Wednesday, November 23, 11

  11. Bootstrap https:// twitter.github.com/ bootstrap 11 Wednesday, November 23, 11

  12. Bootstrap • Bootstrap is Twitter's frontend HTML, CSS and JavaScript toolkit for kickstarting websites. • It includes base CSS styles for typography, forms, buttons, tables, grids, navigation, alerts, and more. • Supports IE7 and up • Very small (CSS is ~7Kb) 12 Wednesday, November 23, 11

  13. Incredibly popular • 3rd most watched Github project (after Ruby on Rails and Node.js) 13 Wednesday, November 23, 11

  14. 14 Wednesday, November 23, 11

  15. 15 Wednesday, November 23, 11

  16. 16 Wednesday, November 23, 11

  17. 17 Wednesday, November 23, 11

  18. 18 Wednesday, November 23, 11

  19. Built around a complete styleguide • Scaffolding • grid, fixed-width and variable width • Typography • headings, body text, quotes, lists, code, labels • Navigation • fixed topbar, tab and pill navigation, breadcrumbs, pagination • Alerts, dialogs • Media thumbnails, tables, forms, buttons 19 Wednesday, November 23, 11

  20. 20 Wednesday, November 23, 11

  21. 21 Wednesday, November 23, 11

  22. 22 Wednesday, November 23, 11

  23. Bootstrap • Lets you build websites that have consistent, beautiful look, quickly. 23 Wednesday, November 23, 11

  24. Finagle https://twitter.github.com/finagle 24 Wednesday, November 23, 11

  25. Finagle • Switching gears from front end to back end now… • Finagle is a library for building asynchronous RPC servers and clients on JVM. 25 Wednesday, November 23, 11

  26. Finagle • Built on top of Netty • Supports request-response, streaming, pipelining. • Supports stateful RPC styles. 26 Wednesday, November 23, 11

  27. Client features • Connection pooling • Load balancing • Failure detection • Failover/retry • Distributed tracing • Service discovery • Sharding • Native OpenSSL support • Rich statistics 27 Wednesday, November 23, 11

  28. Server features • Backpressure (against abusive clients) • Service registration • Native OpenSSL bindings 28 Wednesday, November 23, 11

  29. Protocol support • HTTP • Streaming HTTP (“Comet”) • Thrift • Memcached • Kestrel • In no way limited to these only… 29 Wednesday, November 23, 11

  30. Minimal HTTP server: val service: Service[HttpRequest, HttpResponse] = new Service[HttpRequest, HttpResponse] { def apply(request: HttpRequest) = Future(new DefaultHttpResponse(HTTP_1_1, OK)) } val server: Server[HttpRequest, HttpResponse] = ServerBuilder() .codec(Http) .bindTo(new InetSocketAddress(10000)) .name("HttpServer") .build(service) … same in Java: Service<HttpRequest, HttpResponse> service = new Service<HttpRequest, HttpResponse>() { public Future<HttpResponse> apply(HttpRequest request) { return Future.value( new DefaultHttpResponse(HttpVersion.HTTP_1_1, HttpResponseStatus.OK)); } }; Server server = ServerBuilder.safeBuild(service, ServerBuilder.get() .codec(Http.get()) .name("HttpServer") .bindTo(new InetSocketAddress("localhost", 10000))); 30 Wednesday, November 23, 11

  31. Minimal HTTP client val client: Service[HttpRequest, HttpResponse] = ClientBuilder() .codec(Http) .hosts(address) .hostConnectionLimit(1) .build() // Issue a request, get a response: val request: HttpRequest = new DefaultHttpRequest(HTTP_1_1, GET, "/") val responseFuture: Future[HttpResponse] = client(request) onSuccess { response => println("Received response: " + response) } 31 Wednesday, November 23, 11

  32. Robust client val client = ClientBuilder() .codec(Http) .hosts("localhost:10000,localhost:10001,localhost:10003") .hostConnectionLimit(1) // max num of connections at a time to a host .connectionTimeout(1.second) // max time to spend establishing a conn .retries(2) // (1) per-request retries .reportTo(new OstrichStatsReceiver) // export host-level load data .logger(Logger.getLogger("http")) .build() 32 Wednesday, November 23, 11

  33. Architecture 33 Wednesday, November 23, 11

  34. Architecture 34 Wednesday, November 23, 11

  35. Futures • Unifying abstraction for asynchronous computation • A computation that has not yet completed • can succeed or fail • Either block and wait for it to return, or… • … register a completion callback. • completion callbacks provide scaling, timeouts, scatter-gather, etc. 35 Wednesday, November 23, 11

  36. Futures • Socket handler is not blocked while the response is being generated. • Socket handler can time out if the operation takes too long. • Response generator can scatter its operation, and return once every sub-operation completed or timed out. 36 Wednesday, November 23, 11

  37. Futures • Blocking style val future = dispatch(request) val response = future() // blocks • Event handler style val future = dispatch(request) future onSuccess { value => // do something asynchronously } • Non-blocking style val future = dispatch(request) if (future.isDefined()) { val response = future() } else { // do something - timeout? } 37 Wednesday, November 23, 11

  38. Cassandra 38 Wednesday, November 23, 11

  39. Cassandra • Onto distributed storage... • Cassandra is a decentralized, fault tolerant, highly scalable distributed database • Multi-master, multi-datacenter • Linearly scalable • High performance 39 Wednesday, November 23, 11

  40. Project • Multiple committers at Twitter • Twitter is one of the largest users • Has contributed major patches in performance, scalability, and operational efficiency. • Hundreds of nodes in production • Serving millions of reads/writes per second! 40 Wednesday, November 23, 11

  41. Use Cases • Spiderduck (real-time crawler) • Cuckoo (real-time monitoring/alerting engine for Twitter infrastructure) • Tweet button • Geolocation • Distributed RPC tracing store • Real-time spam/IP store • and more! 41 Wednesday, November 23, 11

  42. Features • Supports eventual AND strong consistency! • Distributed counters • CQL (SQL like interface - select * from table) • Secondary Indexing • Hadoop support • Compression 42 Wednesday, November 23, 11

  43. Twitter at Scale • Add capacity by racks not servers • Measure everything in percentiles (p95,p99,p999) • Tune Cassandra to better integrate with the kernel and our hardware platforms • Profile, profile and profile! • Agile build deployment processes (jenkins, bittorrent) • Automated performance and distributed testing 43 Wednesday, November 23, 11

  44. FlockDb https://twitter.github.com/flockdb 44 Wednesday, November 23, 11

  45. FlockDb Distributed graph database for storing adjacency lists 45 Wednesday, November 23, 11

  46. FlockDb goals • Support a high rate of add/update/remove operations • Support potientially complex set arithmetic queries • Support paging through query result sets containing millions of entries • Ability to “archive” and later restore archived edges • Horizontal scaling including replication • Online data migration 46 Wednesday, November 23, 11

  47. FlockDb • Simpler, because it solves fewer problems than generic graph databases. • Scales horizontally, and is designed for low-latency, high-throughput environments. • Twitter uses it to store its social graph (“follows” and “blocks” relations). 47 Wednesday, November 23, 11

  48. Gizzard https:// twitter.github.com/ gizzard 48 Wednesday, November 23, 11

  49. Gizzard A Scala framework for creating fault-tolerant distributed databases. 49 Wednesday, November 23, 11

Recommend


More recommend