the gearman cookbook
play

The Gearman Cookbook OSCON 2010 Eric Day http://oddments.org/ - PowerPoint PPT Presentation

The Gearman Cookbook OSCON 2010 Eric Day http://oddments.org/ Senior Software Engineer @ Rackspace Thanks for being here! OSCON 2010 The Gearman Cookbook 2 Ask questions! Grab a mic for long questions. OSCON 2010 The Gearman Cookbook 3


  1. The Gearman Cookbook OSCON 2010 Eric Day http://oddments.org/ Senior Software Engineer @ Rackspace

  2. Thanks for being here! OSCON 2010 The Gearman Cookbook 2

  3. Ask questions! Grab a mic for long questions. OSCON 2010 The Gearman Cookbook 3

  4. Use the source... Source: 00 OSCON 2010 The Gearman Cookbook 4

  5. What is Gearman? OSCON 2010 The Gearman Cookbook 5

  6. It is not German. (well, not entirely at least) OSCON 2010 The Gearman Cookbook 6

  7. A protocol with multiple implementations. OSCON 2010 The Gearman Cookbook 7

  8. A message queue. OSCON 2010 The Gearman Cookbook 8

  9. A job coordinator. OSCON 2010 The Gearman Cookbook 9

  10. MANAGER GEARMAN OSCON 2010 The Gearman Cookbook 10

  11. “A massively distributed, massively fault tolerant fork mechanism.” - Joe Stump, SimpleGeo OSCON 2010 The Gearman Cookbook 11

  12. A building block for distributed architectures. OSCON 2010 The Gearman Cookbook 12

  13. Features ● Open Source ● Simple & Fast ● Multi-language ● Flexible application design ● Embeddable ● No single point of failure OSCON 2010 The Gearman Cookbook 13

  14. How does Gearman work? OSCON 2010 The Gearman Cookbook 14

  15. OSCON 2010 The Gearman Cookbook 15

  16. OSCON 2010 The Gearman Cookbook 16

  17. While large-scale architectures work well, you can start off simple. Source: 01 OSCON 2010 The Gearman Cookbook 17

  18. Foreground (synchronous) or Background (asynchronous) Source: 02 OSCON 2010 The Gearman Cookbook 18

  19. Questions? OSCON 2010 The Gearman Cookbook 19

  20. Let's get cooking! OSCON 2010 The Gearman Cookbook 20

  21. Required Ingredients: OSCON 2010 The Gearman Cookbook 21

  22. Job Server ● Perl Server (Gearman::Server in CPAN) ● The original implementation ● Actively maintained by folks at SixApart ● C Server (https://launchpad.net/gearmand) ● Rewrite for performance and threading ● Added new features like persistent queues ● Different port (IANA assigned 4730) ● Now moving to C++ OSCON 2010 The Gearman Cookbook 22

  23. Client API ● Available for most common languages ● Command line tool ● User defined functions in SQL databases ● MySQL ● PostgreSQL ● Drizzle OSCON 2010 The Gearman Cookbook 23

  24. Worker API ● Available for most common languages ● Usually in the same packages as the client API ● Command line tool OSCON 2010 The Gearman Cookbook 24

  25. Optional Ingredients ● Databases ● Shared or distributed file systems ● Other network protocols ● HTTP ● E-Mail ● Domain specific libraries ● Image manipulation ● Full-text indexing OSCON 2010 The Gearman Cookbook 25

  26. Recipes ● Scatter/Gather ● Map/Reduce ● Asynchronous Queues ● Pipeline Processing OSCON 2010 The Gearman Cookbook 26

  27. Scatter/Gather ● Perform a number of tasks concurrently ● Great way to speed up web applications ● Tasks don't need to be related ● Allocate dedicated resources for different tasks ● Push logic down to where data exists OSCON 2010 The Gearman Cookbook 27

  28. Scatter/Gather Client Full-text DB Query Search Location DB Query Search Image Resize OSCON 2010 The Gearman Cookbook 28

  29. Scatter/Gather ● Start simple with a single task ● Multiple tasks ● Concurrent tasks Source: 03 OSCON 2010 The Gearman Cookbook 29

  30. Scatter/Gather ● Concurrent tasks with different workers ● All tasks run in the time for longest running ● Must have enough workers available Source: 04 OSCON 2010 The Gearman Cookbook 30

  31. Note on Resize Worker OSCON 2010 The Gearman Cookbook 31

  32. Web Applications ● Reduce page load time with concurrency ● Don't tie up web server resources ● Improve time to first byte ● Start non-blocking requests ● Send first part of response ● Block when you need one of the results OSCON 2010 The Gearman Cookbook 32

  33. Questions? OSCON 2010 The Gearman Cookbook 33

  34. Map/Reduce ● Similar to scatter/gather, but split up one task ● Push logic to where data exists (map) ● Report aggregates or other summary (reduce) ● Can be multi-tier OSCON 2010 The Gearman Cookbook 34

  35. Map/Reduce Client Task T Task T 0 Task T 0 Task T 1 Task T 2 Task T 3 OSCON 2010 The Gearman Cookbook 35

  36. Map/Reduce Client Task T Task T 0 Task T 0 Task T 1 Task T 2 Task T 3 Task T 00 Task T 01 Task T 02 OSCON 2010 The Gearman Cookbook 36

  37. Log Service ● Push all log entries to log_collect queue ● tail -f access_log | gearman -n -f log_collect ● Natural spreading between workers when busy ● Can shutdown workers to help balance ● Worker for each operation per log server ● Push operations to where data resides OSCON 2010 The Gearman Cookbook 37

  38. Log Service Source: 05 OSCON 2010 The Gearman Cookbook 38

  39. Questions? OSCON 2010 The Gearman Cookbook 39

  40. Asynchronous Queues ● They help you scale ● Not everything needs immediate processing ● Sending e-mail, tweets, … ● Log entries and other notifications ● Data insertion and indexing ● Allows for batch operations OSCON 2010 The Gearman Cookbook 40

  41. Delayed E-Mail ● Replace: # Send email right now mail($to_address, $subject, $body, $headers); ● With: # Put email in queue to send $client = new GearmanClient(); $client->addServer('127.0.0.1', 4730); $client->doBackground('send_email', serialize($email_options)); Source: 06 OSCON 2010 The Gearman Cookbook 41

  42. Database Updates ● Also useful as a database trigger ● Start background jobs on database changes ● Requires MySQL UDF package CREATE TRIGGER tweet_blog BEFORE INSERT ON blog_entries FOR EACH ROW SET @ret=gman_do_background('send_tweet', CONCAT(NEW.title, " - ", NEW.url)); OSCON 2010 The Gearman Cookbook 42

  43. Questions? OSCON 2010 The Gearman Cookbook 43

  44. Pipeline Processing ● Some tasks need a series of transformations ● Chain workers to send data for the next step Client Client Task T Task T Worker Worker Worker Operation 1 Operation 2 Operation 3 Output OSCON 2010 The Gearman Cookbook 44

  45. Search Engine ● Insert URLs, track duplicates ● Fetch contents of URLs ● Store URLs with title and body ● Search stored URLs OSCON 2010 The Gearman Cookbook 45

  46. Search Engine Insert Insert Fetch Search Store/Search Source: 07 OSCON 2010 The Gearman Cookbook 46

  47. Questions? OSCON 2010 The Gearman Cookbook 47

  48. Persistent Queues ● By default, jobs are only stored in memory ● Various contributions from community ● MySQL/Drizzle ● PostgreSQL ● SQLite ● Tokyo Cabinet ● memcached (not always “persistent”) OSCON 2010 The Gearman Cookbook 48

  49. Persistent Queues ● Use at your own risk, test in your environment! ● Configure back-end to meet your performance and durability needs Source: 08 OSCON 2010 The Gearman Cookbook 49

  50. Timeouts ● By default, operations block forever ● Clients may want a timeout on foreground jobs ● Workers may need to periodically run other code besides job callback Source: 09 OSCON 2010 The Gearman Cookbook 50

  51. gearmand --help ● --job-retries - Prevent poisonous jobs ● --worker-wakeup - Don't wake up all workers for every job ● --threads - Run multiple I/O threads (C only) ● --protocol - Load pluggable protocols (C only) OSCON 2010 The Gearman Cookbook 51

  52. New Distributed Applications ● Think of scalable cloud architectures ● Not just LAMP on a virtual machine ● Elastic servers and services (workers) ● New data models ● Use eventual consistency whenever possible ● Blogs, wikis, and other web apps powered by EC and queues, not a single logical database OSCON 2010 The Gearman Cookbook 52

  53. Get involved! ● http://gearman.org/ ● Mailing list, documentation, related projects ● #gearman on irc.freenode.net ● Contact me at: http://oddments.org/ ● Stickers! OSCON 2010 The Gearman Cookbook 53

Recommend


More recommend