swarm
play

Swarm Transparently distributed computation in the cloud Ian Clarke - PowerPoint PPT Presentation

Swarm Transparently distributed computation in the cloud Ian Clarke ian@uprizer.com Sunday, September 13, 2009 Swarm Transparently Distributed Computation in the cloud Ian Clarke ian.clarke@gmail.com Sunday, September 13, 2009


  1. Swarm “Transparently distributed computation in the cloud” Ian Clarke ian@uprizer.com Sunday, September 13, 2009

  2. Swarm “Transparently Distributed Computation in the cloud” Ian Clarke ian.clarke@gmail.com Sunday, September 13, 2009

  3. About me • Degree in AI and Comp Sci from Edinburgh University, Scotland (1995-1999) • Designer and co-ordinator of Freenet, the first decentralized P2P architecture (1999-present) • Designed P2P video streaming system that later became part of “Joost” (2003-2004) • Founder and Chief Scientist of Revver (2004-2006) • CEO of Uprizer Labs (2007-present) Sunday, September 13, 2009

  4. The Problem Sunday, September 13, 2009

  5. Building a web-app? You want your development process to be: • Cheap and fast to implement • Scalable in the event of success Sunday, September 13, 2009

  6. Building a web-app? You want your development process to be: Pick One! • Cheap and fast to implement • Scalable in the event of success Sunday, September 13, 2009

  7. Examples Sunday, September 13, 2009

  8. Sunday, September 13, 2009

  9. Sunday, September 13, 2009

  10. • 22 hour outage after IPO in 1999 • Estimated cost: Over $2M Sunday, September 13, 2009

  11. • 22 hour outage after IPO in 1999 • Estimated cost: Over $2M Sunday, September 13, 2009

  12. • 22 hour outage after IPO in 1999 • Estimated cost: Over $2M • Periodic outages since it started, most recently August ’09 • Forced fundamental rearchitecture • Aside: Started with Ruby on Rails, now using Scala Sunday, September 13, 2009

  13. Sunday, September 13, 2009

  14. How is this solved today? Sunday, September 13, 2009

  15. Database Architecture MySql Cache Cache Cache WebNode WebNode WebNode Sunday, September 13, 2009

  16. Replicate databases MySql MySql MySql Cache Cache Cache WebNode WebNode WebNode Sunday, September 13, 2009

  17. Map Reduce • Certain problems may be broken into “map” and “reduce” operations • Interesting because the data stays still, the computation moves • Good at things like distributed sort, distributed grep, etc • Not general-purpose Sunday, September 13, 2009

  18. Our Proposal: Swarm Sunday, September 13, 2009

  19. But first... Some background Sunday, September 13, 2009

  20. Scala Sunday, September 13, 2009

  21. Scala • Compiles to Java bytecode • so its fast and widely supported Sunday, September 13, 2009

  22. Scala • Compiles to Java bytecode • so its fast and widely supported • Supports closures, and type-inference • so it solves most of Java’s problems Sunday, September 13, 2009

  23. Scala • Compiles to Java bytecode • so its fast and widely supported • Supports closures, and type-inference • so it solves most of Java’s problems • The upcoming Scala 2.8 supports “portable continuations” Sunday, September 13, 2009

  24. Continuations Sunday, September 13, 2009

  25. What do continuations do? • Store the state of a computer program • Like saving your position in a video game • Resume execution at some point in the future Sunday, September 13, 2009

  26. Scala 2.8’s continuations support Sunday, September 13, 2009

  27. Scala 2.8’s continuations support • “Delimited” Sunday, September 13, 2009

  28. Scala 2.8’s continuations support • “Delimited” • Portable Sunday, September 13, 2009

  29. Scala 2.8’s continuations support • “Delimited” • Portable • Implemented through a code transformation Sunday, September 13, 2009

  30. Scala 2.8’s continuations support • “Delimited” • Portable • Implemented through a code transformation • Complicated! Sunday, September 13, 2009

  31. The Solution Sunday, September 13, 2009

  32. ? What if we could distribute data and computation across multiple computers such that the programmer need not think about it? Sunday, September 13, 2009

  33. But how? Sunday, September 13, 2009

  34. But how? • Move the computation, not the data Sunday, September 13, 2009

  35. But how? • Move the computation, not the data • Handle this transparently within the framework Sunday, September 13, 2009

  36. But how? • Move the computation, not the data • Handle this transparently within the framework • Arrange the data to minimize movement of the computation Sunday, September 13, 2009

  37. How does it work? b a Program: c 1. print a 2. print b 3. print c Sunday, September 13, 2009

  38. How does it work? b a Program: c 1. print a 2. print b 3. print c Sunday, September 13, 2009

  39. How does it work? b a Program: c 1. print a 2. print b 3. print c Sunday, September 13, 2009

  40. How does it work? b a Program: c 1. print a 2. print b 3. print c Sunday, September 13, 2009

  41. Arranging data with graph clustering Sunday, September 13, 2009

  42. Forcing Swarm to migrate the continuation Sunday, September 13, 2009

  43. Sunday, September 13, 2009

  44. Forced remote variable Sunday, September 13, 2009

  45. Sunday, September 13, 2009

  46. What next? • Just a simple prototype • Many interesting sub-problems • Open source • Need your help! Sunday, September 13, 2009

  47. Storage • How do we arrange the data for optimal efficiency? • What about concurrency? • Software transactional memory • Replication and redundancy • Garbage collection Sunday, September 13, 2009

  48. A “universal” codebase • Swarm requires that every node has the same binary • We could use the JVM’s classloader mechanism to retrieve binaries as needed from a global namespace • Will need to address issues of versioning and security Sunday, September 13, 2009

  49. “Swarm aware” libraries • Need “Swarm” aware collections classes like Map, List, and Set • Develop a storage system with capabilities similar to a relational database • The creation of a web framework around Swarm (similar to “Rails” or “LiftWeb”) Sunday, September 13, 2009

  50. Swarm tools • Continuations plugin imposes restrictions on the code that can be migrated • “foreach” • Serializable • A Scala compiler plugin that understood these limitations would be very useful Sunday, September 13, 2009

  51. Interested in helping? http://code.google.com/p/swarm-dpl/ ian@uprizer.com Sunday, September 13, 2009

Recommend


More recommend