large scale open source development models
play

LARGE SCALE OPEN SOURCE DEVELOPMENT MODELS A COMPARATIVE ANALYSIS - PowerPoint PPT Presentation

LARGE SCALE OPEN SOURCE DEVELOPMENT MODELS A COMPARATIVE ANALYSIS By Joe Gordon ABOUT ME OpenStack Developer at HP Hacking on OpenStack for 4 years contact information jogo on freenode github.com/jogo WHY Saw OpenStack grow from around 60


  1. LARGE SCALE OPEN SOURCE DEVELOPMENT MODELS A COMPARATIVE ANALYSIS By Joe Gordon

  2. ABOUT ME OpenStack Developer at HP Hacking on OpenStack for 4 years contact information jogo on freenode github.com/jogo

  3. WHY Saw OpenStack grow from around 60 developers to 2,000 developers Unusual development model But how do other projects solve the same problems?

  4. WHY IS PICKING THE RIGHT DEVELOPMENT MODEL IMPORTANT?

  5. ACCELERATING GROWTH Linux kernel 2 years to reach 100 contributors in 1991 Linux 2.0 had 190 contributors in 1996 in credits OpenStack took 1 year to reach 100 contributors in 2010 Docker had over 300 contributors in its first year in 2013 200 contributors per month Linux: 1991 - June 2004 (13 years) Debian: 1993 - March 2007 (14 years) OpenStack: 2010 - October 2012 (2 years)

  6. ACCELERATING GROWTH Linux, Debian, Docker, OpenStack (clockwise from top left) source: openhub

  7. OPEN SOURCE IS BIG BUSINESS Open source instead of standard bodies Balancing corporate interests Linux foundation Gold and Platinum Members

  8. PICKING THE RIGHT DEVELOPMENT MODEL CONWAY'S LAW organizations which design systems ... are constrained to produce designs which are copies of the communication structures of these organizations

  9. DEVELOPMENT MODELS PROJECTS COVERED Linux Kernel Apache Software Foundation Debian OpenStack Docker

  10. LINUX KERNEL

  11. RELEASE Time based release model (2-3 months) Rolling development model, continually integrating major changes Separate stable team Release single artifact Rarely consumed directly by end users

  12. SCALING MODEL Per month 1,000 contributors 5,000 to 7,000 patches Lieutenants / subsystem maintainers 100-150 maintainers Chain of Trust No elections for technical positions Decentralized review process Each maintainer has their own git tree Only about 1% of patches are directly merged by BDFL

  13. HIERARCHY 1. BDFL 2. Subsystem maintainers 3. Contributors Usually one layer of subsystem maintainer but sometimes up to three

  14. TOOLING Communication mailing lists Git No automated pre-commit CI Yes post commit Code review decentralized more mailing lists

  15. LIFE OF A PATCH Process can be quick for minor fixes or take years for controversial changes 1. Design Prefer in the open, but not required 2. Early review via mailing list Find correct maintainer Submit patches via email 3. Wider review, accepted by a subsystem maintainer's tree and into the - next trees. 4. Merged into mainline

  16. CULTURE Chain of trust About the individual Value frankness over politeness Corporate friendly No single company controls Not much automated pre commit testing Failing testing is very bad for author

  17. APACHE SOFTWARE FOUNDATION

  18. RELEASE ASF is more of a governance umbrella and culture Each project does its own thing 150+ separate releases

  19. SCALING MODEL Separate projects 4,431 committers 150+ top level projects 740 contributors in past 12 months? In project scaling up to each project Apache Spark had 570 contributors in past 12 months OpenOffice had 31 flat (ish) trust model 'Review then commit' vs. 'commit then review' In order to reduce friction and allow for diversity to emerge, rather than forcing a monoculture from the top ... each project is delegated authority over development of its software, and is given a great deal of latitude in designing its own technical charter and its own governing rules.

  20. HIERARCHY 1. ASF Member 2. Project Management Committee Makes technical decisions 3. Committer 4. Developer When the group felt that the person had "earned" the merit to be part of the development community, they granted direct access to the code repository, thus increasing the group and increasing the ability of the group to develop the program, and to maintain and develop it more effectively.

  21. TOOLING Communication mailing list SVN Optional CI Central review system: Review Board Lazy consensus If it didn't happen on a mailing list, it didn't happen.

  22. LIFE OF A PATCH Different projects have different review flows Review then commit or commit then review . Review Board

  23. CULTURE Lazy consensus Focus is on the team All decisions are team based Focus is on contributors not companies No monoculture Within the ASF we worry about any community which centers around a few individuals who are working virtually uncontested.

  24. DEBIAN

  25. RELEASE When its ready, not time based. Notoriously slow Every 2 years Lots and lots of artifacts Unstable, Testing, Stable

  26. SCALING MODEL Package Maintainers 3,200 Debian Developers Can have individual maintainers or groups (via a mailing list) No review, trust/burden maintainers more

  27. HIERARCHY Roles Maintainer : the person making the Debian package of the program. Sponsor : a person who helps maintainers to upload packages to the official Debian package archive (after checking their contents). Debian Developer (DD) : a member of the Debian project with full upload rights to the official Debian package archive. Debian Maintainer (DM) : a person with limited upload rights to the official Debian package archive.

  28. TOOLING Communication Mailing list Web services Lots of IRC Poor automated testing Quality control is ultimately to individual maintainers Half of the CI available isn't official No peer review system Except for new packages (FTP Masters)

  29. LIFE OF A PACKAGE

  30. CULTURE Rotating leadership (elections) Do-ocracy: An individual Developer may make any technical or nontechnical decision with regard to their own work Open development Independent not 'profit-driven': no imposed decisions by who has money, infrastructure, people no benevolent dictator, no oligarchy It is all about the individual (although individual's can form groups) Territorial

  31. OPENSTACK

  32. RELEASE Time based, every 6 months Continuous delivery Set of separate but related projects. Usually 1 way dependencies Lots of artifacts Sometimes consumed directly by consumers (without distro) No rolling development, freeze development on master before a release

  33. SCALING MODEL Break down repositories and build teams around each repository 31 teams 150+ repositories 5,000 commits per month from 500 contributors 282 core developers Flat trust model Strong centralized review process (two core reviews) Automated testing to reduce reviewer burden Having trouble with scaling the team responsible for a single repository Can't get past 15 or so members on a core team

  34. HIERARCHY Flat as possible 1. TC 2. PTL 3. Core Teams

  35. TOOLING Communication Mailing lists IRC Code reviews Git Code review: Gerrit Lots and lots of automated testing In person design summits twice a year

  36. LIFE OF A PATCH

  37. CULTURE Group over individual Egalitarian Elections Welcoming to new contributors Corporate friendly Not controlled by single company Lazy consensus Decentralized design Uniform tooling/process across projects

  38. CULTURE OPENSTACK'S 4 OPENS Open Source , not open core Open Design Open Development Open Community Lazy consensus technical governance is a meritocracy put everything in the public

  39. FACTORS LIMITING GROWTH Cross project issues Team size Single vision

  40. DOCKER 'Github' development model

  41. RELEASE Every 2 Months separate release branch master isn't frozen

  42. SCALING MODEL 37 maintainers in Docker 10-15 repos in total Maintainers / subsystem maintainers Have to submit a pull request when going on vacation! No don't direct push Centralized review in github 1) They share responsibility in the project's success. 2) They have made a long­term, recurring time investment to improve the project. 3) They spend that time doing whatever needs to be done, not necessarily what is the most interesting or fun." This "cellular division" is the primary mechanism for scaling maintenance of the project as it grows.

  43. HIERARCHY 1. BDFL 2. Project Leader (day to day work) 3. Core maintainers 4. Subsystem maintainers 5. Contributors

  44. BDFL Ideally, the BDFL role is like the Queen of England: awesome crown, but not an actual operational role day­to­day. The real job of a BDFL is to NEVER GO AWAY. ... the BDFL will always be there, preserving the philosophy and principles of the project, and keeping ultimate authority over its fate. This gives us great flexibility in experimenting with various governance models, knowing that we can always press the "reset" button without fear of fragmentation or deadlock. See the US congress for a counter­example. BDFL daily routine: * Is the project governance stuck in a deadlock or irreversibly fragmented? * If yes: refactor the project governance * Are there issues or conflicts escalated by core? * If yes: resolve them * Go back to polishing that crown.

  45. TOOLING Communication IRC Google groups Pull request (all decisions are a pull request) Git Github CI Jenkins Gate pull requests

  46. LIFE OF A PATCH 5 States of a review 1. Triage Check DCO etc. Partially automated 2. Design review 3. Code review 4. Docs review 5. Merge Commit message bodies are optional!

  47. LIFE OF A PATCH

Recommend


More recommend