LARGE SCALE OPEN SOURCE DEVELOPMENT MODELS A COMPARATIVE ANALYSIS By Joe Gordon
ABOUT ME OpenStack Developer at HP Hacking on OpenStack for 4 years contact information jogo on freenode github.com/jogo
WHY Saw OpenStack grow from around 60 developers to 2,000 developers Unusual development model But how do other projects solve the same problems?
WHY IS PICKING THE RIGHT DEVELOPMENT MODEL IMPORTANT?
ACCELERATING GROWTH Linux kernel 2 years to reach 100 contributors in 1991 Linux 2.0 had 190 contributors in 1996 in credits OpenStack took 1 year to reach 100 contributors in 2010 Docker had over 300 contributors in its first year in 2013 200 contributors per month Linux: 1991 - June 2004 (13 years) Debian: 1993 - March 2007 (14 years) OpenStack: 2010 - October 2012 (2 years)
ACCELERATING GROWTH Linux, Debian, Docker, OpenStack (clockwise from top left) source: openhub
OPEN SOURCE IS BIG BUSINESS Open source instead of standard bodies Balancing corporate interests Linux foundation Gold and Platinum Members
PICKING THE RIGHT DEVELOPMENT MODEL CONWAY'S LAW organizations which design systems ... are constrained to produce designs which are copies of the communication structures of these organizations
DEVELOPMENT MODELS PROJECTS COVERED Linux Kernel Apache Software Foundation Debian OpenStack Docker
LINUX KERNEL
RELEASE Time based release model (2-3 months) Rolling development model, continually integrating major changes Separate stable team Release single artifact Rarely consumed directly by end users
SCALING MODEL Per month 1,000 contributors 5,000 to 7,000 patches Lieutenants / subsystem maintainers 100-150 maintainers Chain of Trust No elections for technical positions Decentralized review process Each maintainer has their own git tree Only about 1% of patches are directly merged by BDFL
HIERARCHY 1. BDFL 2. Subsystem maintainers 3. Contributors Usually one layer of subsystem maintainer but sometimes up to three
TOOLING Communication mailing lists Git No automated pre-commit CI Yes post commit Code review decentralized more mailing lists
LIFE OF A PATCH Process can be quick for minor fixes or take years for controversial changes 1. Design Prefer in the open, but not required 2. Early review via mailing list Find correct maintainer Submit patches via email 3. Wider review, accepted by a subsystem maintainer's tree and into the - next trees. 4. Merged into mainline
CULTURE Chain of trust About the individual Value frankness over politeness Corporate friendly No single company controls Not much automated pre commit testing Failing testing is very bad for author
APACHE SOFTWARE FOUNDATION
RELEASE ASF is more of a governance umbrella and culture Each project does its own thing 150+ separate releases
SCALING MODEL Separate projects 4,431 committers 150+ top level projects 740 contributors in past 12 months? In project scaling up to each project Apache Spark had 570 contributors in past 12 months OpenOffice had 31 flat (ish) trust model 'Review then commit' vs. 'commit then review' In order to reduce friction and allow for diversity to emerge, rather than forcing a monoculture from the top ... each project is delegated authority over development of its software, and is given a great deal of latitude in designing its own technical charter and its own governing rules.
HIERARCHY 1. ASF Member 2. Project Management Committee Makes technical decisions 3. Committer 4. Developer When the group felt that the person had "earned" the merit to be part of the development community, they granted direct access to the code repository, thus increasing the group and increasing the ability of the group to develop the program, and to maintain and develop it more effectively.
TOOLING Communication mailing list SVN Optional CI Central review system: Review Board Lazy consensus If it didn't happen on a mailing list, it didn't happen.
LIFE OF A PATCH Different projects have different review flows Review then commit or commit then review . Review Board
CULTURE Lazy consensus Focus is on the team All decisions are team based Focus is on contributors not companies No monoculture Within the ASF we worry about any community which centers around a few individuals who are working virtually uncontested.
DEBIAN
RELEASE When its ready, not time based. Notoriously slow Every 2 years Lots and lots of artifacts Unstable, Testing, Stable
SCALING MODEL Package Maintainers 3,200 Debian Developers Can have individual maintainers or groups (via a mailing list) No review, trust/burden maintainers more
HIERARCHY Roles Maintainer : the person making the Debian package of the program. Sponsor : a person who helps maintainers to upload packages to the official Debian package archive (after checking their contents). Debian Developer (DD) : a member of the Debian project with full upload rights to the official Debian package archive. Debian Maintainer (DM) : a person with limited upload rights to the official Debian package archive.
TOOLING Communication Mailing list Web services Lots of IRC Poor automated testing Quality control is ultimately to individual maintainers Half of the CI available isn't official No peer review system Except for new packages (FTP Masters)
LIFE OF A PACKAGE
CULTURE Rotating leadership (elections) Do-ocracy: An individual Developer may make any technical or nontechnical decision with regard to their own work Open development Independent not 'profit-driven': no imposed decisions by who has money, infrastructure, people no benevolent dictator, no oligarchy It is all about the individual (although individual's can form groups) Territorial
OPENSTACK
RELEASE Time based, every 6 months Continuous delivery Set of separate but related projects. Usually 1 way dependencies Lots of artifacts Sometimes consumed directly by consumers (without distro) No rolling development, freeze development on master before a release
SCALING MODEL Break down repositories and build teams around each repository 31 teams 150+ repositories 5,000 commits per month from 500 contributors 282 core developers Flat trust model Strong centralized review process (two core reviews) Automated testing to reduce reviewer burden Having trouble with scaling the team responsible for a single repository Can't get past 15 or so members on a core team
HIERARCHY Flat as possible 1. TC 2. PTL 3. Core Teams
TOOLING Communication Mailing lists IRC Code reviews Git Code review: Gerrit Lots and lots of automated testing In person design summits twice a year
LIFE OF A PATCH
CULTURE Group over individual Egalitarian Elections Welcoming to new contributors Corporate friendly Not controlled by single company Lazy consensus Decentralized design Uniform tooling/process across projects
CULTURE OPENSTACK'S 4 OPENS Open Source , not open core Open Design Open Development Open Community Lazy consensus technical governance is a meritocracy put everything in the public
FACTORS LIMITING GROWTH Cross project issues Team size Single vision
DOCKER 'Github' development model
RELEASE Every 2 Months separate release branch master isn't frozen
SCALING MODEL 37 maintainers in Docker 10-15 repos in total Maintainers / subsystem maintainers Have to submit a pull request when going on vacation! No don't direct push Centralized review in github 1) They share responsibility in the project's success. 2) They have made a longterm, recurring time investment to improve the project. 3) They spend that time doing whatever needs to be done, not necessarily what is the most interesting or fun." This "cellular division" is the primary mechanism for scaling maintenance of the project as it grows.
HIERARCHY 1. BDFL 2. Project Leader (day to day work) 3. Core maintainers 4. Subsystem maintainers 5. Contributors
BDFL Ideally, the BDFL role is like the Queen of England: awesome crown, but not an actual operational role daytoday. The real job of a BDFL is to NEVER GO AWAY. ... the BDFL will always be there, preserving the philosophy and principles of the project, and keeping ultimate authority over its fate. This gives us great flexibility in experimenting with various governance models, knowing that we can always press the "reset" button without fear of fragmentation or deadlock. See the US congress for a counterexample. BDFL daily routine: * Is the project governance stuck in a deadlock or irreversibly fragmented? * If yes: refactor the project governance * Are there issues or conflicts escalated by core? * If yes: resolve them * Go back to polishing that crown.
TOOLING Communication IRC Google groups Pull request (all decisions are a pull request) Git Github CI Jenkins Gate pull requests
LIFE OF A PATCH 5 States of a review 1. Triage Check DCO etc. Partially automated 2. Design review 3. Code review 4. Docs review 5. Merge Commit message bodies are optional!
LIFE OF A PATCH
Recommend
More recommend