The Social Structure of Open Source Software Development Authors: Kevin Crowston and James Howison Presented by Bill Shenk
What is social structure? ● Control ● Coordination ● Socialization ● Continuity
Why study social structure for OSSE? “little is known about how people in these communities coordinate software development … or about what software processes, work practices, and organizational contexts are necessary to their success.” Walt Scacchi, “Software Development Practices in Open Software Development Communities: A Comparative Case Study”, 2002
Why does this matter? Understanding social structure can help with: ● Development planning ● Predictable relationships between code structure and social structure ● Risk management ● Team members that are vital to the success of a project
What's important when studying social structure? ● Individuals – Group size: small, large? constant, growing, shrinking? ● Their actions – How is the work divided? Who contributes what? ● Their interactions – Who talks to whom, how often, where? – Is the communication funneled somehow?
Centralization and decentralization: Development ● Actions: code is written ● Centralized: a small core group of code contributors – cathedral ● Decentralized: contributions from a larger portion of project individuals – bazaar
Centralization and decentralization: Communication ● Interactions between project members (via email, IM, wiki, bug-tracking, etc.) ● Centralized: small group who speak to the larger group, and large group talks only to small group ● Decentralized: project members speak to each other as a whole
Decentralized development and communication Raymond (1998), Kuwabara (2000), Cox (1998) suggest that most OSS projects are decentralized in development, communication, or both ● Decentralized development and planning are a good indication of decentralized communication ● “decentralized development” surrounded in “clamor … anyone is welcome – the more people, the louder the clamor, the better it is.” - Kuwabara (2000) ● The more the merrier!
Decentralized communication and OSS Alan Cox argues that bazaar projects can lead to “clique” formation ● Linux for 8086: noise from inexperienced programmers prompted the core developers to form a secluded group ● Discussions should focus on existing code rather than opinions and ideas, avoid “town councilors”
Centralized communication and OSS ● Centralized social structures can sometimes lead to “ownership” of a project ● Informal ownership often goes to founding member(s) – e.g., Linux ● Raymond (1998) believes that some centralization is vital to OSS success
Authors' Study ● Examined communication centralization during bug- fixing stage ● Chosen because there are a “microcosm of coordination problems” (Crowston 1997) and collaboration across many individuals and roles ● Data taken from SourceForge through spiders and parsers ● Criteria: at least seven developers with at least 100 bugs per project from relatively active projects
How the data was analyzed ● SourceForge ID was used as an individual's identifier ● Each message tied to a bug counts as one interaction from one sender to another sender, starting with reporter ● 23% of messages were sent anonymously, considered extraneous and therefore not utilized (“nobody interactions”)
Some raw figures ● 120 projects (out of 50,000 at the time) were analyzed that fit criteria and had available bug- tracking data ● 61,068 bug reports, avg. of 509 per project – bugs with at least one reply were counted ● 14,922 total unique users (posters) with avg. of 140 users per project
Centralization scores ● Central project individuals are those who send and receive greater number of messages ● Members who send messages (out-degree centrality) are measured for centrality ● In a very centralized project, a single individual will have a high out-degree; in decentralized, no one person stands out
Pretty graph – interaction plot Figure 5 – openrpg
Pretty graph – interaction plot Figure 9 – curl, centralization = 0.922
Pretty graph – interaction plot Figure 10 – squirrelmail, centralization = 0.377
Distribution Figure 8 – centralization scores for projects
Project size vs. centralization ● Authors discovered large projects are typically less centralized ● Possible interpretation: in a large project, it is difficult for a single individual to fix every bug ● Growing projects lead to modularity and formation of smaller groups
What does it all mean?! ● Data and graphs show that the bug-tracking communication, on the whole, was neither centralized nor decentralized ● Average centralization: 0.56 ± 0.20
Questions & Further Analysis ● Projects that changed leaders? (centralized → decentralized) ● Posters with high out-degree might be verbose/unclear and score artificially higher? ● Mailing list communication centralization also measured for 52 projects showed similar results. What about other ways of measuring communication centralization? ● Communication centralization vs. development centralization? Would it show similar results?
Thank you!
Recommend
More recommend