“Never doubt that a small group of thoughtful, committed citizens can change the world. Indeed, it is the only thing that ever has.” --Margaret Mead
Thank You R Hackers of NYC
Harvesting & Analyzing Interaction Data in R: The Case of MyLyn Sean P. Goggins, PhD Drexel University outdoors@acm.org MyLyn Research Collaborators: Peppo Valetto, PhD (PI) & Kelly Blincoe
I Study Small Groups I use electronic trace data, interviews, field notes, electronic content & surveys for raw data
Coolest Open* Data to Me ❖ Group’s Emerging & Evolving ❖ Group Formation & Development ❖ The long tail of social computing, which I describe as everything *except* Wikipedia & Facebook ❖ Groups constructing knowledge, creating information and forming identity. *Available, but not always easy to get in an analyzable form
Points ❖ Harvesting Small, Open Data [MyLyn] ❖ Analyzing ❖ Temporal Changes in the MyLyn Network ❖ Work ❖ Talk ❖ Libraries Used & Source Code ❖ StatNet ❖ iGraph ❖ TNET ❖ R Sourcecode and Data will be available for download at http://www.groupinformatics.org . If you use this data or scripts please cite: ❖ Goggins, S. P., Laffey, J., Amelung, C., and Gallagher, M. 2010. Social Intelligence In Completely Online Groups. IEEE International Conference on Social Computing. 500-507. DOI=10.1109/SocialCom.2010.79. ❖ Blincoe, K., Valetto, G., and Goggins, S. 2011. Leveraging Task Contexts for Managing Developers’ Coordination. Under Review.
Data for R An Example From the MyLyn Project
More About MyLyn: http://tasktop.com/blog/ http://www.eclipse.org/myly n/ .zip file MyLyn Context Uploads k Wor Bug Tal Database k MySQL Database Tal k Tal k HTML Parser
Talk Cues Wor Tal k k
Coordination Requirements & Dependencies MyLyn Data Has 2 Tal Advantages k for Analysis compared to source Control systems analysis: 1. You see files *viewed* together 2. Discourse on a Bug is directly connected to the files read and edited 1. Closer connection between analysis of work & talk. Wor k
Harvesting Data for R An Example From the MyLyn Project
MyLyn Interaction Datamart Tal k Wor k MyLyn Interaction Warehouse ETC Wor Tal CANS k k
Analyzing Open Data with R An Example From the MyLyn Project
Analysis Tools ❖ Eight Mylyn Releases (Temporal Analysis) ❖ R Packages Used ❖ TNET ❖ iGraph ❖ Statnet
Weighted Network: TNET
The Dense Graph (Work) ❖ Developers create a dense graph. Not a complete graph, but dense. Wor k
A Sparser Graph (Talk) ❖ Commenter's create a sparse graph Tal k
Release One (2.0) Analysis Release 1 Discussion Cod Tal Wor e k k iGrap h
STATNET for Discussion ❖ StatNet Red = Bug Commenter Release Blue = Bug Opener 1 Tal k StatNET
Release One Work & Talk
Release 1 (2.0) iGraph & Statnet Tal k Red = Bug Commenter Blue = Bug Opener Cluster s Release 1 StatNET In Degree & iGrap Out Degree h
Release One (2.0): Filtered Cod Discussion e Tal k Release Wor 1 k Google Summer Coder 304, 373, 399 & 143 form The Strongest Red = Bug Connections Commenter Blue = Bug Opener In both networks
Release One (2.0): Filtered Cod Discussion Wor Tal e k k Release 1 457, 391 & 159 – Comment & Google Open Summer Coder 304, 373, 399 & 143 form The Strongest Red = Bug Connections Commenter Blue = Bug Opener In both networks
Compare Over Time First & Last Release
Release 1 (2.0) Compared to Release 8 (3.3) Release 1 Tal k Release 8 304, 399, 143, 159, 173, 373 399, 118, 304, 159, 391, 416 StatNET & ordinary plotting
Release 1 (2.0) Compared to Release 8 (3.3) Wor k Release 143 & 304 1 disengaged Or missing entirely Release 8 304, 373, 399 & 143 Two disconnected Graphs in release iGrap 8 h
Release Eight Work & Talk
Release 8 (3.3): Filtered Discussion Tal Cod Release k e 8 Nobody is “Just Blue” Wor k Red = Bug Commenter Blue = Bug Opener
Release 8 (3.3): Filtered Discussion Release 8 Tal Cod k e Wor k Notice 416 in Talk & Second Coder Red = Bug Commenter Graph Blue = Bug Opener
Release 8 (3.3) iGraph & Statnet 399, 118 & 159 are significant, But play with different clusters of Other people. Release Tal Red = Bug Commenter 8 k Blue = Bug Opener Cluster s Blue StatNET In Degree Cluste & iGrap r Out Degree h
Releases One → Eight High Level Views Over Time
Discussion, Releases 1 – 8 Where there is no color, There are multiple, incomplete Graphs.
Code, Releases 1 – 8 One Possible explanation: A few central People who slowly but Observably begin to engage Other contributors in An open source software Development project. Structure evolves Key Groups Evolve iGrap h
Next Step: The Story But that’s the research part, not the cool “R Stuff” Part
The People 373 304 399 159 143 Our next step is piecing together a narrative about the groups that emerged on this project, and describing each of the individuals. This is all open data. When we finish this part, we will publish one or more papers. For now, Let’s look at the cool “R Stuff”
Interaction Traces from Small Groups: The Case of MyLyn Sean P. Goggins, PhD Drexel University outdoors@acm.org Collaborators: Peppo Valetto, PhD & Kelly Blincoe Questions? In the after session.
Recommend
More recommend