Reproducibility & Generalizability @ Twitter Strengthening - PowerPoint PPT Presentation

Reproducibility & Generalizability @ Twitter Strengthening Reproducibility in Network Science workshop NetSci 2017 Brandon Roy @bcroygbiv June 19, 2017

What is Twitter? Twitter is a real-time information network – it’s what’s happening right now

What is Twitter? Twitter is a real-time information network – it’s what’s happening right now I choose other users to follow All tweets by those users render into my timeline A tweet can be retweeted If some users I follow in turn follow me, it’s a mutual follow

HUB team Health, Usage and Behavior - Define and model user “health” at individual and population level - Identify causal factors for health and usage - Characterize user interests - Translate insights into experiments and build prototype systems - ...

Twitter Science (and friends) - Analytics & Machine Learning - Machine learning infrastructure / platforms - User metrics and revenue modeling - Content understanding (text, images, video) - Data services and integration - User modeling - …

Science The systematic study of the structure and behavior of the physical and natural world through observation and experiment Newton, observing apple falling from a tree develops a theory: - Apples are attracted toward the Earth? - Fruit is attracted toward the Earth? - Unobserved force attracts all masses to one another Develops Law of Universal Gravitation Depends on a minimal set of conditions Scientific findings are reproducible under appropriate conditions Assumption: laws of physics are stable

Science Developmental psychology – how do children learn words? Study through observation and experiment Observational study preserves natural system. Can correlate features of objects & environment (e.g. shape, color, salience) with words learned Experimental study can isolate and test factors, may be more easily repeatable. But may also lose important aspects of system under analysis Assumption: human nature / behavior is relatively stable Medina et. al., 2011

Studying Twitter Twitter is both social and a technical system. Parts are simple, but system is complex Consists of millions of users producing, sharing, and consuming content

Studying Twitter Twitter is both social and a technical system. Parts are simple, but system is complex Consists of millions of users producing, sharing, and consuming content Regular use How can we make Twitter better? How can we grow the platform? Trial Awareness

Studying Twitter Twitter is both social and a technical system. Parts are simple, but system is complex Consists of millions of users producing, sharing, and consuming content Regular use Regular use How can we make Twitter better? How can we grow the platform? Trial Awareness

Product experimentation A/B testing Randomly assign users into control / treatment groups Record key metrics and look for stat. sig. differences between groups

Product experimentation A/B testing Randomly assign users into control / treatment groups Record key metrics and look for stat. sig. differences between groups In this example, we learn green button is “better” than blue… But we don’t necessarily have a “theory” of button color If we are able to replicate on other websites, with other text, with potentially other background colors, we’ll start to feel more confident about green buttons

Product experimentation DDG = “Duck Duck Goose” We’ve been experimenting with account recommendations to new users Change recommendation algorithm for subset of new users and compare to control group Feel confident finding would be valid (on avg) for all users due to random sampling strategy If things looks good we expect reproducibility and will “ship it” to all users Caveat: other parts of system may change, could affect these findings!

Observational data analysis Many questions we would like to answer but cannot (easily) manipulate through experiment But we can try to study these questions using other methods Example: what makes a user “healthy”? Graph actions Graph state Production Consumption Active engagements Passive engagements Social interaction Rich media

Characterizing graph state Link type B’s # followers B’s usage state 0 - 60 Near zero 61 - 500 Very light 501 - 3,000 Light 3,001 - 25,000 Medium Non-Tweeter 25,001 - 200,000 Medium Tweeter 200,001 - 2,000,000 Heavy Non-Tweeter 2,000,000+ Heavy Tweeter

Characterizing graph state

Analysis Hypothesis User’s graph supports their activity, and only certain types of links are important for driving heavy usage Analysis Match users with same covariates except variable in question Compare matched users who differ on variable in question For example, find pair of users who have same graph summary counts except for # of small, heavy tweeter accounts followed and look for different health outcomes

Observational data analysis Very excited when we first got this result Was intuitive, suggests ingredients for a great Twitter experience But would be more convincing if we could reproduce analysis with different data. Better yet, reproduce effect with controlled experiment. But how to implement this change?

Reproducibility recommendations from Sandve et. al., 2013 1. For every result, keep track of how it was produced 2. Avoid manual data manipulation steps 3. Archive the exact versions of all external programs used 4. Version control all custom scripts 5. Record all intermediate results, when possible in standardized formats 6. For analyses that include randomness, note underlying random seeds 7. Always store raw data behind plots 8. Generate hierarchical analysis output, allowing layers of increasing detail to be inspected 9. Connect textual statements to underlying results 10. Provide public access to scripts, runs, and results Sandve et. al., 2013

Reproducibility recommendations from Sandve et. al., 2013 1. For every result, keep track of how it was produced 2. Avoid manual data manipulation steps 3. Archive the exact versions of all external programs used 4. Version control all custom scripts 5. Record all intermediate results, when possible in standardized formats 6. For analyses that include randomness, note underlying random seeds 7. Always store raw data behind plots 8. Generate hierarchical analysis output, allowing layers of increasing detail to be inspected 9. Connect textual statements to underlying results 10. Provide public access to scripts, runs, and results Great to reproduce analysis … even better to reproduce the effect! Sandve et. al., 2013

References Sandve GK, Nekrutenko A, Taylor J, Hovig E (2013) Ten Simple Rules for Reproducible Computational Research. PLoS Comput Biol 9(10): e1003285. https://doi.org/10.1371/journal.pcbi.1003285 Medina, T., Snedeker, J., Trueswell, J., & Gleitman, L (2011). How words can and cannot be learned by observation. Proceedings of the National Academy of Sciences, 108(22), 9014.

Reproducibility & Generalizability @ Twitter Strengthening - PowerPoint PPT Presentation

Reproducibility & Generalizability @ Twitter Strengthening Reproducibility in Network Science workshop NetSci 2017 Brandon Roy @bcroygbiv June 19, 2017 What is Twitter? Twitter is a real-time information network its whats

Generalizability Theory; Understanding Variance in Research Casperhp12345@gmail.com

New NIH requirements regarding Rigor and Reproducibility

Computational Reproducibility in Production Physics Applications Numerical Reproducibility at

On the generalizability of P an .inis praty ah ara -technique to other languages

R and Reproducibility A Proposal David Smith Revolu0on

The Model You Know: Generalizability and Predictive Power of Models of Choice Under Uncertainty

B: Data Reproducibility What are we doing in Singapore, Tim White and what should journals be

Everware - lowering reproducibility barriers Andrey Ustyuzhanin Yandex School of Data Analysis

Rigor, Reproducibility, and Transparency David T. Redden, PhD Co-Director, CCTS BERD Chair,

Computational Reproducibility Daniel S. Katz Jennifer Freeman Smith Computational

Rethinking our Alignment Obsession: Can a Generalizability Perspective Help? Scott Marion, Center

Contributions to Generalizability Theory: Dick Jaegers Indirect but Strong Mentoring Effects

Reproducibility: failures & futures David A. C. Beck Chemical Engineering & eScience

Experiment Reproducibility in Planetlab RP 1.1 Project Presentation Sudesh Jethoe Experiment

REPRODUCIBILITY IN COMPUTER VISION: TOWARDS OPEN PUBLICATION OF IMAGE ANALYSIS EXPERIMENTS AS

Using Twitter for your CPD Janet Thomas November 2019 #PHYSIO19 Why twitter for CPD?

Reproducibility as a Community Effort Lessons from the Madagascar Project Sergey Fomel Jackson

Worksheets Percy Liang UCI Reproducibility Symposium September 22, 2020 The current research

Use of Java / JVM at Twitter @TonyPrintezis | @TwitterBoston tprintezis@twitter.com #JCP EC

Repeatability Reproducibility & Rigor Jan Vitek Kalibera, Vitek. Repeatability,

Discussion: Reproducibility and Cross-study Replicability of Prognostic Signatures from High

Join the Conversation on Twitter Use #AMSSAevents to follow the conversation on Twitter and

Adventures in Elm GOTO Chicago, 24 May 2016 Adventures in Elm Events, Reproducibility, and

Numerical reproducibility of high-performance computations using floating-point or interval

Reproducibility & Generalizability @ Twitter Strengthening - PowerPoint PPT Presentation

Reproducibility & Generalizability @ Twitter Strengthening Reproducibility in Network Science workshop NetSci 2017 Brandon Roy @bcroygbiv June 19, 2017 What is Twitter? Twitter is a real-time information network its whats

Generalizability Theory; Understanding Variance in Research Casperhp12345@gmail.com

New NIH requirements regarding Rigor and Reproducibility

Computational Reproducibility in Production Physics Applications Numerical Reproducibility at

On the generalizability of P an .inis praty ah ara -technique to other languages

R and Reproducibility A Proposal David Smith Revolu0on

The Model You Know: Generalizability and Predictive Power of Models of Choice Under Uncertainty

B: Data Reproducibility What are we doing in Singapore, Tim White and what should journals be

Everware - lowering reproducibility barriers Andrey Ustyuzhanin Yandex School of Data Analysis

Rigor, Reproducibility, and Transparency David T. Redden, PhD Co-Director, CCTS BERD Chair,

Computational Reproducibility Daniel S. Katz Jennifer Freeman Smith Computational

Rethinking our Alignment Obsession: Can a Generalizability Perspective Help? Scott Marion, Center

Contributions to Generalizability Theory: Dick Jaegers Indirect but Strong Mentoring Effects

Reproducibility: failures &amp; futures David A. C. Beck Chemical Engineering &amp; eScience

Experiment Reproducibility in Planetlab RP 1.1 Project Presentation Sudesh Jethoe Experiment

REPRODUCIBILITY IN COMPUTER VISION: TOWARDS OPEN PUBLICATION OF IMAGE ANALYSIS EXPERIMENTS AS

Using Twitter for your CPD Janet Thomas November 2019 #PHYSIO19 Why twitter for CPD?

Reproducibility as a Community Effort Lessons from the Madagascar Project Sergey Fomel Jackson

Worksheets Percy Liang UCI Reproducibility Symposium September 22, 2020 The current research

Use of Java / JVM at Twitter @TonyPrintezis | @TwitterBoston tprintezis@twitter.com #JCP EC

Repeatability Reproducibility &amp; Rigor Jan Vitek Kalibera, Vitek. Repeatability,

Discussion: Reproducibility and Cross-study Replicability of Prognostic Signatures from High

Join the Conversation on Twitter Use #AMSSAevents to follow the conversation on Twitter and

Adventures in Elm GOTO Chicago, 24 May 2016 Adventures in Elm Events, Reproducibility, and

Numerical reproducibility of high-performance computations using floating-point or interval

Reproducibility: failures & futures David A. C. Beck Chemical Engineering & eScience

Repeatability Reproducibility & Rigor Jan Vitek Kalibera, Vitek. Repeatability,