Strengthening Reproducibility in Network Science ... · 10. Provide public access to scripts, runs,...

Reproducibility & Generalizability @ TwitterStrengthening Reproducibility in Network Science workshopNetSci 2017

Brandon Roy@bcroygbivJune 19, 2017

Twitter is a real-time information network – it’s what’s happening right now

What is Twitter?

Twitter is a real-time information network – it’s what’s happening right now

I choose other users to followAll tweets by those users render into my timeline

A tweet can be retweeted

If some users I follow in turn follow me, it’s a mutual follow

What is Twitter?

Health, Usage and BehaviorHUB team

- Define and model user “health” at individual and population level- Identify causal factors for health and usage- Characterize user interests- Translate insights into experiments and build prototype systems- ...

- Analytics & Machine Learning- Machine learning infrastructure / platforms- User metrics and revenue modeling- Content understanding (text, images, video)- Data services and integration- User modeling- …

(and friends)Twitter Science

The systematic study of the structure and behavior of the physical and natural world through observation and experiment

Newton, observing apple falling from a tree develops a theory:- Apples are attracted toward the Earth?- Fruit is attracted toward the Earth?- Unobserved force attracts all masses to one another

Develops Law of Universal Gravitation

Depends on a minimal set of conditions

Scientific findings are reproducible under appropriate conditions

Assumption: laws of physics are stable

Science

Developmental psychology – how do children learn words?

Study through observation and experiment

Observational study preserves natural system. Can correlate features of objects & environment (e.g. shape, color, salience) with words learned

Experimental study can isolate and test factors, may be more easily repeatable. But may also lose important aspects of system under analysis

Assumption: human nature / behavior is relatively stable

Science

Medina et. al., 2011

Studying TwitterTwitter is both social and a technical system. Parts are simple, but system is complex

Consists of millions of users producing, sharing, and consuming content

Awareness

Regular use

How can we make Twitter better? How can we grow the platform?

Awareness

Regular use

Twitter is both social and a technical system. Parts are simple, but system is complex

Studying Twitter

Awareness

Regular useRegular use

Randomly assign users into control / treatment groups

Record key metrics and look for stat. sig. differences between groups

Product experimentationA/B testing

Randomly assign users into control / treatment groups

Record key metrics and look for stat. sig. differences between groups

In this example, we learn green button is “better” than blue…

But we don’t necessarily have a “theory” of button color

Product experimentationA/B testing

If we are able to replicate on other websites, with other text, with potentially other background colors, we’ll start to feel more confident about green buttons

We’ve been experimenting with account recommendations to new users

Change recommendation algorithm for subset of new users and compare to control group

Feel confident finding would be valid (on avg) for all users due to random sampling strategy

If things looks good we expect reproducibility and will “ship it” to all users

Caveat: other parts of system may change, could affect these findings!

Product experimentationDDG = “Duck Duck Goose”

Graph stateConsumptionPassive engagementsRich media

Graph actionsProductionActive engagementsSocial interaction

Many questions we would like to answer but cannot (easily) manipulate through experiment

But we can try to study these questions using other methods

Example: what makes a user “healthy”?

Observational data analysis

Graph stateConsumptionPassive engagementsRich media

Graph actionsProductionActive engagementsSocial interaction

Many questions we would like to answer but cannot (easily) manipulate through experiment

But we can try to study these questions using other methods

Example: what makes a user “healthy”?

Characterizing graph state

Link type

0 - 6061 - 500

501 - 3,0003,001 - 25,000

25,001 - 200,000200,001 - 2,000,000

2,000,000+

B’s # followers

Near zeroVery lightLightMedium Non-TweeterMedium TweeterHeavy Non-TweeterHeavy Tweeter

B’s usage state

Characterizing graph state

HypothesisUser’s graph supports their activity, and only certain types of links are important for driving heavy usage

AnalysisMatch users with same covariates except variable in question

Compare matched users who differ on variable in question

For example, find pair of users who have same graph summary counts except for # of small, heavy tweeter accounts followed and look for different health outcomes

Analysis

Very excited when we first got this result

Was intuitive, suggests ingredients for a great Twitter experience

But would be more convincing if we could reproduce analysis with different data.

Better yet, reproduce effect with controlled experiment.

But how to implement this change?

1. For every result, keep track of how it was produced2. Avoid manual data manipulation steps3. Archive the exact versions of all external programs used4. Version control all custom scripts5. Record all intermediate results, when possible in standardized formats6. For analyses that include randomness, note underlying random seeds7. Always store raw data behind plots8. Generate hierarchical analysis output, allowing layers of increasing detail to be inspected9. Connect textual statements to underlying results

10. Provide public access to scripts, runs, and results

Reproducibility recommendations

Sandve et. al., 2013

from Sandve et. al., 2013

1. For every result, keep track of how it was produced2. Avoid manual data manipulation steps3. Archive the exact versions of all external programs used4. Version control all custom scripts5. Record all intermediate results, when possible in standardized formats6. For analyses that include randomness, note underlying random seeds7. Always store raw data behind plots8. Generate hierarchical analysis output, allowing layers of increasing detail to be inspected9. Connect textual statements to underlying results

10. Provide public access to scripts, runs, and results

Reproducibility recommendationsfrom Sandve et. al., 2013

Sandve et. al., 2013

Great to reproduce analysis… even better to reproduce the effect!

Sandve GK, Nekrutenko A, Taylor J, Hovig E (2013) Ten Simple Rules for Reproducible Computational Research. PLoS Comput Biol 9(10): e1003285. https://doi.org/10.1371/journal.pcbi.1003285

Medina, T., Snedeker, J., Trueswell, J., & Gleitman, L (2011). How words can and cannot be learned by observation. Proceedings of the National Academy of Sciences, 108(22), 9014.

References

Strengthening Reproducibility in Network Science ... · 10. Provide public access to scripts, runs,...

Documents

Research reproducibility - Code etc

Reproducibility: 10 Simple Rules

Reproducibility Confidentiality Data Access

The reproducibility of placebo analgesia

REPEATABILITY AND REPRODUCIBILITY STUDIES: A

2014 manchester-reproducibility

Revisiting Rigor and Transparency in NIH Grant Applications · 4/10/2017 · science:” …a new book on reproducibility Reproducibility of Research: Issues and ... reproducibility

Gage Repeatability and Reproducibility Methodologies ...cdn.intechopen.com/pdfs/17412/InTech-Gage_repeatability_and... · Gage Repeatability and Reproducibility Methodologies Suitable

Irreproducible results - Portal · Results of reproducibility study for 508 papers 402 9 Experiments with limited reproducibility Examples for lack of reproducibility • Success

2014 nicta-reproducibility

REPRODUCIBILITY OF COMPLEX TURBULENCE FLOW …riam-compact.com/inc/download4/3dhill_report201304.pdf · REPRODUCIBILITY OF COMPLEX TURBULENCE FLOW REPRODUCIBILITY OF COMPLEX TURBULENCE

ImproveIng the Reproducibility in Dyeing

Flowmeter Repeatability, Reproducibility, Linearity

Analysis of Reproducibility of Noninvasive Measures of ... · H. F. Posada-Quintero et al.: Analysis of Reproducibility of Noninvasive Measures the reliability of such techniques

Reproducibility and discoverability at EGA · * Di Tommaso P, et al., Nextflow enables computational reproducibility, Nature Biotech, 2017 (publication pending) •A framework for

Reproducibility SuperPlots: Communicating reproducibility

Reproducibility Using Semantics: An Overview

"Reproducibility from the Informatics Perspective"

Lecture: Case Studies, Reproducibility

On reproducibility