Mining the Modern Code Review Repositories: A Dataset of People, Process and Product (MSR 2016)

Preview:

Citation preview

Mining the Modern Code Review Repositories:

A Dataset of People, Process and Product

Xin Yang Raula G. KulaNorihiro Yoshida Hajimu Iida

May 14–15, 2016. Austin, Texas

MSR 2016 data showcase

Osaka University

Japan

Nagoya University

Japan

NAISTJapan

NAISTJapan

Au Gai
same thing
Xin Yang
Maybe people not so interestedIf the collection method is boring

An Overview of the Code Review Dataset

1

● Code Review

● Source Code

● Human / Social

Why we made this dataset?

2

*Hamasaki et al., “Who does what during a code review? datasets of OSS peer review repositories”. MSR '13

Our JSON-based Dataset

(Hamasaki et al. MSR'13)*

Our previous work (Hamasaki et al.

MSR '13)*

Why we made this dataset?

2

*Hamasaki et al., “Who does what during a code review? datasets of OSS peer review repositories”. MSR '13

Our JSON-based Dataset

(Hamasaki et al. MSR'13)*

Some feedback:“Hard to query...”“Hard to convert...”“Unable to access the source code...”

Our previous work (Hamasaki et al.

MSR '13)*

Why we made this dataset?

2

*Hamasaki et al., “Who does what during a code review? datasets of OSS peer review repositories”. MSR '13

Our JSON-based Dataset

(Hamasaki et al. MSR'13)*

Some feedback:“Hard to query...”“Hard to convert...”“Unable to access the source code...”

Script

Typical Modern Code Review Process

3

Process

Product

People

You can mine from three different aspects

3

4 years 3 years 7 years 4 years 3 years

611 20 567 111 189

173,749 13,597 63,610 110,17

2 9,168

5,091 437 3,334 1,437 759

Dataset Statistics (updated to May 2015)

4

</></

></>

goo.gl/Wi4UoJ

5

Download the Dataset

Get Your Copy Now!!!

Recommended