24
Noises in Interactions Traces Data and their Impact on Previous Studies Zéphyrin Soh , Thomas Drioul, Pierre-Antoine Rappe, Foutse Khomh, Yann-Gaël Guéhéneuc, Naji Habra ESEM October, 22nd 2015 Beijing, China

Noises in Interactions Traces Data and their Impact on Previous Studies

Embed Size (px)

Citation preview

Page 1: Noises in Interactions Traces Data and their Impact on Previous Studies

Noises in Interactions Traces Data and their Impact on Previous Studies

Zéphyrin Soh, Thomas Drioul, Pierre-Antoine Rappe, Foutse Khomh, Yann-Gaël Guéhéneuc, Naji Habra

ESEMOctober, 22nd 2015

Beijing, China

Page 2: Noises in Interactions Traces Data and their Impact on Previous Studies

Outline

Background

Problem and Motivations

Noises in Interaction Traces

Correction of Noises

Impact of Noises on Previous Studies

Conclusion

Page 3: Noises in Interactions Traces Data and their Impact on Previous Studies

Background Interaction traces (ITs): Activity logs collected when

developers interacted with the IDE Interaction event: Each interaction with the IDE

Kind Handle

OriginID StartDate EndDate ...

Page 4: Noises in Interactions Traces Data and their Impact on Previous Studies

Background Mylyn

– Create, activate, descativate Available traces

ITs are valuable source of information

Page 5: Noises in Interactions Traces Data and their Impact on Previous Studies

Background

Page 6: Noises in Interactions Traces Data and their Impact on Previous Studies

Problem and Motivations

Are there noises in ITs?

Can we correct noises (if any) in ITs?

How noises impact previous studies?

Page 7: Noises in Interactions Traces Data and their Impact on Previous Studies

Problem and Motivations ITs are collected in real-work environment

The time mined from ITs is the time spent by the developers performing the maintenance task

Eclipse

Interlease activities Interruptions/Idle times

Page 8: Noises in Interactions Traces Data and their Impact on Previous Studies

Problem and Motivations

Kind = « edit » change activity– Edit with not null duration

Intent = time spent in the editor is more productive

VS.

Previous studies depend on the accuracy of the information mined from ITs

Page 9: Noises in Interactions Traces Data and their Impact on Previous Studies

Noises in Interaction Traces

15 participants Four systems 1 participant = 1 task Collected data

MylynInteraction

Traces (RITs)

Videos

VideoTranscription

Video-basedInteraction

Traces (VITs)

Page 10: Noises in Interactions Traces Data and their Impact on Previous Studies

Noises in Interaction Traces Time-related Noise

– Global Time (GT) = endDate(e3) - startDate(e1)

– Accumulated Time (AT) = d1 + d2 + d3

e1

e2

e3

ot

d2

it

d3

Time

d1

Page 11: Noises in Interactions Traces Data and their Impact on Previous Studies

GT vs. AT

VITs : Same results (by definition)

RITs : Different results

Noises in Interaction Traces

Page 12: Noises in Interactions Traces Data and their Impact on Previous Studies

VITs vs. RITs

RITs miss on average 6% of the time spent to perform the task.

Difference due to overlaps between events

Noises in Interaction Traces

Page 13: Noises in Interactions Traces Data and their Impact on Previous Studies

RITs

Average individual idle times ≈ 30 sec. Time (RITs) = AT – ot + d

d = it if it < 30 sec.

Noises in Interaction Traces

e1

e2

e3

ot

d2

it

d3

Time

d1

Page 14: Noises in Interactions Traces Data and their Impact on Previous Studies

Edit-related Noise– #edit events (duration <> 0)

• VITs vs. RITs

Noises in Interaction Traces

Page 15: Noises in Interactions Traces Data and their Impact on Previous Studies

Edit-related Noise– All edits are not real modification of the code

– edit(VITs) vs. edit(RITs)

ITs contain about 28% of false edit-events

Noises in Interaction Traces

Page 16: Noises in Interactions Traces Data and their Impact on Previous Studies

Edit-related Noise

– Feedback from Mylyn community• « ... the argument that there is noise in the edit

events makes sense to me. »• «The edit events don’t have to be textual

edits ... »

Noises in Interaction Traces

Page 17: Noises in Interactions Traces Data and their Impact on Previous Studies

Correction of Noises

Correction approach

VITs

TracesAlignment

Correctionrules

RITs

TracesCorrection

CITs

Page 18: Noises in Interactions Traces Data and their Impact on Previous Studies

Correction of Noises Traces alignment

An edit-event is a false edit-event if it takes less than 24.01 seconds but more than 0 second

VIT

RIT

?

Page 19: Noises in Interactions Traces Data and their Impact on Previous Studies

Correction of Noises

Correction rules

CITs = RITs + application correction rules

Kind = « edit »

No change Code change

Time <= 24.01s Time > 24.01s

Double clickopen

Static navigation(F3)

From search view others

Page 20: Noises in Interactions Traces Data and their Impact on Previous Studies

Impact of Noises Editing Styles

– A. Ying and M. Robillard. The influence of the task on programmer bahaviour. ICPC 2011

e = first half if +50% time(e) in first half Fraction(e = first)

– 0 to 19% = edit-last

– 87 to 100% = edit-first

– Otherwise = edit-throughout

0 10.5

e1 e2

Page 21: Noises in Interactions Traces Data and their Impact on Previous Studies

Impact of Noises

VITs (oracle), RITs, CITs Evaluation

– Precision & recall

RITs Editing style

VITs Editing style

CITs Editing style

Evaluation

Evaluation

Impactof noise

Page 22: Noises in Interactions Traces Data and their Impact on Previous Studies

Impact of Noises

Experiment data

Bugzilla data

– 1 970 ITs for 4 systems– 66% same categorisation

– 34% different categorisation

RITs CITs

Precision Recall Precision Recall

edit-first 0 0 0 0

edit-last 100 11.11 100 33.33

edit-throughout 18.18 100 22.22 100

Page 23: Noises in Interactions Traces Data and their Impact on Previous Studies

Impact of Noises Edit ratio: #edits / #events

– Measure developers' productivity

• Kersten and Murphy. FSE 2006

• Sanchez et al. SANER 2015

– Characterize developers' behaviour

• Soh et al. WCRE 2013

Wilcoxon test

– Experiment data

p-value

VITs vs. RITs VITs vs. CITs RITs vs. CITs

ECF

jEdit

JHotDraw

PDE

Page 24: Noises in Interactions Traces Data and their Impact on Previous Studies

Conclusione1

e2

e3ot

d2

it

d3

Time

d1