Community-Assisted Software Engineering Decision Making

Preview:

Citation preview

Community-Assisted Software Engineering

Decision MakingGregory Gay and Mats Heimdahl

University of Minnesota

AI in SE: A Success Story

Large, active field, with:● Growing research community● Numerous conferences and workshops,

such as MSR, PROMISE, RAISE● Large data repositories● History of collaboration between industry

and academia

2

We're already good at drawing useful conclusions. We expect further algorithmic improvements.

But...

We need to improve our data!

3

Problem 1:We don't know what data we need.

Trying to solve complex problems. Make guesses, then collect data.

Results in missing attributes, added noise.

4

Problem 2:The data we have is often weak.

Solution quality depends on data quality.

Some commonly-used data sets infamous for missing values, unhelpful attributes, poor recording standards.

5

We should improve data standards, but..We need to use the data we have.

Synergy of human feedback and AI to turn static data models into dynamic models.

Bring a Wikipedia model to data sets.

6

Inspiration: Recommender Systems

7

Enhanced Feedback Loop

8

Recommendation:MC/DC

Helpful?Yes

New Values for Existing Attributes:

Num. Boolean Expressions: 219

Num. Numeric Calculations: 73

New Attributes to Collect (and Values):

Ratio of Boolean to Numeric Calculations: 3:1

Data to Delete:Projects 1, 3, 7

Why should we enhance our data?

These dynamic data models allow:● Low start-up costs.● Build body of evidence over time.● Address data quality issues.● Human-in-the-loop feedback.

9

Challenge 1:How do we collect feedback?

10

Challenge 2:How do we use feedback?

Fundamental trade-off between human curation and automated AI learning.

When should attributes be filtered? Un-updated data phased out? New data added?

11

Challenge 3:Motivating Users

How do we motivate users to:● Provide feedback.● Add new data.● Update old data.

12

Motivation requires:1. Incentive.2. Ease of use/contribution.3. Utility from and trust in the model.

13

We propose feedback-driven dynamic data models maintained by a synergy of user-feedback and automated AI techniques.

We propose that dynamic data will allow for low start-up costs, a stronger body of evidence over time, and adaptations to changing industrial conditions.

14

For discussion...1. Is this even a good idea?2. What can we do to solve data quality

issues? (other than just the idea suggested here)

3. What kind of data would benefit from dynamic adaptation?

4. How do we motivate users to provide feedback, new data, and update old data?

15

Recommended