57
Co-filtering human interaction and object segmentation Ferran Cabezas Supervised by: Vincent Charvillat Axel Carlier Xavier Giró-i-Nieto Amaia Salvador 1

Co-filtering human interaction and object segmentation

Embed Size (px)

Citation preview

Co-filtering human interactionand object segmentation

Ferran Cabezas

Supervised by:

Vincent CharvillatAxel Carlier

Xavier Giró-i-NietoAmaia Salvador

1

1. Motivation

2. Related Work

3. Treatment of human interaction

a) Removing human interaction - Combination of object candidates

b) Taking advantage of all human interaction - Foreground map algorithm

4. Automatic categorization of the users

5. Conclusions

6. Future work

Outline

2

Crowdsourcing object segmentation

3

Filtering out bad human interactionsCorrect human interaction

GoalResult of a correct human interaction Result of an incorrect human interaction

Incorrect human interaction

4

1. Motivation

2. Related Work

3. Treatment of human interaction

a) Removing human interaction - Combination of object candidates

b) Taking advantage of all human interaction - Foreground map algorithm

4. Automatic categorization of the users

5. Conclusions

6. Future work

Outline

5

Click’n’Cut

• Web tool for interactive object segmentation designed for crowdsourcing tasks.

A. Carlier, V. Charvillat, A.Salvador, X.Giró-i-Nieto, O. Marques, Click’n’Cut: Crowdsourced Interactive Segmentation with Object Candidates. In CrowdMM’14, 2014

DEMO

6

Data

20 users that have fully realized the Click’n’Cut experiment

100 objects with associated ground truth from the Berkeley-DCU dataset.

Testing set

5 images from Pascal VOC 2012 to perform gold standard techniques. Training set

Training set7

How are obtained the masks from the clicks?

• Combination of different precomputed

binary object candidates .• Foreground map algorithm

?

A.Carlier, Combining Content Analysis with Usage Analysis to better understand visual contents, PHD Thesis, 2014.

A. Carlier, V. Charvillat, A.Salvador, X.Giró-i-Nieto, O. Marques, Click’n’Cut: Crowdsourced Interactive Segmentation with Object Candidates. In CrowdMM’14, 2014

8

Information of users are not always reliable

Bad user interaction Good user interaction

9

First approach - How are separated good from bad user interactions?

4th GS1st GS

Error rate Error rate Error rate Error rate Error rate

2nd GS 3rd GS 5th GS

Mean error rate

• Removing users based on their error rate on the Gold standard images (training set)

10

Removing users based on their error rate

Remove users based on an error rate threshold

5GS

User20

5GS

User18

5GS

User19

. . .5GS

User3

5GS

User1

5GS

User2

Error rate Error rate Error rate Error rate Error rate Error rate

11

1. Motivation

2. Related Work

3. Treatment of human interaction

a) Removing human interaction - Combination of object candidates

b) Taking advantage of all human interaction - Foreground map algorithm

4. Automatic categorization of the users

5. Conclusions

6. Future work

Outline

12

How are evaluated the obtained masks?

clicks

Object candidate technique

Ground truth mask

?

?

Foreground map algorithm

13

Jaccard index

A ∪ B

A ∩ B

Measure of similarity between the mask obtained from the Click’n’Cut experiment and the ground truth mask

14

3. Treatment of human interaction

a) Removing human interaction - Combination of object candidates• Removing users

• Removing clicks

• Removing clicks and users

Outline

15

Impact of good and bad users in the resulting mask

Image 1 user (good user)

Image

12 users (Good users)

• A lot of errors can be removed just by discarding bad users

Image

20 users

16

Jaccard index= 0.0214Error rate = 0

Jaccard index= 0.9402Error rate = 0

Users filtering

NO OBVIOUS CORRELATION

17

Jaccard index for each user

4th GS1st GS

Jaccard index

Jaccard index

Jaccard index

Jaccard index

Jaccard index

2nd GS 3rd GS 5th GS

Mean Jaccard index

• Better idea of how it is the contribution of the user in the final result

18

Jaccard index for each user

5GS

User20

5GS

User18

5GS

User19

. . .5GS

User3

5GS

User1

5GS

User2

Jaccard index Jaccard index Jaccard index Jaccard index Jaccard index Jaccard index

Remove users based on a Jaccard index threshold

19

Image 100

Jaccard index 100

Image 1

Jaccard index 1

Image 2

Jaccard index 2

Image 3

Jaccard index 3

Image 98

Jaccard index 98

Image 99

Jaccard index 99

MEAN

Jaccard index for the test set

. . .

Maintained users

Removed users

20

Results for the test set

0 2 4 6 8 10 12 14 16 18 200.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Number of users

Jaccard index by taking different number of users

Jaccard

Index

Users sorted by its ascendent Jaccard index

Users sorted by its descendent error rate

descendentascendant

21

3. Treatment of human interaction

a) Removing human interaction - Combination of object candidates• Removing users

• Removing clicks

• Removing clicks and users

Outline

22

Schematic

Combination of Object Candidates

Image with filtered clicksObtaining mask

Slic

Felzenszwalb

N-cuts

nothing

Three different techniques for over-segment an image

Two techniques for discarding the clicks in a same superpixel

Image with non filtered clicks

23

Schematic

Combination of Object Candidates

Image with filtered clicksObtaining mask

Slic

Felzenszwalb

N-cuts

nothing

Three different techniques for over-segment an image

Two techniques for discarding the clicks in a same superpixel

Image with non filtered clicks

24

Superpixel techniques

Three different techniques for over-segment an image

Two techniques for discarding the clicks in a same superpixel

Combination of Object Candidates

Slic

Felzenszwalb

N-cuts

nothing

Image with filtered clicksObtaining mask

25

Superpixel techniques

• Felzenszwalb• K = 20

• σ = 0,5

• m = 20

• SLIC • Region size = 10• Regularizer = 0.1• N-cuts

26

Filtering Clicks in a same superpixel

Three different techniques for over-segment an image

Two techniques for discarding the clicks in a same superpixel

Combination of Object Candidates

Slic

Felzenszwalb

N-cuts

nothing

Image with filtered clicksObtaining mask

27

Filtering Clicks in a same superpixel

1) Total removal of conflict clicks :Discarding all clicks in conflicting superpixels

2) Partial removal of conflict clicks : Discarding the clicks in minority /equality inside conflicting superpixels

nothingnothing

28

Results

Without applying any

technique of filtering

clicks0.14

Techniques of

filtering clicks in a

same sppxl.

Partial removal of

conflict clicks

Total removal of

conflict clicks

SLIC 0.2109 0.2412

N-CUTS 0.2735 0.3330

FELZ 0.2104 0.2240

• Jaccard index for all users in the test set

29

3. Treatment of human interaction

a) Removing human interaction - Combination of object candidates• Removing users

• Removing clicks

• Removing clicks and users

Outline

30

Results

• Users sorted by its descendent Jaccard index

0 2 4 6 8 10 12 14 16 18 200.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Number of Users sorted by its descended Jaccard index

Jaccard

Index

Comparing results with partial filtering and without filtering

Felz. sppxl. technique

Ncuts spxxl. technique

SLIC spxxl. technique

With no filtering clicks

0 2 4 6 8 10 12 14 16 18 200.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Number of Users sorted by its descended Jaccard indexJaccard

Index

Comparing results with total filtering and without filtering

Felz. sppxl. technique

Ncuts spxxl. technique

SLIC spxxl. technique

With no filtering clicks

Partial filtering Total filtering

31

3. Treatment of human interaction

b) Taking advantage of all human interaction - Foreground map algorithm

Outline

32

Foreground map algorithm

Set of clicks

50 100 150 200 250 300 350 400 450

50

100

150

200

250

300

50 100 150 200 250 300 350 400 450

50

100

150

200

250

300

Felzenzwalb Superpixel segmentation with k=100

Felzenzwalb Superpixel segmentation with k=300

• Each click have a measure of confidence based on the user error on the 5GS.

• Weight superpixel based on clicks

33

Foreground map algorithm

• Superpixel combination• Slic: 6 levels• Felzenzwalb: 8 levels

. . . . . .

R.Vieux, J.Benois, J.Domenger, A.Braquelaire, Segmentation-based multi-class semantic object detection, Multimedia Tools and Applications, 2010 34

Parameters to adjust after the combination

• Threshold

• Structure element for hole filling

?

?

35

Combining all Felz. and Slic levels

Threshold 0.56 Jaccard index = 0.8603• Felz: k: 10,20,50,100,200,300,400,500

• SLIC: Regions side: 5,10,20,30,40,50

• SE =7

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1X: 0.56

Y: 0.8891

Threshold

Jaccard

Index

Combining Slic and Felzenzwalb superpixels techniques in the train set

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1X: 0.56

Y: 0.8603

ThresholdJaccard

Index

Combining Slic and Felzenzwalb superpixels techniques in the test set

36

Results combining all Felz. and Slic levels

Threshold = 0.56SE = 7

37

1. Motivation

2. Related Work

3. Treatment of human interaction

a) Removing human interaction - Combination of object candidates

b) Taking advantage of all human interaction - Foreground map algorithm

4. Automatic categorization of the users

5. Conclusions

6. Future work

Outline

38

Type of users and their particularities

• Painter: Lot of foreground clicks inside the object to segment

39

Type of users and their particularities

• Tired: Few clicks per image

40

Type of users and their particularities

• Border guards: Most of the bg clicks are in the contour of the image.

41

Type of users and their particularities

• Surrounders: Most of the fg clicks are in the contour of the image

42

Type of users and their particularities

• Mirrors: Have understood the experiment upside-down

43

Type of users and their particularities

• Spammers: Randomly placed foreground clicks over the image.

44

Type of users and their particularities

• Experts: Have well-understood the experiment and just made few

mistakes

45

Type of users and their particularities

• Different pattern: Does not follow the same pattern of clicks in all images

46

Manually categorization

• It is done a manually categorization by considering just the 5 gold standard images

Users Manually categorization

1 Painter

2 Expert

3 Mirror

4 Expert

5 Border guard

6 Expert

7 Tired

8 Border guard

9 Expert

10 Different pattern

11 Different pattern

12 Expert

13 Expert

14 Expert

15 Expert

16 Expert

17 Tired

18 Surrounder

19 Spammer

20 Expert

47

Manual rules for automatic user categorization

Features Painter The mirror

The border guard

The surrounder

The spammer

The tired The expert

# clicks >150/image - - - - <5/image -

fg clicks(%) >95% - <20% >95% >90% - -

errors(%) <3% >90% - - >40% <20% -

Jaccard index (%) - <10% - - - <80% >80%

Contour fg(%)(fg contour clicks/total fg clicks)

- - - >80% <80% - -

Contour bg(%)(bg contour clicks/total bg clicks)

- - >70% - - - -

• According to the particularities of each type of user, a set of features and its rules are created:

48

Automatic categorization evaluation for the test set

Prediction

Painter Mirror Expert Spammer Surrounder Border Guard Tired Diff. Pattern

Ground Truth

Painter 1 0 0 0 0 0 0 0

Mirror 0 1 0 0 0 0 0 0Expert 0 0 9 0 0 0 0 1

Spammer 0 0 0 1 0 0 0 0

Surrounder 0 0 0 0 1 0 0 0

Border guard 0 0 0 0 0 1 0 1Tired 0 0 0 0 0 0 1 1Diff. pattern 0 0 0 0 0 0 0 2

49

1. Motivation

2. Related Work

3. Treatment of human interaction

a) Removing human interaction - Combination of object candidates

b) Taking advantage of all human interaction - Foreground map algorithm

4. Automatic categorization of the users

5. Conclusions

6. Future work

Outline

50

Conclusions

• Jaccard index is a better measure compared to error rate to separate bad users from good ones

0 2 4 6 8 10 12 14 16 18 200.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Number of users

Jaccard index by taking different number of users

Jaccard

Index

Users sorted by its ascendent Jaccard index

Users sorted by its descendent error rate

51

Conclusions

• Better results with partial than with total filtering • Filtering clicks only makes sense when treating with bad users

0 2 4 6 8 10 12 14 16 18 200.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Number of Users sorted by its descended Jaccard index

Jaccard

Index

Comparing results with partial filtering and without filtering

Felz. sppxl. technique

Ncuts spxxl. technique

SLIC spxxl. technique

With no filtering clicks

0 2 4 6 8 10 12 14 16 18 200.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Number of Users sorted by its descended Jaccard index

Jaccard

Index

Comparing results with total filtering and without filtering

Felz. sppxl. technique

Ncuts spxxl. technique

SLIC spxxl. technique

With no filtering clicks

Partial filteringTotal filtering

52

Conclusions

• In the foreground map algorithm it is reached the best result by combining Felzenzwalb and Slic superpixel techniques with different levels

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1X: 0.56

Y: 0.8891

Threshold

Jaccard

Index

Combining Slic and Felzenzwalb superpixels techniques in the train set

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1X: 0.56

Y: 0.8603

ThresholdJaccard

Index

Combining Slic and Felzenzwalb superpixels techniques in the test set

53

Conclusions

Images from User 11

• It is not possible to automatically categorize users that does not follow the same pattern of clicks in all images

54

1. Motivation

2. Related Work

3. Treatment of human interaction

a) Removing human interaction - Combination of object candidates

b) Taking advantage of all human interaction - Foreground map algorithm

4. Automatic categorization of the users

5. Conclusions

6. Future work

Outline

55

Future work

• Study different techniques for filtering clicks in a same superpixel.

• Take advantage of the clicks of some users to create a better mask (e.g. Border guard and Surrounder users)

• Train classifier for automatic user categorization

56

Questions & Answers

57