26
Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting LORIA, Nancy city, France Monday 18th of May 2009

Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Embed Size (px)

Citation preview

Page 1: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Towards Performance Evaluation of Symbol Recognition & Spotting Systems

in a Localization Context

Mathieu DelalandreCVC, Barcelona, Spain

EuroMed MeetingLORIA, Nancy city, FranceMonday 18th of May 2009

Page 2: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Introduction

symbolbackgroundtext

Recognition

Spotting

r1 r2 r3

sofa

skin

tub

doordo

or

document database

learning

database

Query By

Example

(QBE)

rank

labels

Symbol spotting: “a way to efficiently localize possible symbols and limit the computational complexity, without using full recognition methods” [Tombre2003] [Dosch2004] [Tabbone2004] [Zuwala2006] [Locteau2007] [Qureshi2007] [Rusinol2007]

Symbol recognition: ““a particular application of the general problem of pattern recognition, in which an unknown input pattern (i.e. input image) is classified as belonging to one of the relevant classes (i.e. predefined symbols) in the application domain” [Chhabra1998][Cordella1999] [Llados2002] [Tombre2005]

Electrical diagram

Mechanical drawing

Utility map

scanned

CAD file Web image

Page 3: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Introduction

Characterisation

GroundtruthGroundtruthGroundtruth

Groundtruthing

ResultsResultsResults

Performance evaluation

System

Performance evaluation: Information Retrieval [Salton1992], Computer Vision [Thacker2005], CBIR [Muller2001], DIA [Haralick2000]

Case of symbol recognition & spotting: [Ezra2008][Delalandre2008]

Training

data

dATADataData

Spotting/RecognitionSystem

Groundtruth Mapping

Region Of

Interest

Characterization

sofa

skin

tub

door

door

Labels

r1 r2 r3

RanksQBE

truth results

Learning

Performance evaluation

Page 4: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Plan

1. Groundtruth and test documents2. Performance characterization3. Conclusions and perspectives

Page 5: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Groundtruth and test documents Overview of approaches

Real approach

Document

Document

Document

Groundtruth

Groundtruth

Groundtruth

Groundtruthing

- - weak ++ good re

al

appro

ach

synth

eti

c appro

ac

h

GTground-truthing

validation

groundtruth

drawings and alerts

groundtrutheddrawings

validation and

alertsevaluationtest images

recognition

results

Dosch and al 2006

1

0

43

2

5

4

01

5

32

connectedparallel and overlapped

Yan and al 2004

1. Overview of approaches2. Existing datasets

Rusinol and al 2009

Page 6: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Groundtruth and test documents Overview of approaches

Synthetic approach

Document

Document

Document

Groundtruth

Groundtruth

GroundtruthGroundtruthing

Setting

- - weak ++ good

real

appro

ach

synth

eti

c appro

ac

h

Aksoy 2000

binary noise

vectorial noise

Valveny and al 2007Zhai and al 2003

1. Overview of approaches2. Existing datasets

Page 7: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

symbolbackground

Graphical documents are composed of two layers

To use a same background layer with different symbol layers

Groundtruth and test documents Overview of approaches

- - weak ++ good

real

appro

ach

synth

eti

c appro

ac

h

Delalandre2008

1. Overview of approaches2. Existing datasets

Page 8: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Delalandre2008

Groundtruth and test documents Overview of approaches

c2

c1

M1

M2

M3

M4

C1

C2

C3

C4

L1

θ1

p1

L2θ2

p2

p

1,0L 2,0

L

bounding box and control point

alignment

symbol model

loaded symbol

1. Overview of approaches2. Existing datasets

- - weak ++ good

real

appro

ach

synth

eti

c appro

ac

h

Page 9: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Delalandre2008

Groundtruth and test documents Overview of approaches

GTGT

GTGT

PositioningConstraints

SymbolModels

Document Generation

SymbolPositioning

Symbol Models

BuildingEngine

(2) run

(3) displa

y

(1) edit

Background Image

1. Overview of approaches2. Existing datasets

- - weak ++ good

real

appro

ach

synth

eti

c appro

ac

h

Page 10: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Groundtruth and test documents Existing datasets

datasets

images

symbols

degradations

models

GREC’03 #30 3000 3000 10 5-50

GREC’05 #16 1000 1000 625-150

GREC’07 #6 2100 2100 650-150

ICPR’00 #9 450 11250 9 25

bags #16 1600 15046 none25-150

floorplans

#10 1000 26830 none 16

diagrams

#10 1000 14100 none 21

queries #6 6000 6000 none16-21

Rusinol’09

#1 42 344 none 38

GR

EC

1. Overview of approaches2. Existing datasets

ICPR

SESYD

Oth

ers

Page 11: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Groundtruth and test documents Existing datasets

GR

EC

1. Overview of approaches2. Existing datasets

ICPR

SESYD

datasets

images

symbols

degradations

models

GREC’03 #30 3000 3000 10 5-50

GREC’05 #16 1000 1000 625-150

GREC’07 #6 2100 2100 650-150

ICPR’00 #9 450 11250 9 25

bags #16 1600 15046 none25-150

floorplans

#10 1000 26830 none 16

diagrams

#10 1000 14100 none 21

queries #6 6000 6000 none16-21

Rusinol’09

#1 42 344 none 38

Oth

ers

Page 12: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Groundtruth and test documents Existing datasets

GR

EC

1. Overview of approaches2. Existing datasets

ICPR

SESYD

datasets

images

symbols

degradations

models

GREC’03 #30 3000 3000 10 5-50

GREC’05 #16 1000 1000 625-150

GREC’07 #6 2100 2100 650-150

ICPR’00 #9 450 11250 9 25

bags #16 1600 15046 none25-150

floorplans

#10 1000 26830 none 16

diagrams

#10 1000 14100 none 21

queries #6 6000 6000 none16-21

Rusinol’09

#1 42 344 none 38

Oth

ers

Page 13: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Groundtruth and test documents Existing datasets

1. Overview of approaches2. Existing datasets

GR

EC

ICPR

SESYD

datasets

images

symbols

degradations

models

GREC’03 #30 3000 3000 10 5-50

GREC’05 #16 1000 1000 625-150

GREC’07 #6 2100 2100 650-150

ICPR’00 #9 450 11250 9 25

bags #16 1600 15046 none25-150

floorplans

#10 1000 26830 none 16

diagrams

#10 1000 14100 none 21

queries #6 6000 6000 none16-21

Rusinol’09

#1 42 344 none 38

Oth

ers

Page 14: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Groundtruth and test documents Existing datasets

1. Overview of approaches2. Existing datasets

GR

EC

ICPR

SESYD

datasets

images

symbols

degradations

models

GREC’03 #30 3000 3000 10 5-50

GREC’05 #16 1000 1000 625-150

GREC’07 #6 2100 2100 650-150

ICPR’00 #9 450 11250 9 25

bags #16 1600 15046 none25-150

floorplans

#10 1000 26830 none 16

diagrams

#10 1000 14100 none 21

queries #6 6000 6000 none16-21

Rusinol’09

#1 42 344 none 38

Oth

ers

Page 15: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Groundtruth and test documents Existing datasets

1. Overview of approaches2. Existing datasets

GR

EC

ICPR

SESYD

datasets

images

symbols

degradations

models

GREC’03 #30 3000 3000 10 5-50

GREC’05 #16 1000 1000 625-150

GREC’07 #6 2100 2100 650-150

ICPR’00 #9 450 11250 9 25

bags #16 1600 15046 none25-150

floorplans

#10 1000 26830 none 16

diagrams

#10 1000 14100 none 21

queries #6 6000 6000 none16-21

Rusinol’09

#1 42 344 none 38

Oth

ers

Page 16: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Groundtruth and test documents Existing datasets

1. Overview of approaches2. Existing datasets

GR

EC

ICPR

SESYD

datasets

images

symbols

degradations

models

GREC’03 #30 3000 3000 10 5-50

GREC’05 #16 1000 1000 625-150

GREC’07 #6 2100 2100 650-150

ICPR’00 #9 450 11250 9 25

bags #16 1600 15046 none25-150

floorplans

#10 1000 26830 none 16

diagrams

#10 1000 14100 none 21

queries #6 6000 6000 none16-21

Rusinol’09

#1 42 344 none 38

Oth

ers

Page 17: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Groundtruth and test documents Existing datasets

1. Overview of approaches2. Existing datasets

GR

EC

ICPR

SESYD

Groundtruth

Generator of

queries

1. Random selection of a document2. Radom selection of a symbol

v0

x

s [0,1]

y

vmax

v x

es0

2

12

2

1

2

vzerfs

l

n

nn

nn

z

0

12

)12(!

)1(2

2.02

52

12

12

3. Random crop

datasets

images

symbols

degradations

models

GREC’03 #30 3000 3000 10 5-50

GREC’05 #16 1000 1000 625-150

GREC’07 #6 2100 2100 650-150

ICPR’00 #9 450 11250 9 25

bags #16 1600 15046 none25-150

floorplans

#10 1000 26830 none 16

diagrams

#10 1000 14100 none 21

queries #6 6000 6000 none16-21

Rusinol’09

#1 42 344 none 38

Oth

ers

Page 18: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Groundtruth and test documents Existing datasets

1. Overview of approaches2. Existing datasets

GR

EC

ICPR

SESYD

datasets

images

symbols

degradations

models

GREC’03 #30 3000 3000 10 5-50

GREC’05 #16 1000 1000 625-150

GREC’07 #6 2100 2100 650-150

ICPR’00 #9 450 11250 9 25

bags #16 1600 15046 none25-150

floorplans

#10 1000 26830 none 16

diagrams

#10 1000 14100 none 21

queries #6 6000 6000 none16-21

Rusinol’09

#1 42 344 none 38

Oth

ers

Page 19: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Plan

1. Groundtruth and test documents2. Performance characterization3. Conclusions and perspectives

Page 20: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Performance characterization Introduction

Performance characterisation (segmented symbols) [Valveny2004] [Dosch2006] [Valveny2007,2008a,2008b]

Recognition ratePrecision/RecallHomogeneitySeparability

Performance characterisation (real context)

Spotting/RecognitionSystem

Groundtruth Mapping

Region Of

Interest

Characterization

sofa

skin

tub

door

door

Labels

r1 r2 r3

RanksQBE

truth results

Learning

Performance evaluation

Page 21: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Performance characterization About mapping

groundtruth

segmentation

segmentation

Layout analysis [Antonacopoulos1999]

Text/graphics separation [Wenyin1997]groundtruth

seg

me

nta

tion

truth results

Single : a model line matches only with

one detected line.

Split : two model lines

match with one detected line.

Merge : a model line matches with two

detected lines.

False alarm : a detected line

doesn't match with any model lines.

Miss : a model line doesn't

match with any detected lines.

Mapping cases

Symbol spotting [Rusinol2009]

Groundtruth

Results

Mappingc1 c2

g1 g2

r

r

ccecision 21Pr

21

21Regg

cccall

Page 22: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Performance characterization Mapping, application to symbol

wrapperbox,

ellipsis

convex polygon

the precision will depend of

the model

could be of weak

precision

Which representation ? How to define the regions ?

concave polygon

precise but comparison is time

consuming

the polarized pat of the capacitor belong

to the symbol ?

Same for the moving area of the door ?

Lot of systems use sliding windows to detect symbols providing only points [Adam2001] [Dosh2004] [Rusinol2007]

pointHow to define

local thresholds

Compatibility with recognition systems ?

groundtruth

segmentation

Lot of systems use sliding windows to detect symbols providing only points [Adam2001] [Dosh2004] [Rusinol2007]Systems providing region of interest can “tune” their results, how to limit the over segmentation cases ?

Page 23: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Performance characterization Work in progress

Comparison of some criteria System of [Qureshi’08] , 100 floorplans (2521 symbols)

Domain definition of

the ROI

Orientation sampling

[0-2π]

Reporting [0-2π]Rate

s %

Region size dx×dy

results ground

truth

Signature based characterization

Page 24: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Plan

1. Groundtruth and test documents2. Performance characterization3. Conclusions and perspectives

Page 25: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Conclusions and perspectives

• Conclusions– Large databases of segmented symbol images exist “GREC”– Synthetic databases in real context exist “SESYD”– True-life documents and groundtruth are at the corner “EPEIRES”– Characterization tools have been proposed “SymbolRec”

• Perspectives– Continue to produce other databases, using existing platforms– Mapping is the key problem today, to achieve a performance

evaluation in real context

Page 26: Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting

Thanks

All the referenced papers can be found in

[1] M. Delalandre, E. Valveny and J. Lladós Performance Evaluation of Symbol Recognition and Spotting Systems: A Overview. Workshop on Document Analysis Systems (DAS), pp 497-505, 2008.