Upload
cameron-lawson
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Fast System for the Retrieval of Ornamental Letter Image
M. Delalandre1, J.M. Ogier2, J. Lladós1
1 CVC, Barcelona, Spain2 L3i, La Rochelle, France
L3i MeetingL3i, La Rochelle, FranceThursday 13th July 2007
Plan
• Introduction• Our System• Works in Progress• Conclusions and Perspectives
Introduction (1/3)Old Printed Graphics
Old books of XV° and XVI° centuries
Bartolomeo (1534)Alciati (1511) Laurens (1621)
figure
ornamental letter
headlineheadline
41% ornamental letter59% others
Graphics type
63% textual 37% graphical
Foreground pixel [Jour’05]
4755 (3.4 per page)Graphics
1385Page
46Book
Graphical parts
Graphics/Book
0100200300400500600700
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46
Books
Gra
ph
ics n
um
ber
CESR Database
1. Old printed graphics2. Needs of Historian People3. Problematic & Approaches
Vascosan 1555 Marnef 1576(1) Wood plug tracking
Printing house
plugexchange
copy
1531-1548
1511-1542
1555-1578
1497-1507
Introduction (2/3)Needs of Historian People
(2) User-driven historical metadata acquisition
Metadata file
Metadata file
Metadata file
Metadata file
Without retrieval
With retrieval more faster reduce error
Wood plug(bottom view)
Retrieve similar printings
Plug 1 Plug 2 Plug 3
Printing 1
Printing 2
Meeting with people of CESR
(Tours)
1. Old printed graphics2. Needs of Historian People3. Problematic & Approaches
Introduction (3/3)Problematic & Approaches
DB
queryquery
results result
Noise
Offset
Shape complexity
Accuracy
Scalability several hundred of classesseveral thousand of images
Real-time
Problematic
1. Old printed graphics2. Needs of Historian People3. Problematic & Approaches
Existing Approaches
descriptors
FastAccurate
To scalar [Loncaric’98] Hough, Radon, Zernike, Hu,
Fourrier scaled and
orientation invariant fast local (character, symbol, digit)
To image [Gesu’99] template matching,
Hausdorff distance no scaled and
orientation invariant slow global (scene)
Template matching versus scalar descriptors
Accuracy and low complexity are needed for our problematic,Key idea: to work on a compressed representation of image
Key Idea
Plan
• Introduction• Our System• Works in Progress• Conclusions and Perspectives
Our System (1/4)System Overview
Query
Compression
Centeringand
Comparison
R1 R2 R3
Checking
Image Database
Our System (2/4)Checking
CompressionCentering
andComparison
Checking
Several image providersSeveral digitalization toolsLong time processHuman supervisedComplex plate-form…
Digitalization problems
250 to 350Resolutions
UncompressCompression
TiffFormats
grayModel
279.7 MpSize
2038Files
Results
QUEID EngineBase
accepted
rejected
Parameters
Checking
Checking
Our System (3/4)Compression
Compression
image foreground background both
pixel
runc n
nt 1
]1,[ pixelrun nn
[1,0[ct
Compression rate/Dropcap
0,7
0,8
0,9
1
1 201 401 601 801 1001 1201 1401 1601 1801 2001
Dropcap
Co
mp
res
sio
n r
ate
0.75
0.950.88
Results
CompressionCentering
andComparison
Filtering
Centering
kg ,...2,1
lh ,...2,1lk
k
i i
jiikl
jyx h
ghd
10, min
x2 x2x2
x1x1 x1
x2 x2
x1
line (y) image 1
line (y+dy) image
2
xstack
reference
while x2 x1 handle image 2while x1 x2 handle image 1
Comparison
Our System (4/4)Centering and Comparison
CompressionCentering
andComparison
Filtering
ResultsRaster sizes
0
200
400
600
1 201 401 601 801 1001 1201 1401 1601 1801 2001
Dropcap
Size
(k.p
ixel)
903.62600.8Max
337.06137.7Mean
176.677.74Min
Time s
Size k.pixel
Run Sizes
0
200
400
600
1 201 401 601 801 1001 1201 1401 1601 1801 2001
Dropcap
Size
(K.ru
n)
137.0687.8Max
41.6815.5Mean
22.321.1Min
Time s
Size k.run
r
n
ii tntt
1
image database
query image
7 times faster
Plan
• Introduction• Our System• Works in Progress• Conclusions and Perspectives
Mean query of 40 s, how to reduce again
without using a lossless compression
and to loose accuracy ?
2121
2121
,max,max vvuu
vvuud
Level 1 : image sizes Level 2 : black, white pixelsLevel 3 : RLE comparison
Our first system
Works in Progress (1/3)
How to process the distance
curve ?
2
clusterth
Distance curve
00,10,20,30,40,50,60,70,8
1 167 333 499 665 831 997 1163 1329 1495 1661 1827 1993
Dropcap
Dis
tan
ce
1
2
if 1 - 2 < 0
push x, cluster
while 1 - 2 < 0
next
Using a basic clustering algorithm
‘elbow criteria’
query
1st Level
2sd Level
To use a system approach using different level of operator (from more speed to more accurate) to
select image to compare
Our key idea
Speed
Depth
From 4% to 59%, how to reduce the variability ? To work on a better selection criteria seems ambiguous
…
0
5
10
15
20
25
30
35
40
0
5
10
15
20
25
30
35
40
0
5
10
15
20
25
30
35
40
Works in Progress (2/3)
To add an intermediate operator between
scalar and image data
Our key idea
Selection results
Selection results
0%
20%
40%
60%
80%
100%
1 195 389 583 777 971 1165 1359 1553 1747 1941
Dropcap
Sel
ectio
n (%
)
SizesDensities
59%Max
24%Mean
4%Min
Selection%
4 times faster
BaseIHM
Retrieve engine
control
display
retrieve
Labels
driven labelling
Bench1 Bench2 Bench2To produce
Example of query result
0.1947 0.2517 0.3485 0.3616 0.3819 0.4064
Same plug
Next plug
Query
0.4109 0.4209
First results seem good, but how to get the ground truth and to evaluate our system?
Works in Progress (3/3)
To use our engine to produce benchmark
database
Our key idea
Plan
• Introduction• Our System• Works in Progress• Conclusions and Perspectives
ConclusionsQUEID to check image database Speedup of comparison ( 30 times faster)
RLE compression ( 7 times faster)Image selection ( 4 times faster)
PerspectivesTo add operator to reduce the variability of selection processand to speed again the processTo extend our system to do performance evaluation
Conclusions and Perspectives