Upload
lorraine-sutton
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Madonne Talk (Tours University)7 th November 2006
A Fast System for Dropcap Image Retrieval
Mathieu Delalandre and Jean-Marc Ogier
L3i, La Rochelle University, France
Madonne Talk (Tours University)7th November 2006
Short CV
Personal Information Mathieu Delalandre, 32 years old, Married
Academic Degrees 1995-1998 Lic.Sc In Industrial Computing Rouen University, France 1998-2001 M.Sc in Computer Science Rouen University, France
Research Experiences (5 years, Graphics Recognition) 04/01-09/01 Master PSI Laboratory (Rouen, France) 10/01-04/05 PhD PSI Laboratory (Rouen, France) 05/05-09/05 Post-doc SCSIT (Nottingham, England) 10/05-10/06 Post-doc L3i Laboratory (La Rochelle, France) 11/06-12/06 Post-doc PSI Laboratory (Rouen, France) 01/07-12/09 Post-doc CVC (Barcelone, Spain)
Madonne Talk (Tours University)7 th November 2006
Introduction
- Old books
- Old graphics retrieval
- Our problem
Madonne Talk (Tours University)7th November 2006
IntroductionOld books
Old books of XV° and XVI° centuries Samples
Bartolomeo (1534)
Alciati (1511)
Laurens (1621)
figure
dropcap
headlineheadline
Example of digitized database
(BVH, CESR Tours)
Book 46
Page 1385
Graphics 4755 (3.4/page)
Foreground pixel
63% textual
37% graphical
Graphics type 41% dropcap
59% others Old Graphics
Graphics/Book
0100200300400500600700
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46
Books
Gra
ph
ics n
um
ber
- Old books- Old graphics retrieval- Our problem
Madonne Talk (Tours University)7th November 2006
IntroductionOld graphics retrieval
- Old books- Old graphics retrieval- Our problem
Image Database
Query
Extraction Comparison
Index
IndexingRetrieval
Manual Index
System overview General architecture
Samples
Pareti’05Graphics style
Zip law
Uttama’05Document layout
MST
Baudrier’05Sub image
Hausdorff distance
Bigun’96Stroke image
Radiogram orientation
letter (c) topic (vegetal) pattern (cross)
Retrieval criterion
Madonne Talk (Tours University)7th November 2006
IntroductionOur problem (1/2)
Context MAsse de DOnnées issues de la Numérisation du
patrimoiNE (MADONNE) Project Bibliothèques Virtuelles Humanistes (BVH)
du Centre d’Etudes Supérieures de la Renaissance (CESR)
Class 1 Class 2 Class 3
printingWood plug(bottom view)
Vascosan 1555 Marnef 1576
Wood Plug Tracking
Printing house
tampon
exchange
copy
1531-1548
1511-1542
1555-1578
1497-1507
- Old books- Old graphics retrieval- Our problem
Madonne Talk (Tours University)7th November 2006
IntroductionOur problem (2/2)
Problem features No scaled, no oriented Noise
Offset
Complexity
Accuracy
Scalability
descriptors
fast localcomplex global
Descriptor choice
To scalar [Loncaric’98] Hough, Radon,
Zernike, Hu, Fourrier Scaled and
orientation invariant fast local
To image [Gesu’99] Template matching,
Hausdorff distance no scaled and
orientation invariant global (scene)
Query
CompressionCentering
andComparison
R1 R2 R3
Formatting
Image Database
- Old books- Old graphics retrieval- Our problem
Madonne Talk (Tours University)7 th November 2006
Our system
CompressionCentering
andComparison
Formatting
Madonne Talk (Tours University)7th November 2006
Our systemFormatting
Digitalization problems [Lawrence’00] Problem sources
Several image providers Several digitalization tools Length of process Human supervised …
QUEID « QUery Engine on Image Database »
Diagnostic
Base
Expertise
QUEID
query
charts
analysis
Format
CompressionCentering
andComparison
Formatting
OLDB (Ornamental Letters Database) Before (oldb.jpg)
After
Packbits and JpegCompression
?; from 72 to 450 dpiResolutions
Jpeg and TiffFormats
gray and colourModel
377.7 MpSize
2803Files
250 to 350Resolutions
UncompressCompression
TiffFormats
grayModel
279.7 MpSize
2038Files
Madonne Talk (Tours University)7th November 2006
Our systemCompression
Run based compression Run Length Encoding (RLE)
Compression rate
pixel
runc n
nt 1
]1,[ pixelrun nn
[1,0[ct RLE Types
image foreground background both
OLDB results Fixed threshold binarisation Both RLE
Compression rate/Dropcap
0,7
0,8
0,9
1
1 201 401 601 801 1001 1201 1401 1601 1801 2001
Dropcap
Co
mp
res
sio
n r
ate
0.75
0.950.88
CompressionCentering
andComparison
Formatting
Madonne Talk (Tours University)7th November 2006
Our systemCentering and comparison
Centering
x2 x2x2
x1x1 x1
x2 x2
x1
line (y) image 1
line (y+dy) image 2
xstack
pointeur
while x2 x1 handle image 2while x1 x2 handle image 1
OLDB results
Raster sizes
0
200
400
600
1 201 401 601 801 1001 1201 1401 1601 1801 2001
Dropcap
Size
(k.p
ixel)
903.62600.8Max
337.06137.7Mean
176.677.74Min
Time
s
Size
k.pixel
Run Sizes
0
200
400
600
1 201 401 601 801 1001 1201 1401 1601 1801 2001
Dropcap
Size
(K.ru
n)
137.0687.8Max
41.6815.5Mean
22.321.1Min
Time
s
Size
k.run
kg ,...2,1
lh ,...2,1lk
k
i i
jiikl
jyx h
ghd
10, min
Comparison
CompressionCentering
andComparison
Formatting
r
n
ii tntt
1
image database
query image
Madonne Talk (Tours University)7th November 2006
In progress
2
clusterth
2121
2121
,max,max vvuu
vvuud
query
1st Level
2sd Level
Our problem Current time : 40 s Wished time : < 4 s
To use a lossless
compression
To use a system
approach
Key idea
First system Level 1 : image sizes Level 2 : black, white pixels Level 3 : RLE comparison
Depth
Speed
Selection algorithm
Distance curve
00,10,20,30,40,50,60,70,8
1 167 333 499 665 831 997 1163 1329 1495 1661 1827 1993
Dropcap
Dis
tan
ce
1
2
if 1 - 2 < 0
push x, cluster
while 1 - 2 < 0
next
Madonne Talk (Tours University)7th November 2006
In progress
0
5
10
15
20
25
30
35
40
0
5
10
15
20
25
30
35
40
Depth level
0%
20%
40%
60%
80%
100%
1 195 389 583 777 971 1165 1359 1553 1747 1941
Dropcap
Dep
th (%
)
SizesDensities
OLDB results
59%Max
24%Mean
4%Min
Depth
%
To decrease variability
To work on
selection
To add a level
Run based signature
0
5
10
15
20
25
30
35
40
Madonne Talk (Tours University)7th November 2006
In progress
Query example
0.1947 0.2517 0.3485 0.3616 0.3819 0.4064
Same plug
Next plug
Query
0.4109 0.4209
Performance evaluation
BaseIHM
Retrieve engine
control
display
retrieve
Labels
driven labelling
Bench1 Bench2 Bench2To produce
Criterion ? - Scalability- Accuracy- Time processing
Benchmark system
Madonne Talk (Tours University)7th November 2006
Conclusions et perspectives
Conclusions Dropcap image retrieval « wood tracking » Formatting image database (QUEID) Fast approach, two features
RLE comparison (7 to 9) Top-down strategy (2 to 20)
Results 10 s for 2000 images (300 Mo)
Perspectives Working on RLE signature Benchmark system for performance evaluation
Madonne Talk (Tours University)7th November 2006
Bibliography
1. J. Bigun, S. Bhattacharjee, and S. Michel. Orientation radiograms for image retrieval: An alternative to segmentation. In International Conference on Pattern Recognition (ICPR), volume 3, pages 346-350, 1996.
2. V. D. Gesu and V. Starovoitov. Distance based function for image comparison. Pattern Recognition Letters (PRL), 20(2):207-214, 1999.3. S. Loncaric. A survey of shape analysis techniques. Pattern Recognition (PR), 31(8):983-1001, 1998.4. R. Pareti and N. Vincent. Global discrimination of graphics styles. In Workshop on Graphics Recognition (GREC), pages 120-128, 2005.5. S. Uttama, M. Hammoud, C. Garrido, P. Franco, and J. Ogier. Ancient graphic documents characterization. In Workshop on Graphics Recognition (GREC),
pages 97-105, 2005.6. E. Baudrier, G. Millon, F. Nicolier, and S. Ruan. A fast binary-image comparison method with local-dissimilarity quantification. In International
Conference on Pattern Recognition (ICPR), volume 3, pages 216- 219, 2006.