18
DOCLIB Overview A developer perspective Guangyu Zhu and David Doermann Language and Media Processing Lab University of Maryland

DOCLIB Overview A developer perspective

Embed Size (px)

DESCRIPTION

DOCLIB Overview A developer perspective. Guangyu Zhu and David Doermann Language and Media Processing Lab University of Maryland. Agenda. DOCLIB Background Inside DOCLIB Architecture Overview DOCLIB Development Overview Development Environment/Processes DOCLIB Add-on Structure - PowerPoint PPT Presentation

Citation preview

Page 1: DOCLIB Overview A developer perspective

DOCLIB Overview A developer perspective

Guangyu Zhu and David DoermannLanguage and Media Processing Lab

University of Maryland

Page 2: DOCLIB Overview A developer perspective

Agenda

DOCLIB Background

Inside DOCLIB

Architecture Overview

DOCLIB Development Overview

Development Environment/Processes

DOCLIB Add-on Structure

Coding Conventions

DOCLIB Resources

LAMP DOCLIB Project page

DOCLIB Wiki

DOCLIB online demo

DoclibCommandLine

Page 3: DOCLIB Overview A developer perspective

DOCLIB Objectives

To develop a single image processing library that supports typical document processing needs and multiple platforms

Efficient Technology Transfer Software compatibility Balance of academia, governemnt, and industry needs A common framework for document processing

Scalability Rapid prototyping of new methods Simple algorithm comparison

Robustness and Stability High quality standards Platform-independence Accommodating frequently changing requirements

Page 4: DOCLIB Overview A developer perspective

DOCLIB Components

Page 5: DOCLIB Overview A developer perspective

DOCLIB Architecture

DLImage

e.g. XML input/output page segmentation text line extraction logo detection page layout analysis

DocLib´s two essential pillars

e.g. image rotation image deskewing image conversions cc calculation shape drawing

DLImage: Image Processing Functions

DLDocument: Container of Document Metadata for different processing task

DLDoc.

Page 6: DOCLIB Overview A developer perspective

DLImage Design

DLImageFactory{ Imported from New Project1 }

Attributes

Operations

+ dlLoadImage( filename : char* ...+ dlSaveImage( image : DLImag...

DLImage{ Imported from New Project...

Attributes

+ BLACKWHITEIMAGE :...+ GRAYIMAGE : short int+ COLORIMAGE : short int+ TIFFORMAT : short int+ BMPFORMAT : short int+ JPGFORMAT : short int+ PPMFORMAT : short int+ RGBCOLOR : short int

DLImageHead{ Imported from New Project1 }

Attributes

# width : int# height : int# depth : int# channels : int# colorModel : int# dataOrder : int# rowSize : int# dataAlignment : int# resolution : int

DLImageJPGHead{ Imported from New Proj...

Attributes

Operations

+ DLImageJPGHead() :

DLImagePPMHead{ Imported from New Proj...

Attributes

Operations

+ DLImagePPMHead() : + ~DLImagePPMHead...

DLImageTIFFHead{ Imported from New Project1 }

Attributes

Operations

+ DLImageTIFFHead() : + ~DLImageTIFFHead() :

DLImageBMPHead{ Imported from New Proj...

Attributes

Operations

+ DLImageBMPHead() :

DLTIFFImage{ Imported from Ne...

Attributes

Operations

+ DLTIFFImage(...+ ~DLTIFFImag...

DLPPMImage{ Imported from N...

Attributes

Operations

+ DLPPMImag...+ ~DLPPMIma...

DLJPGImage{ Imported from New ...

Attributes

Operations

DLBMPImage{ Imported from N...

Attributes

Operations

+ DLBMPImag...+ ~DLBMPIma...

dlImageHead

Page 7: DOCLIB Overview A developer perspective

DLImageFactor Design

DLImageFactory

DLTiffImage

DLJpegImge

DLPPMImage

DLPPImage

::

Registers

DLBaseImage

Design Factors

Image Type objects are static/singleton objects created on startup

DLImageFactory is a static/singleton object

Image Type objects registers itself with the DLImageFactory during startup

DLImageFactory keeps a list of supported Image objects as each image type calls the register function

Additional image types can be plugged into DOCLIB without modifying existing DOCLIB code

Page 8: DOCLIB Overview A developer perspective

Supported Image Processing Functions

1. Open TIFF, JPEG, GIF, PPM, PNG, BMP, PBM(P1 and P4) 1a. Memory based read TIFF 1b. Memory based read GIF

1c. Memory based read PNG1d. Memory based read BMP1e. Memory based read PPM1f. Memory based read JPEG1g. Memory based read PBM1h. Supports multiple page TIFF file

2. Save TIFF, JPEG, GIF, PPM, and PNG images to disk2a. Memory based write TIFF2b. Memory based write BMP

3. Calculate connected components 4. Save individual components of an image to disk5. Resize an image6. Rotate an image (large image takes long time)7. Copy an image8. Extract sub-image9. Flip image10. Contour image11. Reverse image12. Paste Image13. Set Pixel of an image14. Convert images to and from BIT(1 bit, black/white), BYTE(8 bits, gray), COLOR (24 bits)

15. Draw shapes within images a. Line (color) b. box (color)16. Dilate image17. Erode image18. Sharpen image19. Blur image20. Mask image21. Convert to YCrCb22. YCrCbBinarization23. 256Color quantized 24. PercentThresholdBinarization percent (i.e. 90 or .9)25. ThresholdBinarization rThresh gTresh bThresh26. Color2Gray_global27. Gray2Binary_global thresh28. Binary2Color29. Gray2Color30. Binary2Gray31. loadDocFile - Load an image as a document. Flips bits if black pixels are more than white pixels.32. Histgram for 1 and 8 bit images33. Projection34. Skeletonize35. deskew36. Loading an unknown image from memory or from file (Unknown image type must be one of the supported images)37. Skeletonize38. DLDocument39. Centroid Calculation

Page 9: DOCLIB Overview A developer perspective

DLDocument Design

DLDocument:

…DLPage 1: DLPage

1:DLPage2:

DLPageN:

DLDoc.

DLZone 1:Zone1:

Zone2:

ZoneM:

Zone1_1:

Zone1_2:

Zone1_3:

...

DLZone class allows users to subclass their own objects: e.g. DLSegment, DLLine, DLCharacter, DLTable etc.

Page 10: DOCLIB Overview A developer perspective

DOCLIB Development Overview

STL C++ code without platform specific

dependencies, such as MFC etc

developed under Microsoft´s .net environment

(mainly Visual Studio .NET 2003, i.e. v7.1)

compiles and runs under Windows and Linux

accessible via cvs server by core developers

consistent coding and documentation conventions

support add-on applications using core DOCLIB as

the foundatation layer

Page 11: DOCLIB Overview A developer perspective

Software Development Processes

Source Control Bug Tracking Software Integrity Documentation Standards Distribution Portal Software development

documentation under /doc and /ReleaseDoc directories of DOCLIB release

Page 12: DOCLIB Overview A developer perspective

Software Development Process

Accept Bug/Feature

Implement

Unit Test

Bug?

Run Regression Test

Pass?

Y

N

N

Submit BugExternal

Failure?

Debug

Resolve Bug Check into CVS

Y

Y

N

CVS Update

Page 13: DOCLIB Overview A developer perspective

DOCLIB Add-on Concept

DLImage DLDoc.

DOCLIB Core

Add-on1

Add-on3

Add-on4

Add-on5

Add-on2

Page 14: DOCLIB Overview A developer perspective

DOCLIB Add-on Structure

Page 15: DOCLIB Overview A developer perspective

DOCLIB Coding Conventions

Function names should start with lowercase“dl” void dlGetName()

Class names should start with uppercase “DL” class DLImage { }

Each function will be documented to conform to Doxygen standards

Page 16: DOCLIB Overview A developer perspective

DOCLIB Coding Conventions Run-time exceptions should be thrown using

standards defined in DLException.h

No creation of interim files during processing. All data manipulation in memory

Page 17: DOCLIB Overview A developer perspective

Generate documentation using Doxygen

Load doxygen config file under the/doc directory of the core DOCLIB release and customize for your add-on

Page 18: DOCLIB Overview A developer perspective

DOCLIB Resources

LAMP DOCLIB Project page http://lampsrv01.umiacs.umd.edu/projdb/project.php?id

=55 Include citation to DOCLIB as appropriate

DOCLIB Wiki (access controlled) https://wiki.umiacs.umd.edu/lamp/index.php?title

=DOCLIB

DOCLIB online demo http://lampsrv01.umiacs.umd.edu/doclib/upload.php

DOCLIBCommandLine A utility shipped with core DOCLIB that supports

command line, console, and script mode image processing

Its source code is intent to demo the key DOCLIB image processing features