Upload
dana-sullivan
View
213
Download
1
Embed Size (px)
Citation preview
Design of a Search Engine for Metadata Search Based on Metalogy
Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang
Dept. of Computer Engineering and Science
Yuan-Ze University
http://syslab.cse.yzu.edu.tw/
ICADL 2001 - 2001/12/11
[email protected] YZU, Taiwan - ICADL2001 2
Outline
• Introduction
• Related Technologies
• System Architecture
• An Experimental Prototype
• Conclusions
• Future work
[email protected] YZU, Taiwan - ICADL2001 3
Introduction
• Metadata management is not an easy task:– It requires specific domain knowledge for
appropriate data categorization.– It needs to deal with the complicated
relationships between the metadata items.– A good management tool for easing metadata
construction and manipulation is necessary.
[email protected] YZU, Taiwan - ICADL2001 4
Introduction
• Metalogy– Metalogy is a management system developed
by ROSS project group in Taiwan.– It can be used to manipulate various digitized
items and export/import XML records.– It is mainly designed for metadata management
of each digital library.
[email protected] YZU, Taiwan - ICADL2001 5
Introduction
• Search across digital libraries:– Metalogy does not consider how to search
information across digital libraries.– As digital libraries are widely deployed,
searching information across several digital libraries becomes important.
– We design a search engine to help users find resources without connecting to digital libraries and inputting the same query terms.
[email protected] YZU, Taiwan - ICADL2001 6
Introduction
• We design this search engine based on the XML data exported from Metalogy for some reasons:– XML/Metalogy provides comprehensive
metadata descriptions and DTD information for metadata search.
– The quality of the distributed service highly depends on the quality of the data resource.
[email protected] YZU, Taiwan - ICADL2001 7
Related Technologies
• Z39.50 – It was proposed to search and retrieve information from
heterogeneous databases over networks.
– Provide abstract search capability.
– It is difficult to be implemented because of its strengthened functionality.
• OAI – Arc– Arc is developed for cross-archive searching.
– It adopts the OAI protocol to harvest digital archives.
[email protected] YZU, Taiwan - ICADL2001 8
Related Technologies
• Harp– Harp provides a uniform query interface across legacy
public libraries through HarpSQL.
– A HarpSQL server acts as a query agent for storing and handling the intermediate query results not as a search engine to collect and store all metadata.
• METALICA– It adopts a meta-search engine like MetaCrawler to
provide a uniform user interface for supporting cross-archive search.
[email protected] YZU, Taiwan - ICADL2001 9
System Architecture
XMLXML Parser
(Java Application)Index
Database
Search Engine(Java Servlet)
DTD Manager(Java Servlet)
UserInterface
ManagerInterface
Query
Request
Metadata
DTD
Digital Library 1
DTD
Browser ‧‧
‧
Digital Library n
Digital Library 2
[email protected] YZU, Taiwan - ICADL2001 10
System Architecture
• The search engine is constructed with three modules:– Search engine module
• Provide an integrated user interface• Adopt Java servlets to provide search services
– Index database module• Provide metadata repository for digital library
sources.• Adopt simple Dublin Core set as default metadata.• Store DTD mapping relationships.
[email protected] YZU, Taiwan - ICADL2001 11
System Architecture
– Metadata/DTD manager• Provide an administration interface to manage
XML/DTD mapping relationships .
• Parse and translate the XML/DTD documents provided by remote digital libraries.
• Gather information from remote digital libraries and update the index database repeatedly.
[email protected] YZU, Taiwan - ICADL2001 12
An Experimental Prototype
• Development tool:– Implement this search engine with Java to reach
platform-independence.– Parse XML information with JAXP (Java API
for XML parsing) package.– The database is constructed with a public
domain database MySQL.
[email protected] YZU, Taiwan - ICADL2001 13
An Experimental Prototype
• XML/DTD manager
Manage functionality
[email protected] YZU, Taiwan - ICADL2001 14
An Experimental Prototype
• A mapping example
Mapping information
[email protected] YZU, Taiwan - ICADL2001 15
An Experimental Prototype
• An search example
A famous calligrapher His-Chih Wang (303-
361 AD)
[email protected] YZU, Taiwan - ICADL2001 16
An Experimental Prototype
• Search results
Matched metadata
Link to the resource file
[email protected] YZU, Taiwan - ICADL2001 17
Conclusions
• Present the design of a search engine for searching information across digital libraries based on metadata/XML.
• The design of the search engine has three advantages:– First, the system architecture is simple and the
cost is low.
[email protected] YZU, Taiwan - ICADL2001 18
Conclusions
– Second, the system extensibility is high for newly required services.
– Third, users need not to know how and where to search information by using this uniform user interface.
[email protected] YZU, Taiwan - ICADL2001 19
Future Work
• The quality control on the metadata provided by the original digital library source.
• The mapping scheme to support more heterogeneous digital archives should be further discussed.
[email protected] YZU, Taiwan - ICADL2001 20
Future Work
• The performance issue should be further addressed when the environment is in a large scale.
• How to effectively update information from the remote digital libraries is another important work to do.