2
A Digital Museum of Taiwanese Butterflies Jen-Shin Hong, Herng-Yow Chen, and Jieh Hsiang Department of Computer Science and Information Engineering National Chi-Nan University Puli, Nantou, TAIWAN 545 E-mail: {jshong, hychen, hsiang}@csie.ncnu.edu.tw ABSTRACT Taiwan is renown for its great variety of butterflies. There are about 400 species, a number of which unique to Taiwan, over its 36,500 sq km land. Last year we built a comprehensive digital collection of Taiwan's butterflies to provide a modern research environment on butterflies for academic institutions, as well as an interactive butterfly educational environment for the general public. Our digital museum emphasizes on the ease to use, and provides a number of innovative features to help the user fully utilize the information provided by the system. The digital museum is accessible through the Web at http://digimuse.nmns.edu.tw. KEYWORDS: Digital museum, butterflies, content-based retrieval, courseware, FAQ. INTRODUCTION Our project of a digital museum of butterflies is jointly developed by the National Chi-Nan University and the National Museum of Natural Science. Our goal is to provide a comprehensive database of Taiwanese butterflies, with extremely user-friendly interfaces and query systems. We envision the system be used both by researchers for studying butterflies and by the general public to understand and appreciate the ecology and variety of butterflies. Our work is sponsored partly by the National Science Council of Taiwan under grant number NSC-88-2745-P-260-006. Our digital museum framework contains 6 major modules: (1) XML-based information organization for digitized butterfly collections, (2) content-based image retrieval of butterflies, (3) SMIL-based synchronized multimedia exhibition, (4) compositional FAQ, (5) interactive games of butterfly eco- system, and (6) online courseware on butterflies. MANAGING THE BUTTERFLY DATA The National Museum of Natural Science has about 12,000 specimens of about 300 species of butterflies. From the specimens we selected 876 better ones, photographed them into 35mm slides, and from which digitized the images. We also scanned 1034 photographs of butterflies in their natural habitat, including those of eggs, pupa, larva, and adults. There are another 325 images of natural habitats, host plants, and honey plants. To ensure portability and platform independence, we built our data entirely in XML. With the help of our colleagues in the Department of Library and Information Science of the National Taiwan University, we built a metadata system for butterflies and the associated XML DTD (document type declaration). The text information is divided into five categories: an overall description of the species and related information concerning the collection process, and one category for each of the four stages of the life cycle: eggs, pupa, larva, and adult. Currently there are 318 XML documents, each describing one species. The metadata [1] information was entered by butterfly experts through our XML input interface. The contents of the XML documents are indexed for our XML search engine, written in PERL, which is capable of full-text, as well as attribute, search. CONTENT-BASED IMAGE RETRIEVAL One of the most interesting challenges to a digital museum of butterflies is to provide a natural way for a (novice) user to find a specific butterfly. Since a user usually does not know the name of a butterfly, we designed an image retrieval system that allows a user to describe, pictorially, the butterfly that one has seen. We identify three attributes, color, shape and pattern, as the most distinctive features of the visual appearance of a butterfly. For each of the attributes we choose several representative features, and built, semi-automatically, a visual feature value vectors for each of the images in our database. We also incorporated a notion of “fuzziness” to deal with the differences of human perception. (For instance, “gray” is closer to “white” than to “blue”.) Our query system employs two methods, query by features (QBF) and query by examples (QBE), which can be used interactively. When making a query, the user is presented with an outline of a butterfly and menus of colors, patterns, and shapes. One then choose an item from one or several menus, and the chosen features will be imposed on the query image. After the features are chosen, a QBF command is issued and a list of thumbnails of butterflies are returned. If the user does not find what she wants, she can point to one of the thumbnails and request all similar butterflies (QBF) or refine the features (further QBF). Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Digital Libraries, San Antonio, TX. Copyright 2000 ACM 1-58113-231-X/00/0006…$5.00 260

A Digital Museum of Taiwanese Butterfliesftp.cse.buffalo.edu/users/azhang/disc/disc01/cd1/out/papers/dl/p...1 A Digital Museum of Taiwanese Butterflies Jen-Shin Hong, Herng-Yow Chen,

Embed Size (px)

Citation preview

1

A Digital Museum of Taiwanese Butterflies

Jen-Shin Hong, Herng-Yow Chen, and Jieh Hsiang Department of Computer Science and Information Engineering

National Chi-Nan University Puli, Nantou, TAIWAN 545

E-mail: {jshong, hychen, hsiang}@csie.ncnu.edu.tw ABSTRACT Taiwan is renown for its great variety of butterflies. There are about 400 species, a number of which unique to Taiwan, over its 36,500 sq km land. Last year we built a comprehensive digital collection of Taiwan's butterflies to provide a modern research environment on butterflies for academic institutions, as well as an interactive butterfly educational environment for the general public. Our digital museum emphasizes on the ease to use, and provides a number of innovative features to help the user fully utilize the information provided by the system. The digital museum is accessible through the Web at http://digimuse.nmns.edu.tw.

KEYWORDS: Digital museum, butterflies, content-based retrieval, courseware, FAQ.

INTRODUCTION Our project of a digital museum of butterflies is jointly developed by the National Chi-Nan University and the National Museum of Natural Science. Our goal is to provide a comprehensive database of Taiwanese butterflies, with extremely user-friendly interfaces and query systems. We envision the system be used both by researchers for studying butterflies and by the general public to understand and appreciate the ecology and variety of butterflies. Our work is sponsored partly by the National Science Council of Taiwan under grant number NSC-88-2745-P-260-006.

Our digital museum framework contains 6 major modules: (1) XML-based information organization for digitized butterfly collections, (2) content-based image retrieval of butterflies, (3) SMIL-based synchronized multimedia exhibition, (4) compositional FAQ, (5) interactive games of butterfly eco-system, and (6) online courseware on butterflies.

MANAGING THE BUTTERFLY DATA The National Museum of Natural Science has about 12,000 specimens of about 300 species of butterflies. From the specimens we selected 876 better ones, photographed them

into 35mm slides, and from which digitized the images. We also scanned 1034 photographs of butterflies in their natural habitat, including those of eggs, pupa, larva, and adults. There are another 325 images of natural habitats, host plants, and honey plants.

To ensure portability and platform independence, we built our data entirely in XML. With the help of our colleagues in the Department of Library and Information Science of the National Taiwan University, we built a metadata system for butterflies and the associated XML DTD (document type declaration). The text information is divided into five categories: an overall description of the species and related information concerning the collection process, and one category for each of the four stages of the life cycle: eggs, pupa, larva, and adult.

Currently there are 318 XML documents, each describing one species. The metadata [1] information was entered by butterfly experts through our XML input interface. The contents of the XML documents are indexed for our XML search engine, written in PERL, which is capable of full-text, as well as attribute, search.

CONTENT-BASED IMAGE RETRIEVAL One of the most interesting challenges to a digital museum of butterflies is to provide a natural way for a (novice) user to find a specific butterfly. Since a user usually does not know the name of a butterfly, we designed an image retrieval system that allows a user to describe, pictorially, the butterfly that one has seen. We identify three attributes, color, shape and pattern, as the most distinctive features of the visual appearance of a butterfly. For each of the attributes we choose several representative features, and built, semi-automatically, a visual feature value vectors for each of the images in our database. We also incorporated a notion of “fuzziness” to deal with the differences of human perception. (For instance, “gray” is closer to “white” than to “blue”.)

Our query system employs two methods, query by features (QBF) and query by examples (QBE), which can be used interactively. When making a query, the user is presented with an outline of a butterfly and menus of colors, patterns, and shapes. One then choose an item from one or several menus, and the chosen features will be imposed on the query image. After the features are chosen, a QBF command is issued and a list of thumbnails of butterflies are returned. If the user does not find what she wants, she can point to one of the thumbnails and request all similar butterflies (QBF) or refine the features (further QBF).

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Digital Libraries, San Antonio, TX. Copyright 2000 ACM 1-58113-231-X/00/0006…$5.00

260

261

The major difficulty that needed to be overcome is that different people may perceive the same butterfly differently. One may see a butterfly as white with black stripes. Another may see the same butterfly as very dark gray with white spots. Yet a third may only remember the distinctive stripes but not the color at all. Our system seems robust enough to lead to the target butterfly on different descriptions. Our system is also very easy to use since it is based on visual perception and no typing is necessary.

A major difference between our image retrieval approach and other CBIR methods (e.g. [2]) is that, by focusing on butterflies, the “granularity” that defines precision is much smaller than one that with images of more diverse nature.

COMPOSITIONAL FAQ Many useful information providers maintain an FAQ (frequently asked questions) list. In our system, we take the concept one step further and allow a use to ask questions by oneself. The answers are a composition of existing FAQ’s. For instance, if a user asks “How many legs does a butterfly have?”, the system will retrieve, from the existing list of over 100 FAQ’s, all the relevant ones, such as “How many legs does a butterfly have?”, “How many legs does a caterpillar have?”, and “Can a butterfly stand on glasses?”. Our mechanism uses a keyword match from an inverted file of the existing FAQ’s. If there are no relevant answers in the database, the system asks the user to leave an e-mail address. Since all questions asked by users are logged, we choose the better questions, send the answers back to the user by e-mail, and add them to the FAQ list.

SMIL-BASED MULTIMEDIA EXHIBITION The third interesting feature of our system is narrated on-line slide shows presenting the most essential information about butterflies. Our slide show uses SMIL (Synchronized Multimedia Integration Language) [3], and the goal is to provide maximal amount of information about butterflies with the minimum number of clicks necessary. For bandwidth consideration, the slide shows use still images with audio narration. There are currently 18 shows covering the life cycles, food, predators, environmental impact on butterflies, the differences between butterflies and moth, the defense mechanism of butterflies, and procedures for preparing butterfly specimens. The topics, narrative and image contents were all selected and prepared by butterfly experts. In order not to over-burden the user, we have limited each slide show to be between 40 seconds and 2 minutes. The slide show can be viewed via RealPlayer G2. A person who has gone through all the slide shows should have most of the basic knowledge about butterflies and their environment. INTERACTIVE GAMES OF BUTTERFLY ECOSYSTEM We hope to educate children about the ecology of butterflies and to promote environmental awareness. To this end we have

designed interactive games with educational content. A user chooses a certain species, then the system simulates the hatching process, starting from the egg. Various questions concerning food, predators, climate, and simple facts about butterflies are presented as multiple-choice questions. As the user proceeds, the egg hatches and goes through the life cycle until becoming an adult. While the first version of the games was done in 3D VRML, we realized that it is not realistic with the current bandwidth. Therefore we only provide a 2D JAVA version in our current website.

ON-LINE BUTTERFLY COURSEWARE In addition to letting children visit the digital museum and learn by themselves, another effective way of reaching wider audience is by providing the teachers a convenient way to extract course material from the digital museum. We have designed a courseware with which a teacher can extract material (webpages) as one browse along in the digital museum, put them in folders (each folder may represent one lecture), and later show them to the class. Students can also go through the material off-line at their own paces. There are additional mechanisms to establish discussion groups to encourage student participation on line, and to make quizzes for the students to test their own progress. The advantage of our courseware system is that a teacher can make lectures as one sees fit. The mixture of webpages, audio/slide simulcast, and flexible queries should make the instructions quite dynamic and interesting. The discussion group and quizzes also make the students focus on their work. DISCUSSION In this paper we briefly outlined the content and technical contributions of a digital museum of butterflies. In addition to containing comprehensive information of almost all species of butterflies of Taiwan, we have incorporated a number of interesting features such as a compositional FAQ mechanism, image-based query system, SMIL-based multimedia exhibition, interactive games, and facilities for producing courses on butterflies. Our system employs XML as the basic organizational framework. REFERENCES 1. Marshall, C. C. Making metadata a study of metadata creation for a mixed physical-digital collection, in Proc. Of the 3rd ACM Conference on Digital Libraries (Pittsburgh, June 23-26,1998), ACM Press, pp. 162 – 171.

2. M. Flickner, H. Sawhney, W. Niblack, J. Ashley, et al., Query by image and video content: the QBIC system. IEEE Computer, 28, 9 (Sep 1995), 23-32,.

3. Rutledge L., et al., Practical application of existing hypermedia standards and tools, in Proc. Of the 3rd ACM Conference on Digital Libraries (Pittsburgh, June 23-26,1998), ACM Press, pp. 191-199.