43
Databasing the World: Biodiversity and the 2000s Written by Bowker, G. C. Presented by Chen Zhang (Mike)

Databasing the world

Embed Size (px)

DESCRIPTION

 

Citation preview

  • 1. Databasing the World:Biodiversity and the 2000s
    Written by Bowker, G. C.
    Presented by Chen Zhang (Mike)

2. Four Key Aspects
Database Infrastructure
Standardsflexible, stable
Technologystable
Communication
Data Sharing
Ownership
Disarticulation
Data collection
3. Four Key Aspects
Distributed Collective Practice
Collaborate work
New Knowledge Economy
Accounting for life
Development of Classification
Cladistics
The Future
4. DatabaseInfrastructure
5. Standards
Why do we need standards
Example of air-conditioner industry
Diameter Match between screw and the hole on the panel
Reasons for database
Need handshake among various media
MIMEprotocol
Each layer of infrastructure requires its own set of standards
Need standardizedcategories.
6. Standards
Standards will not always win
Some best-known standards
QWERTY keyboard
7. Standards
Standards will not always win
Some best-known standards
VHS (Video Home System) standard
8. Standards
Standards will not always win
Some best-known standards
DOS computing system
9. Standards
Standards will not always win
Why?
The best standard maybe doesnt have best market
Standards setting is a key site of political work
The inferior standard may be respected by the political agency. ( Such as standards-setting bodies)
10. Standards
Interoperability
Continuum of strategies for standards setting
One Standard Fits All
Let A Thousand standards bloom
11. Standards
Interoperability
Some Related Standards
1. ANSI/NISO Z39.50
ANSI/NISO Z39.50 is the American National Standard Information Retrieval Application Service Definition and Protocol Specification for Open Systems Interconnection.
IT makes it easier to use large information databases by standardizing the procedures and features for searching and retrieving information.
12. Standards
Interoperability
Some Related Standards
ANSI/NISO Z39.50
13. Standards
Interoperability
Some Related Standards
1. ANSI/NISO Z39.50
A single enquiry over multiple databases.
widely adopter in the library world.
14. Standards
Interoperability
Some Related Standards
2. XML
Extensible Markup Language (XML) is a set of rules for encoding documents in machine-readable form.
Two extremes:
a. Colonial model
b. Democratic model (win out)
Peoples established computing environment
15. Technology
Technology must be stable
Nothing to guarantee the stability of vast data sets
Failure of Paul Otletswell catalogued microfiches
Development of computer memory
Hard to retrieve information
16. Technology
Technology must stable
Data accessible and usable
Infrastructure will require a continued maintenance effort
Reasons
a.Data is passed from one medium to another
b.Data is analyzed by one generation of database technology to the next.
17. Issues of Communication
Problem of reliable metadata
Metadatadata about data
The blue lines
are metadata
18. Issues of Communication
Problem of reliable metadata
The standard name of certain kinds of data
Searchableeasy to search over multiple database
Issuehow detail does the name of data should be?
Lack of details the information of data is useless
Too many details longer time, more work
19. Issues of Communication
Dublin code
The Dublin Core set ofmetadata elements provides a small and fundamental group of text elements through which most resources can be described and cataloged.
The Simple Dublin Core Metadata Element Set (DCMES) consists of 15 metadata elements:
Language
Relation
Coverage
Rights
Title Creator Subject Description Publisher Contributor
Date
Type
Format
Identifier
Source
20. Data Sharing
21. Ownership
Control of knowledge
Mid-nineteenth century:
only professionally trained scientists and doctors
New information economy:
from many people
Example: patients group
22. Ownership
Privacy
Keep data private is difficult :
Example: data is complied by third-company to generate a new, marketable form of knowledge
New Patterns of ownership
Science has frequently been analyzed as a public good
Increasing privatization of knowledge :
It is unclear to what extent the vaunted openness of the scientific community will last
23. Disarticulation
Ideal database
Should according to most practitioners be theory-neutral, but should serve as a common basis for a number of scientific disciplines to progress.
Example: genome databank new kind of science genome construct arguments about the genetic causation the process of mapping the genome

  • Data must be reusable by scientists

24. The data in a database should be easily manipulated by other scientists.