Upload
laurel-hoover
View
215
Download
1
Tags:
Embed Size (px)
Citation preview
Revolution & Kids:Building the Future of the Net &
Understanding the Structures of the World
Bruce R. SchatzCANIS - Community Systems Laboratory
University of Illinois at Urbana-Champaign
[email protected], http://canis.uiuc.edu
Special Holiday Lecture sponsored by IRIS-CISE
National Science Foundation, Dec 16, 1997
Building the Future of the Net
The Grand ChallengeGrand Challenge is
Semantic Federation across a
Billion Community Repositories
Evolution of the Net
• The Present: Access– The Net fetches documents
• The Future: Organization– The Net searches repositories
• The Millennium: Analysis– The Net correlates information
From the Internet (data transmission)
to the Interspace (information manipulation)
1965
1975
1985
1995
2000
2010ARPANET Internet Interspace
The Third Wave of Net Evolution
PROTOCOLS IP FTP HTTP CORBA CP
SERVICES Distributed Files
GlobalHypermedia
DistributedObjects
GlobalSemantics
FUNCTION Access Organization Analysis
UNITS Packets Files Links Objects Concepts
DistributedPaths
Categories
SMP
System Model Timeline
• 1984 Telesophy proposed at Bellcore– revolutionary system prototype developed
• 1989 Schatz systems advisor at NCSA– 20th Anniversary ARPANET Symposium
• 1994 NCSA Mosaic gains many users– Mosaic 1M, Netscape 50M, Explorer 100M
• 1999 Web achieves Telesophy functionality– browsing & sharing multimedia documents
Telesophy
Publishing Cycle
USER request
LIBRARY reference
INDEXER classify
PUBLISHER quality
AUTHOR generate
• users are authors, computers are publishers• Every person & machine performs Every role
Community System
browse and share all the knowledge of a community
data results(database management) (electronic mail)
literature news(information retrieval) (bulletin
boards)
knowledge(hypertext annotations)
Formal Informal
Worm Community System
• WCS Information:
Literature Biosis, Medline, newsletter, meetings
Data Genes, Maps, Sequences, strains, people
• WCS Functionality
Browsing search, navigation
Filtering selection, analysis
Sharing linking, publishing
WCS: 250 users at 50 labs across the Internet
Worm Community System (WCS)
World of a Billion Repositories
• Every community has its own repository– objects and pointers in information space
• Local collections but global solutions– peer-peer architecture for infrastructure
• Net must support semantic correlation– vocabulary switching in concept spaces
Levels of Indexes
Technology
Engineering
Electrical
IEEE
communities
groups
individuals
FORMAL
INFORMAL
(traditional)
(digital)
Community Repositories
• repository is an organized collection
• DLI establishing large repositories for major publishers
• WCS integrated large formal (journals) and small informal (bulletins)
• need semantic retrieval for across publishers
• need semantic indexing for small publishers
Semantic Retrieval
• automatic indexing of concepts– find context of terms (phrases) within documents– generates concept space from co-occurrence frequency
• useful for interactive searching– given a term, can suggest other terms– merging concept spaces supports vocabulary switching
• concept spaces require supercomputing – space for Inspec (400K) took 1 day on SGI Challenge– 600 spaces for Compendex (4M) took 3.5 days on HP
Convex Exemplar
Vocabulary Switching
• Grand Challenge of Digital Libraries– semantic interoperability across subject domains– vocabulary switching to suggest across domains
• Generating 1000 community repositories– 600 categories across engineering (38 top-level) – 150 categories across EE, CS, physics– 3M raw abstracts, about 10M in community spaces
• large-scale supercomputer simulation– 7 days of dedicated computation (10 days overall)– have space navigation; need space intersection
Interspace Prototype
• semantic retrieval concept spaces
• semantic interoperability vocabulary switching
• semantic indexing object categorization
• semantic clustering category maps
• information spaceflight region visualization
• analysis environment path correlations
Analysis Enviroments
• Concept Spaces for interactive suggestion
• Category Maps for semantic clustering
• Information Spaceflight for navigation
• examples from DLI Research
http://csl.ncsa.uiuc.edu/interspace.html
Understanding the Structures of the World
The Grand ChallengeGrand Challenge is
Point-of-View Spaceflight across a
Billion Community Repositories
Global Cultural Memory
• Enable Everyone to understand – How They Have Lived and – How They Will Live
• The Structures of Everyday Life– recorded over space and time– correlated in a global information space
• Appreciate the Past and Predict the Future
Evolutionary System Model
• Getty project using commercial technology
• Museum-style Web collection
• curators to classify from big sources
• Champaign County Historical information
• county-level community repository
• 5000 Historical Societies in US
Revolutionary System Model
• A planet for every kid’s local environment
• Federating the planets into a universe
• Ordering all planets from kid’s POV
• Flying through the Kids Universe
• Finding similar kids from different POVs
• Connecting historically through museums
Category Map (2D)
Information Spaceflight (3D)
Research Issues
• Community Repositories [sovereign individual]
– indexing all the knowledge of small groups
• Semantic Federation [transient centralization]
– creating virtual libraries from distributed repositories
• Information Analysis [think globally, act locally]
– correlating across repositories to solve problems
The World of a Billion Repositories:
The Interspace of the 21st Centuryhttp://dli.grainger.uiuc.edu/_ppt/vanguard/vanguard1/index.htm
Societal Implications
• Sovereign Individual – community repositories for custom indexing
• Transient Centralization– community curators as relationship bonders
• Think Globally, Act Locally– vocabulary switches for correlation infrastructure
• Situational Analysis– dynamic categorization for problem solving
• Zen of the Net– at one with the knowledge of the world (mushin)