Upload
jane-ford
View
213
Download
0
Embed Size (px)
Citation preview
Breakout GroupsGroup 1
Reagan Moore
Hyeon Kim
Ching-Chih Chen
Jonghoon Chun
Sam Oh
Ulf Hermjakob
Karl Lo
Su-Shing Chen
Sung Been Moon
Group 2
Ron Larsen
Sung-Hyuk Kim
Gregory Crane
Doo-Kwon Baik
Michael Gertz
Stephen Helmreich
Bruce Miller
Bob Allen
Soon Joo Hyun
Yongchae Kim
Group 3
Ed Fox
Sung Hyon Myaeng
Lee Zia
Kang-Tak Oh
Sang-Ho Lee
Lois Delcambre
Young-Suk Lee
Hae Chang Rim
Sang-Goo Lee
List, Explain, and Prioritize• List -> Prioritize
– Actually, brainstorm to be inclusive -> full list– Maybe just need to categorize based on: Urgency? Window of opportunity? Short vs. long
term?
• Applications / Application Domains• Technologies
– Research areas– Technical (challenge) problems
• Benefits– From US-Korean collaboration– Justify: impact on science, on (each) society
• Funding– strategies, tactics, scale/costs– matching, leveraging (including investment in related activities)
Charge for Fri 1:30pm Groups• What is the problem?• What specific part(s) of it can be solved in the short (1yr),
medium (3yr), long term (5yr)? What are the dimensions of solution(s)?– Technology, policy, legal, digitization, mgmnt, …
• What is new? Why invest in solving the problem(s) now? What difference will it make if solved? Who will benefit? How?
• Who needs to be involved? Why?• What resources exist to support this application?• How will you know when have succeeded? What are the
evaluation method(s)? Deliverables?
Cultural Heritage –The Opportunity
•Korea has large collections of cultural heritage resources that are only available in Korea. They are not available to international researchers without those researchers traveling to Korea.
•Ancient documents have been scanned for preservation purposes but are available only in image form. (Korea is responsible for this digitization effort.)
•Even these materials that are available digitally lack conformance with international standards.
•Language barriers seriously impede usage.
•Precious ancient artifacts must be preserved before they are lost or destroyed.
The Problem, cont’d
The US-Korean cultural exchange of information is lop-sided.
– Non-conformance to standards
– Language barrier
– Low proportion of information digitized (most is in page image form)
– Lack of metadata support
– Complications of character sets and dialects
– OCR not up to the challenge
The Solution
• Phase 1 – It’s new & novel!– Build balanced core corpus
• Text, images (art objects, places, people, rare books…), maps, dictionaries …• 3D representations of objects and spaces
– Build bilingual resources• Dictionaries, lexicons• Assemble parallel & comparable corpora
– Build best-of-class prototype based on current state of art• Demonstrate capability, feasibility, and functionality• Establish critical mass of people, infrastructure, and information resources• Evaluate sufficiency of current standards
– E.g., TEI DTDs for Korean CH resources
• Prepare for issues of long-term preservation
The Solution, cont’d
• Phase 2 – It’s useful & desirable!– Interoperable applications
• Extend prototype into educational domain
– Language technologies• Metadata• Build ontologies• Refine translation tools• Develop scholarly translation resources (commentaries, hand-tooled
translations,
– Prepare for scale-up• Validate architecture (may need some retrofit based on new R&D)• Begin the production operaitons• User evaluation studies (needs the critical mass of resources)
The Solution, cont’d
• Phase 3 – It’s assumed & unnoticed!– Cross-lingual transparency
• CLIR• Multimedia support• Extraction, summarization
– Cross-disciplinary research and analysis– Cross-cultural learning and collaboration– Transformation of manual scholarly practice
• Documents designed for digital library• Geo-referenced everything (e.g., all images)• Many details
– Disambiguation of proper names– Co-reference– Authority control
(Some) Dimensions of Problem
• Policy– Make clear IP arrangements up front and make
no compromises– Lock in (what you thought was) the obvious– Once in, never out (only move forward)
• Technology– Content-based multimedia information retrieval
What’s New? Why Now?
• Maturation of basic DL technologies• Globally networked world
– Broadband, expanding US infrastructure– Wired & wireless infrastructure in Korea
• Korean commitment to digitization of cultural heritage resources for global consumption
• Enlightened self-interest in global conformance to standards and best practices
• Emerging international DL interests and opportunities
Who Benefits?
• Who doesn’t?• Revolutionizes access to high quality information
of Korean culture– Essentially inaccessible anywhere in the US today
• Expands accessibility of American cultural resources to Korea
• Enables greatly enhanced multicultural education and collaboration opportunities
Who’s Involved… Why?
• Universities
• Government –
• Other disciplines emerging as relevant– Cognitive scientists– Librarians
What’s needed?
• Digitized resources produced through government resources (e.g., MIC)
• Currently existing DL’s for comparable corpora in the west
• Language technologies existing and under development
• Coherent leadership (beginning now)
How will we know it worked?
• Quantitative measures (Size & usage statistics)– Number of users– Number of objects– US vs. Korean users
• Qualitative measures– International usability and usage