14
Breakout Groups Group 1 Reagan Moore Hyeon Kim Ching-Chih Chen Jonghoon Chun Sam Oh Ulf Hermjakob Karl Lo Su-Shing Group 2 Ron Larsen Sung-Hyuk Kim Gregory Crane Doo-Kwon Baik Michael Gertz Stephen Helmreich Bruce Miller Bob Allen Soon Joo Hyun Group 3 Ed Fox Sung Hyon Myaeng Lee Zia Kang-Tak Oh Sang-Ho Lee Lois Delcambre Young-Suk Lee Hae Chang Rim Sang-Goo Lee

Breakout Groups Group 1 Reagan Moore Hyeon Kim Ching-Chih Chen Jonghoon Chun Sam Oh Ulf Hermjakob Karl Lo Su-Shing Chen Sung Been Moon Group 2 Ron Larsen

Embed Size (px)

Citation preview

Page 1: Breakout Groups Group 1 Reagan Moore Hyeon Kim Ching-Chih Chen Jonghoon Chun Sam Oh Ulf Hermjakob Karl Lo Su-Shing Chen Sung Been Moon Group 2 Ron Larsen

Breakout GroupsGroup 1

Reagan Moore

Hyeon Kim

Ching-Chih Chen

Jonghoon Chun

Sam Oh

Ulf Hermjakob

Karl Lo

Su-Shing Chen

Sung Been Moon

Group 2

Ron Larsen

Sung-Hyuk Kim

Gregory Crane

Doo-Kwon Baik

Michael Gertz

Stephen Helmreich

Bruce Miller

Bob Allen

Soon Joo Hyun

Yongchae Kim

Group 3

Ed Fox

Sung Hyon Myaeng

Lee Zia

Kang-Tak Oh

Sang-Ho Lee

Lois Delcambre

Young-Suk Lee

Hae Chang Rim

Sang-Goo Lee

Page 2: Breakout Groups Group 1 Reagan Moore Hyeon Kim Ching-Chih Chen Jonghoon Chun Sam Oh Ulf Hermjakob Karl Lo Su-Shing Chen Sung Been Moon Group 2 Ron Larsen

List, Explain, and Prioritize• List -> Prioritize

– Actually, brainstorm to be inclusive -> full list– Maybe just need to categorize based on: Urgency? Window of opportunity? Short vs. long

term?

• Applications / Application Domains• Technologies

– Research areas– Technical (challenge) problems

• Benefits– From US-Korean collaboration– Justify: impact on science, on (each) society

• Funding– strategies, tactics, scale/costs– matching, leveraging (including investment in related activities)

Page 3: Breakout Groups Group 1 Reagan Moore Hyeon Kim Ching-Chih Chen Jonghoon Chun Sam Oh Ulf Hermjakob Karl Lo Su-Shing Chen Sung Been Moon Group 2 Ron Larsen

Charge for Fri 1:30pm Groups• What is the problem?• What specific part(s) of it can be solved in the short (1yr),

medium (3yr), long term (5yr)? What are the dimensions of solution(s)?– Technology, policy, legal, digitization, mgmnt, …

• What is new? Why invest in solving the problem(s) now? What difference will it make if solved? Who will benefit? How?

• Who needs to be involved? Why?• What resources exist to support this application?• How will you know when have succeeded? What are the

evaluation method(s)? Deliverables?

Page 4: Breakout Groups Group 1 Reagan Moore Hyeon Kim Ching-Chih Chen Jonghoon Chun Sam Oh Ulf Hermjakob Karl Lo Su-Shing Chen Sung Been Moon Group 2 Ron Larsen

Cultural Heritage –The Opportunity

•Korea has large collections of cultural heritage resources that are only available in Korea. They are not available to international researchers without those researchers traveling to Korea.

•Ancient documents have been scanned for preservation purposes but are available only in image form. (Korea is responsible for this digitization effort.)

•Even these materials that are available digitally lack conformance with international standards.

•Language barriers seriously impede usage.

•Precious ancient artifacts must be preserved before they are lost or destroyed.

Page 5: Breakout Groups Group 1 Reagan Moore Hyeon Kim Ching-Chih Chen Jonghoon Chun Sam Oh Ulf Hermjakob Karl Lo Su-Shing Chen Sung Been Moon Group 2 Ron Larsen

The Problem, cont’d

The US-Korean cultural exchange of information is lop-sided.

– Non-conformance to standards

– Language barrier

– Low proportion of information digitized (most is in page image form)

– Lack of metadata support

– Complications of character sets and dialects

– OCR not up to the challenge

Page 6: Breakout Groups Group 1 Reagan Moore Hyeon Kim Ching-Chih Chen Jonghoon Chun Sam Oh Ulf Hermjakob Karl Lo Su-Shing Chen Sung Been Moon Group 2 Ron Larsen

The Solution

• Phase 1 – It’s new & novel!– Build balanced core corpus

• Text, images (art objects, places, people, rare books…), maps, dictionaries …• 3D representations of objects and spaces

– Build bilingual resources• Dictionaries, lexicons• Assemble parallel & comparable corpora

– Build best-of-class prototype based on current state of art• Demonstrate capability, feasibility, and functionality• Establish critical mass of people, infrastructure, and information resources• Evaluate sufficiency of current standards

– E.g., TEI DTDs for Korean CH resources

• Prepare for issues of long-term preservation

Page 7: Breakout Groups Group 1 Reagan Moore Hyeon Kim Ching-Chih Chen Jonghoon Chun Sam Oh Ulf Hermjakob Karl Lo Su-Shing Chen Sung Been Moon Group 2 Ron Larsen

The Solution, cont’d

• Phase 2 – It’s useful & desirable!– Interoperable applications

• Extend prototype into educational domain

– Language technologies• Metadata• Build ontologies• Refine translation tools• Develop scholarly translation resources (commentaries, hand-tooled

translations,

– Prepare for scale-up• Validate architecture (may need some retrofit based on new R&D)• Begin the production operaitons• User evaluation studies (needs the critical mass of resources)

Page 8: Breakout Groups Group 1 Reagan Moore Hyeon Kim Ching-Chih Chen Jonghoon Chun Sam Oh Ulf Hermjakob Karl Lo Su-Shing Chen Sung Been Moon Group 2 Ron Larsen

The Solution, cont’d

• Phase 3 – It’s assumed & unnoticed!– Cross-lingual transparency

• CLIR• Multimedia support• Extraction, summarization

– Cross-disciplinary research and analysis– Cross-cultural learning and collaboration– Transformation of manual scholarly practice

• Documents designed for digital library• Geo-referenced everything (e.g., all images)• Many details

– Disambiguation of proper names– Co-reference– Authority control

Page 9: Breakout Groups Group 1 Reagan Moore Hyeon Kim Ching-Chih Chen Jonghoon Chun Sam Oh Ulf Hermjakob Karl Lo Su-Shing Chen Sung Been Moon Group 2 Ron Larsen

(Some) Dimensions of Problem

• Policy– Make clear IP arrangements up front and make

no compromises– Lock in (what you thought was) the obvious– Once in, never out (only move forward)

• Technology– Content-based multimedia information retrieval

Page 10: Breakout Groups Group 1 Reagan Moore Hyeon Kim Ching-Chih Chen Jonghoon Chun Sam Oh Ulf Hermjakob Karl Lo Su-Shing Chen Sung Been Moon Group 2 Ron Larsen

What’s New? Why Now?

• Maturation of basic DL technologies• Globally networked world

– Broadband, expanding US infrastructure– Wired & wireless infrastructure in Korea

• Korean commitment to digitization of cultural heritage resources for global consumption

• Enlightened self-interest in global conformance to standards and best practices

• Emerging international DL interests and opportunities

Page 11: Breakout Groups Group 1 Reagan Moore Hyeon Kim Ching-Chih Chen Jonghoon Chun Sam Oh Ulf Hermjakob Karl Lo Su-Shing Chen Sung Been Moon Group 2 Ron Larsen

Who Benefits?

• Who doesn’t?• Revolutionizes access to high quality information

of Korean culture– Essentially inaccessible anywhere in the US today

• Expands accessibility of American cultural resources to Korea

• Enables greatly enhanced multicultural education and collaboration opportunities

Page 12: Breakout Groups Group 1 Reagan Moore Hyeon Kim Ching-Chih Chen Jonghoon Chun Sam Oh Ulf Hermjakob Karl Lo Su-Shing Chen Sung Been Moon Group 2 Ron Larsen

Who’s Involved… Why?

• Universities

• Government –

• Other disciplines emerging as relevant– Cognitive scientists– Librarians

Page 13: Breakout Groups Group 1 Reagan Moore Hyeon Kim Ching-Chih Chen Jonghoon Chun Sam Oh Ulf Hermjakob Karl Lo Su-Shing Chen Sung Been Moon Group 2 Ron Larsen

What’s needed?

• Digitized resources produced through government resources (e.g., MIC)

• Currently existing DL’s for comparable corpora in the west

• Language technologies existing and under development

• Coherent leadership (beginning now)

Page 14: Breakout Groups Group 1 Reagan Moore Hyeon Kim Ching-Chih Chen Jonghoon Chun Sam Oh Ulf Hermjakob Karl Lo Su-Shing Chen Sung Been Moon Group 2 Ron Larsen

How will we know it worked?

• Quantitative measures (Size & usage statistics)– Number of users– Number of objects– US vs. Korean users

• Qualitative measures– International usability and usage