6
Replicating Linguistic Resources B2SAFE: MPI-TLA CLARIN Center Willem Elbers (MPI-TLA) 2 nd EUDAT Conference Date: 29 October 2013

Replicating Linguistic Resources

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Replicating Linguistic Resources

Replicating Linguistic Resources

B2SAFE: MPI-TLA CLARIN Center

Willem Elbers (MPI-TLA)

2nd EUDAT Conference

Date: 29 October 2013

Page 2: Replicating Linguistic Resources

The Language Archive

2

• Data on languages:

– about 60 Terabyte of well-described resources

– about 20.000 hours of digitized audio/video recordings

– about 73.000 metadata described sessions

– about 4.5 million annotated segments

– data on more than 200 languages

– among these, data from about 60 DOBES teams

– acquisition, speech, multimodal, multilingual, language and cognition,

brain imaging, ethnological and other data.

• Mission:

– Maintaining access to all stored resources for the current generation of

researchers, language communities and the interested public.

– Preserve the valuable cultural heritage for current en future generations.

Page 3: Replicating Linguistic Resources

B2SAFE

• Goals

– Replication of data

• B2SAFE!

– Replication of services

• RZG providing Language Archive Technology services at

replica side

• B2SAFE Community extensions:

– Replication based on logical structure defined in the IMDI/CMDI

metadata

– Integrated with underlying SAM-FS

3

Page 4: Replicating Linguistic Resources

4

Approx:

3TB, #objects

Approx:

3TB, #objects

Page 5: Replicating Linguistic Resources

Summary

“Cultural Heritage Data replicated for the future”

• Data replication running in production

• LAT Software stack running @ RZG (beta)

• Replication of authorization records running (beta)

5

Page 6: Replicating Linguistic Resources

Summary

“Cultural Heritage Data replicated for the future”

6