17
RA TRAINING DAY GRF Corpus project

Grf corpus project training 1

Embed Size (px)

DESCRIPTION

Training workshop presentation

Citation preview

Page 1: Grf corpus project training 1

RA TRAINING DAY

GRF Corpus project

Page 2: Grf corpus project training 1

Sign in to the project

Get your user account and log in to https://grfcorpus.teamworkpm.net/

Page 3: Grf corpus project training 1

Get the software

Software download from:http://tla.mpi.nl/tools/tla-tools/elan/Or from the project page

Page 4: Grf corpus project training 1

ELAN working environment

ELAN project consists of 2 files .etf file Source audio file

Download 2 files from teamwork 1) your personal audio file as per your task 2) standard etf template file

Page 5: Grf corpus project training 1

Create your new project

File : new -> wav/mp3 + etf.

The annotation work consists of 2 parts:1) segmentation2) transcription

Page 6: Grf corpus project training 1

Segmentation 1

Options -> segmentation mode

Listen first. Different participants are recorded.

Page 7: Grf corpus project training 1

Segmentation 2

Start with Speaker1 - Sentence tier

Each speaker separate. Fine tune boundariesDelete, move merge and split

Page 8: Grf corpus project training 1

Transcription 1

Options -> transcription mode

Select Speech

Page 9: Grf corpus project training 1

Transcription 2

Listen and type

Page 10: Grf corpus project training 1

Transcription 3

This phase:

Page 11: Grf corpus project training 1

1st copy of segmentation

Options -> Annotation modeTiers -> Create annotations on

dependent tiersSpeech -> JyutPing, Translation

Page 12: Grf corpus project training 1

More transcription

Use this or transcription view to enter textFor jyutping transcription use website:http://hktv.cc/hp/cantonesetojyutping/Pay attention to spaces

Page 13: Grf corpus project training 1

Tokenizing

Tier ->Tokenize tiers: JyutPing -> Words

Adjust segments while pressing Alt

Page 14: Grf corpus project training 1

2nd copy of segmentation

Tier -> Create annotations on dependent tiers Words -> English Gloss, IPA, Language

Language has Controlled Vocabulary: E, C, P, ?

Page 15: Grf corpus project training 1

Last 2 Tiers

Code switching types Annotation mode Select a section with your mouse and double click Choose an option

Translation Annotation mode or Transcription mode Ctrl+Enter or Configure Verbal Unit Tier

Page 16: Grf corpus project training 1

More participants

Recreate tier structure for each participantTier -> Add new participant -> OKTake a break and repeat

the whole transcription process.

Save your work oftenTry using a mouse

Page 17: Grf corpus project training 1

Finish

Upload .eaf file to Teamwork and set the task to complete and upload saved file