Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
5aSC8.Transcrip-onandforcedalignmentoftheDigitalArchiveofSouthernSpeechMargaret E. L. Renwick � Michael L. Olsen � Rachel Miller Olsen � Joseph A. Stanley
[email protected] � [email protected] � [email protected] � [email protected]
Code Meaning{D:} Doubt:e.g.{D:doubtedwords}{X} Unintelligible{B} Beep:addedtoanonymizeaudiorecordingsavailabletothepublic{C:} Comment:e.g.{C:tapedistor@on}{NW} Non-word:e.g.laugh,cough{NS} Non-speech:e.g.dogbarking,doorclosing
7. REFERENCES&ACKNOWLEDGMENTSThisresearchissupportedby:NSFBCS#1625680toco-PIsKretzschmarandRenwick,theUGAGraduateSchool,andtheAmericanDialectSociety.[1]Boersma,P.,andWeenink,D.(2015).Praat:DoingphonePcsbycomputer[Computerprogram],Version5.4.08.Retrievedfrom
hOp://www.praat.org[2]Boudahmane,K.,Manta,M.,Antoine,F.,Galliano,S.,andBarras,C.(1998).Transcriber[Computerprogram],Version1.5.2.Retrievedfrom
hOp://trans.sourceforge.net/[3]Fromont,R.,andHay,J.(2012).“LaBB-CAT,”ProceedingsoftheAustralasianLanguageTechnologyWorkshop,10,113–117.[4]Gorman,K.,Howell,J.,andWagner.M.(2011).“Prosodylab-Aligner:AToolforForcedAlignmentofLaboratorySpeech,”CanadianAcous@cs,
39(3),192–193.[5]Kretzschmar,W.A.J.(2011).LinguisPcAtlasProject,[email protected]://www.lap.uga.edu/[6]Kretzschmar,W.A.,Bounds,P.,HeOel,J.,Pederson,L.,Juuso,I.,Opas-Hänninen,L.L.,andSeppänen,T.(2013).“TheDigitalArchiveof
SouthernSpeech(DASS),”SouthernJournalofLinguis@cs,37(2),17–38.[7]Pederson,L.,McDaniel,S.L.,andAdams,C.M.(Eds.)(1986).LinguisPcAtlasoftheGulfStates,UniversityofGeorgiaPress,Athens,Georgia,
Vols.1-7.[8]Reddy,S.,andStanford,J.N.(2015).“Towardcompletelyautomatedvowelextrac@on:IntroducingDARLA,”Linguis@csVanguard,1(1),15–28.
doi:10.1515/lingvan-2015-0002[9]Rosenfelder,I.,Fruehwald,J.,Evanini,K.,andYuan,J.(2011).FAVE(ForcedAlignmentandVowelExtracPon)programsuite[Computer
program].RetrievedfromhOp://fave.ling.upenn.edu
1. THEDIGITALARCHIVEOFSOUTHERNSPEECH(DASS)v SubsetoftheLinguis@cAtlasoftheGulfStates(LAGS)[7]v 64speakersrecordedinsociolinguis@cinterviewsfrom1968– 1983in8U.S.GulfStates
v 30female,34male;born1886–1965;meanage61yearsv 4speakersforeachof16LAGSgeographicalsectors[6](Fig1)
v 1AfricanAmerican(AA)speaker,and3EuropeanAmerican(EA)speaker“Types”v 372hoursofaudio(2.5–10hoursperinterview;µ=5.75hours)
v .wavfilesfilteredinPraat[1]toremovear@factualnoiseabove17kHz
SpeakerType Descrip-on
1“Folk” Older,lesseducated,lessconnected
2“Common” Younger,beOereducated&connected
3“Cul@vated”Mosteducated,culturallyaware,connectedtothe
communityAA AfricanAmerican
3. GOALSv Tooffermethodsfortranscrip@onandautoma@cphone@canalysisofalargespeechcorpusv Toautoma@callyextractasmuchgoodacous@cdataaspossiblefromtheselegacyrecordingsv Tousethisrichhistoricaldatatoexplorethesociophone@csofSouthernspeech
CODING:§ Codes(below)areemployedincurlybrackets{}withinTranscriber§ Eachcodeistranscribedonitsownline(Fig3,top)
TEXTGRIDOUTPUT:§ TimealignmentsfromTranscriberaremappedtoTextGridintervals§ Eachspeaker(interviewee,interviewer)receivesaseparate@er§ Onlyinterviewee@erisphone@callyanalyzed§ Intervalscontaining{}areexcludedfromphone@canalysis
ENSURINGCONSISTENCYACROSSTRANSCRIPTIONS:§ In-housetranscrip@onguidelines(e.g.spelling,punctua@on)§ Dic@onaryofnon-standardwords(e.g.uh-huh,gonna)
Figure3.Transcribersotwaregraphicaluserinterface
4. METHODSFORLARGE-SCALETRANSCRIPTIONv Transcribersotware[2]isusedfororthographictranscrip@on(Fig3)
v Facilitatesuser-friendly,precisely@me-aligned,mul@-@ertranscrip@onv Approximately40undergraduatesareassignedasinglespeakereach,
[email protected]:54minutes.
6. FORCEDALIGNMENTANDFORMANTEXTRACTIONv TextGridsand.wavfilessubmiOedtoDartmouthLinguis@cAutoma@on(DARLA)[8]for
forcedalignment(Fig4)andvowelextrac@onv DARLAfiltersdata,andbydefaultdoesnotreturnmeasurementsforeverytoken
v Wearetes@ngDARLAagainstthreenon-filteringformantextrac@ontechniques(Fig5)v In-housePraatscript:extractsalldata,butformanttrackingiserrorfulforbackvowelsv FAVE[9]:extractsalltokens,butitsBayesianformanttrackingalgorithmisnot
specializedforSouthernspeech;trainingdatacomefrommanyU.S.varie@esv ModifiedFAVE:extractsalltokens;Bayesianalgorithmtrainedonmeanformantvalues
from4fully-transcribedDASSinterviews;requiresanextrastepfordataextrac@onv ModifiedFAVEappearstoperformbest:itprovidesaclean,well-separatedvowelspace
similartoDARLA’soutput,butwithoutdatalossduetofiltering
v Atpresent,18interviewsarefullytranscribed,and20+areinprogressv UseourQRcodetointeractwiththisdatasetinyourwebbrowser!v Visitposter5aSC9forfurtheracous@canalysismethodsandresults!
1STLISTEN:Orthographicallyrecordwhosaidwhat,when.2NDLISTEN:Correctspelling,ensurethattranscrip@onisproperly@me-aligned&in-houseconven@onsarefollowed.
3RD LISTEN: 2-3 graduate students check all transcrip@ons toensureconsistencyacrossthecorpus.
3-LISTENSYSTEM:
Figure1.DASSspeakersbyLAGSsectorandtype
2. MOTIVATIONFORTRANSCRIBINGDASSv Within the Linguis@c Atlas Project [5], a limited number of target lexical items were
impressionis@callytranscribedinLAGSProtocols(Fig2),withnoacous@canalysisv Maximumof1031transcribeditemsperspeaker;liOleintraspeakervaria@onrepresented
v Transcribing full DASS interviews is expected to yield a searchable corpus of 1.5 millionwords,@me-alignedtotheaudio,withcorrespondingacous@cdata.
Figure2.ExampleofLAGSSpeakerProtocoltranscrip@ons
Figure4.Force-alignedTextGridreturnedbyDARLA
DHIY0K AE1 T S M EY1 DIH1 T SAO1R T AH0 S K EH1 R IY0
the cats made it sorta scary
Time (s)0 1.801
Figure5.Comparisonofvowelformantextrac@onmethods
Spot-checkedforconsistencyinTranscriber(3rdlisten)
5. TRANSCRIPTIONANDDATAPROCESSINGWORKFLOW
Transcrip-onbyundergraduate
usingTranscriber,includingdouble-
checking
Automa-cphone-canalysis!
1houraudio
12.5hoursofwork(2listens) .trs(.xml)à.txt
.trsà.TextGrid
FileconversionviaLaBB-CAT[3]scripts
Dallas
AustinHouston
Little Rock
New Orleans
Shreveport Jackson
Memphis
Nashville Knoxville
Atlanta
Macon
Birmingham
Montgomery
Jacksonville
Orlando
Miami
Key West
Type 1 2 3 AA