Upload
ishan-gupta
View
216
Download
0
Embed Size (px)
Citation preview
7/31/2019 NLP Documentation
1/10
ABV-Indian Institute of Information Technology
and Management
Natural Language Processing Lab Assignment
Submitted To:
Dr. Mahua Bhattacharya
Submitted By:
Ishan Gupta (2008IPG-37)
7/31/2019 NLP Documentation
2/10
7/31/2019 NLP Documentation
3/10
iii. In the third step recordings were creatediv. In order to create the word level transcription a words.mlf file was created.
Next the mkphones0.led script was created in order to facilitate the task of creating
phone level transcriptions.The following command was executed to finally perform the task of creating phonelevel transcriptions
HLEd -A -D -T 1 -l '*' -d dict -i phones0.mlf mkphones0.led words.mlf
This command results in the creation of phones0.mlf and the screenshot of executionof this command is given below
7/31/2019 NLP Documentation
4/10
v. In the fifth step the conversion of audio file to mfcc file was required. This task wasperformed by creating the codetrain.scp file and tuning the parameters in the configfile.After creating the codetrain.scp file and the config file the following command wasexecuted
HCopy -A -D -T 1 -C wav_config -S codetrain.scp
The screenshot of execution of this command is given below:
7/31/2019 NLP Documentation
5/10
7/31/2019 NLP Documentation
6/10
This command results in the creation of files proto and vfloors in hmm0 folder. Thescreenshot of execution of this step is given below:
Next step involves the creation of flat start monophones.
This task was performed with the help of following steps.
a. Create a new file called hmmdefs in your 'voxforge/manual/hmm0' folder: Copy the monophones1 file to your hmm0 folder; rename the monophones1 file to hmmdefs;
b. For each phone in hmmdefs:
7/31/2019 NLP Documentation
7/10
put the phone in double quotes; add '~h ' before the phone (note the space after the '~h'); and copy from line 5 onwards (i.e. starting from "" to
"") of the hmm0/proto file and paste it after each phone. Leave one blank line at the end of your file.
Creation of a file named macros was also required which involved performing thefollowing steps:
create a new file called macros in hmm0; copy vFloors to macros copy the first 3 lines of hmm0/proto (from ~o to ) and add them
to the top of the macros file
Next nine folders hmm1 to hmm9 were created in the project directory i.e.htk1 andthe following command which resulted in creation of hmm1/hmmdefs andhmm1/macros folder
HERest -A -D -T 1 -C config -I phones0.mlf -t 250.0 150.0 1000.0 -S train.scp-H hmm0/macros -H hmm0/hmmdefs -M hmm1 monophones1
The screenshot of execution is given below
http://www.voxforge.org/uploads/1W/tm/1WtmPybiKamc0XWu650fgg/macroshttp://www.voxforge.org/uploads/1W/tm/1WtmPybiKamc0XWu650fgg/macroshttp://www.voxforge.org/uploads/1W/tm/1WtmPybiKamc0XWu650fgg/macroshttp://www.voxforge.org/uploads/1W/tm/1WtmPybiKamc0XWu650fgg/macroshttp://www.voxforge.org/uploads/1W/tm/1WtmPybiKamc0XWu650fgg/macroshttp://www.voxforge.org/uploads/1W/tm/1WtmPybiKamc0XWu650fgg/macroshttp://www.voxforge.org/uploads/1W/tm/1WtmPybiKamc0XWu650fgg/macroshttp://www.voxforge.org/uploads/1W/tm/1WtmPybiKamc0XWu650fgg/macroshttp://www.voxforge.org/uploads/1W/tm/1WtmPybiKamc0XWu650fgg/macros7/31/2019 NLP Documentation
8/10
Similarly the following two commands were executed to create files in folder hmm2and hmm3 respectively
HERest -A -D -T 1 -C config1 -I phones0.mlf -t 250.0 150.0 1000.0 -S train.scp -H hmm1/macros -H hmm1/hmmdefs -M hmm2 monophones1
HERest -A -D -T 1 -C config1 -I phones0.mlf -t 250.0 150.0 1000.0 -S train.scp -H hmm2/macros -H hmm2/hmmdefs -M hmm3 monophones1
vii. In the next step the main task was of fixing the silence model. This was done bycreating a sp model. Initially the contents of folder hmm3 were copied to folder hmm4 and the following steps were performed
copy and paste the sil model from hmmdefs and re name the new onesp(don't delete your old "sil" model, you will need it - just make a copy of it)
remove state 2 and 4 from new sp model (i.e. keep 'centre state' of old silmodel in new sp model)
change to 3 change to 2 change to 3 change matrix in to 3 by 3 array change numbers in matrix as follows:
0.0 1.0 0.00.0 0.9 0.1
0.0 0.0 0.0
Then the sil.hed file was created and the following command was executed:
HHEd -A -D -T 1 -H hmm4/macros -H hmm4/hmmdefs -M hmm5 sil.hed monophones1
This command resulted in the creation of files hmmdefs and macros in the folderhmm5. The screenshot of execution of this command is as given below:
7/31/2019 NLP Documentation
9/10
Next the following commands were executed to create hmmdefs and macros file inhmm6 and hmm7 folders respectively
HERest -A -D -T 1 -C config -I phones1.mlf -t 250.0 150.0 3000.0 -S train.scp-H hmm5/macros -H hmm5/hmmdefs -M hmm6 monophones0
$HERest -A -D -T 1 -C config1 -I phones1.mlf -t 250.0 150.0 3000.0 -S train.scp -H hmm6/macros -H hmm6/hmmdefs -M hmm7 monophones1
viii. Realignment of the training data was done with the help of the following command
HVite -A -D -T 1 -l '*' -o SWT -b SENT-END -C config -H hmm7/macros -H hmm7/hmmdefs -i aligned.mlf -m -t 250.0 150.0 1000.0 -y lab -a -I words.mlf -S train.scp dict monophones0> HVite_log
The snapshot of execution of this command is given below:
7/31/2019 NLP Documentation
10/10
Finally the following two commands were executed in order to create hmmdefs andmacros files in hmm8 and hmm9 folders respectively.
HERest -A -D -T 1 -C config1 -I aligned.mlf -t 250.0 150.0 3000.0 -S train.scp -H hmm7/macros -H hmm7/hmmdefs -M hmm8 monophones1
HERest -A -D -T 1 -C config1 -I aligned.mlf -t 250.0 150.0 3000.0 -S train.scp -H hmm8/macros -H hmm8/hmmdefs -M hmm9 monophones1