Notes While Installing Kal Dion at Al Machine

Embed Size (px)

DESCRIPTION

steps to install kaldi

Citation preview

Questions to ask in the forum : 1. Missing help filesThis is official Kaldi readme. You are now in Kaldi/trunk mirror.Read Kaldi.md and INSTALL.md first!2. Which scripts expect L.fst to exist ?Does the utils/prepare_lang.sh expect it ?3. What new scripts are written ?4. Purpose of dict_common5. Can we get a list of files that we create and files/folders created by scripts ?6. what is the diference between folder structure of hindi and tamiDemo ?Self notes1. the egs directory contains directories named as per the specifictask/database on which ASR is to be done e.g., switch board database-----------------------------------------------------------------------------atal:~/kaldi/kaldi-trunk/tools/extras% ./check_dependencies.sh ./check_dependencies.sh: zlib is not installed../check_dependencies.sh: automake is not installed../check_dependencies.sh: libtool is not installed../check_dependencies.sh: autoconf is not installed../check_dependencies.sh: we recommend that you run (our best guess): sudo apt-get install zlib1g-dev automake libtool autoconfYou should probably do: sudo apt-get install libatlas3-base/bin/sh is linked to dash, and currently some of the scripts will not runproperly. We recommend to run: sudo ln -s -f bash /bin/shatal:~/kaldi/kaldi-trunk/tools/extras% file /bin/sh/bin/sh: symbolic link to `dash' atal:~/kaldi/kaldi-trunk/tools/extras% sudo apt-get install zlib1g-dev automake libtool autoconf[sudo] password for tauseef: tauseef is not in the sudoers file. This incident will be reported.atal:~/kaldi/kaldi-trunk/tools/extras% ./check_dependencies.shYou should probably do: sudo apt-get install libatlas3-base/bin/sh is linked to dash, and currently some of the scripts will not runproperly. We recommend to run: sudo ln -s -f bash /bin/shatal:~/kaldi/kaldi-trunk/tools/extras% ./check_dependencies.shYou should probably do: sudo apt-get install libatlas3-baseatal:~/kaldi/kaldi-trunk/tools/extras% ./check_dependencies.sh./check_dependencies.sh: all OK.atal:~/kaldi/kaldi-trunk/tools/extras% -----------------------------------------------------------------------------------------Error while doing configureatal:~/kaldi/kaldi-trunk/src% ./configureConfiguring ...Checking OpenFST library in /home/tauseef/kaldi/kaldi-trunk/tools/openfst ...***configure failed: Could not find file /home/tauseef/kaldi/kaldi-trunk/tools/openfst/include/fst/fst.h: you may not have installed OpenFst. See ../tools/INSTALL ***atal:~/kaldi/kaldi-trunk/src% ----------------------------------------------------------------------------------Above error occured with freshly downloaded kaldi from sourceforge.Hence now resort to using the trunk.tar.gz given by IITM team at WISP.Following is the log output of running configure command : atal:~/wispWorkshop/kaldi/trunk/src% ./configureConfiguring ...Checking OpenFST library in /home/tauseef/wispWorkshop/kaldi/trunk/tools/openfst ...Checking OpenFst library was patched.Doing OS specific configurations ...On Linux: Checking for linear algebra header files ...Using ATLAS as the linear algebra library.... no libatlas.so in /usr/lib... no libatlas.so in /usr/lib/atlas... no libatlas.so in /usr/lib/atlas-sse2... no libatlas.so in /usr/lib/atlas-sse3... no libatlas.so in /usr/lib64... no libatlas.so in /usr/lib64/atlas... no libatlas.so in /usr/lib64/atlas-sse2... no libatlas.so in /usr/lib64/atlas-sse3... no libatlas.so in /usr/local/lib... no libatlas.so in /usr/local/lib/atlas... no libatlas.so in /usr/local/lib/atlas-sse2... no libatlas.so in /usr/local/lib/atlas-sse3... no libatlas.so in /usr/local/lib64... no libatlas.so in /usr/local/lib64/atlas... no libatlas.so in /usr/local/lib64/atlas-sse2... no libatlas.so in /usr/local/lib64/atlas-sse3... no libatlas.so in /home/tauseef/wispWorkshop/kaldi/trunk/src/../tools/ATLAS/build/install/lib/... no libatlas.so in /home/tauseef/wispWorkshop/kaldi/trunk/tools/ATLAS/libCould not find libatlas.so in any of the obvious places, will most likely try static:Could not find libatlas.a in any of the generic-Linux places, but we'll try other stuff...Successfully configured for Debian 7 [dynamic libraries] with ATLASLIBS =/usr/lib/atlas-base/libatlas.so.3.0 /usr/lib/atlas-base/libf77blas.so.3.0 /usr/lib/atlas-base/libcblas.so.3 /usr/lib/atlas-base/liblapack_atlas.so.3CUDA will not be used! If you have already installed cuda drivers and cuda toolkit, try using --cudatk-dir=... option. Note: this is only relevant for neural net experiments-----------------------------------------------------------------Follwing steps were done to train test hindi databsed1. copy the hindi folder atal:~/wispWorkshop/kaldi/trunk% cp -rf ~/wispWorkshop/kaldi/trunkUsedAtWISP/egs/hindi egs/.2.~/wispWorkshop/kaldi/trunk/egs/hindi% mv exp/ expAtWISP~/wispWorkshop/kaldi/trunk/egs/hindi% mv mfcc/ mfccAtWISP3. Change path.sh~/wispWorkshop/kaldi/trunk/egs/hindi% gvim path.sh &4. replace speech by tauseef in wav.scp/home/tauseef/wispWorkshop/kaldi/trunk/egs/hindi/data/train/wav.scpatal:~/wispWorkshop/kaldi/trunk/egs/hindi/data/train% head wav.scpaaloo_FYCGQM002CNBUP_0030 /home/tauseef/wispWorkshop/iiit_workshop_guru/wav/train_wav/aaloo_FYCGQM002CNBUP_0030.wavaaloo_FYCPQM002CNBUP_0321 /home/tauseef/wispWorkshop/iiit_workshop_guru/wav/train_wav/aaloo_FYCPQM002CNBUP_0321.wav~/wispWorkshop/kaldi/trunk/egs/hindi/data/test/wav.scp5.If you delete data/lang/ and run script then you get error as belowrm -rf data/lang/*~/wispWorkshop/kaldi/trunk/egs/hindi% utils/prepare_lang.sh data/local/dict '!SIL' data/local/lang data/langSIL: Event not found.rm -rf data/lang/L.fs~/wispWorkshop/kaldi/trunk/egs/hindi% utils/prepare_lang.sh data/local/dict '!SIL' data/local/lang data/langSIL: Event not found.6.tauseef@atal:~/wispWorkshop/kaldi/trunk/egs/hindi$ ./fst.sh ./fst.sh: line 22: fstcompile: command not foundChecking /home/tauseef/wispWorkshop/kaldi/trunk/egs/hindi/data/lang/phones.txt ...--> /home/tauseef/wispWorkshop/kaldi/trunk/egs/hindi/data/lang/phones.txt is OK>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>.tauseef@atal:~/wispWorkshop/kaldi/trunk/egs/hindi$ . path.shtauseef@atal:~/wispWorkshop/kaldi/trunk/egs/hindi$ ./fst.sh Checking /home/tauseef/wispWorkshop/kaldi/trunk/egs/hindi/data/lang/phones.txt ...--> /home/tauseef/wispWorkshop/kaldi/trunk/egs/hindi/data/lang/phones.txt is OKChecking /home/tauseef/wispWorkshop/kaldi/trunk/egs/hindi/data/lang/oov.{txt, int} ...--> 1 entry/entries in /home/tauseef/wispWorkshop/kaldi/trunk/egs/hindi/data/lang/oov.txt--> /home/tauseef/wispWorkshop/kaldi/trunk/egs/hindi/data/lang/oov.int corresponds to /home/tauseef/wispWorkshop/kaldi/trunk/egs/hindi/data/lang/oov.txt--> /home/tauseef/wispWorkshop/kaldi/trunk/egs/hindi/data/lang/oov.{txt, int} are OK--> SUCCESS>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>.tauseef@atal:~/wispWorkshop/kaldi/trunk/egs/hindi$ find . -name 'L.fst'./data/lang/L.fst./1hr_data/lang/L.fst./1hr_data/local/lang_test_selvi/L.fsttauseef@atal:~/wispWorkshop/kaldi/trunk/egs/hindi$ find . -name 'G.fst'./data/lang/G.fst./1hr_data/lang/G.fst./1hr_data/local/lang_test_selvi/G.fst>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>steps taken to run agmark on atal machine1. copy 4 folders from hindi to agmarktauseef@atal:~/wispWorkshop/kaldi/trunk/egs/hindi$ cp -rf local/ steps/ utils/ conf/ ../agmark/.2. Create data3.Create folders withini datatauseef@atal:~/wispWorkshop/kaldi/trunk/egs/agmark$ mkdir data/train data/test data/local 4.Copy transcription, wav.scp, utt2spk, spk2utttauseef@atal:~/wispWorkshop/kaldi/trunk/egs/agmark$ cp -rf /home/tauseef/wispWorkshop/agmark/doc/kaldi/train/* data/train/.tauseef@atal:~/wispWorkshop/kaldi/trunk/egs/agmark$ cp -rf /home/tauseef/wispWorkshop/agmark/doc/kaldi/test/* data/test/.5. Create two more directories under data - lang and lang_testtauseef@atal:~/wispWorkshop/kaldi/trunk/egs/agmark/data$ lslocal/ test/ train/tauseef@atal:~/wispWorkshop/kaldi/trunk/egs/agmark/data$ mkdir lang lang_testtauseef@atal:~/wispWorkshop/kaldi/trunk/egs/agmark/data$ chmod a+rx lang/ lang_test/6. create 4 directories under data/local : dict dict_comm lang lang_testtauseef@atal:~/wispWorkshop/kaldi/trunk/egs/agmark/data/local$ mkdir dict dict_comm lang lang_testtauseef@atal:~/wispWorkshop/kaldi/trunk/egs/agmark/data/local$ chmod a+rx dict dict_comm lang lang_testtauseef@atal:~/wispWorkshop/kaldi/trunk/egs/agmark/data/local$ lsdict/ dict_comm/ lang/ lang_test/7.Copy the dictionary, phone and filler filestauseef@atal:~/wispWorkshop/agmark/doc$ cp marathiAgmark.dic marathiAgmark.filler marathiAgmark.phone /home/tauseef/wispWorkshop/kaldi/trunk/egs/agmark/data/local/dict/.Compare with the files used in tamilDemolexicon.txtnonsilence_phones.txt = marathiAgmark.phoneoptional_silence.txtsilence_phones.txt\Note :1.marathiAgmark.dicj does not have siltauseef@atal:~/wispWorkshop/kaldi/trunk/egs/tamilDemo/data/local/dict$ tail lexicon.txtfb_laugh SILfb_ln SILfb_pron SILfb_pau SIL SILfb_uu SILfb_whisper SILfb_br SILsil SIL!SIL SIL2. optional_silence.txt and silence_phones.txt look like followingSo just copy them from tamildemotauseef@atal:~/wispWorkshop/kaldi/trunk/egs/tamilDemo/data/local/dict$ head optional_silence.txt SILtauseef@atal:~/wispWorkshop/kaldi/trunk/egs/tamilDemo/data/local/dict$ head silence_phones.txt SIL>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>use of perl, emacs for text processing1.change following line in transcriptionto2.use macro in emacs.macro is a sequence of keystrokesF3(start macro)->Begin of line 1-> End -> Backspace -> Ctrl R -> Shift F -> Shift End -> Cut -> goto begin by home -> paste ->goto begin of next line -> end macro by F4write down the following steps for agmark test/trainhow to modify transcriptionhow to get utt2spk and spk2utthow to modify wav.scp