34
P10071·BBN PATROL Personne l 8.2 Johns Hopkins Universi ty Dr. Hynek Hermansky is a rull Professor of Elect ri cal and Computer I:ngineering al the Johns Hop kins University in Haltimorc. Maryland. He is also a Prolessor (o n leave of absence) at the Bma University of Technol ogy. Czech Rep ublic: an Adjunct Professor at the Oregon Ilcalth and Sciences University, Purthllld. Oregon; and an Ex ternal fellow at the Intenmtjonal Computer Science In stitute at Berkeley. Ca liforn ia. 1Ie is a Fellow of IEEE for "invention a nd development o fp crccplua ll y-based speech prm:ess in g methods," His ma in area or research is robust acoust ic processing of speech. Among the techniques that he has pioneered arc: Ilerceptua ll .inear Prediction (PI.P). RA STA modulation rr t! quency filtering. mu lt i·stream recog nition ofspccc h. d,It<t·guided feature extraction. frequcncy·domain PLP. and posterio based features for spl!ech recognition. He has applied these Int!thods to speech recognition. speaker recognition. and speech and audio cod in g. A number or the robust processing teehni411es thar he piom:i:red are now in wide usage wor ld w ide. Dr. Hermansky has been wo rking in speech processing for ov er 30 years and has held lhe fo ll ow ing pos it ions: Director of Research at th e IDIAP Research Institute. Martigny. Switzerland: Research Fellow alth e University of Tokyo: Research Engineer al Panasonic Technologies in Santa I::lClrbara. Ca lif ornj,,; Se ni or Member of Re search Stall' at US WEST Advanced Technol og ies: and Prof esso r and Director o rthe Ce nLe r fur In formation Processing at OHSU. Portland. Orego n. In h is proft! ss ional activities. Dr. Hermans ky served as Technical Cha ir at th e 1998 ICASSP in Sea n It! and an Associa te Ed it or for IEEE Transac ti on on Spt!ech and Audio. and is a member of the Organiz ing Committee for ICASSP 20 II in Prague. He is member of the Editorial Board of Speech Com munication, holds 7 US patents. and authored or co·a ulhored ovt!r 200 papers in re viewed jou rnal s and conterenee proceedings. 8.3 University of Maryland Dr. Shihab Shamma is a Full Professor. Department of Electrical Eng in eering. Uni versity of Maryl and (UMD). College Park. with ajoint appointment at th e Institute for System Research. UMD. Ilis research deals with questions in computational neurosc ience. speech and audio processing, neurornorphic eng in eering, and the developmerll of mi croscnsor systems for exper imental researc h and ncural prosth eses, Primary focus has been on study in g the computational principles undt!rlying thc proct!ssing and recognition of complex so und s in the aud it ory system. and the relationship between aud it ory and visual process ing. Sig nal processing algorithms inspired by data from neurophysiological and psychoacoustical experiments arc being developed and applied in a var iety of systems such as speech and voice re cogn iti on. diagnost ic s in industrial manufacturing. and underwater and battlefield acolls ti cs. Dr. Shaml113 has developcd nove l models of auditory cortical process in g. in spired by physiological and psycho acoustic experiments over the last decade th at have revealed a rich and nexible rep resentation of the perceptually significant fe<ltures of so und. The resu lt ant hi gher· dimcnsional co n ical representation (a frequency/ralc/scale vector as a func ti on of time) is able to separate speech so und s from other unwanted signals. slIch as various types of noise or music. He has been able to use this unique sep<lration property to design rmrlti·dimensional filters that enl1<lllce the quality of' noisy speech and suppress the noise, He has al so used the cortical representati on to design front ends to a speech activity detector (SAD) for many types of noise. li e has tested SAD with animal vocalizations. mus ic. and env iron mental so unds. includingje L 32 Use or disclosure of the data contained on this page is subject to the restriction on the title paga of this proposal. epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000103

PATROL Personnel - EPIC · P10071·BBN PATROL Personnel 8.2 Johns Hopkins University Dr. Hynek Hermansky is a rull Professor of Electrical and Computer I:ngineering al the Johns Hopkins

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

P10071·BBN PATROL Personnel

8.2 Johns Hopkins University

Dr. Hynek Hermansky is a rull Professor of Electrica l and Com puter I:ngineering al the Johns Hopkins University in Haltimorc. Maryland. He is also a Prolessor (on leave of absence) at the Bma University of Techno logy. Czech Republic: an Adjunct Pro fessor at the Oregon Ilcalth and Sc iences Un iversity, Purthllld. Oregon; and an External fellow at the Intenmtjonal Computer Sc ience Institute at Berkeley. Ca liforn ia. 1 Ie is a Fellow of I EEE for "invention and development ofpcrccplua ll y-based speech prm:ess ing methods," His ma in area or research is robust acoust ic processing o f speech. Among the techniques that he has pioneered arc: Ilerceptuall .inear Prediction (P I.P). RA STA modu lation rrt!quency filter ing. mu lt i· stream recognition ofspccch. d,It<t·guided feature ex traction. frequcncy·domain PLP. and posterior· based features for spl!ech recognition. He has applied these Int!thods to speech recognition. speaker recognition. and s peech and audi o cod ing. A num ber or the robust process ing teehni411es thar he piom:i:red are now in wide usage wor ldw ide.

Dr. Hermansky has been working in speech processing for over 30 years and has held lhe fo llowing pos it ions: Director of Research at the I DIAP Research Inst itute. Martigny. Switzerland: Research Fe llow althe University of Tokyo: Research Engineer al Panasonic Techno logies in Santa I::lClrbara. Ca lifornj,,; Seni or Member of Resea rch Stall' at US WEST Advanced Technolog ies: and Professor and Director orthe CenLer fur In formation Processing at OHSU. Portland. Oregon.

In his proft!ssional acti vities. Dr. Hermansky se rved as Technica l Cha ir at the 1998 ICASSP in Sean It! and an Associate Ed itor for IEEE Transaction on Spt!ech and Audio. and is a member of the Organiz ing Committee for ICASSP 20 II in Prague. He is member of the Editorial Board of Speech Communication, ho lds 7 US patents. and authored or co·aulhored ovt!r 200 pape rs in re viewed journal s and con terenee proceedings.

8.3 University of Maryland Dr. Shihab Shamma is a Full Professor. Department of Electrical Engineering. Uni vers ity of Mary land (UMD). College Park. with ajoint appo intment at the Inst itute for System Research . UMD. I lis research deal s with questions in computational neuroscience. speech and audio processing, neurornorphic eng ineering, and the developmerll of microscnsor systems fo r experimenta l research and ncural prostheses, Primary focus has been on study ing the computationa l principles undt!rlying thc proct!ssing and recognition of complex sounds in the aud itory system. and the relationship between aud itory and visual processing. Signal processing algorithms inspired by data from neu rophysiolog ical and psychoacoustical experiments arc being developed and applied in a variety of systems such as speech and voice recogn ition. diagnostics in industrial manufacturing. and underwater and batt lefie ld acollsti cs.

Dr. Shaml113 has develo pcd nove l model s of aud itory cort ica l process in g. inspired by physiological and psychoacoustic expe riments over the last decade that have revea led a rich and nexible represen tation of the perce ptually s ignificant fe<ltures of sound. The resultant higher· d imcnsional con ica l represen tation (a frequency/ralc/scale vector as a func tion of time) is able to separa te speech sounds from other unwanted s ignals. slIch as various types of noise or music. He has been ab le to use thi s unique sep<lration property to design rmrlti·dimensional filters that enl1<lllce the quality of' no isy speech and suppress the noi se, He has a lso used the cort ica l represen tat ion to design front ends to a speech activity detector (SA D) for many types of noise. li e has tested hi~ SAD wit h animal voca lizat ions. music. and env iron mental sounds. includingjeL

32 Use or disclosure of the data contained on this page is subject to the restriction on the title paga of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000103

Pl0071·BBN PATROL Personnel

engine no ise. factory no ise. destroyer engine sounds. milita ry vehic les . cars. and speech babble recorded in di ffe re nt environments li ke restaurants. a irports. and ex hibition ha lls. and. more recently. reverberant speech. Hi s speech ac tiv ity detector has been found to be superior to other SAl) methods in independent lests by several government agencies.

I)r. Shanllna is a Fe llow of the Acoustica l Soc iety or America. Senior Member of I EE E. member of the Association for Rcseurch in Otolaryngo logy_ and member urlhe Society fo r Neuroscience. Ilc is al s(,l Acti on Editor lor th e Journal o f Compu tational Ne urosc ience. Academic Editor for PLoS. and Academic Board Member fo r T rends in Cogni tive Sciences . I lc h,IS been Co­organize r and Oi rector of numerous Workshops and Sympos ia including most recent ly the Annual Te ll uride Workshops on Neuro lllorphic Engineering ( 1997- presenl. part ially funded by NSr, ON R. DARPA. and the Whitaker and Gatsby Foundati ons), and th e NIPS and COSYNE wo rkshops on Neural Mechani sms or Music Percepti on (1 999), Thalamo-cortica l Processing (2002). and Attention and Streaming (2006). J Ie is the ho lder ofa patent on " A Cochlear Filte r Ban k with Switched Ca pacitor Circuits". and has patent s pend ing on "Inte llig ibility Assessme nt Usi ng Spectro-tempoml Modul ations" and "Speech Discrimination Based on Multisca le Spectro­Tempora l Modu lat ions."

8.4 Cambridge University Dr. Mark (lales will serve as the primary person in charge of the RATS w(wk at Cambridge Uni vers ity. Professor Phil Woodland will serve as senior technical ad visor.

Dr. Mark Gales is Reader in In fo rmation Engi nee ring at Cambr idge Un iversi ty where he has been a member of faculty stafT since 1999. I)rior to this he was a Research Sta rr Member at IBM. He has work ed on speech process ing for over 18 years. with a part icular interest in acousti c modeling. acoustic environll1c lll robustness. and speaker udaptation. He has parti c ipated in a range o f DA RI'AIN IST automatic speech recognition evaluations s ince 1994. at both Cambridge and IBM . These include recent evaluations under the DARPA EA RS and GALE programs. Or. Gales has been awarded mu lt iple IEEE paper awards lo r his work on speech recognit ion. From 200 1 to 2004 he was a member of the IEEE Speech Techn ical Committee. He is currentl y an Associate Editor of the IEEE S ignal Process ing Letters. IEEE Transactions on Audio Speech and Language. and on the editorial board of Computer Speech and Language journa l. li e has publi shed over 100 papers, primari ly in the area of acoustic modeling for speech recognition.

Dr. Ga les has extensive experience in noise robustness. speaker adaplation. and speech recognit ion. He developed one of the fi rst approaches to mode l-based compensation for acoustic mode ls to handle noise cond itions - para lle l mode l combination. This modcl·based approach to handling hi gh levels o f background no ise has been refined to handle a range of im portant techniques to make it appli cable to state· of-thc-arl speech process ing applications, including di scriminative adaptive training. and improv ing efliciency to a ll ow its applicati on to large systems. He developed one o rthe standard approaches fo r speech recogni tion and adaptive training. Constra ined Max imum Likelihood Linear Regression (CMLLR). as we ll as re fining this to be more suited for speaker adaptation in the presence o fltigh leve ls of background noise. Under the DA RPA EA RS and GA LE progra ms he has bcen in vo lved in deve loping large vocabul ary transcripti on systems for Eng li sh. Mandarin and Arabic. In additi on. he has worked on sequence kernel s that form an importanl pa rt of speaker identificat ion and verifi cation systems and has re(;entl y been applied to handling hi gh-level s of background no ise.

33 Use or disclosure of the data contained on this page is subject to the restriction on the title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000104

P10071-BBN PATROL Personnel

Phil Woodland is Professor of Information Engineering at the University of Cambridge where he leads the Speech Group. li e has been a mcmber offaculty stafTat Cambridge since f 989. li e has published almost 200 acadcmie papers in the area or speech and language technology. Ma ny o r these papers are hi gh ly cited. including the most highly cited paper to appear in the jou rnal Complilcr !';peech and Language. He has won best paper awa rds for work in the areas of speaker adaptation and dis(;riminative training of large vocabulary speech recognition systems. He has also worked on stati st ica l language mode ling. auditory mode ling. stati stical speech synthesis. and spokcn documcnt rctrie va l. Ili s team has developed large-vocabulary speech recognition systcms that have rrequentl y g iven the lowest error rates in international research eva luations organized by the US Gove rnment. He was one of the original co-authors orthe wide ly-used IITK too lkit. which now has about 100.000 registered use rs worldwide. and has continued to playa major ro le in its development. lie was a member orthe IEEE SPS Speech Technical Committee from 1999 to 2003. a member o r the r:ditorial Hoard oJ' Computer Speech and Language rro m 1994 unt il 2009. and is a current member of the Editoria l Board of Speech Communica ti on.

Pro!: Wood land has ex tens ive experience in many areas of speech technology. particularly in speech recognition systems. and has worked in the fie ld for the last 25 ycars. Of spec ial relevance to RATS is his work on linear transform-based adaptation approaches. in which he deve lored the w idely-used MLLR technique. as we ll as techniques for unsupervised adaptation in low recognition rate situations, via lattice-based adaptation. and discriminative adaptation based ll1ethous. He has de ve loped widely used discriminative training techniques and shown that such techniques arc useful fo r situations where test data is poorly matched to the traini ng. He has worked on de veloping speech transcription techno logy for many years. including work on the DARPA EA RS and GA LE programs. fo r both of th ese he was the III at Cambridge Univers ity. During thi s time h is tea m has deve loped systems in a number oflanguages inc luding Engli sh. Ma ndarin and Ara bic.

8.5 Bmo University of Technology

Dr. Lukas Rurget w ill serve as the primary person in charge of the RA TS work at BUT. Work ing closely w ith Dr. Burget wi ll be Dr. Pavel Matejka .

Dr. Lukas Burget has been an Ass istant Professo r at the Faculty of In fo rmation Techno logy (F IT). University a fTeehnology, Brno, Czech Republic. s ince 2004. He serves as Sc ientifi c l1ireetor o rthe Speech@FITresea rchgroup. From 2000 to 2002. he was a visiting resea rcher at OG L Portland. OR. under supervision of Professor Hynek Ilermansky. In 2008. he was in vited to lead th e "Robust Speaker Recognition over Vary ing Channels" team at the Johns Hopk ins Uni versity CLSP summer workshop. and will lead [he team of BOSARI S wo rkshop in 20 I O.

Dr. Burget's sc ientific interests arc in the field o f speech process ing. namely acoustic modeling tor speech, speaker and language recognition, including their software implementations. He has authored or co-authored more than 40 papers in journal s and confe ren ces . li e was leader of the top-performing teams in the NIST LI D 2005 and 20tH evaluations and the NIST S ID 2006 and 2008 evaluations, scoring firsi in a remarbble number of condi tions. li e contributed significant ly to the team de veloping AM I L VCSR systems slleeessful in th e NIST RT 2005. 2006 and 2007 evaluations.

Dr. Burget has participated in EU-sponsored projects "Multi moda l Meeting Manage r" (M4), "Augmented MultiParty InteractionH (AM I). and "A ugmented Mult iParty interaction with

34 Use or disclosure of the data contained on this page is subject to the restriction on the title page of this proposal

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000105

P10071-BBN PATROL Personnel

Distant Acccss" (AMIDA). as wcll as in seveml projects sponsored at the local C:I.ech level. Currently. he is participating in the EU pro.iect "Mobile Riometry"·. He is principal investigator of US-A ir Force EOARD sponsored project " Improvi ng the capacity of langwige recognition systems to handle nlrc languages using radio broadcast data".

Dr. Pavel Matejka is senior researcher in the Speech((~ FrT research group. Department of Computer Graphics and Multimedia. Faculty of Info rmation Technology. Hrno Uni versi ty of Technology (BUT). lie was a vis it ing sc ienti st witb the Anthropic speech processing group at the Oregon Graduate Inst itute o f Sc ience <IIld Technology in 2002-2003. He has participated in the EU pr~jccts M4.AMI. and AM IDA mentioned above. and in language identification projects sponsored by US Air Force EOARD and the Czech Ministry of Defense. He lOok part in NIST 2005.2007 and 2009 language recognition and NIST 2006 and 2008 speaker recogni tion evaluations. whcre BUT had exce llent results. li e is currently leading BUr s speakcr identification activities in the Mobile lliometry EU project.

Or. Matejka is author or co-author of more tban 15 papers in journals and ,It rt:viewed international conferences. His resea rch interests include speaker recognit ion. language ident ifi cation. speeeh recognition (namely. phone recognition bascd on novel fcature ex traction us ing tempora l pilLterns and neural networks). He is acti ve in keyword spalling and on- line implementation o f speech processing algorithms. lie was finali st in the Student paper contest at ICASSr 2006 in Toulouse and participated in the JI·IU workshop group "Recovery from Model Inconsistency in Multilingual Speech I{ecognition" ill 2007.

8.6 Time Commitments

Table 8-1 shows the planned time commitments lo r the Key Personncl for ca lendar years 2010. 20 I I. and 2012. The numbers for 20 I 0 assume thatlhe elTort w ill start 15 August 20 I O. so 20 I 0 is only a partial year. It is ant icipated that a re lative ly slllall eftort will be devoted for Technica l Area I during the lirst six months of the project while data is being co llected by the Data Co llection team.

For the IlBN Key Personnel. we have showilihe timc cOlllmitments both in terms of hou rs and percentage time. At certain universities. personnel time is accounted for by percentage or time rather than by hours. whi le at others. time is spec ilicd in months.

35 Use or disclosure of the data contained on this page is subject to the restriction on the title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000106

P10071 -BBN PATROL Personnel

Key Individual Project Pending/Current 2010 2011 2012

John Makhoul RATS Proposed 90 hrs 528 hrs 616 hrs (15%) (30%) (35%)

GALE Current 350 hrs 403 hrs "', (60%) (23%)

Richard Schwartz RATS Proposed 100 hrs 792 hrs 880 hr.; (17%) (45%) (50%)

GALE CUffenl 410 hrs 440 hrs oJ, (70%) (25%)

Spyros Matsoukas RATS Proposed 153 hrs 1196 hrs 1472 hrs (25%) (65%) (80%)

GALE Current 429 hrs 460 hrs "', (70%) (25%)

Hynek Hermansky RATS Proposed 12% 25% 25%

IARPAIARL Current 12_5% 12.5% "', Speaker ID

Shihab Shamma RATS Proposed 0.25 month 1 month 1 month

IARPAIARL Current 1 month 2 months "', Speaker ID

ONR Pending 0.25 month 1 month 1 month

Mark Gales RATS Proposed 50 hrs 200 hrs 200 hrs

GALE Current 70 70 "', Phil Woodland RATS Proposed 20 hrs 50 h'" 50 hr.;

GALE Current 70 hrs 70 hrs "', lukas Burget RATS Proposed 10% 50% 40%

IARPA Current 41% 40% 40% Speaker ID

Pavel Matejka RATS Proposed 10% 55% 40%

IARPA Current 61 % 35% 40% Speaker lD

Table 8-1 K ... y personnel lind their ti me commitmenlS.

36 Use or disclosure of Ihe data contained on this paga is subjact to the restriction on /he litle page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000107

P10071-BBN PATROL Personnel

8.7 Selected Bibliographies Below is a se lected bibliography from each orthe PATROL s ites with references pertaining to the RATS program.

8.7.1 BBN R. IJrasad. S. Tsakalidis. I. Bulyko. C. Kao. and P. Nata rajan. "Pashta Speech Recognition wi th Limited Pronunciation Lexicon." IEEE Jnfc:rnaliol1a/ Conference on Acoustics. '\-'I'eech. and Signal Processing, Dallas. TX. Ma rch 20 I O.

S. Tsaka lidi s. R. Prasad. and P. Nalaraja n. "Co ntext-dependent Pronunc iation Modeling f'or Iraqi ASR," IEEE Inlernaliol1al Conference on ACOl(s/ics. ",Jeech, lind 5iignal Processing. Tai wan. April 2009.

M. Siu, H. Glsh and X. Yang. "Disc rimi nati ve ly Tra ined UMMs for Language C lass ification Using Boosting Methods" , IEFl:' Trans . 011 ;ludio. ,)jJeech and Language Processing, Vo l. 17, No. 1,2009.

L. Nguyen. T . Ng. K. Nguyen, Rabih Zbib, and John Mak hou l. "I.exical and phonetic mode ling for Arabic automatic speech recogn ilion: ' InlerS'peech. Brighton . UK. pp. 7 12-7 15. 2009.

S. Novot ney. R. Schwartz. and J. Ma. "Unsuperv ised Acoust ic and Language Model Tra ining with Small Amo unts of 1.a beled Dala:' I(,ASSP, Taipei. C hina . pp. 4297-4300. 2009.

J. Ma and R. Schwartz. "Unsupervised ve rsus Superv ised Tra ining of Acoustic Models." Inler.\·lxl..'ch. Hrisbanc, Australia. pp. 2378-238 I, 2008.

J. Au- Yeu ng and M. Siu , "Evaluation of Ihc robustness of the polynom ial segment models to noisy environments w ith unsupervised adaptation", ,)jJeech Communicaliol1s. Vo l. 50. 2008.

K. Subramanian, R. Prasad. and P. Natarajan. "Optimal Estimation of Rejection Thresholds for Top ic Spotting," IEEE Inlemaliolla.Co.1/erellce on ACOlfSlics. Speech. and Signal Processing. Iiono lulu. I II, April 2007.

D. R. 1-1. M ill er. M. Kleber. C. Kao. O. Kimball. T . Co lthurst. S. A. Lowe, R. Schwartz. and II. Gish. "Rapid and Accurate Spoken Term Detection", Inler.weech. pp. 314-371. Antwerp. Be lgiu m. 2007.

J. Ma, S. Matsoukas. O. Kimball, and R. Schwartz. "Unsupervised Trai ning on La rge Amounts of Broadcast News I)ata. " IC;lSSP. Toulouse. France. vo l.3. pp . 1056-1059,2006.

S. Sa leem, R. Prasad. and P. Natarajan, "Colloquial Iraqi ASR for Speec h-to-Speech 'l'ranslation." Inti. CO'1/ on Spoken Langlltlge Processing, Pittsburg. PA, September. 2006.

B. Zhang. S. Malso ukas. R. Schwartz: " Recenl progress on the discriminative region-dcpcndent lransfiJrm for speech feature ext racti on". lnterspeech. Pittsburgh, PA, September, 2006.

II. Jin. R. Schwartz. F. G. Wall s. and S. P. S ista, "Method and Apparatus for Score Norma lization for I nformation Retrieval App li cations". lJS Patent No. 6.65 I .057. issued on November 18, 2003. and US Patent No. 7.062.485. issued on June 13.2006 (continuation).

L. Zhai. M. Siu. X. Yang and 1 r. Gish. "Disc riminati vc ly traincd Language Models Using Support Vector Machin es fo r La nguage Identification:' ()(~vssey. June 2006.

B. Z ha ng, S. Matso ukas, and R. Schwartz, "Discriminatively Trained Region Dependent Feature Transforms for Speech Recogniti on," ICASSP, Toulouse, rrance. vo LI . pp.313-3 I 6,2006.

37 Use or disclosure of the data contained on this page Is subject to the restriction on the title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000108

P10071·BBN PATROL Personnel

B. Zhang. S. Matsoukas. and R. Schwartz. "Recent Progress on the Discriminative Region­dcpcndcnI Translorm for Speech Feature Extraction." IlI/eJ"speech. Pittsburgh. PI\. pp. 1495-1498.2006.

K.Y. Leung. M.W. Mak. M. Sill. and S.Y. Kung. "Speaker Verification Using Adapted Articu latory Feature-based Conditional Pronunciation Modeling," IEEE Inl. COI?ference on AcolIsli('s . .),leech wul SiJ.!l'Ia/ Processing. PI'. 18 1- 184. Ph ilade lphia. PA. 2005.

R. Xiang. L. Nguyen. S. Matsoukas. and R. Schwartz. "Cluster-dependent m:uuslic modding." ICASSP 2()U5. Ph iladclph i,l. Pennsy lvania. 2005.

K.K. Wong. M. Siu. "Automat ic language identification using di scre te hidden Markov mode l. " Inlernalioll(fl Conferen<.'l! on .~/wken Lungl/axe frocessillJ!. pp. 1633- 1636. Jcju Is land. Korea. 2004.

W. Andrcws and J. Hernandez-Cordero. "Using Speech Compress ion lor Speaker Detect ion". Speaker ()(~Ysscy Workshop. Toledo, Spain. 2004.

L. Nguyen and B. X iang. " Light supervision in acollst ic rnodel training:' ICASS/ 1• Montrea l.

Canada. 2004.

D. Reytw lds. W. A ndrews. J. Campbel l, J . Navrli !. B. Peskin. A. Adami. Q. Jill. D. Kluseek. J . Abrarnson, R. MihaesctL. J. Godfrey. D. Jones. and B. Xiang. "SuperSID Pr~ject: Exploiting High-Lcvc l ln fo rmation for Iligh Accuracy Speaker Recognition". ICAS5iP. llong Kong. China. 2003.

J. Navrlil. Q. Jin. W. Andrews and J. Campbell. " Phonetic Speakcr Recognition Using Maximum-Like li hood Binary- Dec is ion Tn.'e Models". ICA5;,S·P. Hong Kong. China, 2003.

L. Nguye n, X. Guo. R. Schwartz. and J . Makhou l, "Japanese broadcast news transcription system." ICSLP. Dcnvcr. Colorado. 2002.

M. Siu. A. Chan. "Robust Specch Recognition Against Short-T imc noise:' Inlel'lwlional CfJl1fel'ence on .S/JOken l,al1~lI{Jge Process in,;. Vol. 2. pp. 1049- 1052. 2002,

R. Prasad. L. Nguyen, R. Schwartz. and J. Makhoul. "Automatic Transcription of Courtroom Speech." In! . ConI Spoke/1 Language Processing. Denver. CO. pp. 1745- 1748. 2002.

R. Prasad. L. Nguyen. R. Schwartz, and J. M<lkhoul, "MeelingLoggcr: Rich Transc ription of Courtroom Speech:' Inl. COIlr HlIman L(III~I/(IXI! Tec:llllology. San Diego. CA. Morgan Kaufmann Publishers. pr. 303-306. March 24-27. 2002.

S. Sista, R. Schwartz. T. Leek , and J. Makhoul. "An tdgorilhm fo r Unsuperviscd Topic Discovery from Broadcast Ncws Stories," Inl. ('O/?/ Human 1.(lI1gl/a~e Techn%K}'. San Diego. CA. Morgan Kaufmann Publi shers. pp. 110- 11 4.2002.

\V. Andrews. M. Koh ler. and J. Campbell. "Phonetic Spcakcr Recognition." El/roS/JCech. Aalborg. Denmark. 200 I.

H. Jin, R. Schwartz. S. S isla. r. Walls."Topic Tracking for Radio. TV Broadcast. and Ncwswirc ," Elll'Ospeecll, Budapest. Hungary. pp. 2439-2442. 1999.

II. J in. S. Matsoukas. R. Schwartz. and F Kuba la. " Fast Robust Inverse Transform SAT and Multi-stage Adaptation". DARPA Broadcast Neil'S li'an8c/'//)/hm and Understanding Workshop. Leesburg VA. 1998.

38 Use or disclosure of the dato contained on this page is subject to the restriction on the title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000109

P10071 ·BBN PATROL Personnel

H. Jin. F. Kubala. and R. Schwartz. "Automatic speaker clustering." DAHI'A ,)/)Cecll Reco){nilioll

Worhl1Op. Chantilly VA. 1997.

S. M .. tsoukas. R. Schwartz. II. .l in. and L. NguYl.'n. " ]Jraclica ll mplcmentations orSpeaker Ada ptive Training", DARPA 5ipeech Heco}?niliol1 Workshop. Chantilly V 1\. 1997.

T. I\nastasakos. J. McDonough. and J . Makhoul. "Speaker Adaptive Training: A Maximum Likelihood Approach to SpeakcrNormalization"'.ICA,)'SP. Washington DC. 1997 .

S. M.llsoukas. R. Schwartz. II. .lin. and L. Nguyen. "rraclicallmplcmenta lions ofSpcaker. Adaptive Training'", IF.F.E Rec(Jj!l1ilio/l Workshop. Chanti Hy. Virginia. pp 133- IJR, 1997.

A. I\nastasakos, J. Mcdonough. R. Schwartz:. and J. Makhou l. "'/\ Compact Model for Speaker­Adaptive Training," ICSLP. 1996

R. Schwartz. 1\. Dcrr. 1\. Wilgus. and J. Makhoul. "Speaker Verilieation IR&D Final Report. " BAN Technical Re port 0.6661. Nov. 1987.

H. Gish. M. Krasner. W. Ru sse ll , and J. WolL "Methods and experiments for text-independent speaker recognition over telephone channels." IEEE Inl . ConI Aco/lsr" !'J'fJeech. Signal Processing. pp. 865-868. 1986.

V. Viswanathan. C .M. Il cnry. R. Schwartz. and S. Roucos. "Evaluation o f Multi sensor Speech Input for Speech Recogni tion in High Ambient Noise:' IEEE Inrernalional Conji.'rcmce on Al:OIIslics. Speech. and Sixnal PrvcessillK. Tokyo. Japan. pp. 85-88. 1986.

S. Roucos. V. Viswanathan. C. Henry. and R. Schwanz. "Word Recogniti on Using Multi sensor Speech Input in I li gh Ambient No ise." IEf.F Inlernaliol1111 Conference on Acollsl i(". ... Speech. alltl Signal PrvcessinX. Tokyo, Japan, pp. 737-740. 1986.

II. Uish. K. Kamofsky , M. Krasner. S. Roucos. R. Schwartz. and 1. WolL "Investi gation of Text­Independent Speaker Idenlilica tion Ovcr Telephone Channels." IEEI:.· llIIernalional Conference 01/ Ac()Wlics. Speech. and Signal Proces.\'ing. Tampa, FL, PI'. 379-382, 1985.

M. Kntsncr. J . Wa ll'. K. Kamorsky. R. Schwartz. S. ROllcos. and II. Gi sh. "Investigation of tex t­independent speaker identificf:l tion techniqucs under conditions o f variab le data." IF.EE Inl. COl!/.' Acollsl .. .\'pee(-h, 5;ij!na/ Pl'Oce.\·sing. pp. ISB.S. I-I SI3 .5A. 1984.

J. Walt: M. Krasner. K. Kamorsky. R. Schwartz. and S. Roucos. "Further Invest igations o f Probabi listie Methods for Text- Independent Speaker Identification'" IEEF. Internalional COI?/erence all Acollstics. Speech. and Sixnal Pl'oc:essinK. Boston. MA. Pl'. 55 1-554. 1983 .

R. Schwartz, S. Roucos. and M. Acrouti. "The Applicat ion of Probabi lity Oensity Estimation to Text- Independe nt Speaker Identification." ICASSf'. Pari s. France. pp. 1649- 1652. May 1982.

M. Berout;' R. Schwartz. and J. Makhoul. "Enhancement of Speech Corrupted by Acoustic Noise." ICASSP, Washington. DC. pp. 208-2 11. 1979. Also in J .Lim (Cd.). Speech Enlul/1cellll!lIl , New Jersey; Prentice- I ln ll. 1983.

8.7.2 Johns Hopkins University

M. Alhineos. H. I lcrmansky. and D. P. W. Elli s, "I. P-TRAP: Linear Pred ictive Tempora l Patterns." Proc. ICSI.f. Jeju, S. Korea. pp. 11 54-1157. 2004.

S. Ga napath y,S. Thomas and H. Hermansky. "Temporal Envelope Subtmction for Robusl Speech Recognition Us ing Modulat ion Spectrum". Proc. ASRU. Mcrano, lta ly.2009.

39 Use or disclosure of the data contained on this page is subject (0 the restn'ction on the title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000110

P10071-BBN PATROL Personnel

S. Ganapathy. S. Thomas and I-I. Il ermansky. "Robust Spcctro·temporal Features based on Autoregressive Models of Hilbert Envelopes", Proc. ICAS.\'/' 1010, Dallas. TX. 20 10.

S. Thomas. S. Gunapathy and H. Il crmansky. "Tandem Representations o fSpcctral Envelope and Modulation Frequency Features for ASR", Pm,", IlIferSp(!f:c:h. Brighton. UK. 2009.

S. Thomas. S. Ganapalhy and H. Il crmansky. "Recognition or Reverberant Speech Using Frequency Domain Linear l)rcd iction", I££E Si~nal Proce.~.\·il1g I.eller .... 2008.

II. Hennansky. "Specu lations on Knowledge Versus Data in ASR:' 1'J'()e;. ESCA -NA'/'O TlJ/orilll and Research Workshop on Rohus/ Speech Reco~nili(}n.li)r Vnknoll'n Communication Channels. Pont-a-Mousson. France. 1997.

F. Valente and H. Hcrmansky, "Discriminant Linea r Processing of Tilllc-Frcquency Planc." Proi'. International (·()I?/i.m:l1ce ol15i'<lken Language !'n1('('ssing, PittSburgh, PA, 2006.

H. Hennansky and 0.1>. W. Ell is and S. Sharma. "Collnectionist Fcatunc Extract ion for Conventiona l HMM Systems", Proc. ICAS.\'P '()(), 2000.

H. Ilcnnam;ky and N. Morgan. ""RASTA Processing of Speech". IEEI:.' Tral1SaCliof1s on Speech lind Audio I'rocessing. Vo l. 2. NO.4. pp. 587-589. 1994.

H. Hcrmansky, "Robust Spcech Recognition:' in Co le, llirshrnan ct al .. "The Chall enge of Spoken Language Systcms, Research Directions fo r the Nineties:' IEEE 7i-al1slIclions on Sp(!(!('h and Audio l'roCL'ssing. Vol. 3. No. I. pp. 1-2 1. 1995.

H. Hcrmansky and S. Sharma. "TRAPS - Class ifie rs ofTcmpora l Patte rns." !'roc. IlIlernalional COI?terence on Spoken I.ongl/oge Processing. Sydney. Auslra li<l. 1998.

H. Hennansky and S. Sharma, "Temporal Patterns (TRAPS) in ASR. o/"Noisy Speech:' Pmc. IE£E Inferno/iollal Cmrferem.:e on Acu/{s/ics, SpeedJ and Sigl1al Processil1K, Phoen ix. AZ 1999.

II. Hermansky and P. rousek, "M ulti-resolution RASTA filte ring for TAN L>EM-based ASR:' Pmc. l~lIropean C0I1ference 011 Speech C(Jfllll1lmimlion lIl1d Technology. Lisbon. Ponugal, 2005.

N. Mesgarani. G.S.V.S. Sivaram. S. Nen1<lla. II. Hermansky. "I)isc rimin:.mt Spectrotempoml Features lor Phoneme Recognition".l'roc. /111eJ"sl'ee(;h 2()()I), I1righton. UK. 2009.

G.S.V.S . Sivaram. S. Nemnla . M.Elhilnli.l. Tran. and H. Hcrmansky. "Sparse Coding lo r Speech Recognit ion".ICASSP 2()/O. Dallas. TX. 20 10.

J. Pinto. Y. l1ayya. H. Ilcrmansky. M. Magimai-L>oss. "Exploiting Contextual Inlo nnation For Improved Phoneme Recognition". Proc. I1:.EE Imerna/iolla! Conjerence on Acoll.wics. Speech. and Signal Processing (lC/ ISSP). 2008.

F. Valente and H. Ile rrnansky. " Hierarchical and Parallel Process ing of ModulaLi on Spectrum for ASR applications". Proc. IEt::I:.·/n/erna/iol1(J/ ConJerel1ce on Acoustic .... Speech. lind Signal Processing (leAS.')'p) . 2008 .

H. Fletcher and R.lI. Galt. "The perception of speech and ils relmion to telephony", The .follmal of the Aco/ls/iclll Socidy (?/"America, 1950.

A. Boothroyd and S. Ni uroucr. "Mathematical treatment or conlext e ne-cIs in phoneme and word rccognition", Jotlrl/ol (~fIhe Am/lsIical Society (!f America, Vol 84( I). pp. 101-114. July 2008.

40 Use or disclosure of the data contamed on this page is subject to the restriction on the title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000111

P10071-BBN PATROL Personnel

H. McGurk and J. MacDonald. "Hearing lips and seeing voices", Nature 26-1, pp. 746 - 748. December 1976.

F. Va lente and H. Hermansky. "Comhillat ion of Acoustic Classi fiers baseu on Dempster-Shafer Theory of Ev idence:' Prof.'. IHEE Inlemllfivnal COf!ferelllX on Acoustics, ,\iJ<'(!ch and .\'iXl1o/ Pl'm'(!ssing. Honolulu. III. 2007.

8.7.3 University of Maryland S. Shamma, "Encoding sound timbre in the auditory syslcm"/ETEJ. Res. 49(2}. 193-205.2003.

L. At las and S. Shamma. "Joint acollstic and modulation frequency" , EURA$'1P.J. AI'I'!. Sig. ['roc. 7. Jeju. S. Korea. pr. 661:1-675. 2003.

P. Julian. A. And reOli. L. Riddle. S. Shamma. D. Goldberg. and O. Cauwcnbergh s. ,," Comparative Study or Sound I ,ocalizalion Algorithms fo r Energy Aware Sensor Network Nodes:' IEEE Trans. Cir. ,):11.\'" 51 (4).2004

A. Schai . S Shamma"A NCLIromorphic Sound Localizer for a Smart MEMS System:' Analo)! Integrated Circuits (IIulS'jgnal Processillg. 39. 267~273. 2004

N. Massgarani. M. Slaney. and S. Shamma. "Colltent-l3ased Audio Classifica tion Rased On Muhiscale Spectro-Temporal Features", IEEt: T/'(I/1s. Alldio (lnd .'jJeech. 2005.

N. Massgarani. M. Slaney. and S. Shamma. "Content-Based Audio Classification Based On Multiscale Spectro-Temporal features". IEEF Trans. Audio and SjX!ech. 2005.

D. Zotkin. S. Sharnma. P. Ru. and R. Duraiswami. L Davis "Neuromimetic sound representation for percept detection and manipulation." EURASIP J Applied Signlll Processing. 2005.

D. ZOlkin. S. Shamma. P. Ru. and R. Duraiswailli. I.. Davis "Ncuroillimel ic sound representation for percept detection and manirulation." ElJRASIP J Applied Si),!nol Processing. 2005.

J. Fritz. S. Shamma. M. Elhiliali, " Active Li sten ing: Task-dependent plasticity of recepti ve fields in primary auditory cortex", Hearing Hits. 206. 159-176. 2005.

T. Chi. P. Ru. and S. Shamma. "Mult ireso lution spectrotemporal analys is of complex sounds", .I. ACOlIsf. SOl.:. Am, 1 18. pp.887. 2006.

T. Chi and S. Shamma. "Spectrum restoration from multi sca le auditory phase singu larit ies by generalized projections"./t·EE TrailS. ;!udioandSpeech. 14(4). 11 8.8872006.

J. Fritz. M. Elh ilal i. S. David. S Shamma "Auditory Attention - foclIsing the search light on sound". Curl'. Opin. Nellrobiol, 17(4):437-55.2007

M. Elhilali ,md S. Sh<.lll1lna "The cocktail party problem-with a cortica l twi st"../. Acolfs/. Soc:. Alii .. 124(6).2008

S. Shamma. "On the emergence and awa reness of auditory objects:' PI,oS biology. 6(6) 2008

S. Shamma. "Characterizing Auditory Rccepti ve Fields", Nellron. 58(6), /1.829-831. 2008

M. Elhi lali. J Xiang. S Sharnnm. and J Simon' " Interaclion between attention and bottom-up sa liency mediates the representation of foreground and b<lckground in an auditory scene:' PLoS 7(6) 2009.

41 Use or disclosure of the dala contained on this page is Subject 10 Ihe restriction on the ti/le page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000112

P10071-BBN PATROL Personnel

N. Mesgarani. J Frit z, and S Shamma. "A Comrutational Mode l of Rapid Task-Related Plast icit), of Auditory Cortical Receptive Fields:' J COIIIV Neuroscience, 2009.

8.7.4 Cambridge University F. Clego and M..I.F . Ga les, "Discriminative Adaptive Training with VTS and JUD," Pro". IEEt: Automatic ,">/Jeedl Remgnitiol1 (lnd Undt!rstandil1g Worh·hop. December 200,).

II. Xu. M..I.r. Gales, and K.K. Chin, "'ltnrroving Joint Uncertaint), Decod ing Perfo rmance by Predicti ve Methods for Noise Robust Speech Recognition:' Proc. IEf:I:.' Automafic Speech Recognitiol/ and Understanding Workshop, December 2009.

M.J.F. Ga les. "Maximum likelihood Linear Transformatio ns for II MM-bascd Speeeh Recognition."' COlI/jJlIler !))x'ech and Language. Vo lume 12. 1998.

O.K. Kim and M.J .r. Ga les. "Noisy Constrained Maximum Likelihood Linear Regression for Noisc-Robust Speech Recogniti on:' submitted to lEt'/:': Transactions on IIl1dio Spee('h 011(1 Language Processing.

M. Tomalin. J . Park, F. Diehl. M..I .r. Ga les, and P.C. Woodland, " Recent Improvements to the Cambridge Arabic SpeeclHo-Texl Systems:' lell,S'SI> 2010. Dallas. TX. 2010.

C. Longworth and M.J.F. Gales. "Combin ing Derivative and Parametric Kernels for Speaker Veriti catio n," 1l'.'F:E Transactions 011 Alldio Speech lind LanKlIliKe Pro"cssing, May 2009.

M.J .F. Gales and F. rlego. "l>iscriminativc C lassifiers wit h AU,lptivc Kernels for Noise Robust Speech Recognition." to appear CvmfJIIICr Speech and Lan};l/llge. 20 I O.

c.J. Leggelter and P.e. Woodland, "Maximum likelihood linear regress ion fo r speaker adaptation of continuous density Il iuden Markov Illodels." Computer speech and {m/KlIlIge. 1995.

L.F. Uebe l and PC Woodland. "Speaker adaptation using lattice-based MLLR," ITRW 011

Adaptation MethodsJin- S"ec('" Recognilion, 200 I.

L. Wang and P.c. Woodland. "MPc-based discriminative linear transfurms for speaker adaptati on." Compuler SjJeedl & L1II1~II(lKe, Vol. 22. 0.3. 200R.

K. Yu, M. Gales and P.c. WooJland , "Unsupervi sed ada rta1i on with di sc riminative mapping transforms:' IEEE Trans. ASI.r, Vol. 17. No. 4. 2009.

D. Pavey and P.C Woodland. "M inimum phone error and I-smooth ing fo r improved discril11inative training," Pmc. ICASSf'. Orlando. FL. 2002.

R. Cordoba, P.c. Woodland. and M. Gales. "Improved cross-task recognition using MM IE tra ining:' Proc. I('ASr.;;P. Orlando. Fl. . 2002.

M. Tomalin. J. Park, F. Diehl. M.J.F Gales and P.c. Wood land "Recent imrrovements to the Cambridge Arabic speech-to-text systems" Proc. ICA,">:5P'20/0, Dallas. TX, 2010.

42 Use or disclosure of the data contained on this page is subject to the restriction Oil the title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000113

P10071-BBN PATROL Personnel

8.7.5 Smo University of Technology N. Brummer. L. Burget, J . CernockY. O. Glembck. r. Gn.!zL M. Karari.1L D. van Leeuwen. P. Matejka. P. Schwarz. and A. S[rasheim. "fusion ofhctcrogeneolls spe,lker recognition systems in the STI3U submiss ion for the NIST speaker recognition eva luation 2006:' IEEE Transaclions on Audio, Spee('''' and Lallg1u1xe Processing. Vol. 15. NO.7 . pp. 2072-2084. 2007.

N. Brummer. A. Strasheim. V. Hubeika. P. Matejka. L.13urgct. and O. Glcmbek. "Disc riminative Acoustic Language Recognition via Channel-Compensated GMM Statistics:' Pruc. II1'er.~~Jeec:h, I3righ lOll . UK. pp. 2187-2190. 2009.

L.l3 l1rget. P. Matej ka. and J. CcrnockY. " Discriminati ve Training Techniques for Acoustic Language Idcntification··. Proc. leA,""SP. Toulollse. FR. pp. 209·2 12. 2006.

L. Burgct. P. Matej ka. P. Schwarz. O. G lel1lbek. and J. ("crnocky, "Analysis of feature extraction and channel compensation in G MM s r eakcr recognition system". IEEt: "f'ran.Wlcfions on Audio. Speech. and Langu<t}te Pmce.\".\·ill}t. Vol. IS. No.7. rr. 1979· 1986 2007.

I.. Gllrget. N. BrUmmer. D. Reynolds. P. Kenny. J. Pclecanos. R. Vogl. F. Castaldo. J. Dchak. R. Dehak. O. Glembck, Z Karall1 , J . Noecker Jr .. Y. Na Hye, C. Costin C ipriano V. ll llbeika. S. Kaj arekar, N . Scheffer. and J . Ce rnockY. "Robust Speaker Recognition Ove r Vary ing Channels" . .II/V. Galtimore. MD. p. 8 1. 2008.

L. Gurget. P. Matejka. V. lIubcika. and J. Ce rnockY. " Investigation into variant s o fJo im Factor Analysis for speaker recognition". PI'OC. Inlerspeech. nrighton . UK, pp. 1263- 1266. 2009.

l.. Burget. M. Fapso. V. Hubeika. O. Glembe. M. Karali.h, M. Kockmann. r. Mat~ika. P. Schwarz. and J. Ce rnockY. "(3UT system for N IST 2008 speaker recognition eva luation". Proc. Il1lerspeech. Brighton, UK. pro 2335·2338. 2009.

O. G lell1bek. L. Burget. N. Dehak. N , BrUmmer and P. Kenny: "Comparison of scoring methods used in speaker recogniti on with joint factor analysis". accepted to ICASSP. 2009 .

Z. Jane ik_ O. I'lchot, N. Brlimmer, L. Burget. O. Glembck. V. Hubeika. M. Karafiat, P. Matcjka , T. Miko lov. A. Strasheirn and J. Cemocky, "Data select ion and calibration isslies in automatic language recognition". submitted to O(~V"·.'iey 2010. Ilrno, 2010.

M, Kockmann . L. Burget "Contour modeling of prosodic and acollstic features for speaker recogn it ion-·. Proc. IEE£ Workshop on ,""jJoken Langl/{I1-!C' Technology. p. 4. 2008.

P. Matej ka. p, Schwarz. L. l::JurgeL and J. CernockY. "Use o f'anti -model s to further improve state·of-the·art PRLM language recognition system". Pro£:. I(,ASSP. Toulo use. FR. pp. 197· 200. 2006.

r. Matejka. L. Burget.. P. SchwarL. O. Glembek. M. Karafhil. F. Circzl. J. CcrnockY. D. van Leeuwen. N. BrOmmer, A. Strashcim. "STBU system for the N IST 2006 speaker recognition eva luation". Proc. IEEE Inlemafiol1a/ Con./erenC(! 0/1 Acoustics. 5i'peech and Signa! Pl'Ocessil1~.

Hono lulu. HI. pp . 22 1-224. 2007.

P. Matejka. L. Burget. O. Glembek. P. SchwarL. V. lI ubeika. M. Fapso. 1'. Miko lov. and O. Plehot. "BUT system description for NIST I.RE 2001". NIST '-HE 2007 Workshop. Orlando. FL 2007.

43 Use or disclosure of the data contained on this page is subject to the restriction on the title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000114

Pl0071-BBN PATROL Personnel

T. Miko lov. O. Plchot. O. G lelllbck. P. Matl:jka.l. Burget. and J. (·crnoc kY. ·'PCA·based Feature Ex traction for Phonotact ic Language Recognition", submitted to (}l~l's.\'ey 20tO, Bmo, 20 10.

r. O ldrich. V. Hu bcika. L. RurgcL P. Schwarz. P. M'ltejka. and J. l:crnockY. "Acq ui s ition or Te lephone Data from Radio Broadcasts wit h Appl ications to Language Recogniti on: Technica l Report" ,lec:lmimll'eporl. BUT. Ja nuary 2009.

P. Schwarz. " Phoneme recognition based on long tempora l context", FhD Thesis, 81'110. I1 UT. 2009.

44 Use or disclosure of the data contained on this page is subject to the restriction on the fitle page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000115

P10071·BBN PATROL

9 Project Management and Interaction Plan

9 1 Project Management

Raytheon BBN Technologies Robert Elmer, President

Edward Campbell, hecutive VP

I John Makhoul

B8N VP & Chief St ientist PATROL Principal Investigator

Richard Schwart~ Tethnicallead

! -r !

Project Management

Spyros MatsoukiU Co-PI & Technital Mano1ger

! Cambrldse University

University of Milryland Johns Hopkins Unillenity ~ ...• Srna Unillef1lty of Technology

M ark Gales ... Lukas Burget Phil Woodland Shihab Shamma Hynek Hermansky

Palle' Matejka

Figure 9·1 Urgani7-3tional chart ror Ihe PATROL Team.

Figure 9- 1 shows the management organiz.ation fo r the project. Dr. John Makhoul will serve as Principal Invest igator and the ()vcrall Program Manager ror thi s effort: he will coordinate and manage the acti vities o rthe BBN-Ied PATROL Tea m ill the RATS Program. Dr. Makhou l is an experienced project leader and program manager. He has served as Plan a large number of DARPA and other government projects. Most recently. he managed and coordinated the BBN­led team under the DARPA EA RS program and is current ly the Plan the BBN-Ied team under the DARPA GALl:: program. [n both programs. the BBN-Ied teams have been co nsistently top perfonners.

Under the BBN GA LE effort. Dr. Makhoul oversees the activ it ies of a dozen subcontractors. distributed geographica lly across multiple locat ions. from Ca lifo rnia 10 Europe. The work under the GALE prograrn is quite diverse; it inc ludes work on spcech recognition. machine trans lation. distillati on. the linguistic annotation of text. and the integmtion of techno logies into ope rationa l systems that take s peech or text from multiple languages and produce an English translation. The resulting technologies have been trall siti oned into BliN's Broadcast Monitoring and Web Monitoring products which have been deployed in at least 18 Gove rnment locations for 2417 operation. The GALE program is expected to end in /\pril20 II. but our experience in running such a large and success fld project will serve us we ll in managing the 1313 -led PATROL Team in the RATS program.

Dr. Makhoul will be assisted at I3BN by Lwo senior n:searchcrs. both o l' whom have had prominent and important roles in the EARS and GALES programs. Spyros MaLSoukas. who has demonstrated in the GA LES program first-rate skills in performing technical management across many s ites. will serve as co-PI and Technical Manager. He will manage the technical activities at BBN and across the other four s ites on the PATROL team. Rich Schwartz. who has been a foun tain of innovat ion in various aspects o f speech processi ng and pattern recognition for ovcr three decades. will serve as Technical Lead for the eflort.

45 Use or disclosure of the data contained on this page is subjeclto the restriction on tile title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000116

P10071-BBN PATROL Project Management

Hav ing been on the BBN-Icd team on GALE for the last four and a ha lf years. Pro fessors Mark Gall's and rhil Wood land o f Cambridge Uni ve rsity ha ve worked close ly with Makhou l. Matsoukas. and Schwart:t .. and together have developed one of the most ad va nced multi- lingua l speech recogniti on systems in the wor ld . That strong working re lationship will contillue durin g the RA TS project.

Professor Hynek I-Icrmansky or Johns Hopkins University (JHU). who has an ongoingjo inl appo intment with the S mo lJ ni versity of Technology , has a very strung relationship with Drs. Lukas Burget and Pavel Matejka there. both hav ing perfo rmed research with him in the past and have attended summer workshops at JII U. Current ly. [lurgel and Matejka serve as consultant s On one of Hermansky's pn~jects in the US.

Professo r Shihab S ham rna of the Uni ve rsity of Maryland (UM D) currentl y has a jo int pro.iect on speaker identifi cation w ith Pro fessor Hermansky. sponso red hy IAR PA under the BEST progra m. Furthermore. both of them work on auditory modeling o f sound and wi ll be able to join up to provide the PATROL Leam unparalleled expertise in that .. rea. which we expect to be very important in iso lating the speech part of tile noisy speech signal. Sham ma and Hermansky w ill be providing auditory-based feature extraction so ftw are to the other team members for usc in building systems for the four RATS applications.

BBN and UM I) have also hwJ a very strong work ing rel,ltionship under the GALE program. Rieh Sdl\vartz from HR N. who is physica lly located in Virg in ia. has supervised the work o f several students from UMD, Schwartz Ims been meeting with the stllJcn ts m the 13BN otlice in Columbia. MD. on a week ly basis. Under RATS. Schwa rtz will continue his rel ati onship w ith UM D and hi s in teract ion with the students there.

The above described worki ng relationships that alreally ex ist among thc PATROl. team members will be a very good starting po int lor establi shing a multilateral. open. and cooperative work ing re lationship among a il leam members.

9.2 Team Size and Composition From ou r experience in rhe EA RS and GALE programs. we have found that. in programs with challenging goa ls. it is important to have the most qualilied people atll1u hipl e sites work together. bounce 01'1' ideas aga inst each other. and co llaborate in implementing and integrating ideas and solutions that lead to the best performance and accuracy. By any measure. the goa ls o f rhe RA TS program are very challengi ng and. therefore. req uire diverse ideas from a team that has the requis ite expertise anll proven record o f innovat ion and superior performance. We bel ieve we have assembled sllch a team. Going with the old adage o l' .. two heads are better than one", we have endeavored to ensure that , in each primary technica l area. we ha ve at least two sites that focus on each area of research.

Tab le 9-1 li sts the fi ve sites on the PATROL team and the areas of primary concentration in the pro posed effort . Illlhe area o f robust feature extraction and speech enhancement. Or. Hermansky at JH U and Dr. Shamma at UMD ha ve each deve loped audio processing methods that tend to separate the speech from the noisc. They will make their methods and so ftware avai lable to the PA TROL team in ordcr to see which features work best with large leve ls of noise.

Even with the lise of robust featu res. there will st ill be the need fo r noise mode ling. BBN and Cambridge Unive rsity are both very experienced in thi s area and \vi ll continue to innovate

46 Use or disclosure of the data contained on this page is subject to the restrictIon on the title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000117

Pl0071·BBN PATROL Project Management

so lutions to this thorny problem. Brno Uni versity of Technology (RUT) wi ll usc their joint raclar analysis OF") work to render the I[) tasks more immune to noi se.

Robust Feature Noise Speech language Speaker Keyword Software Extractionl Modeling Act ivity 10 10 Spotting Integration Enhancement Detection

BBN X X X X X X

Johns HOPkins U X X

U Marvland X X

Cambridge U X X

Brno UTech X X X . . Tahle 9-1 Or~amzallOn.!; oflhc PA I ROL telll11 and their primary arCHS of con cent ratIo II III RA rs . Or. Sham rna at UMD has obtained very good results in perronning speech acti vity detection (SAD) in va rious types of noise by us ing his corliC<l I represen tat ion or sound. He wi ll continue to work on this pro blem while BH N will use ad vanced statistical methods to perform SAD.

HRN and BUT will work on language I D and speaker I D. both of which are the primary areas or stren gth at BUT. In addition. Ur. lIermansky wi ll wo rk on sp~aker 10. especially that he cu rrent ly is workin g on that problem on an IAR llA elTort .

HBN is a n:cognized leader in the area o f keyword spotting and has deli vered systems to the Govefllmt rH for keyword spotting in various languages. Cambridge Uni vers ity - well known for their expertise in speech recognition - will work with i3I3N on kcyword spotting.

Sites would be expected to ddi ver to BBN working so n ware modules (a task that is not shown explicitly in the abo ve table). BHN will need to integrate aillhis software into a working system that can be delivered to the Government. Naturally . BBN will be the on ly s ite performing integrat ion and delivery.

Even though each s ite will concentrate on the tasks listed in Tuhle 9- r. the sites. th rough jo int meetings. will also be available to consult on the o ther tasks. as their ex pcrti se permits.

As one can sec from the above. BBN has bu ilt a team with some built-in redundancy. but with maximum eOeetivcness. We strongly believe Iha t this team. hoth in size and compos ition. is both necessary and sufTieient to meetlhc program objectives.

9.3 Working/Meeting Models BBN has establi shed learning relationships with each o f the PATROL team members. The: Statement o f Work for each team member contains specific technical requirements by task. project management reporting responsibilities. anticipated support for meetings and presentations. and the delivery of so ftware. The teaming strategy draws upon unique capabilities of each participant to form a comprehensive execution plan for the RATS efTon.

i3BN plans to have a face-to-face kick off meeting early in the project. The kick-ofT mceting w ill concentrate on working ou t an overall plan 10 be accompli shed during Phase I u rlhe project. and a timeline for exec uting the plan. After the kick off meeting. most of the day-to-day coordination ofae ti vit ies will take place through regu lar weekl y h:leconfcrences for each urt he major tasks. The technical revi~\Vs for e:ach task will include: e:oo rdination of activities across sites. planning fo r the research to be performed. the assignment of specific tasks. and the review afprogrcss against the work plan. In addition to Ihe frequent tel econferences. periodic face-to-face meetings will be scheduled at va rious locations. taking advantage of major conferences and OARPA

47 Use or disclosure of the data contained on this page is subject to the restriction on the title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000118

P10071·BBN PATROL Project Management

meetings. Such team meetings will strengthen the lies among team members and allow for in ­depth technical di scussions and planning. In additi on to the DARPA requested meetings and workshops specified upon award oflhe contract. meetings and tclecollfe~nccs wi th the DARPA RATS Program Manager wi ll be scheduled as needed.

9.4 Software/Code Management

9.4.1 Software Integration/Management BBN is highly experienced in the integration o flcchno logy from other sites into functioning systems. with arrnopriatc GU ls. whi ch have been delivered to lhe Government. ei ther for internal usc or for deployment in the fie ld for 2417 operations. Our experti se incl udes devel oping COTS Human Langu<lge Tcclllwlogy (H L T) software systems. s uch as the Broadcast Mon itori ng System (I3MS). which integrates software from two other s ites : it arenites with minimal human oversight at over 18 Gove rnment locations. In additi on to commercial products. BI3N deli ve rs state·of·the·an HLT so ftware suitcs direc tly to Government customers ror their usc in research and operations . One such system is Ryhl os. BAN's state·of .. tbe·ar1. Iminable spcech·to·text (STT) systcill. Byblos is used hy Government stalTto run STT and to create new SlT model s using Byblos 100 is to lrdin on noisy. rea l- wurld telephone data in a variety of languages. BBN al so delivers keyword search (K WS ) systems to the- Governmen t. inctuding word and rhone· based systems that operate at vcry high speeds. The K WS so liware suites are des igned for lise in larger Government systems. so they are I:a llable via an API jointly designed by the Government and BI3N. HB N has ex tens ive experience supporting the Government customer to heir them make STT. K WS. and other lILT work in a w ide va riety of languages and doma ins. We ha ve a lso delivered a (JU I·dri ven lrainah/e OCR system to the Government (the OCR system is bused on th e I1I3N Byblos STT system).

In the RATS project the intcgration of the PATROL soft warc system wi lt be perfo rmed atl3l3 N. Well be fo re integratio n. BBN. in constlltution with the team members. w ill create AP ls that c learly define interfaces of the modules each site is developing. Early integration of prototype modules in I)hase I will take place to test API soundness and ensure successful final integration for delive ry to the Eva luation Team. Our teammates will contribute software in thl! form of sou rce code. enabling BBN to create a single tightl y integrated software suite. Ha ving the source code in a centrali zed locati on wi ll al so a llow us to rap idly debug and !ix any iss ues discovered during integration. Whenever s ites submit ne w vers ions of software modules. they will also subm it test programs that exerci se A PI runcti onal ity and the results of test runs demonstrating the module· s confo rmance to the defined AP I.

All source code will he checked into a central repository at BBN dedi cated to the RATS project. This repository wi ll be used for integration of a ll subcontractor components into the comple te system . In additio n. this repository w ill support Ihe usual revis ion contro l functi ons necessa ry for RnN staff to eflieiciltly conduct research exploring alternati ve software algorithms. Spec ifically. BBN researchers wi ll check in to the reposi tory a ll changes to the research code base. typ ica lly cheCking into a private branch of tbe repository until a change has been shown to be bencfic i<tl cxperimentally. al which time the ch"lIlge is merged into the reros itory ·s "main trunk." The RA 'I·S project will adopt a number 01" so ftware practices that I3BN cu rrently Llses to maintain a cons istent so ft ware repository. This includes c reation ora symbolica lly tagged. date· stamped ·· Iocal release" every time the trllnk o rthe re.pository is updated. Software will also be tagged , built. and released whe never new software contrihulions are rece ived from subcontr<lc tors. The symbo lic tags ensure our ability to analy%:e and identiry the so urce orany

48 Use or disclosure of the data contained on this page is subject to the restriction on the title page of this proposal,

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000119

P10071-BBN PATROL Project Management

unexpected changes in system behavior. Other software pract ices we wi ll mJopl include the use o f automated nigh tl y builds and regression tests. with cmails automatica lly sent when any error or warning occurs in the process, along wi th information about the specific module and orig inating site. We wi llll1CI1 work with that sill: to fi x any problems.

HRN"s soft wa re system will a lso ad here [ 0 the AP ls and input and output formalling conventions defined by the Eva luation Team for periodic eva luati on del iveries and by military end users for the trainable system deli ve ries. BBN will Ies! a ll so liwan.:: component s extensively prior to delivery to ensure rooustness and confo rnHlIlcc to exte rnal All is.

I3BN will support the Eva lumion Team by providing access to all developed system and component son ware: addressing solhva re instrumentat ion needs: identify ing failure causes: meeting required del ivery dates to support eva luation eve nts: and respond ing to any other test­re l<l ted issues. BRN's extens ive ex perience supporting Uyb los and other Human Language Technology systems at remOle customer s ites gives us the confidence to ensure that the Eva lua tion Team can successful ly run our RATS system.

9.4.2 Secure Processing The work under Technical Area I of the RA TS program docs not require the process ing ofallY class ified data . Howe ver. the Eva luat ion Team is s lated to test the deli vered systems on classi fied data. poss ib ly at the SCI leve l. In the eve nt that the Evaluat ion Team should meet w ith an y issues when running the PATROL system on classified dam. BBN has numerous starr in its speech processing group who are c leared al the SC l lcvcl and who wi ll be able to interact with the Eva luation Team or any future mili tary cnd use rs 10 ensure that the software suite is correct ly configured and ru ns sllccessfully on their data. In addition. BON has a TS/SCI Se lf equipped with a secure connection to TS/SCI Government networks and. therefore. will be able to process the same classilicd data in its SCIF. should the need arise to do that.

9.5 University Participation In thi s project. we arc fortunate to have as partners fo ur univers ities that have exce lled in their areas of experti se w hich are relevant to the RATS program. as mentioned above . The Key Pcrsonnel pro ressors in those universities will be supervis ing a mi x o f research stall including post-docs. and graduate student s who wi ll be partic ipa tin g as research assistant s. The statTanJ students. under the guidance of their professors. will be writing sotiw<l re and perform ing experiments 10 im prove accu racy in the Jour RA TS applications in noi se.

In addition to deVelopi ng !lew resea rch ideas and writing code. these universities wi ll be t:xpccted to deliver to I3B N softwa re that wi ll be intt:grmcd into the BI1N system for delivery to the Eva luation Team and to potcnt ialuse rs. Unde r an existing IARPA contract. JII U. UMD. and HUT wi ll be delivering software for an external eva luation team as well. So. some o fLheir sonware wi ll have al ready undergone hardening for delivery to th ird parties. Fu rthermo[e. RUT hllS a lot of expe rie nce in transiti oning their so ftware fo r government and commcrcial app lications. We plan to take advantage of a ll thi s experience to integra te the soli ware from these sites with ou rs fo r delivery. The speech group at Cambridge University has a long hi story o f bu ilding so lid software (especially the ir IITK speech recogni tion too lkit) ror ex terna l use.

For over 20 years. BUN has had a cooperative relationship with Northeastern Univers ity. whcre Dr, Ma khoul has a n appointlllent as Adjunct I'rofessor. In this re lat ionship. gradl1ate students from Nonheastern work as research assi stants and do their graduate thesis work at BBN.

49 Use or disclosure of the data contained on th is page is subject to the restnction on the title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000120

P10071-BBN PATROL Project Management

supervised by senior BBN stafT. wi th Dr. Makhoul acting as their ollicin l thesis advisor. We expect \0 continue thi s arnmgcmcnt. so we have budgeted one research assistant from Northeastern to work on th is project.

9.S Government's Role We expeCllhat the Gove rnment wi ll provide lIS with the necessa ry data with wh ich 10 build our models fo r the four application arcas. We will wo rk dosely with the Program Manager to ensure that our work is we ll aligned with the fiovernlllt!'1l1's expectations.

9.7 Project Management at BBN BON has an extensive hi story of program management success that is predicated 011 ta iloring Ihe management approach 10 the needs orlhe projecL Our project ac tivities range from very large. rormal programs to small. short-term projects w ith only a few engi neers. Our project activities range rrom research and development to technical transition to commerc ial offerings. We have the fl exibility and trai neLi talcnt within rJl3 N to work at all these levels . BHN is dedicateu to the principle th.tt program manage ment must take a balanceu and comprehensivc approach to tht: achievement ortechnica l. eosL. and schedule Object ives. To mee,t these objectives. HB N has de ve loped a program organization and management approach proven effective by prior success ful contract performances with multiple team members.

Every project has a designmed program manager who is responsible for the overa ll management leading to the successfu l completion with in lime and budget. The program manager is respons ible for day -to-day management. sta ffing plans. fin::mcial tracking and reporting. subcontractor relationships. and tracking o r milestones and deliverab les. Every project is reqllired to ha ve a project plan Lletailing th.1t project's tasks. work organi zation. and budgeted costs. The plan. sc hedule. ass ignments and budgets arc communicated to all program team members. Monthly tracking and control procedures mcasure the progress and cost aga inst the plan. Each projed is subjecteu to monthly reviews that report on project status. so that any difficulties or risks that may impact project completion can be identified and addressed in a timely manner.

The team holds regular project meetings to coordinate work, identify any potential problems, anLi ensure the prQject is proceeding on schedule. Program ri sks will be identified and discussed. and mitigation plans implemented. I n a program of the magnitude of RATS the team meetings arc required ill both the program and task leve l.

A project web site is estab lished to slore. retrieve. anLi share technica l and manugement documents. All work products undergo an appropriate level of peer review. We provide status reports prepared and submitted in accordance w ith procedures in the award document. At the end ofench phase. we prepare and submit a Final Report that summarizes the project and tasks at the end ortha! phase.

BI3 N has the infrastructure to support the management of large government contracts. A number of systems (such as BBN's time reporting system. Microsoft PrQjcct for schedule creation and management. and PRI SM lo r project finances) are in place to monitor project status and faci litate project contro l. Most importantly. we have customizable project management practices. We unLierstand and appn:ciate the nced to be Il exib le. efficient. and highly responsive to the evo lving goals of the RATS Pmgram.

50 Use or disclosure of the dara contained on this page is subject to the restriction on the title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000121

Pl0071-BBN PATROL Cost Summaries

10 Cost Summaries

The fo llowing four pages provide a complete cost summary for the proposed PA TROL efTort under the RATS program.

The lirst page contains a top-level view o rlhe proposed costs. broken down by phase. major task. and performing s ite. There a rc seven major tec hni ca l tasks: Feature Extraction. No ise Model ing/Compensation . Speech Acti vity Detection. Lunguagc Identification. Spt;:akcr Identilication. Keyword Spotting. and Systl!lll Integration. In additi on to the seven major technica l tasks. we have also priced Program Management and IJrogram InfnlstruclU re as add itional major ta sks. Progra m Management includes program admin istration. all BBN travel , and data se rvices (:such as low-leve l annotat ions, whi ch will be made ava ilable to the other Rt\ 'I'S performing sites), Program Infmstructure includes materials (computer equipment) , computer system admini stra tion. and da ta acquisition . For Phases 2 and 3. we also show at the boltom o f the pagc the cost ing lo r the proJX>scd Speech Transc ript ion option.

The second page contains the same cost ing information for I>hase I. but broken down by month. for the 18-month duration. The third and fourt h pages contain the corresponding monthly costing information for Phases 2 and 3. each lasting 12 months.

51 Use or disclosure of the data contained on this page is subject to the restriction on the title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000122

PATROL Cost Summaries

RATS PATROL SUMMARY

Base Proposal Phase 1 Phase 2 Phase 3

Feature Extraction Johns Hopkins University $ 191 ,501 $ 128,763 $ 131,141

University of tllaryland $ 129,261 $ 95,315 $ 94,500

Noise Modeling/Compensation BBN $ 664,876 $ 442,559 $ 441 ,829

Bma University $ 51 ,520 $ 38,011 $ 37,687

Cambridge University $ 134,649 $ 54,900 $ 55,978

Speech Activity Detection (SAD) BBN $ 475,039 $ 276,875 $ 243,125

University of tJaryland $ 129,261 $ 95,315 $ 94,500

Language ID (LID) BBN $ 644,170 $ 461 ,689 $ 495,500

Brna University $ 103,040 $ 76,036 $ 75,372

Speaker ID (SID) BBN $ 486,644 $ 398,574 $ 419,302

Brna University $ 103,040 $ 76,036 $ 75,372

Johns Hopkins University $ 191,501 $ 128,763 $ 131 ,140

Keyword Spotting (KWS) BBN $ 616,423 $ 461 ,364 $ 483,307

Cambridge University $ 134,650 $ 54,904 $ 55,978

Northeastern University $ 43,063 $ 44,782 $ 46,573

System Integration BBN $ 313,267 $ 342,757 $ 377,522

Program Management * BBN $ 328,681 $ 247,608 $ 234,355 Program Infrastructure .. BBN $ 552,786 $ 470,811 $ 400,930

Total Base $5,293,372 $ 3,895,062 $3,694,111

Speech Transcription Option BBN $ 325,701 $ 337, 100

Cambridge University $ 157,064 $ 162,459

Total Option $ 482,785 $ 499,559

* Program Management includes program administration, travel and data services .

• " Program Infrastructure includes system administration, material, and data acquisition.

52 Use or disclosure of the data contained on this page is subject to the restriction on (he title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000123

P10071 ·BBN PATROL Cost Summar ies

Phase 1 by Month

Pn ... 1 S ...

." iI<1'_.~"'f

Comt n<li/O u-. ... ""Y

$~. ,h k'''' ;lyC .... ' I.n (SAOI

'" ~"Y.f"~

lMI~ ... ~IDIUDI ~, _u.-_ . ..,. Spe . ... ' ID !SlD)

'"

Koywo>rd S poiling IKWS)

'" comorrigo ..... """"11)' _." .... lkweftlty

',og'.",OI>tI.O~"""'

'" P,olllam lf1,,,, ,,, "O "'fO ."

JuI·" Nov_" 0.<_11

10 ,&112 , 10,6&1 IO.6l!l '0 tiIIl $ 10,M l S 106l!l 10.W .0&'12 u·\ $ usa

lUll

1.112 ". 711J $ 7 '32 $ 1.1&0 1.1~1 71!2 $ 1181 $ 1.1al S 11 ~ 1

1?~l

". 7 lal

,,~

1.181

~ _10\! ,.,

S lun 'X'l2l $ 30107 S 30707 '30. 1~ 'l',2lS

S 7.(111 S 1,081 5 lve7 $ ;,0I!1 ! 7<)1~ S Hl81

1 1 2~

,~

21,532

1,13~

30 1~1

7,1110

1)10&5 '23801 i l"OO I " .lill I Z'SJS S V2'2 $ 2IIG2D ,., I $3., S tQMl S 100112 $ 10 612 $ lOSe! S ,Mal

12,1 12

1.011

lO 152

>CII

:lIlIS,

,~, ""' ,.,

' 06l!1

35,157

1t3~

$ I ~M

1,1I!1

ll.m $ 3/1.001

~.sa7 $ ,saT

2U71 .. , 10 "U

lG ,233

7.1157

". '"'

28,601<

I ,la.

"ICIt

e fll

IS,S:MI ,., lJ ,TSO

l ,ca l

.," I .I~ , 112e

.~

,-

n ,15ee

7 III

' .2i'I , • 28'

"' lill Y

7 ,1! 1

ll,ll l

1111

"" 10117

15lS2 $ ~.~7 S 3l1.m s 3' m s 31.U& S 3"~J

1.\0&1 S 11,1 $ ISSI S 1\0&1 S I .SSl i ISST

,., 10!&1

]1 ,132

1.1I!7

UIe

'., .m

,., , o.6~l

J~ , 115

'til

'1,01)

M81

,O,U,

~,n.

I.on 1.1i!!

l5.'-

.~

7!~7

7t!2

1.081 $ 111!1

25,lOJ

7 \!/

/I ,IQ I

3,517

~1.06' S 11 ~9 S lOOJO

I .SSl S 1.5S1 S

1t3! $ I CII S 1087

1I2~ I , &28

TO'.'

"' ,R' IH.lII

"",,/i 11 .UII

n o,""

.. &.u. IIIJ.MII

"'.$01

III, UI

,U,UO

IJ.OII

S &0601 S '3.>11 S 13177 $ '3m S 13m S 20m S 19.0< 1 S 11.30>7 S ' 9 392 S 11,5" S I1&11S 1 11.= S '6.111Xl S 1119' S 11.:.06 S 17 1!1 S '6,= S·I._ $ " 9501 , 1".:t1

S li.031 S 2' a'a $ 'z.~ $ '2 1 '~ " 2011 S 13:100 S IUJO "2 011 5 17,m 5 217C1 S 2UII i 2'j~JJ S 15M 121912 i '2.0<'0 111733 5151'93 I 11~1 I 10.1IZO 1 '>I.U I

$ 1\ ,115 $ 12111 1 11 975 $22! ,157 51 1, 11 1 $ 11.915 I 12 1fO $ 11 ,1176 '"5,170 11/,!i/i/ S 12310 1 12,17' S 12,!'062 S '2.370 S 12,!i/i3 $ ' 2J 7~ 112,503 S '2 ,370 S 1211~ I 111.117

snu,: I~J,IH n U,n, 1"1.1H nn.l~ $lM.n' I ZlI,IU 181.1011 I JlI.3U IZM.'" Im.IM UCIII,7H 1:t7.1.2' IHI.' I' 1211.110' UII.117 I:IU,702 1:11.'" 1221.'. u .tU.1n

53 Use Of dIsclosure of the data contained on this page is subjecllo tile res triction on the tJl/e page of thIS proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000124

P10071·BBN PATROL Cost Summaries

Phase 2 by Month

Phase 2 Base

Fe ature ElCtrllc t io n JoIv6 Hopkins UllIversJly Uroo,ers,11y of MaTYlantl

Feb-12 Mar-12 Apr-12

S 9,90S S 9,905 $ 9 .905 S $ 4.039 5 8.079 $ 8079 ,

May-12 Jun-12

9,905 S 51. 905 S 8.079 S 7.977 S

JUI-12 Aug-12 SCp-12

9.905 S 9,905 S 7,675 S 7.875 S

9905 S 7.874 ,

OCt-12

9,905 5 7.875 S

No v_12

9,905 5 7,675 S

Dec-12 Jlln-1J

9,905 S 9.905 S 7.875 $ 7.875 5

F, b·13 Total

9,903 $ 128,763 3936 S 95,315

Noise Mod.~nglCo~nsatioll BB N 5 34,140 S 33.039 5 34,196 S 34,256 S 35,020 S 35589 $ 35,832

3.222 S 3.222 S 3,222 S 3,140 S 3.141 S 3. ' 4 1 4,535 S 4.1535 S 4,635 S 4,635 $ 4 ,635 S 4.554

S 35,589 S $ 3,14 1 $ $ 4,519 $

34,196 $ ] ,1 40 $ 4,519 $

34,256 3,140 4,519

$ 34.196 $ 3. 140 5 4,51i

S 33,057 $ 29.182 S 3,140

S 442,558 S 38,011 S 54.900

arro University CClmbndge lJni...erslty

Speech Activity BBN UrivefSily of MllryBrd

LangUllge 10 (LID) BBN Brno Uri...ersily

S pn h r lOlSID) BBN 8m;) Urivers ity Johns Hopkins Unilll!rsity

Keywot\1 Spotting BBN Cambridge Unllf!r5Ity ~l1heastem UI1IYel'1 ily

SyStDm Integration

s s

3,22:2 2,317

S S

$ 20.859 S 21188 S 21 .2"4 $ 4 ,039 $ 8.079 $ 807i

$ 22.2"3 S 8,079

S 22.099 S 21 .718 $ $ 7 976 S 7,875 $

22.595 7,875

S 22.128 S $ 1875 S

21 .814 7.875

$ 21 .959 S $ 7.875 $

S 4 5 19 S 2,259

20959 $ 7.875 $

20.124 5 17947 7,875 $ 3938

$ 216,8 15 S 95,315

.5 33,190 S 3U27 S 34,352 S 34,823 S 38,090 S 36 ,315 S 37.156 S 36,315 $ 35,836 $ 36,217 S 35.491 S 37.761 S 34,316 $ 461 ,689 S 6.444 $ 6.444 S 6.444 S 6444 S 6.293 S 6.281 $ 6.281 $ 8281 $ 6.281 S 6281 S 8.281 S 6.281 S 76,036

$ 29.956 $ 6 ,444 $ 9,905

5 34.624 S 2.318 $ 1.899

$ 30,559 S 6.444 S ~ , 903

S 35.723 $ 4.635 $ 1,899

$ 30720 $ 6.444 S 9,e05

S 31.628 $ 8.444 $ 9,905

S 3631$ 5 S 4635 $ $ 1899 $

36.689 4 ,638 1,900

$ 31 ,289 $ 32012 $ 31 .803 S 6,295 S 6.28 1 $ 628 1 S i,905 S 9.905 $ 9.905

S 31 .599 S 31289 S 6,281 $ 6.281 $ 9,905 S 9905

S 36.660 S 36,660 $ 4.637 S 4.637 $ 9.229 $ 9,229

$ 35.478 5 $ 4.556 5 $ 9.230 S

38.248 5 4 $19 5 1900 S

35.836 4,518 1.899

5 31 .216 $ 30.720 S 26.302 $ 27.679 $ $ 8,281 $ 6281 S 6,281 $ $ 9,905 5 9,905 5 9905 S 9,905 $

$ 38.149 $ 4.518 $ 1.900

$ 35492 $ 33.107 $ 32.383 $ $ 4518 5 4.518 $ 2,259 $ $ U99 S 1.899 $

398.572 16,038

128,763

461 ,364 54,904 44,782

BBN $ 23,176 $ 22.B03 S 24.400 S 23.703 $ 24,29B $ 28 ,507 $ 30,380 $ 28,507 $ 28.889 $ 29,251 S 29,"48 $ 24,364 S 24751 $ 342,757

Progr.lm Management BBN S 21 .263 $ 28,165 $ 19.0711 $ 14460 $ 26018 $ 30577 $ 16978 S 20,653 5 14.336 S 14.048 S 181711 5 11663 S 12.189 $ 247,608

Program Infrastructure BB N $ 228 ,892 $ 12. ' 10 S 12,500 $ 12,499 S 12.499 $ 12.499 $ 12.302 $ 10S,017 $ 12 499 5 12.497 5 12,499 $ 12,500 $ 12489 $ 470,812

loui Phase2 Ban $476,632. $280,659 1276.053 $273,550 $293,965 $l03,639 $291,907 $378,256 $276,693 5277,792 S21t1,682 $263,016 $223,158 $3,895,082

Speech Transcription Option 8BN S 24.153 S 24,698 $ 24.492 $ 25,847 S 24,998 5 25.S82 S 24.492 $ 2S,562 $ 24.998 S 2S.847 $ 24 ,492 $ 25.562 S 24998 $ 325,701 CambridgeUri vt!rsify S 12064 S 12083 $ 12083 $ 12,083 S 12,084 $ 12,084 S 12.083 $ 12084 $ 12 .0B4 $ 12 ,083 S 12,083 $ 12,083 $ 12083 5 157,084 TolalOptioo 536.231 $ 36,781 $ 36,515 $ 31,930 $ 31,082 531,646 $ 36,575 $ 37,646 $ 37,082 5 37,930 $ 36,$75 $ 37,64$ $ 37,081 S 482,785

54 Use or disclosure of the data contained on Il!is page is subject to the restriction on the title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000125

P10071-BBN

Phase 3 by Month

Phase 3 Base

Feature Extraction John!; Hopkins University

lInrver$'ty of Marylilnd

PATROL Cost Summaries

Feb-13 Mar-13 Apr-13 May-13 Jun-13 Jul-13 Aug-13 Sep-13 Oct-13 Nov-13 Oec-13 Jan -1 4 Feb-14 Total

$ 10.087 $ 10.068 $ 10,087 S 10,Oae S 10.066 S 10,086 5 10.088 S 10086 S 10.068 $ 10,088 $ 100S6 S 10,088 S 10,067 S 131,141 S 7.269 $ 7.269 S 7,269 S 1.269 S 1269 S 7,269 S 7,270 S 7.270 S 1.270 $ 7.269 S 7.269 $ 7.269 S 7,269 S 94,500

Noise Mode lingfCompensalion BBN $ 33219 S 33050 S 35.019 S 35,945 S 37,967 S 37,455 S 38,067 S 36.339 S 35.235 S 35030 S 34,382 S 30.3 16 $ 18,793 S 441,829 Brno University S 3.141 S 3141 S 3, 141 $ 3, 141 S 3, 140 $ 3,140 S ),140 S 3.140 5 3,140 5 3, 141 $ 3,141 5 3,141 $ 37,681 Cambtodge Un....erslty S 2.332 $ 4.665 S 4,565 5 4,665 S 4.665 S 4.665 S 4,565 S 4.665 S 4,565 $ 4.665 $ 4665 S 4.664 S 2.332 $ 55,978

Spel!'ch Activity Detection (SAD) BBN 5 18,525 5 17,938 S 19,623 $ 19,347 $ 20.058 $ 21,630 5 20,233 $ 20.980 $ 19146 $ 18.697 5 19174 5 16,962 $ 10,612 $ 243,125 Univers ityofM::lryland S 7,269 $ 7. 269 S 7,269 $ 7,269 $ 7,269 S 7,269 S 7.269 S 7270 S 7,270 S 7,270 $ 7,269 S 7,269 S ],26S $ S4,500

Language ID (LID)

"N erno lkliverslly

Speake r 10 (SID)

'" Blno l.)f1 ivers ,ty Johns Hopkins University

1<eyYlOrd Spotting (1<WS) BBN Cambridge lJoiVersity Northeastem Unrversity

System IntegratIon BBN

Program Manage.,.nt BBN

Proyrlm Infrutrl.lCturlt BBN

lol.1 Phase 3 Bnlt

S 37,082 539,102 S 40.033 S 42064 $ 40.750 S 41 ,854 $ 38.053 $ 40.148 $ 40,750 $ 40. 116 S 38,085 S 34.602 522.861 $ 495,500 $ 6.281 5 6.281 S 6.281 $ 6.281 S 6.281 $ 6.281 5 6.281 $ 6.281 $ 6281 5 6.281 S 6281 $ 6.281 S 75,372

S 31.519 S 31 .006 S 32,623 S 32,222 S 32,196 S 32,517 S 32,947 S 32,517 S 32,196 S 32.222 S 32,623 $ 32,09 1 $ 32,623 $ 419,302 $ 6,281 S 6,281 S 6,28 1 S 6281 $ 6.28 1 $ 6,28 1 S 6,281 S 6,281 S 6,28 1 S 6.281 S 6,281 S 6.28 1 $ 15,372 S 10.087 S 10.008 S 10088 S 10.088 $ 10.087 5 10088 S 10088 S 10088 S 10088 S 10,088 S 10088 S 10.081 S 10.007 S 131,140

S 35.768 S 36 323 S 37,506 S 37,594 $ 31,018 S 37,269 S 37,a31 S 37.269 S 37,018 S 37.594 S 37,506 $ 37,594 $ 37.018 S 483,306 S 2,332 S 4.665 S 4,685 5 4,665 S 4,665 $ 4,665 S 4,665 S 4,665 S 4,665 S 4,664 S 4,665 $ 4,665 $ 2,332 S 55,978 $ 1,916 $ 1,976 $ 1,976 S 1.976 $ 9,599 $ 9,599 S 9,599 $ 1,976 $ 1.974 5 1,974 S 1,974 S 1,974 $ 46,573

S 28.420 S 21-877 S 29 415 $ 29,091 S 291n S 29,090 $ 29.415 $ 29.090 $ 29175 $ 29090 529,414 528853 $ 29 415 $ 377,522

S 21,095 5 22,568 5 18.482 $ 13,860 S 13,305 $ 29.977 S 16.382 5 25,949 5 13,305 S 13.1)60 $ 18.406 S 13,097 S 14,069 $ 234,355

$208,835 $ 12.499 $ 12.903 S 12,90t $ 12.90 1 $ 12,901 5 12.697 5 50,784 $ 12.901 S 12901 $ 12.~03 $ 12900 S 12.902 $ 400,931

$471,516 $282,086 $288,52& $284,747 5 292,716 $312,048 $294,971 $ 334,&00 5281 .448 $211,231 $214,214 $268,139 $217,669 $3,894,111

Speeeh Transeriptlon Option BBN 5 24,998 $ 25.562 $ 25.349 5 26,752 $ 25.673 S 26 457 $ 25,349 S 25 457 $ 25.873 S 26,752 S 25348 S 26,457 S 2S.a7J S 337 100 Cambodgel..lroi"'!!rslly $ 12.497 S 12,. 97 512496 S 12.497 S 12.497 $ 12.497 $ 12.496 S 12,497 S 12497 S 12A97 S 12497 512497 $ 12 497 5 162 459 Total Option S 37,495 $ 38,059 5 37,845 $ 39,249 $ 38,370 S 38,954 $ 37,845 $ 38,954 $ )8,370 S 39,249 $ 17,845 $ 38,954 5 31,370 S 499.559

55 Use or disclosure of tile data contained on this page is subject to the restriction all the tIIle page of this proposal

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000126

P10071-BBN PATROL Organizational Conflict of Interest

11 Organizational Conflict of Interest Affirmations and Disclosure

Nonc.

56 Use or disclosure of the data contained Of) this page is Subject to the restriction on the title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000127

P10071·BBN PATROL Human Use

12 Human Use

None.

57 Use or disclosure of the data contained on t/lis page is subject to the restriction on the title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000128

P10071-88N PATROL Animal Use

13 Anima l Use

None.

58 Use or disclosure of the data contained on this page is subject 10 the restriction on the title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000129

PATROL Statement of Unique Capability

14 Statement of Unique Capability Provided by Government or Government-funded Team Member

Not App licable.

59 Use or disclosure of the data contained on this page is subject to the restriction on the title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000130

P10071-BBN PATROL Government-Funded Team Member

15 Government or Government-funded Team Member Eligibility

Not Applicable.

60 Use or disclosure of the data contained on this page is subject to the restriction on the title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000131

P10071-BBN PATROL Appendix A: References

Appendix A: References

[ II I), Mermelstein. '" Distance measures for spl!cch recognition. psycho logical and instrumen tal," in "altem Recognition and Arltf'icia/ Intelligence. C. II. Chen (Ed.). pp. 374-388. Acade mic. New York. 1976.

r21 II. Hermansky. "Perceptua l Linear Predic ti on (PLP) I\nalysis lo r Speech:' J. A,:ousl. Soc. Alllcr .. Vol. 87. pp. 1738- 1752. Apri l 1990.

l31 H. Il crll1<ltlsky <J lld N. Morgan, " RAST A Process ing or Speech." 18t:T:: Transadiol1s 011 ... ;,.,cech and Audio P/'oCf!s.\'ing. Vol. 2. No. 4. pp. 587-589. October. 1994.

r41 M. Berouti. R. Schwartz. and J. Mak houl. "Enhancement orspeech corrupted by acoListic noise." If.t:E Imemafiolla! COI!/erCI1Cc on Acoustics, 5;pecc/t (1m! ,\'iJ.:1101 ProcC!ssing. pp. 208-211. Apri l. 1979.

15 1 S. Gamlpalhy. S. Thomas, ,tnd II. Hermal1sky. "Robusl Spectro-lcmpora l Features based on Au toregressive Mode ls of Hilbert Enve lopes." I( 'ASSP, Dallas. ·I·X. March 20 I O.

16] II. Hermansky, "The Modulation Spectrum in Automatic Recogn ition of Speech:' IF.EE Workshol) on A IIlon1alic SjJeech R(!('()gnilion lind Undersllll1ding. S. Furui. B.-I I. Juang and W. Chou (Eds.). 1997.

[7] T. Chi. P. Ru, and S. Shamma, "Mu ltiresolution speclroternpora l analys is ofcomplcl' sounds:'.I. Aeo/lsf . .')'oc, AllieI' .. Vo l. II S. pp. 887-906. 2006,

rSl M. Mirbaghe r i, N. Mesgarani, and S. Sham rna "Nonlinear filtering ofspectrolemporal modulati ons in speech enhancement:' ICASSf'. Dallas. TX, Marc h 20 10.

19] 13. Zhang. S. Matsoukas, and R. Schwartz. " Oiscrimina tively Trained Region Dependent Feature Transforms for Speech Recogniti on," ICAS"P. Vo l. I. pp.I-31 3 - 1-316, May 2006.

[I OJ H. Hermansk y. D.P.W. Elli s. and S. Sharma. "Connectionist Feature Extraction for Conventional HMM Systems," I(,ASSP. 2000.

[II J H. Hcnnansky and S. Sharnm. ''Temporal Pattern s (TRA I)S) in ASR of No isy Speech," It:EF. Imerna/i(mal COl~terence on Acous/ics .• \)x:ech lind 5ii}!,l1al Prm.:es.\'ing, Phoenix. AZ. 1999.

r 12'] F. Va lente and H. Il ennansky. ;' Hie rarchical and Para ll el Process ing or Modu lat ion Spectrum for ASR applicat ions." IEEE InlenWfiol1o/ Conference 011 AmUslic,\'. ,\jX'ech and Signal Processing. 2008.

fl l J J.l'into. Y. Bayya. H. Hermansky. and M. Mag imai-Ooss. "Exploiting Contextua l Information For Improved Phoneme Recogn ition." I£I:F. Inl. Conf. 0/1 AcOW;li('s. Speech. ({lid Si1!,nal Processing, 2008 .

ll 4J B. Zhang. S, Malsoukas. and R. Schwartz. "Recent progress on the discriminative reg ion~

dependent transform for speech feature ex traction." Imcrspee(-h. PiLLsbu rgh, PA, September 2006.

r 15 J J . Ma and R. Schwartz. "Unsuperv ised versus supervi sed tra ining of acoustic mode ls:' hller.\'jJeech. 2008.

61 Use or disclosure of the data contained on this page is subject to the restn'clion on the title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000132

P10071·BBN PATROL Appendix A: References

[16] S. Novotney, R. Schwartz. and .I . Ma. "Unsllpervised acoustic and language model train ing wit h small amounts of labe led daHl:' ICAS.')'P. 2009.

ll 71 S. Novotney. and R. Schwalv. "Analysis of Low-Resource Acollstic Model Self-Training:' !(,ASSf'. 2009.

j1 8] II . Uish. M. Siu, A. Chan. and W. Belficld, " Unsuperv ised trai ning of an II MM-based speech recogn izer for topic classiticat ion." ICAS'S? 2009.

1 19] N, Mesgarani. M. Slaney. S. A. Shamma, "Disc ri minati on o f speech from non speech based 0 11 multi sca le spectro-tempora l modulations." [EEl'; '/hms(fClions 01/ Audio. Speech. lIml Language Processing. vo 1. 14. no.3. pp. 920- 930. May 2006.

f20] L. Deng, A. Acero. M. Il lumpc, and X. D. Huang. " Large vocabu lary speech recognition unde r ad ve rse aco ustic environmen ts." I(,S'Lt>. Vol.3 . pr. 806-809. October, 2000.

f21] M.J .F. Ga les and S.J . Young. "A n Improved /\pproach to the I lidden Markov Model Decomposition of Speech and Noise:' ICA.\'5W, Vo l. I, pp. 233-236. 1992.

[22 1 M..I .r:. Ga les. S.J. Young. "A Fast and Flex ible Implementation of Paralle l Model Cumbinatiol1 ,·· ICASSP. pr. 133-136. 1995.

123"1 A. Acero. I.. Deng. T. Kristjansson. and J. Zhang. " HMM adaptut ion using vector 'raylor series for no isy speech recognition: ' [(,SLP. September 2000.

1241 r. Flcgo and M ..I .F .. Gales. " Disc riminative Adaptive Tra ining with VTS and JUD." I£EE Au/omatic !lju:t:ch RecoKlIiliol1l1nd Understanding Workshop. December 2009.

l25 J H. Xu, M.J .F .. Ga les, and K.K, Chin , " Improv ing .Io int Uncertainty Decoding Performance by Predict ive Methods for Noise Robust Spe~ch Recognition." /f,/:.E Automatic' .\'peech Reco},Jnitiol1 find Unders/lIl1ding Workshop. December 2009.

l261 D.K, Ki m and M.J .F .. Gules. "No isy Constrained Maximum Likelihood Linear Regress ion ror Noise-Rohusl Speech Recogni tion," submitted to If FE Transac/ions on Audio 5jJeech and LlII1gLl{lKe Proces.\"il1j{. 2010.

[271 Z. Ghahmma ni and M. L Jordan. " Factorial hidden Markov models:' Machine Lea1'l1il1g. 29:245- 275. 1997.

[28J P. Ken ny. G. Boulianne, P. Oucllct. and P. Dumouchel. "Joint fac tor analysis versus cigenchannels in speaker recogni tion:' 1£££ Trans. Audio. Speech. Lang Pro(','s'}'ing, Vo l. 15. no. 4. May 2007.

l291 G. Shafer. A Ma/hemalkal Them:" oj F:vidence. M IT Press, 1976.

f30] F. Va lente and II. Hermansky. "Combination or Acoustic C lass ifiers based on Dempster­Shafer Theory o f Evidence," IEEE In/ernalional COI!terence on Acol/stic.'), Speech and Sigilli! Processing. Honolulu, HI. 2007.

[3 11 R. Schwartz. A. Derr. /\. Wil gus. and J. Makhou l. "Speaker Ve ri ti cation IR&D Final Report." nnN Technical Reporl No. 666 1. Nov. 1987.

[32 1 E, Singer and D.A. Reynolds. "A nalys is o f' multi target detection for speaker and language recogni tion." 1£££ Odyssey: 71le Speaker tIIlfl LUI1KlI{lgt: Re('ogni/ion Workshop, pp. 30 I· 308,2004.

62 Use or disclosure of the data contained on tllis page is subject to the restriction on the title page of this proposal:

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000133

P10071-BBN PATROL Appendix A: References

[331 H. Jin . R. Schwart7.. S. Sista, and F. Walls. "Topic Tracki ng for Rad io. TV Broadcast. and Newswirc:' 1~·lIrosl'eech. Uudapcsl. Hungary. pp. 2439-2442. September 1999.

1341 I-I. Jin . R. Schw<trtz. F. Wa ll s. and S. Sista. "Method and apparatus fo r score normalizll ti on for information retrieval appi icar ions:' US. I'll/ellis 7 .062.4~5 and 6.651 ,057.

[35 1 D.R.H. Miller. M. Klebe r. C. Kao. O. Kimball. T. Cohhurst. S.A. Lowe. R.M. Schwartz. and H. Gish. "Rapid and Acc umte Spoken Term Dt!leclion," Il1lerspe(!ch. Antwerp. Belgium. 2007.

[36] S. MaLsou kas. I .. Nguye n. J. Davenport. J . nitla. F. Richardson. M. Si u. D. Liuo R. Schwartz. and J. Makhoul. "The 1998 llllN Byblos Primary System App lied 10 English anu Spanish Broadcast News Transcription:' DARPA Bl'oadc:asl News Workshop. Ilc rndon , V A. Morgan Kaufmann Puhli shers. pp . 255-260. reb. 2R-Mar. 3, 1991).

[371 S. Galligan, "Automatic speech detcction and scgmentation or air tmnic cont rol audio using paramctric trajectory models," I/:;HE Aerospace Conft!l'ence, Big Sky, MontamL March 2007.

P8J A. Stolckc. L. Ferrcr. S. Kajarck~lr. E. Shriberg. and A. Vcnkataraman. -'M I. LR Tmnsforms as rea!ures in Speaker Recogniti on," Hums/leech, pp. 2425-2428. September 2005.

[39J E. E, Shriberg " Iligher level features in speakcr rec.ognition:· in ,\'peake /' C/asst/icaliol1 I. C. Muller (Ed.). Vol. 4343 of Lecture Noles il1 Compuler Sciencf!. Arti licial lntel ligcnce. Springer. pp. 24 1-259. 2007.

[40.1 C.J. I.cggetter and P.c. Woodland. "Maximum likelihood linear regression for speakcr adaptation or continuous density IIMMs'-' COII/pllln Sllet>ch ami Langu(lge. Vol. 9, pp. 171-186.1995 .

14 1) M.A. Z issman. "Comparison of lour approaches to automatic language idcn tification of telcphone speech ," IEEE 7i·(l/1S. 011 '\/Jecch (1/1(/ Audio Processing, 4( 1):3 1-44. 1996.

[42] L. Burget. P. Matejka. and J. Cernocy. "Discriminative Training Techniques lo r Acousti c Language Ident ifi cation." ICASSf', Tou louse. France. pp. 209-212. 2006.

[431 T. Anastasak os. J. McDonough, R. Schwartz. and J. Makhoul. "A compact model ror speaker-adaptive training." 1111. Cw!l Speech Language Pl'ocf.'ssiI1K. vo l. 2. Philadelphia, PA. pp. 11 37- 1140. 1996.

f44] II. Li <llld 13. Ma. "A Phonotaclic Langu<lge Model for Spoken Language Identification," -/3rd Annulil Meeling l!lfhe ACL Ann Arbor. MI. pp. 5 15- 522, June 2005.

[45] S. Sista. R. Schwartz. T . Leek. and J. Makhoul. "All Algorithm lor Unsupervised Topic Discovery from Rroadcast News Stories." HlIl11al1l.clIlgulIK(' Technology Workshop. San Dicgo_ CA. 2002.

f461 S. F. Boll, "Suppress ion o f acoustic noi se in speech us ing spect ral subtraclion:'I££E ASSI'-2i- pp_ 113- 120. Apri 11 979.

l47J H. Gish. Y. -L. Chow. and J. R. Rohlicek. "Probabilistic Vector Mapping of No isy Speech parameters for II MM word spotting," I( "ASSP, 1990.

f481 K. Ng. H. Gish. and J. R. Rohlicek. " Robust Mappi ng of noi sy speech rarameters for HM M word spoiling:' leA.SSt'. 1992.

63 Use or disclosure of the data contained on this page is Subject to Ihe restncllOn on the title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000134

P10071·BBN PATROL Appendix A: References

f 49] X. Zhu, "M ult i-Stage speaker diari zntion for canterellel! and lecture meeti ngs." http://www. itl . ni sI. go v l iad/m i g/tcs[s/rtl2 00 7/workshnp/R -1'07 -LI M S 1-S r K R -PrcsenlUli oll_ v2.pdf

[SOt C. Wooters and M.I luijbrcgts. "The ICSt 20U7 speaker diari zation system." http://www.ill .nist.gov/i.ld/ m igltC:Sls/rt/2007/workshop/RT07 -SR IICS I-SPK R­Prese nta ti on.pdf

15 1] X. Angucra. M. Agu ila. C. Wooters. C. Nadt>u. and J. Il crmmdo. "Hybrid Speech/non­speech detec tor applied 10 Speaker Diarizul io l1 of Meet ings." IEEt: Ody,\'sey Speaker and LanXlI(lge Rec.:ognilioll Workshop. pp. 1-6. 2006.

1521 E. Scheirer and M. Slaney. "Construct io n and evaluation of a robust Illult ifeatu rc speech/music disc riminato r:' ICASSI'. 1997.

[531 B. Kingsbury. G. Saon. L. Mangu. M Padm,mabhan and R. Sarikaya. "Robust speech recognition in noisy environments: The 200 1 113 M SPINE eva luat ion system," IC;l5iS'I', 2002.

fS4] D. A. Reynolds. R. C. Rose. and M. J. T. Smith. "A Mixtu re Model ing Approach to Text­Independent Speukcr Ident ifi cation:' ./01/1"1/(/1 (?( till.! AcolI.,'/ical Sociely l?/Amerim, Suppl. t. Vol. 87. 1990.

155] L. Burget. P. Mi:l t~jka. P. Schwarl:. O. Glembck. and J . Cernocky, "Anulysis of tealUre ex tract ion and cluHlncl compcnsation in GMM speaker recognition system."IE£/:; Tram'{lClio!1.\' onAlldio. SjX!cch. (lml Lang/taRe Processing, Vol. 15, No.7. 2007,

f56] W. Campbell. I). E. Stu rim, D. A. Reynolds, :'Support Vector Machines Using GMM Supervectors for Speaker Veri tica tion:' IEEE Sigmi/ ProcessinK Lellers. Vo l. 13. No , 5. pp, 308- 3 11 . May 2006.

[571 L. Burget, P. Matejka, V. Hubeika, and J. CcrnockY. " Investigation into variants ofJ01111 r:actor Ana lysis for speake r recognition:' Il1ter~pee,:h , Brighton, UK. pp. 1263- 1266.2009.

f58J A. SolomonolT. W. M. Campbel l. and C. Qui llen. "Nuisance Attribute Projection:' Speech ('oll1l11 lmiclil iol'l. Amsterdam. The Nethe rlands. May 2007.

f59] W. M. Campbell. J. P. Campbell. D. A. Reynolds. D. A. Jo nes. and T. R. Leek. " High-Level Speaker Verilieation with Support Vcctor Machines:' f(';/Ssr. Montreal. Quebec. Canada. pp. I: 7.1-76. Muy 2004.

[601 J. P. Campbe ll. D. A. Reynolds. and R. B. Dunn. "Fus ing High- and I.ow-Level Features fo r Speake r Recogni tion." £urospeech. 2003.

t6 1] N. BrOmmer, L. Burget. J . Cernoek)'. O. Glembek. r: . Grezl. M. Karaliat. D. va n Leeuwen. P. Matejka. P. Schwarz, and A. Strashe im. "Fusion o f heterogeneous speaker recogn it ion systems in the ST13U submiss ion fo r thc NIST speaker recogni ti on eva luat ion 2006." IEF.£ Transaclions 0'1 Alldio . .5jJecch. and l.angulI:.,:e !'rocessil1J:. Vo l. 15. No.7, pp. 2072-2084, 2007.

L 621 P. Mate.ika. P. Schwa rz, L. 13urgel. and J. Ccrnocky, "Use of anti-mode ls to further im prove state-o l~ the-art PR LM language recogn it ion system." Ie ASS!'. Toulouse. r:rance. pp. 197-200.2006.

64 Use or disclosure of the data contained on this page is subject to the restriction on the title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000135

Pl0071·BBN PATROL Appendix A: References

[63 1 JR . Roh licek. P. Jcanrenaud. K. Ng. II. Gish. 8. Musicus. M. Siu ... ,lhonct ic trai ning and language mOdel ing for word spotting:' ICASSP. 1993.

(64 1 J,G. Fiscus. J . I\jot. J .S. Garoro lo. and G. Doddington. "Results or the 2006 Spoken Term Detecti on Eva hUll ion." ,\jJecia/ l/lferesl GrollI' 01'1 {I!thrll/al ion Retrieval (S IG I R ·07) Workshop ill .')'earching Spontaneous COl1l'erslIliol1al ,~/Jeech. 2007.

65 Use or disclosure of the data contained on this page ;s subject to the restriction on the title page of this proposal.

epic.org 14-10-09-DARPA-FOIA-20150527-Production-Raytheon 000136