25th Anniversary Distinguished Lecture Series

 

GodfreyFrom "Open Sesame" to "Where’s Waldo?"
A personal strategic view of 20 years of progress
in Speaker and Language ID

John J. (Jack) Godfrey
Chief of Human Language Technology Research, U.S. National Security Agency

Friday, October 14, 11 a.m., TI Auditorium (ECSS 2.102) Refreshments at 10:45 a.m.

 

Abstract
When the U.S. Government, especially the Department of Defense, has language technology needs that require R&D, it usually establishes specific sponsored programs (at DARPA or the individual services or agencies) to address these challenges. While speech processing and language technology advancements have been made from both industry and academic sectors, there are major challenges that require more far reaching coordinated efforts, which is where support from the U.S. Government is needed. For speech/language applications such as speaker identification (SID) and language identification (LID), these have generally been in the shadow of other language-based programs such as those directed at automatic speech recognition. For SID/LID, there have only been two such major research programs in the last 20 years. Instead of continuous supported R&D in this area, we have tried to create a "virtuous cycle" using National Institute of Standards and Technology (NIST) evaluations and associated data sets to drive progress in research (i.e., the NIST SRE – Speaker Recognition Evaluation (www.itl.nist.gov/iad/mig/tests/sre/), and NIST LRE – Language Recognition Evaluation (www.itl.nist.gov/iad/mig/tests/lre/), leaving application development to government labs or industry contracts. This talk will review how this strategy has worked over the last decade, its strengths and weaknesses and some gaps in the science that might be addressed in the research community. Along these lines, it will describe several new "seedling" initiatives under way at Johns Hopkins University, MIT Lincoln Laboratory and BBN Technologies.

Bio
Dr. Godfrey received his PhD in Linguistics from Georgetown University, was a Postdoctoral researcher at the U.S. Air Force Aerospace Medical Research Lab and subsequently spent 10 years at UTD’s Callier Center where he focused on speech perception and psycholinguistics. Under NIH sponsorship, his team at Callier collaborated with neurologists at UT Southwestern Medical Center on research showing that dyslexia has a central auditory component, affecting the ability to discriminate and classify phonetic features in speech. He later joined Texas Instruments Speech Research where, in addition to phonetics, he worked on corpus-based evaluation, which helped drive speech research for the next decade. He also served as Executive Director for the Linguistic Data Consortium before becoming Chief of Human Language Technology Research at the U.S. National Security Agency where he oversees government R&D efforts in Speaker Recognition, Language Recognition and Multilingual Speech-to-Text-based Keyword Search Technologies. He has interacted with numerous TI researchers including George Doddington, Barbara Wheatley and Joseph Picone. He also had a leadership role in establishing the Human Language Technology Center of Excellence at Johns Hopkins University in 2007.