Morphology Based Language Modeling for Turkish Speech Recognition (BAP)
BAP funded project on morphology-based language modeling for Turkish speech recognition.
Status
Completed 2008 - 2009
Morphology Based Language Modeling for Turkish Speech Recognition
Funding Agency: Bogazici University Research Fund, BAP (Project 08M103)
Project Manager: Tunga Güngör
Dates: 2008-2009
In this project, we aimed at developing a high performance large vocabulary continuous speech recognition system for Turkish. The most important contribution of this work was to develop a morphology-based language model for Turkish.
As a result of our previous work, we built language resources for Turkish such as a morphological parser, a morphological disambiguator, and a web corpus. Using these language resources, we developed an effective morphology-based language model for Turkish. We also replaced the static lexicon with a dynamic one based on the morphological parser, greatly alleviating the out-of-vocabulary problem for Turkish. We also developed a speech decoder which can do speech decoding on morphology-integrated search networks.