Creating Speech Recognition Grammars from Regular Expressions for Alphanumeric Concepts

  • Ye-Yi Wang ,
  • Y. C. Ju

International Conference on Spoken Language Processing |

Published by International Speech Communication Association

To bring speech recognition mainstream, researchers have been working on automatic grammar development tools. Most of the work focused on the modeling of sentence level commands for mixed-initiative dialogs. In this paper we describe a novel approach that enables the developers with little grammar authoring experience to construct high performance speech grammars for alphanumeric concepts, which are often needed in the more commonly used directed dialog systems in practice. A developer can simply write down a regular expression for the concept and the algorithm automatically constructs a W3C grammar with appropriate semantic interpretation tags. While the quality of the grammar is ultimately determined by the way in which the regular expression is written, the algorithm relieves the developers from the difficult tasks of optimizing grammar structures and assigning appropriate semantic interpretation tags, thus it greatly speeds up grammar development and reduces the requirement of expertise. Preliminary experimental results have shown that the grammar created with this approach consistently outperformed the general alphanumeric rules in the grammar library. In some cases the semantic error rates were cut by more than 50%.