Ying Jia and Jinyu Li
In this paper, we propose a new type of frame-based that the recognizer fohws the maximum a posteriori hidden Markov models (HMMs), in which a sequence of observations are generated using state-dependent autoregressive feature models. Based on this correlation = dwIo)= dolw) dw) model, it can be proved that expressing the probability of a sequence of observations as a product of probabilities of decorrelated individual observations doesn’t require the where W is a word string hypothesis for a given acoustic observation 0. p(0lw) is the acoustic model, and assumption of frame independence. Under the maximum likelihood (ML) criteria, we also derived re-estimation formulae for the parameters (mean vectors, covariance i=l matrix, and diagonal regression matrice) of the new is the N%am language model. When deriving the Hh4Ms using an Expectation Maximization (EM) algorithm. From the formulae, it’s interesting to see that the new HMMs have extended the standard HMMs by relaxing the frame independence limitation. Initial experiment conducted on WSJ20K task shows an encouraging performance improvement with only 117 additional parameters in all.
|Published in||Proc. ICASSP|