Sabato M. Siniscalchi, Jinyu Li, and et. al.
Speech recognition has become common in many application
domains. Incorporating acoustic-phonetic knowledge into Automatic
Speech Recognition (ASR) systems design has been proven a viable approach
to rise ASR accuracy. Manner of articulation attributes such as
vowel, stop, fricative, approximant, nasal, and silence are examples of
such knowledge. Neural networks have already been used successfully as
detectors for manner of articulation attributes starting from representations
of speech signal frames. In this paper, a set of six detectors for the
above mentioned attributes is designed based on the E-αNet model of
neural networks. This model was chosen for its capability to learn hidden
activation functions that results in better generalization properties. Experimental
set-up and results are presented that show an average 3.5%
improvement over a baseline neural network implementation.
In Lecture Notes in Computer Science