A unified statistical model for the identification of English baseNP

  • Changning Huang ,
  • Endong Xun ,
  • Zhou Ming

Published by Association for Computational Linguistics

Publication

This paper presents a novel statistical model for automatic identification of English baseNP. It uses two steps: the N-best Part-Of-Speech (POS) tagging and baseNP identification given the N-best POS-sequences. Unlike the other approaches where the two steps are separated, we integrate them into a unified statistical framework. Our model also integrates lexical information. Finally, Viterbi algorithm is applied to make global search in the entire sentence, allowing us to obtain linear complexity for the entire rocess. Compared with other methods using the same testing set, our approach achieves 92.3% in precision and 93.2% in recall. The result is comparable with or better than the previously reported results.