In Proceedings of the First Workshop on Natural Language Processing and Neural Networks

  • Lei Zhang ,
  • Ming Zhou ,
  • Changning Huang ,
  • Haihua Pan

Language models adopted by most existing error detection and correction approaches of Chinese text are N-Gram models of character, word or POS tag. Their deficiencies are that only local language constraints are employed and there is no language model unification process. A multifeature-based automatic error detection and correction approach is presented. It uses both local language features and wide-scope semantic features. Winnow is adopted in the learning step. In experiment, this method gets an error detection recall rate of 85.1%, an error detection precision rate of 41.0%, and a correction rate of 51.2%. This approach shows better performance than existing approaches.