We present a supervised method for training a sentence level confidence measure on translation output using a human-annotated corpus. We evaluate a variety of machine learning methods. The resultant measure, while trained on a very small dataset, correlates well with human judgments, and proves to be effective on one task based evaluation. Although the experiments have only been run on one MT system, we believe the nature of the features gathered are general enough that the approach will also work well on other systems.
Publisher European Language Resources Association
Printed / Distributed with the permission of ELRA. This paper was published within the proceedings of the LREC'2004 Conference. © 2004 ELRA - European Language Resources Association. All rights reserved.