Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lot,

A novel technique for maximum “a posteriori”

(MAP) adaptation of maximum entropy (MaxEnt)

and maximum entropy Markov models (MEMM) is

presented.

The technique is applied to the problem of recovering

the correct capitalization of uniformly cased

text: a “background” capitalizer trained on 20Mwds

of Wall Street Journal (WSJ) text from 1987 is

adapted to two Broadcast News (BN) test sets —

one containing ABC Primetime Live text and the

other NPR Morning News/CNN Morning Edition

text —from 1996.

The “in-domain” performance of the WSJ capitalizer

is 45% better than that of the 1-gram baseline,

when evaluated on a test set drawn from WSJ

1994. When evaluating on the mismatched “out-ofdomain”

test data, the 1-gram baseline is outperformed

by 60%; the improvement brought by the

adaptation technique using a very small amount of

matched BN data—25-70kwds—is about 20-25%

relative. Overall, automatic capitalization error rate

of 1.4% is achieved on BN data.

2004-chelba-emnlp.pdf
PDF file

In  Proc. of EMNLP

Details

TypeInproceedings
> Publications > Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lot,