Microsoft Research Question-Answering Corpus

This download consists of data only: a text file containing 1.4K questions aimed at the text of Encarta 98, the full text of Encarta 98, and a set of human annotations identifying pieces of text in Encarta that fully or partially answer the question. These annotations additionally specify information about the precise nature of the match, such as whether the linguistic forms of the question and the answer are similar. The annotation data has been split in two different ways to facilitate different algorithm-training methodologies: 1) 10 files, each containing 10 percent of the original 1.4K questions, along with the full set of answers for each question, and 2) 10 files, each containing 10 percent of the full, pooled set of 10K+ question/answer pairs.

Details

TypeDownload
File NameMSR Encarta QA Corpus.msi
Version1.0.0
Date Published13 November 2008
Download Size36.76 MB

Note By installing, copying, or otherwise using this software, you agree to be bound by the terms of its license. Read the license.