IMPORTANT DATES(for the year 2011; Japan time)
blue=participants' actions; red=organisers' actions; black=both
|March 1||Training queries and nuggets released|
|April 28||Formal run queries released|
|May 16||Run submissions due|
|May 23||Formal run nuggets (tentative version) released|
|June 13||Feedback on formal run nuggets due|
|July 1||Nugget match evaluation begins|
|August 31||Nugget match evaluation ends|
|September 8||Formal run results released|
|September 20||First draft papers due (for all NTCIR-9 tasks)|
|November 4||Camera ready papers due (for all NTCIR-9 tasks)|
|December 6-9||NTCIR-9 meeting|
March 1: We will release about 40 training queries with their nuggets in order to show what kinds of information we expect the participating systems to return for each question type (See Task Definition).
April 28- May 16: We will release 60 formal run questions to participating teams. Participating teams must submit their runs to the organisers on or before May 16.
May 23-June 13: We will release a tentative version of formal run nuggets (i.e. right answers) to participants. Participants have three week to examine them and complain to the organisers: "Hey, why isn't this piece of information included in the nugget set!?" We will revise the nuggets sets if necessary.
June 20-August 31: Using a web-based nugget match evaluation interface (See Evaluation Method), organisers and participants assess the submitted runs. The evaluation task is basically to compare a system output with a nugget set. As this is a Japanese information access task, we will ask the Japanese participants for cooperation.
September 8-November 4: We will release the official evaluation scores for all runs. Based on the results, all participating teams must submit a paper to NTCIR. If you are participating in multiple subtasks of INTENT, you must write one paper for INTENT and a separate paper for 1CLICK.
December 6-9: See you all at NII, Tokyo!
1CLICK aims at the following information access scenario: The user enters a query and clicks on the SEARCH button. After that, there's no need for the user to click or scroll any further - the system returns exactly what he wants right away, and his information need is satisfied immediately.
This year, we will focus on textual output (of length up to X characters). Systems should try to present important pieces of information first and to minimise the amount of text the user has to read.
This year, we will focus on Japanese queries and Japanese textual output.
There will be four types of queries: CELEBRITY, LOCAL, DEFINITION and QA. The first three query types are simple phrases, while the QA queries are single sentences. We will release a query set file, where each line contains the following two fields:
Note that we will not explicitly provide the query type information before the run submission deadline.
We will release 60 formal run queries, 15 for each query type.
Optionally, participating systems can use the Oracle URLs that we will provide for each query. These URLs were actually used for creating the gold-standard nuggets. Thus, instead of using a search API, participants may choose to treat these URLs as input, just like in summarisation evaluation.
Each participating system must return, for each query, an X-character string (which we call X-string) designed to satisfy the information need. The expected types of information depend on query types. Details are given in the nugget creation policy document and the sample queries and nuggets document.
We accept two types of runs (system output files):
DESKTOP runs ("D-runs"), for which the first X=500 characters for each query will be evaluated. The length limit of 500 roughly approximates the amount of text from top five Japanese snippets in a Web search result page, which the user can typically see without scrolling the browser.
MOBILE runs ("M-runs"), for which the first X=140 characters for each query will be evaluated. This length limit approximates a mobile phone display size.
Participating systems may utilise any documents available on the Web. If a run relies at least partially on the Oracle URLs that we provide, this run is called an oracle run. Otherwise (i.e. the run does not use the Oracle URLs in any way e.g. by using a Web search API), the run is called an open run.
Runs must be generated automatically.
Each team can submit up to TWO runs. Participating teams may choose to evaluate additional runs for themselves later using the nugget match evaluation interface.
The file name of each run must be of the following format:
<teamID> is your registered teamID, e.g. MSRA
<runtype> is either "D" (desktop) or "M" mobile.
The nugget match evaluation interface will truncate each D-run's X-string down to 500 characters, and each M-run's X-string down to 140 characters (excluding punctuation marks etc.) before evaluation.
<knowledgesource> is either "ORCL" (uses the Oracle URLs in some way) or "OPEN".
<integer> is either 1 or 2, since we only accept two runs per team.
So a legitimate run name would be something like MSRA-D-ORCL-1.txt.
Run file specifications:
All run files should be encoded in UTF-8. TAB is used as the separator.
Each run file begins with exactly one system description line, which should be in the following format:
SYSDESC[tab]<brief one-sentence system description in English>
Please make sure the description text does not contain a newline code.
Below the system description line, there must be an output line for every query. Each output line should contain an X-string. The required format is:
Again, X-strings should not contain any newline codes. We recommend that the target length of each X-string be around 600 characters for D-runs and around 200 characters for M-runs (i.e. slightly longer than the official values X=500 and 140). The nugget match evaluation interface will truncate each run before evaluation, ignoring punctuation marks etc.
Each output line should be followed by at least one URL line. These lines represent the knowledge sources from which you generated the X-string. Each URL line must be in the following format:
These lines will be used for investigating what kinds of knowledge sources the participating teams have utilised.
If your run used the Oracle URLs only to produce the X-strings, then your run file should contain these URLs.
In summary, a run file should look something like this:
SYSDESC Bing API top 10 snippets; tfidf for sentence selection
1C1-0001 OUT これは例です
1C1-0001 URL http://research.microsoft.com/en-us/people/tesakai/1CLICK.html
1C1-0001 URL http://research.nii.ac.jp/ntcir/index-en.html
: : :
1C1-0060 OUT これも例です
1C1-0060 URL http://research.nii.ac.jp/ntcir/index-en.html
We will evaluate runs based on nuggets. Please refer to the nugget creation policy document and the sample queries and nuggets document. Please note that our nuggets are designed to represent established facts as of December 31, 2010. We will not reward inclusion of more fresh information (e.g. events that took place in 2011) in the X-string.
Organisers and (Japanese) participants will use the nugget match evaluation interface to compare the X-string and the nugget set for each query. That is, nuggets in the X-strings will be identified manually. Please refer to the nugget match evaluation instructions for details.
We will use S-measure as the primary evaluation metric (See Sakai/Kato/Song CIKM'11 paper). This generalises weighted nugget recall (W-recall) to encourage systems to present important nuggets first and to minimise the amount of text the user has to read. Given the result of the nugget match evaluation, S-measure and W-recall can be computed using NTCIREVAL.
The key difference between our 1CLICK evaluation and traditional nugget-based QA/summarisation evaluation is that we take the positions of nugget matches into account.
Microsoft Research Asia
Microsoft Research Asia
Microsoft Research Asia
If you have any questions, please contact ntcadm-intent at nii dot ac dot jp.