Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
1CLICK@NTCIR-9 Home Page

last updated September 1, 2011

Once Click Access (1CLICK) is a subtask of the NTCIR-9 INTENT task.

IMPORTANT DATES

(for the year 2011; Japan time)

blue=participants' actions; red=organisers' actions; black=both 

March 1 Training queries and nuggets released 
April 28     Formal run queries released 
May 16  Run submissions due 
May 23     Formal run nuggets (tentative version) released 
June 13  Feedback on formal run nuggets due 
July 1  Nugget match evaluation begins 
August 31  Nugget match evaluation ends 
September 8  Formal run results released 
September 20  First draft papers due (for all NTCIR-9 tasks)
November 4 Camera ready papers due (for all NTCIR-9 tasks)
December 6-9  NTCIR-9 meeting 

 

March 1: We will release about 40 training queries with their nuggets in order to show what kinds of information we expect the participating systems to return for each question type (See Task Definition).

April 28- May 16: We will release 60 formal run questions to participating teams. Participating teams must submit their runs to the organisers on or before May 16.

May 23-June 13: We will release a tentative version of formal run nuggets (i.e. right answers) to participants. Participants have three week to examine them and complain to the organisers: "Hey, why isn't this piece of information included in the nugget set!?" We will revise the nuggets sets if necessary.

June 20-August 31: Using a web-based nugget match evaluation interface (See Evaluation Method), organisers and participants assess the submitted runs. The evaluation task is basically to compare a system output with a nugget set. As this is a Japanese information access task, we will ask the Japanese participants for cooperation.

September 8-November 4: We will release the official evaluation scores for all runs. Based on the results, all participating teams must submit a paper to NTCIR. If you are participating in multiple subtasks of INTENT, you must write one paper for INTENT and a separate paper for 1CLICK.

December 6-9: See you all at NII, Tokyo!

TASK DEFINITION

 Outline:

1CLICK aims at the following information access scenario: The user enters a query and clicks on the SEARCH button. After that, there's no need for the user to click or scroll any further - the system returns exactly what he wants right away, and his information need is satisfied immediately.

This year, we will focus on textual output (of length up to X characters). Systems should try to present important pieces of information first and to minimise the amount of text the user has to read.

Target language:

This year, we will focus on Japanese queries and Japanese textual output.

Input:

There will be four types of queries: CELEBRITY, LOCAL, DEFINITION and QA. The first three query types are simple phrases, while the QA queries are single sentences. We will release a query set file, where each line contains the following two fields:

<queryID> <querystring>

Note that we will not explicitly provide the query type information before the run submission deadline.

We will release 60 formal run queries, 15 for each query type.

Optionally, participating systems can use the Oracle URLs that we will provide for each query. These URLs were actually used for creating the gold-standard nuggets. Thus, instead of using a search API, participants may choose to treat these URLs as input, just like in summarisation evaluation.

Output:

Each participating system must return, for each query, an X-character string (which we call X-string) designed to satisfy the information need. The expected types of information depend on query types. Details are given in the nugget creation policy document and the sample queries and nuggets document.

We accept two types of runs (system output files):

DESKTOP runs ("D-runs"), for which the first X=500 characters for each query will be evaluated. The length limit of 500 roughly approximates the amount of text from top five Japanese snippets in a Web search result page, which the user can typically see without scrolling the browser.

MOBILE runs ("M-runs"), for which the first X=140 characters for each query will be evaluated. This length limit approximates a mobile phone display size.

Participating systems may utilise any documents available on the Web. If a run relies at least partially on the Oracle URLs that we provide, this run is called an oracle run. Otherwise (i.e. the run does not use the Oracle URLs in any way e.g. by using a Web search API), the run is called an open run.

Runs must be generated automatically.

RUN SUBMISSION

Each team can submit up to TWO runs. Participating teams may choose to evaluate additional runs for themselves later using the nugget match evaluation interface.

File names:

The file name of each run must be of the following format:

<teamID>-<runtype>-<knowledgesource>-<integer>.txt

where

<teamID> is your registered teamID, e.g. MSRA

<runtype> is either "D" (desktop) or "M" mobile.

The nugget match evaluation interface will truncate each D-run's X-string down to 500 characters, and each M-run's X-string down to 140 characters (excluding punctuation marks etc.) before evaluation.

<knowledgesource> is either "ORCL" (uses the Oracle URLs in some way) or "OPEN".

<integer> is either 1 or 2, since we only accept two runs per team.

So a legitimate run name would be something like MSRA-D-ORCL-1.txt.

Run file specifications:

All run files should be encoded in UTF-8. TAB is used as the separator.

Each run file begins with exactly one system description line, which should be in the following format:

SYSDESC[tab]<brief one-sentence system description in English>

Please make sure the description text does not contain a newline code.

Below the system description line, there must be an output line for every query. Each output line should contain an X-string. The required format is:

<queryID>[TAB]OUT[TAB]<X-string>

Again, X-strings should not contain any newline codes. We recommend that the target length of each X-string be around 600 characters for D-runs and around 200 characters for M-runs (i.e. slightly longer than the official values X=500 and 140). The nugget match evaluation interface will truncate each run before evaluation, ignoring punctuation marks etc.

Each output line should be followed by at least one URL line. These lines represent the knowledge sources from which you generated the X-string. Each URL line must be in the following format:

<queryID>[TAB]URL[TAB]<url>

These lines will be used for investigating what kinds of knowledge sources the participating teams have utilised.

If your run used the Oracle URLs only to produce the X-strings, then your run file should contain these URLs.

In summary, a run file should look something like this:

SYSDESC    Bing API top 10 snippets; tfidf for sentence selection

1C1-0001    OUT    これは例です

1C1-0001    URL    http://research.microsoft.com/en-us/people/tesakai/1CLICK.html

1C1-0001    URL    http://research.nii.ac.jp/ntcir/index-en.html

: : :

1C1-0060    OUT    これも例です

1C1-0060    URL    http://research.nii.ac.jp/ntcir/index-en.html

EVALUATION METHOD

We will evaluate runs based on nuggets. Please refer to the nugget creation policy document and the sample queries and nuggets document. Please note that our nuggets are designed to represent established facts as of December 31, 2010. We will not reward inclusion of more fresh information (e.g. events that took place in 2011) in the X-string.

Organisers and (Japanese) participants will use the nugget match evaluation interface to compare the X-string and the nugget set for each query. That is, nuggets in the X-strings will be identified manually. Please refer to the nugget match evaluation instructions for details.

Evaluation metrics:

We will use S-measure as the primary evaluation metric (See Sakai/Kato/Song CIKM'11 paper). This generalises weighted nugget recall (W-recall) to encourage systems to present important nuggets first and to minimise the amount of text the user has to read. Given the result of the nugget match evaluation, S-measure and W-recall can be computed using NTCIREVAL.

The key difference between our 1CLICK evaluation and traditional nugget-based QA/summarisation evaluation is that we take the positions of nugget matches into account.

ORGANISERS

Tetsuya Sakai  

Microsoft Research Asia 

Makoto Kato  

Kyoto University 

Youngin Song  

Microsoft Research Asia 

Ruihua Song  

Microsoft Research Asia  

Min Zhang  

Tsinghua University 

Yiqun Liu  

Tsinghua University 

Nick Craswell  

Microsoft 

 

If you have any questions, please contact ntcadm-intent at nii dot ac dot jp.