1CLICK-2@NTCIR-10 Home Page (This is the OLD homepage)

WHAT'S NEW

IMPORTANT DATES

ALL DATES ARE BASED ON JAPAN TIME

blue=participants; red=organisers; black=all

April 30, 2012 sample queries and iUnits released RIGHT HERE
Aug 31, 2012 test queries released
Oct 31, 2012  run submissions due
Nov 2012-Jan 2013 iUnit match evaluation
Feb 01, 2013      very early draft overview released 
Feb 28, 2013         evaluation results released
Mar 01, 2013 draft participant papers due 
May 01, 2013  all camera ready papers due
Jun 18-21, 2013  NTCIR-10

 

 

TASK DEFINITION

Overview

Current web search engines usually return a ranked list of URLs in response to a query. The user often has to visit several web pages and locate relevant parts within long web pages. However, for some classes of queries,  the system should be able to gather and return relevant information directly to the user, to satisfy her immediately after her click on the search button ("one click access").

The 1CLICK task focusses on evaluting textual output based on information units (iUnits) rather than document relevance. Moreover, we require the systems to try to minimise the amount of text the user has to read or, equivalently, the time she has to spend in order to obtain the information. This type of information access is particularly important for mobile search. The systems are thus expected to search the web and return a multi-document summary of retrieved relevant webpages that fits a small screen.

Queries

The first round of 1CLICK at NTCIR-9 (1CLICK-1) dealt with Japanese queries only.There, based on a study on desktop and mobile query logs, we considered four query types: CELEBRITY, LOCAL, DEFINITION and QA.

For this second round of 1CLICK (1CLICK-2), we have expanded our language scope to English and Japanese. Moreover, we will use the following, more fine-grained query types:

ARTIST (10) use wants important facts about musicians, novelists etc. who produce work of art;
ACTOR (10) user wants important facts about actors, actresses, TV personalities etc.;
POLITICIAN (10) user wants important facts about politicians;
ATHLETE (10) user wants important facts about athletes;
FACILITY (15) user wants access and contact info of a particular landmark, facility etc.;
GEO (15) user wants access and contact information of entities with geographical constraints e.g. sushi restaurants near Tokyo station;
DEFINITION (15) user wants to look up an unfamiliar term, an idiom etc.;
QA (15) user wants to know factual (but not necessarily factoid) answers to a natural language question;

The number of queries for each query type is shown in parentheses. Thus, we will use a total of 100 queries for the Japanese task, and another 100 for the English task. The queries will be selected from real mobile query logs. 

Task types

Registered 1CLICK participants must submit at least one run to the Main Tasks or the Query Classification Subtasks.

MAIN TASKS (Japanese, English)

This is similar to the 1CLICK-1 task described in the Overview: given a query, return a single textual output (X-string). The length of the X-string is limited as follows:
- For English, 1000 chars for DESKTOP run and 280 chars for MOBILE run; and
- For Japanese, 500 chars for DESKTOP run and 140 chars for MOBILE run.
Note that symbols (such as ',' and '(') are excluded in counting the number of characters.

We require systems to return important pieces of factual information and minimise the amount of text the user has to read (time she has to spend to obtain the information).

There are three types of Main Task runs:

Mandatory Runs: Organisers will provide baseline web search results and their page contents for each query. Participants must use these contents only to generate X-strings. Using the baseline data will enhance repeatability and comparability of 1CLICK experiments.

Oracle Runs (OPTIONAL): Organisers will provide the supporting URLs and their page contents for the gold standard iUnits for each query. Participants can use the data either wholly or partially to generate X-strings. If this data set is used in any way at all, the run is considered as an Oracle run.

Open Runs (OPTIONAL): Participants may choose to search the live web on their own to generate X-strings. Any run that does NOT use the oracle data but uses at least some privately-obtained web search results is considered as an Open run, even if it also uses the baseline data.

QUERY CLASSIFICATION SUBTASKS (Japanese, English)

This is a relatively easy subtask: given a query, return its query type. The query type should be chosen from the taxonomy mentioned above. Main Task participants whose systems involve query classification are encouraged to participate in this subtask to "componentise" evaluation.
The input of Query Classification Subtasks is the same as that of MAIN tasks:
a query set file, which contains pairs of a query ID and a query string as explained below.

NOTE: For the Query Classification Subtasks, there are no run types such as Mandatory/Oracle/Open.

 

Input

We will release a query set file, in which each line contains the following two fields:

<queryID>[TAB]<querystring>

Note that we will not explicitly provide the query type information before the run submission deadline.

For ORACLE and MANDATORY runs, we will provide a set of HTML documents for each query, from which participants are expected to generate X-strings. In addition, an index file for each query will be released, which contains the title, URL, rank, and snippet (summary of a HTML document presented under the title in SERP) of the HTML documents. The rank and snippet of HTML documents are derived from Bing's Web search results.

Index files are named <queryID>-index.tsv, and each line is of the following format:

<rank>[TAB]<filename>[TAB]<title>[TAB]<url>[TAB]<snippet>

where <filename> is the name of HTML document distributed by the organizers.

Examples

  • Input files (Download via Microsoft SkyDrive)
    • 1C2-E-SAMPLE.tsv (a query set file that contains five queries)
    • 1C2-E-SAMPLE-000x-index.tsv (an index file for the query ID 1C2-E-SAMPLE-000x)
    • 1C2-E-SAMPLE-000x folder (a folder that contains HTML documents for the query ID 1C2-E-SAMPLE-000x)

RUN SUBMISSION

MAIN TASKS

Each team can submit up to FOUR runs for Japanese and SIX runs for English. Note that only run files with higher priority will be evaluated if organizers cannot evaluate all the submissions due to lack of resources (see <integer> in file name specification below).

File name specification

The file name of each run must be of the following format:

<teamID>-<language>-<runtype>-<integer>.tsv

where

  • <teamID> is your registered team ID, e.g. MSRA
  • <language> is either E (English) or J (Japanese)
  • <runtype> is the identifier of a run type, which specifies DESKTOP (D) or MOBILE (M), and MANDATORY (MAND), ORACLE (ORCL), or OPEN. More formally, <runtype> must be of the following format: (D|M)-(MAND|ORCL|OPEN).

    For example:

    • D-MAND: a DESKTOP and MANDATORY run file,
    • D-ORCL: a DESKTOP and ORCL run file, and
    • M-OPEN: a MOBILE and OPEN run file.
  • <integer> is a unique integer for each team's run starting from 1, which represents the priority of a run file. Run files with a smaller <integer> will be evaluated with higher priority in the event that organizers do not have resources enough to evaluate all the submissions. Note that at least one MANDATORY run with the highest priority will be evaluated regardless of its <integer>.

Some example run names for a team "MSRA" would be:

  • MSRA-J-D-MAND-1.tsv
  • MSRA-J-M-OPEN-2.tsv
  • MSRA-J-D-MAND-3.tsv
  • MSRA-J-M-OPEN-4.tsv

 

Run file specification

All run files should be encoded in UTF-8. TAB is used as the separator.

Each run file begins with exactly one system description line, which should be in the following format:

SYSDESC[TAB]<brief one-sentence system description in English>

Please make sure the description text does not contain a newline symbol.

Below the system description line, there must be an output line for each query. Each output line should contain an X-string. The required format is:

<queryID>[TAB]OUT[TAB]<X-string>

Again, X-strings should not contain any newline symbols. The nugget match evaluation interface will truncate each run before evaluation, ignoring punctuation marks etc.

Each output line should be followed by at least one SOURCE line. These lines represent the knowledge sources from which you generated the X-string. Each SOURCE line must be in the following format:

<queryID>[TAB]SOURCE[TAB]<source>

These lines will be used for investigating what kinds of knowledge sources the participating teams have utilized. <source> must be either a URL (for ORACLE and OPEN runs) or a filename (for MANDATORY runs).

Examples

 

QUESTION CLASSIFICATION SUBTASKS

File name specification

The file name of each run in Query Classification Subtasks must be of
the following format:

<teamID>-QC-<integer>.tsv

where <teamID> is your registered team ID, and <integer> is a unique
integer for each team's run starting from 1. You can submit as many Query Classification runs as you like.

Run file specification

Each line in the run file should contain the following two fields:

<queryID>[TAB]<querytype>

where <querytype> is a query type predicted by your system, which must
be one of the following eight types: ARTIST, ACTOR, POLITICIAN, ATHLETE,
FACILITY, GEO, DEFINITION, and QA.

EVALUATION METHOD

We will follow and possibly extend the S-measure evaluation methodology as described in a CIKM11 paper and Overview of NTCIR-9 1CLICK. S-measure is like weighted nugget recall, but it encourages systems to
(a) present important pieces of information first; and
(b) minimise the amount of text the user has to read.

In addition, unnecessarily long X-strings will be penalised, using a nugget-precision-like metric, as described in an AIRS12 paper.

We also plan to devise new iUnit-based evaluation methods using
ideas from a WSDM12 paper.

PAPER WRITING AND ATTENDING NTCIR-10

TBA

TASK PARTICIPANT REGISTRATION (CLOSED)

Please register online . Note that registered participants must write a participant paper and physically attend NTCIR-10 in June 2013 to present a poster.

Participants who have come back from NTCIR-9 1CLICK-1 are encouraged to continue using the same team name to facilitate the progress monitoring.

ORGANISERS

(ntcadm-1click at nii.ac.jp)
Makoto Kato  Kyoto University, Japan 
Tetsuya Sakai  MSRA, China 
Virgil Pavlu  Northeastern University, USA 
Takehiro Yamamoto  Kyoto University, Japan 
Mayu Iwata  Osaka University, Japan 
Zhicheng Dou  MSRA, China 
Matthew Ekstrand-Abueg  Northeastern University, USA 
Shahzad Rajput  Northeastern University, USA 

 

PAPERS

  • Li, J. Huffman, S. and Tokuda, A.: Good Abandonment in Mobile and PC Internet Search, ACM SIGIR 2009, pp.43-50, 2009. doi
  • Pavlu, V., Rajput, S., Golbus, P.B. and Aslam, J. A.: IR System Evaluation using Nugget-based Test Collections, ACM WSDM 2012, pp.393-402, February 2012. doi
  • Sakai, T. and Kato, M.P: One Click One Revisited: Enhancing Evaluation based on Information Units, AIRS 2012, to appear, 2012.
  • Sakai, T., Kato, M.P. and Song, Y.-I.: Overview of NTCIR-9 1CLICK, NTCIR-9 Proceedings, pp.180-201, December 2011. pdf
  • Sakai, T., Kato, M.P. and Song, Y.-I.: Click the Search Button and Be Happy: Evaluating Direct and Immediate Information Access, ACM CIKM 2011, pp.621-630, October 2011. preprint