By submitting one query, users may have different intents. For an ambiguous query, users may seek for different interpretations. For a faceted topic, users may be interested in different subtopics. In this project, we investigate how many queries are ambiguous in real search logs; we propose methods to diversify search results; we experiment with new metrics to measure diversity; we also organize NTCIR INTENT and IMINE tasks to provide common data for IR community.
This is the new homepage for
- New URL for the INTENT-2 homepage: http://research.microsoft.com/INTENT/
IMPORTANT DATES (in Japan Time: UTC+9)
May 31, 2012
|Chinese/Japanese topics, query suggestions and non-diversified baseline Document Ranking runs released|
|June 18, 2012||English topics (for Subtopic Mining only) released (same as TREC 2012 web topics)|
|July 31, 2012||all submissions due (CLOSED)|
|Aug-Dec 2012||identify intents from subtopics -> per-intent relevance assessments|
|Dec 21, 2013||evaluation results released, with a draft overview|
|March 1, 2013||draft participant papers due|
|May 1, 2013||camera ready papers due|
|June 18-21, 2013||NTCIR-10 at NII, Tokyo|
The second round of the INTENT task (INTENT-2) is similar to INTENT-1: please refer to the Overview of INTENT-1. As before, we have Subtopic Mining and Document Ranking subtasks for both Chinese and Japanese. In addition, this time we also have an English Subtopic Mining task by sharing topics with the TREC 2012 web track diversity task.
Subtopic Mining: given a query, return a ranked list of possible "subtopic strings." (See below for the definition.)
The subtopic strings returned by participants will be manually merged into intents for evaluating both Subtopic Mining and Document Ranking.
Document Ranking: given a query, return a selectively diversified ranked list of web pages.
The document collections for Chinese and Japanese are the same as those used at INTENT-1 (SogouT and ClueWeb09-JA). See the INTENT-1 homepage for obtaining these collections.
WHAT'S NEW AT INTENT-2
Besides the introduction of English Subtopic Mining, here are the departures from INTENT-1:
We have provided a clear definition of what a subtopic string should be (see below).
Organisers will provide query suggestions from major search engines to participants instead of making them scrape the suggestions for themselves. This will make the experiments more repeatable and comparable.
Similarly, organisers will provide non-diversified baseline search results, complete with their web contents, to participants. Thus, even those who do not have the document collections can participate by reranking the baselines.
The INTENT-2 Chinese and Japanese topic sets (shared across Subtopic Mining and Document Ranking) will contain not only ambiguous and underspecified queries (which probably require diversification), but also one-item search queries (which probably don't require diversification). Here, a one-item search query is one that requires only one answer or one particular webpage. Thus, in Document Ranking, selective diversification will be encouraged.
INTENT-2 participants will be asked to process not only the INTENT-2 topics but also the INTENT-1 topics, for the purpose of discussing topic set comparability and monitoring progress.
Similarly, INTENT-1 participants who have come back for INTENT-2 will be encouraged to run their INTENT-1 systems with the INTENT-2 topics as well, to discuss progress.
WHAT IS A SUBTOPIC STRING?
In the Subtopic Mining subtask, participants are required to return a ranked list of subtopic strings, not a ranked list of document IDs. What is a subtopic string?
A subtopic string of a given query is a query that specialises and/or disambiguates the search intent of the original query. If a string returned in response to the query does neither, it is considered incorrect.
original query: "harry potter" (underspecified)
subtopic string: "harry potter philosophers stone movie"
incorrect: "harry potter hp" (does not specialise)
original query: "office" (ambiguous)
subtopic string: "office workplace"
incorrect: "office office" (does not disambiguate; does not specialise)
It is encouraged that participants submit subtopics of the form
whereover appropriate although we do allow subtopics that do NOT contain the original query:
original query: "avp"
subtopic string: "aliens vs predators"
Visit: Submitting Runs to INTENT-2
Following INTENT-1, we plan to use D(#)-nDCG proposed in Sakai/Song SIGIR12 as our primary evaluation metric. In addition, for the Document Ranking subtask only, will also use DIN(#)-nDCG proposed in Sakai WWW12 to encourage systems to diversify by considering whether each intent is informational or navigational (i.e. one-item-search intent).
The basic idea is, given a space for 10 URLs (i.e. the first Search Engine Result Page), allocate more space to the more popular intents in comparison to the less popular ones; and allocate more space to informational intents and give just one URL slot for each navigational intent. D(#)-nDCG is type-agnostic (i.e. does not consider the intent types), but DIN(#)-nDCG is type-sensitive (i.e. considers informational and navigational intent types).
Type-sensitive metrics can naturally evaluate selective diversification: for topics that have multiple intents, systems can diversify; for those that have exactly one navigational intent (and nothing else), diversification might actually hurt the ranked list.
We are also open to trying other evaluation metrics.
(ntcadm-intent at nii.ac.jp)
|Min Zhang||Tsinghua University|
|Yiqun Liu||Tsinghua University|
|Makoto Kato||Kyoto University|
|Mayu Iwata||Osaka University|
|Takehiro Yamamoto||Kyoto University|
Sakai, T. and Song, R.: Diversified Search Evaluation: Lessons from the NTCIR-9 INTENT Task, Information Retrieval, to appear, 2013. authors version
Sakai, T., Dou, Z., Song, R. and Kando, N.: The Reusability of a Diversified Search Test Collection, AIRS 2012 (LNCS 7675), pp.26-38, 2012.
Sakai, T.: Evaluation with Informational and Navigational Intents, WWW 2012, pp.499-508, April 2012. pdf
Song, R., Zhang, M., Sakai, T., Kato, M.P., Liu, Y., Sugimoto, M., Wang, Q. and Orii, N.: Overview of the NTCIR-9 INTENT Task, NTCIR-9 Proceedings, pp.82-105, December 2011. pdf
Sakai, T. and Song, R.: Evaluating Diversified Search Results Using Per-Intent Graded Relevance, ACM SIGIR 2011, pp.1043-1052, July 2011. preprint