|
Geographic Query Parsing A geographic query is usually composed of three components, “what? “geo-relation?and “where? How to parse queries and extract these components from them is a key problem for geographic information retrieval (GIR). The keywords in the “what?component indicate what users want to search; “where?indicates the geographic area users are interested in; “geo-relation?stands for the relationship between “what?and “where? For example, for a query “Restaurant in Beijing, China? “what?= “Restaurant? “where?= “Beijing, China? and “geo-relation?= “IN? For another query “Mountains in the south of United States? “what?= “Mountains? “where?= “United States? and “geo-relation?= “SOUTH-OF? For the “what?component, we categorize it into three types, as listed below:
Table 1. Geo-relation Types
Data Set 800,000 queries were collected from Windows Live Search logs (http://search.live.com/). Most of them were geographical queries. A sample labeled set of 100 queries were provided as a training set. This data set has been used in the geographic query parsing task of GeoCLEF 2007. The query set is in XML format. Each query has two attributes: <QUERYNO> and <QUERY>. <QUERYNO>1</QUERYNO> <QUERY>Restaurant in Beijing, China</QUERY> <QUERYNO>2</QUERYNO> <QUERY>Real estate in Florida</QUERY> <QUERYNO>3</QUERYNO> <QUERY>Mountains in the south of United States</QUERY> The sample labeled set is in the following format. There are 4 more attributes: <LOCAL>, <WHAT>, <WHAT_TYPE>, <GEO-RELATION> and <WHERE>. <QUERYNO>1</QUERYNO> <QUERY>Restaurant in Beijing, China</QUERY> <LOCAL>YES</LOCAL> <WHAT>Restaurant</WHAT> <WHAT-TYPE> Yellow page</WHAT-TYPE> <GEO-RELATION>IN</ GEO-RELATION> <WHERE>Beijing, China</WHERE> <LAT-LONG>40.24, 116.42</LAT-LONG> <QUERYNO>2</QUERYNO> <QUERY> Lottery in Florida</QUERY> <LOCAL>YES</LOCAL> <WHAT>Lottery</WHAT> <WHAT-TYPE>Information</WHAT-TYPE> <GEO-RELATION>IN</ GEO-RELATION> <WHERE>Florida</WHERE> <LAT-LONG>28.38, -81.75</LAT-LONG> File Download
|