Anitha Kannan, Inmar E.Givoni, Rakesh Agrawal, and Ariel Fuxman
1 August 2010
An e-commerce catalog is typically comprised of specifications for millions of products. The search engine receives millions of sales offers from thousands of independent merchants that must be matched to the right products. This problem is hard for several reasons. First, unique identifiers are absent in most offers. Second, although the product specifications are well structured, offers are described in the form of free text. Third, offers mention the values of the attributes without providing the corresponding attribute names. Fourth, values of a large number of attributes are often missing from the offer description. Finally, offers may also contain words other than attribute names and values.
We present an automated technique for matching unstructured offers to structured product descriptions. A novel aspect of our approach is the semantic parsing of offer descriptions using dictionaries built from the structured catalog. Another novelty is that the matching function we learn factors in not only matches but also mismatches of attribute values as well as the missing attribute values. Our approach has been implemented in an experimental search engine and is used to match all the offers received by Bing shopping to the Bing product catalog on a daily basis. We present extensive experimental results from this implementation that demonstrate the effectiveness of the proposed approach.