Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Transformation-based Framework for Record Matching

Arvind Arasu, Surajit Chaudhuri, and Raghav Kaushik

Abstract

Today's record matching infrastructure does not allow a flexible way to account for synonyms such as "Robert" and "Bob" which refer to the same name, and more general forms of string transformations such as abbreviations. We propose a programmatic framework of record matching that takes such user-defined string transformations as input. To the best of our knowledge, this is the first proposal for such a framework. This transformational framework, while expressive, poses significant computational challenges which we address. We empirically evaluate our techniques over real data.

Details

Publication typeInproceedings
Published inProceedings of the 24th International Conference on Data Engineering, ICDE 2008
PublisherIEEE Computer Society
> Publications > Transformation-based Framework for Record Matching