In this project we develop a large-scale recommendation system that takes into account meta-data describing users and items.
One of the great challenges in the interconnected world of the web is to connect people with other people, content, or products they care about. This challenge appears in tasks like product recommendation, social matchmaking, targeted advertising and content filtering. Interestingly, the problem also occurs in more technical challenges such as picking the right algorithm for a given problem.
We have developed a large scale Bayesian recommender system called Matchbox which has been tailored to meet the requirements posed by these diverse challenges. At its core, Matchbox has the ability to learn about people’s preferences from observing how they rate items such as movies, content, or other products. Based on those observations Matchbox is then able to recommend new items to users upon request.
Matchbox has been designed to use the available data for each user as efficiently as possible. Its learning algorithm is designed specifically for the large streams of data typical for web-scale applications. However, its main feature is that Matchbox takes advantage of meta-data available for both users and items. This means that things learnt about one user or item can be transferred across to other users or items.
Matchbox in Project Emporia
Project Emporia is a joint FUSE Labs/Microsoft Research project that uses Matchbox to provide a content-based recommender system for the real-time web. The key idea is to learn about the relevance of content snippets called tweets from explicit like/dislike feedback provided by users and bootstrapped from the content of Twitter lists.
Both tweets and users are described by meta-data that allow Matchbox to provide recommendations for new tweets and new users. In addition, the context of a user is described by a set of categories called lenses. The lenses make it possible even for non-registered users to enjoy the benefits of the recommendation system and filter down the public timeline of tweets.
Furthermore, the system combines human and computational intelligence for providing recommended web links which are aggregated from the recommended tweets. It thus constitutes a search engine for the real-time web based on the wisdom of the crowd.
Visit Project Emporia Website
- David Stern, Ralf Herbrich, Thore Graepel, Horst Samulowitz, Luca Pulina, and Armando Tacchella, Collaborative Expert Portfolio Management, in Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence AAAI-10 (to appear), July 2010.
- David Stern, Ralf Herbrich, and Thore Graepel, Matchbox: Large Scale Bayesian Recommendations, in Proceedings of the 18th International World Wide Web Conference, 2009.