Social Networks and Streams

The ‘Social Web’ is currently undergoing a revolution, with users of social networking sites such as Twitter and Facebook growing exponentially and people are discovering new ways of using these platforms to communicate with each other. Our goal is to leverage machine learning techniques to enable entirely new online social experiences. We believe that combining the areas of knowledge representation & reasoning, machine learning, mechanism design and information retrieval is pivotal to intelligent Social Web scenarios. Our agenda is motivated by a number of application scenarios that are likely to play central roles in the web of the future, including:

  • automatic planning systems for “personal assistant” services
  • automatic question answering and context-aware information services
  • recommendation systems based on understanding of user intent and patterns of social interaction

Here is a list of our current projects

Personalised Streams

Public streams of messages from social networks such as Twitter and Weblogs contain a masses of valuable information but much of this is lost because there is also a great deal of irrelevant content and users do not have time to sift through everything.

Social networks provide one way of filtering public streams through the social graph – in the case of Twitter, this involves subscribing to feeds from friends. The social graph traditionally acts as a routing and distribution mechanism for messages flowing through the graph of real-time web information. We propose a complementary filtering and routing mechanism based on the preference patterns of users to messages.

Machine intelligence can be applied to produce much more powerful personalised filters on these massive flows of information. It is possible to learn about a user’s tastes by observing the way they interact with messages and their friends. Sources of such ‘implicit’ feedback are what the user reads, what they write, which feeds they subscribe to etc. A second aspect to our work is to develop user experiences which allow the user to give explicit feedback about what content they like so we can generate more interesting streams for them in the future.

Probabilistic Databases

The challenge of designing new user experiences for the Social Web derives not only from the sheer scale of the web, but also from the uncertainty associated with web-extracted information and from the potentially adversarial nature of user behaviour. Representing the uncertain knowledge about user intention and attributes such as trust, authority and relevance will necessitate combining traditional data representation frameworks, such as relational databases, with modern probabilistic graphical models. Progress in this area will rely on insights into the interplay of computer languages and probabilistic inference.

Probabilistic databases are an extension of today’s frameworks for storing and querying data collections – relational databases – with the concept of uncertainty. Reasoning within probabilistic databases requires the application of new approximate inference methods that fully exploit the relational structure of the database with a focus on the crucial trade-off between approximation quality and computational complexity. Additional challenges are posed by distributed storage and parallel inference.

We have developed a new declarative modelling language called PQL, or Probabilistic Query Language which resembles and extends SQL (Structured Query Language). Key features of PQL include

  • Succinct description of probabilistic models on top of data residing in relational databases.
  • Execution of approximate inference in order to answer queries against uncertain attributes.

We have developed prototype implementations of PQL on both a distributed architecture using DryadLinq and a centralised architecture using SQL Server.

David Stern
David Stern

Ralf Herbrich
Ralf Herbrich