Ceren Budak, Anitha Kannan, Rakesh Agrawal, and Jan Pedersen
We address the problem of inferring users' interests from microblogging sites such as Twitter, based on their utterances and interactions in the social network. Inferring user interests is important for systems such as search and recommendation engines to provide information that is more attuned to the likes of its users. In this paper, we propose a probabilistic generative model of user utterances that encapsulates both user and network information. This model captures the complex interactions between varied interests of the users, his level of activeness in the network, and the information propagation from the neighbors. As exact probabilistic inference in this model is intractable, we propose an online variational inference algorithm that also takes into account evolving social graph, user and his neighbors? interests. We prove the optimality of the online inference with respect to an equivalent batch update. We present experimental results performed on the actual Twitter users, validating our approach. We also present extensive results showing inadequacy of using Mechanical Turk platform for large scale validation.