Twahpic: Twitter topic modeling

"Twahpic" shows what tweets on Twitter™ are about in terms of both topics (like sports, politics, Internet, etc) and axes of Substance, Social, Status, and Style. Twahpic uses Partially Labeled Latent Dirichlet Analysis (PLDA) to identify 200 topics used on Twitter.

30 July 2013: The demo is currently unavailable. However, you can see some sample output from real queries.

  • Status: tweets for this query had many terms related to status updates (going places)
  • Substance: tweets for these queries had many terms related to specific entities and events (making money, news)
  • Style: this query's tweets use particular linguistic style conventions (example)
  • Social: this query's tweets are all about "where the party's at" (example)
  • Mixture: this query has an even mix of substance - sports, particularly - and style. (football)

Screenshot of Twahpic interface