WebSensor (InformationSensor)

With the rapid growth of the web, there are grand challenges when making sense of web data: big volume, high velocity, high variety, and unknown veracity. In the physical world, a sensor is a converter that measures a physical quantity and converts it into a signal that can be read by an observer or by an instrument—today, mostly electronic. This project creates a virtual, WebSensor layer atop the web.

Websensor Platform

A WebSensor is a programmable, focused crawler that continuously discovers, extracts, and aggregates structured information about a topic. A WebSensor platform based on Windows PowerShell and the .NET Framework makes it easy for developers to create WebSensors that continuously extract information from the web and generate time-series stream data. End users also can create WebSensors easily for their daily life.

The websensor platform has many built-in capabilities to extract and collect time-sequenced data embedded in web sites. These built-in capabilities include:

  • Convenient wrapper generation on webpages (just by a few clicks)
  • Automatically wrapper adaption to page layout change
  • Easy to configure and run
  • Easy to extend using simple script language
  • Easy to manage and retrieve the data collected

Websensors can connect to form a sensor network for more complex analysis tasks that involve multiple time-sequenced data.

 Examples

Tracking count of Bill Gates' followers on twitter.com

It's super easy to track Bill Gates' follower count: just by a click on the current count of followers (8,903,947 on the following snapshot). A time series will then be generated and it will keep update.

Bill Gates Twitter pagethe original Bill Gates' Twitter page

sensor to track billg's follower countThe time series ouputted by the sensor which tracks Bill Gates' follower count

Tracking product price

Price of Windows SurfacePrice of Microsoft Surface (32GB), on http://www.amazon.com/gp/product/B009XNBFJK/

Predicting Results of 2012 Presidential Election

We used websensors to help predict the results of 2012 Presidential Election. Please check here for detail.