[ Kentaro Toyama's Home Page | MSR ]
Advances in location-aware devices, such as GPS, cellphones, WiFi-enabled devices, and so on, allow us to easily collect a person’s location history – a record of a subject’s location over an interval of time. (Location histories are often called “tracks” or “breadcrumbs” among GPS enthusiasts.) There is a lot of interesting data, much of it semantically meaningful, that can be mined from a person’s location history. Project Lachesis seeks to model and analyze this data.
(a) |
(b) |
|
(c) |
(d) |
Fig.
1. Data from a few months of Kentaro’s location history,
collected using a handheld GPS device: (a) line segments connecting adjacent points
in the location history; (b) automatically extracted stays marked as dots; (c)
automatically extracted destinations marked as circles; and (d) stays and
destinations extracted at a much coarser resolution.
In Project Lachesis, we are defining data structures and algorithms for modeling and analyzing location histories. For example, we believe the following entities form the basis for a reasonable set of low-level datastructures:
We are in the process of rigorously defining these entities algorithmically (see below for publications), such that the definitions are independent of the method of data collection or of geographic representation. In our work, as long as there is a metric function which returns a single real-valued distance between two locations, we can extract stays, destinations, trips, and paths. Note that this allows for location histories that are based on textual placenames, as well as more traditional representations of geography using, for example, latitude and longitude.
Once we have these low-level data structures and algorithms for extracting them from raw location histories, we can model the location history using probabilistic models. At this point, we have a model that uses a time-dependent Markov chain to model a subject’s destination.
|
(a) |
(b) |
Fig. 2. (a) Typical and (b) atypical weeks for one subject. The index of the destination is plotted against time. (The ordering of the destination indices is entirely arbitrary)
(a) |
(b) |
Fig. 3. Plots of synthesized weeks, using a model trained on the same subject’s data: (a) using the non-Markovian model and (b) with Markovian transitions
There are many possible applications of location histories, once algorithms for analyzing them are developed. These include…
n commute optimization (what’s the best way to work, if I leave at 7:50am?)
n
smart
appointment scheduling (I’m most likely to be in
n collaborative filtering (if I visit a particular set of restaurants, what are other restaurants which I might also like?)
n location spoofing for privacy (I want to make my location public, except when on vacation, when I want to appear as if I’m living my normal life. Also, analysis of even a week’s worth of location history easily establishes where one lives. Home can be “dithered” upon request so that others can tell I’m home, but not where home is.)
Obviously, privacy issues are a major concern for any work dealing with a person’s geographic location. This is particularly true for location histories, which all but tell the life story of an individual. Although our work is academic in nature, and does not address privacy concerns directly (except where we build applications that allow better preservation of privacy), we believe privacy should be protected securely, and that a person’s location-history data should be collected and transferred on an opt-in basis only.
Data
As part of this project, I have collected nearly two years of continuous GPS data of my whereabouts. This was collected using Garmin eTrex and Geko devices.
I’m
throwing all caution to the wind and making this available to anyone
who’d like to use it for research purposes. Please e-mail gpsdata(a)microsoft.com stating
your name and project goals, and I will send you this data in GPX format. I apologize if my response is slow – if
it takes me longer than a month, please ping me again. J If you end up using this data, we would appreciate
it if you could cite our paper below (Hariharan and
Software tools for collecting and viewing this data are available at http://wwmx.org/Downloads.aspx.
Publications
Hariharan, R.,
Related Work
Ashbrook,
D., Starner, T. Learning significant locations and predciting user movement
with GPS. In: Billinghurst, M., eds. 6th
International Symposium on Wearable Computers (ISWC), 2002, pp. 101-108,
Liao, L., D. Fox, and H. Kautz. Learning and Inferring Transportation Routines. in Proceedings of AAAI-04 , 2004. Location history modeling for learning daily routines. Uses an HMM to model transitions between destinations. Won Outstanding Paper Award at AAAI 2004.
Marmasse,
N., Schmandt, C. Location-aware information delivery with ComMotion. In: Thomas, P.J., and Gellersen, H., eds. Handheld and Ubiquitous Computing, Second
International Symposium (HUC), 2000, pp. 157-171,
[ Kentaro Toyama's Home Page | MSR ]