|
8:00-8:30 |
Breakfast & registration |
|
8:30-8:35 |
Introduction |
| 8:35-9:30 |
Invited paper Continuous Lifelong Capture of Personal Experience Using Eyetap Steve Mann, Anurag Sehgal, James Fung Humans are feeling a growing need to engage themselves in their computational world as much as they engage themselves in the real physical world. Following this need of always staying connected, computers are becoming more and more wearable. The two primary senses that involve human computer interaction being sight and sound, we need an interface that can make this every day interaction easy, intuitive and possible. We introduce an experience capturing system known as an Eyetap. Eyetap devices cause the eye to, in effect, function as if it were both a camera and display, by mapping an effective camera and display inside the eye. This paper will discuss the evolution of the technology over the last 30 years, together with some new designs. We also discuss some of the various applications that immediately arise from the use of such technology and the various new practices that are made possible with Eyetap technology. The paper is divided into the following two sections. Capture/sensors and experiential sampling. Applications, Security and privacy |
|
9:30-10:30 |
Session 1 Efficient Retrieval of Life Log Based on Context and Content Kiyoharu Aizawa, Datchakorn Tancharoen, Shinya Kawasaki, Toshihiko Yamasaki In this paper, we present continuous capture
of our life log with various sensors and additional data and propose
effective retrieval using context and content based on them. Our life
log system contains video, audio, acceleration sensor, gyro, GPS,
annotations, documents, webs, and emails. In our previous studies, we
showed our retrieval methodology which mainly depends on context
information from the sensor data. In this paper, we extend our
methodology and add additional functions. They are (1) spatio-temporal
sampling for extraction of key frames for summarization (2) conversation
scene detection. Regarding the spatio-temporal sampling, key frames for
the summarization are extracted using time and location data (GPS).
Because our life log captures dense location data, we can also make use
of derivatives of location data, that is, speed and acceleration of the
movement of the person. The summarized key frames are made using them.
We also introduce content analysis, that is, the conversation scene
detection. In our previous work, we have investigated context based
retrieval, which differs from the majority of the works in the
image/video retrieval focusing on content based retrieval. In this
paper, we introduced visual and audio data content analysis for
conversation scene detection. The detection of conversation scene will
be very important tags for our life log data retrieval. We describe our
present system and additional functions, and preliminary results for the
additional functions.
A Layered Interpretation of Human Interaction Captured by Ubiquitous Sensors Masashi Takahashi, Sadanori Ito, Yasuyuki Sumi, Megumu Tsuchikawa, Kiyoshi Kogure, Kenji Mase, and Toyoaki Nishida We are
developing a machine-readable interaction corpus, a collection of human
interaction data captured by multiple sensors, in order to use it in
recording various episodes in our daily life. To develop such a corpus,
we have prototyped ubiquitous/wearable sensor systems that
collaboratively capture human interactions from multiple points of view
in a poster exhibition site. An infrared ID system, which recognizes the
existence of persons or objects, enables us to estimate the user's state
of gazing at a particular person/object or of staying at a particular
place. A throat microphone, which detects the volume of the user's
voice, tells us whether he or she is making an utterance. The purpose of
this study is to interpret human interaction patterns automatically from
the data of various sensors and to give the interaction corpus
machine-readable indices to make it more useful. In this paper, we
propose a layered model for these interpretations based on a bottom-up
approach to using sensors. In this model, interpretations of human
interactions are hierarchically abstracted so that each layer has unique
semantic/syntactic information represented by machine-readable indices.
This layered model enables us to use various sensors and to refer to
various indices according to the purpose. Furthermore, we assume that we
can apply this model to multiple domains due to its modeling
hierarchically. These indices enable us to use various sources as images
and audio more efficiently, for example, to search for significant
scenes. Accordingly, we introduce various applications that adaptively
utilize such interaction indices at an exhibition site in order to
demonstrate their effectiveness in a corpus. Finally, we introduce our
further attempts to capture human interactions in another domain, a
meeting situation, with the same system in order to assess its
versatility.
|
|
10:30-11:00 |
Coffee break & Demos |
|
11:00-12:00 |
Session 2 Minimal-Impact Audio-Based Personal Archives Daniel P.W. Ellis & Keansub Lee Collecting and storing continuous personal
archives has become cheap and easy, but we are still far from creating a
useful, ubiquitous memory aid. We view the inconvenience to the user of
being `instrumented' as one of the key barriers to the broader
development and adoption of these technologies. Audio-only recordings,
however, can have minimal impact, requiring only that a device the size
and weight of a cellphone be carried somewhere on the person. We have
conducted some small-scale experiments on collecting continuous personal
recordings of this kind, and investigating how they can be automatically
analyzed and indexed, visualized, and correlated with other
minimal-impact, opportunistic data feeds (such as online calendars and
digital photo collections). We describe our unsupervised segmentation
and clustering experiments in which we can achieve good agreement with
hand-marked environment/situation labels. We also discuss some of the
broader issues raised by this kind of work including privacy concerns,
and describe our future plans to address these and other questions.
Passive Capture and Ensuing Issues for a Personal Lifetime Store Jim Gemmell, Lyndsay Williams, Ken Wood, Gordon Bell and Roger Lueder
Passive capture lets people record their experiences without having to
operate recording equipment, and without even having to give recording
conscious thought. The advantages are increased capture, and improved
participation in the event itself. However, passive capture also
presents many new challenges. One key challenge is how to deal with the
increased volume of media for retrieval, browsing, and organizing. This
paper describes the SenseCam device, which combines a camera with a
number of sensors in a pendant worn around the neck. Data from SenseCam
is uploaded into a MyLifeBits repository, where a number of features,
but especially correlation and relationships, are used to manage the
data.
|
|
12:00-1:30 |
Lunch & Demos |
|
1:30-2:30 |
Session 3 Personal Chronicling Tools for Enhancing Information Archival and Collaboration in Enterprise Pilho Kim, Mark Podlaseck, and Gopal Pingali One of the greatest challenges in enterprises today is the lack of dynamic and ongoing information about individuals activities, interests, and expertise. Availability of such personal chronicles can provide rich benefits at both an individual and enterprise level. For example, personal chronicles can help individuals to far more effectively retrieve and review their activities and interactions, while at an enterprise level they can be data-mined to identify groups of common and complementary interests and skills, or to identify implicit work processes that are commonplace in every enterprise. Todays existing tools are very limited in their support for dynamic capture of ongoing activities, in the organization and presentation of captured information, and in supporting rich annotation, search, retrieval, and publication of this information. In this paper, we propose a set of Personal Chronicling Tools (PCT) to support enterprise knowledge workers in digital event archiving and collaboration-oriented publishing. PCT is composed of four primary tools with the following capabilities: (1) event monitoring, (2) interactive annotation, (3) browse/search, and (4) edit/publish. All are designed to exploit existing enterprise infrastructure, storing captured raw data and metadata in secure databases. The first tool is a group of event monitors. These run on user client devices and capture user events such as emails, web pages browsed, instant messaging sessions, and documents edited. Monitors for new event classes are easily added as plug-ins through an XML interface. The second tool, the event annotator, enables context-sensitive user tagging and book marking of interesting moments. The third is an event browser which extends corporate email tools, providing semantic search (by embedding WordNet as a common dictionary) and the ability to follow threads of many kinds. Finally, a publishing tool facilitates the publication of relevant events with a fraction of the effort required to maintain a manual chronicle such as a weblog. This paper presents the overall system architecture, a prototype implementation, and preliminary results from field studies.
Uniscript: a model for persistent and incremental knowledge storage Adorjan Kiss, Joel Quinqueton We present in this paper a model of personal knowledge representation for lifetime storage. In the model we separate the knowledge layer from the resource layer. The knowledge layer consists of a network of atomic knowledge units situated in space and time. Resources are data packages (bit sequences) that can be rendered by some device into any human-perceivable form. The two parts complement each other: the knowledge network can be seen as annotations of the resource base (multimedia store) while resources can serve as means for the interpretation of knowledge units as well as a way to index and access them. For the knowledge network we propose a simple formalism that we consider could support the emergence of a language capable to describe increasingly complex situations of the real world and, by time, to represent any information that is expressible by natural language. |
|
2:30-3:30 |
Session 4 Memory Cues for Meeting Video Retrieval Alejandro Jaimes, Kengo Omura, Takeshi Nagamine, Kazutaka Hirata We advocate a new approach to meeting video
retrieval based on the use of memory cues. First we present a new survey
involving 519 people in which we investigate the types of items people
use to review meeting contents (e.g., minutes, video, etc.). Then we
present a novel memory study involving 15 subjects in which we
investigate what people remember about past meetings (e.g., seating
position, etc). Based on these studies and related research we propose a
novel framework for meeting video retrieval based on memory cues. Our
proposed system graphically represents important memory retrieval cues
such as room layout, participant.s faces and sitting positions, etc..
Queries are formulated dynamically: as the user graphically manipulates
the cues, the query results are shown. Our system (1) helps users easily
express the cues they recall about a particular meeting; (2) helps users
remember new cues for meeting video retrieval. Finally, we present our
approach to automatic indexing of meeting videos.
Total Recall: Are Privacy Changes Inevitable? a position paper William Cheng, Leana Golubchik, David Kay Total Recall is a system that records a personal version of the world using personal sensors such as a microphone array in a pair of glasses or a camera in a necklace. There are many applications of Total Recall, such as in health care, education, and so on, which can significantly improve people's quality of life. However, data recorded by such a system may be also used by the legal system. Hence, pervasive use of such a system will likely change our social structure as potentially there may be no question in the future as to who said what or who did what. It is natural then that privacy advocates might consider such technology dangerous because such data can be misused or abused by law enforcement. In this paper, we discuss privacy concerns in the context of systems like Total Recall and propose a solution that may alleviate some of these concerns. We discuss the ramification of this solution and its possible implementations. |
|
3:30-4:00 |
Coffee Break & Demos |
|
4:00-4:30 |
Panel What is the future of CARPE - continuous or just lifelong? What are the commercial prospects? Brian Clarkson, Abigail Sellen & Ivo Stivoric |
|
4:30-5:00 |
The final word Gordon Bell |
Demonstrations
Experience Sharing by Retrieving Captured Conversations using Non-Verbal
Features
Christof Mueller, Yasuyuki Sumi, Kenji Mase
We present a system that retrieves the voice part of human
communications captured by our collaborative experience capturing system. For
segmenting, interpreting, and retrieving past conversation scenes from a huge
amount of captured data the system focusses on the non-verbal aspects, i.e. the
contextual informations captured by ubiquitous sensors, rather than the verbal
(semantic) aspects of the data. The retrieved communications are presented to
other persons being in similar situations as the communicators. This experience
sharing enables people to gain more information about their situation or
surroundings. The system's current domain is a poster exhibition at an academic
conference where the system provides a visitor with additional informations
about the exhibited posters.
Paper
(PDF 0.3 MB)
PWS & PHA: Posture Web Server and Posture History Archiver
Yasuhiko Ooe, Kentaro Yamasaki, Tsukasa Noma
Our physical experiences are best represented by body movements. Many limitations of existing systems/devices, however, prevent their use in archival of daily experiences. This paper proposes an integrated system composed of PWS (Posture Web Server) and PHA (Posture History Archiver); The PWS has a palm-size controller and 15 light-weight tilt sensor devices, newly developed by us. The feature of our tilt device lies in measurement of 360 degrees inclinations in two directions. The PWS is worn by a user, and always monitors his/her body posture. It acts as a posture web server, that is, it sends his/her current postural data upon request via a wireless network. The PHA running on a PC sends requests to PWS periodically via a network, and then archives a time series of his/her postures called posture history. The whole/part of the history can be visualized depending on his/her preferences. In this paper, system design issues, development of tilt sensor devices, implementation, and our experimental results are described.
Augmenting and Sharing Memory with eyeBlog
Connor Dickie, Roel Vertegaal, Daniel Chen, David Fono, Daniel Cheng, Chenguk
Sohn
eyeBlog is an automatic personal video recording system. It consists
of ECSGlasses [XX], a pair of glasses augmented with a wireless eye-contact and
glyph sensing camera, and a web application that visualizes the video from the
ECSGlasses camera as chronologically delineated blog entries. The blog format
allows for easy annotation, grading, cataloging and searching of video segments
by the wearer or anyone else with Internet access. eyeBlog reduces the editing
effort of video bloggers by recording video only when something of interest is
registered by the camera. Interest is determined by a combination of independent
methods. For example, recording can be triggered upon detection of eye contact
towards the wearer of the glasses, allowing all face-to-face interactions to be
recorded. Recording can also be triggered by the detection of image patterns
such as glyphs in the frame of the camera. This allows the wearer to record
their interactions with any object that has an associated marker. Finally, by
pressing a button the user can initiate recording manually.
Paper (PDF 2.6 MB)
Industrial Demonstrations
Deja View Camwear Model 100
Sid Reich, Les Goldberg, Stephen Hudek
Deja View Camwear Model 100 is the first in a family of wearable camcorders designed to free the user from being shackled to his viewfinder. The Model 100 is designed to ensure that the user never misses that important tidbit. While the initial use is for active lifestyles, we are exploring its use in Security, Military, Training, Automotive and sundry other vertical markets.
Quindi Meeting Capture
Stan Rosenschein
In this paper, we describe the Quindi Meeting Companion, a
personal software tool for documenting content-rich meetings. We examine the
principal motivations for the system, key design decisions, and new practices
enabled by the technology.
Paper (PDF 0.1 MB)
The BodyMedia Platform: Continuous Body Intelligence
Astro Teller & John Stivoric
Only by making computing intimate to the body can products begin to know our states of mind, our contexts, our states of health, etc. and respond (or have other aspects of the world respond) in intelligent ways. Sympathetic products, driven by computers worn on the body, are coming and the industry will grow up with wearable body monitoring at its core. BodyMedia, Inc. has been building toward this vision since 1999 and today has a commercially available, clinically tested, consumer-acceptable, environmentally hardened body monitoring platform. This platform is available today or will soon be available in healthcare markets as well as tangential areas including safety, security, entertainment, and affective computing. This brief document will highlight the current state of the platform and some of its on-going evolution.