Our research
Content type
+
Downloads (421)
+
Events (352)
 
Groups (147)
+
News (2465)
 
People (824)
 
Projects (1021)
+
Publications (11426)
+
Videos (4843)
Labs
Research areas
Algorithms and theory47205 (208)
Communication and collaboration47188 (177)
Computational linguistics47189 (143)
Computational sciences47190 (159)
Computer systems and networking47191 (598)
Computer vision208594 (8)
Data mining and data management208595 (3)
Economics and computation47192 (81)
Education47193 (71)
Gaming47194 (63)
Graphics and multimedia47195 (178)
Hardware and devices47196 (173)
Health and well-being47197 (62)
Human-computer interaction47198 (715)
Machine learning and intelligence47200 (595)
Mobile computing208596 (7)
Quantum computing208597 (3)
Search, information retrieval, and knowledge management47199 (561)
Security and privacy47202 (218)
Social media208598 (5)
Social sciences47203 (219)
Software development, programming principles, tools, and languages47204 (495)
Speech recognition, synthesis, and dialog systems208599 (11)
Technology for emerging markets208600 (22)
1–25 of 178
Sort
Show 25 | 50 | 100
1234567Next 
J.-B. Huang, Q. Cai, Z. Liu, N. Ahuja, and Z. Zhang
Cross-ratio (CR) based methods offer many attractive properties for remote gaze estimation using a single camera in an uncalibrated setup by exploiting invariance of a plane projectivity. Unfortunately, due to several simplification assumptions, the performance of CR-based eye gaze trackers decays significantly as the subject moves away from the calibration position. In this paper, we introduce an adaptive homography mapping for achieving gaze prediction with higher accuracy at the calibration position and...
Publication details
Date: 1 March 2014
Type: Inproceeding
Publisher: ACM
Z. Zhang and Q. Cai
The cross-ratio approach has recently attracted increasing attention in eye-gaze tracking due to its simplicity in setting up a tracking system. Its accuracy, however, is lower than that of the model-based approach, and substantial efforts have been devoted to improving its accuracy. Binocular fixation is essential for humans to have good depth perception, and this paper presents a technique leveraging this constraint. It is used in two ways: First, in estimating jointly the homography matrices for both...
Publication details
Date: 1 March 2014
Type: Inproceeding
Publisher: ACM
tao mei, yong rui, shipeng li, and qi tian
The explosive growth and widespread accessibility of community contributed media content on the Internet have led to a surge of research activity in multimedia search. Approaches that apply text search techniques for multimedia search have achieved limited success as they entirely ignore visual content as a ranking signal. Multimedia search re-ranking, which reorders visual documents based on multimodal cues to improve initial text-only searches, has received increasing attention in recent years. Such a...
Publication details
Date: 1 January 2014
Type: Article
Jaesik Park, Sudipta n Sinha, Yasuyuki Matsushita, Yu-Wing Tai, and In So Kweon
We propose a method for accurate 3D shape reconstruction using uncalibrated multiview photometric stereo. A coarse mesh reconstructed using multiview stereo is first parameterized using a planar mesh parameterization technique. Subsequently, multiview photometric stereo is performed in the 2D parameter domain of the mesh, where all geometric and photometric cues from multiple images can be treated uniformly. Unlike traditional methods, there is no need for merging view-dependent surface normal maps. Our...
Publication details
Date: 3 December 2013
Type: Inproceeding
Publisher: International Conference on Computer Vision
Tiezheng Ge, Kaiming He, Qifa Ke, and Jian Sun
Publication details
Date: 1 November 2013
Type: Article
Publisher: IEEE Computer Society
Xian-Sheng Hua, Linjun Yang, Jingdong Wang, Jing Wang, Ming Ye, Kuansan Wang, Yong Rui, and Jin Li
The semantic gap between low-level visual features and high-level semantics has been investigated for decades but stillremains a big challenge in multimedia. When "search" became one of the most frequently used applications, "intent gap", the gap between query expressions and users' search intents, emerged. Researchers have been focusing on three approaches to bridge the semantic and intent gaps: 1) developing more representative features, 2) exploiting better learning approaches or statistical models to...
Publication details
Date: 21 October 2013
Type: Inproceeding
Publisher: ACM Conference on Multimedia
Wenyuan Yin, Tao Mei, and Chang Wen Chen
The ongoing revolution in media consumption from traditional PCs to the pervasiveness of mobile devices is driving the adoption of social media in our daily lives. More and more people are using their mobile devices to enjoy social media content while on the move. However, mobile display constraints create challenges for presenting and authoring the rich media content on screens with limited display size. This paper presents an innovative system to automatically generate magazine-like social media visual...
Publication details
Date: 1 October 2013
Type: Inproceeding
Publisher: ACM Multimedia
Wu Liu, Tao Mei, Yongdong Zhang, Jintao Li, and Shipeng Li
Mobile video is quickly becoming a mass consumer phenomenon. More and more people are using their smartphones to search and browse video content while on the move. In this paper, we have developed an innovative instant mobile video search system through which users can discover videos by simply pointing their phones at a screen to capture a very few seconds of what they are watching. The system is able to index large-scale video data using a new layered audio-video indexing approach in the cloud, as well...
Publication details
Date: 1 October 2013
Type: Inproceeding
Publisher: ACM Multimedia
Ting Yao, Tao Mei, Chong-Wah Ngo, and Shipeng Li
The problem of tagging is mostly considered from the perspectives of machine learning and data-driven philosophy. A fundamental issue that underlies the success of these approaches is the visual similarity, ranging from the nearest neighbor search to manifold learning, to identify similar instances of an example for tag completion. The need to searching for millions of visual examples in high-dimensional feature space, however, makes the task computationally expensive. Moreover, the results can suffer from...
Publication details
Date: 1 October 2013
Type: Inproceeding
Publisher: ACM Multimedia
Ivan Tashev
Kinect is a device for human-machine interaction, which adds two more input modalities to the palette of the user interface designer: gestures and speech. Kinect is transforming how people interact with computers, kiosks, and other motion-controlled devices from fun applications like playing a virtual violin, to applications in health care and physical therapy, retail, education, and training. The Kinect for Windows SDK and toolkit contain drivers, tools, APIs, device interfaces, and code samples to...
Publication details
Date: 1 September 2013
Type: Article
Publisher: IEEE
Publication details
Date: 1 September 2013
Type: Inproceeding
Publisher: British Machine Vision Conference (BMVC)
Zicheng Liao, Neel Joshi, and Hugues Hoppe
Given a short video we create a representation that captures a spectrum of looping videos with varying levels of dynamism, ranging from a static image to a highly animated loop. In such a progressively dynamic video, scene liveliness can be adjusted interactively using a slider control. Applications include background images and slideshows, where the desired level of activity may depend on personal taste or mood. The representation also provides a segmentation of the scene into independently looping...
Publication details
Date: 22 July 2013
Type: Article
Publisher: Association for Computing Machinery, Inc.
Dilip Krishnan, Raanan Fattal, and Richard Szeliski
We present a new multi-level preconditioning scheme for discrete Poisson equations that arise in various computer graphics applications such as colorization, edge-preserving decomposition for two-dimensional images, and geodesic distances and diffusion on three-dimensional meshes. Our approach interleaves the selection of fine- and coarse-level variables with the removal of weak connections between potential fine-level variables sparsification and the compensation for these changes by strengthening...
Publication details
Date: 1 July 2013
Type: Article
Publisher: ACM SIGGRAPH
Number: 4
Qiang Hao, Rui Cai, Zhiwei Li, Lei Zhang, Yanwei Pang, Feng Wu, and Yong Rui
3D model-based object recognition has been a noticeable research trend in recent years. Common methods find 2D-to-3D correspondences and make recognition decisions by pose estimation, whose efficiency usually suffers from noisy correspondences caused by the increasing number of target objects. To overcome this scalability bottleneck, we propose an efficient 2D-to-3D correspondence filtering approach, which combines a light-weight neighborhood-based step with a finer-grained pairwise step to remove spurious...
Publication details
Date: 25 June 2013
Type: Inproceeding
Publisher: Institute of Electrical and Electronics Engineers, Inc.
Tiezheng Ge, Kaiming He, Qifa Ke, and Jian Sun
Product quantization is an effective vector quantization approach to compactly encode high-dimensional vectors for fast approximate nearest neighbor (ANN) search. The essence of product quantization is to decompose the original high-dimensional space into the Cartesian product of a finite number of low-dimensional subspaces that are then quantized separately. Optimal space decomposition is important for the performance of ANN search, but still remains unaddressed. In this paper, we optimize product...
Publication details
Date: 1 June 2013
Type: Inproceeding
Publisher: IEEE Computer Society
Alessandro Bergamo, Sudipta Sinha, and Lorrenzo Torresani
In this paper we propose a new technique for learning a discriminative codebook for local feature descriptors, specifically designed for scalable landmark classification. The key contribution lies in exploiting the knowledge of correspondences within sets of feature descriptors during codebook learning. Feature correspondences are obtained using structure from motion (SfM) computation on Internet photo collections which serve as the training data. Our codebook is defined by a random forest that is trained...
Publication details
Date: 1 June 2013
Type: Inproceeding
Tiezheng Ge, Kaiming He, Qifa Ke, and Jian Sun
Publication details
Date: 28 May 2013
Type: Technical report
Number: MSR-TR-2013-59
Ting Yao, Yuan Liu, Chong-Wah Ngo, and Tao Mei
The search for entities is the most common search behavior on the Web, especially in social media communities where entities (such as images, videos, people, locations, and tags) are highly heterogeneous and correlated. While previous research usually deals with these social media entities separately, we are investigating in this paper a unified, multi-level, and correlative entity graph to represent the unstructured social media data, through which various applications (e.g., friend suggestion,...
Publication details
Date: 1 May 2013
Type: Inproceeding
Andrew Cross, Mydhili Bayyapunedi, Edward Cutrell, Anant Agarwal, and William Thies
Recent years have seen enormous growth of online educational videos, spanning K-12 tutorials to university lectures. As this content has grown, so too has grown the number of presentation styles. Some educators have strong allegiance to handwritten recordings (using pen and tablet), while others use only typed (PowerPoint) presentations. In this paper, we present the first systematic comparison of these two presentation styles and how they are perceived by viewers. Surveys on edX and Mechanical Turk...
Publication details
Date: 29 April 2013
Type: Inproceeding
Christopher Smowton, Jacob R. Lorch, David Molnar, Stefan Saroiu, and Alec Wolman
This paper proposes "seamless customer identification" (SCI), a means to identify physically present customers without any effort on customers' part beyond a one-time opt-in. With SCI, customers need not present cards or operate smartphones to convey their identities. So, stores can provide personalized shopping experiences at any time, not just at check-out. SCI uses two complementary technologies: device detection and face recognition. Device detection identifies customers by detecting their phones...
Publication details
Date: 14 March 2013
Type: Technical report
Number: MSR-TR-2013-31
Ivan Tashev and Malcolm Slaney
Audio signal enhancement often involves the application of a time-varying filter, or suppression rule, to the frequency-domain transform of a corrupted signal. Classic approaches use rules derived under Gaussian models and interpret them as spectral estimators in a Bayesian statistical framework. This mathematical approach provides rules that satisfy certain optimization criteria – maximum likelihood, mean square error, etc. In this paper we propose to learn the suppression rule from a representative...
Publication details
Date: 14 February 2013
Type: Inproceeding
Publisher: University of California - San Diego
Cha Zhang and Dinei Florencio
In this letter, we provide a theoretical analysis of optimal predictive transform coding based on the Gaussian Markov random field (GMRF) model. It is shown that the eigen-analysis of the precision matrix of the GMRF model is optimal in decorrelating the signal. The resulting graph transform degenerates to the well-known 2-D discrete cosine transform (DCT) for a particular 2-D first order GMRF, although it is not a unique optimal solution. Furthermore, we present an optimal scheme to perform predictive...
Publication details
Date: 1 January 2013
Type: Article
Publisher: IEEE
Tao Mei, Lin-Xie Tang, Jinhui Tang, and Xian-Sheng Hua
The ever increasing volume of video content on the Web has created profound challenges for developing efficient indexing and search techniques to manage video data. Conventional techniques such as video compression and summarization strive for the two commonly conflicting goals of low storage and high visual and semantic fidelity. With the goal of balancing both video compression and summarization, this paper presents a novel approach, called "Near-Lossless Semantic Summarization" (NLSS), to summarize a...
Publication details
Date: 1 January 2013
Type: Article
Publisher: ACM
Publication details
Date: 1 January 2013
Type: Article
Kenichi Kumatani, Takayuki Arakawa, Kazumasa Yamamoto, John McDonough, Bhiksha Raj, Rita Singh, and Ivan Tashev
Distant speech recognition (DSR) holds out the promise of providing a natural human computer interface in that it enables verbal interactions with computers without the necessity of donning intrusive body- or head-mounted devices. Recognizing distant speech robustly, however, remains a challenge. This paper provides a overview of DSR systems based on microphone arrays. In particular, we present recent work on acoustic beamforming for DSR, along with experimental results verifying the effectiveness of the...
Publication details
Date: 5 December 2012
Type: Inproceeding
1–25 of 178
Sort
Show 25 | 50 | 100
1234567Next 
> Our research