﻿<?xml version="1.0" encoding="utf-8" standalone="no"?>
<rss version="2.0" xmlns:live="http://live.com/schema/media/" xmlns:media="http://search.yahoo.com/mrss/" xmlns:dcterms="http://purl.org/dc/terms/">
  <channel>
    <title>Microsoft Research Lectures</title>
    <link>http://research.microsoft.com/apps/dp/vi/videos.aspx</link>
    <description>Watch the latest lectures from Microsoft Research</description>
    <copyright>© 2012 Microsoft Corporation. All rights reserved.</copyright>
    <language>en-US</language>
    <lastBuildDate>Tue, 15 May 2012 22:24:04 GMT</lastBuildDate>
    <item>
      <title>Generate-and-Test Models for Machine Translation</title>
      <description>[Speaker: Chris Dyer] I discuss translation as an optimization problem subject to three kinds of constraints: lexical, configurational, and constraints enforcing target-language wellformedness. Lexical constraints ensure that the lexical choices in the output are meaning-preserving; configurational constraints ensure that the relationships between source words and phrases (e.g., semantic roles and modifier-head relationships) are properly transformed in translation; and target-language wellformedness constraints ensure the grammaticality of the output. The constraint-based framework suggests a generate-and-test (discriminative) model of translation in which features sensitive to input and output structures are engineered by language and translation experts, and the feature weights are trained to maximize the conditional likelihood of a corpus of example translations. The specified features represent empirical hypotheses about what correlates (but not why) and thus encode domain-specific knowledge; the learned weights indicate to what extent these hypotheses are confirmed or refuted. To verify the usefulness of the feature-based approach, I discuss the performance two models: first, a lexical translation model evaluated by the word alignments it learns. Unlike previous unsupervised alignment models, the new model utilizes features that capture diverse lexical and alignment relationships, including morphological relatedness, orthographic similarity, and conventional co-occurrence statistics. Results from typologically diverse language pairs demonstrate that the generate-and-test model provides substantial performance benefits compared to state-of-the-art generative baselines. Second, I discuss the results of an end-to-end translation model in which lexical, configurational, and wellformedness constraints are modeled explicitly. This model is substantially more compact than state-of-the-art translation models, but still performs significantly better on languages where source-target word order differences are substantial. </description>
      <link>http://research.microsoft.com/apps/video/default.aspx?id=164254</link>
      <media:content url="http://msrvideo.vo.msecnd.net/rmcvideos/164254/164254.asf" type="video/x-ms-asf" medium="video" height="480" width="640" duration="4861" lang="en" fileSize="999058333" bitrate="1500000" />
      <media:thumbnail url="http://msrvideo.vo.msecnd.net/rmcvideos/164254/i/large.jpg" height="240" width="320" />
      <media:keywords>Chris Dyer</media:keywords>
      <media:category>Science and Technology</media:category>
      <pubDate>Mon, 14 May 2012 17:30:00 GMT</pubDate>
    </item>
    <item>
      <title>NW-NLP 2012 Afternoon Talks</title>
      <description>[Speakers: Emily Prud'hommeaux, Congle Zhang, and Max Whitney] 3:30 Graph-based alignment of narratives for automated neurological assessment Emily Prud'hommeaux and Brian Roark 3:50 Ontological Smoothing for Relation Extraction with Minimal Supervision Congle Zhang, Raphael Hoffmann and Daniel Weld 4:10 Bootstrapping via Graph Propagation Max Whitney and Anoop Sarkar </description>
      <link>http://research.microsoft.com/apps/video/default.aspx?id=164255</link>
      <media:content url="http://msrvideo.vo.msecnd.net/rmcvideos/164255/164255.asf" type="video/x-ms-asf" medium="video" height="480" width="640" duration="3413" lang="en" fileSize="648025885" bitrate="1500000" />
      <media:thumbnail url="http://msrvideo.vo.msecnd.net/rmcvideos/164255/i/large.jpg" height="240" width="320" />
      <media:keywords>Emily Prud'hommeaux; Congle Zhang; Max Whitney</media:keywords>
      <media:category>Science and Technology</media:category>
      <pubDate>Fri, 11 May 2012 22:30:00 GMT</pubDate>
    </item>
    <item>
      <title>NW-NLP 2012 Morning Talks</title>
      <description>[Speakers: Matt Hohensee, Anthony Stark, Shafiq Joty, and Ryan Georgi] 11:00 Getting More from Morphology in Multilingual Dependency Parsing Matt Hohensee and Emily M. Bender 11:20 Hello, Who is Calling?: Can Words Reveal the Social Nature of Conversations? Anthony Stark, Izhak Shafran and Jeffrey Kaye 11:40 A Novel Discriminative Framework for Sentence-Level Discourse Analysis Shafiq Joty, Giuseppe Carenini and Raymond Ng 12:00 Measuring the Divergence of Dependency Structures Cross-Linguistically to Improve Syntactic Projection Algorithms Ryan Georgi, Fei Xia and William Lewis </description>
      <link>http://research.microsoft.com/apps/video/default.aspx?id=164256</link>
      <media:content url="http://msrvideo.vo.msecnd.net/rmcvideos/164256/164256.asf" type="video/x-ms-asf" medium="video" height="480" width="640" duration="4795" lang="en" fileSize="912938177" bitrate="1500000" />
      <media:thumbnail url="http://msrvideo.vo.msecnd.net/rmcvideos/164256/i/large.jpg" height="240" width="320" />
      <media:keywords>Matt Hohensee; Anthony Stark; Shafiq Joty; Ryan Georgi</media:keywords>
      <media:category>Science and Technology</media:category>
      <pubDate>Fri, 11 May 2012 18:00:00 GMT</pubDate>
    </item>
    <item>
      <title>NW-NLP 2012 Welcome and Introduction</title>
      <description>[Speakers: Will Lewis and Luke Zettlemoyer] 9:30 Gather (Coffee Break Food Available) 10:00 Welcome to NW NLP (10min) </description>
      <link>http://research.microsoft.com/apps/video/default.aspx?id=164196</link>
      <media:content url="http://msrvideo.vo.msecnd.net/rmcvideos/164196/164196.asf" type="video/x-ms-asf" medium="video" height="480" width="640" duration="1137" lang="en" fileSize="216076229" bitrate="1500000" />
      <media:thumbnail url="http://msrvideo.vo.msecnd.net/rmcvideos/164196/i/large.jpg" height="240" width="320" />
      <media:keywords>Will Lewis; Luke Zettlemoyer</media:keywords>
      <media:category>Science and Technology</media:category>
      <pubDate>Fri, 11 May 2012 17:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Metro Design Principles</title>
      <description>[Speaker: Darlene Wong] Please join us for our monthly XAPfest speaker presentation event! We will have presentations on the second Thursday of every month. Pizza will be served. Current Event: Metro Design Principles What is Metro? Where do Metro’s design principles come from? Come and hear Darlene share the story of Metro and how to apply the principles while designing and building your next app. Agenda: 6:00 PM – Arrive, mingle, get food 6:30 PM – Presentation 7:30 PM – Mingle, Q&amp;A, etc. 8:00 PM – Event ends </description>
      <link>http://research.microsoft.com/apps/video/default.aspx?id=164146</link>
      <media:content url="http://msrvideo.vo.msecnd.net/rmcvideos/164146/164146.asf" type="video/x-ms-asf" medium="video" height="480" width="640" duration="2892" lang="en" fileSize="530014759" bitrate="1500000" />
      <media:thumbnail url="http://msrvideo.vo.msecnd.net/rmcvideos/164146/i/large.jpg" height="240" width="320" />
      <media:keywords>Darlene Wong</media:keywords>
      <media:category>Science and Technology</media:category>
      <pubDate>Fri, 11 May 2012 01:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Modeling and Inspecting the Question-Asking Process in Educational Dialogues</title>
      <description>[Speaker: Lee Becker] While many studies have demonstrated that dialogue-based tutoring systems have a positive effect on learning, the significant amount of human effort required to author, design, and tune system behaviors still provides a major barrier towards widespread deployment and adoption of these systems. Machine learning presents a path towards reduced human effort, however the custom-built nature of these systems means that any learned behavior is strictly tied to a single implementation. Ideally these behaviors should be able to extend to a variety of materials and concepts. To enable this kind of generalization will require a meta-level model of the dialogue that abstracts utterances to their action, function, and content. In this talk, I describe the DISCUSS dialogue move taxonomy, an intermediate representation that allows for lesson-independent modeling of dialogue behavior. To demonstrate the utility of this representation, I explore how DISCUSS-based features assist in the process of ranking and selecting follow-up questions within the context of the My Science Tutor (MyST) intelligent tutoring system. Moreover, I show how DISCUSS enables us to model and identify the factors driving the decisions made by experienced human tutors when teaching. </description>
      <link>http://research.microsoft.com/apps/video/default.aspx?id=163949</link>
      <media:content url="http://msrvideo.vo.msecnd.net/rmcvideos/163949/163949.asf" type="video/x-ms-asf" medium="video" height="480" width="640" duration="4216" lang="en" fileSize="796622709" bitrate="1500000" />
      <media:thumbnail url="http://msrvideo.vo.msecnd.net/rmcvideos/163949/i/large.jpg" height="240" width="320" />
      <media:keywords>Lee Becker</media:keywords>
      <media:category>Science and Technology</media:category>
      <pubDate>Thu, 10 May 2012 20:30:00 GMT</pubDate>
    </item>
    <item>
      <title>A Key Value Store that Supports Strict SLAs and the Applications that Need it</title>
      <description>[Speaker: Christopher Stewart] Emerging datacenter workloads differ from traditional e-commerce workloads in two ways. First, emerging big-science workloads can access networked storage thousands of times for 1 simulation. Even a few slow accesses can slowdown the entire simulation. Second, emerging niche workloads in green computing have non-technical competitive advantages. Providers of these workloads want to keep technical costs as low as possible while barely meeting their user's performance needs. Both of these workloads point to an emerging technical need, managing performance-oriented service level agreements (SLAs). In this talk, I will present Zoolander, a key value store that can complete a very high percentage of storage accesses (i.e., a service level) within tight, time constraints. The key idea behind Zoolander is to revisit replication for predictability, an old but seldom-used approach to mask the effects of uncommon events. Zoolander mixes replication for predictability, partitioning, and traditional replication to scale efficiently, meeting strict SLAs while using half as many cloud nodes. I will present results where Zoolander can complete 99.99% of its storage accesses within a strict, 15ms latency bound. It does this by reducing 99th percentile latencies by 78%. Zoolander provides write-order consistency, survives failures, and has been scaled in our tests to 32 nodes. </description>
      <link>http://research.microsoft.com/apps/video/default.aspx?id=164197</link>
      <media:content url="http://msrvideo.vo.msecnd.net/rmcvideos/164197/164197.asf" type="video/x-ms-asf" medium="video" height="480" width="640" duration="4883" lang="en" fileSize="909554705" bitrate="1500000" />
      <media:thumbnail url="http://msrvideo.vo.msecnd.net/rmcvideos/164197/i/large.jpg" height="240" width="320" />
      <media:keywords>Christopher Stewart</media:keywords>
      <media:category>Science and Technology</media:category>
      <pubDate>Thu, 10 May 2012 17:30:00 GMT</pubDate>
    </item>
    <item>
      <title>18 Minutes: Find Your Focus, Master Distraction, and Get the Right Things Done</title>
      <description>[Speaker: Peter Bregman] Strategic advisor, Peter Bregman, explains how busy people can create a plan for managing their day in just 18 minutes. Bregman works from the premise that the best way to combat constant and distracting interruptions is to create productive distractions of one's own. His approach shows how to navigate through the constant chatter of emails, text messages, phone calls, and meetings to better focus on the things that are truly important. </description>
      <link>http://research.microsoft.com/apps/video/default.aspx?id=164198</link>
      <media:content url="http://msrvideo.vo.msecnd.net/rmcvideos/164198/164198.asf" type="video/x-ms-asf" medium="video" height="480" width="640" duration="4010" lang="en" fileSize="779253467" bitrate="1500000" />
      <media:thumbnail url="http://msrvideo.vo.msecnd.net/rmcvideos/164198/i/large.jpg" height="240" width="320" />
      <media:keywords>Peter Bregman</media:keywords>
      <media:category>Science and Technology</media:category>
      <pubDate>Wed, 09 May 2012 20:30:00 GMT</pubDate>
    </item>
    <item>
      <title>Subliminal: How Your Unconscious Mind Rules Your Behavior</title>
      <description>[Speaker: Leonard Mlodinow] In Subliminal, Leonard Mlodinow presents an illuminating examination of the ways in which the unconscious mind shapes our lives. Over the past two decades researchers have developed new tools for probing the subliminal workings of the mind. This explosion of research has led to a sea change in our understanding of how the mind affects the way we live. Scientists are becoming increasingly convinced that how we experience the world is largely driven by the mind's subliminal processes and not by the conscious ones, as we have long believed. Mlodinow unravels the subliminal mind and reveals its influence on how we interact with the people around us. </description>
      <link>http://research.microsoft.com/apps/video/default.aspx?id=163950</link>
      <media:content url="http://msrvideo.vo.msecnd.net/rmcvideos/163950/163950.asf" type="video/x-ms-asf" medium="video" height="480" width="640" duration="3602" lang="en" fileSize="669827019" bitrate="1500000" />
      <media:thumbnail url="http://msrvideo.vo.msecnd.net/rmcvideos/163950/i/large.jpg" height="240" width="320" />
      <media:keywords>Leonard Mlodinow</media:keywords>
      <media:category>Science and Technology</media:category>
      <pubDate>Tue, 08 May 2012 20:30:00 GMT</pubDate>
    </item>
    <item>
      <title>Rethinking the Architecture of Warehouse-Scale Computers: Improving Efficiency and Utilization</title>
      <description>[Speaker: Jason Mars] The class of datacenters coined as “warehouse scale computers” (WSCs) house large-scale data intensive web services such as websearch, maps, social networking, docs, video sharing, etc. Companies like Google, Microsoft, Yahoo, and Amazon spend ten to hundreds of millions to construct and operate WSCs to provide these services. Maximizing the efficiency of this class of computing reduces cost and has energy implications for a greener planet. However, WSC design and architecture remains in its relative infancy. WSCs are built using commodity processor architectures (Intel/AMD), and software components (Linux, GCC, JVM, etc) that has been engineered and optimized for traditional computing environments and workloads, such as those you’d find in the desktop / laptop environment. However, there are many characteristics, assumptions, and requirements present in the WSC computing domain that impacts design decisions within these components. In this presentation, we rethink how WSCs are designed and architected, identify sources of inefficiency, and develop solutions to improve WSCs, with a particular focus on the interaction between the application layer, system software stack, and the underlying hardware platform. </description>
      <link>http://research.microsoft.com/apps/video/default.aspx?id=163951</link>
      <media:content url="http://msrvideo.vo.msecnd.net/rmcvideos/163951/163951.asf" type="video/x-ms-asf" medium="video" height="480" width="640" duration="4852" lang="en" fileSize="926274519" bitrate="1500000" />
      <media:thumbnail url="http://msrvideo.vo.msecnd.net/rmcvideos/163951/i/large.jpg" height="240" width="320" />
      <media:keywords>Jason Mars</media:keywords>
      <media:category>Science and Technology</media:category>
      <pubDate>Tue, 08 May 2012 17:30:00 GMT</pubDate>
    </item>
    <item>
      <title>Mitigating Resource Contention in Warehouse-scale Computers</title>
      <description>[Speaker: Lingjia Tang] The class of modern datacenters hosting large-scale Internet services such as web-search, mail, and social networking has gained significant momentum in today’s computing environments. However, these datacenters, recently coined as warehouse scale computers (WSCs), are extremely expensive to construct and operate. Improving software performance and server utilization is key to improving the efficiency and reducing the enormous cost in WSCs. Modern WSCs are constructed using commodity multicore processors, on which part of the memory subsystem is shared. When multiple applications are co-located on a multicore machine, contention for the shared memory resources, such as caches and memory bandwidth, may occur. This contention can cause severe cross-core performance interference, and significantly de- grade application performance. Mitigating resource contention is critical for improving application performance. However, despite the wealth of research effort on contention management, little is known about how emerging large- scale web-service applications interact with the shared memory resources on commodity processors, and how this contention can be mitigated to improve the performance of these applications. In addition to performance, mitigating contention is also critical for im- proving the server utilization in WSCs. As multicore processors with expanding core counts continue to dominate the server market, the overall utilization of WSCs depends heavily on the consolidation of workloads to take advantage of the total computing potential provided by modern processors. However, many of the applications running in WSCs are user-facing, latency-sensitive applications with quality of service (QoS) requirements. These QoS requirements can be violated by the performance interference that can occur when multiple applications are consolidated on a single ma- chine. As a result, the current common practice in WSCs is to disallow the co-location of latency-sensitive applications with other applications. This approach is undesirable as it results in low machine utilization in WSCs and millions of dollars wasted. In this talk I present novel compilation and runtime approaches to significantly mitigating contention and improving performance, QoS and machine utilization in datacenters. Specifically, this talk presents: 1) comprehensive investigation and characterization of the impact of memory resource sharing on industry-strength large-scale datacenter workloads, which expose new characteristics and insights contrary to recent literature; 2) the design of a heuristic based system and a runtime system to intelligently map application threads to cores to promote positive resource sharing and mitigate resource contention to improve application performance; and 3) the design of novel compilation techniques and run- time systems that statically and dynamically manipulate applications’ contentious nature to enable the co-location of applications with varying QoS requirements, and as a result, greatly improve server utilization in WSCs. </description>
      <link>http://research.microsoft.com/apps/video/default.aspx?id=163952</link>
      <media:content url="http://msrvideo.vo.msecnd.net/rmcvideos/163952/163952.asf" type="video/x-ms-asf" medium="video" height="480" width="640" duration="3638" lang="en" fileSize="678619277" bitrate="1500000" />
      <media:thumbnail url="http://msrvideo.vo.msecnd.net/rmcvideos/163952/i/large.jpg" height="240" width="320" />
      <media:keywords>Lingjia Tang</media:keywords>
      <media:category>Science and Technology</media:category>
      <pubDate>Mon, 07 May 2012 17:30:00 GMT</pubDate>
    </item>
    <item>
      <title>The Convex Geometry of Inverse Problems</title>
      <description>[Speaker: Ben Recht] Deducing the state or structure of a system from partial, noisy measurements is a fundamental task throughout the sciences and engineering. The resulting inverse problems are often ill-posed because there are fewer measurements available than the ambient dimension of the model to be estimated. In practice, however, many interesting signals or models contain few degrees of freedom relative to their ambient dimension: a small number of genes may constitute the signature of a disease, very few parameters may specify the correlation structure of a time series, or a sparse collection of geometric constraints may determine a sensor network configuration. Discovering, leveraging, or recognizing such low-dimensional structure plays an important role in making inverse problems well-posed. In this talk, I will propose a unified approach to transform notions of simplicity and latent low-dimensionality into convex penalty functions. This approach builds on the success of generalizing compressed sensing to matrix completion, and greatly extends the catalog of objects and structures that can be recovered from partial information. I will focus on a suite of data analysis algorithms designed to decompose general signals into sums of atoms from a simple---but not necessarily discrete---set. These algorithms are derived in an optimization framework that encompasses previous methods based on l1-norm minimization and nuclear norm minimization for recovering sparse vectors and low-rank matrices. I will provide sharp estimates of the number of generic measurements required for exact and robust estimation of a variety of structured models. I will then detail several example applications and describe how to scale the corresponding algorithms to massive data sets. </description>
      <link>http://research.microsoft.com/apps/video/default.aspx?id=163953</link>
      <media:content url="http://msrvideo.vo.msecnd.net/rmcvideos/163953/163953.asf" type="video/x-ms-asf" medium="video" height="480" width="640" duration="4979" lang="en" fileSize="935587281" bitrate="1500000" />
      <media:thumbnail url="http://msrvideo.vo.msecnd.net/rmcvideos/163953/i/large.jpg" height="240" width="320" />
      <media:keywords>Ben Recht</media:keywords>
      <media:category>Science and Technology</media:category>
      <pubDate>Fri, 04 May 2012 17:30:00 GMT</pubDate>
    </item>
    <item>
      <title>Skeleton Automata for FPGAs: Reconfiguring without Reconstructing</title>
      <description>[Speaker: Louis Woods] While the performance opportunities of field-programmable gate arrays (FPGAs) for high-volume query processing are well-known, system makers still have to compromise between desired query expressiveness and high compilation effort. The cost of the latter is the primary limitation in building efficient FPGA/CPU hybrids. In this talk I will present an FPGA-based stream processing engine that does not have this limitation. It provides a hardware implementation of XML projection that can be reconfigured in less than a micro-second, yet supports a rich and expressive dialect of XPath. By performing XML projection in the network, we can fully leverage its filtering effect and improve XQuery performance by several factors. These improvements are made possible by a new design approach for FPGA acceleration, called skeleton automata. Skeleton automata separate the structure of finite-state automata from their semantics. Since individual queries only affect the latter, with this approach query workload changes can be accommodated fast and with high expressiveness. </description>
      <link>http://research.microsoft.com/apps/video/default.aspx?id=163954</link>
      <media:content url="http://msrvideo.vo.msecnd.net/rmcvideos/163954/163954.asf" type="video/x-ms-asf" medium="video" height="480" width="640" duration="3610" lang="en" fileSize="663819067" bitrate="1500000" />
      <media:thumbnail url="http://msrvideo.vo.msecnd.net/rmcvideos/163954/i/large.jpg" height="240" width="320" />
      <media:keywords>Louis Woods</media:keywords>
      <media:category>Science and Technology</media:category>
      <pubDate>Thu, 03 May 2012 20:30:00 GMT</pubDate>
    </item>
    <item>
      <title>Creating Innovators: The Making of Young People Who Will Change the World</title>
      <description>[Speaker: Tony Wagner] In Creating Innovators, Tony Wagner addresses the question of how do we create the next generator of innovators? By profiling young creators and examining cutting-edge programs, Wagner identifies that the answer is to embrace the principles of play, passion, and purpose, and shows how we can remake our schools and workplaces to better cultivate the change-makers of tomorrow. </description>
      <link>http://research.microsoft.com/apps/video/default.aspx?id=163758</link>
      <media:content url="http://msrvideo.vo.msecnd.net/rmcvideos/163758/163758.asf" type="video/x-ms-asf" medium="video" height="480" width="640" duration="3326" lang="en" fileSize="616297369" bitrate="1500000" />
      <media:thumbnail url="http://msrvideo.vo.msecnd.net/rmcvideos/163758/i/large.jpg" height="240" width="320" />
      <media:keywords>Tony Wagner</media:keywords>
      <media:category>Science and Technology</media:category>
      <pubDate>Thu, 03 May 2012 20:30:00 GMT</pubDate>
    </item>
    <item>
      <title>The Case for Continuous Time</title>
      <description>[Speaker: Christian Shelton] Time is a continuous quantity. This talk begins with theoretical and experimental problems that arise when time is treated as a discrete quantity in stochastic systems. I will then discuss continuous time Bayesian networks (CTBNs), a variable-based representation of continuous-time Markov processes. I will cover their representation and semantics and a bit about inference and learning in the models. Finally, I will present my group's recent work in employing CTBNs on real-world applications. </description>
      <link>http://research.microsoft.com/apps/video/default.aspx?id=163761</link>
      <media:content url="http://msrvideo.vo.msecnd.net/rmcvideos/163761/163761.asf" type="video/x-ms-asf" medium="video" height="480" width="640" duration="4440" lang="en" fileSize="831984047" bitrate="1500000" />
      <media:thumbnail url="http://msrvideo.vo.msecnd.net/rmcvideos/163761/i/large.jpg" height="240" width="320" />
      <media:keywords>Christian Shelton</media:keywords>
      <media:category>Science and Technology</media:category>
      <pubDate>Tue, 01 May 2012 20:30:00 GMT</pubDate>
    </item>
    <item>
      <title>Failures and Other Challenges of Big-Data Analytics</title>
      <description>[Speaker: Magdalena Balazinska] An important challenge faced by today's big-data analytics systems is fault-tolerance: When running a parallel query at large scale, some form of failure is likely to occur during execution. Existing systems typically take one of two radically different strategies to handle failures: restart entire queries or materialize the output of each operator and restart only failed operator partitions. The former approach adds significant overhead when a failure occurs, while the latter adds overhead at runtime and typically introduces global synchronization barriers. In this talk, we present FTOpt, a new approach for making online, parallel query plans fault-tolerant: FTOpt provides intra-query fault-tolerance without blocking. Additionally, it does so by using different fault-tolerance techniques at different operators within a query plan. Enabling each operator to use a different fault-tolerance strategy leads to a space of fault-tolerance plans amenable to cost-based optimization. FTopt comprises a protocol for mixing-and-matching fault-tolerance techniques within a single query plan and an optimizer for selecting the technique to use in order to minimize the expected processing time with failures for the entire query. Experiments show that with as little as one failure, the choice of fault-tolerance approach can result in 70% difference in query runtimes, that often hybrid query plans lead to the best performance, and that our optimizer is able to select a winning plan. In addition to FTOpt, we will also present a broad overview of other research challenges tackled by the ongoing Nuage, CQMS, and Data Ecoytem projects at the University of Washington. </description>
      <link>http://research.microsoft.com/apps/video/default.aspx?id=163759</link>
      <media:content url="http://msrvideo.vo.msecnd.net/rmcvideos/163759/163759.asf" type="video/x-ms-asf" medium="video" height="480" width="640" duration="3611" lang="en" fileSize="678379073" bitrate="1500000" />
      <media:thumbnail url="http://msrvideo.vo.msecnd.net/rmcvideos/163759/i/large.jpg" height="240" width="320" />
      <media:keywords>Magdalena Balazinska</media:keywords>
      <media:category>Science and Technology</media:category>
      <pubDate>Tue, 01 May 2012 20:30:00 GMT</pubDate>
    </item>
    <item>
      <title>Halo: Primordium The Forerunner Saga</title>
      <description>[Speaker: Greg Bear] Halo: Primordium continues the story of the enigmatic creators and builders of the Halos that began in Halo: Cryptum. In the wake of the apparent self-destruction of the Forerunner empire, two humans—Chakas and Riser—end up on a strange inverted world where horizons rise into the sky. Their epic journey across this damaged Halo takes them into the domain of a powerful and monstrous intelligence who claims to be the Last Precursor. Called the Captive by Forerunners, and the Primordial by ancient humans, this intelligence has taken charge of, and perverted, the Master Builder’s already horrifying research into the Flood. Chakas and Riser unwittingly become trapped in an ancient game of vengeance between the powers who seeded the galaxy with life and the Forerunners, who have taken up the sacred Mantle of duty to protect all living things. </description>
      <link>http://research.microsoft.com/apps/video/default.aspx?id=163760</link>
      <media:content url="http://msrvideo.vo.msecnd.net/rmcvideos/163760/163760.asf" type="video/x-ms-asf" medium="video" height="480" width="640" duration="2948" lang="en" fileSize="605782861" bitrate="1500000" />
      <media:thumbnail url="http://msrvideo.vo.msecnd.net/rmcvideos/163760/i/large.jpg" height="240" width="320" />
      <media:keywords>Greg Bear</media:keywords>
      <media:category>Science and Technology</media:category>
      <pubDate>Tue, 01 May 2012 20:30:00 GMT</pubDate>
    </item>
    <item>
      <title>Erasure Codes for Big Data over Hadoop and Large-scale Sparse PCA for Twitter Analysis</title>
      <description>[Speakers: Alex Dimakis and Dimitris S. Papailiopoulos] 1). Erasure Codes for Big Data over Hadoop As big data grows faster than infrastructure, triple replication becomes prohibitively expensive for distributed storage systems. For this reason most large systems use erasure codes for archival files to provide high reliability with small storage overheads. Reed-Solomon codes are the common choice, a classical error-correcting construction relying on polynomials over finite fields. Unfortunately, classical error-correcting codes are not sufficient for distributed systems since repairing a single failure requires large data disk IO and network transfers. We will present a new family of erasure codes that minimize repair communication and disk IO during single node failures. An implementation over HDFS will be described and several tradeoffs and research directions will be discussed.  2). Large-scale Sparse PCA for Twitter Analysis Large-scale data analytics are now becoming a trend in big data set applications, such as event detection and sentiment analysis in social networks that host millions of users. Sparse PCA, a new tool in dimensionality reduction, generates sparse collections of data features that identify major trends in a data set. The sparsity of this tool is fundamental to the high interpretability of the results, e.g., in event detection applications, a small collection of words that "describe" major events are easier to interpret than a bag of thousands of words. Unfortunately, sparse PCA is NP-hard and most of its polynomial-time relaxations involve solving subproblems that become intractable in high dimensions. We will present a new, parallelizable spectral algorithm for sparse-PCA that inspires a novel feature elimination technique, which on real Twitter data sets reduces the problem size from hundreds of thousands to hundreds of features. Our algorithm comes with specific optimality and approximation guarantees and is based on a new way to visualize the span of matrices using auxiliary spherical variables. Future directions on implementing our algorithm in the Hadoop MapReduce framework will be discussed. </description>
      <link>http://research.microsoft.com/apps/video/default.aspx?id=163611</link>
      <media:content url="http://msrvideo.vo.msecnd.net/rmcvideos/163611/163611.asf" type="video/x-ms-asf" medium="video" height="480" width="640" duration="6036" lang="en" fileSize="1122769623" bitrate="1500000" />
      <media:thumbnail url="http://msrvideo.vo.msecnd.net/rmcvideos/163611/i/large.jpg" height="240" width="320" />
      <media:keywords>Alex Dimakis; Dimitris S. Papailiopoulos</media:keywords>
      <media:category>Science and Technology</media:category>
      <pubDate>Mon, 30 Apr 2012 22:15:00 GMT</pubDate>
    </item>
    <item>
      <title>The Battle for Control of Online Communications</title>
      <description>[Speaker: Nick Feamster] The Internet offers users many opportunities for communicating and exchanging ideas, but abuse, censorship, and the manipulation of Internet traffic have put free and open communication at risk. Recent estimates suggest that spam constitutes about 95% of all email traffic; hundreds of thousands of online scam domains emerge every day; online social networks may be used to spread propaganda; and more than 60 countries around the world censor Internet traffic. In this talk, I will present approaches that we have developed to preserve free and open communication on the Internet in the face of these threats. First, I will describe the threat of message abuse (e.g., spam) and describe methods we have developed for mitigating it. I will briefly discuss a 13-month study of the network-level behavior of spammers, and present SNARE, a spam filtering system we developed that classifies email messages based on the network-level traffic characteristics of the email messages, rather than their contents. Next, I will turn to information censorship, and describe Collage, a system that circumvents censorship without arousing the suspicion of the censor. Finally, I will discuss the various forms of information manipulation, including the spread of propaganda in social networks and online "filter bubbles". Although it is difficult to prevent all forms of manipulation, our goal is to make it more transparent to users. Towards this goal, I will describe my broader research agenda and plans, which aim to improve Internet transparency for aspects of Internet communication ranging from network performance to social media to search results using the aggregation of data from a wide variety of vantage points. </description>
      <link>http://research.microsoft.com/apps/video/default.aspx?id=163612</link>
      <media:content url="http://msrvideo.vo.msecnd.net/rmcvideos/163612/163612.asf" type="video/x-ms-asf" medium="video" height="480" width="640" duration="4511" lang="en" fileSize="848024473" bitrate="1500000" />
      <media:thumbnail url="http://msrvideo.vo.msecnd.net/rmcvideos/163612/i/large.jpg" height="240" width="320" />
      <media:keywords>Nick Feamster</media:keywords>
      <media:category>Science and Technology</media:category>
      <pubDate>Mon, 30 Apr 2012 20:30:00 GMT</pubDate>
    </item>
    <item>
      <title>From Plastic to Pixels: In Pursuit of Effective Touch-Typing on Touch Screens</title>
      <description>[Speaker: Jacob O. Wobbrock] Fast, accurate and satisfying text entry remains a significant challenge on touch screens. The lack of tactile feedback from physical keys and the loss of distinction between touching and pressing on touch screen keyboards are two of many challenges. The challenges increase for mobile touch screen text entry, where small screens and walking-induced situational impairments compromise accuracy. In this talk, I will present a study of “touch-typing on flat glass” to understand finger-strike patterns for touch screen keyboards. I will also describe an adaptive keyboard built for Microsoft Surface that morphs its key layout to remain positioned beneath users’ fingers. I will also show a way to incorporate stroke gestures for non-alphanumeric input into this keyboard. For mobile text entry, I will describe WalkType, a keyboard made more accurate while walking by incorporating accelerometer data and inference about users’ walking behavior. Finally, I will describe Perkinput, a Perkins Brailler-based method for eyes-free text entry using Braille-like patterns. Taken together, these projects highlight the potential for effective touch screen text input, and point to future possibilities where exciting work remains. </description>
      <link>http://research.microsoft.com/apps/video/default.aspx?id=163613</link>
      <media:content url="http://msrvideo.vo.msecnd.net/rmcvideos/163613/163613.asf" type="video/x-ms-asf" medium="video" height="480" width="640" duration="5060" lang="en" fileSize="948283767" bitrate="1500000" />
      <media:thumbnail url="http://msrvideo.vo.msecnd.net/rmcvideos/163613/i/large.jpg" height="240" width="320" />
      <media:keywords>Jacob O. Wobbrock</media:keywords>
      <media:category>Science and Technology</media:category>
      <pubDate>Mon, 30 Apr 2012 17:30:00 GMT</pubDate>
    </item>
  </channel>
</rss>
