The second New England Machine Learning Day will be held **May 1st, 2013**, from 10:00am-5:00pm at Microsoft Research New England, One Memorial Drive, Cambridge, MA 02142. The event will bring together local academics and researchers in machine learning and its applications. There will be a lively poster session during lunch, like NEML 2012.

### Schedule

10:00‑10:05 | Jennifer Chayes (MSR) | Opening remarks |

10:10‑10:40 | Sham Kakade (MSR) | Learning latent structure in documents, social networks, and more... |

10:45‑11:15 | Stefanie Tellex (Brown) | Learning Word Meanings for Human-Robot Interaction |

11:20‑11:50 | Pablo Parrilo (MIT) | From Sparsity to Rank, and Beyond: algebra, geometry, and convexity |

11:50‑1:45 | Posters and lunch | |

1:45‑2:15 | Erik Sudderth (Brown) | Toward Reliable Bayesian Nonparametric Learning |

2:20‑2:50 | Ryan Adams (Harvard) | Practical Bayesian Optimization of Machine Learning Algorithms |

2:50‑3:20 | Coffee break | |

3:20‑3:50 | Hanna Wallach (UMASS) | Machine Learning for Complex Social Processes |

3:55‑4:25 | Cynthia Rudin (MIT) | ML for the Future: Healthcare, Energy, and the Internet |

4:30‑5:00 | Antonio Torralba (MIT) | Who is to blame in object detection failures? |

### Poster session

There will be a poster session during lunch. To submit, please email a brief abstract describing the project to nemlposter@hotmail.com by **April 22th, 2013**.

**Directions**. The event will be on the first floor of One Memorial Drive, Cambridge, MA 02142. Note that we are very close to the Kendall T stop, while parking is 27 dollars for the day.

### Titles and Abstracts

**Sham Kakade.** Learning latent structure in documents, social networks, and more...

In many applications, we face the challenge of modeling the interactions between multiple observations and hidden causes; such problems range from document retrieval, where we seek to model the underlying topics, to community detection in social networks. The (unsupervised) learning problem is to accurately estimate the model (e.g. the hidden topics, the underlying clusters, or the hidden communities in a social network) with only samples of the observed variables. In practice, many of these models are fit with local search heuristics. This talk will overview how simple and scalable linear algebra approaches provide closed form estimation methods for a wide class of these models models---including Gaussian mixture models, hidden Markov models, topic models (including latent Dirichlet allocation), and mixed membership models for communities in social networks.

**Stefanie Tellex.** Learning Word Meanings for Human-Robot Interaction

As robots become more powerful and autonomous, it is critical to develop ways for untrained users to quickly and easily tell them what to do. Natural language is a powerful and flexible modality for conveying complex requests, but in order for robots to effectively understand natural language commands, they must be able to acquire meaning representations that can be mapped to perceptual features in the external world. I will present approaches to learning thesegroundedmeaning representations from a corpus of natural language sentences paired with a robot's perceptual model of the environment. The robot can use these learned models to recognize events, follow commands, ask questions, and request help.

**Pablo Parrilo.** From Sparsity to Rank, and Beyond: algebra, geometry, and convexity

Optimization problems involving sparse vectors or low-rank matrices are of great importance in applied mathematics and engineering. They provide a rich and fruitful interaction between algebraic-geometric concepts and convex optimization, with strong synergies with popular techniques like L1 and nuclear norm minimization. In this lecture we will provide a gentle introduction to this exciting research area, highlighting key algebraic-geometric ideas as well as a survey of recent developments, including extensions to very general families of parsimonious models such as sums of a few permutations matrices, low-rank tensors, orthogonal matrices, and atomic measures, as well as the corresponding structure-inducing norms. Based on joint work with Venkat Chandrasekaran, Maryam Fazel, Ben Recht, Sujay Sanghavi, and Alan Willsky.

**Erik Sudderth.** Toward Reliable Bayesian Nonparametric Learning

Applications of Bayesian nonparametrics increasingly involve datasets with rich hierarchical, temporal, spatial, or relational structure. While basic inference algorithms such as the Gibbs sampler are easily generalized to such models, in practice they can fail in subtle and hard-to-diagnose ways. We explore this issue via variants of a simple and popular nonparametric Bayesian model, the hierarchical Dirichlet process. By optimizing variational learning objectives in non-traditional ways, we build improved models of text, image, and social network data.

**Ryan Adams.** Practical Bayesian Optimization of Machine Learning Algorithms

Machine learning algorithms frequently involve careful tuning of learning parameters and model hyperparameters. Unfortunately, this tuning is often a "black art" requiring expert experience, rules of thumb, or sometimes brute-force search. There is therefore great appeal for automatic approaches that can optimize the performance of any given learning algorithm to the problem at hand. I will describe my recent work on solving this problem with Bayesian nonparametrics, using Gaussian processes. This approach of "Bayesian optimization" models the generalization performance as an unknown objective function with a GP prior. I will discuss new algorithms that account for variable cost in function evaluation and take advantage of parallelism in evaluation. These new algorithms improve on previous automatic procedures and can reach or surpass human expert-level optimization for many algorithms including latent Dirichlet allocation for text analysis, structured SVMs for protein motif finding, and convolutional neural networks for visual object recognition.

**Hanna Wallach.** Machine Learning for Complex Social Processes

From the activities of the US Patent Office or the National Institutes of Health to communications between scientists or political legislators, complex social processes---groups of people interacting with each other in order to achieve specific and sometimes contradictory goals---underlie almost all human endeavor. In order draw thorough, data-driven conclusions about complex social processes, researchers and decision-makers need new quantitative tools for exploring, explaining, and making predictions using massive collections of interaction data. In this talk, I will discuss the development of machine learning methods for modeling interaction data. I will concentrate on exploratory analysis of communication networks---specifically, discovery and visualization of topic-specific subnetworks in email data sets. I will present a new Bayesian latent variable model of network structure and content and explain how this model can be used to analyze intra-governmental email networks.

**Cynthia Rudin.**ML for the Future: Healthcare, Energy, and the Internet

I will overview recent applications of ML to some of society's critical domains, including healthcare, energy grid reliability, and information retrieval. Specifically:

1) Stroke risk prediction in medical patients, using ML techniques for interpretable predictive modeling.

2) Energy grid reliability in New York City, using point process models.

3) Growing a list using the Internet, using clustering techniques.

These applications show the promise of how applications can drive the development of effective new ML techniques.

Collaborators: Ben Letham, Seyda Ertekin, Tyler McCormick, David Madigan, and Katherine Heller

**Antonio Torralba.** Who is to blame in object detection failures?

### Posters

1.*Priors for Diversity in Generative Latent Variable Models*by James Zou and Ryan Adams.

2.

*Generalized Random Utility Models*by Hossein Azari, David C. Parkes, and Lirong Xia.

3.

*Approximate Inference in Collective Graphical Models*by Daniel Sheldon, Tao Sun, Akshat Kumar, and Thomas G. Dietterich.

4.

*Discovering Structure in Spiking Networks*by Scott Linderman and Ryan Adams.

5.

*Poisson Statistics and the Future of Internet Marketing*by Delaram Motamedvaziri, Mohammad Hossein Rohban, Venkatesh Saligrama, and David Castanon.

6.

*Copy or Coincidence? A Model for Detecting Social Influence and Duplication Events*by Lisa Friedland, David Jensen, and Michael Lavine.

7.

*An Impossibility Result for High Dimensional Supervised Learning*by M. H. Rohban, P. Ishwar, B. Orten, W. C. Karl, and V. Saligrama.

8.

*Localizing 3D Cuboids in Single-view Images*by Jianxiong Xiao, Bryan C. Russell, and Antonio Torralba.

10.

*Accelerating Inference: Towards a Full Language, Compiler and Hardware Stack*by Lyric Labs - Analog Devices.

11.

*Efficient Nearest-Neighbor Search in the Probability Simplex*by Kriste Krstovski, David A. Smith, Hanna M. Wallach, Andrew McGregor, and Michael J. Kurtz.

12.

*Image Caption Generation*by Rebecca Mason.

13.

*The Gesture Recognition Toolkit*by Nicholas Gillian and Joseph Paradiso.

14.

*The incidental parameter problem in network analysis for neural spiking data*by Dahlia Nadkarni and Matthew Harrison.

15.

*Knowledge Mining Blood Pressure Data with Dynamic Bayesian Network Modeling*by Alex Waldin, Kalyan Veeramachaneni, and Una-May O'Reilly.

16.

*The network you keep: Graphlet-Based discrimination of persons of interest*by Saber Shokat Fadaee, Javed A. Aslam, Nikos Passas, and Ravi Sundaram.

17.

*Probabilistic reasoning about human edits in information integration*by Michael Wick, Ari Kobren, and Andrew McCallum.

18.

*Spectral Discovery of Clinical Autism Phenotypes with Subspace Regularization*by Finale Doshi-Velez, Deniz Oktay, Ben Mayne, and Isaac Kohane.

19.

*Predicting Age Distribution—A Generative Bayesian Model*by Huseyin Oktay, Aykut Firat, and David Jensen.

20.

*An Improved Message-Passing Algorithm Incorporating Certainty Information*by Nate Derbinsky, José Bento Ayres Pereira, Veit Elser, and Jonathan S. Yedidia.

21.

*A New Geometric Approach to Latent Topic Modeling and Discovery*by Weicong Ding, Mohammad H. Rohban, Prakash Ishwar, and Venkatesh Saligrama.

22.

*Coco-Q: Learning in Stochastic Games with Side Payments*by Eric Sodomka, Elizabeth Hilliard, Amy Greenwald, and Michael Littman.

23.

*Modeling Clinical Prognosis by Learning Interpretable Representations from Massive Health Data*by Rohit Joshi and Peter Szolovits.

24.

*An Efficient Atomic Norm Minimization Approach to Identification of Low Order Models*by Burak Yilmaz, Constantino Lagoa, and Mario Sznaier.

25.

*Agglomerative Clustering of Bagged Data Using Joint Distributions*by David Arbour, James Atwood, Ahmed El-Kishky, and David Jensen.

26.

*Hankel Based Maximum Margin Classifiers: A Connection Between Machine Learning and Wiener Systems Identification*by Fei Xiong, Yongfang Cheng, Octavia Camps, Mario Sznaier, and Constantino Lagoa.

27.

*Fitting Large-Scale GLMs with Implicit Updates*by Panos Toulis, Jason Rennie, and Edo Airoldi.

28.

*Automatic delineation of radiosensitive structures in CT images using statistical appearance models and level sets*by Karl D. Fritscher and Gregory Sharp.

29.

*Topic-Partitioned Multinetwork Embeddings*by Peter Krafft, Juston Moore, Bruce Desmarais, and Hanna Wallach.

30.

*Evaluating Crowdsourcing Participants in the Absence of Ground-Truth*by Ramanathan Subramanian, Romer Rosales, Glenn Fung, and Jennifer Dy.

31.

*Nonparametric Mixture of Gaussian Processes with Constraints*by James C. Ross.

32.

*Sparse Signal Processing with Linear and Non-Linear Observations: A Unified Shannon Theoretic Approach*by Cem Aksoylar, George Atia, and Venkatesh Saligrama.

33.

*More Efficient Dual Decomposition for Corpus Wide Inference*by Alexandre Passos, David Belanger, Sebastian Riedel, and Andrew McCallum.

34.

*Learning with Irregularly Sampled Time Series Data*by Steve Cheng-Xian Li and Benjamin M. Marlin.

35.

*Batch-iFDD for Representation Expansion in Large MDPs*by Alborz Geramifard, Tom Walsh, Nicholas Roy, and Jonathan How.

36.

*Leveraging Hierarchical Structure in Diagnostic Codes for Predicting Incident Heart Failure*by Anima Singh and John Guttag.

37.

*Layered Model for Video Analysis*by Deqing Sun, Jonas Wulff, Erik B. Sudderth, Hanspeter Pfister, and Michael J. Black.

38.

*FlexGP: a Divide and Conquer Approach to Machine Learning on the Cloud*by Kalyan Veeramachaneni, Owen Derby, Dylan Sherry, and Una-May O'Reilly.

39.

*Density Estimation and Anomaly Detection Using the Relevance Vector Machine*by Jose Lopez.

40.

*Reasoning about Independence in Probabilistic Models of Relational Data*by Marc Maier, Katerina Marazopoulou, and David Jensen.

41.

*On a Particle-Stabilized Wang-Landau Algorithm*by Luke Bornn, Pierre Jacob, Arnaud Doucet, and Pierre Del Moral.

42.

*Posterior Consistency for the Number of Components in a Finite Mixture*by Jeffrey W. Miller and Matthew T. Harrison.

### Organization

NEML 2013 is organized by:**Edo Airoldi**(Harvard)

**Tommi Jaakkola**(MIT)

**Adam Tauman Kalai**(Microsoft Research, Chair)

**Andrew McCallum**(UMass Amherst)

NEML is intended to be an annual event. The steering committee that selects the organizers each year consists of Sham Kakade, Adam Kalai, and Joshua Tenenbaum.

**Hospitality Notice for University and Government Employees:**
Microsoft Research is providing hospitality at this event. Please consult with your institution to determine whether you can accept meals and other hospitality
under your institution's ethics rules and any other laws that might apply. By accepting our invitation, you confirm that this invitation is compliant with your institution's policies.