Improving Survey Aggregation with Sparsely Represented Signals

KDD 2016: 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, California |

Published by Knowledge Discovery and Data Mining (KDD)

Publication

In this paper, we develop a new aggregation technique to reduce the cost of surveying. Our method aims to jointly estimate a vector of target quantities such as public opinion or voter intent across time and maintain good estimates when using only a fraction of the data. Inspired by the James-Stein estimator, we resolve this challenge by shrinking the estimates to a global mean which is assumed to have a sparse representation in some known basis. This assumption has led to two different methods for estimating the global mean: orthogonal matching pursuit and deep learning. Both of which significantly reduce the number of samples needed to achieve good estimates of the true means of the data and, in the case of presidential elections, can estimate the outcome of the 2012 United States elections while saving hundreds of thousands of samples and maintaining accuracy.