John C. Platt, Emre Kiciman, and David Maltz
December 2007
Web servers on the Internet need to maintain high reliability, but the cause of intermittent failures of web transactions is non-obvious. We use approximate Bayesian inference to diagnose problems with web services. This diagnosis problem is far larger than any previously attempted: it requires inference of 10^4 possible faults from 10^5 observations. Further, such inference must be performed in less than a second. Inference can be done at
this speed by combining a mean-field variational approximation and the use of stochastic gradient descent to optimize a variational cost function. We use this fast inference to diagnose a time series of anomalous HTTP requests taken from a real web service. The inference is fast enough to analyze network logs with billions of entries in a matter of hours.
![]() PDF file |
In: Advances in Neural Information Processing Systems 20
| Type: | Inproceedings |
| Pages: | 1169-1176 |