Performance Assessment
Up
PAC-Bayesian
Bayesian Transduction
Bayes Point Machines
Adatpive Margin Machines
Sparsity
Ordinal Regression
Proximity Learning
Performance Assessment
Concept Learning
Ripple Down Rules
Algorithmic Luckiness
Semidefinite Programming
Informative Vector Machines
Learning to Fight
ROC Curve Bounds
Poisson Networks
Approximate Bayesian Inference
Drivatars

 

 

Assessment of learning algorithms

In order to rank the performance of machine learning algorithms, many researchers conduct experiments on ceratin benchmark data sets. Most learning algorithms have domain-specific parameters and it is a popular custom to adjust these parameters with respect to minimal error on a holdout set. The error on the same holdout set of samples is then used to rank the algorithm, which causes an optimistic bias. We quantify this bias and show, why, when, and to which extent this inappropriate experimental setting distorts the results.

References

  • Tobias Scheffer and Ralf Herbrich. Unbiased Assessment of Learning Algorithms. In Proceedings of the International Joint Conference on Artificial Intelligence, pages 798-803, 1997. (PostScript).
  • Thomas G. Dietterich. Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms. Neural Computation, 10 (7) 1895-1924, (1998). (Gzipped Postscript).

Up | PAC-Bayesian | Bayesian Transduction | Bayes Point Machines | Adatpive Margin Machines | Sparsity | Ordinal Regression | Proximity Learning | Performance Assessment | Concept Learning | Ripple Down Rules | Algorithmic Luckiness | Semidefinite Programming | Informative Vector Machines | Learning to Fight | ROC Curve Bounds | Poisson Networks | Approximate Bayesian Inference | Drivatars

This site was last updated 29-10-2004