Ralf Herbrich, Thore Graepel, and John Shawe-Taylor
January 2000
We provide small sample size bounds on the generalisation error of linear classifiers that take advantage of large observed margins on the training set and sparsity in the data dependent expansion coefficients. It is already known from results in the luckiness framework that both criteria independently have a large impact on the generalisation error. Our new results show that they can be combined which theoretically justifies learning algorithms like the Support Vector Machine or the Relevance Vector Machine. In contrast to previous studies we avoid using the classical technique of symmetrisation by a ghost sample but directly using the sparsity for the estimation of the generalisation error. We demonstrate that our result leads to practical useful results even in case of small sample size if the training set witnesses our prior belief in sparsity and large margins.
![]() PostScript file |
In: Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
| Type: | Inproceedings |
| Pages: | 304–308 |