CSL 864  Special Topics in AI: Classification
This is an introductory course on machine learning focusing on
classification. The course has three major objectives. First, to
familiarize students with basic classification methods so that these
can be used as tools to tackle practical machine learning
problems. Second, to equip students with the mathematical skills
needed to theoretically analyze these methods and modify and extend
them to tackle new problems. Finally, students will be introduced to
basic optimization techniques so that they can start writing their own
code for these methods.
 Lecture 1: Introduction to supervised learning, over fitting,
probability theory and decision theory.
 Lecture 2: Generative Methods and Naïve Bayes.
 Lecture 3: Toy example.
 Lecture 4: Discriminative Methods and Logistic
Regression. Equivalence to Naïve Bayes.
 Lecture 5: Logistic Regression optimization and
extensions.
 Lecture 6: Support Vector Machines.
 Lecture 7: SVMs continued  Kernels.
 Lecture 8: MultiClass SVMs. Digression on VC dimension.
 Lecture 9: SVM optimization
 Lecture 10: Kernel learning and optimization
 Lecture 11: Boosting
 Lecture 12: Boosting continued
Slides
Code
Most code was written five minutes before the start of each lecture
and comes with no guarantees, comments or documentation. In
particular, no attempt has been made to bulletproof the code. For
instance, if there is no feasible solution for your parameter settings
then the figure plotting subroutines will crash (the LR and SVM
learning routines should be stable). In any case, run the code at your
own peril.
 Some common MATLAB tools needed for
the demos (you'll need the Optimization Toolbox ver 4.2 or
higher for fmincon.LBFGS for Logistic Regression)
 MATLAB code for demos
 Naïve Bayes
 Regularized Logistic Regression (code is kernelized and
displays dual weights)
 Naïve Bayes vs Logistic Regression
 Multiclass Logistic Regression (Multinomial, 1vsAll,
1vs1 DAG, 1vs1 majority vote)
 Linear SVMs vs Logistic Regression
 Nonlinear SVMs
 Multiclass SVMs (multiclass hinge loss, Multinomial Logistic Regression, 1vsAll SVM, 1vs1 DAG SVM, 1vs1 majority vote SVM)
Links to code by other people can be found in the slides.
Recommended Reading

D. Bertsekas.
Nonlinear Programming.
Athena Scientific, 1999.

C. M. Bishop.
Pattern Recognition and Machine Learning.
Springer, 2006.

S. Boyd and L. Vandenberghe.
Convex Optimization.
Cambridge University Press, 2004.

N. Cristianini and J. ShaweTaylor.
An Introduction to Support Vector Machines and
Other Kernelbased Learning Methods.
Cambridge University Press, 2000.

R. O. Duda, P. E. Hart, and D. G. Stork.
Pattern Classification.
John Wiley and Sons, second edition, 2001.

T. Hastie, R. Tibshirani, and J. Friedman.
The
Elements of Statistical Learning.
Springer, second edition, 2009.

T. Mitchell.
Machine Learning.
McGraw Hill, 1997.

B. Scholkopf and A. Smola.
Learning with Kernels.
MIT Press, 2002.
Please see the slides for links to relevant research papers.
Back to Manik's Home Page