Many statistical learning techniques assume that training and testing samples are generated from the same underlying distribution. Often, however, an ”unadapted classifier” is trained on samples drawn from a training distribution that is different from the target (or test-time) distribution. Moreover, in many applications, while there may be essentially an unlimited amount of labeled ”training data,” only a small amount of labeled ”adaptation data” drawn from the target distribution is available. The problem of adaptive learning (or adaptation) then, is to learn a new classifier utilizing the unadapted classifier and the limited adaptation data, in an attempt to obtain as good classification performance on the target distribution as possible. The goal of this dissertation is to investigate theory, algorithms and applications of adaptive learning. Specifically, we propose a Bayesian “fidelity prior” for classifier adaptation, which leads to simple yet principled adaptation strategies for both generative and discriminative models. In the PAC-Bayesian framework, this prior relates the generalization error bound to the KL-divergence between training and target distributions. Furthermore, based on the fidelity prior, we develop “regularized adaptation” algorithms in particular for support vector machines and multi-layer perceptrons. We evaluate these algorithms on a vowel classification corpus for speaker adaptation, and on an object recognition corpus for lighting condition adaptation. Experiments show that regularized adaptation yielded superior performance compared with other adaptation strategies. The theoretical and algorithmic work on adaptive learning was originally motivated by the development of the “Vocal Joystick” (VJ), a voice based computer interface for individuals with motor impairments. The final part of this dissertation describes the VJ engine architecture, with focus on the signal processing and pattern recognition modules. We discuss the application of regularized adaptation algorithms to a vowel classifier and a discrete sound recognizer in the VJ, which greatly helped enhance the engine performance. In addition, we present other machine learning techniques developed for the VJ, including a novel pitch tracking algorithm and an online adaptive filter algorithm.