Connecting Deep Learning Features to Log-Linear Models

  • Li Deng

in

Published by MIT Press | 2015

A log-linear model by itself is a shallow architecture given fixed, nonadaptive, human-engineered feature functions but its flexibility in using the feature functions allows the exploitation of diverse high-level features computed automatically from deep learning systems. We propose and explore a paradigm of connecting the deep leaning features as inputs to log-linear models, which, in combination with the feature hierarchy, form a powerful deep classifier. Three case studies are provided in this chapter to instantiate this paradigm. First, deep stacking networks and its kernel version are used to provide deep learning features for a static log-linear model — the softmax classifier or maximum entropy model. Second, deep-neural-network features are extracted to feed to a sequential log-linear model — the conditional random field. And third, a log-linear model is used as a stacking-based assemble learning machine to integrate a number of deep learning systems’ outputs. All these three types of deep classifier have their effectiveness verified in experiments. Finally, compared with the traditional log-linear modeling approach which relies on human feature engineering, we point out one main weakness of the new framework in its lack of ability to naturally embed domain knowledge. Future directions are discussed for overcoming this weakness by integrating deep neural networks with deep generative models.