Jing Gao, Bolin Ding, Wei Fan, Jiawei Han, and Philip S. Yu
2008
Classification is an important data analysis tool that uses a model built from historical data to predict class labels for new observations. More and more applications are featuring data streams, rather than finite stored data sets, which are a challenge for traditional classification algorithms. Concept drifts and skewed distributions, two common properties of data stream applications, make the task of learning in streams difficult. The authors aim to develop a new approach to classify skewed data streams that uses an ensemble of models to match the distribution over under-samples of negatives and repeated samples of positives.
In IEEE Internet Computing
Publisher IEEE Computer Society
| Type | Article |
| URL | http://www.computer.org/portal/web/csdl/doi/10.1109/MIC.2008.119 |
| Pages | 37-49 |
| Volume | 12 |
| Number | 6 |