

Senior Research Software Design Engineer, Machine Learning and Applied Statistics, Microsoft Research
E-mail: alexeib@microsoft.com ; Alexei.Bocharov@microsoft.com
© 2003-2006, Ludmila Zamiatina
(This is a two-sided iso-area mask, made out of single traditional 6” Origami square. One side is my portrait as a young man – the other side is my portrait now. Published with a kind permission of my wife Ludmila.)
I am passionate about all kinds of mathematical algorithms. My background is in pure mathematics, but I had been active in mathematical software development and publishing for more than 10 years before joining Microsoft.
After working for more than five years on the core Algebra and Calculus features of Mathematica® (http://www.wolfram.com) I went on to work as a principal contractor developing an algebraic application that has been subsequently published as firmware for Casio® Algebra FX scientific calculator and Casio® ClassPad 300 (http://www.calculatorsinc.com/casio/classpad.HTML).
There is a lot of my code implementing both symbolic and numeric math running commercially.
My current projects are in the areas of Keyword Analysis, Adversarial Problems (spam, phishing) and Data Mining.
The following paper on stability of time series forecasting has been presented at Data Mining and Information Engineering 2006 Conference and published in the Proceedings (“Data Mining VII”, WIT Press, 2006, pp.141 – 150)
Stability
Analysis of Time Series Forecasting with ART Models
Alexei Bocharov, David Chickering, David Heckerman
In technical terms the cases of long range forecasting instability are characterized by rapid growth of mean absolute prediction error with time, which may or may not be accompanied by significant growth of predicted standard deviation. In practice the cases of instability where predict StDev stays tame are especially misleading, since they can furnish unreliable predictions with little or no visual cues that would characterize them as unreliable. The method described in the paper is designed to detect and control the long range forecasting instabilities and to cull the unreliable predictions.
Data Mining Reloaded |
||||
The two main
functions of data mining are classification and prediction (or
forecasting). Data mining helps you make sense of those countless gigabytes
of raw data stored in databases by finding important patterns and rules
present in the data or derived from it. Analysts then use this knowledge to
make predictions and recommendations about new or future data. The main
business applications of data mining are learning who your
customers are and what they need, understanding where the sales are coming
from and what factors affect them, fashioning marketing strategies, and
predicting future business indicators. With the
release of SQL Server 2000, Microsoft rebranded
OLAP Services as Analysis Services to reflect the addition of newly developed
data-mining capabilities. The data-mining toolset in SQL Server 2000
included only two classical analysis algorithms (Clustering and Decision
Trees), a special-purpose data-mining management and query-expression
language named DMX, and limited client-side controls, viewers, and
development tools. SQL Server 2005
Analysis Services comes with a greatly expanded set of data-mining methods
and an array of completely new client-side analysis and development tools
designed to cover most common business intelligence
(BI) needs. The Business Intelligence Framework in SQL Server 2005 offers a
new data-mining experience for analysts and developers alike. |
A version of this paper has been reprinted as a part of A Jump Start to SQL Server BI book.
A wealth of Data Mining news briefs, tutorials, scenarios discussions and tips are being posted at the ever-expanding http://www.sqlserverdatamining.com.
No, I am not a “Save the Whales” activist (yet) – I simply like and admire the whales. Some pictures are available through either of the links below (but prepare for a 40MB download).
Whales/Whales.mht Whales\Whales.ppt © Alex Bocharov, 2004