

Principal Research Software Design Engineer, Machine Learning and Applied Statistics, Microsoft Research
E-mail: alexeib@microsoft.com ; Alexei.Bocharov@microsoft.com
© 2003-2006, Ludmila Zamiatina
(This is a two-sided iso-area mask, made out of single traditional 6” Origami square. One side is my portrait as a young man – the other side is my portrait now. Published with a kind permission of my wife Ludmila.)
I am passionate about all kinds of mathematical algorithms. My background is in pure mathematics, but I had been active in mathematical software development and publishing for more than 10 years before joining Microsoft.
After working for more than five years on the core Algebra and Calculus features of Mathematica® (http://www.wolfram.com) I went on to work as a principal contractor developing an algebraic application that has been subsequently published as firmware for Casio® Algebra FX scientific calculator and Casio® ClassPad 300 (http://www.calculatorsinc.com/casio/classpad.HTML).
There is a lot of my code implementing both symbolic and numeric math running commercially.
My current projects are in the areas of Keyword Analysis, Adversarial Problems (spam, phishing) and Data Mining.
Scott Yih and Chris Meek are my principal collaborators in the area of Keyword Analysis.
Algorithm and components coming from our keyword research platform power selected features and interfaces in these web applications.
Keyword Services Platform (see also Wikipedia:KSP)
Microsoft Dynamics CRM (see also Wikipedia:CRM)
The following paper on stability of time series forecasting has been presented at Data Mining and Information Engineering 2006 Conference and published in the Proceedings (“Data Mining VII”, WIT Press, 2006, pp.141 – 150)
Stability
Analysis of Time Series Forecasting with ART Models
Alexei Bocharov, David Chickering, David Heckerman
In technical terms the cases of long range forecasting instability are characterized by rapid growth of mean absolute prediction error with time, which may or may not be accompanied by significant growth of predicted standard deviation. In practice the cases of instability where predict StDev stays tame are especially misleading, since they can furnish unreliable predictions with little or no visual cues that would characterize them as unreliable. The method described in the paper is designed to detect and control the long range forecasting instabilities and to cull the unreliable predictions.
Data Mining Reloaded |
||||
The two main
functions of data mining are classification and prediction (or forecasting).
Data mining helps you make sense of those countless gigabytes of raw data
stored in databases by finding important patterns and rules present in the
data or derived from it. Analysts then use this knowledge to make predictions
and recommendations about new or future data. The main business applications
of data mining are learning who your customers
are and what they need, understanding where the sales are coming from and
what factors affect them, fashioning marketing strategies, and predicting
future business indicators. With the
release of SQL Server 2000, Microsoft rebranded
OLAP Services as Analysis Services to reflect the addition of newly developed
data-mining capabilities. The data-mining toolset in SQL Server 2000
included only two classical analysis algorithms (Clustering and Decision
Trees), a special-purpose data-mining management and query-expression
language named DMX, and limited client-side controls, viewers, and
development tools. SQL Server 2005
Analysis Services comes with a greatly expanded set of data-mining methods
and an array of completely new client-side analysis and development tools
designed to cover most common business intelligence
(BI) needs. The Business Intelligence Framework in SQL Server 2005 offers a
new data-mining experience for analysts and developers alike. |
A version of this paper has been reprinted as a part of A Jump Start to SQL Server BI book.
A wealth of Data Mining news briefs, tutorials, scenarios discussions and tips are being posted at the ever-expanding http://www.sqlserverdatamining.com.
Computational finance is my newly found
hobby. I’ll start here by making fun of various trading myths in a series of
short essays and papers.
Here is one: "Key Reversal
Myth"
Here is another: "Covered Call vs
Naked Put Dilemma"
No, I am not a “Save the Whales” activist (yet) – I simply like and admire the whales. Some pictures are available through either of the links below (but prepare for a 40MB download).
Whales/Whales.mht Whales\Whales.ppt © Alex Bocharov, 2004