Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Yi Wei

Yi Wei

I am a post-doc researcher in the Constraint Reasoning group at Microsoft Research Cambridge. My research interests are dynamic and static program analysis, software testing, program correction and specification mining and machine learning. I finished my PhD in 2012 at Chair of Software Engineering, ETH Zurich, under the supervision of Prof. Bertrand Meyer. Bertrand is really a cool person. The first words he told me when I started as a PhD student was "Have fun!" -- I did have a lot of fun during my PhD working on automated software testing and correction at Chair of Software Engineering.

At Microsoft Research

I'm working on two projects, the Bing Code Search project and the PowerPoint assistant project.

The Bing Code Search project (you can try it online) suggests ready-to-use snippets from user’s natural language queries in Visual Studio. I started this project when I was an intern at MSR Cambridge in 2011. The project has been selected to present in TechFest’12, Microsoft Summit’12 and TechFect’14. In 2012, I used this idea to participate in the VentureKick startup competition and got venture capital support in Switzerland. Now, at Microsoft, working together with the VS platform team, we are about to release a Visual Studio addin in DevLabs.


The Bing Code Search Add-in for Visual Studio 2013

In Bing Code Search, we (together with Mukund Raghothaman) used Hidden Markov Model to learn patterns from existing code repositories; used IBM translation model 1 to learn a mapping from natural language words to C# tokens; used Bayesian network to decide the most likely control-flow and variable assignments; used Roslyn to perform data-flow analysis; used FastRank to train a code ranker to outperform Bing’s default ranking for code snippets. Since the data is in large scale, we used COSMOS to process data, Aether to run model training and Azure (websites, worker roles, SQL databases, tables, queues, traffic manager) to host the code suggestion services.

FlashPoint is a PowerPoint addin, which suggests new materials for slide builders. FlashPoint provides zero query experience. The algorithm analyses users‘ current slide to decide to suggest pictures, wiki definitions or to perform fact checking. The poject was selected to present at TechFest’14.

In my free time

I like programming a lot and I like to analysis stocks (although the Chinese stock market in recent years does not justify my love). In my free time, I'm the main developer on the two websites Sogule and CoreTX that I started. These two projects analyze the Chinese stock market, Sogule for fundamental analysis and CoreTX for technical analysis.

In Sogule, I developed sophisticated scraping techniques to collect information from more than 10 different sources. I applied various classification techniques to predict a stock company’s performance based on its fundamental data, such as balance sheets. I built simple sentiment analyzer to classifier news for companies -- only realizing that doing the reversed thing that the news tells you to will give you 54% odds of winning -- in other words, the news mislead you to the wrong decisions!


Main page of Sogule

In CoreTX, the main task is to find patterns from stock price charts. I applied (piecewise) linear regression to simplify price charts, classification to identify price change signals and to remove potential fake signals; clustering to group stocks for porfolio analysis. I wrote various simulators to estimate trading strategy performance. I also wrote a framework to conduct real-time future trading.


An identified "Flag" shape in technical analysis, indicating the price may go up


Previously, I was a software engineer at Eiffel Software. I developed a library to launch external processes for the Eiffel language; improved the project navigation system for the EiffelStudio development environment; and developed the new metrics tool to calculate software product metrics. All these code are in the official release of EiffelStudio.