Microsoft Research India, Bangalore, India
December 13 and 14, 2010
This workshop featured an introduction to DryadLINQ and its supporting technologies, as well as a tutorial on DryadLINQ, presented by several of the inventors and primary developers of the system. The introduction and tutorial was followed by a hands-on lab in which students had the opportunity to write distributed programs and execute them on clusters of hundreds of computers.
DryadLINQ is a system that makes it easy to compute over large-scale datasets. It is built by using Microsoft .NET language technologies and HPC cluster-management software. Developers use familiar high-level languages such as C#, and development tools such as Visual Studio, and write in a simple, imperative programming style. The DryadLINQ system automatically parallelizes the resulting code and distributes its execution over all the computing cores in an HPC cluster installation. DryadLINQ is ideally suited for large-scale scientific applications (for example, fluid dynamics and astronomy), data-mining applications such as infrastructure planning and development (to use for hydrology modeling or weather analysis, for example), data-mining (for census processing, customer database management, and so forth), and machine-learning and inference (to help with forecasting, computer vision, and so forth).
- Dennis Fetterly – Microsoft Research Silicon Valley
- Michael Isard – Microsoft Research Silicon Valley
- Frank McSherry – Microsoft Research Silicon Valley
- Chandu Thekkath - Microsoft Research Silicon Valley
- Yuan Yu – Microsoft Research Silicon Valley
Downloads (Documents, code and slides)
- DryadLINQ User Manual
- Yuan Yu - Visual Studio and C# and LINQ
- Michael Isard - From LINQ to DryadLINQ to Dryad
- Frank McSherry - Large-scale machine learning
- Solution for large-scale machine learning lab
- Dennis Fetterly - Large-scale Data Processing with DryadLINQ