I currently work as an Architect and group lead in the eXtreme Computing Group (XCG), a new organization in Microsoft Research established to push the boundaries of computing. My team is responsible for developing core tools and services for compute and data intensive research. Our goal is to make simple yet powerful tools available, that any researcher can use to extract insights by mining and combining diverse data sets. Specific examples include NCBI BLAST on Azure, the Daytona iterative MapReduce runtime for Windows Azure, and Excel DataScope. Our team also offers tutorials on cloud computing, identifies best practices for deploying research applications and data collections in the cloud, such as the AzureScope service, and serve as thought leaders on the application of cloud computing for research. I am frequent public speaker and Microsoft spokesperson on the topic.
Prior to joining XCG I worked a Principal Architect for External Research (MSR), where I founded and led the Advanced Research Services and Tools (ARTS) team. The ARTS team was responsible for developing innovative tools and services using Microsoft products and technology accelerate research, such as the Trident Scientific Workflow Workbench, The Research Information Centre VRE, and Dryad/DryadLINQ on HPCS. Our team also provided strategic and tactical hands-on technological leadership to projects across External Research’s international engagements.
I joined Microsoft in 1997 as a Researcher in the Database Group of Microsoft Research, where I was involved in a number of systems research projects and product development efforts in database systems, application recovery, workflow and stream processing. Throughout my career at Microsoft I have enjoyed developing ideas from basic research, through proof of concept prototypes to incubation efforts in product groups. Microsoft Research, and eXtreme Computing Group in particular, is a truly unique and rewarding organization in which to work.
Iterative MapReduce Research on Azure, demonstration at SC'11 in Seattle WA Nov. 2011 (PDF)
Cloud Computing at Scale, invited talk at the 2011 eXtremely Large Databases Conference & Workshop (XLDB'11) held at SLAC in Palo Alto CA, October 2011 (PPT show)
Windows Azure for Research, Tutorial presented at the IEEE eScience'10 Conference in Brisbane Australia and invited talk at NICTA in Sydney, December 2010 (PPT show).
NCBI BLAST on Windows Azure, Presented at SuperComputing 2010 (SC10) in New Orleans LA, November 2010 (PPT show).
Data Laden Clouds, invited keynote at MTAGS'10 Workshop at SC'10 (PPT show).
Azure for Science Research: From Desktop to the Cloud, Invited Presentation, Microsoft Research Faculty Summit 2010, Redmond WA, July 2010 (PDF).
Cloud Computing for Science, Keynote Presentation, 22nd International Conference on Scientific and Statistical Dabase Management, SSDBM'10, Heidelberg, Germany, June 2010 (PDF).
Emerging Trends and Converging Technologies in Data Intensive Computing, Invited Keynote, National e-Science Centre Workshop, Edinburgh, March 2010.
Data Intensive Research in the Cloud, Invited Talk, National e-Science Centre Workshop, Edinburgh, March 2009 (PDF).
Lessons from Building the Research Information Centre (RIC), Presented at the UK eScience All Hands Meeting (PDF).
Opportunities and Challenges in Enabling Effective eScience, Invited Keynote, presented at ADVCOMP’08.
Trends in Data Storage and Management, Invited Talk, Oxford University, 2008.
Scientific Workflow for Project Neptune, Invited Talk, ETH MICCS 2008 (PPT show).
Recent Professional Service
- IEEE eScience 2012 organizing committee.
- 12th IEEE/ACM International Conference on Grid Computing, Vice Chair of cloud computing and virtualization (Grid 2012).
- 24th International Conference on Scientific and Statistical Database Systems (SSDBM 2012).
- 28th International Conference on Data Engineering (ICDE 2012).
- Co-Chair of the 2nd Int'l Workshop on Data Intensive Computing in the Cloud, DataCloud-SC 2011.
- XLDB (eXtremely Large Database Systems) - Europe 2011.
- DataCloud 2011 - First International Workshop on Data Intensive Computing in the Clouds, in conjunction with IEEE IPDCS'11).
- DIDC 2011 - Fourth International Workshop on Data Intensive Distributed Computing, in conjuction with HPDC 2011.
- 20th International ACM Symposium on High Performance Parallel and Distributed Computing (HPDC 2011).
- 27th International Conference on Data Engineering (ICDE 2011)
- Challenges of Large Applications in Distributed Environments (CLADE 2010)
- ACM ScienceCLOUD 2010
- 4th Int’l Workshop on Workflow systems in e-Science (WSES09)
- International Conference on Very Large Database Systems (VLDB) 2008, 2006, and 2001
- ACM Special Interest Group on Management of Data (SIGMOD) 2008 and 2002.
- IEEE Scientific Workflow 2011, 2010, 2009 and 2008.
- IEEE Advances in Computing (ADVCOMP) 2009 and 2008.
- IEEE Services Computing Conference (SCC) 2007.
- IEEE Distributed Event Based Systems (DEBS) 2007.
- Roger S Barga, Jaliya Ekanayake, and Wei Lu, Project Daytona: Data Analytics as a Cloud Service, in Proceedings of the International Conference of Data Engineering (ICDE), International Conference on Data Engineering, 7 March 2012.
- Jaliya Ekanayake, Jared Jackson, Wei Lu, Roger Barga, and Atilla Soner Balkir, A Scalable Communication Runtime for Clouds, in Proceedings IEEE Cloud 2011, The 4th International Conference on Cloud Computing, IEEE Computer Society, 4 July 2011.
- Ankur Dave, Roger Barga, Wei Lu, and Jared Jackson, CloudClustering: Toward an iterative data processing pattern on the cloud, in Proceedings of IEEE DataCloud 2011, IEEE, 16 June 2011.
- Badrish Chandramouli, Jonathan Goldstein, Roger Barga, Mirek Riedewald, and Ivo Santos, Accurate Latency Estimation in a Distributed Event Processing System, in 27th International Conference on Data Engineering (ICDE '11), IEEE, April 2011.
- Roger Barga, Dennis Gannon, and Daniel Reed, The Client and the Cloud: Democratizing Research Computing, in IEEE Internet Computing, IEEE Computer Society, 2011.
- Roger Barga, Bill Howe, David Beck, Stuart Bowers, William Dobyns, Winston Haynes, Roger Higdon, Chris Howard, Christian Roth, Elizabeth Stewart, Dean Welch, and Eugene Kolker, Bioinformatics and Data-Intensive Scientific Discovery in the Beginning of the 21st Century, in OMICS A Journal of Integrative Biology, Mary Ann Liebert, Inc., 2011.
- Wei Lu, Jared Jackson, Jaliya Ekanayake, and Roger Barga, Performing Large Science Experiments on Azure: Pitfalls and Solutions, in Proceedings of the 2nd IEEE Int'l Conference on Cloud Computing Technology and Science, IEEE Computer Society, 30 November 2010.
- Xiaohong Qiu, Jaliya Ekanayake, Scott Beason, Thilina Gunarathne, Geoffrey Fox, Roger Barga, and Dennis Gannon, Cloud Technologies for Bioinformatics Applications, in IEEE Transactions on Parallel and Distributed Systems, TPDSSI-2010-01-0021, IEEE, September 2010.
- Roger Barga, Yogesh Simmhan, Eran Chinthaka, Satya Sahoo, Jared Jackson, and Nelson Araujo, Provenance for Scientific Workflows: Towards Reproducible Research, in Bulletin of the Technical Committee on Data Engineering, IEEE Computer Society, September 2010.
- Badrish Chandramouli, Jonathan Goldstein, Roger Barga, Mirek Riedewald, and Ivo Santos, Accurate Latency Estimation in a Distributed Event Processing System, no. MSR-TR-2010-146, July 2010.
- Keith Crochow, Bill Howe, Mark Stoermer, Roger Barga, and Ed Lazowska, Client + Cloud: Evaluating Seamless Architectures for Visual Data Analytics in the Ocean Sciences, in Proceedings of 22nd international conference on scientific and statistical database management., Springer Verlag, 28 June 2010.
- Wei Lu, Jared Jackson, and Roger Barga, AzureBlast: A Case Study of Developing Science Applications on the Cloud, in Proceedings of the 1st Workshop on Scientific Cloud Computing (Science Cloud 2010), Association for Computing Machinery, Inc., 21 June 2010.
- Eran Chinthaka Withana, Beth Plale, Roger Barga, and Nelson Araujo, Versioning for Workflow Evolution, in Proceedings of The Third International Workshop on Data Intensive Distributed Computing, Association for Computing Machinery, Inc., 21 June 2010.
- Yogesh Simmhan and Roger Barga, Analysis of Approaches for Supporting the Open Provenance Model: A Case Study of the Trident Workflow Workbench, in Future Generation Computer Systems - In Submission, Elsevier , 2010.
- Yogesh Simmhan, Catharine van Ingen, Roger Barga, Alex Szalay, and Jim Heasley, Building Reliable Data Pipelines for Managing Community Data using Scientific Workflows, in IEEE eScience Conference, IEEE, 9 December 2009.
- Jaliya Ekanayake, Atilla Soner Balkir, Christophe Poulain, Nelson Araujo, Roger Barga, Thilina Gunarathne, and Geoffrey Fox, DryadLINQ for Scientific Analyses, 8 December 2009.
- Nelson Araujo, Roger Barga, Eran Chinthaka, and Beth Plale, Workflow Evolution: TracingWorkflows Through Time, 7 December 2009.
- John R. Delaney and Roger S. Barga, Observing the Oceans - A 2020 Vision for Ocean Science, in The Fourth Paradigm: Data Intensive Scientific Discovery, Microsoft Research, 22 November 2009.
- Yogesh Simmhan, Roger Barga, Catharine van Ingen, Ed Lazowska, and Alex Szalay, Building the Trident Scientific Workflow Workbench for Data Management in the Cloud, in International Conference on Advanced Engineering Computing and Applications in Sciences (ADVCOMP), IEEE, 11 October 2009.
- Yogesh Simmhan, Catharine van Ingen, Roger Barga, Alex Szalay, and Jim Heasley, Reliable Management of Community Data Pipelines using Scientific Workflows, no. MSR-TR-2009-125, 15 September 2009.
- Yogesh Simmhan, Maria Nieto-Santisteban, Roger Barga, Tamas Budavari, Laszlo Dobos, Nolan Li, Michael Shipway, Alexander S. Szalay, Ani Thakar, Jan Vandenberg, Alainna Wonders, Sue Werner, Richard Wilton, Dan Fay, Michael Thomassy, Catharine van Ingen, Jim Heasley, and Conrad Holmberg, GrayWulf: Scalable Software Architecture for Data Intensive Computing, in Hawaii International Conference on System Sciences (HICSS), IEEE Computer Society, January 2009.
- Nelson Araujo, Roger Barga, Dean Guo, Jared Jackson, Yogesh Simmhan, and N. Gautam, The Trident Scientific Workflow Workbench, 7 December 2008.
- Maria Nieto-Santisteban, Yogesh Simmhan, Roger Barga, Laszlo Dobos, Jim Heasley, Conrad Holmberg, Nolan Li, Michael Shipway, Alexander S. Szalay, Catharine van Ingen, and Sue Werner, Pan-STARRS: Learning to Ride the Data Tsunami, in Microsoft eScience Workshop, December 2008.
- Yogesh Simmhan, Roger Barga, Catharine van Ingen, Ed Lazowska, and Alex Szalay, On Building Scientific Workflow Systems for Data Management in the Cloud, in IEEE eScience Conference, December 2008.
- Roger Barga, Jared Jackson, Nelson Araujo, Dean Guo, Nitin Gautam, and Yogesh Simmhan, The Trident Scientific Workflow Workbench, in IEEE eScience Conference, IEEE, December 2008.