Roger S Barga, Jaliya Ekanayake, and Wei Lu
7 March 2012
Spreadsheets are established data collection and analysis tools in business, technical computing and academic research. Excel, for example, offers an attractive user interface, provides an easy to use data entry model, and offers substantial interactivity for what-if analysis. However, spreadsheets and other common client applications do not offer scalable computation for large scale data analytics and exploration. Increasingly researchers in domains ranging from the social sciences to environmental sciences are faced with a deluge of data, often sitting in spreadsheets such as Excel or other client applications, and they lack a convenient way to explore the data, to find related data sets, or to invoke scalable analytical models over the data. To address these limitations, we have developed a cloud data analytics service based on Daytona, which is an iterative MapReduce runtime optimized for data analytics. In our model, Excel and other existing client applications provide the data entry and user interaction surfaces, Daytona provides a scalable runtime on the cloud for data analytics, and our service seamlessly bridges the gap between the client and cloud. Any analyst can use our data analytics service to discover and import data from the cloud, invoke cloud scale data analytics algorithms to extract information from large datasets, invoke data visualization, and then store the data back to the cloud all through a spreadsheet or other client application they are already familiar with.
In Proceedings of the International Conference of Data Engineering (ICDE)
Publisher International Conference on Data Engineering