Starfish: A MADDER and Self-tuning System for Big Data Analytics
Timely and cost-effective analytics over “big data” is now a key ingredient for success in businesses and scientific disciplines. The Hadoop platform (consisting of an extensible MapReduce execution engine, pluggable distributed storage engines, and a range of procedural to declarative interfaces) is a popular choice for big data analytics. Hadoop’s performance out of the box can be poor, causing suboptimal use of resources, time, and money. Unfortunately, practitioners of big data analytics such as business analysts, computational scientists, and researchers often lack the expertise to tune the Hadoop platform for good performance.
I will introduce Starfish, a self-tuning system for big data analytics. Starfish builds on Hadoop, while adapting to system workloads and user needs to provide good performance automatically; without any need for users to understand and manipulate the many tuning knobs in the Hadoop platform. The novelty in Starfish’s approach comes from how it focuses simultaneously on different workload granularities – overall workload, workflows, and jobs procedural and declarative) – as well as across various decision points – provisioning, optimization, scheduling, and data layout.
Starfish is available at: http://www.cs.duke.edu/starfish
Speaker Details
Herodotos Herodotou is a Ph.D. Candidate in the Department of Computer Science at Duke University, expecting to graduate in May 2012. He received his M.S. degree from Duke in 2009. He was a recipient of the Steele Endowed Fellowship at Duke in 2008. His research interests are in large-scale Data Processing Systems and Relational Database Systems. In particular, his work focuses on ease-of-use, manageability, and automated tuning of both centralized and distributed data-intensive computing systems. In addition, he is interested in applying database techniques in other areas like scientific computing, bioinformatics, and numerical analysis.
- Series:
- Microsoft Research Talks
- Date:
- Speakers:
- Herodotos Herodotou
- Affiliation:
- Duke University
-
-
Herodotos Herodotou
-
Jeff Running
-
Series: Microsoft Research Talks
-
-
-
-
Galea: The Bridge Between Mixed Reality and Neurotechnology
Speakers:- Eva Esteban,
- Conor Russomanno
-
Current and Future Application of BCIs
Speakers:- Christoph Guger
-
Challenges in Evolving a Successful Database Product (SQL Server) to a Cloud Service (SQL Azure)
Speakers:- Hanuma Kodavalla,
- Phil Bernstein
-
Improving text prediction accuracy using neurophysiology
Speakers:- Sophia Mehdizadeh
-
-
DIABLo: a Deep Individual-Agnostic Binaural Localizer
Speakers:- Shoken Kaneko
-
-
Recent Efforts Towards Efficient And Scalable Neural Waveform Coding
Speakers:- Kai Zhen
-
-
Audio-based Toxic Language Detection
Speakers:- Midia Yousefi
-
-
From SqueezeNet to SqueezeBERT: Developing Efficient Deep Neural Networks
Speakers:- Sujeeth Bharadwaj
-
Hope Speech and Help Speech: Surfacing Positivity Amidst Hate
Speakers:- Monojit Choudhury
-
-
-
-
-
'F' to 'A' on the N.Y. Regents Science Exams: An Overview of the Aristo Project
Speakers:- Peter Clark
-
Checkpointing the Un-checkpointable: the Split-Process Approach for MPI and Formal Verification
Speakers:- Gene Cooperman
-
Learning Structured Models for Safe Robot Control
Speakers:- Ashish Kapoor
-
-