Strategies for scaling up data mining algorithms

In today’s world, data is generated by and collected from a myriad of disciplines such as mechanical systems, sensor network-based Earth science systems, hardware infrastructures, and information networks. Many of the existing data analysis algorithms do not scale to such large data sets. In this talk, I will present some of our work in speeding up existing data mining algorithms to scale to very large data sets. The first technique will describe how outlier detection can be done in an efficient fashion using an indexing strategy and parallel computing on clusters. This will be followed by a discussion on a general framework for checking model fidelity in very large loosely coupled distributed systems and how the framework can be adapted for system health monitoring.

Speaker Details

Kanishka Bhaduri is a research scientist in the Intelligent Data Understanding group at NASA Ames Research Center. He received his PhD from the University of Maryland Baltimore County (2008) in the area of scalable and distributed data mining. His research interests include distributed and parallel data mining and machine learning, text mining, web analytics, and distributed systems. He has over 20 peer reviewed conference and journal publications and has won many awards such as the NASA Group Achievement Award for Toyota Unintended Acceleration study, NASA Aeronautics Associate Administrator Exemplary Performance Award for Technology and Innovation, and the NASA Ames Contractor Council Excellence Award. He has been a visiting research scientist in the Computer Science department at Technical Universit¨at Dortmund, Germany. Kanishka regularly serves as a PC member for SIGKDD, IEEE ICDM, SIAM Data Mining Conference, ASONAM and PAKDD conferences and reviews articles for journals such as IEEE TKDE, ACM TKDD, IEEE SMC and Data Mining and Knowledge Discovery. He recently served as the Associate Editor of a special issue of DMKD on DataMining for a SustainableWorld. More information about him can be found at http://ti.arc.nasa.gov/profile/kbhaduri/.

Date:
Speakers:
Kanishka Bhaduri
Affiliation:
NASA Ames Research Center
    • Portrait of Jeff Running

      Jeff Running