Graph Cuts without Eigenvectors

Graph clustering—clustering the nodes of a graph—is a fundamental problem arising in many machine learning and data mining applications. The last few years have seen a surge of interest in spectral clustering methods, which use eigenvectors of the graph adjacency matrix (or a matrix derived from the adjacency matrix) to optimize one of several graph cut objective functions. These methods are powerful and theoretically well-motivated; however, computing eigenvectors of a large matrix is computationally expensive and requires significant memory overhead.

I will describe my recent research on optimizing spectral clustering objectives without costly eigenvector computation. After deriving a mathematical connection between spectral clustering objectives and weighted kernel k-means, I will propose a simple iterative algorithm for optimizing various graph cut objectives, and then embed this into a multilevel algorithm. This resulting algorithm outperforms state-of-the-art spectral clustering algorithms in terms of quality (graph cut value), speed (up to 2000 times faster on graphs under a million nodes), and memory usage (requiring memory on the order of the input graph). I will also discuss applications of this algorithm to image segmentation, protein network clustering, and social network analysis.

This is joint work with Inderjit Dhillon and Yuqiang Guan.

Speaker Details

Brian Kulis is a PhD student at the University of Texas at Austin working on machine learning and data mining. He received his BA in computer science and mathematics from Cornell University in 2003. His research focuses on large-scale optimization, kernel methods, and spectral methods. This summer he is working with Arun Surendran as an intern in the Knowledge Tools group.

Date:
Speakers:
Brian Kulis
Affiliation:
University of Texas at Austin
    • Portrait of Jeff Running

      Jeff Running