Nikhil Rasiwasia, Dhruv Mahajan, Vijay Mahadevan, and Gaurav Aggarwal
In this paper we present cluster canonical correlation analysis (cluster-CCA) for joint dimensionality reduction of two sets of data points. Unlike the standard pairwise correspondence between the data points, in our problem each set is partitioned into multiple clusters or classes, where the class labels define correspondences between them. Cluster-CCA is able to learn discriminant low dimensional representations that maximizes the correlation between the two sets while segregating the different classes. Furthermore, we present a kernel extension, kernel cluster canonical correlation analysis (cluster-KCCA) that extends cluster-CCA to account for non-linear relationships. Cluster-(K)CCA is shown to be computationally efficient, the complexity being similar to standard (K)CCA. By means of experimental evaluation on benchmark datasets, cluster-(K)CCA is shown to achieve state of the art performance for cross-modal retrieval tasks.