Rakesh Agrawal, Behzad Golshan, and Evimaria Terzi
Given a class of large number of students, each exhibiting a different ability level, how can we group them into sections so that the overall gain for students is maximized? This question has been a topic of central concern and debate amongst social scientists and policy makers for a long time. We propose a framework for rigorously studying this question, taking a computational perspective. We present a formal definition of the grouping problem and investigate some of its variants. Such variants are determined by the desired number of groups as well as the definition of the gain for each student in the group.
We focus on two natural instantiations of the gain function and we show that for both of them the problem of identifying a single group of students that maximizes the gain among its members can be solved in polynomial time. The corresponding partitioning problem, where the goal is to partition the students into non-overlapping groups appear to be much harder. However, the algorithms for the single-group versions can be leveraged for solving the more complex partitioning problem. Our experiments with generated data coming from different distributions demonstrate that our algorithm is significantly better than the current strategies in vogue for dividing students in a class into sections.
|Published in||ACM SIGKDD International Conference on Knowledge Discovery and Data Minining (KDD)|