Automated Problem Generation for Education

Intelligent Tutoring Systems (ITS) can significantly enhance the educational experience, both in the classroom and online. A key aspect of ITS is the ability to automatically generate problems of a certain difficulty level and that exercise use of certain concepts. This can help avoid copyright or plagiarism issues and help generate personalized workflows. This project develops technologies for problem generation in various subject domains including math, logic, and even language learning.


This is an inter-disciplinary project that brings together researchers from different research areas, different MSR labs, and our academic collaborators across different continents. Our academic collaborators include Umair Ahmed and Amey Karkare (from IIT Kanpur, India), Krishnendu Chatterjee (from IST Austria), and Erik Andersen and Zoran Popovic (from Univ of Washington).


Computer-aided Education can help revolutionize the pedagogy inside/outside traditional classrooms, as well as enrich the experience offered by online classrooms [talk video]. This project focusses on one key aspect of computer-aided education, namely automatic problem generation. We have developed tools that can help teachers generate hundreds of problems that are similar to a given problem for use in exams (to prevent plagiarism). These tools can also help students by generating personalized workflows with a dynamic progression of problems of increasing difficulty. We believe that this technology will become an integral part of traditional/online classrooms, replace expensive books containing practice problems, and offer unique value with personalized instruction leading to better and quick learning.

We have developed automatic problem generation tools for a variety of subject domains including algebraic proof problems, procedural math problems, sentence completion SAT problems, logic problems, automata construction problems, and board-game problems. These tools build over state-of-the-art advances in multiple computer science disciplines including logical reasoning and natural language understanding, and also leverage domain-specific knowledge associated with a given subject domain.

Algebraic Proof Problems

Given an algebraic identity (like the red problem below), our tool [AAAI 2012] is able to generate fresh algebraic identities (shown as green problems below) that are similar to the input identity. In addition, the teacher can also enter a sample solution to the red problem, and our tool can generate sample solutions to the new green problems. This technology (which generates fresh problems and solutions to those problems from an example problem and its sample solution) can be likened to the Flash Fill feature in Excel 2013 that can perform string transformations by examples.

The tool works for a variety of algebraic domains such as trigonometry, calculus, determinant, and combinatorics.

The tool has an interactive setting. It generalizes the input (red) identity into a template (shown in orange), which replaces each operator in the input identity by a hole that can be instantiated with another operator of the same type, and some additional constraints (also shown in orange on the right side) that restrict the instantiation choices for the holes. The teacher can modify the constraints to control the variants that get generated. A more restrictive set of constraints will lead to fewer problems being generated, but those that might be more similar to the original one. A less restrictive set of constraints may lead to a generation of a larger number of problems, but the search process might take up more time.

The tool performs a brute force search over all possible instantiations of the holes in the template and validates the correctness of each instantiation by doing random testing on free variables of the problem. The correctness of this procedure follows from an extension of the classical randomized polynomial identity testing theorem.

Procedural Math Problems

 Procedural problems, such as addition, multiplication, and computation of least common multiple, are those whose solution entails following a step-by-step procedure. These are unlike conceptual problems, such as proof problems, whose solution requires creative thinking as opposed to memorizing and executing a procedure.

Procedural problems are common in middle school mathematics curriculum. The standard pedagogy in teaching procedural skills involves presenting a student with a progression of problems of increasing difficulty. For example, an effective progression for addition procedure may first include problems that involve adding single digits, then problems that involve adding multiple digit numbers without any carry, then problems that involve single carry, double carry, and so on.

Our tool [CHI 2013] generates problems for a given procedural concept by using off-the-shelf test input generation tools (a classic technology from software engineering literature) that achieve various forms of coverage over the underlying procedure written as code, as shown in the following figure. 


A partial order over program traces yields a partial order over corresponding test inputs or problems, which can be used to arrange problems into a progression. Our methodology, which characterizes a problem using its trace characteristics, can be used for various interesting applications such as filling holes in a given progression, comparing different progressions from different textbooks, and generating personalized progressions as part of interactive tutoring.

Sentence Completion Problems

Our sentence completion demo illustrates the way in which we can use natural language processing technology to generate candidate vocabulary questions, which a teacher can choose to accept or reject for use in the classroom. In this demo, we take a list of vocabulary words to test on, and then generate sentence completion questions that each use a desired vocabulary word. For example, for the vocabulary word “flouted,” it might generate:

But there were rules, and purists irritably noted when they were __________.
a) installed 
b) trapped
c) broached
d) flouted
e) fortified

To do this, the program first selects a vocabulary word, and then selects a sentence that uses the word from a database of over 2 million newspaper sentences. Since these sentences are chosen from published newspaper text, they illustrate a correct usage of the target word. To generate alternates or decoys, we analyze the sentence to determine the way in which the target word has been used. For example, in the sentence “He cleans his room every day, and it is always very orderly,” the word “orderly” is used as an adjective. But in the sentence “The general asked his orderly for coffee,” it is used as a noun. Once the specific linguistic usage of a word has been determined, the program selects alternates from amongst other words that are commonly used in the same way, and presents the question for a teacher to accept or reject.

In the past, generating sentence completion questions was a time consuming process requiring a great deal of thought to construct a good carrier sentence, and then to think up the alternates. With our technology, it takes just a few seconds to judge whether a question is acceptable, and a very large variety of questions can be generated.

Logic Problems

Natural deduction, which is a method for establishing validity of propositional type arguments, helps develop important reasoning skills and is thus a key ingredient in a course on introductory logic. A natural deduction problem consists of a set of premises and a conclusion, and the goal is to derive the conclusion starting from the premises using a given set of inference rules.

Our tool can help generate new variants of a given problem by generating valid replacement for one of the premises or the conclusion, as shown in the following figure.


Our tool also supports two other interfaces for problem generation (paper). These include: (i) generation of problems whose proof/solution structure is similar to that of a given input problem, (ii) generation of problems with given structural and solution characteristics that include number of variables, number of premises, size of each premise, size of solution, and the set of inference rules used in the solution.

The tool views problem generation as reverse of solution generation. It operates by searching for appropriate premises starting from a conclusion.

Our tool can also generate sample solutions to any given problem. It leverages the fact that the problem instances have small size, and hence it performs some offline computation and makes use of bitvector data-structures to alleviate the cost associated with symbolic reasoning.

Board Game Problems

Board games, like Tic-Tac-Toe and CONNECT-4, play an important role not only in development of mathematical and logical skills, but also in emotional and social development. Our tool can generate starting positions for a given game (specified as a set of rules) and a given expertise level of two players. Generating these starting positions allows for leveling the playing field between two players who may have different expertise levels. It may make certain game variants interesting (such as Tic-Tac-Toe 4*4, which when started from the empty board is heavily biased towards player 1). It also provides other benefits such as educational value in teaching certain strategies, prevents against memorization of moves, and enables faster finish time. The following figure shows starting positions of various difficulty level generated for a beginner player playing against an expert opponent for the game of Tic-Tac-Toe over a 4*4 board that involves making row and column matches of size 3. 

Our tool for generating fresh starting positions can be likened to the above tools for generating fresh problems. Our tool operates by performing backward reachability on winning states (paper). Hence, it is very similar to the tool for logic problem generation in that it also views problem generation as reverse of solution generation.


Related Publications

  • Automatically Generating Algebra Problems, AAAI 2012, Rohit Singh, Sumit Gulwani, Sriram Rajamani
    [abstract | pdf | full-version(pdf) | ppt slides]
  • A Trace-based Framework for Analyzing and Synthesizing Educational Progressions, CHI 2013, Erik Andersen, Sumit Gulwani, Zoran Popovic
    [abstract | pdf]
  • Automatically Generating Problems and Solutions for Natural Deduction, IJCAI 2013, Umair Ahmed, Sumit Gulwani, Amey Karkare 
    [abstract | pdf ]
  • Automatic Generation of Starting Positions in Board Games, Technical Report, Umair Ahmed, Krishnendu Chatterjee, Sumit Gulwani