Education is acknowledged to be the primary vehicle for accelerating economic development and textbooks are known to be the educational input most consistently associated with improvements in student learning. With the emergence of abundant online content, cloud computing, and electronic reading devices, textbooks are poised for transformative changes. Taking into account the vast amount of existing textbooks designed for traditional printed medium and the potential for enabling new kinds of functionalities through the medium of electronic textbooks, we present the results of our research into algorithmically diagnosing and enhancing the quality of textbooks. Specifically, we first describe a diagnostic tool for authors and educators to identify weaknesses in a textbook. We then discuss techniques for augmenting different sections of a textbook with links to selective web content and providing easy access to concepts explained elsewhere in the book that are necessary for understanding the present section. We have used a corpus of Indian (NCERT) high-school textbooks and graduate level textbooks published in U.S.A. to drive our research. We have also built a demonstration of the ideas running on Microsoft Surface and other Windows 8 devices as well as on Aakash, the low cost tablet being developed by the Indian government for distribution to millions of students. While the results are encouraging and indicate the feasibility of developing technological approaches to embellishing textbooks, the practical deployment issues and a systematic evaluation of the effectiveness of these ideas in enhancing learning merits further research.
Diagnostic tool to identify weaknesses in a textbook
Our diagnostic tool consists of two components. Abstracting from the education literature, we identify the following properties of good textbooks: (1) Focus: Each section explains few concepts, (2) Unity: For every concept, there is a unique section that best explains the concept, and (3) Sequentiality: Concepts are discussed in a sequential fashion so that a concept is explained prior to occurrences of this concept or any related concept. Further, the tie for precedence in presentation between two mutually related concepts is broken in favor of the more significant of the two.
The first component provides an assessment of the extent to which these properties are followed in a textbook and quantifies the comprehension load that a textbook imposes on the reader due to non-sequential presentation of concepts. The second component identifies sections that are not well-written and can benefit from further exposition. We propose a probabilistic decision model for this purpose, which is based on the syntactic complexity of writing and the new semantic notion of the dispersion of concepts in the section.
Study aids for amplifying reading
Our system provides two different study aids in the form of related material that can benefit a student while reading a book section. The first aid, study navigator, provides easy access to concepts explained elsewhere in the book that are necessary for understanding the present section. These concept references are algorithmically generated using a model of how students read textbooks. In the second study aid, we provide augmentations to different sections of a book with links to selective web articles, images and videos. We first identify the set of key concepts phrases contained in a section. Using these phrases, we find web articles that represent the central concepts presented in the section and endow the section with links to them. We also find images that are most relevant to a section of the textbook, while respecting the constraint that the same image is not repeated in different sections of the same chapter. We view the problem of matching images to sections in a textbook chapter as an optimization problem and solve it optimally. We augment with videos that are most relevant to a section, by taking into account the focus of the section. We compute the focus of each section using selective combinations of concept phrases, crawl educational videos from the web and obtain their transcripts, and perform matching to identify relevant videos.
To illustrate the technical feasibility of the ideas, we have built a demonstration using a corpus of Indian (NCERT) high-school textbooks and graduate level textbooks published in U.S.A., running on Microsoft Surface and other Windows 8 devices. We have also ported the demo to run on the Aakash2 tablet, the low cost tablet being developed by the Indian government for distribution to millions of students. The demo shows the key concept phrases identified for each book section and the corresponding video augmentation. The same demo also incorporates various diagnostic statistics computed for the book sections.
- Rakesh Agrawal, Sunandan Chakraborty, Sreenivas Gollapudi, Anitha Kannan, and Krishnaram Kenthapadi, Empowering Authors to Diagnose Comprehension Burden in Textbooks, in ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), ACM, August 2012
- Rakesh Agrawal, Sunandan Chakraborty, Sreenivas Gollapudi, Anitha Kannan, and Krishnaram Kenthapadi, Quality of Textbooks: An Empirical Study, in ACM Symposium on Computing for Development (ACM DEV), ACM, March 2012
- Rakesh Agrawal, Sreenivas Gollapudi, Anitha Kannan, and Krishnaram Kenthapadi, Data Mining for Improving Textbooks (invited paper), in SIGKDD Explorations Newsletter, ACM, December 2011
- Rakesh Agrawal, Sreenivas Gollapudi, Anitha Kannan, and Krishnaram Kenthapadi, Enriching Textbooks with Images, in International Conference on Information and Knowledge Management (CIKM), ACM, October 2011
- Rakesh Agrawal, Sreenivas Gollapudi, Anitha Kannan, and Krishnaram Kenthapadi, Identifying Enrichment Candidates in Textbooks, in International World Wide Web Conference (WWW), ACM, March 2011
- Rakesh Agrawal, Sreenivas Gollapudi, Krishnaram Kenthapadi, Nitish Srivastava, and Raja Velu, Enriching Textbooks Through Data Mining, in ACM Symposium on Computing for Development (ACM DEV), ACM, December 2010