AutoCaption: Automatic Caption Generation for Personal Photos

Krishnan Ramnath, Simon Baker, Lucy Vanderwende, Motaz El-Saban, Sudipta Sinha, Anitha Kannan, Noran Hassan, Michel Galley, Yi Yang, Deva Ramanan, Alessandro Bergamo, and Lorrenzo Torresani


AutoCaption is a system that helps a smartphone user generate a caption for their photos. It operates by uploading the photo to a cloud service where a number of parallel modules are applied to recognize a variety of entities and relations. The outputs of the modules are combined to generate a large set of candidate captions, which are returned to the phone. The phone client includes a convenient user interface that allows users to select their favorite caption, reorder, add, or delete words to obtain the grammatical style they prefer. The user can also select from multiple candidates returned by the recognition modules.


Publication typeInproceedings
PublisherIEEE Winter Conference on Applications of Computer Vision

Previous versions

Yi Yang, Simon Baker, Anitha Kannan, and Deva Ramanan. Recognizing Proxemics in Personal Photos, June 2012.

> Publications > AutoCaption: Automatic Caption Generation for Personal Photos