GrabCut

  C. Rother, V. Kolmogorov, A. Blake, M.Brown


 

Note: the different bounding boxes used in the paper:
Image Segmentation with A Bounding Box Prior, Lempitsky, Kohli, Rother, Sharp, ICCV 2009.

are available here.

 

IMPORTANT: This page is currently under construction and will change (really) soon!

Easy object cut and paste

   GrabCut is an efficient, interactive tool for foreground segmentation in still images. For moderately difficult examples it is sufficient to mark the object with a rectangle (or lasso) to obtain the desired result. Classical image editing tools use either texture (colour) information, e.g. Magic Wand, or edge (contrast) information, e.g. Intelligent Scissors. GrabCut successfully combines both types of information. It extends considerably the graph-cut based segmentation technique introduced by Boykov and Jolly at ICCV 2001. First, a more powerful, iterative version of the optimisation technique has been developed. Secondly, the power of the iterative algorithm is used to simplify substantially the user interaction needed for a given quality of result. Thirdly, a robust algorithm for ``border matting'' has been developed to estimate simultaneously the alpha-matte around an object boundary and the colours of foreground pixels.

 

Scientific publications

  

  1. C. Rother, V. Kolmogorov, A. Blake. GrabCut: Interactive Foreground Extraction using Iterated Graph Cuts. ACM Transactions on Graphics (SIGGRAPH'04), 2004
  2. A. Blake, C. Rother, M. Brown, P. Perez, and P. Torr. Interactive image segmentation using an adaptive GMMRF model. Proc. Eur. Conf. on Computer Vision, ECCV (2004).

 

  


Video and extra material

The Siggraph and Press talk contains many compressed images. Please contact us (carrot 'at' microsoft.com) if you need a high quality version of one of these images.


Ground truth database

   To evaluate our method we designed a new ground truth database of 50 images. The following zip-files contain: Data, Segmentation, Labelling - Lasso, Labelling - Rectangle. Due to license issues, please download the following images (readme.txt) from the zip-file available at Berkley Image database. Explanation of the datasets:

 

 

 

Data

 

Segmentation

 

Labelling-Lasso

 

Labelling-Rectangle

Segmentation: A tri-map which specifies background (0), foreground (255) and mixed area (128). The mixed area contains pixels which are a combination of fore- and background texture. Note, in low contrast regions the true boundary is not observed and the ground truth is in this case a "good guess".
Labelling-Lasso: Imitates a tri-map obtained by a lasso or pen tool. The colour coding is: background (0); background - used for colour model training (64); inference (unknown) region (128); foreground - used for colour model training (255). Note, a lasso tool can be imitated by specifying the foreground region (255) as unknown (128).
Labelling-Rectangle:
Imitates a tri-map obtained by two mouse clicks (rectangle). Same colour coding as in Labelling-Lasso.

 Benchmark results with the GMMRF model (see ECCV 04)

Segmentation Model

Error rate (%)

GMMRF; optimally chosen gamma using ground truth (K = 10 full Gaussians)

to be updated

GMMRF; discriminatively learned gamma = 50 (K = 10 full Gaussians)

to be updated

Learned GMMRF parameters (K = 30 isotropic Gaussians)

to be updated

GMMRF; discriminatively learned gamma = 50 (K = 30 isotropic Gaussians)

to be updated

Strong interaction model (gamma = 1000; K = 30 isotropic Gaussians)

to be updated

Ising model (gamma = 25; K = 30 isotropic Gaussians)

to be updated

Simple mixture model - no interaction (gamma = 0; K = 30 isotropic Gaussians)

to be updated

Note, these results are different to the evaluation presented in the ECCV 04 paper (see Fig. 4). The reason is a modified database (due to license issues). Furthermore, the gamma refers to the SIGGRAPH 04 paper (see eqn. 4)

back to Computer Vision @ MSRC

Web site designed and maintained by A. Criminisi