Image
Compression - the Mechanics of the JPEG 2000
Microsoft Research, Signal Processing,
Email: jinl@microsoft.com
ABSTRACT
We briefly review the mechanics in the coding engine of the JPEG 2000, a start-of-the-art image compression system. The transform, entropy coding and bitstream assembler modules are examined in details. Our goal is to enable the readers to have a good understanding of the modern scalable media compression technologies without being swarmed by the details.
Keywords: Image compression, JPEG 2000,
transform, wavelet, entropy coder, sub-bitplane entropy coder, bitstream
assembler.
1.
INTRODUCTION
Compression is a process that creates a compact data representation for storage and transmission purposes. Media compression usually involves the use of special compression tools because media is different from the generic data. Generic data file, such as a computer executable program, a Word document, must be compressed losslessly. Even a single bit error may render the data useless. On the other hand, distortion is tolerable in the media compression process; because it is the content of the media that is of paramount importance, rather than the exact bit of the media. Since the size of the original media, whether it is an image, a sound clip, or a movie clip, is usually very large, it is essential to compress the media at a very high compression ratio. Such high ratio media compression is usually achieved through two mechanisms: a) to ignore the media components that are less perceptible, and b) to use entropy coding to explore information redundancies exist in the source data.
Different applications may have different requirement of the compression ratio and tolerance of the compression distortion. A publish application may require a compression scheme with very little distortion, while a web application may tolerate relatively large distortion in exchange of a smaller compressed media. Recently, a category of media compression algorithms termed scalable compression emerges to offer the ability to trade between the compression ratio and distortion even after the compressed bitstream has been generated. In scalable compression, a media is first compressed into a master bitstream, where a subset of the master bitstream may be extracted to form an application bitstream with a higher compression ratio. With scalable compression, a compressed media can be quickly tailored for applications with vastly different compression ratio and quality requirement, which is especially useful in media storage and transmission.
In the following part of the paper, we use image compression, and in particular the JPEG 2000 image compression standard, to illustrate important mechanics in a modern scalable media compression algorithm. The paper is organized as follows. The basic concepts of the scalable image compression and its applications are discussed in Section 2. The JPEG 2000 standard and its development history are briefly reviewed in Section 3. The transform, quantization, entropy coding, and bitstream assembler modules are examined in details from Section 4-7. Our goal is to describe the key mechanics of the JPEG 2000 coding engine so that the readers may get a good understanding of the standard without being swarmed by the details. For the readers who are further interested, they may refer to [1][2][3].
Digital images
are used every day now. A digital image is essentially a 2D data array x(i,j), where i and j index the row and
column of the data array, and each of the data point x(i,j) is referred as a pixel. For the gray image, each pixel is of
an intensity value G. For color
image, each pixel consists of a color vector (R, G, B), which represent the intensity of the red, green and blue
components, respectively. Because it is the content of the digital image that
matters the most, the underlying 2D data array may undergo big changes, and
still convey the content to the user. An example is shown in Figure 1, where the original image

Figure 1 Digital image and image manipulation.
Among the operations, the compression creates a compact representation of the image data. It is an essential operation for image storage and transmission. The JPEG 2000, a next generation image compression standard, is a highly scalable image compression algorithm. From the compressed JPEG 2000 bitstream, it is possible to extract a subset of the bitstream that decodes to an image of lower quality (with higher compression ratio), lower-resolution, and/or smaller spatial region. In other words, instead of manipulating the image in the space domain as shown in Figure 1, we may manipulate the image directly on the compressed domain, and form a new bitstream that suits the application better.
Scalable image compression has important applications in image storage and delivery. Let us first examine the application of digital photography. Right now, digital cameras on the market all use non-scalable image compression technologies, mainly JPEG. A camera with a fixed amount of the memory can accommodate a small number of high quality, high-resolution images, or a large number of low quality, low-resolution images. Unfortunately, image quality and resolution setting has to be determined before the shooting of photos. This leads to painful trade off between removing lovely photos to make space for new exciting shots, and shooting photos with poor quality and resolution setting. With scalable image compression, it is possible for the digital cameras to adjust the image quality and resolution after the photo is shot. The camera may always shot images at the highest possible quality and resolution setting. When the camera memory is used up, the compressed bitstream of existing shots may be truncated to smaller size to leave room for the upcoming shots. Through dynamically trading between the number of images and the image quality, the precious camera memory is not wasted, and the quality of the shot is maximized.
Another important
application of the scalable image compression is for the web browsing. As the
resolution of the digital camera and the digital scanner becomes higher and
higher, high-resolution digital image becomes a reality. It is very pleasing to
view a high-resolution image, however, it is equally painful to wait for the
long compressed bitstream to be delivered over the web. Before scalable image
compression technology is available, it is common practice to generate and put
on the web multiple copies of the compressed bitstream with different spatial
region, resolution and compression ratio. However, this produces multiple
copies of the bitstream for the same media file, and causes headaches in media
management and wastes valuable server space. A better solution is to put a
scalable master bitstream of the compressed image on the server. During image
browsing, the user may specify a
region of interest (ROI) with a certain spatial and resolution constraint. The
browser only downloads a subset of the compressed media bitstream
covering the current ROI, and the download can be performed in a
progressive fashion so that a coarse view of the ROI can be rendered very
quickly and then gradually refined as more and more bits are arrived. With
scalable image compression, it is possible to browse large image quickly and on
demand over the Internet, as shown with the Vmedia
project [16].
3. The JPEG 2000 stanard
We first briefly review the history of the development of
the JPEG 2000 standard. In the early 1990s, a number of new image compression
algorithms, such as CREW (compression with
reversible embedded wavelets) [5] and EZW (embedded zerotree wavelet) [6], emerged to provide not only superior compression
performance, but also a new set of features unseen before. Based on industrial
demand, the JPEG 2000[1] project
was approved as a new work item in 1996. A call for technical contributions was
issued in Mar. 1997[10]. The first evaluation is performed in November 1997
in
JPEG 2000 standardizes the decoder and the bitstream syntax. Nevertheless, information on the encoder implementation is provided to assure a reasonable performance encoder. We choose to describe the JPEG 2000 from the encoder perspective since it can be more easily understood.

Figure 2 Operation flow of the JPEG 2000 standard.
The operation flow of a typical JPEG 2000 encoder can be shown in Figure 2. The first module is component and tile separation, whose function is to cut the image into manageable chunks and to decorrelate the color components. Huge original images, e.g., aero-photography images, are separated into spatially non-overlapping tiles of equal size. For multi-component (color) images, a component transform is performed to decorrelate the components. For example, a color image with RGB (red, green and blue) components can be transformed to the YCrCb (Luminance, Chrominance red and Chrominance blue) or RCT (reversible component transform) component space. Each tile of each component is then processed separately. The d