JPEG - JPEG XR

Overview of JPEG XR

The JPEG XR standard provides a practical coding technology for a broad range of applications with excellent compression capability and important additional functionalities. The primary intended application of JPEG XR is the representation of continuous-tone still images such as photographic images.

As of 2008, the most widely used digital photography format is a nominal implementation of the first JPEG coding format as specified in Rec. ITU-T T.81 | ISO/IEC 10918-1. This encoding uses a bit depth of 8 for each of three channels, resulting in 256 representable values per channel (a total of 16 777 216 representable color values). More demanding applications may require a bit depth of 16, providing 65,536 representable values for each channel, and resulting in over 2.8 * 10^14 color values. Additional scenarios may necessitate even greater bit depths and sample representation formats. When memory or processing power is at a premium, as few as five or six bits per channel may be used.

The JPEG XR specification enables greater effective use of compressed imagery with this broadened diversity of application requirements. JPEG XR supports a wide range of color encoding formats including monochrome, RGB, CMYK and n-component encodings using a variety of unsigned integer, fixed point, and floating point decoded numerical representations with a variety of bit depths. The primary goal is to provide a compressed format specification appropriate for a wide range of applications while keeping the implementation requirements for encoders and decoders simple. A special focus of the design is support for emerging high dynamic range (HDR) imagery applications.

JPEG XR combines the benefits of optimized image quality and compression efficiency together with low-complexity encoding and decoding implementation requirements. It also provides an extensive set of additional functionalities, including:

high compression capability,
low computational and memory resource requirements,
lossless and lossy compression,
image tile segmentation for random access and large image formats,
support for low-complexity compressed-domain image manipulations,
support for embedded thumbnail images and progressive resolution refinement – embedded codestream scalability for both image resolution and fidelity,
alpha plane support,
bit-exact decoder results for fixed and floating point image formats.

Important detailed design properties include:

high performance, embedded system friendly compression,
small memory footprint,
integer-only operations with no divides,
a signal processing structure that is highly amenable to parallel processing,
use of the same signal processing operations for both lossless and lossy compression operation.,
Support for a wide range of decoded sample formats (many of which support high dynamic range imagery):
- monochrome, RGB, CMYK or n-component image representation,
- 8- or 16-bit unsigned integer,
- 16- or 32-bit fixed point,
- 16- or 32-bit floating point,
- several packed bit formats,
- 1-bit per sample monochrome,
- 5- or 10-bit per sample RGB,
- Radiance RGBE.

The algorithm uses a reversible hierarchical lifting-based lapped biorthogonal transform. The transform has lossless image representation capability and requires only a small number of integer processing operations for both encoding and decoding. The processing is based on 16×16 macroblocks in the transform domain, which may or may not affect overlapping areas in the spatial domain (with the overlapping property selected under the control of the encoder). The design provides encoding and decoding with a minimal memory footprint suitable for embedded implementations.

The algorithm provides native support for both RGB and CMYK color types by converting these color formats to an internal luma-dominant format through the use of a reversible color transform. In addition, YUV, monochrome and arbitrary n-channel color formats are supported.

The transforms employed are reversible; both lossless and lossy operations are supported using the same algorithm. Using the same algorithm for both types of operation simplifies implementation, which is especially important for embedded applications.

A wide range of numerical encodings at multiple bit depths are supported: 8-bit and 16-bit formats, as well as additional specialized packed bit formats, are supported for both lossy and lossless compression. (32-bit formats are supported using lossy compression.) Up to 24 bits are retained through the various transforms. While only integer arithmetic is used for internal processing, lossless and lossy coding are supported for floating point and fixed point image data – as well as for integer image formats.

Part 1: System architecture

JPEG XR Part 1, known in ITU-T as T.Sup2, provides a technical overview and informative guidelines for applications of JPEG XR image coding format.

Part 2: Image coding specification

JPEG XR Part 2, known in ITU-T as T.832, is the core coding specification for JPEG XR. It specifies the syntax and semantics of JPEG XR coded images and the associated decoding process.

Part 3: Motion JPEG XR

JPEG XR Part 3, known in ITU-T as T.833, is the Motion JPEG XR file format specification. The Motion JPEG XR file format is designed to contain one or more motion sequences of JPEG XR images, with their timing. The format is based on the ISO base media file format.

Part 4: Conformance testing

JPEG XR Part 4, known in ITU-T as T.834, is the JPEG XR conformance testing specification. It specifies tests designed to verify whether codestreams, files, encoders, and decoders meet the normative requirements of JPEG XR specifications.

Part 5: Reference software

JPEG XR Part 5, known in ITU-T as T.835, provides reference software source code for the JPEG XR image coding format. Such reference software is useful in aiding users of an image coding standard to establish and test conformance and interoperability, and to educate users and demonstrate the capabilities of the standard. The reference software includes both encoder and decoder functionality.