In the first post of this series, we named the popular methods for texture compression in OpenGL ES, particularly Imagination Technologies PVRTC on all Apple and many Android mobile devices. Now, let’s explore what texture compression involves, what PVRTC does, and how it differs from other approaches.
Graphics processing units (GPUs) render images from the basic building blocks of polygons. Implementing a technique called “hardware tessellation”, GPUs can divide polygons into smaller polygons until images are both highly detailed and quickly rendered. Modern GPUs handle millions of polygons, but without further detail each is just a flat, lifeless shape. Achieving 3D realism calls for texture maps applied to the polygons combined with lighting, shading, and projection techniques, along with other advanced effects like transparency and fog.
Texture mapping is sort of like covering the side of a box with wallpaper – except in most realistic scenes, it is more like millions of boxes with hundreds or thousands of different kinds of wallpaper. Textures are not always simple patterns, as we see from this complex rendering example; achieving a high quality look calls for more detail in the texture map, increasing the size of the image. This drives memory size and bandwidth requirements, and ultimately determines the rendering time needed and the overall frame rate of the scene.
image courtesy Walt Disney Animation Studios
Texture compression comes to the rescue, reducing the size of texture maps accordingly. Keep in mind each texture map is rendered onto a polygon with varying orientation and effects as the scene progresses, so they need to be manipulable. Traditional image compression schemes like JPEG and PNG can achieve smaller file sizes, but do not permit random access of compressed pixel data and must be fully decompressed before they can be processed – pretty much undoing any savings.
All this is magnified on a mobile GPU, where any polishing and grinding of images requires not only time, but power which is in short supply. While more advanced techniques are possible, mobile implementations favor efficiency. The search was on for efficient texture compression schemes that offered both strong compression and simple processing, without sacrificing too much image quality.
This first wave of texture compression schemes implemented block-based encoding, with their basic function typically working on a 4×4 group of pixels (with each pixel a 24-bit RGB value, in shorthand 24bpp) to produce a more compact 64-bit word. The advantage is fixed-rate, single memory access decompression, allowing a hardware unit in a mobile GPU to work directly with the compressed texture.
The inventor of PVRTC, Simon Fenney, makes the key observation in his original publication: each 4×4 “texel” is handled completely independently in these block-based schemes. These approaches are efficient, but fail to take any advantage of possible similarity between adjacent texels, and create a big opportunity for discontinuities between texels as compression produces unexpected results – leading to undesirable artifacts.
The question then becomes if a scheme can be designed taking into account the entire texture image, or at least bigger regions of it, while still retaining relatively simple decompression operations that can be implemented in GPU hardware. The answer is a combination of analyzing the variation in the texture image, and applying a modulation scheme using low frequency signals instead of constant per-block color values, with a heavy dose of bilinear upscaling.
Here is the very abbreviated version of the PVRTC process: the texture map image is low-pass filtered, decomposed into two separate low resolution images centered around the original signal, and an axis image of modulation weights used to guide blending of the two compressed images back into the final texture map. The two images and modulation values are strung together into a single 64-bit word, which can be processed using relatively straightforward operations back into a 4×4 texel.
If even that oversimplified description sounds complicated, remember the filtering and compression process is handled offline as part of the application design. The act of compressing texture maps can be fairly sophisticated if it results in something that is easier to decompress at GPU render time, and that was the goal of PVRTC. In reading Feeney’s original work, the compression scheme was the limiting factor and has since matured.
Offline pre-processing of texture images is where Imagination PVRTexTool comes in, or for iOS developers Apple texturetool using the PVR format. You might notice in the Apple examples, they are encoding PNG formatted images, a lossless compression format. JPEG images are already somewhat compromised with lossy compression, and layers of lossy compression no matter how good each step is tend to have undesirable results. Best results come from starting with original images of the best possible quality, preferably uncompressed.
The overall mobile graphics experience is a combination of starting image quality, display quality, the GPU hardware, and the texture compression scheme. The visual results can be impacted by many choices, and the variables of memory usage, performance, and power consumption also factor in. PVRTC (and its successor, PVRTC2 as overviewed by Imagination) combined with the PowerVR GPU family architecture can produce world-class results, perhaps best showcased by the Apple iPhone and iPad implementations but certainly available in other devices.
Next up: how things look in a texture compressed world, with PVRTC applied.