When we last visited texture compression technology for OpenGL ES on mobile GPUs, we mentioned Squish image quality results in passing, but weren’t able to explore a key technology at the top of the results. With today’s introduction of the ARM Mali-T720 GPU IP, let’s look at the texture compression technology inside: Adaptive Scalable Texture Compression, or ASTC.
In contrast to some other proprietary texture compression implementations, ASTC is backed by the Khronos Group – and royalty free. This multi-vendor backing by the likes of AMD, ARM, NVIDIA, Qualcomm, and even Imagination Technologies bodes well, with adoption gaining momentum. For example, CES 2014 saw NVIDIA’s Tegra K1 and Imagination’s PowerVR Series6XT GPU core both add ASTC support to the mix, and the latest homegrown Adreno 420 GPU core from Qualcomm showcased in the Snapdragon 805 also features ASTC.
Until these recent developments, many have equated ASTC technology with ARM, probably due to their leadership in the Khronos specification activity, and an IP onslaught designed to challenge Imagination head on. ARM was able to stake out a claim to ASTC support beginning with the Mali-T624 and subsequent versions of their GPU IP family.
First released in 2012, ASTC had the advantage of learning from actual use of other popular texture compression formats and targeting the opportunity for improvement. The problem, as the original Nystad presentation on ASTC from HPG 2012 opens with, is the wide field of use cases with varying requirements for color components, dynamic range, 2D versus 3D, and quality. Most of the existing formats do well in a particular use case, but not in others, and mobile designers have generally opted for more decode efficiency at the expense of some image quality.
ASTC is a lossy, block-based scheme, and like other schemes is designed to decode textures in constant time with one memory access, but with a big difference: variable texel size working within a fixed 128 bit block, with the result a finely scalable bit rate. ASTC also supports both 2D and 3D textures, LDR and HDR pixel formats, and an orthogonal choice of base format – L, LA, RGB, RGBA – so it can adapt to almost any situation, leaving the bit rate encoding unaffected.
All this control provides a way to scale image quality more precisely, without changing compression schemes for different situations, but at first glance one might think it would blow up storage space requirements with all those settings running around. The math magic behind ASTC is fairly complex, depending on a strategy called bounded integer sequence encoding (BISE).
We know power-of-two encoding can be dreadfully inefficient, but BISE takes a seemingly-impossible approach: effectively, fractional bits per pixel. A long story short (follow the above BISE link for a more complete ARM explanation), BISE looks at base-2, base-3, and base-5 encoding to efficiently pack the number of needed values into a fixed 128 bit block. That magic explains the above bit rate table and its rather non-intuitive texel sizes.
There are a few further insights to ASTC that make it pretty amazing:
- As with other texture compression schemes, the heavy lifting is done during encoding, a software tool running on a host – all the mobile GPU has to worry about is fast texture decompression.
- With all these options in play, the only thing fixed in the implementation is the 128 bit block footprint – every setting can vary block-to-block, meaning image encoding can change across an image dynamically based on the needs. (In theory, at least. I’m not sure ARM’s encoder tool actually does this, and in most comparisons, a group of settings applies to the entire image.)
- The end result of more efficient and finer grained encoding is better signal-to-noise ratios – those better Squish results we mentioned earlier, with ARM indicating differences of 0.25dB can be detected by some human eyes.
- ARM’s ASTC implementation is synthesizable RTL (plus ARM POP IP technology for hardening), allowing it to find homes with customers choosing Mali GPU IP or customers implementing their own GPUs like the ones listed above – and the absence of per-unit royalties is attractive for many potential users.
Now, ARM breaks back into the midrange with the cost-optimized Mali-T720 GPU core targeting Android devices, fully supporting ASTC and other enhancements including a scalable 1 to 8 core engine and partial rendering. As a companion to the optimized ARM Cortex-A17 core, the Mali-T720 continues to shrink die area targeting a 28nm process, while improving energy efficiency and graphics performance running at 695 MHz.
ASTC may be in the early stages of taking over for mobile GPU texture compression, especially on Android implementations where platform variety is larger. The adoption by Qualcomm is especially significant, and I’ll be excited to see Jon Peddie’s 2014 mobile GPU data soon to see what kind of impact the availability of more ARM Mali GPU IP is having. Stay tuned.