We’re currently experiencing a once-in-a-decade revolution in the world of digital entertainment. The rising popularity of video streaming services among consumers is creating whole new business models for companies involved in this fast-growing market.

The good news is that standards bodies have been working diligently to prepare the industry for the shift to high-resolution video streaming; there are multiple groups focused on finding ways to most efficiently deliver video to mobile and embedded devices, including the ITU-T (H.264 and H.265) and the WebM Project (VP8 and VP9).

The primary aim of these current codecs is to achieve significant reductions in bitrate while maintaining the same (or higher) level of quality compared to their predecessors. For many mobile and embedded applications, this means that multimedia codecs must be aggressively optimized to handle the increase in computational complexity and meet the needs of low power and cost-sensitive devices.

Video-codec-MSA-MIPS SIMD ArchitectureThe typical architecture of a video codec

Until today, the optimization process usually implied a deep understanding of the underlying hardware and lots of hand-written assembly code that was specific to a certain microarchitecture. Unfortunately, this approach sometimes created a nightmare for developers who had to maintain separate code trees for different architectures and platforms.

VP9-vertical-tap-filter The VP9 vertical motion compensation filter

To address this problem Imagination introduced the MIPS SIMD Architecture (MSA), a standardized set of instructions designed to meet the requirements of a wide range of compute‑intensive applications.

In a whitepaper published on our website, we’ve recently demonstrated how developers can use our MSA technology to optimize video codecs in a manner that’s flexible, scalable and works seamlessly across multiple platforms.

Today we are announcing that these new optimizations are now available for the latest video decoders under FFmpeg v2.8.1 (HEVC, VP9, VP8, AVC, MPEG-4/H.263, MPEG-2) as well as for the AVC encoder under x264.

By using only compiler-friendly C code, we’ve developed a library of optimized and cross-platform built-in datatypes and intrinsics to accelerate video codecs on MIPS P- and I-class Warrior CPUs. This new approach includes frequently used mathematical or arithmetic vector operations (addition, subtraction, multiplication, shifts etc.) and offers a complete replacement to handcrafted assembly code, being reusable across any platform integrating MSA-capable MIPS Warrior CPUs.

It also helps enhance maintainability and reduces development time significantly, making portability of the code to any future MSA architecture quite easy. The code is structured in the readable order of processing, leaving register usage and instruction scheduling intelligence to the compiler.

Last but not least, the performance gain achieved is quite competitive. For example, we’ve reduced the total instruction count by up to 10x in the case of a vertical filter for the VP9 codec.

To learn more about MSA and optimizing video codecs, download the whitepaper from our website. If you’d like to start prototyping software for MSA-capable Warrior CPUs, we’ve integrated this functionality in the latest QEMU emulator provided by the prpl foundation and the free Codescape SDK offered by Imagination. More advanced users can contact us to get access to the professional edition of our Codescape SDK.

About the author: Alex Voica

Profile photo of Alex Voica

Before deciding to pursue his dream of working in technology marketing, Alexandru held various engineering roles at leading semiconductor companies in Europe. His background also includes research in computer graphics and VR at the School of Advanced Studies Sant'Anna in Pisa. You can follow him on Twitter @alexvoica.

View all posts by Alex Voica