Originally created in the early 1980s, the MIPS instruction set architecture is one of the earliest RISC architectures, and is one of the few CPU architectures in wide usage today. To maintain that heritage of usability and appropriateness for modern market segments, the architecture has continued to evolve over time.

In the time since I assumed the position as the manager of the instruction set architecture group within MIPS, we have introduced significant enhancements to the MIPS architecture.

With the introduction of Release 5, I wanted to provide a brief glimpse into the architecture’s evolution during my time here at MIPS. This has included the introduction of major upgrades to the base MIPS32® and MIPS64® architectures, introduction of an entirely new instruction set architecture called microMIPSTM, and development of new extensions to the MIPS architecture that provide functionality targeting specific application areas. These extensions are available either as separate Applications Specific Extensions (ASEs), or incorporated as optional modules within the base architecture.

Base Architecture Release 2.8

In 2009, we released an update of the architecture in which we concentrated on the Memory Management Unit (MMU). With the introduction of the MIPS32® and MIPS64® architectures in 1999, the JTLB (joint translation lookaside buffer) MMU (memory management unit) was standardized. This standardization allowed for multiple MIPS vendors to share kernel code, as well as user application code. Since this JTLB MMU originated in the early 1990s with the R4000 CPU, it had seen two decades of service, and was starting to become too small for modern day programs and their large working sets. Release 2.8 included the following enhancements to the MMU to deal with this issue.

  • Support for a JTLB MMU with more than 64 TLB entries. The larger JTLB aids in reducing the frequency of TLB misses.
  • An MMU configuration which supports both a variable-page-size VTLB and a fixed-page set-associative FTLB. The VTLB has the same capabilities of the JTLB, and its main purpose is to allow variable sized pages. The set associative FTLB can be implemented with normal RAM arrays, which would allow for a very large TLB array (entries in the thousands are possible).
  • For MIPS64, very large TLB pages were now supported – 1GB to 256TB.  Using one large page can replace multiple smaller pages.

Another important issue we addressed at that time was support for symmetric multi-processing (SMP), as MIPS introduced its first SMP product during this timeframe. Improvements for SMP support included:

  • The PAUSE instruction to de-allocate a (virtual) processor when arbitration for a lock doesn’t succeed. This allows for lower power consumption as well as lower snoop traffic when multiple (virtual) processors are arbitrating for a lock.
  • More flavors of memory barriers that are available through stype field of the SYNC instruction. The newer memory barriers attempt to minimize the amount of pipeline stalls while doing memory synchronization operations.
  • An enhancement of the CACHE management instruction to broadcast its operations to work on multiple cores in a coherent fashion.

In addition to the instruction set changes for SMP, the MIPS architecture also included standardized system-level components for SMP such as:

  • A Coherency Manager, which sends the memory transaction snoops to all of the coherent processors.
  • A Global Interrupt Controller, which can route up to 256 system-level interrupts to any virtual/physical processor.
  • An IO Coherency Unit, which allows IO devices to generate coherent memory transactions.
  • Coherent memory transactions followed the OCP-IP 3.0 specification.

Other enhancements within Release 2.8 included:

  • Scratch registers within Coprocessor0 for kernel mode software: this feature aids in quicker exception handling by not requiring the saving of user-mode registers onto the stack before kernel-mode software uses those registers.
  • The CDMM (common device memory map) scheme for the placement of small I/O devices into the physical address space: this scheme allows for efficient packing of such I/O devices into a memory region much smaller than a TLB page.
  • An EIC (external interrupt controller) mode where the EIC controller supplies a 16-bit interrupt vector: this allows different interrupts to share code.

Base Architecture Release 3

In 2010, MIPS Technologies introduced Release 3 of the MIPS architecture. The major feature of this release was the microMIPSTM instruction set. microMIPS is a complete ISA that can be used to replace MIPS32 or MIPS64 to reduce code size. This was MIPS’ second generation instruction set targeting applications where minimal code/image sizes are important. These include systems where expensive FLASH memory dominates the system cost, as well as other cost-constrained systems. Improvements from the older MIPS16eTM instruction set included:

  • Availability of floating point instructions.
  • Availability of privileged instructions for kernel-mode OS code.
  • Achieving good code sizes while retaining MIPS32/64 levels of performance.

In Release 3, MIPS continued to add features to aid memory management:

  • A more flexible version of the context register that can point to any power-of-two sized data structure. This allows the context register to be used for the first memory access of the TLB refill handler, removing the need to calculate that first pointer address during the TLB refill handler, and thus making the refill handler run faster.
  • Additional protection bits in the TLB entries that allow for non-executable and write-only virtual pages: these security bits help stop malicious code methods such as buffer overflow attacks, and help create secure systems by protecting pages from being read.

MCU Application Specific Extension (ASE)

Along with microMIPS, at this time we also introduced other features to aid our customers developing products for the microcontroller (MCU) market. These were collected in the MCU Application Specific Extension (ASE), and included these features:

  • Improved interrupt delivery through an increase in the number of CPU interrupt inputs from 6 signals to 8 signals.
  • Improved interrupt latency through automation of some of the control register saving and restoring on the stack.
  • Improved interrupt latency through the ability to handle multiple interrupts before returning from the exception handler.
  • Instructions to atomically set and clear bits within I/O devices.

Base Architecture Release 3.5

In early 2012, Release 3.5 incorporated important new features to extend the life of the 32-bit version of the architecture by making the virtual address spaces more usable.

This included a more programmable virtual address space map without fixed cache-ability and map-ability attributes. This Enhanced Virtual Addressing (EVA) scheme allows the implementations to decide how large/small uncached/unmapped segments need to be. The EVA capabilities are implemented through the Segmentation Control registers.

Along with the programmable virtual address map, it is possible to create separate user-mode and kernel-mode view of segments. This allows a larger kernel virtual address space to be defined. To access both the larger kernel address space and the overlapping user-space, we introduced additional load/store instructions.

Prior to these enhancements, MIPS implementations were limited to 512MB of memory which was directly addressable in kernel-mode (without using the TLB). That effectively limited Linux-based MIPS systems from using more than 256MB of physical memory without complex workarounds.

With these new features, it is now possible to build MIPS32 systems with up to 3.5GB of memory which is directly addressable in kernel mode.

Other features introduced in Release 3.5 included:

  • TLB invalidate instructions: these are necessary with Segmentation Control, as it is now possible to create a virtual address map without unmapped segments. Previously unmapped addresses were used to create invalid TLB entries.
  • Support for IEEE-754-2008 FPU behaviors (as opposed to behaviors of the older IEEE-754-1985 standard).
  • Hardware TLB Page Walking, which removes the need for taking an exception and draining the execution pipelines in order to fill the TLB. This aids CPU performance in programs with large working sets that would otherwise take many TLB refill exceptions.

Base Architecture Release 5

In December 2012, MIPS publicly announced Release 5 of the MIPS architecture (we skipped Release 4 because the number four is considered by many to be inauspicious or unlucky).  The major components of this release are the introduction of two new optional modules:

  • Hardware support for virtualization – VZ Module. Virtualization is used to run multiple operating systems simultaneously and securely on one CPU. This allows for easier migration of applications and their associated OS between different CPUs within enterprise applications. It also allows specific OS versions to be run for specific applications. For consumer devices, this enables increased security between one OS on which users are allowed to add their own applications, and a secondary mission-critical OS with which users can not tamper.  The benefits of the VZ Module are described more fully in other white papers on security and details of the VZ module capabilities.
  • 128b-wide Single-Instruction-Multiple-Data (SIMD) architecture – MSA Module. SIMD instructions are used to increase performance in media processing, signal processing and graphics rendering. The MSA doubles the vector length when compared to the preceding Digital Signal Processing (DSP) ASE. Both fixed-point and floating-point SIMD operations are supported in the MSA. The benefits of the MSA Module are described more fully in another white paper.

In addition, with Release 5, MIPS has made some older ASEs available as part of the base architecture as modules. This includes:

  • Digital Signal Processing – DSP Module. The DSP Module is an older and lighter-weight SIMD instruction set targeting audio and voice processing.
  • Multi-threading support – MT Module. Multi-threading is a method of increasing processing through-put when there are frequent cache misses.

Previously, these modules were sold as separate add-on ASE products. These modules are now part of the Release 5 Base Architecture license.

The MIPS instruction set architecture provides a modern, capable CPU solution

By continuously adding to the capabilities of the MIPS architecture, we are able to meet the demands of the ever-increasing and changing demands of multiple market segments. This allows the MIPS architecture to be appropriate and scalable across a very wide range of markets – all the way from small cost-constrained systems such as microcontrollers to very large scale, high-performance enterprise applications.

Optimizing architectural support for small footprint, entry level 32-bit systems  has driven the addition of an entirely new and improved code-size-reduction instruction set, with improved interrupt handling features ideal for low-end systems such as microcontrollers and other deeply embedded products.

For high-end systems, we improved memory management, introduced SMP support, and hardware support for virtualization.  And for high performance systems, we introduced wide SIMD support and hardware page table walking. As it has for nearly 30 years, the MIPS architecture will continue to evolve to support future generations of electronic products, with their ever-increasing requirements for greater processing capability.

About the author: David Lau

Profile photo of David Lau