Key Takeaways
- RISC-V is expanding from microcontroller-centric applications to include application processors that require virtualization support.
- The complexity of the MMU standard in RISC-V is challenging for verification teams to develop comprehensive test plans.
- Virtualization offers benefits like multiple active programs, memory overflow management, and memory isolation between processes.
- Breker Verification is working to create a system VIP for MMU verification to assist design and verification teams.
In the early days of RISC-V adoption, applications were microcontroller-centric with no need for virtualization support. But horizons expanded and now RISC-V is appearing in application processors, very much needing to be able to virtualize multiple apps concurrently. Take another step forward to datacenter servers running virtual machines under hypervisors, with each virtual machines running multiple virtual processes. Virtualization is turning up everywhere in RISC-V, supported by a standard for the MMU that virtualization requires. But the complexity of the standard is taxing verification teams when it comes to developing comprehensive testplans. I talked to Adnan Hamid (President and CTO) and Dave Kelf (CEO) of Breker Verification to get insight.
A quick recap on virtual memory and MMUs
The basic idea is quite simple. Each software developer can assume their program is running standalone with as much memory as it needs. The operating system (and the MMU) supports this fiction through an indirection between virtual and physical memory. This memory space indirection allows the OS/MMU to allocate and move around chunks of memory in the form of pages to support multiple processes occupying physical memory and/or offline storage at the same time. Virtualization delivers multiple benefits: More than one program can be active at a time; Each program can assume it has access to more memory than is physically available since overflow can be swapped out to disk; the OS/MMU can transparently optimize to reduce memory fragmentation as running processes complete; the OS/MMU can ensure memory isolation between processes, so if one process tries to access an out-of-bounds address in its own space, that attempt doesn’t affect other processes running at the same time.
Hypervisors add another level of indirection. These run multiple virtual machines, each in their own virtual space hosting an OS with services, in turn running multiple virtualized processes within that virtualized space. Nothing really complicated there.
Where it gets complicated
Seems pretty straightforward, right? Unfortunately it gets a whole lot more tangled in the details. MMU complexity isn’t unique to RISC-V. Adnan has previously worked on MMU verification for x86 and Arm-based systems and confirms there is plenty of complexity in both. Still, the RISC-V definition is unique in a few ways. First, the definition was finalized more recently, implying perhaps more time is needed for the standard (or at least documentation of the standard) to fully mature through widespread deployment. Second, in keeping with the RISC-V philosophy, MMU support is defined through extensions to the ISA, but the compatibility test framework requires demonstrating system level compatibility between multiple processors, probably coherent networks, the MMU, external memory and backing store. Third, the RISC-V standard teams saw opportunity to further generalize the definition, no doubt adding more capability but also more complexity.
Some of the complexity is just in the nature of MMUs. Process image data is stored in pages, each page 4KB by default but different profiles allow for larger pages, even a mix of page sizes. Pages are indexed by page tables, a lookup mechanism storing virtual and physical offsets for each page in memory. When a read or store is made to an address, the MMU will attempt to find the corresponding reference in these page tables. Naturally this lookup is supported by a cache (TLB) to enhance performance. If the appropriate address is already in a page in memory, the value can be returned/updated. If not, the MMU faults through to finding the appropriate page in main memory or backing store, bringing it in and making space by evicting some least recently used page currently in memory. When a hypervisor is active, lookup must go through two tables of indirection.
Add to this multiple levels of page table to accelerate lookup, multiple address translation protocols, privilege management, and other goodies which play into the details of how the MMU should function to be compatible with the RISC-V compliance tests. There is a written specification which Adnan repeatedly called “dense”, meaning long and complex. No doubt very carefully thought through by experts, though there still seems to be some debate about whether it is fully finalized.
Fairly quickly I get out of my depth in all this complexity. Instead I’ll turn to my own level of indirection by talking about what Breker has been doing to help DV teams in this space. Industrial experience in working with the standard is a pretty good indicator of maturity. One important point to remember is that the standard defines ISA extensions for MMU support, and it provides a system compatibility reference checker. It doesn’t tell you how to build your MMU or how to verify it. Both are left as exercises for the design and verification teams.
Breker SystemVIP for MMU verification
Breker hosted a tutorial on MMU testing at DVCon which was well attended (90 people). So popular that they have subsequently repeated the tutorial, reaching similar crowds. The tutorials reinforce that DV experts are struggling to know how to write testplans around MMUs for RISC-V-based systems.
Breker has put a lot of work into understanding these requirements to build a system VIP which can provide a canned starting point for DV testplans and test implementation. Adnan freely confesses that they aren’t all the way there yet. In Breker’s own work and in talking with clients, they know of holes in the Breker solution. Adnan says they have frequent and spirited discussions around whether the Breker interpretation is correct on any given point. At this point Adnan feels that the Breker has it right more often than not, but they still consider feedback both to test and to drive refinements to their implementation. Meantime clients and prospects keep coming back to Breker, with questions and arguments. A pretty good indication that even if incomplete, Breker is still leading the pack!
Very interesting. MMU system testing in the RISC-V world may be a niche but it’s a very important niche for anyone building a system which claims to support virtualization. You can learn more about Breker work in this space HERE.
Also Read:
How Breker is Helping to Solve the RISC-V Certification Problem
Breker Brings RISC-V Verification to the Next Level #61DAC
System VIPs are to PSS as Apps are to Formal
Share this post via:
Comments
2 Replies to “RISC-V Virtualization and the Complexity of MMUs”
You must register or log in to view/post comments.