The I/O memory management unit (IOMMU) is a type of memory management unit (MMU) that connects a Direct Memory Access (DMA) capable expansion bus to the main memory. It extends the system architecture by adding support for the virtualization of memory addresses used by peripheral devices. Additionally, it provides memory isolation and protection by enabling system software to control which areas of physical memory an I/O device may access. It also helps filter and remap interrupts from peripheral devices. Let’s have a look at IOMMU advantages, disadvantages, and how it is implemented from the perspective of hardware and software.
Advantages of IOMMU usage:
- One single contiguous virtual memory region can be mapped to multiple non-contiguous physical memory regions. IOMMU can make a non-contiguous memory region appear contiguous to a device (scatter/gather). Scatter/gather optimizes streaming DMA performance for the I/O device.
- Memory isolation and protection: device can only access memory regions that are mapped for it. Hence faulty and/or malicious devices can’t corrupt memory.
- Memory isolation allows safe device assignment to a virtual machine without compromising host and other guest OSes.
- IOMMU enables 32-bit DMA capable devices to access to > 4GB memory.
- Support hardware interrupt remapping. It extends limited hardware interrupts to software interrupts. Primary uses are the interrupt isolation and translation between interrupt domains (e.g. IOAPIC vs x2APIC on x86)
Disadvantages of IOMMU usage:
- Latency in dynamic DMA mapping, translation overhead penalty.
- Host software has to maintain in-memory data structures for use by the IOMMU
IOMMU’s architecture is designed to accommodate a variety of system topologies. There can be multiple IOMMUs located at a variety of places in the system fabric. It can be placed at bridges inside one bus or at bridges between busses of the same or different types. In modern systems, IOMMU is commonly integrated with the PCIe root complex. It connects to the PCIe bus from downstream and to the system bus (e.g. infinity fabric) from upstream. One example topology is shown below.
The IOMMU is configured and controlled via two sets of registers. One in the PCI configuration space and another set mapped in system address space. Since the IOMMU appears to OS as a PCI function, it has a capability block in the PCI configuration space. Additionally, up to eight data structures are placed in the main system memory. These data structures contain information about the mapping of interrupts and memory for the usage of peripheral devices. Information about the device’s memory access permissions is stored there too.
IOMMU is a generic name for technologies such as VT-d by Intel, AMD-Vi by AMD, TCE by IBM and SMMU by ARM. Make sure that your CPU supports one of these before you try to enable IOMMU.
First of all, IOMMU has to be initiated by UEFI/BIOS and information about it
has to be passed to the kernel in ACPI tables. For the end-user, that means that
you have to enter the UEFI/BIOS settings and set the IOMMU option to
For readers interested in developing firmware capable of IOMMU initialization,
the next post in this topic will describe the process of enabling IOMMU for PC
Engines apu2 in coreboot.
Enable IOMMU support by setting the correct kernel parameter depending on the type of CPU in use:
- For Intel CPUs (VT-d) set
- For AMD CPUs (AMD-Vi) set
- Additionally if you interested in PCIe passthrough set
After rebooting, you can check dmesg output to confirm that IOMMU is enabled
On Intel platforms:
Output should look somewhat like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
[ 0.000000] ACPI: DMAR 0x00000000BDCB1CB0 0000B8 (v01 INTEL BDW 00000001 INTL 00000001) [ 0.000000] Intel-IOMMU: enabled [ 0.028543] dmar: IOMMU 0: reg_base_addr fed90000 ver 1:0 cap c0000020660462 ecap f0101a [ 0.028831] dmar: IOMMU 1: reg_base_addr fed91000 ver 1:0 cap d2008c20660462 ecap f010da [ 0.028924] IOAPIC id 8 under DRHD base 0xfed91000 IOMMU 1 [ 0.536201] DMAR: No ATSR found [ 0.536211] IOMMU 0 0xfed90000: using Queued invalidation [ 0.536228] IOMMU 1 0xfed91000: using Queued invalidation [ 0.536236] IOMMU: Setting RMRR: [ 0.536246] IOMMU: Setting identity map for device 0000:00:02.0 [0xbf000000 - 0xcf1fffff] [ 0.537511] IOMMU: Setting identity map for device 0000:00:14.0 [0xbdea8000 - 0xbdeb6fff] [ 0.537521] IOMMU: Setting identity map for device 0000:00:1a.0 [0xbdea8000 - 0xbdeb6fff] [ 0.537535] IOMMU: Setting identity map for device 0000:00:1d.0 [0xbdea8000 - 0xbdeb6fff] [ 0.537547] IOMMU: Prepare 0-16MiB unity mapping for LPC [ 0.537559] IOMMU: Setting identity map for device 0000:00:1f.0 [0x0 - 0xffffff] [ 2.039251] [drm] DMAR active, disabling use of stolen memory
On AMD platforms:
Output should look somewhat like this:
1 2 3 4 5 6
[ 0.805831] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40 [ 0.805832] pci 0000:00:00.2: AMD-Vi: Extended features (0x4f77ef22294ada): [ 0.805834] AMD-Vi: Interrupt remapping enabled [ 0.805834] AMD-Vi: Virtual APIC enabled [ 0.806045] AMD-Vi: Lazy IO/TLB flushing enabled [ 4.535573] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <firstname.lastname@example.org>
One of the most interesting use cases of IOMMU is PCIe Passthrough. With the help of the IOMMU, it is possible to remap all DMA accesses and interrupts of a device to a guest virtual machine OS address space, by doing so, the host gives up complete control of the device to the guest OS. It implicates that the host doesn’t have to virtualize this device nor translate communication between it and the guest, this almost completely removes performance overhead and latency caused by such translations. For example, by performing a passthrough of GPU it is possible to play games on Windows as a guest OS with performance unnoticeably different compared to the bare metal Windows. Another frequent use case is passing through a network interface card to use its full performance on the guest OS. The obvious drawback would be the inability to use passed devices by the host or other guests. Some solution to this problem is a technology called SR-IOV. It allows different virtual machines in a virtual environment to share a single PCI Express hardware interface, though very few devices support SR-IOV. It is almost exclusively available in server-grade devices. It is worth noting that not always IOMMU is capable of isolating a single device. Depending on the PCIe tree configuration, some devices are inseparable. If it is a case, those devices will be placed by the kernel in a single IOMMU group. These groups will be the topic of one of a future post.
IOMMU is a useful device with many advantages. Among other things, It protects a system from DMA attacks and allows better isolation of virtual machines. Its main disadvantage is the latency of DMA transfers due to the need to check mapping and permission saved in main memory when the transfer occurs, but this latency can be largely alleviated by caching that information inside IOMMU. In summary, the use of IOMMU seems to be beneficial in almost every case.
If you think we can help in improving the security of your firmware or you
looking for someone who can boost your product by leveraging advanced features
of used hardware platform, feel free to book a call with us
or drop us email to
contact<at>3mdeb<dot>com. If you are interested in similar
content feel free to sign up to our newsletter