We continue our effort to enable IOMMU and as side effect I have to play with various technologies to exercise reliable development environment which base on RTE.
In this blog post I would like to present semi-automated technique to debug firmware, Xen and Linux kernel. The goal is to have set of tools that help in enabling various features in Debian-based dom0.
We would like:
- update Linux kernel which is exposed over HTTP server
- update rootfs provided through NFS
I will use following components:
My workstation environment is QubesOS 4.0 with Debian stretch VMs, but it should not make any difference. I had to workaround one obstacle related to our environment, which is behind VPN, but I also wanted to access outside world in my fw-dev VM. More information about there can be found here
First, I assume that you have working
pxe-server and RTE connected to apu2.
We will start with automation of Linux kernel deployment since this is crucial while debugging.
Initially this blog post was motivated with coreboot development effort to enable IOMMU. And error I get with 4.14.50 kernel and mentioned coreboot patches:
Building kernel with Xen support for apu2
Config fetched from github has couple features enabled that make it work as
Options that are worth enabling when debugging IOMMU support in Linux kernel is
Enable IOMMU debugging aka
Ansible for Linux kernel development
Typically manual procedure of deploying new kernel and rootfs to pxe-server and NFS would look like below:
- Compile new kernel as described above
- Update rootfs using
*.debpackages from point 1
- Update kernel using
bzImagefrom point 1
- Boot new system over iPXE
*.deb packages and
bzImage packages have to be deployed to NFS server and
installed inside rootfs what typically mean
chroot. Installation with system
booted over NFS is way slower.
We assume that server we working with is dedicated for developers. In our
infrastructure we have 2 VMs one with production
pxe-server and one with
pxe-server-dev. After exercising configuration on
pxe-server-dev we applying them to production.
Flat ansible playbook
I’m not familiar with Ansible design patters so I made it for now flat playbook. Rough steps of what was done in below scripts are like this:
- Copy all
*.debfiles mentioned in command line to pxe-server
- Mount required subsystems into Debian Dom0 rootfs
- create script that would be executed in chroot (upgrade and kernel installation)
- Umount subsystems from point 2
- Force to update netboot to revision that has support for
Things left out:
- automatic selection of
*.debpackages that were created by build process
- previous kernels cleanup in rootfs
- modification of
menu.ipxe- we rely now on branch in
netbootrepository, this not the best solution, because all modifications go through repository
My Xen rootfs looks like that:
Running above code with command similar to:
This command is convoluted and for sure need simplification, but for now I didn’t manage to figure out better solution.
This script should update rootfs and add required kernel. Now we would like to test what we did with RTE.
Run Xen Linux dev with RTE
Internally we developed extensive infrastructure that can leverage various features of RTE for example:
- reserve device under test so no one else will intercept test execution - this is great in shared environment
- check hardware configuration if it makes sense to run this test
- automatically support all OSes exposed by
To verify our new kernel we would like to use last feature. Simplest
look like that:
There are couple interesting things to explain here:
- we use
rtectrl-rest-api/rtectrl-rest-api.robot- this library provide control over GPIO, it is quite easy to implement your own if you have RTE since it is just interaction with
Run iPXE shell,
iPXE boot entrycame from our iPXE library, which just parse PC Engines apu2 serial, it works only with
pxe-serverand firmware released by 3mdeb at pcengines.github.io.
Serial root login Linuxis just login prompt wrapper with password as parameter
Also to run above test you need modified Robot Framework which you can find here.
If you are interested in RTE usage please feel free to contact us. Having RTE you can achieve the same goal using various other methods (without our RF scripts).
How RTE-supported development workflow look like?
Typically you work on your kernel modification and want to run it on hardware, so you point above ansible to deploy code to pxe-server.
You may ask: why use some external pxe-server and not just install everything locally? This implies couple problems:
- target hardware have to be connected to your local network
- every time you reboot computer you have some additional steps to finish setup
- you can start container automatically, but still it consume resources on your local machine which you may use for other purposes (e.g. compilation)
RTE is first about remote and second about automation. Of course RTE
pxe-server should always be behind VPN.
Getting back to workflow. It may look like that:
- build custom kernel as described above - time highly depends on your hardware
- deploy kernel to pxe-server - time: 1min15s
- run test - e.g. booting Xen Linux dev over iPXE RTE time: 1min40s
- rebuild firmware - assuming you use pce-fw-builder RTE time: ~5min
- firmware flashing and verification - RTE time: 2min (depends how many SPI blocks are different)
- firmware flashing without verification - RTE time: 27s
Please note that:
- rebuilding firmware is not just building coreboot, but putting together all
components (memtest, SeaBIOS, sortbootorder, iPXE) to make sure we didn’t
distcleanevery time, we plan to change that so optionally it will reuse cached repositories, please track this issue
- verification means booting over iPXE to OS and check if flashed version is the same as version exposed by provided binary
Then you can run
dev.robot to see
how boot log look like. In my case mentioned at the begging I wanted initially
to get better logs from kernel to continue investigation of repeating:
We strongly believe in automation in firmware and embedded systems development. We think there is not enough validation in coming IoT era. Security requires reproducibility and validation. Because of that we try to automate our workflows, what is time consuming, but left us with some automation that always can be helpful in streamlining everyday work. We never know how many iteration given debugging session will take, why not to automate it? Or even better why not to try Test Driven Bug Fixing?
If you think we can help in validation of your firmware or you looking for
someone who can boot your product by leveraging advanced features of used
hardware platform feel free to drop us email to