We continue our effort to enable IOMMU and as side effect I have to play with various technologies to exercise reliable development environment which base on RTE.
In this blog post I would like to present semi-automated technique to debug firmware, Xen and Linux kernel. The goal is to have set of tools that help in enabling various features in Debian-based dom0.
We would like:
- update Linux kernel which is exposed over HTTP server
- update rootfs provided through NFS
I will use following components:
- PC Engines apu2c
- RTE
- pxe-server - our dockerized HTTP and NFS server
- Xen 4.8
- Linux kernel 4.14.y
My workstation environment is QubesOS 4.0 with Debian stretch VMs, but it should not make any difference. I had to workaround one obstacle related to our environment, which is behind VPN, but I also wanted to access outside world in my fw-dev VM. More information about there can be found here
First, I assume that you have working pxe-server
and
RTE connected to apu2.
We will start with automation of Linux kernel deployment since this is crucial while debugging.
Initially this blog post was motivated with coreboot development effort to enable IOMMU. And error I get with 4.14.50 kernel and mentioned coreboot patches:
|
|
Building kernel with Xen support for apu2
|
|
Config fetched from github has couple features enabled that make it work as dom0
e.g. CONFIG_XEN_DOM0
.
Options that are worth enabling when debugging IOMMU support in Linux kernel is
Enable IOMMU debugging
aka CONFIG_IOMMU_DEBUG
.
Ansible for Linux kernel development
Typically manual procedure of deploying new kernel and rootfs to pxe-server and NFS would look like below:
- Compile new kernel as described above
- Update rootfs using
*.deb
packages from point 1 - Update kernel using
bzImage
from point 1 - Boot new system over iPXE
*.deb
packages and bzImage
packages have to be deployed to NFS server and
installed inside rootfs what typically mean chroot
. Installation with system
booted over NFS is way slower.
We assume that server we working with is dedicated for developers. In our
infrastructure we have 2 VMs one with production pxe-server
and one with
development pxe-server-dev
. After exercising configuration on pxe-server-dev
we applying them to production.
Flat ansible playbook
I’m not familiar with Ansible design patterns so I made it for now flat playbook. Rough steps of what was done in below scripts are like this:
- Copy all
*.deb
files mentioned in command line to pxe-server - Mount required subsystems into Debian Dom0 rootfs
- create script that would be executed in chroot (upgrade and kernel installation)
- Umount subsystems from point 2
- Copy
bzImage
to/var/netboot/kernels/vmlinuz-dev
- Force to update netboot to revision that has support for
*-dev
menu options
Things left out:
- automatic selection of
*.deb
packages that were created by build process - previous kernels cleanup in rootfs
- modification of
menu.ipxe
- we rely now on branch innetboot
repository, this not the best solution, because all modifications go through repository
My Xen rootfs looks like that:
|
|
Running above code with command similar to:
|
|
This command is convoluted and for sure need simplification, but for now I didn’t manage to figure out better solution.
This script should update rootfs and add required kernel. Now we would like to test what we did with RTE.
Run Xen Linux dev with RTE
Internally we developed extensive infrastructure that can leverage various features of RTE for example:
- reserve device under test so no one else will intercept test execution - this is great in shared environment
- check hardware configuration if it makes sense to run this test
- automatically support all OSes exposed by
pxe-server
To verify our new kernel we would like to use last feature. Simplest dev.robot
may look like that:
|
|
There are couple interesting things to explain here:
- we use
rtectrl-rest-api/rtectrl-rest-api.robot
- this library provide control over GPIO, it is quite easy to implement your own if you have RTE since it is just interaction withsysfs
Run iPXE shell
,iPXE menu
,iPXE boot entry
came from our iPXE library, which just parse PC Engines apu2 serial, it works only withpxe-server
and firmware released by 3mdeb at pcengines.github.io.Serial root login Linux
is just login prompt wrapper with password as parameter
Also to run above test you need modified Robot Framework which you can find here.
If you are interested in RTE usage please feel free to contact us. Having RTE you can achieve the same goal using various other methods (without our RF scripts).
We plan to provide some working examples of RTE and Robot Framework during our workshop session on Open Source Firmware Conference.
How RTE-supported development workflow look like?
Typically you work on your kernel modification and want to run it on hardware, so you point above ansible to deploy code to pxe-server.
You may ask: why use some external pxe-server and not just install everything locally? This implies couple problems:
- target hardware have to be connected to your local network
- every time you reboot computer you have some additional steps to finish setup
- you can start container automatically, but still it consume resources on your local machine which you may use for other purposes (e.g. compilation)
RTE is first about remote and second about automation. Of course RTE and
pxe-server
should always be behind VPN.
Getting back to workflow. It may look like that:
- build custom kernel as described above - time highly depends on your hardware
- deploy kernel to pxe-server - time: 1min15s
- run test - e.g. booting Xen Linux dev over iPXE RTE time: 1min40s
- rebuild firmware - assuming you use pce-fw-builder RTE time: ~5min
- firmware flashing and verification - RTE time: 2min (depends how many SPI blocks are different)
- firmware flashing without verification - RTE time: 27s
Please note that:
- rebuilding firmware is not just building coreboot, but putting together all
components (memtest, SeaBIOS, sortbootorder, iPXE) to make sure we didn’t
messed something
pce-fw-builder
performdistclean
every time, we plan to change that so optionally it will reuse cached repositories, please track this issue - verification means booting over iPXE to OS and check if flashed version is the same as version exposed by provided binary
Then you can run dev.robot
to see how boot log look like. In my case mentioned
at the begging I wanted initially to get better logs from kernel to continue
investigation of repeating:
|
|
Summary
We strongly believe in automation in firmware and embedded systems development. We think there is not enough validation in coming IoT era. Security requires reproducibility and validation. Because of that we try to automate our workflows, what is time consuming, but left us with some automation that always can be helpful in streamlining everyday work. We never know how many iteration given debugging session will take, why not to automate it? Or even better why not to try Test Driven Bug Fixing?
If you think we can help in validation of your firmware or you looking for
someone who can boot your product by leveraging advanced features of used
hardware platform feel free to drop us email to contact<at>3mdeb.com
.