This article originally appeared in issue 85 of Linux User & Developer.
Linux User & Developer, one of the nation’s favourite Linux and Open Source publications, is now part of the award winning Imagine Publishing family. Readers can subscribe and save more than 30% and receive our money back guarantee – click here to find out more.
The latest kernel release brought with it the usual round of regression reports, compiled by the ever vigilant Rafael J Wysocki from the existing bug reports. These show a (slightly worrying) trend for regressions to go unfixed for some time between releases, although many of the more scary issues have been fixed. Others took the release opportunity to provide updates on their own special kernel source trees that are based upon the 2.6.33 kernel. This included Thomas Gleixner, who let us know that a 2.6.33 series Real Time kernel patchset would be available soon for those who are interested in developing the RT features. He had been delayed getting to this as he was fixing a somewhat ugly problem with
unfortunate POSIX process priority semantics in the Real Time code.
Debugging the kernel with KGDB
Like many open source developers, I spend a lot of my time playing with virtual machines (VMs, also known as ‘guests’) these days. The KVM (Kernel Virtual Machine) support within the Linux kernel allows one to easily take advantage of the fancy hardware support for virtualisation inside modern systems and run those virtual machines at near native speed. On the management side, I use libvirt (libvirt.org), which has been a part of Fedora and other distributions for some time. All was well until a while back, when the host system I use to run my virtual guests began to become increasingly unstable. Sometimes it would fall over several times a week; eventually, more than once in a single day.
When a Linux system crashes (we call this a ‘panic’), it typically outputs some useful diagnostic information on the console. In many cases, this might be the display screen that you are using, but things like the X Window System often mean that the user doesn’t see the panic messages. Kernel developers typically use a serial console – a cable hooked up between serial ports on two machines – to capture this output, and we use serial because it generally always works (even when the system has almost completely gone out to lunch). Of course, as often also happens, I didn’t have a serial console configured on the system in question (I have since bought a fancy – but now very cheap on eBay – Linux-based Cyclades terminal server so that all of my machines have such consoles), and had to wait to reproduce the problem several times before I could capture the panic output and post it to the kernel mailing list as a ‘PROBLEM’.