I gave the keynote speech at this year’s Linux Symposium. The talk was entitled the ‘State of the Kernel’, and summarised the past year of kernel development. Attempting to summarise an entire year in the space of an hour is a daunting challenge and required over 70 hours of preparation in order to review each of the many proposals and discussions that have taken place. With that work done, however, I’d like to share a few of the things I have learned with readers.
New features galore
The past year has seen a lot of work on enterprise-level features. This isn’t too surprising as several of the major vendors work on new products based around the 2.6.32 kernel. Among the new bits were performance enhancements (it’s now possible to apply more fine-grained limits on I/O bandwidth to different running programs, thanks to the blkio cgroup I/O controller work), scalability improvements (it’s now possible to use over 4,096 CPUs) and a huge amount of progress in the virtualisation space. On that latter point, there were too many developments to fully summarise, but they included an in-kernel virtualised network server to aid performance, and new support for detecting identical regions of memory in guest machines in order to share a single copy, which can reduce overall memory usage considerably.
Embedded systems have seen a lot of action too. Support was added for Intel’s Moorsetown platform, which is x86-based but is not compatible with the PC standard (it is intended for use in low-power embedded gadgets). Thomas Gleixner aptly stated that this indicated “the arrival of the embedded nightmare to arch/x86”. There was also the addition of new – and very aggressive – power management features that allow for the selective shutdown of buses that are not in use in order to save power, and support for faster suspend and resume of the overall system through parallelisation.
Some new features have been unexpected, or unintuitive. Google proposed an ability to add ‘idle cycle injection’ as a means to temporarily handle data centre power deficiencies not through shutting down servers, but by adding a number of power-saving idle cycles proportional to the power savings needed. We also saw the revelation that ATA ‘TRIM’ support – used to tell disks (especially flash wear-levelling SSDs) that data blocks are not being used any more) might not be as good as advertised. Mark Lord found that many drive firmwares lie about TRIM support and perform terribly if issued a number of TRIM commands back-to-back, as in the real world.
Developers have benefited from increased acceptance of debugging tools, and we have seen the merging of in-kernel GDB support where it was once thought that this would never happen. Other tools, such as kmemleak and Mudflap, allow the kernel to detect leaked or use-after-free memory errors and apply liberal warnings to developers via system logs. Even the localmodconfig patches from Steven Rostedt were useful in allowing developers to easily build test kernels containing only code that applies to their machines (it’s faster to build many times over and developers do many builds each day).
Some features just didn’t make it, in spite of the best efforts of many. Among these were the Fanotify patches that have been developed by Eric Paris as a compromise to aid those ‘anti-malware’ vendors who wish to hook into the guts of the Linux kernel and do such things as on-demand file access scanning prior to opening files. This feature isn’t loved by many kernel developers – who often view viruses as a consequence of bad design in other operating systems – but Eric has done a good job of making them palatable. In my keynote, I pointed out that it’s likely these will get merged, it’s just going to take a little more time – perhaps 2.6.36, perhaps later.