News

Shellshocked Linux kernel – the kernel column

Jon Masters informs us of the kernel’s role in the latest Shellshock security vulnerability, and summarises the work in the kernel community towards a final 3.17 release

unionroom

Linux turned 23 years old a couple of months ago, just in time for the tail end of the 3.17 kernel development cycle. In fact, the 5th Release Candidate was almost released on Linux’s birthday but Linus decided against that as he is “not an overly sentimental person, so screw that”. Indeed, this has been a quieter cycle in some respects and Linus considered making RC6 the final Release Candidate. In the end he did make, “yes, another rc”, (RC7) since “‘convenience’ [to his travel schedule] isn’t really part of the release criteria”. We will have a full summary of Linux 3.17 in the next issue.

Is the Linux kernel ‘shellshocked’?

A recent security vulnerability in the Bash (Bourne Again SHell) has had everyone talking about being ‘shellshocked’.

It’s strange to think of Bash as front-page headlines, but such is the world in which we live. When it’s not featured on the front-page of the New York Times, Bash is a humble (yet almost universal) shell, most often experienced as the familiar Linux or Mac OS X command-line interface spawned when running a terminal emulation program (by clicking on the Terminal application icon). But Bash is also quite often the shell behind /bin/sh (which is usually a symlink to the bash executable) and is thus available to remote users who request a webserver to run (for example) a CGI script.

The specific security vulnerability in Bash is pretty straightforward. A bug was inadvertently introduced into Bash (several decades ago) which causes it to miss parse function declarations that are passed into it via the process environment. Like any other application, running instances of Bash have a Unix process environment which can be viewed or modified using the “env” command. The environment is simply a linear memory array of an arbitrary size sitting within the process’ overall address space. It is defined in the kernel within the “mm” struct that forms a part of every process (called a “task” when seen within the kernel), and is accessed by kernel code using a macro such as current->mm->env_start (which would return a pointer to the environment of the currently running program). The kernel is involved in setting up the environment originally as part of the binfmt_elf code that sets up Executable and Linking Format (ELF) executables, such as Bash.

The environment of a process (between env_start and env_end) is visible to userspace through a global variable (extern char **environ), and through the file /proc/self/environ, where “self” can be replaced with another process ID if you have suitable capabilities (eg root) to read the environment of another process. The “env” utility (part of the “coreutils” Linux package on most distributions) simply reads an array of strings from the global Unix environ variable and displays them. These strings have a declarative form, such as can be seen in “HOME=/home/ jcm” and are known as “environment variables”. Various tools, such as the Bash shell, will also read these entries on start up and set various internal state based upon these variables. Environment variables can include function declarations (such as “x=() { :;}”, which defines a function named “x” that does nothing) and these can be exported to subsequent commands through use of the “export -f x” command. Normally, this command will only cause the environment to gain an “x=() { :;}” entry, but there are other ways to set the process environment besides using “export”, such as using the “env” command directly. This leads us to the now famous vulnerability test code:

env x=‘() { :;}; echo vulnerable’ bash -c :

The second Bash process is passed in a function named “x” but erroneously executes the “echo vulnerable” command tacked onto the end in the course of parsing the “x” function environment variable. In fact if you were to remove the “-c :” from the end of that line and allow the second shell to remain running, you would notice that the “env” command doesn’t show anything untoward (because it doesn’t correctly parse the content either). However, running “cat/proc/$$/environ” to display the environment as directly reported by the kernel will show the “echo vulnerable” addendum. Newer fixed versions of Bash improve parsing and also append new “BASH_FUNC_” prefixes to exported functions. But the kernel continues to do its thing, having never been more than a spectator in the process of handing the environment from one task to another.

Ongoing development

An interesting thread around the topic of non- blocking file reads spawned under the initial subject heading “read()/readv() only from page cache”. The Linux page cache is an in-memory data structure containing data pages (fragments of disc files, also known as “inodes” in that context) representing those portions of files that have been previously read into memory and for which there is (at least temporarily) sufficient spare RAM to keep a cached copy on hand in case of imminent repeat access. The idea of reading file data only from the page cache is that file reads can often block for very long periods of time (milliseconds): more than enough time to ruin any kind of determinism in a low latency or real- time system. Typically, special purpose real-time systems take precautions around data storage for these reasons. But there are those who would like to use a general purpose file system and simply leverage the presence of the page cache,
allowing for per request control over whether a read operation will block, should data not be previously cached therein.

New patches for POWER8 architecture support were posted, including a selection of the CPU type itself, as well as support for controlling attributes of the Transactional Memory hardware support when in use by a particular process undergoing debug through the ptrace API. Transactional Memory is a pretty nifty feature that several architectures are growing support for (recent x86 chips ship with this feature built in). It allows a sequence of memory operations to be treated as a unit that will either be completed entirely (and atomically) or not take place at all. It’s typically used to implement various high performance critical sections in place of conventional locking (hence Intel use the term Transactional Synchronization Extensions or ‘TSX’ in their implementation), and are usually achieved by restricting the size of an overall transaction to a multiple of the L1 cache line width and using cache organisational tricks.

Community-run Linux Plumbers Conference is seeking organisers for its 2015 event. The LPC is contemporaneously co-located with Linux Foundation events such as LinuxCon, and under the overall technical direction of the Linux Foundation Technical Advisory Board (TAB). The new TAB was announced this month following fresh elections held at this year’s LinuxCon in Chicago (full disclosure: this author was nominated to stand in absentia) and it is seeking assistance in organising the event, to be held in Seattle. The new TAB sees the departure of several long-term members and the addition of a few fresh faces including the new chair, Grant Likely. It’s particularly gratifying to see that the ten-person TAB now has two female members (with the addition of Kristen Accardi, joining Sarah Sharp). This is of course far from enough, but it is at least a welcome trend in a positive direction.

Finally this month, a patch was recently posted that enables support for the upcoming GCC 5. For those who find the notion of experimenting with new and novel compilers interesting, this is definitely something to take a look at.

×