By Jake Edge
August 19, 2009
Creating a sandbox—a safe area in which to run untrusted
code—is a difficult problem. The successful sandbox implementations
tend to come with completely new languages (e.g. Java) that are
specifically designed to support that functionality. Trying to sandbox C
code is a much more difficult task, but one that the Google Chrome web browser team has
been working on.
The basic idea is to restrict the WebKit-based renderer—along with
the various image and other format libraries that are linked to
it—so that browser-based vulnerabilities are unable to affect the
system as a whole. A successful sandbox for the browser would eliminate
a whole class of problems that plague Firefox and other browsers that
require
frequent, critical security updates. Essentially, the browser would protect
users from bugs in the rendering of maliciously-crafted web pages, so that
they could not lead to system or user data compromise.
The Chrome browser, and its free software counterpart, Chromium, is designed around
the idea of separate processes for each tab, both for robustness and
security. A misbehaving web page can only affect the process controlling
that particular tab, so it won't bring the entire browser down if it causes
the process to crash. In addition, these processes are considered to be
"untrusted", in
that they could have been compromised by some web page exploiting a bug in
the renderer. The sandbox scheme works by severely restricting the actions
that
untrusted processes can take directly.
At some level, Linux already has a boundary that isolates programs from the
underlying system: system calls. A program that does no system calls
should not be able to affect anything else, at least permanently. But it
is a trivial program indeed that does not need to call on some system
services. A largely unknown kernel feature, seccomp, allows processes to
call a very small subset of system calls—just read(),
write(), sigreturn(), and exit()—aborting
a process that attempts to call any other. That is the starting point for
the Chromium sandbox.
But, there are other system calls that the browser might need to make. For
one thing, memory allocation might require the brk() system call.
Also, the renderer needs to be able to share memory with the X server for
drawing. And so on. Any additional system calls, beyond the four that
seccomp allows, have to be handled differently.
A proposed change to seccomp
that would allow finer-grained control over which system calls were allowed
didn't get very far. In any case, that wasn't a near-term solution, so
Markus Gutschke of the Chrome team went in another direction. By splitting
the renderer process into trusted and untrusted threads, some system
calls could be allowed for the untrusted thread by making the equivalent of
a remote procedure call
(RPC) to the trusted thread. The trusted thread could then verify that
the system call, and its arguments, were reasonable and, if so, perform the
requested action.
Chrome team member Adam Langley describes it this way:
So that's what we do: each untrusted thread has a trusted helper thread
running in the same process. This certainly presents a fairly hostile
environment for the trusted code to run in. For one, it can only trust its
CPU registers - all memory must be assumed to be hostile. Since C code will
spill to the stack when needed and may pass arguments on the stack, all the
code for the trusted thread has to [be] carefully written in assembly.
The trusted thread can receive requests to make system calls from the
untrusted thread over a socket pair, validate the system call number and
perform them on its behalf. We can stop the untrusted thread from breaking
out by only using CPU registers and by refusing to let the untrusted code
manipulate the VM in unsafe ways with mmap, mprotect etc.
There are still problems with that approach, however. For one thing, the
renderer code is large, with many different system calls scattered
throughout. Turning each of those into an RPC is possible, but then
would have to
be maintained by the Chromium developers going forward. The upstream
projects (WebKit, et. al.) would
not be terribly interested in those changes, so each new revision from
upstream would need to be patched and then checked for new system calls.
Another approach might be to use LD_PRELOAD trickery to intercept
the calls in glibc. That has its own set of problems as Langley points
out: "we could try and intercept at dynamic linking time, assuming
that all the system calls are via glibc. Even if that were true, glibc's
functions make system calls directly, so we would have to patch at the
level of functions like printf rather than write."
So, a method of finding and patching the system calls at runtime was
devised. It uses a disassembler on the executable code, finds each system
call and turns it into an RPC to the trusted thread. Correctly parsing x86
machine code is notoriously difficult, but it doesn't have to be perfect.
Because the untrusted thread runs in seccomp mode, any system call that is
missed will not lead to a security breach, as the kernel will abort the
thread if it attempts any but the trusted four system calls. As Langley
puts it:
But we don't need a perfect disassembler so long as it works in practice
for the code that we have. It turns out that a simple disassembler does the
job perfectly well, with only a very few corner cases.
The last piece of the puzzle is handling time-of-check-to-time-of-use race
conditions. System call arguments that are passed in memory, via
pointers or for system calls with too many arguments to fit in registers,
can be changed by the, presumably subverted, untrusted
thread between the time they are checked for validity and when they are used.
To handle that, a trusted process, which is shared between all of the
renderers, is created to check system calls that cannot be verified within
the address space of the untrusted renderer.
The trusted process shares a few pages of memory with each trusted thread,
which are read-only to the trusted thread, and read-write for the trusted
process. System calls that cannot be handled by the trusted thread, either
because some of the arguments live in memory, or because the verification
process is too complex to be reasonably done in assembly code, are handed off to
the trusted process. The arguments are copied by the trusted process into
its address space, so they are immune to changes from the untrusted code.
While the current implementation is for x86 and x86-64—though there are
still a few issues to be worked out with the V8 Javascript engine on
x86-64—there is a clear path for other architectures. Adapting or writing a
disassembler and writing the assembly language trusted thread are the two
pieces needed to support each additional architecture. According to
Langley:
The former is probably
easier on many other architectures because they are likely to be more
RISC like. The latter takes some work, but it's a coding problem only
at this point.
There are some potential pitfalls in this sandbox mechanism. Bugs in the
implementation of the trusted pieces—either coding errors or mistakes
made in determining which system calls and arguments are "safe"—could
certainly lead to problems. Currently, deciding which calls to allow is
done on an ad hoc basis, by running the renderer, seeing which calls
it makes, and deciding which are reasonable. The outcome of those
decisions are then codified in syscall_table.c.
One additional, important area that is not covered by the sandbox are
plugins like Flash. Restricting what plugins can do does not fit well with
what users expect, which makes plugins a major vector for attack. Langley said
that the plugin support on Linux is relatively new, but "our experience
on Windows is that, in order for Flash to do all the things that
various sites expect it to be able to do, the sandbox has to be so
full of holes that it's rather useless". He is currently looking at
SELinux as a way to potentially restrict plugins, but, for now, they are
wide open.
This is a rather—some would say overly—complex scheme. It is
still in the experimental stage, so changes are likely, but it does show
one way to protect browser users from bugs in the HTML renderer that might
lead to system or data compromise. It certainly doesn't solve all of the
web's security problems, but could, over time, largely eliminate a whole
class of attacks. It is definitely a project worth keeping an eye on.
[ Many thanks to Adam Langley, whose document was used as a basis for this
article, and who patiently answered questions from the author. ]
(
Log in to post comments)