Sandboxes
Emilua provides support for creating actors in isolated processes using Capsicum, FreeBSD jails, Seccomp, Linux namespaces or Landlock. The idea is to prevent potentially exploitable code from accessing resources beyond what has been explicitly handed to them. That’s the basis for capability-based security systems, and it maps pretty well to APIs implementing the actor model such as Emilua.
Even modern operating systems are still somehow rooted in an age where we didn’t know how to properly partition computer resources adequately to user needs keeping a design focused on practical and conscious security. Several solutions are stacked together to somehow fill this gap and they usually work for most of the applications, but that’s not all of them.
Consider the web browser. There is an active movement that try to push for a future where only the web browser exists and users will handle all of their communications, store & share their photos, book hotels & tickets, check their medical history, manage their banking accounts, and much more… all without ever leaving the browser. In such scenario, any protection offered by the OS to protect programs from each other is rendered useless! Only a single program exists. If a hacker exploits the right vulnerability, all of the user’s data will be stolen. There is no real compartmentalisation.
The browser is part of a special class of programs. The browser is a shell. A shell is any interface that acts as a layer between the user and the world. The web browser is the shell for the www world. Www browser or not, any shell will face similar problems and has to be consciously designed to safely isolate contexts that distrust each other. The Emilua team is not aware of anything better than FreeBSD’s Capsicum to do just this. In the absence of Capsicum, we have Linux Landlock which can be used to build something close. Browsers actually use Linux namespaces which are older.
The API
Compartmentalised application development is, of necessity, distributed application development, with software components running in different processes and communicating via message passing.
Robert N. M. Watson, Jonathan Anderson, Ben Laurie, and Kris Kennaway
The Emilua’s API to spawn an actor lies within the reach of a simple function call:
local my_channel = spawn_vm(module)
Check the manual elsewhere to understand the details. As for sandboxes, the idea is to spawn an actor where no system resources are available (e.g. the filesystem is mostly empty, no network interfaces are available, no PIDs from other processes can be seen, …).
Consider the hypothetical sandbox
class:
local mysandbox1 = sandbox.new()
local my_channel = spawn_vm(mysandbox1:context(module))
mysandbox1:handshake()
That would be the ideal we’re pursuing. Nothing other than 2 extra lines of code at most under your application. All complexity for creating sandboxes taken care of by specialized teams of security experts. The Capsicum paper[1] released in 2010 analysed and compared different sandboxing technologies and showed some interesting figures. Consider the following figure that we reproduce here:
Operating system | Model | Line count | Description |
---|---|---|---|
Windows |
ACLs |
22350 |
Windows ACLs and SIDs |
Linux |
|
605 |
|
Mac OS X |
Seatbelt |
560 |
Path-based MAC sandbox |
Linux |
SELinux |
200 |
Restricted sandbox type enforcement domain |
Linux |
|
11301 |
|
FreeBSD |
Capsicum |
100 |
Capsicum sandboxing using |
Do notice that line count is not the only metric of interest. The original paper accompanies a very interesting discussion detailing applicability, risks, and levels of security offered by each approach. Just a few years after the paper was released, user namespaces was merged to Linux and yet a new option for sandboxing is now available. Fast-forward a few more years and we also have Linux Landlock which is even better than Linux namespaces. Within this discussion, we can discard most of the approaches — DAC-based, MAC-based, or too intrusive to be even possible to abstract away as a reusable component — as inadequate to our endeavour.
Out of them, Capsicum wins hands down. It’s just as capable to isolate parts of an application, but with much less chance to error (for the Chromium patchset, it was just 100 lines of extra C code after all). Unfortunately, Capsicum is not available in every modern OS.
Do keep in mind that this is code written by experts in their own fields, and their salary is nothing less than what Google can afford. 11301 lines of code written by a team of Google engineers for a lifetime project such as Google Chromium is not an investment that any project can afford. That’s what the democratization of sandboxing technology needs to do so even small projects can afford them. That’s why it’s important to use sound models that are easy to analyse such as capability-based security systems. That’s why it’s important to offer an API that only adds two extra lines of code to your application. That’s the only way to democratize access to such technology.
Rust programmers' vision of security is to rewrite the world in Rust, a rather unfeasible undertaking, and a huge waste of resources. In a similar fashion, Deno was released to exploit v8 as the basis for its sandboxing features (now they expect the world to be rewritten in TypeScript). The heart of Emilua’s sandboxing relies on technologies that can isolate any code (e.g. C libraries to parse media streams). |
Back to our API, the hypothetical sandbox
class that we showed earlier will
have to be some library that abstracts the differences between each sandbox
technology in the different platforms. The API that Emilua actually exposes as
of this release abstracts all of the semantics related to actor messaging,
work/lifetime accounting, process reaping, DoS protection, serialization, lots
of Linux namespaces details (e.g. PID1), and much more, but it still expects you
to actually initialize the sandbox.
The init.script
Every process carries associated credentials that enable operation on
system-wide addressable objects such as filesystem objects and sockets. We setup
a sandbox by disabling the ambient authority so the address space itself becomes
inaccessible. Sandboxed code thus should be run only after such setup already
completed successfully. The proper hook to perform this setup is
init.script
. init.script
runs right after the process is created.
After the sandboxed actor is up it can receive access to new resources through its inbox. If any security exploit is performed on the sandboxed code, then only the objects it has access to are rendered vulnerable (the damage is thus contained in its compartment).
Landlock (Linux)
local init_script = [[
local rules = C.landlock_create_ruleset{ handled_access_fs = {
"execute", "write_file" "read_file", "read_dir", "remove_dir",
"remove_file", "make_char", "make_dir", "make_reg", "make_sock",
"make_fifo", "make_block", "make_sym", "refer", "truncate" } }
set_no_new_privs()
C.landlock_restrict_self(rules)
]]
spawn_vm{
subprocess = {
init = { script = init_script }
}
}
Landlock as of now can only control access to filesystem objects, but future versions will be more complete.