spawn_vm

Synopsis

spawn_vm(module: string) -> channel
spawn_vm(opts: table) -> channel

Description

Creates a new actor and returns a tx-channel.

The new actor will execute with _CONTEXT='worker' (this _CONTEXT is not propagated to imported submodules within the actor).

Threading with work-stealing

Spawn more VMs than threads and spawn them all in the same thread-pool. The system will transparently steal VMs from the shared pool to keep the work-queue somewhat fair between the threads.

Threading with load-balancing

Spawn each VM in a new thread pool and make sure each-one has only one thread. Now use messaging to apply some load-balancing strategy of your choice.

Parameters

module: string|filesystem.path
string

The module that will serve as the entry point for the new actor.

'.' is also a valid module to use when you spawn actors.
filesystem.path

Only valid for IPC-based actors (see parameter subprocess below).

inherit_context: boolean = true

Whether to inherit the thread pool of the parent VM (i.e. the one calling spawn_vm()). On false, a new thread pool (starting with 1 thread) is created to run the new actor.

Emilua can handle multiple VMs running on the same thread just fine. Cooperative multitasking is used to alternate execution among the ready VMs.

A thread pool is one type of an execution context. The API prefers the term “context” as it’s more general than “thread pool”.
concurrency_hint: integer|"safe" = "safe"
integer

A suggestion to the new thread pool (inherit_context should be false) as to the number of active threads that should be used for scheduling actors[1].

You still need to call spawn_context_threads() to create the extra threads.
"safe"

The default. No assumption is made upfront on the number of active threads that will be created through spawn_context_threads().

new_master: boolean = false

The first VM (actor) to run in a process has different responsibilities as that’s the VM that will spawn all other actors in the system. The Emilua runtime will restrict modification of global process resources that don’t play nice with threads such as the current working directory and signal handling disposition to this VM.

Upon spawning a new actor, it’s possible to transfer ownership over these resources to the new VM. After spawn_vm() returns, the calling actor ceases to be the master VM in the process and can no longer recover its previous role as the master VM.

subprocess: table|nil
table

Spawn the actor in a new subprocess.

Not available on Windows.
nil

Default. Don’t spawn the actor in a new subprocess.

subprocess.newns_uts: boolean = false

Whether to create the process within a new Linux UTS namespace.

subprocess.newns_ipc: boolean = false

Whether to create the process within a new Linux IPC namespace.

subprocess.newns_pid: boolean = false

Whether to create the process within a new Linux PID namespace.

The first process in a PID namespace is PID1 within that namespace. PID1 has a few special responsibilities. After subprocess.init.script exits, the Emilua runtime will fork if it’s running as PID1. This new child will assume the role of starting your module (the Lua VM). The PID1 process will perform the following jobs:

  • Forward SIGTERM, SIGUSR1, SIGUSR2, SIGHUP, SIGINT, and SIGRTMIN+4 to the child. There is no point in re-routing every signal, but more may be added to this set if you present a compelling case.

  • Reap zombie processes.

  • Exit when the child dies with the same exit code as the child’s.

subprocess.newns_user: boolean = false

Whether to create the process within a new Linux user namespace.

subprocess.newns_net: boolean = false

Whether to create the process within a new Linux net namespace.

subprocess.newns_mount: boolean = false

Whether to create the process within a new Linux mount namespace.

subprocess.pd_daemon: boolean = false

Instead of the default terminate-on-close behaviour, allow the process to live until it is explicitly killed with kill(2).

Only available on FreeBSD.
subprocess.environment: { [string] = string }|nil

A table of strings that will be used as the created process' envp. On nil, an empty envp will be used.

subprocess.stdin,stdout,stderr: "share"|file_descriptor|nil
"share"

The spawned process will share the specified standard handle (stdin, stdout, or stderr) with the caller process.

file_descriptor

Use the file descriptor as the specified standard handle (stdin, stdout, or stderr) for the spawned process.

nil

Create and use a closed pipe end as the specified standard handle (stdin, stdout, or stderr) for the spawned process.

subprocess.init.script: string

The source code for a script that is used to initialize the sandbox in the child process.

See also:

subprocess.init.arg: file_descriptor|nil

A file descriptor that will be sent to the init.script. The script can access this fd through the variable arg that is available within the script.

subprocess.source_tree_cache: table|nil

The Lua source code cache will be pre-populated with this data. Emilua always query the cache before the filesystem when loading Lua modules so you may use this cache to bundle the application that will run inside sandboxes w/o filesystem access (e.g. Capsicum on FreeBSD, Landlock/seccomp on Linux).

That’s a recursive structure (a tree). Each key must be a string with the component name and the value might be a string (the Lua source code) or another tree.

subprocess.native_modules_cache: string[]|"all"|nil
string[]

A list of plugins to resolve (but not load) on the host and send as file descriptors to be fdlopen()ed on the subprocess. Plugin file descriptors will be stored in a special cache on the subprocess, but will only be loaded once require()d from Lua code.

If the character ":" is appended to a module-id, a file descriptor to the containing directory will be sent instead.

Under FreeBSD, these file descriptors are protected using Capsicum. Under Linux, you’re pretty much exposing the whole mount namespace, and should be preparing for such accordingly.

"all"

Send file descriptors to all EMILUA_PATH directories (and the related builtin search paths as well).

subprocess.ld_library_directories: file_descriptor[]
  1. dup() each file descriptor.

  2. For each duplicate, cap_rights_limit().

  3. Send the duplicates to the new subprocess to fill the environment variable LD_LIBRARY_PATH_FDS.

Only available on FreeBSD.
subprocess.libc_service: libc_service.slave|nil

The proxy used to override functions from libc that are used for ambient authority access within the new subprocess.

The object is consumed by the call and cannot be reused elsewhere afterwards.

It’s wise to combine this with a real syscall firewall (e.g. FreeBSD’s Capsicum, Linux’s seccomp).

channel functions

send(self, msg)

Sends a message.

You can send the address of other actors (or self) by sending the channel as a message. A clone of the tx-channel will be made and sent over.

This simple foundation is enough to:

[…​] gives Actors the ability to create and participate in arbitrarily variable topological relationships with one another […​]

close(self)

Closes the channel. No further messages can be sent after a channel is closed.

detach(self)

Detaches the calling VM/actor from the role of supervisor for the process/actor represented by self. After this operation is done, the process/actor represented by self is allowed to outlive the calling process.

The channel remains open.
This method is only available for channels associated with IPC-based actors that are direct children of the caller.

kill(self, signo: integer = system.signal.SIGKILL)

Sends signo to the subprocess. On SIGKILL, it’ll also close the channel.

This method is only available for channels associated with IPC-based actors that are direct children of the caller.
A PID file descriptor is used to send signo so no races involving PID numbers ever happen.

channel properties

child_pid: integer

The process id used by the OS to represent this child process (e.g. the number that shows up in /proc on some UNIX systems).

Do keep in mind that process reaping happens automatically and the PID won’t remain reserved once the child dies, so it’s racy to use the PID. Even if process reaping was not automatic, it’d still be possible to have races if the parent died while some other process was using this PID. Use child_pid only as a last resort.

You can only access this field for channels associated with IPC-based actors that are direct children of the caller.