For the latest stable version, please use Emilua API 0.10! |
spawn_vm
Description
Creates a new actor and returns a tx-channel.
The new actor will execute with _CONTEXT='worker'
(this _CONTEXT
is not
propagated to imported submodules within the actor).
Threading with work-stealing
Spawn more VMs than threads and spawn them all in the same thread-pool. The system will transparently steal VMs from the shared pool to keep the work-queue somewhat fair between the threads. |
Threading with load-balancing
Spawn each VM in a new thread pool and make sure each-one has only one thread. Now use messaging to apply some load-balancing strategy of your choice. |
Parameters
module: string
-
The module that will serve as the entry point for the new actor.
For IPC-based actors, this argument no longer means an actual module when Linux namespaces are involved. It’ll just be passed along to the new process. '.'
is also a valid module to use when you spawn actors. inherit_context: boolean = true
-
Whether to inherit the thread pool of the parent VM (i.e. the one calling
spawn_vm()
). Onfalse
, a new thread pool (starting with1
thread) is created to run the new actor.Emilua can handle multiple VMs running on the same thread just fine. Cooperative multitasking is used to alternate execution among the ready VMs.
A thread pool is one type of an execution context. The API prefers the term “context” as it’s more general than “thread pool”. concurrency_hint: integer|"safe" = "safe"
-
integer
-
A suggestion to the new thread pool (
inherit_context
should befalse
) as to the number of active threads that should be used for scheduling actors[1].You still need to call spawn_context_threads()
to create the extra threads.
"safe"
-
The default. No assumption is made upfront on the number of active threads that will be created through
spawn_context_threads()
.
new_master: boolean = false
-
The first VM (actor) to run in a process has different responsibilities as that’s the VM that will spawn all other actors in the system. The Emilua runtime will restrict modification of global process resources that don’t play nice with threads such as the current working directory and signal handling disposition to this VM.
Upon spawning a new actor, it’s possible to transfer ownership over these resources to the new VM. After
spawn_vm()
returns, the calling actor ceases to be the master VM in the process and can no longer recover its previous role as the master VM. subprocess: table|nil
-
table
-
Spawn the actor in a new subprocess.
Not available on Windows.
nil
-
Default. Don’t spawn the actor in a new subprocess.
subprocess.newns_uts: boolean = false
-
Whether to create the process within a new Linux UTS namespace.
subprocess.newns_ipc: boolean = false
-
Whether to create the process within a new Linux IPC namespace.
subprocess.newns_pid: boolean = false
-
Whether to create the process within a new Linux PID namespace.
The first process in a PID namespace is PID1 within that namespace. PID1 has a few special responsibilities. After
subprocess.init.script
exits, the Emilua runtime will fork if it’s running as PID1. This new child will assume the role of starting your module (the Lua VM). The PID1 process will perform the following jobs:-
Forward
SIGTERM
,SIGUSR1
,SIGUSR2
,SIGHUP
, andSIGINT
to the child. There is no point in re-routing every signal, but more may be added to this set if you present a compelling case. -
Reap zombie processes.
-
Exit when the child dies with the same exit code as the child’s.
-
subprocess.newns_user: boolean = false
-
Whether to create the process within a new Linux user namespace.
Even if it’s a sandbox, and root inside the sandbox doesn’t mean root outside it, maybe you still want to drop all root privileges at the end of the
init.script
:C.cap_set_proc('=')
It won’t be particularly useful for most people, but that technique is still useful to — for instance — create alternative LXC/FlatPak front-ends to run a few programs (if the program can’t update its own binary files, new possibilities for sandboxing practice open up).
subprocess.newns_net: boolean = false
-
Whether to create the process within a new Linux net namespace.
subprocess.newns_mount: boolean = false
-
Whether to create the process within a new Linux mount namespace.
subprocess.environment: { [string] = string }|nil
-
A table of strings that will be used as the created process'
envp
. Onnil
, an emptyenvp
will be used. subprocess.stdin,stdout,stderr: "share"|file_descriptor|nil
-
"share"
-
The spawned process will share the specified standard handle (
stdin
,stdout
, orstderr
) with the caller process. file_descriptor
-
Use the file descriptor as the specified standard handle (
stdin
,stdout
, orstderr
) for the spawned process. nil
-
Create and use a closed pipe end as the specified standard handle (
stdin
,stdout
, orstderr
) for the spawned process.
subprocess.init.script: string
-
The source code for a script that is used to initialize the sandbox in the child process.
errexit
We don’t want to accidentally ignore errors from the C API exposed to the
init.script
. That’s why we borrow an idea from BASH. One common folklore among BASH programmers is the unofficial strict mode. Among other things, this mode dictates the use of BASH’sset -o errexit
.And
errexit
exists for theinit.script
as well. Forinit.script
,errexit
is just a global boolean. Every time the C API fails, the Emilua wrapper for the function will check its value. Onerrexit=true
(the default when the script starts), the process will abort whenever some C API fails. That’s specially important when you’re using the API to drop process credentials/rights.The controlling terminalThe Emilua runtime won’t call
setsid()
norsetpgid()
by itself, so the process will stay in the same session as its parent, and it’ll have access to the same controlling terminal.If you want to block the new actor from accessing the controlling terminal, you may perform the usual calls in
init.script
:C.setsid() C.setpgid(0, 0)
subprocess.init.arg: file_descriptor
-
A file descriptor that will be sent to the
init.script
. The script can access this fd through the variablearg
that is available within the script.
channel
functions
send(self, msg)
Sends a message.
You can send the address of other actors (or self) by sending the channel as a message. A clone of the tx-channel will be made and sent over. This simple foundation is enough to:
|
detach(self)
Detaches the calling VM/actor from the role of supervisor for the process/actor
represented by self
. After this operation is done, the process/actor
represented by self
is allowed to outlive the calling process.
The channel remains open. |
This method is only available for channels associated with IPC-based actors that are direct children of the caller. |
kill(self, signo: integer = system.signal.SIGKILL)
Sends signo
to the subprocess. On SIGKILL
, it’ll also close the channel.
This method is only available for channels associated with IPC-based actors that are direct children of the caller. |
A PID file descriptor is used to send signo so no races involving PID
numbers ever happen.
|
channel
properties
child_pid: integer
The process id used by the OS to represent this child process (e.g. the number
that shows up in /proc
on some UNIX systems).
Do keep in mind that process reaping happens automatically and the PID won’t
remain reserved once the child dies, so it’s racy to use the PID. Even if
process reaping was not automatic, it’d still be possible to have races if the
parent died while some other process was using this PID. Use child_pid
only as
a last resort.
You can only access this field for channels associated with IPC-based actors that are direct children of the caller. |
The C API exposed to init.script
Helpers
mode(user: integer, group: integer, other: integer) → integer
function mode(user, group, other)
return bit.bor(bit.lshift(user, 6), bit.lshift(group, 3), other)
end
receive_with_fd(fd: integer, buf_size: integer) → string, integer, integer
Returns three values:
-
String with the received message (or
nil
on error). -
File descriptor received (or
-1
on none). -
The errno value (or
0
on success).
Functions
These functions live inside the global table C
. errno
(or 0
on success) is
returned as the second result.
-
read()
. Opposed to the C function, it receives two arguments. The second argument is the size of the buffer. The buffer is allocated automatically, and returned as a string in the first result (unless an error happens, thennil
is returned). -
write()
. Opposed to the C function, it receives two arguments. The second one is a string which will be used as the buffer. -
sethostname()
. Opposed to the C function, it only receives the string argument. -
setdomainname()
. Opposed to the C function, it only receives the string argument. -
setgroups()
. Opposed to the C function, it receives a list of numbers as its single argument. -
cap_set_proc()
. Opposed to the C function, it receives a string as its single argument. The string is converted to thecap_t
type using the functioncap_from_text()
. -
cap_drop_bound()
. Opposed to the C function, it receives a string as its single argument. The string is converted to thecap_value_t
type using the functioncap_from_name()
. -
cap_set_ambient()
. Opposed to the C function, it receives a string as its first argument. The string is converted to thecap_value_t
type using the functioncap_from_name()
. The second parameter is a boolean. -
execve()
. Opposed to the C function,argv
andenvp
are specified as a Lua table. -
fexecve()
. Opposed to the C function,argv
andenvp
are specified as a Lua table.
Other exported functions work as usual (except that errno
or 0
is returned
as the second result):
-
open()
. -
mkdir()
. -
chdir()
. -
link()
. -
symlink()
. -
chown()
. -
chmod()
. -
umask()
. -
mount()
. -
umount()
. -
umount2()
. -
pivot_root()
. -
chroot()
. -
setsid()
. -
setpgid()
. -
setresuid()
. -
setresgid()
. -
cap_reset_ambient()
. -
cap_set_secbits()
. -
unshare()
. -
setns()
. -
cap_enter()
. -
jail_attach()
.
Constants
These constants live inside the global table C
.
-
O_CLOEXEC
. -
EAFNOSUPPORT
. -
EADDRINUSE
. -
EADDRNOTAVAIL
. -
EISCONN
. -
E2BIG
. -
EDOM
. -
EFAULT
. -
EBADF
. -
EBADMSG
. -
EPIPE
. -
ECONNABORTED
. -
EALREADY
. -
ECONNREFUSED
. -
ECONNRESET
. -
EXDEV
. -
EDESTADDRREQ
. -
EBUSY
. -
ENOTEMPTY
. -
ENOEXEC
. -
EEXIST
. -
EFBIG
. -
ENAMETOOLONG
. -
ENOSYS
. -
EHOSTUNREACH
. -
EIDRM
. -
EILSEQ
. -
ENOTTY
. -
EINTR
. -
EINVAL
. -
ESPIPE
. -
EIO
. -
EISDIR
. -
EMSGSIZE
. -
ENETDOWN
. -
ENETRESET
. -
ENETUNREACH
. -
ENOBUFS
. -
ECHILD
. -
ENOLINK
. -
ENOLCK
. -
ENODATA
. -
ENOMSG
. -
ENOPROTOOPT
. -
ENOSPC
. -
ENOSR
. -
ENXIO
. -
ENODEV
. -
ENOENT
. -
ESRCH
. -
ENOTDIR
. -
ENOTSOCK
. -
ENOSTR
. -
ENOTCONN
. -
ENOMEM
. -
ENOTSUP
. -
ECANCELED
. -
EINPROGRESS
. -
EPERM
. -
EOPNOTSUPP
. -
EWOULDBLOCK
. -
EOWNERDEAD
. -
EACCES
. -
EPROTO
. -
EPROTONOSUPPORT
. -
EROFS
. -
EDEADLK
. -
EAGAIN
. -
ERANGE
. -
ENOTRECOVERABLE
. -
ETIME
. -
ETXTBSY
. -
ETIMEDOUT
. -
ENFILE
. -
EMFILE
. -
EMLINK
. -
ELOOP
. -
EOVERFLOW
. -
EPROTOTYPE
. -
O_CREAT
. -
O_RDONLY
. -
O_WRONLY
. -
O_RDWR
. -
O_DIRECTORY
. -
O_EXCL
. -
O_NOCTTY
. -
O_NOFOLLOW
. -
O_TMPFILE
. -
O_TRUNC
. -
O_APPEND
. -
O_ASYNC
. -
O_DIRECT
. -
O_DSYNC
. -
O_LARGEFILE
. -
O_NOATIME
. -
O_NONBLOCK
. -
O_PATH
. -
O_SYNC
. -
S_IRWXU
. -
S_IRUSR
. -
S_IWUSR
. -
S_IXUSR
. -
S_IRWXG
. -
S_IRGRP
. -
S_IWGRP
. -
S_IXGRP
. -
S_IRWXO
. -
S_IROTH
. -
S_IWOTH
. -
S_IXOTH
. -
S_ISUID
. -
S_ISGID
. -
S_ISVTX
. -
MS_REMOUNT
. -
MS_BIND
. -
MS_SHARED
. -
MS_PRIVATE
. -
MS_SLAVE
. -
MS_UNBINDABLE
. -
MS_MOVE
. -
MS_DIRSYNC
. -
MS_LAZYTIME
. -
MS_MANDLOCK
. -
MS_NOATIME
. -
MS_NODEV
. -
MS_NODIRATIME
. -
MS_NOEXEC
. -
MS_NOSUID
. -
MS_RDONLY
. -
MS_REC
. -
MS_RELATIME
. -
MS_SILENT
. -
MS_STRICTATIME
. -
MS_SYNCHRONOUS
. -
MS_NOSYMFOLLOW
. -
MNT_FORCE
. -
MNT_DETACH
. -
MNT_EXPIRE
. -
UMOUNT_NOFOLLOW
. -
CLONE_NEWCGROUP
. -
CLONE_NEWIPC
. -
CLONE_NEWNET
. -
CLONE_NEWNS
. -
CLONE_NEWPID
. -
CLONE_NEWTIME
. -
CLONE_NEWUSER
. -
CLONE_NEWUTS
. -
SECBIT_NOROOT
. -
SECBIT_NOROOT_LOCKED
. -
SECBIT_NO_SETUID_FIXUP
. -
SECBIT_NO_SETUID_FIXUP_LOCKED
. -
SECBIT_KEEP_CAPS
. -
SECBIT_KEEP_CAPS_LOCKED
. -
SECBIT_NO_CAP_AMBIENT_RAISE
. -
SECBIT_NO_CAP_AMBIENT_RAISE_LOCKED
.
C.landlock_create_ruleset(attr: table|nil, flags: table|nil) → integer, integer
Parameters:
-
attr.handled_access_fs: string[]
-
"execute"
-
"write_file"
-
"read_file"
-
"read_dir"
-
"remove_dir"
-
"remove_file"
-
"make_char"
-
"make_dir"
-
"make_reg"
-
"make_sock"
-
"make_fifo"
-
"make_block"
-
"make_sym"
-
"refer"
-
"truncate"
-
-
flags: string[]
-
"version"
-
Returns two values:
-
landlock_create_ruleset()
return. -
The errno value (or
0
on success).
C.landlock_add_rule(ruleset_fd: integer, rule_type: "path_beneath", attr: table) → integer, integer
Parameters:
-
attr.allowed_access: string[]
-
"execute"
-
"write_file"
-
"read_file"
-
"read_dir"
-
"remove_dir"
-
"remove_file"
-
"make_char"
-
"make_dir"
-
"make_reg"
-
"make_sock"
-
"make_fifo"
-
"make_block"
-
"make_sym"
-
"refer"
-
"truncate"
-
-
attr.parent_fd: integer
Returns two values:
-
landlock_add_rule()
return. -
The errno value (or
0
on success).
C.landlock_restrict_self(ruleset_fd: integer) → integer, integer
Returns two values:
-
landlock_restrict_self()
return. -
The errno value (or
0
on success).
C.jail_set(params: { [string]: string|boolean }, flags: string[]|nil) → integer, integer
Create or modify a jail. Optionally locks the current process in it.
Jail parameters are given as strings and they’ll be transparently converted to the native format accepted by the kernel.
flags
may contain the following values:
-
"create"
-
"update"
-
"attach"
-
"dying"
See jail(8) for more information on the core jail parameters.