For the latest stable version, please use Emilua API 0.10! |
Filesystem API
Emilua offers its own cross-platform filesystem API. The hard thing about a cross-platform filesystem API is basically Windows. As Ryan Gordon (from the SDL fame) succinctly put it:
Windows. Windows is the problem.
Windows wants you to mess with UTF-16 strings for Unicode filepaths, everything else wants UTF-8.
Windows wants you to use Win32 APIs, everything else uses POSIX.
Windows wants you to use FILETIME (100-nanosecond increments since 1601), everything else uses POSIX (time_t, Unix epoch).
Windows wants you to use
'\\'
, everything else uses'/'
.Windows has drive letters, everything else has mount points.
Windows sorta has symlinks in modern times, many other things always do. But some things don’t at all!
On top of what Ryan said, I’d add the following points:
-
Windows wants you to mess with
GetLastError()
, everything else wantserrno
. -
Windows is case-insensitive, everything else is case-sensitive.
Except for case sensitivity, Emilua absorbs all of these problems on your behalf with an API that abstracts such differences away. On top of that, it’ll use Microsoft’s own implementation for such translation layers[1] when it’s running on Windows (meaning: if you decide to not use “Emilua” abstractions because you don’t trust our knowledge of the Windows API you’re just avoiding Microsoft’s own code which you can’t really do).
Of course a few non-Windows extensions are also offered. If you’re not (only) targeting Windows, the common UNIX concepts are a must to have, and they’re here (otherwise you wouldn’t be able to use Emilua to build containers which is something we also support).
The object filesystem.path
filesystem.path
is the central piece in the architecture for our design. As
the name implies, it represents a path. On the Lua side, you just deal with
UTF-8 encoding. Internally, this class will keep the representation in the
native format and translate to UTF-8 as needed to interact with Lua code.
local fs = require "filesystem"
local my_path1 = fs.path.new("/home/user")
local my_path2 = fs.path.from_generic("Downloads/music")
There are two constructors. One takes the path in the native format. The other
uses a generic format. The generic format always use "/"
as the directory
separator. The native format receives no special handling here as for what "/"
might mean and just relies on the native directory separator of the underlying
platform (but it still handles conversions from UTF-8 to the native encoding).
When you’re composing paths, you can use the overloaded operators as they’ll automatically use the native directory separator for the underlying platform:
function foobar(path)
return path / "Downloads" / "myfile" .. ".txt"
end
You can also query their dynamic properties to perform path decomposition:
function foobar(path)
return path.parent_path, path.filename
end
Or decompose them through iteration:
function foobar(path)
for component in path:iterator() do
print(component)
end
end
Paths are immutable. Operations that modify a path always return a new path while the original is left untouched.
No place in the Emilua API receives a string to handle file paths. You’ll need
to use path objects explicitly even in UNIX socket operations. This design helps
to disambiguate cases where multiple types are accepted but mean different
things (e.g. program
in system.spawn()
). It also helps to centralize
platform differences related to path representation in a single class (e.g. just
grep through your codebase and you can easily refactor stuff around or look for
wrong assumptions).
This class only handles the path itself. It’s just an in-memory
representation. When you use its member functions (e.g. lexically_normal()
),
you’re NOT doing any operation on the filesystem itself. There’s no danger in
committing filesystem operations by just playing with the path object alone
(that’s also why some functions are non-members as a hint to indicate that they
might touch the actual filesystem to complete their task).
Filesystem operations
The module filesystem
presents plenty of useful functions such as:
-
Directory iteration (flat and recursive).
-
Path normalization algorithms (e.g. resolve symlinks, make relative to some base, etc).
-
Create a directory and any missing parent.
-
Copy subtrees.
-
Manipulate links.
Any of these operations might fail and the platform will report the associated
error. Emilua will just propagate the original error to your program. If you
want to handle the error portably you may call the method togeneric()
to
convert the platform-specific error code into the POSIX errno-like object:
function handle_error(e)
if e:togeneric() == generic_error.EEXIST then
-- EEXIST on POSIX or
-- ERROR_ALREADY_EXISTS on Windows
return handle_eexist(e)
else
error(e)
end
end
It’s important to preserve the original error when you’re actually trying to
understand why an operation fail on some platform. That’s why Emilua doesn’t try
to hide it away under generic_error
automatically, and you must always opt-in
for the translation here. Try to keep the original error value in logs and only
convert it to generic_error
when you’re actually handling the error matching
it against a set of conditions your program is able to handle.
On Windows, the translation to POSIX error codes is done by code written by Microsoft. We do not hardcode any mapping ourselves. That’s the closest as it gets to any form of official support from the native platform. You can’t do any better than that, and you should feel safe to use the Emilua API directly instead of trying to bypass it.
Async IO and threading
Unfortunately, async filesystem operations never really gained traction in any
mainstream operating system (and the scenario is unlikely to change). Read/write
on files may make use async IO, but moving files, iterating on directories, etc
all rely on blocking operations. It’d be terribly inefficient to create a thread
for each of these operations. Using thread pools instead of plain threads would
also have huge drawbacks. Therefore, Emilua opts to just block on all of these
operations. If you need to perform operations from the module filesystem
w/o
blocking the current thread, use spawn_vm{inherit_context=false}
to spawn an
actor in a new thread from which you can unapologetically perform blocking
operations.