content-link
I’ve made content-link
as a simple content-addressed file registry, which just symlinks files
using their multihash.
This complements tools like Git, git-annex, IPFS, Nix, etc. which all refer to files using a “content address”, i.e. their hash. Those tools look up hashes in various “stores”, and can even download missing files. Whilst that is certainly convenient, it can also cause duplication; which is a problem when dealing with large numbers of large files.
The idea of content-link is to be much simpler: it
cannot fetch anything, and never makes copies. Instead, it just
associates a file’s hash to its existing path. This provides a
layer of indirection, where tools and scripts can refer to the files
they need by hash, and content-link gives them the
right path for the current system (if known).
A motivating example: virtual machine disk images. These are large,
rarely change, and may be stored in different locations on different
machines. content-link put can register each image once; a
launcher script can then use content-link get to resolve
the path it needs, falling back gracefully (e.g. telling the user where
to obtain missing images) if no match is found.
How it works
The registry lives in ~/.content-link: this is
overridable via the STORE_DIR env var. A useful pattern
I’ve found is to actually make that a symlink to some system-wide
location (like /var/lib/content-link) so all users can
share the same links.
Each entry in that “store” is a symlink, whose name is the symlink target’s hash:
$ ls ~/.content-link/
total 1
lrwxrwxrwx 1 chris users 85 Feb 25 04:07 f122000413fc780bbf41e6c0baeff05dbaa9576114443c21ea95e2f92853abd525b47 -> '/mnt/shared/Images/NixOS_25.11.iso'
The underlying idea is very simple: to add a path, we take its hash and create symlink with that hash as its name. To retrieve the file associated with a given hash, we just look for a symlink with that hash as its name.
Note that we don’t validate the hashes during lookup; it’s assumed that the hash won’t change, e.g. that we’re using immutable files. If that’s a problem, you can just delete and re-add the offending entry to use its latest hash (I decided not to implement such “management” commands, to avoid accidental deletions).
Other than that, we just have to choose some implementation details. If we allowed multiple hash algorithms, that would require insertions and lookups to coordinate their choice of algorithm, which seems like a headache; hence we only support one hash algorithm. I’ve gone with SHA256, since it’s widely used and not obviously terrible; wrapped in a multihash for forward compatibility.
Those SHA256 multihashes are encoded as symlink filenames using hex; but we use multibase to make lookups more flexible (if users prefer to use base64 or something instead).
Usage
Build a self-contained executable with nix-build, or use
content-link.sh directly (requires the rust-multibase
CLI on $PATH).
Register a file
content-link put /path/to/fileOutputs the file’s SHA256 multihash, as multibase-encoded hex:
f1220b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9
Retrieve a path by hash
content-link get f1220b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9Prints the absolute path if that hash is registered, or exits with an error otherwise. The hash argument does not need to be hex; it can be in any multibase encoding supported by rust-multibase.
Custom registry location
STORE_DIR=/my/iso/registry content-link put RedHat9.iso