I am interested in degrowth, smol tech, permacomputing, and e-ink readers.
I dream of a post-scarcity, post-capitalist future.
I keep my code on Codeberg.
I also have a gemsite. I like the idea of a semantic web where styling and display choices are left up to the end user. Gemini is not that, but it's closer to that than the web is. Meanwhile, this webpage is minimally styled and as basic as possible in order to play nicely with user stylesheet plugins.
This website is a fragment, originally published and authored from a hacked reMarkable 2 tablet using a salvaged laptop keyboard controlled by a Teensy 3.2. It will remain a fragment until I figure out a good local-first way to sync files between my devices and the servers I publish to. I am documenting my efforts toward that goal here, directly in vim on the tilde.town server. See my 2022-12-10 entry for the design I have in mind for satisfying the first two goals below.
- Local-first. I should not have to be connected to the Internet to author or edit files. Changes should be synchronized automatically when connectivity is restored.
- Scalable/subjective. Some files may be large and unsuitable for keeping locally on small, space-constrained devices, such as the 4GB tablet I started writing this on. It should be possible to replace files on a device with links to an external presence, sacrificing the "local-first" goal for those files only.
- Native. Modern browsers are resource hogs. Some of the devices I want to use this system with are too resource-constrained to run anything web-based.
- Some sort of extended-attribute-based tag system. The "single source of truth" for tags should be the metadata stored on the files themselves, rather an external database. This simplifies backups and the task of keeping files associated with their tags. A cache, such as an SQLite database, should be derivable from the attributes in order to speed up queries but should, as with any cache, be as seamless as possible and entirely disposable. Finally, it should be possible to export the tags in a format that does not require filesystem support for extended attributes, and then to restore tags from this export artifact later.
- Near-seamless publication. The system should double as a "digital garden," or personal wiki. It should be possible to derive a meaningful semantic accounting for all the files involved and to communicate the purpose and context of these files both to others and to my future self. Old, forgotten corners of my data-body should feel less like archeological digs and more like museum exhibits. Only files marked for release to the public should be published, of course, with the rest being synchronized only to private servers.
2023-02-03: Back on LSD
I've decided to call my syncing daemon "Lykso's Syncing Daemon," in keeping with the naming convention of LUFS ("Lykso's Union File System"). The work that needed to be done to finish the things I'd overlooked in LUFS went very quickly, so now I'm back to working on LSD. My work with LUFS should help the work go smoothly, but the trick will be (as ever) carving out time to get it done. My wife has been very understanding and took more than her fair share of the baby-tending duties so I could wrap up LUFS, and now I've got to make up for it.
2023-02-02: Working on LUFS again
The changes in the syncing daemon setup meant I could remove some code from LUFS, which I did. In working on this, I discovered my LUFS implementation was not as complete as I'd thought. "Renaming" (i.e., moving) files required I implement an interface I'd previously overlooked, and I discovered that there were similar interfaces I'd overlooked for creating directories, listing extended attributes, setting extended attributes, and getting extended attributes. I've finished implementing the interfaces for moving files and creating directories and will shortly begin work on the extended-attribute-related interfaces.
I'm of the mind to create a notion of sidecar layers in LUFS, and have in fact done most of the required work so far, but I'm not sure whether it actually makes sense to introduce this notion at that layer upon further reflection. I'd hoped to avoid having to monitor the filesystem for changes, but I don't know if that's actually something I can work around here...
Addendum: I've given this some thought and I think I've worked out a way to keep sidecar files and main files synced without having to resort to filesystem monitoring, at the cost of one additional mount point. I've also determined that LUFS is absolutely the wrong place to be managing sidecar files given that I want changes to underlying layers to flow over into other layers, without having to go through the user's mountpoint. Going through the user's mountpoint does not work for Syncthing-initiated changes, which is what a change propagated to a "main" file from a "sidecar" file on another node, one without the "main" file, amounts to. I'm going to remove the sidecar code from LUFS, continue implementing the missing interfaces, and then work on introducing the necessary sidecar code to the syncing daemon. Syncthing will be syncing through a FUSE mount created by the syncing daemon, allowing the daemon to catch and respond to all file events as they happen. In theory this will work as a cross-platform alternative to watching for file events and as a resource-light alternative to polling the directory tree for changes. (I'm not entirely committed to making this thing cross-platform, but I'm going to try to at least avoid putting up roadblocks to that goal.)
2023-01-20: Progress Report 2
Finished the setup code a few days ago. Was a lot simpler than I'd thought it would be. Not all the nodes have to be in setup mode, as I'd feared would be the case. Syncthing, it turns out, handles shared folder identity a lot better than I'd thought it would. Shared folders are considered the same if they have the same ID, which is set by the user, not by Syncthing. So all I had to do was slap an "lsd-" prefix on the SHA256 hash of the sorted list of node GUIDs for each nodeset folder and I had a folder ID which would be calculated the same way by each node.
Now I'm trying to work out how to deal with messages and conflicts. The "correct" way would be to use CRDTs to resolve conflicts, but I'm not sure I want to go that hard. There's a whole research project in there, and I already have one stalled project, maybe two, that would qualify as that. I just want something that works well enough for now. So I think my decision is to just let things conflict. The conflicts not already handled by Syncthing are just exclusion and reduction conflicts. I think the way they'll play out without global synchronization works well enough for my use case. Thinking it through as I write, here are the cases that are apparent to me:
- Node A excludes node B while offline, node B excludes node A while connected to the rest of the nodeset, then node A comes online. The end result without rollbacks will be that both nodes are excluded. Since I don't bother trying to protect against denial of service by malicious nodes in my security model, this may even be what I will have intended to happen anyway, regardless of the fact that node B should not have been able to exclude node A if we had global consistency. Further, there's a rule that no node may exclude itself if it's the only remaining node with the given file, so there is no concern of accidentally removing the last copy of the file from the set of all the nodes.
- Similar to the previous case, node A excludes node B while offline, node B edits the file while connected, node A comes online. Syncthing should detect this as a conflict within the nodeset containing node B and handle it as configured, and node A's version of the file without node B's edits will exist at the same subpath in the folder for the nodeset without node B.
There are more cases once we start digging into reduction, but I'm tired of typing all this out. The fact that this is a prototype meant to be operated by a single person with a more or less consistent understanding of their filesystem will hopefully be my saving grace here. The truly correct way to do all of this would be to ditch Syncthing and build up my own syncing scheme based on ElmerFS or similar. Or else I should join Bazil's efforts, given the years of work they've put into basically this exact problem...
Unedited stream-of-consciousness ahead:
As I consider this problem more, maybe my trouble is in the notion of nodes trying to exclude data from other nodes. This seems like a fundamentally flawed notion, like trying to turn back the hands of time. Clawing back information once shared. Perhaps nodes should only be able to exclude themselves, and the notion where other nodes are concerned changed to something more along the lines of requesting that a node exclude itself. Perhaps, given my reasons for writing this whole thing in the first place, I could do away with the notion of trying to "take back" files once shared altogether.
I suppose there's a fair notion of excluding other nodes from future changes to a file, which is what it seems like not attempting to resolve exclusion and reduction conflicts amounts to. Maybe this is the mental model I should try to employ.
2023-01-14: Progress Report
Made some progress, largely on the "day off" my wife gave me a couple days ago. (We've discussed each having a 24 hour period once a month where one of us takes care of the baby and the other can go be baby-free for that time, and I've just had the first of these.) I set up at a sort of beer hall with food carts around and got my head on straight about how to proceed with the setup step. Wrote a bit of code as well, but ran out of time before I could get everything done. (I did do other things as well; this did not take the whole 24 hours!)
I'll share the code once the functionality has caught up to the speculative README I wrote to organize my thoughts./p>
Not much progress lately. Partly due to the holidays, partly due to a mental block I have around doing anything with Docker again. I used to use Docker all the time for things, but it's been a while, and my last contact with it involved trying to get it to stop overriding my iptables rules, which led me to feel that it's a bit of a sprawling, batteries-included mess. So I've been put off using it, but I also am unsure how else to automate testing. I may just... do it live. This is a bit of a toy project, and I'll be keeping backups of everything anyhow. This notion also irks me, though, because I've been bitten by that approach to things more times than I can remember.
2022-12-24: Setup is a Necessary Evil
I think the solution proposed in my last entry will work. I've settled on two verb pairs for now: exclude/include and reduce/restore. So there will have to be "reduced-by" and "excluded-by" tags.
Getting the initial folder and device synchronization going will be bothersome. I'd wanted to lazily create "node set" directories as they were needed, but I think I'm going to have to have a "setup" command that creates all the possible combinations and shares them all at once. So any time I want to add a node, all the nodes will have to be in "setup" mode and connected to each other, which... is not ideal. I don't think there is a way around it without replacing Syncthing with my own code, though.
Each node set will choose the member node with the "lowest" GUID as the "leader," meaning it will be the node responsible for creating a directory for the node set locally and then creating bidirectional shares with each node.
I think this is going to have to be a functioning prototype, with a proper implementation taking control of the synchronization layer as well. Hacking together something based on ElmerFS seems like it might be a viable route forward, should I still have the energy to do this properly once the prototype is done.
If I don't, I suppose I'll at least have something usable, if a bit hackish.
2022-12-17: I Did Say It Was Tenative
I've just realized a node can request files it should not have access to and there is no reliable way to prevent this in my current design. As I've started writing out possibilities here, I think my best option might be to write an "excluded-by" tag with a reference to the GUID of node that excluded the file, to be removed when the file is restored, and without which the file cannot be restored by the requesting node.
The first piece is more or less ready. Lykso's Union Filesystem is a FUSE-based union filesystem with a couple unusual customizations to work a bit better for my requirements than the more common unionfs-fuse would. First, if there is an unlink followed within one second by a create at the same path, the new file is created in the same directory tree the old file was deleted from, as it would be with a simple write. This ensures that programs, such as vim, that unlink a file and then create a new file at the same path when saving don't also end up moving that file from the lower directory tree it originally resided in to the top one. Second, when a directory is unlinked, it deletes everything at the same path across all the mounted directory trees. This is to avoid ending up with a bunch of empty directories across the mounted directory trees after an `rm -rf`.
There may be bugs lurking in its corners and there may be further improvements to make, but so far it seems to do what I need it to do.
2022-12-10: A Tentative Design
I've spent the past couple days weighing my options, arriving a few minutes ago at what I think may be my best shot at achieving these first two of these goals with minimal programming on my part.
Let there be a set of directories, one for each unordered combination of devices n through 1, where n is the number of devices. Each directory is synchronized via Syncthing with each device referenced in each directory's set of devices. A device in a set may choose to exclude a file from another device in that set by moving the file to the directory representing the set without the excluded device. It may likewise reduce a file to its metadata (e.g., external references) by first excluding the file and then putting a metadata file in the file's old location. (N.B.: I may instead make a separate directory tree for these files to make it easier to discover reduced files on a device.) Restoration is achieved by removing the metadata file and returning the original file to its original set.
Let there be a separate directory for messages between devices, synchronized between all the devices. A device may request a file's exclusion from, reduction in, or restoration to their sets by creating a message file containing a signed request. Message files shall be named via hybrid logical clock and processed in order. When the message file is received by the other devices, a daemon running on the other devices checks the signature and honors the request, deleting the message file after.
Metadata files representing "reduced" files are "sidecars," which I had hoped to avoid, but they do not appear to be avoidable, given the "reduce and restore" mechanism, without writing a custom file synchronization layer to replace Syncthing. The task of keeping the locations and properties of these sidecars in sync with their files will likely fall to the message processing daemon. Unique IDs will also be assigned, via extended attribute, to each file/sidecar pair to assist in repairing file/sidecar synchronization failures.
A device's set of directories are layered atop each other via a custom FUSE-based union filesystem that ensures each file remains in its originating directory tree when modified or moved, providing a unified view into the structure and allowing each device to have a different "default sync set" for files created locally. A special utility will be provided to make it easy to exclude, reduce, or restore files and to view any responses (e.g., error messages) originating from requests made via the message-passing directory.
.stignore files may of course still be used by nodes to exclude certain files, but this scheme is meant to avoid accidentally syncing large files outside of ignored paths, to avoid having to write utilities to manage .stignores, and to keep the .stignore files from possibly growing unmanageably long. It also has the advantage of providing a high degree of granularity for "source" devices regarding which other devices each file or directory structure should be synced to. Finally, the message-based reduction mechanism allows for the prevention of accidentally reducing away the last copy of a file.