SDF went down over a month ago because they ran out of disk space. I’m calling this a software security bug for needless loss of availability. It should have only been a partial loss of availability.
What should have happened
When the server detects disk space getting low, it should not just crap out. It should switch into a read-only mode. In read-only mode, users can still login and access existing content. New posts, comments, and edits are disabled in this state. The node should signal to other fedi nodes that connect that it is in read-only mode.
Perhaps most importantly, users could then be informed /in band/ that the server still lives. We often see fedi nodes simply vanish out of the pure blue and users are at a spontaneous loss of data and relationships. Logging into a read-only server would settle some nerves… keep ppls blood pressure down.
Lemmy itself doesn’t use disk storage, it just writes to a db. If the disk is full then that’s a postgres issue, not a lemmy one. Lemmy might not even know the disk is full if the db is on a different machine.
Interesting that Lemmy server has no disk access (that in effect, it just uses inter-process communication). Apparently it is possible to query a DB for the available remaining space for wherever that DB lives.
So IMO it is still a Lemmy issue. In the event that Postgres cannot handle the query (which I have not checked), it is still a Lemmy issue because Lemmy should not choose a DB that cannot provide storage info.
That’s a Microsoft sql server command…
Oof… well, hopefully that was just a bad example. Hopefully a FOSS SQL server exists with the same capability.
Depends on dependencies, I’m not sure of how lemmy is setup but even in read only mode you need to write logs and auth calls.
That’s not an obstacle. It’s a matter of where to draw the line for switching to read-only mode. A good design would obviously switch to read-only (w.r.t. user ops) when there is still plenty of space for logs.
True, I’m a little bitter from coming into companies that do cloud just because and get themselves into issues like having one big partition that fills up and refuses to boot.

