Search Results: "tg"

17 June 2022

Antoine Beaupr : Matrix notes

I have some concerns about Matrix (the protocol, not the movie that came out recently, although I do have concerns about that as well). I've been watching the project for a long time, and it seems more a promising alternative to many protocols like IRC, XMPP, and Signal. This review may sound a bit negative, because it focuses on those concerns. I am the operator of an IRC network and people keep asking me to bridge it with Matrix. I have myself considered just giving up on IRC and converting to Matrix. This space is a living document exploring my research of that problem space. The TL;DR: is that no, I'm not setting up a bridge just yet, and I'm still on IRC. This article was written over the course of the last three months, but I have been watching the Matrix project for years (my logs seem to say 2016 at least). The article is rather long. It will likely take you half an hour to read, so copy this over to your ebook reader, your tablet, or dead trees, and lean back and relax as I show you around the Matrix. Or, alternatively, just jump to a section that interest you, most likely the conclusion.

Introduction to Matrix Matrix is an "open standard for interoperable, decentralised, real-time communication over IP. It can be used to power Instant Messaging, VoIP/WebRTC signalling, Internet of Things communication - or anywhere you need a standard HTTP API for publishing and subscribing to data whilst tracking the conversation history". It's also (when compared with XMPP) "an eventually consistent global JSON database with an HTTP API and pubsub semantics - whilst XMPP can be thought of as a message passing protocol." According to their FAQ, the project started in 2014, has about 20,000 servers, and millions of users. Matrix works over HTTPS but over a special port: 8448.

Security and privacy I have some concerns about the security promises of Matrix. It's advertised as a "secure" with "E2E [end-to-end] encryption", but how does it actually work?

Data retention defaults One of my main concerns with Matrix is data retention, which is a key part of security in a threat model where (for example) an hostile state actor wants to surveil your communications and can seize your devices. On IRC, servers don't actually keep messages all that long: they pass them along to other servers and clients as fast as they can, only keep them in memory, and move on to the next message. There are no concerns about data retention on messages (and their metadata) other than the network layer. (I'm ignoring the issues with user registration, which is a separate, if valid, concern.) Obviously, an hostile server could log everything passing through it, but IRC federations are normally tightly controlled. So, if you trust your IRC operators, you should be fairly safe. Obviously, clients can (and often do, even if OTR is configured!) log all messages, but this is generally not the default. Irssi, for example, does not log by default. IRC bouncers are more likely to log to disk, of course, to be able to do what they do. Compare this to Matrix: when you send a message to a Matrix homeserver, that server first stores it in its internal SQL database. Then it will transmit that message to all clients connected to that server and room, and to all other servers that have clients connected to that room. Those remote servers, in turn, will keep a copy of that message and all its metadata in their own database, by default forever. On encrypted rooms those messages are encrypted, but not their metadata. There is a mechanism to expire entries in Synapse, but it is not enabled by default. So one should generally assume that a message sent on Matrix is never expired.

GDPR in the federation But even if that setting was enabled by default, how do you control it? This is a fundamental problem of the federation: if any user is allowed to join a room (which is the default), those user's servers will log all content and metadata from that room. That includes private, one-on-one conversations, since those are essentially rooms as well. In the context of the GDPR, this is really tricky: who is the responsible party (known as the "data controller") here? It's basically any yahoo who fires up a home server and joins a room. In a federated network, one has to wonder whether GDPR enforcement is even possible at all. But in Matrix in particular, if you want to enforce your right to be forgotten in a given room, you would have to:
  1. enumerate all the users that ever joined the room while you were there
  2. discover all their home servers
  3. start a GDPR procedure against all those servers
I recognize this is a hard problem to solve while still keeping an open ecosystem. But I believe that Matrix should have much stricter defaults towards data retention than right now. Message expiry should be enforced by default, for example. (Note that there are also redaction policies that could be used to implement part of the GDPR automatically, see the privacy policy discussion below on that.) Also keep in mind that, in the brave new peer-to-peer world that Matrix is heading towards, the boundary between server and client is likely to be fuzzier, which would make applying the GDPR even more difficult. Update: this comment links to this post (in german) which apparently studied the question and concluded that Matrix is not GDPR-compliant. In fact, maybe Synapse should be designed so that there's no configurable flag to turn off data retention. A bit like how most system loggers in UNIX (e.g. syslog) come with a log retention system that typically rotate logs after a few weeks or month. Historically, this was designed to keep hard drives from filling up, but it also has the added benefit of limiting the amount of personal information kept on disk in this modern day. (Arguably, syslog doesn't rotate logs on its own, but, say, Debian GNU/Linux, as an installed system, does have log retention policies well defined for installed packages, and those can be discussed. And "no expiry" is definitely a bug. privacy policy When I first looked at Matrix, five years ago, was called and had a rather dubious privacy policy:
We currently use cookies to support our use of Google Analytics on the Website and Service. Google Analytics collects information about how you use the Website and Service. [...] This helps us to provide you with a good experience when you browse our Website and use our Service and also allows us to improve our Website and our Service.
When I asked Matrix people about why they were using Google Analytics, they explained this was for development purposes and they were aiming for velocity at the time, not privacy (paraphrasing here). They also included a "free to snitch" clause:
If we are or believe that we are under a duty to disclose or share your personal data, we will do so in order to comply with any legal obligation, the instructions or requests of a governmental authority or regulator, including those outside of the UK.
Those are really broad terms, above and beyond what is typically expected legally. Like the current retention policies, such user tracking and ... "liberal" collaboration practices with the state set a bad precedent for other home servers. Thankfully, since the above policy was published (2017), the GDPR was "implemented" (2018) and it seems like both the privacy policy and the privacy policy have been somewhat improved since. Notable points of the new privacy policies:
  • the "federation" section actually outlines that "Federated homeservers and Matrix clients which respect the Matrix protocol are expected to honour these controls and redaction/erasure requests, but other federated homeservers are outside of the span of control of Element, and we cannot guarantee how this data will be processed"
  • 2.6: users under the age of 16 should not use the service
  • 2.10: Upcloud, Mythic Beast, Amazon, and CloudFlare possibly have access to your data (it's nice to at least mention this in the privacy policy: many providers don't even bother admitting to this kind of delegation)
  • Element 2.2.1: mentions many more third parties (Twilio, Stripe, Quaderno, LinkedIn, Twitter, Google, Outplay, PipeDrive, HubSpot, Posthog, Sentry, and Matomo (phew!) used when you are paying for hosting
I'm not super happy with all the trackers they have on the Element platform, but then again you don't have to use that service. Your favorite homeserver (assuming you are not on probably has their own Element deployment, hopefully without all that garbage. Overall, this is all a huge improvement over the previous privacy policy, so hats off to the Matrix people for figuring out a reasonable policy in such a tricky context. I particularly like this bit:
We will forget your copy of your data upon your request. We will also forward your request to be forgotten onto federated homeservers. However - these homeservers are outside our span of control, so we cannot guarantee they will forget your data.
It's great they implemented those mechanisms and, after all, if there's an hostile party in there, nothing can prevent them from using screenshots to just exfiltrate your data away from the client side anyways, even with services typically seen as more secure, like Signal. As an aside, I also appreciate that has a fairly decent code of conduct, based on the TODO CoC which checks all the boxes in the geekfeminism wiki.

Metadata handling Overall, privacy protections in Matrix mostly concern message contents, not metadata. In other words, who's talking with who, when and from where is not well protected. Compared to a tool like Signal, which goes through great lengths to anonymize that data with features like private contact discovery, disappearing messages, sealed senders, and private groups, Matrix is definitely behind. (Note: there is an issue open about message lifetimes in Element since 2020, but it's not at even at the MSC stage yet.) This is a known issue (opened in 2019) in Synapse, but this is not just an implementation issue, it's a flaw in the protocol itself. Home servers keep join/leave of all rooms, which gives clear text information about who is talking to. Synapse logs may also contain privately identifiable information that home server admins might not be aware of in the first place. Those log rotation policies are separate from the server-level retention policy, which may be confusing for a novice sysadmin. Combine this with the federation: even if you trust your home server to do the right thing, the second you join a public room with third-party home servers, those ideas kind of get thrown out because those servers can do whatever they want with that information. Again, a problem that is hard to solve in any federation. To be fair, IRC doesn't have a great story here either: any client knows not only who's talking to who in a room, but also typically their client IP address. Servers can (and often do) obfuscate this, but often that obfuscation is trivial to reverse. Some servers do provide "cloaks" (sometimes automatically), but that's kind of a "slap-on" solution that actually moves the problem elsewhere: now the server knows a little more about the user. Overall, I would worry much more about a Matrix home server seizure than a IRC or Signal server seizure. Signal does get subpoenas, and they can only give out a tiny bit of information about their users: their phone number, and their registration, and last connection date. Matrix carries a lot more information in its database.

Amplification attacks on URL previews I (still!) run an Icecast server and sometimes share links to it on IRC which, obviously, also ends up on (more than one!) Matrix home servers because some people connect to IRC using Matrix. This, in turn, means that Matrix will connect to that URL to generate a link preview. I feel this outlines a security issue, especially because those sockets would be kept open seemingly forever. I tried to warn the Matrix security team but somehow, I don't think this issue was taken very seriously. Here's the disclosure timeline:
  • January 18: contacted Matrix security
  • January 19: response: already reported as a bug
  • January 20: response: can't reproduce
  • January 31: timeout added, considered solved
  • January 31: I respond that I believe the security issue is underestimated, ask for clearance to disclose
  • February 1: response: asking for two weeks delay after the next release (1.53.0) including another patch, presumably in two weeks' time
  • February 22: Matrix 1.53.0 released
  • April 14: I notice the release, ask for clearance again
  • April 14: response: referred to the public disclosure
There are a couple of problems here:
  1. the bug was publicly disclosed in September 2020, and not considered a security issue until I notified them, and even then, I had to insist
  2. no clear disclosure policy timeline was proposed or seems established in the project (there is a security disclosure policy but it doesn't include any predefined timeline)
  3. I wasn't informed of the disclosure
  4. the actual solution is a size limit (10MB, already implemented), a time limit (30 seconds, implemented in PR 11784), and a content type allow list (HTML, "media" or JSON, implemented in PR 11936), and I'm not sure it's adequate
  5. (pure vanity:) I did not make it to their Hall of fame
I'm not sure those solutions are adequate because they all seem to assume a single home server will pull that one URL for a little while then stop. But in a federated network, many (possibly thousands) home servers may be connected in a single room at once. If an attacker drops a link into such a room, all those servers would connect to that link all at once. This is an amplification attack: a small amount of traffic will generate a lot more traffic to a single target. It doesn't matter there are size or time limits: the amplification is what matters here. It should also be noted that clients that generate link previews have more amplification because they are more numerous than servers. And of course, the default Matrix client (Element) does generate link previews as well. That said, this is possibly not a problem specific to Matrix: any federated service that generates link previews may suffer from this. I'm honestly not sure what the solution is here. Maybe moderation? Maybe link previews are just evil? All I know is there was this weird bug in my Icecast server and I tried to ring the bell about it, and it feels it was swept under the rug. Somehow I feel this is bound to blow up again in the future, even with the current mitigation.

Moderation In Matrix like elsewhere, Moderation is a hard problem. There is a detailed moderation guide and much of this problem space is actively worked on in Matrix right now. A fundamental problem with moderating a federated space is that a user banned from a room can rejoin the room from another server. This is why spam is such a problem in Email, and why IRC networks have stopped federating ages ago (see the IRC history for that fascinating story).

The mjolnir bot The mjolnir moderation bot is designed to help with some of those things. It can kick and ban users, redact all of a user's message (as opposed to one by one), all of this across multiple rooms. It can also subscribe to a federated block list published by to block known abusers (users or servers). Bans are pretty flexible and can operate at the user, room, or server level. Matrix people suggest making the bot admin of your channels, because you can't take back admin from a user once given.

The command-line tool There's also a new command line tool designed to do things like:
  • System notify users (all users/users from a list, specific user)
  • delete sessions/devices not seen for X days
  • purge the remote media cache
  • select rooms with various criteria (external/local/empty/created by/encrypted/cleartext)
  • purge history of theses rooms
  • shutdown rooms
This tool and Mjolnir are based on the admin API built into Synapse.

Rate limiting Synapse has pretty good built-in rate-limiting which blocks repeated login, registration, joining, or messaging attempts. It may also end up throttling servers on the federation based on those settings.

Fundamental federation problems Because users joining a room may come from another server, room moderators are at the mercy of the registration and moderation policies of those servers. Matrix is like IRC's +R mode ("only registered users can join") by default, except that anyone can register their own homeserver, which makes this limited. Server admins can block IP addresses and home servers, but those tools are not easily available to room admins. There is an API ( in /devtools) but it is not reliable (thanks Austin Huang for the clarification). Matrix has the concept of guest accounts, but it is not used very much, and virtually no client or homeserver supports it. This contrasts with the way IRC works: by default, anyone can join an IRC network even without authentication. Some channels require registration, but in general you are free to join and look around (until you get blocked, of course). I have seen anecdotal evidence (CW: Twitter, nitter link) that "moderating bridges is hell", and I can imagine why. Moderation is already hard enough on one federation, when you bridge a room with another network, you inherit all the problems from that network but without the entire abuse control tools from the original network's API...

Room admins Matrix, in particular, has the problem that room administrators (which have the power to redact messages, ban users, and promote other users) are bound to their Matrix ID which is, in turn, bound to their home servers. This implies that a home server administrators could (1) impersonate a given user and (2) use that to hijack the room. So in practice, the home server is the trust anchor for rooms, not the user themselves. That said, if server B administrator hijack user joe on server B, they will hijack that room on that specific server. This will not (necessarily) affect users on the other servers, as servers could refuse parts of the updates or ban the compromised account (or server). It does seem like a major flaw that room credentials are bound to Matrix identifiers, as opposed to the E2E encryption credentials. In an encrypted room even with fully verified members, a compromised or hostile home server can still take over the room by impersonating an admin. That admin (or even a newly minted user) can then send events or listen on the conversations. This is even more frustrating when you consider that Matrix events are actually signed and therefore have some authentication attached to them, acting like some sort of Merkle tree (as it contains a link to previous events). That signature, however, is made from the homeserver PKI keys, not the client's E2E keys, which makes E2E feel like it has been "bolted on" later.

Availability While Matrix has a strong advantage over Signal in that it's decentralized (so anyone can run their own homeserver,), I couldn't find an easy way to run a "multi-primary" setup, or even a "redundant" setup (even if with a single primary backend), short of going full-on "replicate PostgreSQL and Redis data", which is not typically for the faint of heart.

How this works in IRC On IRC, it's quite easy to setup redundant nodes. All you need is:
  1. a new machine (with it's own public address with an open port)
  2. a shared secret (or certificate) between that machine and an existing one on the network
  3. a connect block on both servers
That's it: the node will join the network and people can connect to it as usual and share the same user/namespace as the rest of the network. The servers take care of synchronizing state: you do not need to worry about replicating a database server. (Now, experienced IRC people will know there's a catch here: IRC doesn't have authentication built in, and relies on "services" which are basically bots that authenticate users (I'm simplifying, don't nitpick). If that service goes down, the network still works, but then people can't authenticate, and they can start doing nasty things like steal people's identity if they get knocked offline. But still: basic functionality still works: you can talk in rooms and with users that are on the reachable network.)

User identities Matrix is more complicated. Each "home server" has its own identity namespace: a specific user (say is bound to that specific home server. If that server goes down, that user is completely disconnected. They could register a new account elsewhere and reconnect, but then they basically lose all their configuration: contacts, joined channels are all lost. (Also notice how the Matrix IDs don't look like a typical user address like an email in XMPP. They at least did their homework and got the allocation for the scheme.)

Rooms Users talk to each other in "rooms", even in one-to-one communications. (Rooms are also used for other things like "spaces", they're basically used for everything, think "everything is a file" kind of tool.) For rooms, home servers act more like IRC nodes in that they keep a local state of the chat room and synchronize it with other servers. Users can keep talking inside a room if the server that originally hosts the room goes down. Rooms can have a local, server-specific "alias" so that, say, is also visible as on the home server. Both addresses refer to the same room underlying room. (Finding this in the Element settings is not obvious though, because that "alias" are actually called a "local address" there. So to create such an alias (in Element), you need to go in the room settings' "General" section, "Show more" in "Local address", then add the alias name (e.g. foo), and then that room will be available on your homeserver as So a room doesn't belong to a server, it belongs to the federation, and anyone can join the room from any serer (if the room is public, or if invited otherwise). You can create a room on server A and when a user from server B joins, the room will be replicated on server B as well. If server A fails, server B will keep relaying traffic to connected users and servers. A room is therefore not fundamentally addressed with the above alias, instead ,it has a internal Matrix ID, which basically a random string. It has a server name attached to it, but that was made just to avoid collisions. That can get a little confusing. For example, the room is an alias on the server, but the room ID is ! That's because the room was created on, but the preferred branding is now. As an aside, rooms, by default, live forever, even after the last user quits. There's an admin API to delete rooms and a tombstone event to redirect to another one, but neither have a GUI yet. The latter is part of MSC1501 ("Room version upgrades") which allows a room admin to close a room, with a message and a pointer to another room.

Spaces Discovering rooms can be tricky: there is a per-server room directory, but people are trying to deprecate it in favor of "Spaces". Room directories were ripe for abuse: anyone can create a room, so anyone can show up in there. It's possible to restrict who can add aliases, but anyways directories were seen as too limited. In contrast, a "Space" is basically a room that's an index of other rooms (including other spaces), so existing moderation and administration mechanism that work in rooms can (somewhat) work in spaces as well. This enables a room directory that works across federation, regardless on which server they were originally created. New users can be added to a space or room automatically in Synapse. (Existing users can be told about the space with a server notice.) This gives admins a way to pre-populate a list of rooms on a server, which is useful to build clusters of related home servers, providing some sort of redundancy, at the room -- not user -- level.

Home servers So while you can workaround a home server going down at the room level, there's no such thing at the home server level, for user identities. So if you want those identities to be stable in the long term, you need to think about high availability. One limitation is that the domain name (e.g. must never change in the future, as renaming home servers is not supported. The documentation used to say you could "run a hot spare" but that has been removed. Last I heard, it was not possible to run a high-availability setup where multiple, separate locations could replace each other automatically. You can have high performance setups where the load gets distributed among workers, but those are based on a shared database (Redis and PostgreSQL) backend. So my guess is it would be possible to create a "warm" spare server of a matrix home server with regular PostgreSQL replication, but that is not documented in the Synapse manual. This sort of setup would also not be useful to deal with networking issues or denial of service attacks, as you will not be able to spread the load over multiple network locations easily. Redis and PostgreSQL heroes are welcome to provide their multi-primary solution in the comments. In the meantime, I'll just point out this is a solution that's handled somewhat more gracefully in IRC, by having the possibility of delegating the authentication layer.

Delegations If you do not want to run a Matrix server yourself, it's possible to delegate the entire thing to another server. There's a server discovery API which uses the .well-known pattern (or SRV records, but that's "not recommended" and a bit confusing) to delegate that service to another server. Be warned that the server still needs to be explicitly configured for your domain. You can't just put:
  "m.server": ""  
... on and start using as a Matrix ID. That's because Matrix doesn't support "virtual hosting" and you'd still be connecting to rooms and people with your identity, not as you would normally expect. This is also why you cannot rename your home server. The server discovery API is what allows servers to find each other. Clients, on the other hand, use the client-server discovery API: this is what allows a given client to find your home server when you type your Matrix ID on login.

Performance The high availability discussion brushed over the performance of Matrix itself, but let's now dig into that.

Horizontal scalability There were serious scalability issues of the main Matrix server, Synapse, in the past. So the Matrix team has been working hard to improve its design. Since Synapse 1.22 the home server can horizontally scale to multiple workers (see this blog post for details) which can make it easier to scale large servers.

Other implementations There are other promising home servers implementations from a performance standpoint (dendrite, Golang, entered beta in late 2020; conduit, Rust, beta; others), but none of those are feature-complete so there's a trade-off to be made there. Synapse is also adding a lot of feature fast, so it's an open question whether the others will ever catch up. (I have heard that Dendrite might actually surpass Synapse in features within a few years, which would put Synapse in a more "LTS" situation.)

Latency Matrix can feel slow sometimes. For example, joining the "Matrix HQ" room in Element (from takes a few minutes and then fails. That is because the home server has to sync the entire room state when you join the room. There was promising work on this announced in the lengthy 2021 retrospective, and some of that work landed (partial sync) in the 1.53 release already. Other improvements coming include sliding sync, lazy loading over federation, and fast room joins. So that's actually something that could be fixed in the fairly short term. But in general, communication in Matrix doesn't feel as "snappy" as on IRC or even Signal. It's hard to quantify this without instrumenting a full latency test bed (for example the tools I used in the terminal emulators latency tests), but even just typing in a web browser feels slower than typing in a xterm or Emacs for me. Even in conversations, I "feel" people don't immediately respond as fast. In fact, this could be an interesting double-blind experiment to make: have people guess whether they are talking to a person on Matrix, XMPP, or IRC, for example. My theory would be that people could notice that Matrix users are slower, if only because of the TCP round-trip time each message has to take.

Transport Some courageous person actually made some tests of various messaging platforms on a congested network. His evaluation was basically:
  • Briar: uses Tor, so unusable except locally
  • Matrix: "struggled to send and receive messages", joining a room takes forever as it has to sync all history, "took 20-30 seconds for my messages to be sent and another 20 seconds for further responses"
  • XMPP: "worked in real-time, full encryption, with nearly zero lag"
So that was interesting. I suspect IRC would have also fared better, but that's just a feeling. Other improvements to the transport layer include support for websocket and the CoAP proxy work from 2019 (targeting 100bps links), but both seem stalled at the time of writing. The Matrix people have also announced the pinecone p2p overlay network which aims at solving large, internet-scale routing problems. See also this talk at FOSDEM 2022.


Onboarding and workflow The workflow for joining a room, when you use Element web, is not great:
  1. click on a link in a web browser
  2. land on (say)
  3. offers "Element", yeah that's sounds great, let's click "Continue"
  4. land on and then you need to register, aaargh
As you might have guessed by now, there is a specification to solve this, but web browsers need to adopt it as well, so that's far from actually being solved. At least browsers generally know about the matrix: scheme, it's just not exactly clear what they should do with it, especially when the handler is just another web page (e.g. Element web). In general, when compared with tools like Signal or WhatsApp, Matrix doesn't fare so well in terms of user discovery. I probably have some of my normal contacts that have a Matrix account as well, but there's really no way to know. It's kind of creepy when Signal tells you "this person is on Signal!" but it's also pretty cool that it works, and they actually implemented it pretty well. Registration is also less obvious: in Signal, the app confirms your phone number automatically. It's friction-less and quick. In Matrix, you need to learn about home servers, pick one, register (with a password! aargh!), and then setup encryption keys (not default), etc. It's a lot more friction. And look, I understand: giving away your phone number is a huge trade-off. I don't like it either. But it solves a real problem and makes encryption accessible to a ton more people. Matrix does have "identity servers" that can serve that purpose, but I don't feel confident sharing my phone number there. It doesn't help that the identity servers don't have private contact discovery: giving them your phone number is a more serious security compromise than with Signal. There's a catch-22 here too: because no one feels like giving away their phone numbers, no one does, and everyone assumes that stuff doesn't work anyways. Like it or not, Signal forcing people to divulge their phone number actually gives them critical mass that means actually a lot of my relatives are on Signal and I don't have to install crap like WhatsApp to talk with them.

5 minute clients evaluation Throughout all my tests I evaluated a handful of Matrix clients, mostly from Flathub because almost none of them are packaged in Debian. Right now I'm using Element, the flagship client from, in a web browser window, with the PopUp Window extension. This makes it look almost like a native app, and opens links in my main browser window (instead of a new tab in that separate window), which is nice. But I'm tired of buying memory to feed my web browser, so this indirection has to stop. Furthermore, I'm often getting completely logged off from Element, which means re-logging in, recovering my security keys, and reconfiguring my settings. That is extremely annoying. Coming from Irssi, Element is really "GUI-y" (pronounced "gooey"). Lots of clickety happening. To mark conversations as read, in particular, I need to click-click-click on all the tabs that have some activity. There's no "jump to latest message" or "mark all as read" functionality as far as I could tell. In Irssi the former is built-in (alt-a) and I made a custom /READ command for the latter:
/ALIAS READ script exec \$_->activity(0) for Irssi::windows
And yes, that's a Perl script in my IRC client. I am not aware of any Matrix client that does stuff like that, except maybe Weechat, if we can call it a Matrix client, or Irssi itself, now that it has a Matrix plugin (!). As for other clients, I have looked through the Matrix Client Matrix (confusing right?) to try to figure out which one to try, and, even after selecting Linux as a filter, the chart is just too wide to figure out anything. So I tried those, kind of randomly:
  • Fractal
  • Mirage
  • Nheko
  • Quaternion
Unfortunately, I lost my notes on those, I don't actually remember which one did what. I still have a session open with Mirage, so I guess that means it's the one I preferred, but I remember they were also all very GUI-y. Maybe I need to look at weechat-matrix or gomuks. At least Weechat is scriptable so I could continue playing the power-user. Right now my strategy with messaging (and that includes microblogging like Twitter or Mastodon) is that everything goes through my IRC client, so Weechat could actually fit well in there. Going with gomuks, on the other hand, would mean running it in parallel with Irssi or ... ditching IRC, which is a leap I'm not quite ready to take just yet. Oh, and basically none of those clients (except Nheko and Element) support VoIP, which is still kind of a second-class citizen in Matrix. It does not support large multimedia rooms, for example: Jitsi was used for FOSDEM instead of the native videoconferencing system.

Bots This falls a little aside the "usability" section, but I didn't know where to put this... There's a few Matrix bots out there, and you are likely going to be able to replace your existing bots with Matrix bots. It's true that IRC has a long and impressive history with lots of various bots doing various things, but given how young Matrix is, there's still a good variety:
  • maubot: generic bot with tons of usual plugins like sed, dice, karma, xkcd, echo, rss, reminder, translate, react, exec, gitlab/github webhook receivers, weather, etc
  • opsdroid: framework to implement "chat ops" in Matrix, connects with Matrix, GitHub, GitLab, Shell commands, Slack, etc
  • matrix-nio: another framework, used to build lots more bots like:
    • hemppa: generic bot with various functionality like weather, RSS feeds, calendars, cron jobs, OpenStreetmaps lookups, URL title snarfing, wolfram alpha, astronomy pic of the day, Mastodon bridge, room bridging, oh dear
    • devops: ping, curl, etc
    • podbot: play podcast episodes from AntennaPod
    • cody: Python, Ruby, Javascript REPL
    • eno: generic bot, "personal assistant"
  • mjolnir: moderation bot
  • hookshot: bridge with GitLab/GitHub
  • matrix-monitor-bot: latency monitor
One thing I haven't found an equivalent for is Debian's MeetBot. There's an archive bot but it doesn't have topics or a meeting chair, or HTML logs.

Working on Matrix As a developer, I find Matrix kind of intimidating. The specification is huge. The official specification itself looks somewhat digestable: it's only 6 APIs so that looks, at first, kind of reasonable. But whenever you start asking complicated questions about Matrix, you quickly fall into the Matrix Spec Change specification (which, yes, is a separate specification). And there are literally hundreds of MSCs flying around. It's hard to tell what's been adopted and what hasn't, and even harder to figure out if your specific client has implemented it. (One trendy answer to this problem is to "rewrite it in rust": Matrix are working on implementing a lot of those specifications in a matrix-rust-sdk that's designed to take the implementation details away from users.) Just taking the latest weekly Matrix report, you find that three new MSCs proposed, just last week! There's even a graph that shows the number of MSCs is progressing steadily, at 600+ proposals total, with the majority (300+) "new". I would guess the "merged" ones are at about 150. That's a lot of text which includes stuff like 3D worlds which, frankly, I don't think you should be working on when you have such important security and usability problems. (The internet as a whole, arguably, doesn't fare much better. RFC600 is a really obscure discussion about "INTERFACING AN ILLINOIS PLASMA TERMINAL TO THE ARPANET". Maybe that's how many MSCs will end up as well, left forgotten in the pits of history.) And that's the thing: maybe the Matrix people have a different objective than I have. They want to connect everything to everything, and make Matrix a generic transport for all sorts of applications, including virtual reality, collaborative editors, and so on. I just want secure, simple messaging. Possibly with good file transfers, and video calls. That it works with existing stuff is good, and it should be federated to remove the "Signal point of failure". So I'm a bit worried with the direction all those MSCs are taking, especially when you consider that clients other than Element are still struggling to keep up with basic features like end-to-end encryption or room discovery, never mind voice or spaces...

Conclusion Overall, Matrix is somehow in the space XMPP was a few years ago. It has a ton of features, pretty good clients, and a large community. It seems to have gained some of the momentum that XMPP has lost. It may have the most potential to replace Signal if something bad would happen to it (like, I don't know, getting banned or going nuts with cryptocurrency)... But it's really not there yet, and I don't see Matrix trying to get there either, which is a bit worrisome.

Looking back at history I'm also worried that we are repeating the errors of the past. The history of federated services is really fascinating:. IRC, FTP, HTTP, and SMTP were all created in the early days of the internet, and are all still around (except, arguably, FTP, which was removed from major browsers recently). All of them had to face serious challenges in growing their federation. IRC had numerous conflicts and forks, both at the technical level but also at the political level. The history of IRC is really something that anyone working on a federated system should study in detail, because they are bound to make the same mistakes if they are not familiar with it. The "short" version is:
  • 1988: Finish researcher publishes first IRC source code
  • 1989: 40 servers worldwide, mostly universities
  • 1990: EFnet ("eris-free network") fork which blocks the "open relay", named Eris - followers of Eris form the A-net, which promptly dissolves itself, with only EFnet remaining
  • 1992: Undernet fork, which offered authentication ("services"), routing improvements and timestamp-based channel synchronisation
  • 1994: DALnet fork, from Undernet, again on a technical disagreement
  • 1995: Freenode founded
  • 1996: IRCnet forks from EFnet, following a flame war of historical proportion, splitting the network between Europe and the Americas
  • 1997: Quakenet founded
  • 1999: (XMPP founded)
  • 2001: 6 million users, OFTC founded
  • 2002: DALnet peaks at 136,000 users
  • 2003: IRC as a whole peaks at 10 million users, EFnet peaks at 141,000 users
  • 2004: (Facebook founded), Undernet peaks at 159,000 users
  • 2005: Quakenet peaks at 242,000 users, IRCnet peaks at 136,000 (Youtube founded)
  • 2006: (Twitter founded)
  • 2009: (WhatsApp, Pinterest founded)
  • 2010: (TextSecure AKA Signal, Instagram founded)
  • 2011: (Snapchat founded)
  • ~2013: Freenode peaks at ~100,000 users
  • 2016: IRCv3 standardisation effort started (TikTok founded)
  • 2021: Freenode self-destructs, Libera chat founded
  • 2022: Libera peaks at 50,000 users, OFTC peaks at 30,000 users
(The numbers were taken from the Wikipedia page and Note that I also include other networks launch in parenthesis for context.) Pretty dramatic, don't you think? Eventually, somehow, IRC became irrelevant for most people: few people are even aware of it now. With less than a million users active, it's smaller than Mastodon, XMPP, or Matrix at this point.1 If I were to venture a guess, I'd say that infighting, lack of a standardization body, and a somewhat annoying protocol meant the network could not grow. It's also possible that the decentralised yet centralised structure of IRC networks limited their reliability and growth. But large social media companies have also taken over the space: observe how IRC numbers peak around the time the wave of large social media companies emerge, especially Facebook (2.9B users!!) and Twitter (400M users).

Where the federated services are in history Right now, Matrix, and Mastodon (and email!) are at the "pre-EFnet" stage: anyone can join the federation. Mastodon has started working on a global block list of fascist servers which is interesting, but it's still an open federation. Right now, Matrix is totally open, but publishes a (federated) block list of hostile servers (, yes, of course it's a room). Interestingly, Email is also in that stage, where there are block lists of spammers, and it's a race between those blockers and spammers. Large email providers, obviously, are getting closer to the EFnet stage: you could consider they only accept email from themselves or between themselves. It's getting increasingly hard to deliver mail to Outlook and Gmail for example, partly because of bias against small providers, but also because they are including more and more machine-learning tools to sort through email and those systems are, fundamentally, unknowable. It's not quite the same as splitting the federation the way EFnet did, but the effect is similar. HTTP has somehow managed to live in a parallel universe, as it's technically still completely federated: anyone can start a web server if they have a public IP address and anyone can connect to it. The catch, of course, is how you find the darn thing. Which is how Google became one of the most powerful corporations on earth, and how they became the gatekeepers of human knowledge online. I have only briefly mentioned XMPP here, and my XMPP fans will undoubtedly comment on that, but I think it's somewhere in the middle of all of this. It was co-opted by Facebook and Google, and both corporations have abandoned it to its fate. I remember fondly the days where I could do instant messaging with my contacts who had a Gmail account. Those days are gone, and I don't talk to anyone over Jabber anymore, unfortunately. And this is a threat that Matrix still has to face. It's also the threat Email is currently facing. On the one hand corporations like Facebook want to completely destroy it and have mostly succeeded: many people just have an email account to register on things and talk to their friends over Instagram or (lately) TikTok (which, I know, is not Facebook, but they started that fire). On the other hand, you have corporations like Microsoft and Google who are still using and providing email services because, frankly, you still do need email for stuff, just like fax is still around but they are more and more isolated in their own silo. At this point, it's only a matter of time they reach critical mass and just decide that the risk of allowing external mail coming in is not worth the cost. They'll simply flip the switch and work on an allow-list principle. Then we'll have closed the loop and email will be dead, just like IRC is "dead" now. I wonder which path Matrix will take. Could it liberate us from these vicious cycles? Update: this generated some discussions on

  1. According to Wikipedia, there are currently about 500 distinct IRC networks operating, on about 1,000 servers, serving over 250,000 users. In contrast, Mastodon seems to be around 5 million users, claimed at FOSDEM 2021 to have about 28 million globally visible accounts, and Signal lays claim to over 40 million souls. XMPP claims to have "millions" of users on the homepage but the FAQ says they don't actually know. On the proprietary silo side of the fence, this page says
    • Facebook: 2.9 billion users
    • WhatsApp: 2B
    • Instagram: 1.4B
    • TikTok: 1B
    • Snapchat: 500M
    • Pinterest: 480M
    • Twitter: 397M
    Notable omission from that list: Youtube, with its mind-boggling 2.6 billion users... Those are not the kind of numbers you just "need to convince a brother or sister" to grow the network...

10 June 2022

Thomas Koch: shared infrastructure coop

Posted on February 5, 2014
I m working in a very small web agency with 4 employees, one of them part time and our boss who doesn t do programming. It shouldn t come as a surprise, that our development infrastructure is not perfect. We have many ideas and dreams how we could improve it, but not the time. Now we have two obvious choices: Either we just do nothing or we buy services from specialized vendors like github, atlassian, travis-ci, heroku, google and others. Doing nothing does not work for me. But just buying all this stuff doesn t please me either. We d depend on proprietary software, lock-in effects or one-size-fits-all offerings. Another option would be to find other small web shops like us, form a cooperative and share essential services. There are thousands of web shops in the same situation like us and we all need the same things: As I said, all of the above is available as commercial offerings. But I d prefer the following to be satisfied: Does something like that already exists? There already is the German cooperative hostsharing which is kind of similar but does provide mainly hosting, not services. But I ll ask them next after writing this blog post. Is your company interested in joining such an effort? Does it sound silly? Comments: Sounds promising. I already answered by mail. Dirk Deimeke (Homepage) am 16.02.2014 08:16 Homepage: I m sorry for accidentily removing a comment that linked to while moderating comments. I m really looking forward to another blogging engine Thomas Koch am 16.02.2014 12:20 Why? What are you missing? I am using s9y for 9 years now. Dirk Deimeke (Homepage) am 16.02.2014 12:57

26 May 2022

Sergio Talens-Oliag: New Blog Config

As promised, on this post I m going to explain how I ve configured this blog using hugo, asciidoctor and the papermod theme, how I publish it using nginx, how I ve integrated the remark42 comment system and how I ve automated its publication using gitea and json2file-go. It is a long post, but I hope that at least parts of it can be interesting for some, feel free to ignore it if that is not your case

Hugo Configuration

Theme settingsThe site is using the PaperMod theme and as I m using asciidoctor to publish my content I ve adjusted the settings to improve how things are shown with it. The current config.yml file is the one shown below (probably some of the settings are not required nor being used right now, but I m including the current file, so this post will have always the latest version of it):
title: Mixinet BlogOps
paginate: 5
theme: PaperMod
destination: public/
enableInlineShortcodes: true
enableRobotsTXT: true
buildDrafts: false
buildFuture: false
buildExpired: false
enableEmoji: true
pygmentsUseClasses: true
  disableXML: true
  minifyOutput: true
    languageName: "English"
    description: "Mixinet BlogOps -"
    author: "Sergio Talens-Oliag"
    weight: 1
    title: Mixinet BlogOps
      Title: "Sergio Talens-Oliag Technical Blog"
      Content: >
        ![Mixinet BlogOps](/images/mixinet-blogops.png)
      category: categories
      tag: tags
      series: series
        - name: Archive
          url: archives
          weight: 5
        - name: Categories
          url: categories/
          weight: 10
        - name: Tags
          url: tags/
          weight: 10
        - name: Search
          url: search/
          weight: 15
    - HTML
    - RSS
    - JSON
  env: production
  defaultTheme: light
  disableThemeToggle: false
  ShowShareButtons: true
  ShowReadingTime: true
  disableSpecial1stPost: true
  disableHLJS: true
  displayFullLangName: true
  ShowPostNavLinks: true
  ShowBreadCrumbs: true
  ShowCodeCopyButtons: true
  ShowRssButtonInSectionTermList: true
  ShowFullTextinRSS: true
  ShowToc: true
  TocOpen: false
  comments: true
  remark42SiteID: "blogops"
  remark42Url: "/remark42"
    enabled: false
    title: Sergio Talens-Oliag Technical Blog
    imageUrl: "/images/mixinet-blogops.png"
    imageTitle: Mixinet BlogOps
      - name: Archives
        url: archives
      - name: Categories
        url: categories
      - name: Tags
        url: tags
    - name: CV
      url: ""
    - name: Debian
      url: ""
    - name: GitHub
      url: ""
    - name: GitLab
      url: ""
    - name: Linkedin
      url: ""
    - name: RSS
      url: "index.xml"
    disableHLJS: true
    favicon: "/favicon.ico"
    favicon16x16:  "/favicon-16x16.png"
    favicon32x32:  "/favicon-32x32.png"
    apple_touch_icon:  "/apple-touch-icon.png"
    safari_pinned_tab:  "/safari-pinned-tab.svg"
    isCaseSensitive: false
    shouldSort: true
    location: 0
    distance: 1000
    threshold: 0.4
    minMatchCharLength: 0
    keys: ["title", "permalink", "summary", "content"]
    backend: html5s
    extensions: ['asciidoctor-html5s','asciidoctor-diagram']
    failureLevel: fatal
    noHeaderOrFooter: true
    preserveTOC: false
    safeMode: unsafe
    sectionNumbers: false
    trace: false
    verbose: false
    workingFolderCurrent: true
    disabled: false
    simple: true
    disabled: false
    enableDNT: true
    simple: true
    disabled: false
    simple: true
    disabled: false
    privacyEnhanced: true
    disableInlineCSS: true
    disableInlineCSS: true
      - '^asciidoctor$'
      - '^dart-sass-embedded$'
      - '^go$'
      - '^npx$'
      - '^postcss$'
Some notes about the settings:
  • disableHLJS and assets.disableHLJS are set to true; we plan to use rouge on adoc and the inclusion of the hljs assets adds styles that collide with the ones used by rouge.
  • ShowToc is set to true and the TocOpen setting is set to false to make the ToC appear collapsed initially. My plan was to use the asciidoctor ToC, but after trying I believe that the theme one looks nice and I don t need to adjust styles, although it has some issues with the html5s processor (the admonition titles use <h6> and they are shown on the ToC, which is weird), to fix it I ve copied the layouts/partial/toc.html to my site repository and replaced the range of headings to end at 5 instead of 6 (in fact 5 still seems a lot, but as I don t think I ll use that heading level on the posts it doesn t really matter).
  • params.profileMode values are adjusted, but for now I ve left it disabled setting params.profileMode.enabled to false and I ve set the homeInfoParams to show more or less the same content with the latest posts under it (I ve added some styles to my custom.css style sheet to center the text and image of the first post to match the look and feel of the profile).
  • On the asciidocExt section I ve adjusted the backend to use html5s, I ve added the asciidoctor-html5s and asciidoctor-diagram extensions to asciidoctor and adjusted the workingFolderCurrent to true to make asciidoctor-diagram work right (haven t tested it yet).

Theme customisationsTo write in asciidoctor using the html5s processor I ve added some files to the assets/css/extended directory:
  1. As said before, I ve added the file assets/css/extended/custom.css to make the homeInfoParams look like the profile page and I ve also changed a little bit some theme styles to make things look better with the html5s output:
    /* Fix first entry alignment to make it look like the profile */
    .first-entry   text-align: center;  
    .first-entry img   display: inline;  
     * Remove margin for .post-content code and reduce padding to make it look
     * better with the asciidoctor html5s output.
    .post-content code   margin: auto 0; padding: 4px;  
  2. I ve also added the file assets/css/extended/adoc.css with some styles taken from the asciidoctor-default.css, see this blog post about the original file; mine is the same after formatting it with css-beautify and editing it to use variables for the colors to support light and dark themes:
    /* AsciiDoctor*/
        border-collapse: collapse;
        border-spacing: 0
        border-collapse: separate;
        border: 0;
        background: none;
        width: 100%
    .admonitionblock>table td.icon  
        text-align: center;
        width: 80px
    .admonitionblock>table td.icon img  
        max-width: none
    .admonitionblock>table td.icon .title  
        font-weight: bold;
        font-family: "Open Sans", "DejaVu Sans", sans-serif;
        text-transform: uppercase
    .admonitionblock>table td.content  
        padding-left: 1.125em;
        padding-right: 1.25em;
        border-left: 1px solid #ddddd8;
        color: var(--primary)
    .admonitionblock>table td.content>:last-child>:last-child  
        margin-bottom: 0
    .admonitionblock td.icon [class^="fa icon-"]  
        font-size: 2.5em;
        text-shadow: 1px 1px 2px var(--secondary);
        cursor: default
    .admonitionblock td.icon .icon-note::before  
        content: "\f05a";
        color: var(--icon-note-color)
    .admonitionblock td.icon .icon-tip::before  
        content: "\f0eb";
        color: var(--icon-tip-color)
    .admonitionblock td.icon .icon-warning::before  
        content: "\f071";
        color: var(--icon-warning-color)
    .admonitionblock td.icon .icon-caution::before  
        content: "\f06d";
        color: var(--icon-caution-color)
    .admonitionblock td.icon .icon-important::before  
        content: "\f06a";
        color: var(--icon-important-color)
        display: inline-block;
        color: #fff !important;
        background-color: rgba(100, 100, 0, .8);
        -webkit-border-radius: 100px;
        border-radius: 100px;
        text-align: center;
        font-size: .75em;
        width: 1.67em;
        height: 1.67em;
        line-height: 1.67em;
        font-family: "Open Sans", "DejaVu Sans", sans-serif;
        font-style: normal;
        font-weight: bold
    .conum[data-value] *  
        color: #fff !important
        display: none
        content: attr(data-value)
    pre .conum[data-value]  
        position: relative;
        top: -.125em
    b.conum *  
        color: inherit !important
        display: none
  3. The previous file uses variables from a partial copy of the theme-vars.css file that changes the highlighted code background color and adds the color definitions used by the admonitions:
        /* Solarized base2 */
        /* --hljs-bg: rgb(238, 232, 213); */
        /* Solarized base3 */
        /* --hljs-bg: rgb(253, 246, 227); */
        /* Solarized base02 */
        --hljs-bg: rgb(7, 54, 66);
        /* Solarized base03 */
        /* --hljs-bg: rgb(0, 43, 54); */
        /* Default asciidoctor theme colors */
        --icon-note-color: #19407c;
        --icon-tip-color: var(--primary);
        --icon-warning-color: #bf6900;
        --icon-caution-color: #bf3400;
        --icon-important-color: #bf0000
        --hljs-bg: rgb(7, 54, 66);
        /* Asciidoctor theme colors with tint for dark background */
        --icon-note-color: #3e7bd7;
        --icon-tip-color: var(--primary);
        --icon-warning-color: #ff8d03;
        --icon-caution-color: #ff7847;
        --icon-important-color: #ff3030
  4. The previous styles use font-awesome, so I ve downloaded its resources for version 4.7.0 (the one used by asciidoctor) storing the font-awesome.css into on the assets/css/extended dir (that way it is merged with the rest of .css files) and copying the fonts to the static/assets/fonts/ dir (will be served directly):
    curl "$FA_BASE_URL/css/font-awesome.css" \
      > assets/css/extended/font-awesome.css
    for f in FontAwesome.otf fontawesome-webfont.eot \
      fontawesome-webfont.svg fontawesome-webfont.ttf \
      fontawesome-webfont.woff fontawesome-webfont.woff2; do
        curl "$FA_BASE_URL/fonts/$f" > "static/assets/fonts/$f"
  5. As already said the default highlighter is disabled (it provided a css compatible with rouge) so we need a css to do the highlight styling; as rouge provides a way to export them, I ve created the assets/css/extended/rouge.css file with the thankful_eyes theme:
    rougify style thankful_eyes > assets/css/extended/rouge.css
  6. To support the use of the html5s backend with admonitions I ve added a variation of the example found on this blog post to assets/js/adoc-admonitions.js:
    // replace the default admonitions block with a table that uses a format
    // similar to the standard asciidoctor ... as we are using fa-icons here there
    // is no need to add the icons: font entry on the document.
    window.addEventListener('load', function ()  
      const admonitions = document.getElementsByClassName('admonition-block')
      for (let i = admonitions.length - 1; i >= 0; i--)  
        const elm = admonitions[i]
        const type = elm.classList[1]
        const title = elm.getElementsByClassName('block-title')[0];
    	const label = title.getElementsByClassName('title-label')[0]
    		.innerHTML.slice(0, -1);
        const text = elm.innerHTML
        const parent = elm.parentNode
        const tempDiv = document.createElement('div')
        tempDiv.innerHTML =  <div class="admonitionblock $ type ">
              <td class="icon">
                <i class="fa icon-$ type " title="$ label "></i>
              <td class="content">
                $ text 
        const input = tempDiv.childNodes[0]
        parent.replaceChild(input, elm)
    and enabled its minified use on the layouts/partials/extend_footer.html file adding the following lines to it:
     - $admonitions := slice (resources.Get "js/adoc-admonitions.js")
        resources.Concat "assets/js/adoc-admonitions.js"   minify   fingerprint  
    <script defer crossorigin="anonymous" src="  $admonitions.RelPermalink  "
      integrity="  $admonitions.Data.Integrity  "></script>

Remark42 configurationTo integrate Remark42 with the PaperMod theme I ve created the file layouts/partials/comments.html with the following content based on the remark42 documentation, including extra code to sync the dark/light setting with the one set on the site:
<div id="remark42"></div>
  var remark_config =  
    host:   .Site.Params.remark42Url  ,
    site_id:   .Site.Params.remark42SiteID  ,
    url:   .Permalink  ,
    locale:   .Site.Language.Lang  
    /* Adjust the theme using the local-storage pref-theme if set */
    if (localStorage.getItem("pref-theme") === "dark")  
      remark_config.theme = "dark";
      else if (localStorage.getItem("pref-theme") === "light")  
      remark_config.theme = "light";
    /* Add remark42 widget */
    for(var i = 0; i < c.length; i++) 
      var d = document, s = d.createElement('script');
      s.src = + '/web/' + c[i] +'.js';
      s.defer = true;
      (d.head   d.body).appendChild(s);
   )(remark_config.components   ['embed']);
In development I use it with anonymous comments enabled, but to avoid SPAM the production site uses social logins (for now I ve only enabled Github & Google, if someone requests additional services I ll check them, but those were the easy ones for me initially). To support theme switching with remark42 I ve also added the following inside the layouts/partials/extend_footer.html file:
 - if (not site.Params.disableThemeToggle)  
/* Function to change theme when the toggle button is pressed */
document.getElementById("theme-toggle").addEventListener("click", () =>  
  if (typeof window.REMARK42 != "undefined")  
    if (document.body.className.includes('dark'))  
 - end  
With this code if the theme-toggle button is pressed we change the remark42 theme before the PaperMod one (that s needed here only, on page loads the remark42 theme is synced with the main one using the code from the layouts/partials/comments.html shown earlier).

Development setupTo preview the site on my laptop I m using docker-compose with the following configuration:
version: "2"
      context: ./docker/hugo-adoc
      dockerfile: ./Dockerfile
    image: sto/hugo-adoc
    container_name: hugo-adoc-blogops
    restart: always
      - .:/documents
    command: server --bind -D -F
    user: $ APP_UID :$ APP_GID 
    image: nginx:latest
    container_name: nginx-blogops
    restart: always
      - ./nginx/default.conf:/etc/nginx/conf.d/default.conf
      -  1313:1313
      context: ./docker/remark42
      dockerfile: ./Dockerfile
    image: sto/remark42
    container_name: remark42-blogops
    restart: always
      - ./.env
      - ./remark42/
      - ./remark42/
To run it properly we have to create the .env file with the current user ID and GID on the variables APP_UID and APP_GID (if we don t do it the files can end up being owned by a user that is not the same as the one running the services):
$ echo "APP_UID=$(id -u)\nAPP_GID=$(id -g)" > .env
The Dockerfile used to generate the sto/hugo-adoc is:
FROM asciidoctor/docker-asciidoctor:latest
RUN gem install --no-document asciidoctor-html5s &&\
 apk update && apk add --no-cache curl libc6-compat &&\
 repo_path="gohugoio/hugo" &&\
 api_url="$repo_path/releases/latest" &&\
  curl -sL "$api_url"  \
  sed -n "s/^.*download_url\": \"\\(.*.extended.*Linux-64bit.tar.gz\)\"/\1/p"\
 )" &&\
 curl -sL "$download_url" -o /tmp/hugo.tgz &&\
 tar xf /tmp/hugo.tgz hugo &&\
 install hugo /usr/bin/ &&\
 rm -f hugo /tmp/hugo.tgz &&\
 /usr/bin/hugo version &&\
 apk del curl && rm -rf /var/cache/apk/*
# Expose port for live server
ENTRYPOINT ["/usr/bin/hugo"]
CMD [""]
If you review it you will see that I m using the docker-asciidoctor image as the base; the idea is that this image has all I need to work with asciidoctor and to use hugo I only need to download the binary from their latest release at github (as we are using an image based on alpine we also need to install the libc6-compat package, but once that is done things are working fine for me so far). The image does not launch the server by default because I don t want it to; in fact I use the same docker-compose.yml file to publish the site in production simply calling the container without the arguments passed on the docker-compose.yml file (see later). When running the containers with docker-compose up (or docker compose up if you have the docker-compose-plugin package installed) we also launch a nginx container and the remark42 service so we can test everything together. The Dockerfile for the remark42 image is the original one with an updated version of the script:
FROM umputun/remark42:latest
The updated is similar to the original, but allows us to use an APP_GID variable and updates the /etc/group file of the container so the files get the right user and group (with the original script the group is always 1001):
#!/sbin/dinit /bin/sh
uid="$(id -u)"
if [ "$ uid " -eq "0" ]; then
  echo "init container"
  # set container's time zone
  cp "/usr/share/zoneinfo/$ TIME_ZONE " /etc/localtime
  echo "$ TIME_ZONE " >/etc/timezone
  echo "set timezone $ TIME_ZONE  ($(date))"
  # set UID & GID for the app
  if [ "$ APP_UID " ]   [ "$ APP_GID " ]; then
    [ "$ APP_UID " ]   APP_UID="1001"
    [ "$ APP_GID " ]   APP_GID="$ APP_UID "
    echo "set custom APP_UID=$ APP_UID  & APP_GID=$ APP_GID "
    sed -i "s/^app:x:1001:1001:/app:x:$ APP_UID :$ APP_GID :/" /etc/passwd
    sed -i "s/^app:x:1001:/app:x:$ APP_GID :/" /etc/group
    echo "custom APP_UID and/or APP_GID not defined, using 1001:1001"
  chown -R app:app /srv /home/app
echo "prepare environment"
# replace  % REMARK_URL %  by content of REMARK_URL variable
find /srv -regex '.*\.\(html\ js\ mjs\)$' -print \
  -exec sed -i "s % REMARK_URL % $ REMARK_URL  g"   \;
if [ -n "$ SITE_ID " ]; then
  #replace "site_id: 'remark'" by SITE_ID
  sed -i "s 'remark' '$ SITE_ID ' g" /srv/web/*.html
echo "execute \"$*\""
if [ "$ uid " -eq "0" ]; then
  exec su-exec app "$@"
  exec "$@"
The environment file used with remark42 for development is quite minimal:
And the nginx/default.conf file used to publish the service locally is simple too:
 listen 1313;
 server_name localhost;
 location /  
    proxy_pass http://hugo:1313;
    proxy_set_header Host $http_host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";
 location /remark42/  
    rewrite /remark42/(.*) /$1 break;
    proxy_pass http://remark42:8080/;
    proxy_set_header Host $http_host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;

Production setupThe VM where I m publishing the blog runs Debian GNU/Linux and uses binaries from local packages and applications packaged inside containers. To run the containers I m using docker-ce (I could have used podman instead, but I already had it installed on the machine, so I stayed with it). The binaries used on this project are included on the following packages from the main Debian repository:
  • git to clone & pull the repository,
  • jq to parse json files from shell scripts,
  • json2file-go to save the webhook messages to files,
  • inotify-tools to detect when new files are stored by json2file-go and launch scripts to process them,
  • nginx to publish the site using HTTPS and work as proxy for json2file-go and remark42 (I run it using a container),
  • task-spool to queue the scripts that update the deployment.
And I m using docker and docker compose from the debian packages on the docker repository:
  • docker-ce to run the containers,
  • docker-compose-plugin to run docker compose (it is a plugin, so no - in the name).

Repository checkoutTo manage the git repository I ve created a deploy key, added it to gitea and cloned the project on the /srv/blogops PATH (that route is owned by a regular user that has permissions to run docker, as I said before).

Compiling the site with hugoTo compile the site we are using the docker-compose.yml file seen before, to be able to run it first we build the container images and once we have them we launch hugo using docker compose run:
$ cd /srv/blogops
$ git pull
$ docker compose build
$ if [ -d "./public" ]; then rm -rf ./public; fi
$ docker compose run hugo --
The compilation leaves the static HTML on /srv/blogops/public (we remove the directory first because hugo does not clean the destination folder as jekyll does). The deploy script re-generates the site as described and moves the public directory to its final place for publishing.

Running remark42 with dockerOn the /srv/blogops/remark42 folder I have the following docker-compose.yml:
version: "2"
      context: ../docker/remark42
      dockerfile: ./Dockerfile
    image: sto/remark42
      - ../.env
      - ./
    container_name: remark42
    restart: always
      - ./
The ../.env file is loaded to get the APP_UID and APP_GID variables that are used by my version of the script to adjust file permissions and the file contains the rest of the settings for remark42, including the social network tokens (see the remark42 documentation for the available parameters, I don t include my configuration here because some of them are secrets).

Nginx configurationThe nginx configuration for the site is as simple as:
  listen 443 ssl http2;
  ssl_certificate /etc/letsencrypt/live/;
  ssl_certificate_key /etc/letsencrypt/live/;
  include /etc/letsencrypt/options-ssl-nginx.conf;
  ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem;
  access_log /var/log/nginx/;
  error_log  /var/log/nginx/;
  root /srv/blogops/nginx/public_html;
  location /  
    try_files $uri $uri/ =404;
  include /srv/blogops/nginx/remark42.conf;
  listen 80 ;
  listen [::]:80 ;
  access_log /var/log/nginx/;
  error_log  /var/log/nginx/;
  if ($host =  
    return 301 https://$host$request_uri;
  return 404;
On this configuration the certificates are managed by certbot and the server root directory is on /srv/blogops/nginx/public_html and not on /srv/blogops/public; the reason for that is that I want to be able to compile without affecting the running site, the deployment script generates the site on /srv/blogops/public and if all works well we rename folders to do the switch, making the change feel almost atomic.

json2file-go configurationAs I have a working WireGuard VPN between the machine running gitea at my home and the VM where the blog is served, I m going to configure the json2file-go to listen for connections on a high port using a self signed certificate and listening on IP addresses only reachable through the VPN. To do it we create a systemd socket to run json2file-go and adjust its configuration to listen on a private IP (we use the FreeBind option on its definition to be able to launch the service even when the IP is not available, that is, when the VPN is down). The following script can be used to set up the json2file-go configuration:
set -e
# ---------
# ---------
# ----
# ----
# Install packages used with json2file for the blogops site
sudo apt update
sudo apt install -y json2file-go uuid
if [ -z "$(type mkcert)" ]; then
  sudo apt install -y mkcert
sudo apt clean
# Configuration file values
J2F_USER="$(id -u)"
J2F_GROUP="$(id -g)"
# Configure json2file
[ -d "$J2F_DIR" ]   mkdir "$J2F_DIR"
sudo sh -c "echo '$J2F_DIR' >'$J2F_BASEDIR_FILE'"
[ -d "$TLS_DIR" ]   mkdir "$TLS_DIR"
if [ ! -f "$J2F_CRT_PATH" ]   [ ! -f "$J2F_KEY_PATH" ]; then
  mkcert -cert-file "$J2F_CRT_PATH" -key-file "$J2F_KEY_PATH" "$(hostname -f)"
sudo sh -c "echo '$J2F_CRT_PATH' >'$J2F_CRT_FILE'"
sudo sh -c "echo '$J2F_KEY_PATH' >'$J2F_KEY_FILE'"
sudo sh -c "cat >'$J2F_DIRLIST_FILE'" <<EOF
$(echo "$J2F_DIRLIST"   tr ';' '\n')
# Service override
[ -d "$J2F_SERVICE_DIR" ]   sudo mkdir "$J2F_SERVICE_DIR"
sudo sh -c "cat >'$J2F_SERVICE_OVERRIDE'" <<EOF
# Socket override
[ -d "$J2F_SOCKET_DIR" ]   sudo mkdir "$J2F_SOCKET_DIR"
sudo sh -c "cat >'$J2F_SOCKET_OVERRIDE'" <<EOF
# Set FreeBind to listen on missing addresses (the VPN can be down sometimes)
# Set ListenStream to nothing to clear its value and add the new value later
# Restart and enable service
sudo systemctl daemon-reload
sudo systemctl stop "$J2F_SERVICE_NAME"
sudo systemctl start "$J2F_SERVICE_NAME"
sudo systemctl enable "$J2F_SERVICE_NAME"
# ----
# vim: ts=2:sw=2:et:ai:sts=2
Warning: The script uses mkcert to create the temporary certificates, to install the package on bullseye the backports repository must be available.

Gitea configurationTo make gitea use our json2file-go server we go to the project and enter into the hooks/gitea/new page, once there we create a new webhook of type gitea and set the target URL to and on the secret field we put the token generated with uuid by the setup script:
sed -n -e 's/blogops://p' /etc/json2file-go/dirlist
The rest of the settings can be left as they are:
  • Trigger on: Push events
  • Branch filter: *
Warning: We are using an internal IP and a self signed certificate, that means that we have to review that the webhook section of the app.ini of our gitea server allows us to call the IP and skips the TLS verification (you can see the available options on the gitea documentation). The [webhook] section of my server looks like this:
Once we have the webhook configured we can try it and if it works our json2file server will store the file on the /srv/blogops/webhook/json2file/blogops/ folder.

The json2file spooler scriptWith the previous configuration our system is ready to receive webhook calls from gitea and store the messages on files, but we have to do something to process those files once they are saved in our machine. An option could be to use a cronjob to look for new files, but we can do better on Linux using inotify we will use the inotifywait command from inotify-tools to watch the json2file output directory and execute a script each time a new file is moved inside it or closed after writing (IN_CLOSE_WRITE and IN_MOVED_TO events). To avoid concurrency problems we are going to use task-spooler to launch the scripts that process the webhooks using a queue of length 1, so they are executed one by one in a FIFO queue. The spooler script is this:
set -e
# ---------
# ---------
# ---------
# ---------
  echo "Queuing job to process file '$1'"
    tsp -n "$WEBHOOK_COMMAND" "$1"
# ----
# ----
if [ ! -d "$INPUT_DIR" ]; then
  echo "Input directory '$INPUT_DIR' does not exist, aborting!"
  exit 1
[ -d "$TSP_DIR" ]   mkdir "$TSP_DIR"
echo "Processing existing files under '$INPUT_DIR'"
find "$INPUT_DIR" -type f   sort   while read -r _filename; do
  queue_job "$_filename"
# Use inotifywatch to process new files
echo "Watching for new files under '$INPUT_DIR'"
inotifywait -q -m -e close_write,moved_to --format "%w%f" -r "$INPUT_DIR"  
  while read -r _filename; do
    queue_job "$_filename"
# ----
# vim: ts=2:sw=2:et:ai:sts=2
To run it as a daemon we install it as a systemd service using the following script:
set -e
# ---------
# ---------
# Configuration file values
J2F_USER="$(id -u)"
J2F_GROUP="$(id -g)"
# ----
# ----
# Install packages used with the webhook processor
sudo apt update
sudo apt install -y inotify-tools jq task-spooler
sudo apt clean
# Configure process service
sudo sh -c "cat > $SPOOLER_SERVICE_FILE" <<EOF
Description=json2file processor for $J2F_USER
# Restart and enable service
sudo systemctl daemon-reload
sudo systemctl stop "$SPOOLER_SERVICE_NAME"   true
sudo systemctl start "$SPOOLER_SERVICE_NAME"
sudo systemctl enable "$SPOOLER_SERVICE_NAME"
# ----
# vim: ts=2:sw=2:et:ai:sts=2

The gitea webhook processorFinally, the script that processes the JSON files does the following:
  1. First, it checks if the repository and branch are right,
  2. Then, it fetches and checks out the commit referenced on the JSON file,
  3. Once the files are updated, compiles the site using hugo with docker compose,
  4. If the compilation succeeds the script renames directories to swap the old version of the site by the new one.
If there is a failure the script aborts but before doing it or if the swap succeeded the system sends an email to the configured address and/or the user that pushed updates to the repository with a log of what happened. The current script is this one:
set -e
# ---------
# ---------
# Values
# Address that gets all messages, leave it empty if not wanted
# If the following variable is set to 'true' the pusher gets mail on failures
# If the following variable is set to 'true' the pusher gets mail on success
# gitea's conf/app.ini value of NO_REPLY_ADDRESS, it is used for email domains
# when the KeepEmailPrivate option is enabled for a user
# Directories
# Files
TODAY="$(date +%Y%m%d)"
OUTPUT_BASENAME="$(date +%Y%m%d-%H%M%S.%N)"
# Query to get variables from a gitea webhook json
  printf "%s" \
    '(.             @sh "gt_ref=\(.ref);"),' \
    '(.             @sh "gt_after=\(.after);"),' \
    '(.repository   @sh "gt_repo_clone_url=\(.clone_url);"),' \
    '(.repository   @sh "gt_repo_name=\(.name);"),' \
    '(.pusher       @sh "gt_pusher_full_name=\(.full_name);"),' \
    '(.pusher       @sh "gt_pusher_email=\(.email);")'
# ---------
# Functions
# ---------
  echo "$(date -R) $*" >>"$WEBHOOK_LOGFILE_PATH"
    [ -d "$_d" ]   mkdir "$_d"
  # Try to remove empty dirs
    if [ -d "$_d" ]; then
      rmdir "$_d" 2>/dev/null   true
  webhook_log "Accepted: $*"
  webhook_log "Rejected: $*"
  if [ -f "$WEBHOOK_JSON_INPUT_FILE" ]; then
  exit 0
  webhook_log "Deployed: $*"
  webhook_log "Troubled: $*"
  # Add the pusher email address unless it is from the domain NO_REPLY_ADDRESS,
  # which should match the value of that variable on the gitea 'app.ini' (it
  # is the domain used for emails when the user hides it).
  # shellcheck disable=SC2154
  if [ -n "$ gt_pusher_email##*@"$ NO_REPLY_ADDRESS " " ] &&
    [ -z "$ gt_pusher_email##*@* " ]; then
    _user_email="\"$gt_pusher_full_name <$gt_pusher_email>\""
  if [ "$_addr" ] && [ "$_user_email" ]; then
    echo "$_addr,$_user_email"
  elif [ "$_user_email" ]; then
    echo "$_user_email"
  elif [ "$_addr" ]; then
    echo "$_addr"
  if [ "$MAIL_LOGFILE" = "true" ]; then
    to_addr="$(print_mailto "$to_addr")"
  if [ "$to_addr" ]; then
    # shellcheck disable=SC2154
    subject="OK - $gt_repo_name updated to commit '$gt_after'"
    mail -s "$ MAIL_PREFIX $ subject " "$to_addr" \
  if [ "$MAIL_ERRFILE" = true ]; then
    to_addr="$(print_mailto "$to_addr")"
  if [ "$to_addr" ]; then
    # shellcheck disable=SC2154
    subject="KO - $gt_repo_name update FAILED for commit '$gt_after'"
    mail -s "$ MAIL_PREFIX $ subject " "$to_addr" \
# ----
# ----
# Check directories
# Go to the base directory
cd "$BASE_DIR"
# Check if the file exists
if [ ! -f "$WEBHOOK_JSON_INPUT_FILE" ]; then
  webhook_reject "Input arg '$1' is not a file, aborting"
# Parse the file
webhook_log "Processing file '$WEBHOOK_JSON_INPUT_FILE'"
# Check that the repository clone url is right
# shellcheck disable=SC2154
if [ "$gt_repo_clone_url" != "$REPO_CLONE_URL" ]; then
  webhook_reject "Wrong repository: '$gt_clone_url'"
# Check that the branch is the right one
# shellcheck disable=SC2154
if [ "$gt_ref" != "$REPO_REF" ]; then
  webhook_reject "Wrong repository ref: '$gt_ref'"
# Accept the file
# shellcheck disable=SC2154
webhook_accept "Processing '$gt_repo_name'"
# Update the checkout
git fetch >>"$WEBHOOK_LOGFILE_PATH" 2>&1   ret="$?"
if [ "$ret" -ne "0" ]; then
  webhook_troubled "Repository fetch failed"
# shellcheck disable=SC2154
git checkout "$gt_after" >>"$WEBHOOK_LOGFILE_PATH" 2>&1   ret="$?"
if [ "$ret" -ne "0" ]; then
  webhook_troubled "Repository checkout failed"
# Remove the build dir if present
if [ -d "$PUBLIC_DIR" ]; then
  rm -rf "$PUBLIC_DIR"
# Build site
docker compose run hugo -- >>"$WEBHOOK_LOGFILE_PATH" 2>&1   ret="$?"
# go back to the main branch
git switch main && git pull
# Fail if public dir was missing
if [ "$ret" -ne "0" ]   [ ! -d "$PUBLIC_DIR" ]; then
  webhook_troubled "Site build failed"
# Remove old public_html copies
webhook_log 'Removing old site versions, if present'
find $NGINX_BASE_DIR -mindepth 1 -maxdepth 1 -name 'public_html-*' -type d \
  -exec rm -rf   \; >>"$WEBHOOK_LOGFILE_PATH" 2>&1   ret="$?"
if [ "$ret" -ne "0" ]; then
  webhook_troubled "Removal of old site versions failed"
# Switch site directory
TS="$(date +%Y%m%d-%H%M%S)"
if [ -d "$PUBLIC_HTML_DIR" ]; then
  webhook_log "Moving '$PUBLIC_HTML_DIR' to '$PUBLIC_HTML_DIR-$TS'"
if [ "$ret" -eq "0" ]; then
  webhook_log "Moving '$PUBLIC_DIR' to '$PUBLIC_HTML_DIR'"
if [ "$ret" -ne "0" ]; then
  webhook_troubled "Site switch failed"
  webhook_deployed "Site deployed successfully"
# ----
# vim: ts=2:sw=2:et:ai:sts=2

22 May 2022

Russ Allbery: Review: On a Sunbeam

Review: On a Sunbeam, by Tillie Walden
Publisher: Tillie Walden
Copyright: 2016-2017
Format: Online graphic novel
Pages: 544
On a Sunbeam is a web comic that was published in installments between Fall 2016 and Spring 2017, and then later published in dead tree form. I read the on-line version, which is still available for free from its web site. It was nominated for an Eisner Award and won a ton of other awards, including the Los Angeles Times Book Prize. Mia is a new high school graduate who has taken a job with a construction crew that repairs old buildings (that are floating in space, but I'll get to that in a moment). Alma, Elliot, and Charlotte have been together for a long time; Jules is closer to Mia's age and has been with them for a year. This is not the sort of job one commutes to: they live together on a spaceship that travels to the job sites, share meals together, and are more of an extended family than a group of coworkers. It's all a bit intimidating for Mia, but Jules provides a very enthusiastic welcome and some orientation. The story of Mia's new job is interleaved with Mia's school experience from five years earlier. As a new frosh at a boarding school, Mia is obsessed with Lux, a school sport that involves building and piloting ships through a maze to capture orbs. Sent to the principal's office on the first day of school for sneaking into the Lux tower when she's supposed to be at assembly, she meets Grace, a shy girl with sparkly shoes and an unheard-of single room. Mia (a bit like Jules in the present timeline) overcomes Grace's reticence by being persistently outgoing and determinedly friendly, while trying to get on the Lux team and dealing with the typical school problems of bullies and in-groups. On a Sunbeam is science fiction in the sense that it seems to take place in space and school kids build flying ships. It is not science fiction in the sense of caring about technological extrapolation or making any scientific sense whatsoever. The buildings that Mia and the crew repair appear to be hanging in empty space, but there's gravity. No one wears any protective clothing or air masks. The spaceships look (and move) like giant tropical fish. If you need realism in your science fiction graphical novels, it's probably best not to think of this as science fiction at all, or even science fantasy despite the later appearance of some apparently magical or divine elements. That may sound surrealistic or dream-like, but On a Sunbeam isn't that either. It's a story about human relationships, found family, and diversity of personalities, all of which are realistically portrayed. The characters find their world coherent, consistent, and predictable, even if it sometimes makes no sense to the reader. On a Sunbeam is simply set in its own universe, with internal logic but without explanation or revealed rules. I kind of liked this approach? It takes some getting used to, but it's an excuse for some dramatic and beautiful backgrounds, and it's oddly freeing to have unremarked train tracks in outer space. There's no way that an explanation would have worked; if one were offered, my brain would have tried to nitpick it to the detriment of the story. There's something delightful about a setting that follows imaginary physical laws this unapologetically and without showing the author's work. I was, sadly, not as much of a fan of the art, although I am certain this will be a matter of taste. Walden mixes simple story-telling panels with sweeping vistas, free-floating domes, and strange, wild asteroids, but she uses a very limited color palette. Most panels are only a few steps away from monochrome, and the colors are chosen more for mood or orientation in the story (Mia's school days are all blue, the Staircase is orange) than for any consistent realism. There is often a lot of detail in the panels, but I found it hard to appreciate because the coloring confused my eye. I'm old enough to have been a comics reader during the revolution in digital coloring and improved printing, and I loved the subsequent dramatic improvement in vivid colors and shading. I know the coloring style here is an intentional artistic choice, but to me it felt like a throwback to the days of muddy printing on cheap paper. I have a similar complaint about the lettering: On a Sunbeam is either hand-lettered or closely simulates hand lettering, and I often found the dialogue hard to read due to inconsistent intra- and interword spacing or ambiguous letters. Here too I'm sure this was an artistic choice, but as a reader I'd much prefer a readable comics font over hand lettering. The detail in the penciling is more to my liking. I had occasional trouble telling some of the characters apart, but they're clearly drawn and emotionally expressive. The scenery is wildly imaginative and often gorgeous, which increased my frustration with the coloring. I would love to see what some of these panels would have looked like after realistic coloring with a full palette. (It's worth noting again that I read the on-line version. It's possible that the art was touched up for the print version and would have been more to my liking.) But enough about the art. The draw of On a Sunbeam for me is the story. It's not very dramatic or event-filled at first, starting as two stories of burgeoning friendships with a fairly young main character. (They are closely linked, but it's not obvious how until well into the story.) But it's the sort of story that I started reading, thought was mildly interesting, and then kept reading just one more chapter until I had somehow read the whole thing. There are some interesting twists towards the end, but it's otherwise not a very dramatic or surprising story. What it is instead is open-hearted, quiet, charming, and deeper than it looks. The characters are wildly different and can be abrasive, but they invest time and effort into understanding each other and adjusting for each other's preferences. Personal loss drives a lot of the plot, but the characters are also allowed to mature and be happy without resolving every bad thing that happened to them. These characters felt like people I would like and would want to get to know (even if Jules would be overwhelming). I enjoyed watching their lives. This reminded me a bit of a Becky Chambers novel, although it's less invested in being science fiction and sticks strictly to humans. There's a similar feeling that the relationships are the point of the story, and that nearly everyone is trying hard to be good, with differing backgrounds and differing conceptions of good. All of the characters are female or non-binary, which is left as entirely unexplained as the rest of the setting. It's that sort of book. I wouldn't say this is one of the best things I've ever read, but I found it delightful and charming, and it certainly sucked me in and kept me reading until the end. One also cannot argue with the price, although if I hadn't already read it, I would be tempted to buy a paper copy to support the author. This will not be to everyone's taste, and stay far away if you are looking for realistic science fiction, but recommended if you are in the mood for an understated queer character story full of good-hearted people. Rating: 7 out of 10

6 May 2022

Antoine Beaupr : Wallabako 1.4.0 released

I don't particularly like it when people announce their personal projects on their blog, but I'm making an exception for this one, because it's a little special for me. You see, I have just released Wallabako 1.4.0 (and a quick, mostly irrelevant 1.4.1 hotfix) today. It's the first release of that project in almost 3 years (the previous was 1.3.1, before the pandemic). The other reason I figured I would mention it is that I have almost never talked about Wallabako on this blog at all, so many of my readers probably don't even know I sometimes meddle with in Golang which surprises even me sometimes.

What's Wallabako Wallabako is a weird little program I designed to read articles on my E-book reader. I use it to spend less time on the computer: I save articles in a read-it-later app named Wallabag (hosted by a generous friend), and then Wallabako connects to that app, downloads an EPUB version of the book, and then I can read it on the device directly. When I'm done reading the book, Wallabako notices and sets the article as read in Wallabag. I also set it to delete the book locally, but you can actually configure to keep those books around forever if you feel like it. Wallabako supports syncing read status with the built-in Kobo interface (called "Nickel"), Koreader and Plato. I happen to use Koreader for everything nowadays, but it should work equally well on the others. Wallabako is actually setup to be started by udev when there's a connection change detected by the kernel, which is kind of a gross hack. It's clunky, but actually works and I thought for a while about switching to something else, but it's really the easiest way to go, and that requires the less interaction by the user.

Why I'm (still) using it I wrote Wallabako because I read a lot of articles on the internet. It's actually most of my readings. I read about 10 books a year (which I don't think is much), but I probably read more in terms of time and pages in Wallabag. I haven't actually made the math, but I estimate I spend at least double the time reading articles than I spend reading books. If I wouldn't have Wallabag, I would have hundreds of tabs open in my web browser all the time. So at least that problem is easily solved: throw everything in Wallabag, sort and read later. If I wouldn't have Wallabako however, I would be either spend that time reading on the computer -- which I prefer to spend working on free software or work -- or on my phone -- which is kind of better, but really cramped. I had stopped (and developing) Wallabako for a while, actually, Around 2019, I got tired of always read those technical articles (basically work stuff!) at home. I realized I was just not "reading" (as in books! fiction! fun stuff!) anymore, at least not as much as I wanted. So I tried to make this separation: the ebook reader is for cool book stuff. The rest is work. But because I had the Wallabag Android app on my phone and tablet, I could still read those articles there, which I thought was pretty neat. But that meant that I was constantly looking at my phone, which is something I'm generally trying to avoid, as it sets a bad example for the kids (small and big) around me. Then I realized there was one stray ebook reader lying around at home. I had recently bought a Kobo Aura HD to read books, and I like that device. And it's going to stay locked down to reading books. But there's still that old battered Kobo Glo HD reader lying around, and I figured I could just borrow it to read Wallabag articles.

What is this new release But oh boy that was a lot of work. Wallabako was kind of a mess: it was using the deprecated go dep tool, which lost the battle with go mod. Cross-compilation was broken for older devices, and I had to implement support for Koreader.

go mod So I had to learn go mod. I'm still not sure I got that part right: LSP is yelling at me because it can't find the imports, and I'm generally just "YOLO everythihng" every time I get anywhere close to it. That's not the way to do Go, in general, and not how I like to do it either. But I guess that, given time, I'll figure it out and make it work for me. It certainly works now. I think.

Cross compilation The hard part was different. You see, Nickel uses SQLite to store metadata about books, so Wallabako actually needs to tap into that SQLite database to propagate read status. Originally, I just linked against some sqlite3 library I found lying around. It's basically a wrapper around the C-based SQLite and generally works fine. But that means you actually link your Golang program against a C library. And that's when things get a little nutty. If you would just build Wallabag naively, it would fail when deployed on the Kobo Glo HD. That's because the device runs a really old kernel: the prehistoric Linux kobo #2049 PREEMPT Mon Jan 9 13:33:11 CST 2017 armv7l GNU/Linux. That was built in 2017, but the kernel was actually released in 2010, a whole 5 years before the Glo HD was released, in 2015 which is kind of outrageous. and yes, that is with the latest firmware release. My bet is they just don't upgrade the kernel on those things, as the Glo was probably bought around 2017... In any case, the problem is we are cross-compiling here. And Golang is pretty good about cross-compiling, but because we have C in there, we're actually cross-compiling with "CGO" which is really just Golang with a GCC backend. And that's much, much harder to figure out because you need to pass down flags into GCC and so on. It was a nightmare. That's until I found this outrageous "little" project called What that thing does (with a hefty does of dependencies that would make any Debian developer recoil in horror) is to transpile the SQLite C source code to Golang. You read that right: it rewrites SQLite in Go. On the fly. It's nuts. But it works. And you end up with a "pure go" program, and that thing compiles much faster and runs fine on older kernel. I still wasn't sure I wanted to just stick with that forever, so I kept the old sqlite3 code around, behind a compile-time tag. At the top of the nickel_modernc.go file, there's this magic string:
//+build !sqlite3
And at the top of nickel_sqlite3.go file, there's this magic string:
//+build sqlite3
So now, by default, the modernc file gets included, but if I pass --tags sqlite3 to the Go compiler (to go install or whatever), it will actually switch to the other implementation. Pretty neat stuff.

Koreader port The last part was something I was hesitant in doing for a long time, but that turned out to be pretty easy. I have basically switch to using Koreader to read everything. Books, PDF, everything goes through it. I really like that it stores its metadata in sidecar files: I synchronize all my books with Syncthing which means I can carry my read status, annotations and all that stuff without having to think about it. (And yes, I installed Syncthing on my Kobo.) The koreader.go port was less than 80 lines, and I could even make a nice little test suite so that I don't have to redeploy that thing to the ebook reader at every code iteration. I had originally thought I should add some sort of graphical interface in Koreader for Wallabako as well, and had requested that feature upstream. Unfortunately (or fortunately?), they took my idea and just ran with it. Some courageous soul actually wrote a full Wallabag plugin for koreader, in Lua of course. Compared to the Wallabako implementation however, the koreader plugin is much slower, probably because it downloads articles serially instead of concurrently. It is, however, much more usable as the user is given a visible feedback of the various steps. I still had to enable full debugging to diagnose a problem (which was that I shouldn't have a trailing slash, and that some special characters don't work in passwords). It's also better to write the config file with a normal text editor, over SSH or with the Kobo mounted to your computer instead of typing those really long strings over the kobo. There's no sample config file which makes that harder but a workaround is to save the configuration with dummy values and fix them up after. Finally I also found the default setting ("Remotely delete finished articles") really dangerous as it can basically lead to data loss (Wallabag article being deleted!) for an unsuspecting user... So basically, I started working on Wallabag again because the koreader implementation of their Wallabag client was not up to spec for me. It might be good enough for you, but I guess if you like Wallabako, you should thank the koreader folks for their sloppy implementation, as I'm now working again on Wallabako.

Actual release notes Those are the actual release notes for 1.4.0.
Ship a lot of fixes that have accumulated in the 3 years since the last release. Features:
  • add timestamp and git version to build artifacts
  • cleanup and improve debugging output
  • switch to pure go sqlite implementation, which helps
  • update all module dependencies
  • port to wallabago v6
  • support Plato library changes from 0.8.5+
  • support reading koreader progress/read status
  • Allow containerized builds, use gomod and avoid GOPATH hell
  • overhaul Dockerfile
  • switch to go mod
Documentation changes:
  • remove instability warning: this works well enough
  • README: replace branch name master by main in links
  • tweak mention of libreoffice to clarify concern
  • replace "kobo" references by "nickel" where appropriate
  • make a section about related projects
  • mention NickelMenu
  • quick review of the koreader implementation
  • handle errors in http request creation
  • Use OutputDir configuration instead of hardcoded wallabako paths
  • do not noisily fail if there's no entry for book in plato
  • regression: properly detect read status again after koreader (or plato?) support was added

How do I use this?
This is amazing. I can't believe someone did something that awesome. I want to cover you with gold and Tesla cars and fresh water.
You're weird please stop. But if you want to use Wallabako, head over to the README file which has installation instructions. It basically uses a hack in Kobo e-readers that will happily overwrite their root filesystem as soon as you drop this file named KoboRoot.tgz in the .kobo directory of your e-reader. Note that there is no uninstall procedure and it messes with the reader's udev configuration (to trigger runs on wifi connect). You'll also need to create a JSON configuration file and configure a client in Wallabag. And if you're looking for Wallabag hosting, offers a 14-day free trial. You can also, obviously, host it yourself. Which is not the case for Pocket, even years after Mozilla bought the company. All this wouldn't actually be necessary if Pocket was open-source because Nickel actually ships with a Pocket client. Shame on you, Mozilla. But you still make an awesome browser, so keep doing that.

22 April 2022

Jonathan Dowland: 3D-printed replacement battery cover

Print next to the original Print next to the original
new cover in the light new cover in the light
My first self-designed functional 3D print is a replacement battery cover for a LED fake-candle that my daughter uses as a night-light. I measured the original cover (we have three of the candles) using a newly-purchased micrometer and tried to re-create it in OpenSCAD. I skipped the screw-hole that is for securing the cover as we don't use that. I sliced it using Cura and printed it using PETG on our office 3D printer, a Ender 3. Print time was about an hour. To my amazement, my first take fits snugly! I've uploaded the OpenSCAD source here: batterycover.scad. It's covered under the terms of Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0).

20 April 2022

Russell Coker: Android Without Play

A while ago I was given a few reasonably high-end Android phones to give away. I gave two very nice phones to someone who looks after refugees so a couple of refugee families could make video calls to relatives. The third phone is a Huawei Nova 7i [1] which doesn t have the Google Play Store. The Nova 7i is a ridiculously powerful computer (8G of RAM in a phone!!!) but without the Google Play Store it s not much use to the average phone user. It has the HuaWei App Gallery which isn t as bad as most of the proprietary app stores of small players in the Android world, it has SnapChat, TikTok, Telegram, Alibaba, WeChat, and Grays auction (an app I didn t even know existed) along with many others. It also links to ApkPure (apparently a 3rd party app installer that obtains APK files for major commercial apps) for Facebook among others. The ApkPure thing might be Huawei outsourcing the violation of Facebook terms of service. For the moment I ve decided to only use free software on this phone and use my old phone for non-free stuff (Facebook, LinkedIn, etc). The eventual aim is that I can only carry a phone with free software for normal use and carry a second phone if I m active on LinkedIn or something. My recollection is that when I first got the phone (almost 2 years ago) it didn t have such a range of apps. The first thing to install was f-droid [2] as the app repository. F-droid has a repository of thousands of free software Android apps as well as some apps that are slightly less free which are tagged appropriately. You can install the F-Droid app from the web site. As an aside I had to go to settings and enable force old index format to get the list of packages, I don t know why as other phones had worked without it. Here are the F-Droid apps I installed: Future Plans The current main things I m missing are a calendar, a contact list, and a shared note taking system (like Google Keep). For calendaring and a contact list the CalDAV and CardDAV protocols seem best. The most common implementation on the server side appears to be DAViCal [5]. The Nextcloud system supports CalDAV, CardDAV, web editing of notes and documents (including LibreOffice if you install that plugin) [6]. But it is huge and demands write access to all it s own code (bad for security), and it s not packaged for Debian. Also in my tests it gave me an error 401 when I tried to authenticate to it from the Android Nextcloud client. I ve seen a positive review about Radicale, a simple CalDAV and CardDAV server that doesn t need a database [7]. I prefer the Unix philosophy of keeping things simple with file storage unless there s a real need for anything else. I don t think that anything I ever do with calendaring will require the PostgreSQL database that DAViCal uses. I ll give Radicale a go for CalDAV and CardDAV, but I still need something for shared notes (shopping lists etc). Suggestions welcome. Current Status Lack of a contacts list is a major loss of functionality in a phone. I could store contacts in the phone memory or on the SIM, but I would still have to get all my old contacts in there and also getting something half working reduces motivation for getting it working properly. Lack of a calendar is also a problem, again I could work around that by exporting all my Google calendars as iCal URLs but I d rather get it working correctly. The lack of shared notes may be a harder problem to solve given the failure of Nextcloud. For that I would consider just having the web site always open in Mozilla at least in the short term. At the moment I require two phones, my new Android phone without Google and the old one for my contacts list etc. Hopefully in a week or so I ll have my new phone doing contacts, calendaring, and notes. Then my old phone will just be for proprietary apps which I don t need most of the time and I can leave it at home when I don t need that sort of thing.

8 April 2022

Reproducible Builds: Reproducible Builds in March 2022

Welcome to the March 2022 report from the Reproducible Builds project! In our monthly reports we outline the most important things that we have been up to over the past month.
The in-toto project was accepted as an incubating project within the Cloud Native Computing Foundation (CNCF). in-toto is a framework that protects the software supply chain by collecting and verifying relevant data. It does so by enabling libraries to collect information about software supply chain actions and then allowing software users and/or project managers to publish policies about software supply chain practices that can be verified before deploying or installing software. CNCF foundations hosts a number of critical components of the global technology infrastructure under the auspices of the Linux Foundation. (View full announcement.)
Herv Boutemy posted to our mailing list with an announcement that the Java Reproducible Central has hit the milestone of 500 fully reproduced builds of upstream projects . Indeed, at the time of writing, according to the nightly rebuild results, 530 releases were found to be fully reproducible, with 100% reproducible artifacts.
GitBOM is relatively new project to enable build tools trace every source file that is incorporated into build artifacts. As an experiment and/or proof-of-concept, the GitBOM developers are rebuilding Debian to generate side-channel build metadata for versions of Debian that have already been released. This only works because Debian is (partial) reproducible, so one can be sure that that, if the case where build artifacts are identical, any metadata generated during these instrumented builds applies to the binaries that were built and released in the past. More information on their approach is available in README file in the bomsh repository.
Ludovic Courtes has published an academic paper discussing how the performance requirements of high-performance computing are not (as usually assumed) at odds with reproducible builds. The received wisdom is that vendor-specific libraries and platform-specific CPU extensions have resulted in a culture of local recompilation to ensure the best performance, rendering the property of reproducibility unobtainable or even meaningless. In his paper, Ludovic explains how Guix has:
[ ] implemented what we call package multi-versioning for C/C++ software that lacks function multi-versioning and run-time dispatch [ ]. It is another way to ensure that users do not have to trade reproducibility for performance. (full PDF)

Kit Martin posted to the FOSSA blog a post titled The Three Pillars of Reproducible Builds. Inspired by the shock of infiltrated or intentionally broken NPM packages, supply chain attacks, long-unnoticed backdoors , the post goes on to outline the high-level steps that lead to a reproducible build:
It is one thing to talk about reproducible builds and how they strengthen software supply chain security, but it s quite another to effectively configure a reproducible build. Concrete steps for specific languages are a far larger topic than can be covered in a single blog post, but today we ll be talking about some guiding principles when designing reproducible builds. [ ]
The article was discussed on Hacker News.
Finally, Bernhard M. Wiedemann noticed that the GNU Helloworld project varies depending on whether it is being built during a full moon! (Reddit announcement, openSUSE bug report)

Events There will be an in-person Debian Reunion in Hamburg, Germany later this year, taking place from 23 30 May. Although this is a Debian event, there will be some folks from the broader Reproducible Builds community and, of course, everyone is welcome. Please see the event page on the Debian wiki for more information. Bernhard M. Wiedemann posted to our mailing list about a meetup for Reproducible Builds folks at the openSUSE conference in Nuremberg, Germany. It was also recently announced that DebConf22 will take place this year as an in-person conference in Prizren, Kosovo. The pre-conference meeting (or Debcamp ) will take place from 10 16 July, and the main talks, workshops, etc. will take place from 17 24 July.

Misc news Holger Levsen updated the Reproducible Builds website to improve the documentation for the SOURCE_DATE_EPOCH environment variable, both by expanding parts of the existing text [ ][ ] as well as clarifying meaning by removing text in other places [ ]. In addition, Chris Lamb added a Twitter Card to our website s metadata too [ ][ ][ ]. On our mailing list this month:

Distribution work In Debian this month:
  • Johannes Schauer Marin Rodrigues posted to the debian-devel list mentioning that he exploited the property of reproducibility within Debian to demonstrate that automatically converting a large number of packages to a new internal source version did not change the resulting packages. The proposed change could therefore be applied without causing breakage:
So now we have 364 source packages for which we have a patch and for which we can show that this patch does not change the build output. Do you agree that with those two properties, the advantages of the 3.0 (quilt) format are sufficient such that the change shall be implemented at least for those 364? [ ]
In openSUSE, Bernhard M. Wiedemann posted his usual monthly reproducible builds status report.

Tooling diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploaded versions 207, 208 and 209 to Debian unstable, as well as made the following changes to the code itself:
  • Update minimum version of Black to prevent test failure on Ubuntu jammy. [ ]
  • Updated the R test fixture for the 4.2.x series of the R programming language. [ ]
Brent Spillner also worked on adding graceful handling for UNIX sockets and named pipes to diffoscope. [ ][ ][ ]. Vagrant Cascadian also updated the diffoscope package in GNU Guix. [ ][ ] reprotest is the Reproducible Build s project end-user tool to build the same source code twice in widely different environments and checking whether the binaries produced by the builds have any differences. This month, Santiago Ruano Rinc n added a new --append-build-command option [ ], which was subsequently uploaded to Debian unstable by Holger Levsen.

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Testing framework The Reproducible Builds project runs a significant testing framework at, to check packages and other artifacts for reproducibility. This month, the following changes were made:
  • Holger Levsen:
    • Replace a local copy of the dsa-check-running-kernel script with a packaged version. [ ]
    • Don t hide the status of offline hosts in the Jenkins shell monitor. [ ]
    • Detect undefined service problems in the node health check. [ ]
    • Update the sources.lst file for our mail server as its still running Debian buster. [ ]
    • Add our mail server to our node inventory so it is included in the Jenkins maintenance processes. [ ]
    • Remove the debsecan package everywhere; it got installed accidentally via the Recommends relation. [ ]
    • Document the usage of the osuosl174 host. [ ]
Regular node maintenance was also performed by Holger Levsen [ ], Vagrant Cascadian [ ][ ][ ] and Mattia Rizzolo.
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

5 April 2022

Kees Cook: security things in Linux v5.10

Previously: v5.9 Linux v5.10 was released in December, 2020. Here s my summary of various security things that I found interesting: AMD SEV-ES
While guest VM memory encryption with AMD SEV has been supported for a while, Joerg Roedel, Thomas Lendacky, and others added register state encryption (SEV-ES). This means it s even harder for a VM host to reconstruct a guest VM s state. x86 static calls
Josh Poimboeuf and Peter Zijlstra implemented static calls for x86, which operates very similarly to the static branch infrastructure in the kernel. With static branches, an if/else choice can be hard-coded, instead of being run-time evaluated every time. Such branches can be updated too (the kernel just rewrites the code to switch around the branch ). All these principles apply to static calls as well, but they re for replacing indirect function calls (i.e. a call through a function pointer) with a direct call (i.e. a hard-coded call address). This eliminates the need for Spectre mitigations (e.g. RETPOLINE) for these indirect calls, and avoids a memory lookup for the pointer. For hot-path code (like the scheduler), this has a measurable performance impact. It also serves as a kind of Control Flow Integrity implementation: an indirect call got removed, and the potential destinations have been explicitly identified at compile-time. network RNG improvements
In an effort to improve the pseudo-random number generator used by the network subsystem (for things like port numbers and packet sequence numbers), Linux s home-grown pRNG has been replaced by the SipHash round function, and perturbed by (hopefully) hard-to-predict internal kernel states. This should make it very hard to brute force the internal state of the pRNG and make predictions about future random numbers just from examining network traffic. Similarly, ICMP s global rate limiter was adjusted to avoid leaking details of network state, as a start to fixing recent DNS Cache Poisoning attacks. SafeSetID handles GID
Thomas Cedeno improved the SafeSetID LSM to handle group IDs (which required teaching the kernel about which syscalls were actually performing setgid.) Like the earlier setuid policy, this lets the system owner define an explicit list of allowed group ID transitions under CAP_SETGID (instead of to just any group), providing a way to keep the power of granting this capability much more limited. (This isn t complete yet, though, since handling setgroups() is still needed.) improve kernel s internal checking of file contents
The kernel provides LSMs (like the Integrity subsystem) with details about files as they re loaded. (For example, loading modules, new kernel images for kexec, and firmware.) There wasn t very good coverage for cases where the contents were coming from things that weren t files. To deal with this, new hooks were added that allow the LSMs to introspect the contents directly, and to do partial reads. This will give the LSMs much finer grain visibility into these kinds of operations. set_fs removal continues
With the earlier work landed to free the core kernel code from set_fs(), Christoph Hellwig made it possible for set_fs() to be optional for an architecture. Subsequently, he then removed set_fs() entirely for x86, riscv, and powerpc. These architectures will now be free from the entire class of kernel address limit attacks that only needed to corrupt a single value in struct thead_info. sysfs_emit() replaces sprintf() in /sys
Joe Perches tackled one of the most common bug classes with sprintf() and snprintf() in /sys handlers by creating a new helper, sysfs_emit(). This will handle the cases where kernel code was not correctly dealing with the length results from sprintf() calls, which might lead to buffer overflows in the PAGE_SIZE buffer that /sys handlers operate on. With the helper in place, it was possible to start the refactoring of the many sprintf() callers. nosymfollow mount option
Mattias Nissler and Ross Zwisler implemented the nosymfollow mount option. This entirely disables symlink resolution for the given filesystem, similar to other mount options where noexec disallows execve(), nosuid disallows setid bits, and nodev disallows device files. Quoting the patch, it is useful as a defensive measure for systems that need to deal with untrusted file systems in privileged contexts. (i.e. for when /proc/sys/fs/protected_symlinks isn t a big enough hammer.) Chrome OS uses this option for its stateful filesystem, as symlink traversal as been a common attack-persistence vector. ARMv8.5 Memory Tagging Extension support
Vincenzo Frascino added support to arm64 for the coming Memory Tagging Extension, which will be available for ARMv8.5 and later chips. It provides 4 bits of tags (covering multiples of 16 byte spans of the address space). This is enough to deterministically eliminate all linear heap buffer overflow flaws (1 tag for free , and then rotate even values and odd values for neighboring allocations), which is probably one of the most common bugs being currently exploited. It also makes use-after-free and over/under indexing much more difficult for attackers (but still possible if the target s tag bits can be exposed). Maybe some day we can switch to 128 bit virtual memory addresses and have fully versioned allocations. But for now, 16 tag values is better than none, though we do still need to wait for anyone to actually be shipping ARMv8.5 hardware. fixes for flaws found by UBSAN
The work to make UBSAN generally usable under syzkaller continues to bear fruit, with various fixes all over the kernel for stuff like shift-out-of-bounds, divide-by-zero, and integer overflow. Seeing these kinds of patches land reinforces the the rationale of shifting the burden of these kinds of checks to the toolchain: these run-time bugs continue to pop up. flexible array conversions
The work on flexible array conversions continues. Gustavo A. R. Silva and others continued to grind on the conversions, getting the kernel ever closer to being able to enable the -Warray-bounds compiler flag and clear the path for saner bounds checking of array indexes and memcpy() usage. That s it for now! Please let me know if you think anything else needs some attention. Next up is Linux v5.11.

2022, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 License.
CC BY-SA 4.0

24 March 2022

Ingo Juergensmann: New Server NVMe Issues

My current server is somewhat aged. I bought it new in July 2014 with a 6-core Xeon E5-2630L, 32 GB RAM and 4x 3.5 hot-swappable drives. Gladly I had the opportunity to extend the memory to 128 GB RAM at no additional cost by using memory from my ex-employer. It also has 4x 2 TB WD Red HDDs with 5400 rpm hooked up to the SATA backplane, but unfortunately only two of them are SATA-3 with 6 Gbit/s. The new server is a used/refurbished Supermicro server with 2x 14-core Xeon E5-2683 and 256 GB RAM and 4x 3.5 hot-swappable drives. It also came with a Hardware-RAID SAS/SATA 8-port controller with BBU. I also ordered two slim drive kits (MCP-220-81504-0N & MCP-220-81506-0N) to be able to use 2x 3.5 slots for rotational HDDs as a cheap storage. Right now I added 2x 128 GB Supermicro SATA DOMs, 4x WD Red 4 TB SSDs and a Sonnet Fusion 4 4 Silent and 4x 1 TB Seagate Firecuda 520 NVMe disks. And here the issue starts: The NVMe should be capable of 4-5 GB/s, but they are connected to a PCIe 3.0 x16 port via the Sonnet Fusion 4 4, which itself features a PCIe bridge, so bifurbacation is not necessary. When doing some tests with bonnie++ I get around 1 GB/s transfer rates out of a RAID10 setup with all 4 NVMes. In fact, regardless of the RAID level there are only transfer rates of about 1 1.2 GB/s with bonnie++. (All software RAIDs with mdadm.) But also when constructing a RAID each NVMe gives around 300-600 MB/s in sync speed except for one exception: RAID1. Regardless of how many NVMe disks in a RAID1 setup the sync speed is up to 2.5 GB/s for each of the NVMe disks. So the lower transfer rates with bonnie++ or other RAID levels shouldn t be limited by bus speed nor by CPU speed. Alas, atop shows upto 100% CPU usage for all tests. I even tested In my understanding RAID10 should perform similar to RAID1 in terms of syncing and better and while bonnie++ tests (up to 2x write and 4x read speed compared to a single disk). For the bonnie++ tests I even made some tests that are available here. You can find the test parameters listed in the hostname column: Baldur is the hostname, then followed by the layout (near-2, far-2, offset-2), chunk size and concurrency of bonnie++. In the end there was no big impact of the chunk size of the RAID. So, now I m wondering what the reason for the slow performance of those 4x NVMe disks is? Bus speed of the PCIe 3.0 x16 shouldn t be the cause, because I assume that the software RAID will need to transfer the blocks in RAID1 as well as in RAID10 over the bus. Same goes for the CPU: the amount of CPU work should be roughly the same for RAID1 and for RAID10. RAID10 should even have an advantage because the blocks only need to be synced to 2 disks in a stripe set. Bonnie++ tests are a different topic for sure. But when testing reading with dd from the md-devices I only get around 1-1.5 GB/s as well. Even when using LVM RAID instead of LVM on top of md RAID. All NVMe disks are already set to 4k and IO scheduler is set to mq-deadline. Is there anything I could do to improve the performance of the NVMe disks? On the other head, pure transfer rates are not that important to a server that runs a dozen of VMs. Here the improved IOPS performance over rotation disks is a clear performance gain. But I m still curious if I could get maybe 2 GB/s out of a RAID10 setup with the NVMe disks. Then again having two independent RAID1 setups for MariaDB and for PostgreSQL databases might be a better choice over a single RAID10 setup?

15 March 2022

Russell Coker: Librem 5 First Impression

I just received the Purism Librem 5 that I paid for years ago (I think it was 2018). The process of getting the basic setup done was typical (choosing keyboard language, connecting to wifi, etc). Then I tried doing things. One thing I did was update to the latest PureOS release which gave me a list of the latest Debian packages installed which is nice. The first problem I found was the lack of notification when the phone is trying to do something. I d select to launch an app, nothing would happen, then a few seconds later it would appear. When I go to the PureOS app store and get a list of apps in a category nothing happens for ages (shows a blank list) then it might show actual apps, or it might not. I don t know what it s doing, maybe downloading a list of apps, if so it should display how many apps have had their summary data downloaded or how many KB of data have been downloaded so I know if it s doing something and how long it might take. Running any of the productivity applications requires a GNOME keyring, I selected a keyring password of a few characters and it gave a warning about having no password (does this mean it took 3 characters to be the same as 0 characters?). Then I couldn t unlock it later. I tried deleting the key file and creating a new one with a single character password and got the same result. I think that such keyring apps have little benefit, all processes in the session have the same UID and presumable can use ptrace to get data from each other s memory space. If the keyring program was SETGID and the directory used to store the keyring files was a system directory with execute access only for that group then it might provide some benefit (SETGID means that ptrace is denied). Ptrace is denied for the keyring but relying on a user space prompt for the passphrase to a file that the user can read seems of minimal benefit as a hostile process could read the file and prompt for the passphrase. This is probably more of a Debian issue, and I reproduced the keyring issue with my Debian workstation. The Librem 5 is a large phone (unusually thick by modern phone standards) and is rumoured to be energy hungry. When I tried charging it from the USB port on my PC (HP ML110 Gen9) the charge level went down. I used the same USB port and USB cable that I use to charge my Huawei Mate 10 Pro every day, so other phones can draw more power from that USB port and cable faster than they use it. The on-sceen keyboard for the Librem 5 is annoying, it doesn t have a TAB key and the cursor control keys are unreasonably small. The keyboard used by ConnectBot (the most popular SSH client for Android) is much better, it has it s own keys for CTRL, ESC, TAB, arrows, HOME, and END in addition to the regular on-screen keyboard. The Librem 5 comes with a terminal app by default which is much more difficult to use than it should be due to the lack of TAB filename completion etc. The phone has a specified temperature range of 0C to 35C, that s not great for Australia where even the cooler cities have summer temperatures higher than that. When charging on a fast charger (one that can provide energy faster than the phone uses it) the phone gets quite warm. It feels like more than 10C hotter than the ambient temperature, so I guess I can t charge it on Monday afternoon when the forecast is 31C! Maybe I should put a USB charger by my fridge with a long enough cable that I can charge a phone that s inside the fridge, seriously. Having switches to disable networking is a good security feature and designing the phone with separate components that can t interfere with each other is good too. There are reports that software fixes will reduce the electricity use which will alleviate the problems with charging and temperature. Most of my problems are clearly software related and therefore things that I can fix (in theory at least I don t have unlimited coding time). Overall this wasn t the experience I had hoped for after spending something like $700 and waiting about 4 years (so long that I can t easily find the records of how long and how much money). Getting It Working It seems that the PureOS app store app doesn t work properly. I can visit the app site and then select an app to install which then launches the app store app to do the install, which failed for every app I tried. Then I tried going to the terminal and running the following:
sudo bash
apt update
apt install openssh-server netcat
So I should be able to use APT to install everything I want and use the PureOS web site as a guide to what is expected to work on the phone. As an aside the PureOS apt repository appears to be a mirror or rebuild of the Debian/Bullseye arm64 archive without non-free components that they call Byzanteum. Then I could ssh to my phone via ssh purism@purism (after adding an entry to /etc/hosts with the name purism and a static entry in my DHCP configuration to match) and run sudo bash to get root. To be able to login to root directly I had to install a ssh key (root is setup to login without password) and run usermod --expiredate= root (empty string for expire date) to have direct root logins. I put the following in /etc/ssh/sshd_config.d/local.conf to restrict who can login (I added the users I want to the sshusers group). It also uses the ClientAlive checks because having sessions break due to IP address changes is really common with mobile devices and we don t want disconnected sessions taking up memory forever.
AllowGroups sshusers
PasswordAuthentication no
UseDNS no
ClientAliveInterval 60
ClientAliveCountMax 6
Notifications The GNOME notification system is used for notifications in the phone UI. So if you install the package libnotify-bin you get a utility notify-send that allows sending notifications from shell scripts. Final Result Now it basically works as a Debian workstation with a single-button mouse. So I just need to configure it as I would a Debian system and fix bugs along the way. Presumably any bug fixes I get into Debian will appear in PureOS shortly after the next Debian release.

16 February 2022

Joerg Jaspert: Funny CPU usage - rewrite it in rust

Munin plugin and it s CPU usage (and a rewrite in rust) With my last blog on the Munin plugins CPU usage I complained about Oracle Linux doing something really weird, driving up CPU usage when running a fairly simple Shell script with a loop in. Turns out, I was wrong. It is not OL7 that makes this problem show up. It appears to be something from the Oracle Enterprise Database installed on the system, that makes it go this crazy. I ve now had this show up on RedHat7 systems too, and the only thing that singles them out is that overpriced index card system on it. I still don t know what the actual reason for this is, and honestly, don t have enough time to dig deep into it. It is not something that a bit of debugging/tracing finds - especially as it does start out all nice, and accumulates more CPU usage over time. Which would suggest some kind of leak leading to more processing needed, or so - but then it is only CPU affected, not memory, and ONLY on systems with that database on. Meh. Well, I recently (December vacation) got me to look deeper into learning Rust. My first project with that was a multi-threaded milter to do some TLS checks on outgoing mails (kind of fun customer requirements there), and heck, Rust did make that a surprisingly easy task in the end. (Comparing the old, single-threaded C code with my multi-threaded Rust version, a third of the code length doing more, and being way easier to extend with wanted new features is nice). So my second project was Replace this shell script with a Rust binary doing the same . Hell yeah. Didn t take that long and looks good (well, the result. Not sure about the code. People knowing rust may possibly scratch out eyes when looking at it). Not yet running for that long, but even compared to the shell on systems that did not show the above mentioned bugs (read: Debian, without Oracle foo), uses WAY less CPU (again, mentioned by highly accurate outputs of the top command). So longer term I hope this version won t run into the same problems as the shell one. Time will tell. If you are interested in the code, go find it here, and if you happen to know rust and not run away screaming, I m happy for tips and code fixes, I m sure this can be improved lots. (At least cargo clippy is happy, so basics are done ) Update: According to munin, the rust version creates 14 forks/second less than the shell one. And the fork rate change is same on machines with/without the database. That 14 is more than I would have guessed. CPU usage as expected: only on the problem hosts with Oracle Database installed you can see a huge difference, otherwise it is not an easily noticable difference. That is, on an otherwise idle host (munin graph shows average use of low one-digit numbers), one can see a drop of around 1% in the CPU usage graph from munin. Ohwell, poor Shell.

13 February 2022

Gunnar Wolf: Got to boot a RPi Zero 2 W with Debian

About a month ago, I got tired of waiting for the newest member of the Raspberry product lineup to be sold in Mexico, and I bought it from a Chinese reseller through a big online shopping platform. I paid quite a bit of premium (~US$85 instead of the advertised US$15), and got it delivered ten days later Anyway, it s known this machine does not yet boot mainline Linux. The vast majority of ARM systems require the bootloader to load a Device Tree file, presenting the hardware characteristics map. And while the RPi Zero 2 W (hey what an awful and confusing naming scheme they chose!) is mostly similar to a RPi3B+, it is not quite the same thing. A kernel with RPi3B+ s device tree will refuse to boot. Anyway, I started digging, and found that some days ago Stephan Wahren sent a patch to the linux-arm-kernel mailing list with a matching device tree. Read the patch! It s quite simple to read (what is harder is to know where each declaration should go, if you want to write your own, of course). It basically includes all basic details for the main chip in the RPi3 family (BCM2837), pulls in also the declarations from the BCM2836 present in the RPi2, and adds the necessary bits for the USB OTG connection and the WiFi and Bluetooth declarations. Registers the model name as Raspberry Pi Zero 2 W, which you can easily see in the following photo, informs the kernel it has 512MB RAM, and Well, really, it s an easy device tree to read, don t be shy! So, I booted my RPi 3B+ with a freshly downloaded Bookworm image, installed and unpacked linux-source-5.15, applied Stephan s patch, and added the following for the DTB to be generated in the arm64 tree as well:
--- /dev/null   2022-01-26 23:35:40.747999998 +0000
+++ arch/arm64/boot/dts/broadcom/bcm2837-rpi-zero-2-w.dts       2022-02-13 06:28:29.968429953 +0000
@@ -0,0 +1 @@
+#include "arm/bcm2837-rpi-zero-2-w.dts"
Then, ran a simple make dtbs, and Failed, because bcm283x-rpi-wifi-bt.dtsi is not yet in the kernel . OK, no worries: Getting wireless to work is a later step. I commented out the lines causing conflict (10, 33-35, 134-136), and:
root@rpi3-20220212:/usr/src/linux-source-5.15# make dtbs
  DTC     arch/arm64/boot/dts/broadcom/bcm2837-rpi-zero-2-w.dtb
Great! Just copied over that generated file to /boot/firmware/, moved the SD over to my RPiZ2W, and behold! It boots! When I bragged about it in #debian-raspberrypi, steev suggested me to pull in the WiFi patch, that has also been submitted (but not yet accepted) for kernel inclusion. I did so, uncommented the lines I modified, and built again. It builds correctly, and again copied the DTB over. It still does not find the WiFi; dmesg still complains about a missing bit of firmware (failed to load brcm/brcmfmac43430b0-sdio.raspberrypi,model-zero-2-w.bin). Steev pointed out it can be downloaded from RPi Distro s GitHub page, but I called it a night and didn t pursue it any further ;-) So I understand this post is still a far cry from saying our images properly boot under a RPi 0 2 W , but we will get there

11 February 2022

Ingo Juergensmann: Old Buildd.Net Database

Since March/April 2000 I was deeply involved in Debian m68k and operated multiple m68k autobuilder for over a decade. In fact my Amiga 3000 named arrakis was the second buildd for m68k in addition to the Debian owned Amiga 3000UX named kullervo . Back in that time there was some small website running on Kullervo to display some information about the Debian autobuilder. After some time we (as m68k porters) moved that webpage away from Kullervo to my root server. Step by step this site evolved to Buildd.Net and extended to other archs and suites beside unstable like backports or non-volatile. The project got more and more complex and beyond my ability to do a complete necessary rewrite. So, in 2016 I asked for adoption of the project and in 2018 I shut it down, because (apparently) there was nobody taking over. From November 2005 until January 2018 I do have entries in my PostgreSQL database for Buildd.Net. I think the data in the database might be interesting for those that want to examine that data. You can use the data to see how build times have increased over time, which e.g. led to the expulsion of m68k as release arch, because the arch couldn t keep up anymore. I could imagine that you could do other interesting analysis with that data. For example how new versions of the toolchain increased the build times, maybe even if a specific version of e.g. binutils or gcc had a positive effect on certain archs, but a negative effect on other archs. If there is interest in this data I could open the database to the public or even upload the dump of the database so that you can download and install it on your own.

26 January 2022

Timo Jyrinki: Unboxing Dell XPS 13 - openSUSE Tumbleweed alongside preinstalled Ubuntu

A look at the 2021 model of Dell XPS 13 - available with Linux pre-installed
I received a new laptop for work - a Dell XPS 13. Dell has been long famous for offering certain models with pre-installed Linux as a supported option, and opting for those is nice for moving some euros/dollars from certain PC desktop OS monopoly towards Linux desktop engineering costs. Notably Lenovo also offers Ubuntu and Fedora options on many models these days (like Carbon X1 and P15 Gen 2).
black box

opened box

accessories and a leaflet about Linux support

laptop lifted from the box, closed

laptop with lid open

Ubuntu running

openSUSE runnin
Obviously a smooth, ready-to-rock Ubuntu installation is nice for most people already, but I need openSUSE, so after checking everything is fine with Ubuntu, I continued to install openSUSE Tumbleweed as a dual boot option. As I m a funny little tinkerer, I obviously went with some special things. I wanted:
  • Ubuntu to remain as the reference supported OS on a small(ish) partition, useful to compare to if trying out new development versions of software on openSUSE and finding oddities.
  • openSUSE as the OS consuming most of the space.
  • LUKS encryption for openSUSE without LVM.
  • ext4 s new fancy fast_commit feature in use during filesystem creation.
  • As a result of all that, I ended up juggling back and forth installation screens a couple of times (even more than shown below, and also because I forgot I wanted to use encryption the first time around).
First boots to pre-installed Ubuntu and installation of openSUSE Tumbleweed as the dual-boot option:
(if the embedded video is not shown, use a direct link)
Some notes from the openSUSE installation:
  • openSUSE installer s partition editor apparently does not support resizing or automatically installing side-by-side another Linux distribution, so I did part of the setup completely on my own.
  • Installation package download hanged a couple of times, only passed when I entered a mirror manually. On my TW I ve also noticed download problems recently, there might be a problem with some mirror I need to escalate.
  • The installer doesn t very clearly show encryption status of the target installation - it took me a couple of attempts before I even noticed the small encrypted column and icon (well, very small, see below), which also did not spell out the device mapper name but only the main partition name. In the end it was going to do the right thing right away and use my pre-created encrypted target partition as I wanted, but it could be a better UX. Then again I was doing my very own tweaks anyway.
  • Let s not go to the details why I m so old-fashioned and use ext4 :)
  • openSUSE s installer does not work fine with HiDPI screen. Funnily the tty consoles seem to be fine and with a big font.
  • At the end of the video I install the two GNOME extensions I can t live without, Dash to Dock and Sound Input & Output Device Chooser.

3 January 2022

Russ Allbery: Review: Crashed

Review: Crashed, by Adam Tooze
Publisher: Penguin Books
Copyright: 2018
Printing: 2019
ISBN: 0-525-55880-2
Format: Kindle
Pages: 615
The histories of the 2008 financial crisis that I have read focus almost exclusively on the United States. They also stop after the bank rescue and TARP or, if they press on into the aftermath, focus on the resulting damage to the US economy and the widespread pain of falling housing prices and foreclosure. Crashed does neither, instead arguing that 2008 was a crisis of European banks as much as American banks. It extends its history to cover the sovereign debt crisis in the eurozone, treating it as a continuation of the same crisis in a different guise. In the process, Tooze makes a compelling argument that one can draw a clear, if wandering, line from the moral revulsion at the propping up of the international banking system to Brexit and Trump. Qualifications first, since they are important for this type of comprehensive and, in places, surprising and counterintuitive history. Adam Tooze is Kathryn and Shelby Cullom Davis Professor of History at Columbia University and the director of its European Institute. His previous books have won multiple awards, and Crashed won the Lionel Gelber Prize for non-fiction on foreign policy. That it won a prize in that topic, rather than history or economics, is a hint at Tooze's chosen lens. The first half of the book is the lead-up and response to the crisis provoked by the collapse in value of securitized US mortgages and leading to the failure of Lehman Brothers, the failure in all but name of AIG, and a massive bank rescue. The financial instruments at the center of the crisis are complex and difficult to understand, and Tooze provides only brief explanation. This therefore may not be the best first book on the crisis; for that, I would still recommend Bethany McClean and Joe Nocera's All the Devils Are Here, although it's hard to beat Michael Lewis's storytelling in The Big Short. Tooze is not interested in dwelling on a blow-by-blow account of the crisis and initial response, and some of his account feels perfunctory. He is instead interested in describing its entangled global sweep. The new detail I took from the first half of Crashed is the depth of involvement of the European banks in what is often portrayed as a US crisis. Tooze goes into more specifics than other accounts on the eurodollar market, run primarily through the City of London, and the vast dollar-denominated liabilities of European banks. When the crisis struck, the breakdown of liquidity markets left those banks with no source of dollar funding to repay dollar-denominated short-term loans. The scale of dollar borrowing by European banks was vast, dwarfing the currency reserves or trade surpluses of their home countries. An estimate from the Bank of International Settlements put the total dollar funding needs for European banks at more than $2 trillion. The institution that saved the European banks was the United States Federal Reserve. This was an act of economic self-protection, not largesse; in the absence of dollar liquidity, the fire sale of dollar assets by European banks in a desperate attempt to cover their loans would have exacerbated the market crash. But it's remarkable in its extent, and in how deeply this contradicts the later public political position that 2008 was an American recession caused by American banks. 52% of the mortgage-backed securities purchased by the Federal Reserve in its quantitative easing policies (popularly known as QE1, QE2, and QE3) were sold by foreign banks. Deutsche Bank and Credit Suisse unloaded more securities on the Fed than any American bank by a significant margin. And when that wasn't enough, the Fed went farther and extended swap lines to major national banks, providing them dollar liquidity that they could then pass along to their local institutions. In essence, in Tooze's telling, the US Federal Reserve became the reserve bank for the entire world, preventing a currency crisis by providing dollars to financial systems both foreign and domestic, and it did so with a remarkable lack of scrutiny. Its swap lines avoided public review until 2010, when Bloomberg won a court fight to extract the records. That allowed the European banks that benefited to hide the extent of their exposure.
In Europe, the bullish CEOs of Deutsche Bank and Barclays claimed exceptional status because they avoided taking aid from their national governments. What the Fed data reveal is the hollowness of those boasts. The banks might have avoided state-sponsored recapitalization, but every major bank in the entire world was taking liquidity assistance on a grand scale from its local central bank, and either directly or indirectly by way of the swap lines from the Fed.
The emergency steps taken by Timothy Geithner in the Treasury Department were nearly as dramatic as those of the Federal Reserve. Without regard for borders, and pushing the boundary of their legal authority, they intervened massively in the world (not just the US) economy to save the banking and international finance system. And it worked. One of the benefits of a good history is to turn stories about heroes and villains into more nuanced information about motives and philosophies. I came away from Sheila Bair's account of the crisis furious at Geithner's protection of banks from any meaningful consequences for their greed. Tooze's account, and analysis, agrees with Bair in many respects, but Bair was continuing a personal fight and Tooze has more space to put Geithner into context. That context tells an interesting story about the shape of political economics in the 21st century. Tooze identifies Geithner as an institutionalist. His goal was to keep the system running, and he was acutely aware of what would happen if it failed. He therefore focused on the pragmatic and the practical: the financial system was about to collapse, he did whatever was necessary to keep it working, and that effort was successful. Fairness, fault, and morals were treated as irrelevant. This becomes more obvious when contrasted with the eurozone crisis, which started with a Greek debt crisis in the wake of the recession triggered by the 2008 crisis. Greece is tiny by the standards of the European economy, so at first glance there is no obvious reason why its debt crisis should have perturbed the financial system. Under normal circumstances, its lenders should have been able to absorb such relatively modest losses. But the immediate aftermath of the 2008 crisis was not normal circumstances, particularly in Europe. The United States had moved aggressively to recapitalize its banks using the threat of compensation caps and government review of their decisions. The European Union had not; European countries had done very little, and their banks were still in a fragile state. Worse, the European Central Bank had sent signals that the market interpreted as guaranteeing the safety of all European sovereign debt equally, even though this was explicitly ruled out by the Lisbon Treaty. If Greece defaulted on its debt, not only would that be another shock to already-precarious banks, it would indicate to the market that all European debt was not equal and other countries may also be allowed to default. As the shape of the Greek crisis became clearer, the cost of borrowing for all of the economically weaker European countries began rising towards unsustainable levels. In contrast to the approach taken by the United States government, though, Europe took a moralistic approach to the crisis. Jean-Claude Trichet, then president of the European Central Bank, held the absolute position that defaulting on or renegotiating the Greek debt was unthinkable and would not be permitted, even though there was no realistic possibility that Greece would be able to repay. He also took a conservative hard line on the role of the ECB, arguing that it could not assist in this crisis. (Tooze is absolutely scathing towards Trichet, who comes off in this account as rigidly inflexible, volatile, and completely irrational.) Germany's position, represented by Angela Merkel, was far more realistic: Greece's debt should be renegotiated and the creditors would have to accept losses. This is, in Tooze's account, clearly correct, and indeed is what eventually happened. But the problem with Merkel's position was the potential fallout. The German government was still in denial about the health of its own banks, and political opinion, particularly in Merkel's coalition, was strongly opposed to making German taxpayers responsible for other people's debts. Stopping the progression of a Greek default to a loss of confidence in other European countries would require backstopping European sovereign debt, and Merkel was not willing to support this. Tooze is similarly scathing towards Merkel, but I'm not sure it's warranted by his own account. She seemed, even in his account, boxed in by domestic politics and the tight constraints of the European political structure. Regardless, even after Trichet's term ended and he was replaced by the far more pragmatic Mario Draghi, Germany and Merkel continued to block effective action to relieve Greece's debt burden. As a result, the crisis lurched from inadequate stopgap to inadequate stopgap, forcing crippling austerity, deep depressions, and continued market instability while pretending unsustainable debt would magically become payable through sufficient tax increases and spending cuts. US officials such as Geithner, who put morals and arguably legality aside to do whatever was needed to save the system, were aghast. One takeaway from this is that expansionary austerity is the single worst macroeconomic idea that anyone has ever had.
In the summer of 2012 [the IMF's] staff revisited the forecasts they had made in the spring of 2010 as the eurozone crisis began and discovered that they had systematically underestimated the negative impact of budget cuts. Whereas they had started the crisis believing that the multiplier was on average around 0.5, they now concluded that from 2010 forward it had been in excess of 1. This meant that cutting government spending by 1 euro, as the austerity programs demanded, would reduce economic activity by more than 1 euro. So the share of the state in economic activity actually increased rather than decreased, as the programs presupposed. It was a staggering admission. Bad economics and faulty empirical assumptions had led the IMF to advocate a policy that destroyed the economic prospects for a generation of young people in Southern Europe.
Another takeaway, though, is central to Tooze's point in the final section of the book: the institutionalists in the United States won the war on financial collapse via massive state interventions to support banks and the financial system, a model that Europe grudgingly had to follow when attempting to reject it caused vast suffering while still failing to stabilize the financial system. But both did so via actions that were profoundly and obviously unfair, and only questionably legal. Bankers suffered few consequences for their greed and systematic mismanagement, taking home their normal round of bonuses while millions of people lost their homes and unemployment rates for young men in some European countries exceeded 50%. In Europe, the troika's political pressure against Greece and Italy was profoundly anti-democratic. The financial elite achieved their goal of saving the financial system. It could have failed, that failure would have been catastrophic, and their actions are defensible on pragmatic grounds. But they completely abandoned the moral high ground in the process. The political forces opposed to centrist neoliberalism attempted to step into that moral gap. On the Left, that came in the form of mass protest movements, Occupy Wall Street, Bernie Sanders, and parties such as Syriza in Greece. The Left, broadly, took the moral side of debtors, holding that the primary pain of the crisis should instead be born by the wealthy creditors who were more able to absorb it. The Right by contrast, in the form of the Tea Party movement inside the Republican Party in the United States and the nationalist parties in Europe, broadly blamed debtors for taking on excessive debt and focused their opposition on use of taxpayer dollars to bail out investment banks and other institutions of the rich. Tooze correctly points out that the Right's embrace of racist nationalism and incoherent demagoguery obscures the fact that their criticism of the elite center has real merit and is partly shared by the Left. As Tooze sketches out, the elite centrist consensus held in most of Europe, beating back challenges from both the Left and the Right, although it faltered in the UK, Poland, and Hungary. In the United States, the Democratic Party similarly solidified around neoliberalism and saw off its challenges from the Left. The Republican Party, however, essentially abandoned the centrist position, embracing the Right. That left the Democratic Party as the sole remaining neoliberal institutionalist party, supplemented by a handful of embattled Republican centrists. Wall Street and its money swung to the Democratic Party, but it was deeply unpopular on both the Left and the Right and this shift may have hurt them more than helped. The Democrats, by not abandoning the center, bore the brunt of the residual anger over the bank bailout and subsequent deep recession. Tooze sees in that part of the explanation for Trump's electoral victory over Hilary Clinton. This review is already much too long, and I haven't even mentioned Tooze's clear explanation of the centrality of treasury bonds to world finances, or his discussions of Russian and Ukraine, China, or Brexit, all of which I thought were excellent. This is not only an comprehensive history of both of the crises and international politics of the time period. It is also a thought-provoking look at how drastic of interventions are required to keep the supposed free market working, who is left to suffer after those interventions, and the political consequences of the choice to prioritize the stability of a deeply inequitable and unsafe financial system. At least in the United States, there is now a major political party that is likely to oppose even mundane international financial institutions, let alone another major intervention. The neoliberal center is profoundly weakened. But nothing has been done to untangle the international financial system, and little has been done to reduce its risk. The world will go into the next financial challenge still suffering from a legitimacy crisis. Given the miserly, condescending, and dismissive treatment of the suffering general populace after moving heaven and earth to save the banking system, that legitimacy crisis is arguably justified, but an uncontrolled crash of the financial system is not likely to be any kinder to the average citizen than it is to the investment bankers. Crashed is not the best-written book at a sentence-by-sentence level. Tooze's prose is choppy and a bit awkward, and his paragraphs occasionally wander away from a clear point. But the content is excellent and thought-provoking, filling in large sections of the crisis picture that I had not previously been aware of and making a persuasive argument for its continuing effects on current politics. Recommended if you're not tired of reading about financial crises. Rating: 8 out of 10

4 December 2021

Jonathan Dowland: Haskell mortgage calculator

A few months ago I was trying to compare two mortgage offers, and ended up writing a small mortgage calculator to help me. Both mortgages were fixed-term for the same time period (5 years). One of the mortgages had a lower rate than the other, but much higher arrangement fees. A broker recommended the mortgage with the higher rate but lower fee, on an affordability basis for the fixed term: over all, we would spend less money within the fixed term on that deal than the other. (I thought) this left one bit of information missing: what remaining balance would there be at the end of the term? The mortgages I want to model are defined in terms of a monthly repayment figure and an annual interest rate for the fixed period. I think interest is usually recalculated on a daily basis, so I convert the annual rate down to a daily rate. Repayments only happen once a month. Months are not all the same size. Using mod 30 on the 'day' approximates a monthly payment. Over 5 years, there would be 60 months, meaning 60 repayments. (I'm ignoring leap years)
 > length . filter id .take (5*365) $ [ x mod 30==0   x <- [1..]]
Here's what I came up with. I was a little concerned the repayment approximation was too far out so I compared the output with a more precise (but boring) spreadsheet and they agreed to within an acceptable tolerance. The numbers that follow are all made up to illustrate the function and don't reflect my actual mortgage. :)
borrowed = 1000000 -- day 0 amount outstanding
aer   = 0.89
repay = 1000
der   = aer / 36
owed n   n == 0          = borrowed
         n  mod  30 == 0 = last + interest - repay
         otherwise       = last + interest
        last     = owed (n - 1)
        interest = last * der

1 December 2021

Utkarsh Gupta: FOSS Activites in December 2021

Here s my (twenty-sixth) monthly but brief update about the activities I ve done in the F/L/OSS world.

This was my 35th month of actively contributing to Debian. I became a DM in late March 2019 and a DD on Christmas 19! \o/ Just churning through the backlog again this month. Ugh. Anyway, I did the following stuff in Debian:

Uploads and bug fixes:
  • rails (2: - No-change rebuild for unstable.

Other $things:
  • Mentoring for newcomers.
  • Moderation of -project mailing list.

This was my 10th month of actively contributing to Ubuntu. Now that I ve joined Canonical to work on Ubuntu full-time, there s a bunch of things I do! \o/ I mostly worked on different things, I guess. I was too lazy to maintain a list of things I worked on so there s no concrete list atm. Maybe I ll get back to this section later or will start to list stuff from next year onward, as I was doing before. :D

Debian (E)LTS
Debian Long Term Support (LTS) is a project to extend the lifetime of all Debian stable releases to (at least) 5 years. Debian LTS is not handled by the Debian security team, but by a separate group of volunteers and companies interested in making it a success. And Debian Extended LTS (ELTS) is its sister project, extending support to the Jessie release (+2 years after LTS support). This was my twenty-sixth month as a Debian LTS and seventeenth month as a Debian ELTS paid contributor.
I was assigned 30.00 hours for LTS and 45.00 hours for ELTS and worked on the following things:

LTS CVE Fixes and Announcements:
  • Issued DLA 2813-1, fixing CVE-2021-33829 and CVE-2021-37695, for ckeditor.
    For Debian 9 stretch, these problems have been fixed in version 4.5.7+dfsg-2+deb9u1.
  • Issued DLA 2817-1, fixing CVE-2021-23214 and CVE-2021-23222, for postgresql-9.6.
    For Debian 9 stretch, these problems have been fixed in version 9.6.24-0+deb9u1.
  • Issued DLA 2836-1, fixing CVE-2021-43527, for nss.
    For Debian 9 stretch, these problems have been fixed in version 2:3.26.2-1.1+deb9u3.
  • Started working on src:samba for CVE-2020-25717 to CVE-2020-25722 and CVE-2021-23192 for jessie and stretch, both.
    The version difference b/w the suites are a bit too much for the patch(es) to be easily backported. I ve talked to Anton to work something out. \o/
  • Found the problem w/ libjdom1-java. Will have to roll the regression upload.
    I ve prepared the patch but needs some testing to be finally rolled out. Same for jessie.
  • Started working on libgit2.

ELTS CVE Fixes and Announcements:

Other (E)LTS Work:
  • Front-desk duty from 29-11 to 05-12 for both LTS and ELTS.
  • Triaged udisk2, wordpress, samba, gmp, nss, ntfs-3g, and openssh.
  • Auto EOL ed dwarfutils, radare2, mongodb, linux for jessie.
  • As FD, did a deep dive into the no-pu-update issue. Will write to list shortly.
  • Attended monthly Debian LTS meeting.
  • Answered questions (& discussions) on IRC (#debian-lts and #debian-elts).
  • General and other discussions on LTS private and public mailing list.

Debian LTS Survey I ve spent 3 hours on the LTS survey on the following bits:
  • Talking to Laura to revive the old a/c on
  • Setting up stuff there.
  • Discussing the survey questions and other bits w/ Jeremiah.
  • Partly reviewing the questions of the survey.
  • Doing a walkthru of the LimeSurvey instance we have to make sure there are no changes .

Until next time.
:wq for today.

28 November 2021

Joachim Breitner: Zero-downtime upgrades of Internet Computer canisters

TL;DR: Zero-downtime upgrades are possible if you stick to the basic actor model.

Background DFINITY s Internet Computer provides a kind of serverless compute platform, where the services are WebAssemmbly programs called canisters . These services run without stopping (or at least that s what it feels like from the service s perspective; this is called orthogonal persistence ), and process one message after another. Messages not only come from the outside ( ingress calls), but are also exchanged between canisters. On top of these uni-directional messages, the system provides the concept of inter-canister calls , which associates a respondse message with the outgoing message, and guarantees that a response will come. This RPC-like interface allows canister developers to program in the popular async/await model, where these inter-canister calls look almost like normal function calls, and the subsequent code is suspended until the response comes back.

The problem This is all very well, until you try to upgrade your canister, i.e. install new code to fix a bug or add a feature. Because if you used the await pattern, there may still be suspended computations waiting for the response. If you swap out the program now, the code of that suspended computation will no longer be present, and the response cannot be handled! Worse, because of an infelicity with the current system s API, when the response comes back, it may actually corrupt your service s state. That is why upgrading a canister requires stopping it first, which means waiting for all outstanding calls to come back. During this time, your canister is not available for new calls (so there is downtime), and worse, the length of the downtime is at the whims of the canisters you called they could withhold the response ad infinitum, rendering your canister unupgradeable. Clearly, this is not acceptable for any serious application. In this post, I ll explore some of the ways to mitigate this problem, and how to create canisters that are safely instantanously (no downtime) upgradeable.

It s a spectrum Some canisters are trivially upgradeable, for others all hope is lost; it depends on what the canister does and how. As an overview, here is the spectrum:
  1. A canister that never performs inter-canister calls can always be upgraded without stopping.
  2. A canister that only does one-way calls, and does them in a particular way (see below), can always be upgraded without stopping.
  3. A canister that performs calls, and where it is acceptable to simply drop outstanding repsonses, can always be upgraded without stopping, once the System API has been improved and your Canister Development Kit (CDK; Motoko or Rust) has adapted.
  4. A canister that performs calls, but uses explicit continuations to handle, responses instead of the await-convenience, based on an eventually fixed System API, can be upgradeded without stopping, and will even handle responses afterwards.
  5. A canister that uses await to do inter-canister call cannot be upgraded without stopping.
In this post I will explain 2, which is possible now, in more detail. Variant 3 and 4 only become reality if and when the System API has improved.

One-way calls A one-way call is a call where you don t care about the response; neither the replied data, nor possible failure conditions. Since you don t care about the response, you can pass an invalid continuation to the system (technical detail: a Wasm table index of -1). Because it is invalid for any (realistic) Wasm module, it will stay invalid even after an upgrade, and the problem of silent corruption mentioned above is avoided. And otherwise it s fine for this to be invalid: it means the canister traps once the response comes back, which is harmeless (and possibly even cheaper than a do-nothing computation). This requires your CDK to support this kind of call. Mostly incidential, Motoko (and Candid) actually have the concept of one-way call in their type system, namely shared functions with return type () instead of async ... (Motoko is actually older than the system, and not every prediction about what the system will provide has proven successful). So, pending this PR to be released, Motoko will implement one-way calls in this way. On Rust, you have to use the System API directly or wait for cdk-rs to provide this ability (patches welcome, happy to advise). You might wonder: How are calls useful if I don t get to look at the response? Of course, this is a set-back calls with responses are useful, and await is convenient. And if you have to integrate with an existing service that only provides normal calls, you are out of luck. But if you get to design the canister and all called canisters together, it may be possible to use only one-way messages. You d be programming in the plain actor model now, with all its advantages (simple concurrency, easy to upgrade, general robustness). Consider for example a token ledger canister, not unlike the ICP ledger canister. For the most part, it doesn t have to do any outgoing calls (and thus be trivially upgradeble). But say we need to add notify functionality, where the ledger canister tells other canisters about a transaction. This is a good example for a one-way call: Maybe the ledger canister doesn t care if that notification was received? The ICP leder does care (once it comes back successful, this particular notification cannot be sent again), but maybe your ledger can do it differently: let the other canister confirm the receip via another one-way call, instead of via the reply; or simply charge for each notification and do not worry about repeated notifications. Maybe you want to add archiving functionality, where the ledger canister streams its data to an archive canister. There, again, instead of using successful responses to confirm receipt, the archive canister can ping the ledger canister with the latest received index directly. Yes, it changes the programming model a bit, and all involved parties have to play together, but the gain (zero-downtime upgrades) is quite valuable, and removes a fair number of other sources of issues.

And in the future? The above is possible with today s Internet Computer. If the System API gets improves the way I hope it will be, you have a possible middle ground: You still don t get to use await and instead have to write your response handler as separate functions, but this way you can call any canister again, and you get the system s assistance in mapping responses to calls. With this in place, any canister can be rewritten to a form that supports zero-downtime upgrades, without affecting its interface or what the canister can do.

20 November 2021

Jonathan Dowland: hledger footguns

I wrote in budgeting tools that I was taking a look at Plain Text Accounting and in particular, hledger. My Jury's still out on the tools, but in the time I've been looking at them I've come across a couple of foot-guns I thought it was worth writing down. hledger's ledger format is derived from that of its predecessor ledger, and so some of the problems might be inherited. 1. significant white space delimiters The basic syntax for a transaction looks like this
2020-03-15 client payment
    assets:checking         $ 2000
    income:consulting       $-2000
There's some significant white space delimiters in play. The most subtle is what separates the account names from the values: it is two or more spaces. A single space, and the value is treated as part of the account name. For some reason I hit this frequently with trying to encode opening balances: the account name used as the source of the initial balances is something not otherwise generally referred to again (something like equity:opening balances) and the transaction amount is inferred where possible, so I ended up with a bunch of accounts named equity:opening balances 100 and similar. 2. flexible decimal delimiter The value of transactions can be interspersed with commas and periods to make it more readable: e.g. $2000 could be written as $2,000. Different locales have different conventions here: It seems some(/most/all?) of Europe use periods to separate out the units and a comma to delimit the fractional part, whereas the US and the UK do the opposite. There is no built-in association between the currency symbol you are using and the period/comma convention: it's quite possible to accidentally write a number which is interpreted differently to how you intended, and it doesn't matter if you are using $ or etc. 3. new syntax has unexpected results in old versions Finally, my favourite. hledger has a notion of rules that can be used to match transactions when importing from CSV. The format looks like this:
if (match rule)
& (another rule)
account1 some:account:from
account2 some:account:to
By default, multiple rules in sequence like above are OR'd: any of them can match. The & prefix switches the behaviour to AND. But, & is a relatively new addition: it's not supported in 1.18.1, the version in Debian stable, which upstream released in June 2020. In prior versions the & prefix is not a syntax error, or at least, not one that's reported: it's silently ignored; meaning, the line with the & does nothing, and any of the other rules in the set will match. This is easy to miss, and means imports could be incorrectly posted.