Search Results: "bma"

19 December 2023

Fran ois Marier: Filtering your own spam using SpamAssassin

I know that people rave about GMail's spam filtering, but it didn't work for me: I was seeing too many false positives. I personally prefer to see some false negatives (i.e. letting some spam through), but to reduce false positives as much as possible (and ideally have a way to tune this). Here's the local SpamAssassin setup I have put together over many years. In addition to the parts I describe here, I also turn off greylisting on my email provider (KolabNow) because I don't want to have to wait for up to 10 minutes for a "2FA" email to go through. This setup assumes that you download all of your emails to your local machine. I use fetchmail for this, though similar tools should work too.

Three tiers of emails The main reason my setup works for me, despite my receiving hundreds of spam messages every day, is that I split incoming emails into three tiers via procmail:
  1. not spam: delivered to inbox
  2. likely spam: quarantined in a soft_spam/ folder
  3. definitely spam: silently deleted
I only ever have to review the likely spam tier for false positives, which is on the order of 10-30 spam emails a day. I never even see the the hundreds that are silently deleted due to a very high score. This is implemented based on a threshold in my .procmailrc:
# Use spamassassin to check for spam
:0fw: .spamassassin.lock
  /usr/bin/spamassassin
# Throw away messages with a score of > 12.0
:0
* ^X-Spam-Level: \*\*\*\*\*\*\*\*\*\*\*\*
/dev/null
:0:
* ^X-Spam-Status: Yes
$HOME/Mail/soft_spam/
# Deliver all other messages
:0:
$ DEFAULT 
I also use the following ~/.muttrc configuration to easily report false negatives/positives and examine my likely spam folder via a shortcut in mutt:
unignore X-Spam-Level
unignore X-Spam-Status
macro index S "c=soft_spam/\n" "Switch to soft_spam"
# Tell mutt about SpamAssassin headers so that I can sort by spam score
spam "X-Spam-Status: (Yes No), (hits score)=(-?[0-9]+\.[0-9])" "%3"
folder-hook =soft_spam 'push ol'
folder-hook =spam 'push ou'
# <Esc>d = de-register as non-spam, register as spam, move to spam folder.
macro index \ed "<enter-command>unset wait_key\n<pipe-entry>spamassassin -r\n<enter-command>set wait_key\n<save-message>=spam\n" "report the message as spam"
# <Esc>u = unregister as spam, register as non-spam, move to inbox folder.
macro index \eu "<enter-command>unset wait_key\n<pipe-entry>spamassassin -k\n<enter-command>set wait_key\n<save-message>=inbox\n" "correct the false positive (this is not spam)"

Custom SpamAssassin rules In addition to the default ruleset that comes with SpamAssassin, I've also accrued a number of custom rules over the years. The first set comes from the (now defunct) SpamAssassin Rules Emporium. The second set is the one that backs bugs.debian.org and lists.debian.org. Note this second one includes archived copies of some of the SARE rules and so I only use some of the rules in the common/ directory. Finally, I wrote a few custom rules of my own based on specific kinds of emails I have seen slip through the cracks. I haven't written any of those in a long time and I suspect some of my rules are now obsolete. You may want to do your own testing before you copy these outright. In addition to rules to match more spam, I've also written a ruleset to remove false positives in French emails coming from many of the above custom rules. I also wrote a rule to get a bonus to any email that comes with a patch:
describe FM_PATCH   Includes a patch
body FM_PATCH   /\bdiff -pruN\b/
score FM_PATCH  -1.0
since it's not very common in spam emails :)

SpamAssassin settings When it comes to my system-wide SpamAssassin configuration in /etc/spamassassin/, I enable the following plugins:
loadplugin Mail::SpamAssassin::Plugin::AntiVirus
loadplugin Mail::SpamAssassin::Plugin::AskDNS
loadplugin Mail::SpamAssassin::Plugin::ASN
loadplugin Mail::SpamAssassin::Plugin::AutoLearnThreshold
loadplugin Mail::SpamAssassin::Plugin::Bayes
loadplugin Mail::SpamAssassin::Plugin::BodyEval
loadplugin Mail::SpamAssassin::Plugin::Check
loadplugin Mail::SpamAssassin::Plugin::DKIM
loadplugin Mail::SpamAssassin::Plugin::DNSEval
loadplugin Mail::SpamAssassin::Plugin::FreeMail
loadplugin Mail::SpamAssassin::Plugin::FromNameSpoof
loadplugin Mail::SpamAssassin::Plugin::HashBL
loadplugin Mail::SpamAssassin::Plugin::HeaderEval
loadplugin Mail::SpamAssassin::Plugin::HTMLEval
loadplugin Mail::SpamAssassin::Plugin::HTTPSMismatch
loadplugin Mail::SpamAssassin::Plugin::ImageInfo
loadplugin Mail::SpamAssassin::Plugin::MIMEEval
loadplugin Mail::SpamAssassin::Plugin::MIMEHeader
loadplugin Mail::SpamAssassin::Plugin::OLEVBMacro
loadplugin Mail::SpamAssassin::Plugin::PDFInfo
loadplugin Mail::SpamAssassin::Plugin::Phishing
loadplugin Mail::SpamAssassin::Plugin::Pyzor
loadplugin Mail::SpamAssassin::Plugin::Razor2
loadplugin Mail::SpamAssassin::Plugin::RelayEval
loadplugin Mail::SpamAssassin::Plugin::ReplaceTags
loadplugin Mail::SpamAssassin::Plugin::Rule2XSBody
loadplugin Mail::SpamAssassin::Plugin::SpamCop
loadplugin Mail::SpamAssassin::Plugin::TextCat
loadplugin Mail::SpamAssassin::Plugin::TxRep
loadplugin Mail::SpamAssassin::Plugin::URIDetail
loadplugin Mail::SpamAssassin::Plugin::URIEval
loadplugin Mail::SpamAssassin::Plugin::VBounce
loadplugin Mail::SpamAssassin::Plugin::WelcomeListSubject
loadplugin Mail::SpamAssassin::Plugin::WLBLEval
Some of these require extra helper packages or Perl libraries to be installed. See the comments in the relevant *.pre files. My ~/.spamassassin/user_prefs file contains the following configuration:
required_hits   5
ok_locales en fr
# Bayes options
score BAYES_00 -4.0
score BAYES_40 -0.5
score BAYES_60 1.0
score BAYES_80 2.7
score BAYES_95 4.0
score BAYES_99 6.0
bayes_auto_learn 1
bayes_ignore_header X-Miltered
bayes_ignore_header X-MIME-Autoconverted
bayes_ignore_header X-Evolution
bayes_ignore_header X-Virus-Scanned
bayes_ignore_header X-Forwarded-For
bayes_ignore_header X-Forwarded-By
bayes_ignore_header X-Scanned-By
bayes_ignore_header X-Spam-Level
bayes_ignore_header X-Spam-Status
as well as manual score reductions due to false positives, and manual score increases to help push certain types of spam emails over the 12.0 definitely spam threshold. Finally, I have the FuzzyOCR package installed since it has occasionally flagged some spam that other tools had missed. It is a little resource intensive though and so you may want to avoid this one if you are filtering spam for other people. As always, feel free to leave a comment if you do something else that works well and that's not included in my setup. This is a work-in-progress.

31 October 2023

Iustin Pop: Raspberry PI OS: upgrading and cross-grading

One of the downsides of running Raspberry PI OS is the fact that - not having the resources of pure Debian - upgrades are not recommended, and cross-grades (migrating between armhf and arm64) is not even mentioned. Is this really true? It is, after all a Debian-based system, so it should in theory be doable. Let s try!

Upgrading The recently announced release based on Debian Bookworm here says:
We have always said that for a major version upgrade, you should re-image your SD card and start again with a clean image. In the past, we have suggested procedures for updating an existing image to the new version, but always with the caveat that we do not recommend it, and you do this at your own risk. This time, because the changes to the underlying architecture are so significant, we are not suggesting any procedure for upgrading a Bullseye image to Bookworm; any attempt to do this will almost certainly end up with a non-booting desktop and data loss. The only way to get Bookworm is either to create an SD card using Raspberry Pi Imager, or to download and flash a Bookworm image from here with your tool of choice.
Which means, it s time to actually try it turns out it s actually trivial, if you use RPIs as headless servers. I had only three issues:
  • if using an initrd, the new initrd-building scripts/hooks are looking for some binaries in /usr/bin, and not in /bin; solution: install manually the usrmerge package, and then re-run dpkg --configure -a;
  • also if using an initrd, the scripts are looking for the kernel config file in /boot/config-$(uname -r), and the raspberry pi kernel package doesn t provide this; workaround: modprobe configs && zcat /proc/config.gz > /boot/config-$(uname -r);
  • and finally, on normal RPI systems, that don t use manual configurations of interfaces in /etc/network/interface, migrating from the previous dhcpcd to NetworkManager will break network connectivity, and require you to log in locally and fix things.
I expect most people to hit only the 3rd, and almost no-one to use initrd on raspberry pi. But, overall, aside from these two issues and a couple of cosmetic ones (login.defs being rewritten from scratch and showing a baffling diff, for example), it was easy. Is it worth doing? Definitely. Had no data loss, and no non-booting system.

Cross-grading (32 bit to 64 bit userland) This one is actually painful. Internet searches go from it s possible, I think to it s definitely not worth trying . Examples: Aside from these, there are a gazillion other posts about switching the kernel to 64 bit. And that s worth doing on its own, but it s only half the way. So, armed with two different systems - a RPI4 4GB and a RPI Zero W2 - I tried to do this. And while it can be done, it takes many hours - first system was about 6 hours, second the same, and a third RPI4 probably took ~3 hours only since I knew the problematic issues. So, what are the steps? Basically:
  • install devscripts, since you will need dget
  • enable new architecture in dpkg: dpkg --add-architecture arm64
  • switch over apt sources to include the 64 bit repos, which are different than the 32 bit ones (Raspberry PI OS did a migration here; normally a single repository has all architectures, of course)
  • downgrade all custom rpi packages/libraries to the standard bookworm/bullseye version, since dpkg won t usually allow a single library package to have different versions (I think it s possible to override, but I didn t bother)
  • install libc for the arm64 arch (this takes some effort, it s actually a set of 3-4 packages)
  • once the above is done, install whiptail:amd64 and rejoice at running a 64-bit binary!
  • then painfully go through sets of packages and migrate the set to arm64:
    • sometimes this work via apt, sometimes you ll need to use dget and dpkg -i
    • make sure you download both the armhf and arm64 versions before doing dpkg -i, since you ll need to rollback some installs
  • at one point, you ll be able to switch over dpkg and apt to arm64, at which point the default architecture flips over; from here, if you ve done it at the right moment, it becomes very easy; you ll probably need an apt install --fix-broken, though, at first
  • and then, finish by replacing all packages with arm64 versions
  • and then, dpkg --remove-architecture armhf, reboot, and profit!
But it s tears and blood to get to that point

Pain point 1: RPI custom versions of packages Since the 32bit armhf architecture is a bit weird - having many variations - it turns out that raspberry pi OS has many packages that are very slightly tweaked to disable a compilation flag or work around build/test failures, or whatnot. Since we talk here about 64-bit capable processors, almost none of these are needed, but they do make life harder since the 64 bit version doesn t have those overrides. So what is needed would be to say downgrade all armhf packages to the version in debian upstream repo , but I couldn t find the right apt pinning incantation to do that. So what I did was to remove the 32bit repos, then use apt-show-versions to see which packages have versions that are no longer in any repo, then downgrade them. There s a further, minor, complication that there were about 3-4 packages with same version but different hash (!), which simply needed apt install --reinstall, I think.

Pain point 2: architecture independent packages There is one very big issue with dpkg in all this story, and the one that makes things very problematic: while you can have a library package installed multiple times for different architectures, as the files live in different paths, a non-library package can only be installed once (usually). For binary packages (arch:any), that is fine. But architecture-independent packages (arch:all) are problematic since usually they depend on a binary package, but they always depend on the default architecture version! Hrmm, and I just realise I don t have logs from this, so I m only ~80% confident. But basically:
  • vim-solarized (arch:all) depends on vim (arch:any)
  • if you replace vim armhf with vim arm64, this will break vim-solarized, until the default architecture becomes arm64
So you need to keep track of which packages apt will de-install, for later re-installation. It is possible that Multi-Arch: foreign solves this, per the debian wiki which says:
Note that even though Architecture: all and Multi-Arch: foreign may look like similar concepts, they are not. The former means that the same binary package can be installed on different architectures. Yet, after installation such packages are treated as if they were native architecture (by definition the architecture of the dpkg package) packages. Thus Architecture: all packages cannot satisfy dependencies from other architectures without being marked Multi-Arch foreign.
It also has warnings about how to properly use this. But, in general, not many packages have it, so it is a problem.

Pain point 3: remove + install vs overwrite It seems that depending on how the solver computes a solution, when migrating a package from 32 to 64 bit, it can choose either to:
  • overwrite in place the package (akin to dpkg -i)
  • remove + install later
The former is OK, the later is not. Or, actually, it might be that apt never can do this, for example (edited for brevity):
# apt install systemd:arm64 --no-install-recommends
The following packages will be REMOVED:
  systemd
The following NEW packages will be installed:
  systemd:arm64
0 upgraded, 1 newly installed, 1 to remove and 35 not upgraded.
Do you want to continue? [Y/n] y
dpkg: systemd: dependency problems, but removing anyway as you requested:
 systemd-sysv depends on systemd.
Removing systemd (247.3-7+deb11u2) ...
systemd is the active init system, please switch to another before removing systemd.
dpkg: error processing package systemd (--remove):
 installed systemd package pre-removal script subprocess returned error exit status 1
dpkg: too many errors, stopping
Errors were encountered while processing:
 systemd
Processing was halted because there were too many errors.
But at the same time, overwrite in place is all good - via dpkg -i from /var/cache/apt/archives. In this case it manifested via a prerm script, in other cases is manifests via dependencies that are no longer satisfied for packages that can t be removed, etc. etc. So you will have to resort to dpkg -i a lot.

Pain point 4: lib- packages that are not lib During the whole process, it is very tempting to just go ahead and install the corresponding arm64 package for all armhf lib package, in one go, since these can coexist. Well, this simple plan is complicated by the fact that some packages are named libfoo-bar, but are actual holding (e.g.) the bar binary for the libfoo package. Examples:
  • libmagic-mgc contains /usr/lib/file/magic.mgc, which conflicts between the 32 and 64 bit versions; of course, it s the exact same file, so this should be an arch:all package, but
  • libpam-modules-bin and liblockfile-bin actually contain binaries (per the -bin suffix)
It s possible to work around all this, but it changes a 1 minute:
# apt install $(dpkg -i   grep ^ii   awk ' print $2 ' grep :amrhf sed -e 's/:armhf/:arm64')
into a 10-20 minutes fight with packages (like most other steps).

Is it worth doing? Compared to the simple bullseye bookworm upgrade, I m not sure about this. The result? Yes, definitely, the system feels - weirdly - much more responsive, logged in over SSH. I guess the arm64 base architecture has some more efficient ops than the lowest denominator armhf , so to say (e.g. there was in the 32 bit version some rpi-custom package with string ops), and thus migrating to 64 bit makes more things faster , but this is subjective so it might be actually not true. But from the point of view of the effort? Unless you like to play with dpkg and apt, and understand how these work and break, I d rather say, migrate to ansible and automate the deployment. It s doable, sure, and by the third system, I got this nailed down pretty well, but it was a lot of time spent. The good aspect is that I did 3 migrations:
  • rpi zero w2: bullseye 32 bit to 64 bit, then bullseye to bookworm
  • rpi 4: bullseye to bookworm, then bookworm 32bit to 64 bit
  • same, again, for a more important system
And all three worked well and no data loss. But I m really glad I have this behind me, I probably wouldn t do a fourth system, even if forced And now, waiting for the RPI 5 to be available See you!

1 October 2023

Paul Wise: FLOSS Activities September 2023

Focus This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Review
  • Spam: reported 2 Debian bug reports
  • Debian wiki: RecentChanges for the month
  • Debian BTS usertags: changes for the month
  • Debian screenshots:
    • approved fzf lame lsd termshark vifm
    • rejected orthanc (private data), gpr/orthanc (Windows), qrencode (random QR codes), weboob-qt (chess website)

Administration
  • Debian IRC: fix #debian-pkg-security topic/metadata
  • Debian wiki: unblock IP addresses, approve accounts

Communication

Sponsors The SWH work was sponsored. All other work was done on a volunteer basis.

30 September 2023

Ian Jackson: DKIM: rotate and publish your keys

If you are an email system administrator, you are probably using DKIM to sign your outgoing emails. You should be rotating the key regularly and automatically, and publishing old private keys. I have just released dkim-rotate 1.0; dkim-rotate is a tool to do this key rotation and publication. If you are an email user, your email provider ought to be doing this. If this is not done, your emails are non-repudiable , meaning that if they are leaked, anyone (eg, journalists, haters) can verify that they are authentic, and prove that to others. This is not desirable (for you). Non-repudiation of emails is undesirable This problem was described at some length in Matthew Green s article Ok Google: please publish your DKIM secret keys. Avoiding non-repudiation sounds a bit like lying. After all, I m advising creating a situation where some people can t verify that something is true, even though it is. So I m advocating casting doubt. Crucially, though, it s doubt about facts that ought to be private. When you send an email, that s between you and the recipient. Normally you don t intend for anyone, anywhere, who happens to get a copy, to be able to verify that it was really you that sent it. In practical terms, this verifiability has already been used by journalists to verify stolen emails. Associated Press provide a verification tool. Advice for all email users As a user, you probably don t want your emails to be non-repudiable. (Other people might want to be able to prove you sent some email, but your email system ought to serve your interests, not theirs.) So, your email provider ought to be rotating their DKIM keys, and publishing their old ones. At a rough guess, your provider probably isn t :-(. How to tell by looking at email headers A quick and dirty way to guess is to have a friend look at the email headers of a message you sent. (It is important that the friend uses a different email provider, since often DKIM signatures are not applied within a single email system.) If your friend sees a DKIM-Signature header then the message is DKIM signed. If they don t, then it wasn t. Most email traversing the public internet is DKIM signed nowadays; so if they don t see the header probably they re not looking using the right tools, or they re actually on the same email system as you. In messages signed by a system running dkim-rotate, there will also be a header about the key rotation, to notify potential verifiers of the situation. Other systems that avoid non-repudiation-through-DKIM might do something similar. dkim-rotate s header looks like this:
DKIM-Signature-Warning: NOTE REGARDING DKIM KEY COMPROMISE
 https://www.chiark.greenend.org.uk/dkim-rotate/README.txt
 https://www.chiark.greenend.org.uk/dkim-rotate/ae/aeb689c2066c5b3fee673355309fe1c7.pem
But an email system might do half of the job of dkim-rotate: regularly rotating the key would cause the signatures of old emails to fail to verify, which is a good start. In that case there probably won t be such a header. Testing verification of new and old messages You can also try verifying the signatures. This isn t entirely straightforward, especially if you don t have access to low-level mail tooling. Your friend will need to be able to save emails as raw whole headers and body, un-decoded, un-rendered. If your friend is using a traditional Unix mail program, they should save the message as an mbox file. Otherwise, ProPublica have instructions for attaching and transferring and obtaining the raw email. (Scroll down to How to Check DKIM and ARC .) Checking that recent emails are verifiable Firstly, have your friend test that they can in fact verify a DKIM signature. This will demonstrate that the next test, where the verification is supposed to fail, is working properly and fails for the right reasons. Send your friend a test email now, and have them do this on a Linux system:
    # save the message as test-email.mbox
    apt install libmail-dkim-perl # or equivalent on another distro
    dkimproxy-verify <test-email.mbox
You should see output containing something like this:
    originator address: ijackson@chiark.greenend.org.uk
    signature identity: @chiark.greenend.org.uk
    verify result: pass
    ...
If the output ontains verify result: fail (body has been altered) then probably your friend didn t manage to faithfully save the unalterered raw message. Checking old emails cannot be verified When you both have that working, have your friend find an older email of yours, from (say) month ago. Perform the same steps. Hopefully they will see something like this:
    originator address: ijackson@chiark.greenend.org.uk
    signature identity: @chiark.greenend.org.uk
    verify result: fail (bad RSA signature)
or maybe
    verify result: invalid (public key: not available)
This indicates that this old email can no longer be verified. That s good: it means that anyone who steals a copy, can t verify it either. If it s leaked, the journalist who receives it won t know it s genuine and unmodified; they should then be suspicious. If your friend sees verify result: pass, then they have verified that that old email of yours is genuine. Anyone who had a copy of the mail can do that. This is good for email thieves, but not for you. For email admins: announcing dkim-rotate 1.0 I have been running dkim-rotate 0.4 on my infrastructure, since last August. and I had entirely forgotten about it: it has run flawlessly for a year. I was reminded of the topic by seeing DKIM in other blog posts. Obviously, it is time to decreee that dkim-rotate is 1.0. If you re a mail system administrator, your users are best served if you use something like dkim-rotate. The package is available in Debian stable, and supports Exim out of the box, but other MTAs should be easy to support too, via some simple ad-hoc scripting. Limitation of this approach Even with this key rotation approach, emails remain nonrepudiable for a short period after they re sent - typically, a few days. Someone who obtains a leaked email very promptly, and shows it to the journalist (for example) right away, can still convince the journalist. This is not great, but at least it doesn t apply to the vast bulk of your email archive. There are possible email protocol improvements which might help, but they re quite out of scope for this article.
Edited 2023-10-01 00:20 +01:00 to fix some grammar


comment count unavailable comments

26 September 2023

Ravi Dwivedi: Fixing keymaps in Chromebook Running Debian Bookworm

I recently bought an HP Chromebook from Abhas who had already flashed coreboot in it. I ran a fresh installation of Debian 12 (Bookworm) on it with KDE Plasma. Right after installation, the Wi-Fi and bluetooth were working, but I was facing two issues: So I asked my friend Alper for help on fixing the same as he has some experience with Chromebooks. Thanks a lot Alper for the help. I am documenting our steps here for helping others who are facing this issue. Note: This works in X11. For wayland, the steps might differ. To set system-wide keyboard configuration on Debian systems:
$ sudo dpkg-reconfigure keyboard-configuration
Choose Chromebook as the Keyboard Model . Each DE should default to the system configuration, but might need its own configuration which would similarly be available in their GUI tools. But you can check and set it manually from the command line, for example as in this thread. To check the keyboard model Xorg-based DEs:
$ setxkbmap -print -query   grep model:
model:  pc104
To change it temporarily, until a reboot:
$ setxkbmap -model chromebook
If it s not there in KDE settings that would be a bug, To change it persistently for KDE:
$ cat >>.config/kxkbrc <<EOF
[Layout]
Model=chromebook
EOF
This thread was helpful.

Ravi Dwivedi: Fixing audio and keymaps in Chromebook Running Debian Bookworm

I recently bought an HP Chromebook from Abhas who had already flashed coreboot in it. I ran a fresh installation of Debian 12 (Bookworm) on it with KDE Plasma. Right after installation, the Wi-Fi and bluetooth were working, but I was facing two issues:

Fixing audio I ran the script mentioned here and that fixed the audio. The instructions from that link are:
git clone https://github.com/WeirdTreeThing/chromebook-linux-audio
cd chromebook-linux-audio
./setup-audio

Fixing keyboard I asked my friend Alper for help on fixing the keyboard as he has some experience with Chromebooks. Thanks a lot Alper for the help. I am documenting our steps here for helping others who are facing this issue. Note: This works in X11. For wayland, the steps might differ. To set system-wide keyboard configuration on Debian systems:
$ sudo dpkg-reconfigure keyboard-configuration
Choose Chromebook as the Keyboard Model . Each DE should default to the system configuration, but might need its own configuration which would similarly be available in their GUI tools. But you can check and set it manually from the command line, for example as in this thread. To check the keyboard model Xorg-based DEs:
$ setxkbmap -print -query   grep model:
model:  pc104
To change it temporarily, until a reboot:
$ setxkbmap -model chromebook
If it s not there in KDE settings that would be a bug, To change it persistently for KDE:
$ cat >>.config/kxkbrc <<EOF
[Layout]
Model=chromebook
EOF
This thread was helpful.

30 July 2023

Russell Coker: Links July 2023

Phys.org has an interesting article about finding evidence for nanohertz gravity waves [1]. 1nano-Herz is a wavelength of 31.7 light years! Wired has an interesting story about OpenAI saying that no further advances will be made with larger training models [2]. Bruce Schneier and Nathan Sanders wrote an insightful article about the need for government run GPT type systems [3]. He focuses on the US, but having other countries/groups of countries do it would be good too. We could have a Chinese one, an EU one, etc. I don t think it would necessarily make sense for a small country like Australia to have one but it would make a lot more sense than having nuclear submarines (which are much more expensive). The Roadmap project is a guide for learning new technologies [4]. The content seems quite good. Bigthink has an informative and darkly amusing article Horror stories of cryonics: The gruesome fates of futurists hoping for immortality [5]. From this month in Australia psilocybin (active ingredient in Magic Mushrooms) can be prescribed for depression and MDMA (known as Ecstacy on the streets) can be prescribed for PTSD [6]. That s great news! Slate has an interesting article about the Operation Underground Railroad organisation that purports to help sex trafficed chilren [7]. This is noteworthy now with the controverst over the recent movie about that. Apparently they didn t provide much help for kids after they had been rescued and at least some of the kids were trafficed specifically to fulfill the demand that they created by offering to pay for it. Vigilantes aren t as effective as law enforcement. The ACCC is going to prevent Apple and Google from forcing app developers to give them a share of in-app purchases in Australia [8]. We need this in every country! This site has links to open source versions of proprietary games [9]. Vice has an interesting article about the Hungarian neuroscientist Viktor T th who taught rats to play Doom 2 [10]. The next logical step is to have mini tanks that they can use in real battlefields. Like the Mason s Rats episode of Love Death and Robots on Netflix. Brian Krebs wrote a mind boggling pair of blog posts about the Ashley Adison hack [11]. A Jewish disgruntled ex-employee sending anti-semitic harassment to the Jewish CEO and maybe cooperating with anti-semitic organisations to harass him is one of the people involved, but he killed himself (due to mental health problems) before the hack took place. Long Now has an insightful blog post about digital avatars being used after the death of the people they are based on [12]. Tavis Ormandy s description of the zenbleed bug is interesting [13]. The technique for finding the bug is interesting as well as the information on how the internals of the CPUs in question work. I don t think this means AMD is bad, trying to deliver increasing performance while limited by the laws of physics is difficult and mistakes are sometimes made. Let s hope the microcode updates are well distributed. The Hacktivist documentary about Andrew Bunnie Huang is really good [14]. Bunnie s lecture about supply chain attacks is worth watching [15]. Most descriptions of this issue don t give nearly as much information. However bad you thought this problem was, after you watch this lecture you will realise it s worse than that!

1 July 2023

Paul Wise: FLOSS Activities June 2023

Focus This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Review
  • Spam: reported 29 Debian mailing list posts
  • Debian wiki: RecentChanges for the month
  • Debian BTS usertags: changes for the month
  • Debian screenshots:
    • approved lsd stockfish x86dis
    • rejected libselinux1 (photo), stockfish (mobile app), php-bcmath (selfie), chromium-browser/binutils-alpha-linux-gnu (medical photo), weboob-qt (QR code)

Administration
  • Debian QA services: deploy changes
  • Debian IRC: updated #debian-live admin list
  • Debian mentors: investigate reason for missing package
  • Debian wiki: unblock IP addresses, approve accounts

Communication

Sponsors The sptag work was sponsored. All other work was done on a volunteer basis.

6 June 2023

Russell Coker: PinePhonePro First Impression

Hardware I received my PinePhone Pro [1] on Thursday, it seems in many ways better than the Purism Librem 5 [2] that I have previously written about. The PinePhone is thinner, lighter, and yet has a much longer battery life. A friend described the Librem5 as the CyberTruck phone and not in a good way. In a test I had my PinePhone and my Librem5 fully charged, left them for 4.5 hours without doing anything much with them, and then the PinePhone was at 85% and the Librem5 was at 57%. So the Librem5 will run out of battery after about 10 hours of not being used while a PinePhonePro can be expected to last about 30 hours. The PinePhonePro isn t as good as some of the recent Android phones in this regard but it shows the potential to be quite usable. For this test both phones were connected to a 2.4GHz Wifi network (which uses less power than 5GHz) and doing nothing much with an out of the box configuration. A phone that is checking email, social networking, and a couple of IM services will use the battery faster. But even if the PinePhone has it s battery used twice as fast in a more realistic test that will still be usable. Here are the passmark results from the PinePhone Pro [3] which got a CPU score of 888 compared to 507 for the Librem 5 and 678 for one of the slower laptops I ve used. The results are excluded from the Passmark averages because they identified the CPU as only having 4 cores (expecting just 4*A72) while the PinePhonePro has 6 cores (2*A72+4*A53). This phone definitely has the CPU power for convergence [4]! Default OS By default the PinePhone has a KDE based GUI and the Librem5 has a GNOME based GUI. I don t like any iteration of GNOME (I have tried them all and disliked them all) and I like KDE so I will tend to like anything that is KDE based more than anything GNOME based. But in addition to that the PinePhone has an interface that looks a lot like Android with the three on-screen buttons at the bottom of the display and the way it has the slide up tray for installed apps. Android is the most popular phone OS and looking like the most common option is often a good idea for a new and different product, this seems like an objective criteria to determine that the default GUI on the PinePhone is a better choice (at least for the default). When I first booted it and connected it to Wifi the updates app said that there were 633 updates to apply, but never applied them (I tried clicking on the update button but to no avail) and didn t give any error message. For me not being Debian is enough reason to dislike Manjaro, but if that wasn t enough then the failure to update would be a good start. When I ran pacman in a terminal window it said that each package was corrupt and asked if I wanted to delete it. According to tar tvJf the packages weren t corrupt. After downloading them again it said that they were corrupt again so it seemed that pacman wasn t working correctly. When the screen is locked and a call comes in it gives a window with Accept and Reject buttons but neither of them works. The default country code for Spacebar (the SMS app) is +1 (US) even though I specified Australia on the initial login. It also doesn t get the APN unlike Android phones which seem to have some sort of list of APNs. Upgrading to Debian The Debian Wiki page about Installing on the PinePhone Pro has the basic information [5]. The first thing it covers is installing the TOW boot loader which is already installed by default in recent PinePhones (such as mine). You can recognise that TOW is installed by pressing the volume-up button in the early stages of boot up (described as before and during the second vibration ), then the LED will turn blue and the phone will act as a USB mass storage device which makes it easy to do other install/recovery tasks. The other TOW option is to press volume-down to boot from a MicroSD card (the default is to boot the OS on the eMMC). The images linked from the Debian wiki page are designed to be installed with bmaptool from the bmap-tools Debian package. After installing that package and downloading the pre-built Mobian image I installed it with the command bmaptool copy mobian-pinephonepro-phosh-bookworm-12.0-rc3.img.gz /dev/sdb where /dev/sdb is the device that the USB mapped PinePhone storage was located. That took 6 minutes and then I rebooted my PinePhone into Mobian! Unfortunately the default GUI for Mobian is GNOME/Phosh. Changing it to KDE is my next task.

1 June 2023

Paul Wise: FLOSS Activities May 2023

Focus This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Review

Administration
  • Debian IRC: set topic on new #debian-sa channel
  • Debian wiki: unblock IP addresses, approve accounts

Communication
  • Respond to queries from Debian users and contributors on the mailing lists and IRC

Sponsors The SIMDe, gensim, sptag work was sponsored. All other work was done on a volunteer basis.

6 February 2023

Reproducible Builds: Reproducible Builds in January 2023

Welcome to the first report for 2023 from the Reproducible Builds project! In these reports we try and outline the most important things that we have been up to over the past month, as well as the most important things in/around the community. As a quick recap, the motivation behind the reproducible builds effort is to ensure no malicious flaws can be deliberately introduced during compilation and distribution of the software that we run on our devices. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website.


News In a curious turn of events, GitHub first announced this month that the checksums of various Git archives may be subject to change, specifically that because:
the default compression for Git archives has recently changed. As result, archives downloaded from GitHub may have different checksums even though the contents are completely unchanged.
This change (which was brought up on our mailing list last October) would have had quite wide-ranging implications for anyone wishing to validate and verify downloaded archives using cryptographic signatures. However, GitHub reversed this decision, updating their original announcement with a message that We are reverting this change for now. More details to follow. It appears that this was informed in part by an in-depth discussion in the GitHub Community issue tracker.
The Bundesamt f r Sicherheit in der Informationstechnik (BSI) (trans: The Federal Office for Information Security ) is the agency in charge of managing computer and communication security for the German federal government. They recently produced a report that touches on attacks on software supply-chains (Supply-Chain-Angriff). (German PDF)
Contributor Seb35 updated our website to fix broken links to Tails Git repository [ ][ ], and Holger updated a large number of pages around our recent summit in Venice [ ][ ][ ][ ].
Noak J nsson has written an interesting paper entitled The State of Software Diversity in the Software Supply Chain of Ethereum Clients. As the paper outlines:
In this report, the software supply chains of the most popular Ethereum clients are cataloged and analyzed. The dependency graphs of Ethereum clients developed in Go, Rust, and Java, are studied. These client are Geth, Prysm, OpenEthereum, Lighthouse, Besu, and Teku. To do so, their dependency graphs are transformed into a unified format. Quantitative metrics are used to depict the software supply chain of the blockchain. The results show a clear difference in the size of the software supply chain required for the execution layer and consensus layer of Ethereum.

Yongkui Han posted to our mailing list discussing making reproducible builds & GitBOM work together without gitBOM-ID embedding. GitBOM (now renamed to OmniBOR) is a project to enable automatic, verifiable artifact resolution across today s diverse software supply-chains [ ]. In addition, Fabian Keil wrote to us asking whether anyone in the community would be at Chemnitz Linux Days 2023, which is due to take place on 11th and 12th March (event info). Separate to this, Akihiro Suda posted to our mailing list just after the end of the month with a status report of bit-for-bit reproducible Docker/OCI images. As Akihiro mentions in their post, they will be giving a talk at FOSDEM in the Containers devroom titled Bit-for-bit reproducible builds with Dockerfile and that my talk will also mention how to pin the apt/dnf/apk/pacman packages with my repro-get tool.
The extremely popular Signal messenger app added upstream support for the SOURCE_DATE_EPOCH environment variable this month. This means that release tarballs of the Signal desktop client do not embed nondeterministic release information. [ ][ ]

Distribution work

F-Droid & Android There was a very large number of changes in the F-Droid and wider Android ecosystem this month: On January 15th, a blog post entitled Towards a reproducible F-Droid was published on the F-Droid website, outlining the reasons why F-Droid signs published APKs with its own keys and how reproducible builds allow using upstream developers keys instead. In particular:
In response to [ ] criticisms, we started encouraging new apps to enable reproducible builds. It turns out that reproducible builds are not so difficult to achieve for many apps. In the past few months we ve gotten many more reproducible apps in F-Droid than before. Currently we can t highlight which apps are reproducible in the client, so maybe you haven t noticed that there are many new apps signed with upstream developers keys.
(There was a discussion about this post on Hacker News.) In addition:
  • F-Droid added 13 apps published with reproducible builds this month. [ ]
  • FC Stegerman outlined a bug where baseline.profm files are nondeterministic, developed a workaround, and provided all the details required for a fix. As they note, this issue has now been fixed but the fix is not yet part of an official Android Gradle plugin release.
  • GitLab user Parwor discovered that the number of CPU cores can affect the reproducibility of .dex files. [ ]
  • FC Stegerman also announced the 0.2.0 and 0.2.1 releases of reproducible-apk-tools, a suite of tools to help make .apk files reproducible. Several new subcommands and scripts were added, and a number of bugs were fixed as well [ ][ ]. They also updated the F-Droid website to improve the reproducibility-related documentation. [ ][ ]
  • On the F-Droid issue tracker, FC Stegerman discussed reproducible builds with one of the developers of the Threema messenger app and reported that Android SDK build-tools 31.0.0 and 32.0.0 (unlike earlier and later versions) have a zipalign command that produces incorrect padding.
  • A number of bugs related to reproducibility were discovered in Android itself. Firstly, the non-deterministic order of .zip entries in .apk files [ ] and then newline differences between building on Windows versus Linux that can make builds not reproducible as well. [ ] (Note that these links may require a Google account to view.)
  • And just before the end of the month, FC Stegerman started a thread on our mailing list on the topic of hiding data/code in APK embedded signatures which has been made possible by the Android APK Signature Scheme v2/v3. As part of this, they made an Android app that reads the APK Signing block of its own APK and extracts a payload in order to alter its behaviour called sigblock-code-poc.

Debian As mentioned in last month s report, Vagrant Cascadian has been organising a series of online sprints in order to clear the huge backlog of reproducible builds patches submitted by performing NMUs (Non-Maintainer Uploads). During January, a sprint took place on the 10th, resulting in the following uploads: During this sprint, Holger Levsen filed Debian bug #1028615 to request that the tracker.debian.org service display results of reproducible rebuilds, not just reproducible CI results. Elsewhere in Debian, strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build. This month, version 1.13.1-1 was uploaded to Debian unstable by Holger Levsen, including a fix by FC Stegerman (obfusk) to update a regular expression for the latest version of file(1) [ ]. (#1028892) Lastly, 65 reviews of Debian packages were added, 21 were updated and 35 were removed this month adding to our knowledge about identified issues.

Other distributions In other distributions:

diffoscope diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb made the following changes to diffoscope, including preparing and uploading versions 231, 232, 233 and 234 to Debian:
  • No need for from __future__ import print_function import anymore. [ ]
  • Comment and tidy the extras_require.json handling. [ ]
  • Split inline Python code to generate test Recommends into a separate Python script. [ ]
  • Update debian/tests/control after merging support for PyPDF support. [ ]
  • Correctly catch segfaulting cd-iccdump binary. [ ]
  • Drop some old debugging code. [ ]
  • Allow ICC tests to (temporarily) fail. [ ]
In addition, FC Stegerman (obfusk) made a number of changes, including:
  • Updating the test_text_proper_indentation test to support the latest version(s) of file(1). [ ]
  • Use an extras_require.json file to store some build/release metadata, instead of accessing the internet. [ ]
  • Updating an APK-related file(1) regular expression. [ ]
  • On the diffoscope.org website, de-duplicate contributors by e-mail. [ ]
Lastly, Sam James added support for PyPDF version 3 [ ] and Vagrant Cascadian updated a handful of tool references for GNU Guix. [ ][ ]

Upstream patches The Reproducible Builds project attempts to fix as many currently-unreproducible packages as possible. This month, we wrote a large number of such patches, including:

Testing framework The Reproducible Builds project operates a comprehensive testing framework at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In January, the following changes were made by Holger Levsen:
  • Node changes:
  • Debian-related changes:
    • Only keep diffoscope s HTML output (ie. no .json or .txt) for LTS suites and older in order to save diskspace on the Jenkins host. [ ]
    • Re-create pbuilder base less frequently for the stretch, bookworm and experimental suites. [ ]
  • OpenWrt-related changes:
    • Add gcc-multilib to OPENWRT_HOST_PACKAGES and install it on the nodes that need it. [ ]
    • Detect more problems in the health check when failing to build OpenWrt. [ ]
  • Misc changes:
    • Update the chroot-run script to correctly manage /dev and /dev/pts. [ ][ ][ ]
    • Update the Jenkins shell monitor script to collect disk stats less frequently [ ] and to include various directory stats. [ ][ ]
    • Update the real year in the configuration in order to be able to detect whether a node is running in the future or not. [ ]
    • Bump copyright years in the default page footer. [ ]
In addition, Christian Marangi submitted a patch to build OpenWrt packages with the V=s flag to enable debugging. [ ]
If you are interested in contributing to the Reproducible Builds project, please visit the Contribute page on our website. You can get in touch with us via:

2 December 2022

Paul Wise: FLOSS Activities November 2022

Focus This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Review

Administration
  • Debian IRC: setup topic/entrymsg to redirect OFTC #debian-jp to the Debian Japan IRC server, cleaned up #debianppc ChanServ metadata
  • Debian wiki: unblock IP addresses, approve domains, approve accounts

Communication

Sponsors The libemail-outlook-message-perl, purple-discord, sptag, celery work was sponsored. All other work was done on a volunteer basis.

30 November 2022

Bits from Debian: New Debian Developers and Maintainers (September and October 2022)

The following contributors got their Debian Developer accounts in the last two months: The following contributors were added as Debian Maintainers in the last two months: Congratulations!

16 November 2022

Antoine Beaupr : Wayland: i3 to Sway migration

I started migrating my graphical workstations to Wayland, specifically migrating from i3 to Sway. This is mostly to address serious graphics bugs in the latest Framwork laptop, but also something I felt was inevitable. The current status is that I've been able to convert my i3 configuration to Sway, and adapt my systemd startup sequence to the new environment. Screen sharing only works with Pipewire, so I also did that migration, which basically requires an upgrade to Debian bookworm to get a nice enough Pipewire release. I'm testing Wayland on my laptop, but I'm not using it as a daily driver because I first need to upgrade to Debian bookworm on my main workstation. Most irritants have been solved one way or the other. My main problem with Wayland right now is that I spent a frigging week doing the conversion: it's exciting and new, but it basically sucked the life out of all my other projects and it's distracting, and I want it to stop. The rest of this page documents why I made the switch, how it happened, and what's left to do. Hopefully it will keep you from spending as much time as I did in fixing this. TL;DR: Wayland is mostly ready. Main blockers you might find are that you need to do manual configurations, DisplayLink (multiple monitors on a single cable) doesn't work in Sway, HDR and color management are still in development. I had to install the following packages:
apt install \
    brightnessctl \
    foot \
    gammastep \
    gdm3 \
    grim slurp \
    pipewire-pulse \
    sway \
    swayidle \
    swaylock \
    wdisplays \
    wev \
    wireplumber \
    wlr-randr \
    xdg-desktop-portal-wlr
And did some of tweaks in my $HOME, mostly dealing with my esoteric systemd startup sequence, which you won't have to deal with if you are not a fan.

Why switch? I originally held back from migrating to Wayland: it seemed like a complicated endeavor hardly worth the cost. It also didn't seem actually ready. But after reading this blurb on LWN, I decided to at least document the situation here. The actual quote that convinced me it might be worth it was:
It s amazing. I have never experienced gaming on Linux that looked this smooth in my life.
... I'm not a gamer, but I do care about latency. The longer version is worth a read as well. The point here is not to bash one side or the other, or even do a thorough comparison. I start with the premise that Xorg is likely going away in the future and that I will need to adapt some day. In fact, the last major Xorg release (21.1, October 2021) is rumored to be the last ("just like the previous release...", that said, minor releases are still coming out, e.g. 21.1.4). Indeed, it seems even core Xorg people have moved on to developing Wayland, or at least Xwayland, which was spun off it its own source tree. X, or at least Xorg, in in maintenance mode and has been for years. Granted, the X Window System is getting close to forty years old at this point: it got us amazingly far for something that was designed around the time the first graphical interface. Since Mac and (especially?) Windows released theirs, they have rebuilt their graphical backends numerous times, but UNIX derivatives have stuck on Xorg this entire time, which is a testament to the design and reliability of X. (Or our incapacity at developing meaningful architectural change across the entire ecosystem, take your pick I guess.) What pushed me over the edge is that I had some pretty bad driver crashes with Xorg while screen sharing under Firefox, in Debian bookworm (around November 2022). The symptom would be that the UI would completely crash, reverting to a text-only console, while Firefox would keep running, audio and everything still working. People could still see my screen, but I couldn't, of course, let alone interact with it. All processes still running, including Xorg. (And no, sorry, I haven't reported that bug, maybe I should have, and it's actually possible it comes up again in Wayland, of course. But at first, screen sharing didn't work of course, so it's coming a much further way. After making screen sharing work, though, the bug didn't occur again, so I consider this a Xorg-specific problem until further notice.) There were also frustrating glitches in the UI, in general. I actually had to setup a compositor alongside i3 to make things bearable at all. Video playback in a window was laggy, sluggish, and out of sync. Wayland fixed all of this.

Wayland equivalents This section documents each tool I have picked as an alternative to the current Xorg tool I am using for the task at hand. It also touches on other alternatives and how the tool was configured. Note that this list is based on the series of tools I use in desktop. TODO: update desktop with the following when done, possibly moving old configs to a ?xorg archive.

Window manager: i3 sway This seems like kind of a no-brainer. Sway is around, it's feature-complete, and it's in Debian. I'm a bit worried about the "Drew DeVault community", to be honest. There's a certain aggressiveness in the community I don't like so much; at least an open hostility towards more modern UNIX tools like containers and systemd that make it hard to do my work while interacting with that community. I'm also concern about the lack of unit tests and user manual for Sway. The i3 window manager has been designed by a fellow (ex-)Debian developer I have a lot of respect for (Michael Stapelberg), partly because of i3 itself, but also working with him on other projects. Beyond the characters, i3 has a user guide, a code of conduct, and lots more documentation. It has a test suite. Sway has... manual pages, with the homepage just telling users to use man -k sway to find what they need. I don't think we need that kind of elitism in our communities, to put this bluntly. But let's put that aside: Sway is still a no-brainer. It's the easiest thing to migrate to, because it's mostly compatible with i3. I had to immediately fix those resources to get a minimal session going:
i3 Sway note
set_from_resources set no support for X resources, naturally
new_window pixel 1 default_border pixel 1 actually supported in i3 as well
That's it. All of the other changes I had to do (and there were actually a lot) were all Wayland-specific changes, not Sway-specific changes. For example, use brightnessctl instead of xbacklight to change the backlight levels. See a copy of my full sway/config for details. Other options include:
  • dwl: tiling, minimalist, dwm for Wayland, not in Debian
  • Hyprland: tiling, fancy animations, not in Debian
  • Qtile: tiling, extensible, in Python, not in Debian (1015267)
  • river: Zig, stackable, tagging, not in Debian (1006593)
  • velox: inspired by xmonad and dwm, not in Debian
  • vivarium: inspired by xmonad, not in Debian

Status bar: py3status waybar I have invested quite a bit of effort in setting up my status bar with py3status. It supports Sway directly, and did not actually require any change when migrating to Wayland. Unfortunately, I had trouble making nm-applet work. Based on this nm-applet.service, I found that you need to pass --indicator for it to show up at all. In theory, tray icon support was merged in 1.5, but in practice there are still several limitations, like icons not clickable. Also, on startup, nm-applet --indicator triggers this error in the Sway logs:
nov 11 22:34:12 angela sway[298938]: 00:49:42.325 [INFO] [swaybar/tray/host.c:24] Registering Status Notifier Item ':1.47/org/ayatana/NotificationItem/nm_applet'
nov 11 22:34:12 angela sway[298938]: 00:49:42.327 [ERROR] [swaybar/tray/item.c:127] :1.47/org/ayatana/NotificationItem/nm_applet IconPixmap: No such property  IconPixmap 
nov 11 22:34:12 angela sway[298938]: 00:49:42.327 [ERROR] [swaybar/tray/item.c:127] :1.47/org/ayatana/NotificationItem/nm_applet AttentionIconPixmap: No such property  AttentionIconPixmap 
nov 11 22:34:12 angela sway[298938]: 00:49:42.327 [ERROR] [swaybar/tray/item.c:127] :1.47/org/ayatana/NotificationItem/nm_applet ItemIsMenu: No such property  ItemIsMenu 
nov 11 22:36:10 angela sway[313419]: info: fcft.c:838: /usr/share/fonts/truetype/dejavu/DejaVuSans.ttf: size=24.00pt/32px, dpi=96.00
... but that seems innocuous. The tray icon displays but is not clickable. Note that there is currently (November 2022) a pull request to hook up a "Tray D-Bus Menu" which, according to Reddit might fix this, or at least be somewhat relevant. If you don't see the icon, check the bar.tray_output property in the Sway config, try: tray_output *. The non-working tray was the biggest irritant in my migration. I have used nmtui to connect to new Wifi hotspots or change connection settings, but that doesn't support actions like "turn off WiFi". I eventually fixed this by switching from py3status to waybar, which was another yak horde shaving session, but ultimately, it worked.

Web browser: Firefox Firefox has had support for Wayland for a while now, with the team enabling it by default in nightlies around January 2022. It's actually not easy to figure out the state of the port, the meta bug report is still open and it's huge: it currently (Sept 2022) depends on 76 open bugs, it was opened twelve (2010) years ago, and it's still getting daily updates (mostly linking to other tickets). Firefox 106 presumably shipped with "Better screen sharing for Windows and Linux Wayland users", but I couldn't quite figure out what those were. TL;DR: echo MOZ_ENABLE_WAYLAND=1 >> ~/.config/environment.d/firefox.conf && apt install xdg-desktop-portal-wlr

How to enable it Firefox depends on this silly variable to start correctly under Wayland (otherwise it starts inside Xwayland and looks fuzzy and fails to screen share):
MOZ_ENABLE_WAYLAND=1 firefox
To make the change permanent, many recipes recommend adding this to an environment startup script:
if [ "$XDG_SESSION_TYPE" == "wayland" ]; then
    export MOZ_ENABLE_WAYLAND=1
fi
At least that's the theory. In practice, Sway doesn't actually run any startup shell script, so that can't possibly work. Furthermore, XDG_SESSION_TYPE is not actually set when starting Sway from gdm3 which I find really confusing, and I'm not the only one. So the above trick doesn't actually work, even if the environment (XDG_SESSION_TYPE) is set correctly, because we don't have conditionals in environment.d(5). (Note that systemd.environment-generator(7) do support running arbitrary commands to generate environment, but for some some do not support user-specific configuration files... Even then it may be a solution to have a conditional MOZ_ENABLE_WAYLAND environment, but I'm not sure it would work because ordering between those two isn't clear: maybe the XDG_SESSION_TYPE wouldn't be set just yet...) At first, I made this ridiculous script to workaround those issues. Really, it seems to me Firefox should just parse the XDG_SESSION_TYPE variable here... but then I realized that Firefox works fine in Xorg when the MOZ_ENABLE_WAYLAND is set. So now I just set that variable in environment.d and It Just Works :
MOZ_ENABLE_WAYLAND=1

Screen sharing Out of the box, screen sharing doesn't work until you install xdg-desktop-portal-wlr or similar (e.g. xdg-desktop-portal-gnome on GNOME). I had to reboot for the change to take effect. Without those tools, it shows the usual permission prompt with "Use operating system settings" as the only choice, but when we accept... nothing happens. After installing the portals, it actualyl works, and works well! This was tested in Debian bookworm/testing with Firefox ESR 102 and Firefox 106. Major caveat: we can only share a full screen, we can't currently share just a window. The major upside to that is that, by default, it streams only one output which is actually what I want most of the time! See the screencast compatibility for more information on what is supposed to work. This is actually a huge improvement over the situation in Xorg, where Firefox can only share a window or all monitors, which led me to use Chromium a lot for video-conferencing. With this change, in other words, I will not need Chromium for anything anymore, whoohoo! If slurp, wofi, or bemenu are installed, one of them will be used to pick the monitor to share, which effectively acts as some minimal security measure. See xdg-desktop-portal-wlr(1) for how to configure that.

Side note: Chrome fails to share a full screen I was still using Google Chrome (or, more accurately, Debian's Chromium package) for some videoconferencing. It's mainly because Chromium was the only browser which will allow me to share only one of my two monitors, which is extremely useful. To start chrome with the Wayland backend, you need to use:
chromium  -enable-features=UseOzonePlatform -ozone-platform=wayland
If it shows an ugly gray border, check the Use system title bar and borders setting. It can do some screensharing. Sharing a window and a tab seems to work, but sharing a full screen doesn't: it's all black. Maybe not ready for prime time. And since Firefox can do what I need under Wayland now, I will not need to fight with Chromium to work under Wayland:
apt purge chromium
Note that a similar fix was necessary for Signal Desktop, see this commit. Basically you need to figure out a way to pass those same flags to signal:
--enable-features=WaylandWindowDecorations --ozone-platform-hint=auto

Email: notmuch See Emacs, below.

File manager: thunar Unchanged.

News: feed2exec, gnus See Email, above, or Emacs in Editor, below.

Editor: Emacs okay-ish Emacs is being actively ported to Wayland. According to this LWN article, the first (partial, to Cairo) port was done in 2014 and a working port (to GTK3) was completed in 2021, but wasn't merged until late 2021. That is: after Emacs 28 was released (April 2022). So we'll probably need to wait for Emacs 29 to have native Wayland support in Emacs, which, in turn, is unlikely to arrive in time for the Debian bookworm freeze. There are, however, unofficial builds for both Emacs 28 and 29 provided by spwhitton which may provide native Wayland support. I tested the snapshot packages and they do not quite work well enough. First off, they completely take over the builtin Emacs they hijack the $PATH in /etc! and certain things are simply not working in my setup. For example, this hook never gets ran on startup:
(add-hook 'after-init-hook 'server-start t) 
Still, like many X11 applications, Emacs mostly works fine under Xwayland. The clipboard works as expected, for example. Scaling is a bit of an issue: fonts look fuzzy. I have heard anecdotal evidence of hard lockups with Emacs running under Xwayland as well, but haven't experienced any problem so far. I did experience a Wayland crash with the snapshot version however. TODO: look again at Wayland in Emacs 29.

Backups: borg Mostly irrelevant, as I do not use a GUI.

Color theme: srcery, redshift gammastep I am keeping Srcery as a color theme, in general. Redshift is another story: it has no support for Wayland out of the box, but it's apparently possible to apply a hack on the TTY before starting Wayland, with:
redshift -m drm -PO 3000
This tip is from the arch wiki which also has other suggestions for Wayland-based alternatives. Both KDE and GNOME have their own "red shifters", and for wlroots-based compositors, they (currently, Sept. 2022) list the following alternatives: I configured gammastep with a simple gammastep.service file associated with the sway-session.target.

Display manager: lightdm gdm3 Switched because lightdm failed to start sway:
nov 16 16:41:43 angela sway[843121]: 00:00:00.002 [ERROR] [wlr] [libseat] [common/terminal.c:162] Could not open target tty: Permission denied
Possible alternatives:

Terminal: xterm foot One of the biggest question mark in this transition was what to do about Xterm. After writing two articles about terminal emulators as a professional journalist, decades of working on the terminal, and probably using dozens of different terminal emulators, I'm still not happy with any of them. This is such a big topic that I actually have an entire blog post specifically about this. For starters, using xterm under Xwayland works well enough, although the font scaling makes things look a bit too fuzzy. I have also tried foot: it ... just works! Fonts are much crisper than Xterm and Emacs. URLs are not clickable but the URL selector (control-shift-u) is just plain awesome (think "vimperator" for the terminal). There's cool hack to jump between prompts. Copy-paste works. True colors work. The word-wrapping is excellent: it doesn't lose one byte. Emojis are nicely sized and colored. Font resize works. There's even scroll back search (control-shift-r). Foot went from a question mark to being a reason to switch to Wayland, just for this little goodie, which says a lot about the quality of that software. The selection clicks are a not quite what I would expect though. In rxvt and others, you have the following patterns:
  • single click: reset selection, or drag to select
  • double: select word
  • triple: select quotes or line
  • quadruple: select line
I particularly find the "select quotes" bit useful. It seems like foot just supports double and triple clicks, with word and line selected. You can select a rectangle with control,. It correctly extends the selection word-wise with right click if double-click was first used. One major problem with Foot is that it's a new terminal, with its own termcap entry. Support for foot was added to ncurses in the 20210731 release, which was shipped after the current Debian stable release (Debian bullseye, which ships 6.2+20201114-2). A workaround for this problem is to install the foot-terminfo package on the remote host, which is available in Debian stable. This should eventually resolve itself, as Debian bookworm has a newer version. Note that some corrections were also shipped in the 20211113 release, but that is also shipped in Debian bookworm. That said, I am almost certain I will have to revert back to xterm under Xwayland at some point in the future. Back when I was using GNOME Terminal, it would mostly work for everything until I had to use the serial console on a (HP ProCurve) network switch, which have a fancy TUI that was basically unusable there. I fully expect such problems with foot, or any other terminal than xterm, for that matter. The foot wiki has good troubleshooting instructions as well. Update: I did find one tiny thing to improve with foot, and it's the default logging level which I found pretty verbose. After discussing it with the maintainer on IRC, I submitted this patch to tweak it, which I described like this on Mastodon:
today's reason why i will go to hell when i die (TRWIWGTHWID?): a 600-word, 63 lines commit log for a one line change: https://codeberg.org/dnkl/foot/pulls/1215
It's Friday.

Launcher: rofi rofi?? rofi does not support Wayland. There was a rather disgraceful battle in the pull request that led to the creation of a fork (lbonn/rofi), so it's unclear how that will turn out. Given how relatively trivial problem space is, there is of course a profusion of options:
Tool In Debian Notes
alfred yes general launcher/assistant tool
bemenu yes, bookworm+ inspired by dmenu
cerebro no Javascript ... uh... thing
dmenu-wl no fork of dmenu, straight port to Wayland
Fuzzel ITP 982140 dmenu/drun replacement, app icon overlay
gmenu no drun replacement, with app icons
kickoff no dmenu/run replacement, fuzzy search, "snappy", history, copy-paste, Rust
krunner yes KDE's runner
mauncher no dmenu/drun replacement, math
nwg-launchers no dmenu/drun replacement, JSON config, app icons, nwg-shell project
Onagre no rofi/alfred inspired, multiple plugins, Rust
menu no dmenu/drun rewrite
Rofi (lbonn's fork) no see above
sirula no .desktop based app launcher
Ulauncher ITP 949358 generic launcher like Onagre/rofi/alfred, might be overkill
tofi yes, bookworm+ dmenu/drun replacement, C
wmenu no fork of dmenu-wl, but mostly a rewrite
Wofi yes dmenu/drun replacement, not actively maintained
yofi no dmenu/drun replacement, Rust
The above list comes partly from https://arewewaylandyet.com/ and awesome-wayland. It is likely incomplete. I have read some good things about bemenu, fuzzel, and wofi. A particularly tricky option is that my rofi password management depends on xdotool for some operations. At first, I thought this was just going to be (thankfully?) impossible, because we actually like the idea that one app cannot send keystrokes to another. But it seems there are actually alternatives to this, like wtype or ydotool, the latter which requires root access. wl-ime-type does that through the input-method-unstable-v2 protocol (sample emoji picker, but is not packaged in Debian. As it turns out, wtype just works as expected, and fixing this was basically a two-line patch. Another alternative, not in Debian, is wofi-pass. The other problem is that I actually heavily modified rofi. I use "modis" which are not actually implemented in wofi or tofi, so I'm left with reinventing those wheels from scratch or using the rofi + wayland fork... It's really too bad that fork isn't being reintegrated... For now, I'm actually still using rofi under Xwayland. The main downside is that fonts are fuzzy, but it otherwise just works. Note that wlogout could be a partial replacement (just for the "power menu").

Image viewers: geeqie ? I'm not very happy with geeqie in the first place, and I suspect the Wayland switch will just make add impossible things on top of the things I already find irritating (Geeqie doesn't support copy-pasting images). In practice, Geeqie doesn't seem to work so well under Wayland. The fonts are fuzzy and the thumbnail preview just doesn't work anymore (filed as Debian bug 1024092). It seems it also has problems with scaling. Alternatives: See also this list and that list for other list of image viewers, not necessarily ported to Wayland. TODO: pick an alternative to geeqie, nomacs would be gorgeous if it wouldn't be basically abandoned upstream (no release since 2020), has an unpatched CVE-2020-23884 since July 2020, does bad vendoring, and is in bad shape in Debian (4 minor releases behind). So for now I'm still grumpily using Geeqie.

Media player: mpv, gmpc / sublime This is basically unchanged. mpv seems to work fine under Wayland, better than Xorg on my new laptop (as mentioned in the introduction), and that before the version which improves Wayland support significantly, by bringing native Pipewire support and DMA-BUF support. gmpc is more of a problem, mainly because it is abandoned. See 2022-08-22-gmpc-alternatives for the full discussion, one of the alternatives there will likely support Wayland. Finally, I might just switch to sublime-music instead... In any case, not many changes here, thankfully.

Screensaver: xsecurelock swaylock I was previously using xss-lock and xsecurelock as a screensaver, with xscreensaver "hacks" as a backend for xsecurelock. The basic screensaver in Sway seems to be built with swayidle and swaylock. It's interesting because it's the same "split" design as xss-lock and xsecurelock. That, unfortunately, does not include the fancy "hacks" provided by xscreensaver, and that is unlikely to be implemented upstream. Other alternatives include gtklock and waylock (zig), which do not solve that problem either. It looks like swaylock-plugin, a swaylock fork, which at least attempts to solve this problem, although not directly using the real xscreensaver hacks. swaylock-effects is another attempt at this, but it only adds more effects, it doesn't delegate the image display. Other than that, maybe it's time to just let go of those funky animations and just let swaylock do it's thing, which is display a static image or just a black screen, which is fine by me. In the end, I am just using swayidle with a configuration based on the systemd integration wiki page but with additional tweaks from this service, see the resulting swayidle.service file. Interestingly, damjan also has a service for swaylock itself, although it's not clear to me what its purpose is...

Screenshot: maim grim, pubpaste I'm a heavy user of maim (and a package uploader in Debian). It looks like the direct replacement to maim (and slop) is grim (and slurp). There's also swappy which goes on top of grim and allows preview/edit of the resulting image, nice touch (not in Debian though). See also awesome-wayland screenshots for other alternatives: there are many, including X11 tools like Flameshot that also support Wayland. One key problem here was that I have my own screenshot / pastebin software which will needed an update for Wayland as well. That, thankfully, meant actually cleaning up a lot of horrible code that involved calling xterm and xmessage for user interaction. Now, pubpaste uses GTK for prompts and looks much better. (And before anyone freaks out, I already had to use GTK for proper clipboard support, so this isn't much of a stretch...)

Screen recorder: simplescreenrecorder wf-recorder In Xorg, I have used both peek or simplescreenrecorder for screen recordings. The former will work in Wayland, but has no sound support. The latter has a fork with Wayland support but it is limited and buggy ("doesn't support recording area selection and has issues with multiple screens"). It looks like wf-recorder will just do everything correctly out of the box, including audio support (with --audio, duh). It's also packaged in Debian. One has to wonder how this works while keeping the "between app security" that Wayland promises, however... Would installing such a program make my system less secure? Many other options are available, see the awesome Wayland screencasting list.

RSI: workrave nothing? Workrave has no support for Wayland. activity watch is a time tracker alternative, but is not a RSI watcher. KDE has rsiwatcher, but that's a bit too much on the heavy side for my taste. SafeEyes looks like an alternative at first, but it has many issues under Wayland (escape doesn't work, idle doesn't work, it just doesn't work really). timekpr-next could be an alternative as well, and has support for Wayland. I am also considering just abandoning workrave, even if I stick with Xorg, because it apparently introduces significant latency in the input pipeline. And besides, I've developed a pretty unhealthy alert fatigue with Workrave. I have used the program for so long that my fingers know exactly where to click to dismiss those warnings very effectively. It makes my work just more irritating, and doesn't fix the fundamental problem I have with computers.

Other apps This is a constantly changing list, of course. There's a bit of a "death by a thousand cuts" in migrating to Wayland because you realize how many things you were using are tightly bound to X.
  • .Xresources - just say goodbye to that old resource system, it was used, in my case, only for rofi, xterm, and ... Xboard!?
  • keyboard layout switcher: built-in to Sway since 2017 (PR 1505, 1.5rc2+), requires a small configuration change, see this answer as well, looks something like this command:
     swaymsg input 0:0:X11_keyboard xkb_layout de
    
    or using this config:
     input *  
         xkb_layout "ca,us"
         xkb_options "grp:sclk_toggle"
      
    
    That works refreshingly well, even better than in Xorg, I must say. swaykbdd is an alternative that supports per-window layouts (in Debian).
  • wallpaper: currently using feh, will need a replacement, TODO: figure out something that does, like feh, a random shuffle. swaybg just loads a single image, duh. oguri might be a solution, but unmaintained, used here, not in Debian. wallutils is another option, also not in Debian. For now I just don't have a wallpaper, the background is a solid gray, which is better than Xorg's default (which is whatever crap was left around a buffer by the previous collection of programs, basically)
  • notifications: currently using dunst in some places, which works well in both Xorg and Wayland, not a blocker, salut a possible alternative (not in Debian), damjan uses mako. TODO: install dunst everywhere
  • notification area: I had trouble making nm-applet work. based on this nm-applet.service, I found that you need to pass --indicator. In theory, tray icon support was merged in 1.5, but in practice there are still several limitations, like icons not clickable. On startup, nm-applet --indicator triggers this error in the Sway logs:
     nov 11 22:34:12 angela sway[298938]: 00:49:42.325 [INFO] [swaybar/tray/host.c:24] Registering Status Notifier Item ':1.47/org/ayatana/NotificationItem/nm_applet'
     nov 11 22:34:12 angela sway[298938]: 00:49:42.327 [ERROR] [swaybar/tray/item.c:127] :1.47/org/ayatana/NotificationItem/nm_applet IconPixmap: No such property  IconPixmap 
     nov 11 22:34:12 angela sway[298938]: 00:49:42.327 [ERROR] [swaybar/tray/item.c:127] :1.47/org/ayatana/NotificationItem/nm_applet AttentionIconPixmap: No such property  AttentionIconPixmap 
     nov 11 22:34:12 angela sway[298938]: 00:49:42.327 [ERROR] [swaybar/tray/item.c:127] :1.47/org/ayatana/NotificationItem/nm_applet ItemIsMenu: No such property  ItemIsMenu 
     nov 11 22:36:10 angela sway[313419]: info: fcft.c:838: /usr/share/fonts/truetype/dejavu/DejaVuSans.ttf: size=24.00pt/32px, dpi=96.00
    
    ... but it seems innocuous. The tray icon displays but, as stated above, is not clickable. If you don't see the icon, check the bar.tray_output property in the Sway config, try: tray_output *. Note that there is currently (November 2022) a pull request to hook up a "Tray D-Bus Menu" which, according to Reddit might fix this, or at least be somewhat relevant. This was the biggest irritant in my migration. I have used nmtui to connect to new Wifi hotspots or change connection settings, but that doesn't support actions like "turn off WiFi". I eventually fixed this by switching from py3status to waybar.
  • window switcher: in i3 I was using this bespoke i3-focus script, which doesn't work under Sway, swayr an option, not in Debian. So I put together this other bespoke hack from multiple sources, which works.
  • PDF viewer: currently using atril (which supports Wayland), could also just switch to zatura/mupdf permanently, see also calibre for a discussion on document viewers
See also this list of useful addons and this other list for other app alternatives.

More X11 / Wayland equivalents For all the tools above, it's not exactly clear what options exist in Wayland, or when they do, which one should be used. But for some basic tools, it seems the options are actually quite clear. If that's the case, they should be listed here:
X11 Wayland In Debian
arandr wdisplays yes
autorandr kanshi yes
xdotool wtype yes
xev wev yes
xlsclients swaymsg -t get_tree yes
xrandr wlr-randr yes
lswt is a more direct replacement for xlsclients but is not packaged in Debian. See also: Note that arandr and autorandr are not directly part of X. arewewaylandyet.com refers to a few alternatives. We suggest wdisplays and kanshi above (see also this service file) but wallutils can also do the autorandr stuff, apparently, and nwg-displays can do the arandr part. Neither are packaged in Debian yet. So I have tried wdisplays and it Just Works, and well. The UI even looks better and more usable than arandr, so another clean win from Wayland here. TODO: test kanshi as a autorandr replacement

Other issues

systemd integration I've had trouble getting session startup to work. This is partly because I had a kind of funky system to start my session in the first place. I used to have my whole session started from .xsession like this:
#!/bin/sh
. ~/.shenv
systemctl --user import-environment
exec systemctl --user start --wait xsession.target
But obviously, the xsession.target is not started by the Sway session. It seems to just start a default.target, which is really not what we want because we want to associate the services directly with the graphical-session.target, so that they don't start when logging in over (say) SSH. damjan on #debian-systemd showed me his sway-setup which features systemd integration. It involves starting a different session in a completely new .desktop file. That work was submitted upstream but refused on the grounds that "I'd rather not give a preference to any particular init system." Another PR was abandoned because "restarting sway does not makes sense: that kills everything". The work was therefore moved to the wiki. So. Not a great situation. The upstream wiki systemd integration suggests starting the systemd target from within Sway, which has all sorts of problems:
  • you don't get Sway logs anywhere
  • control groups are all messed up
I have done a lot of work trying to figure this out, but I remember that starting systemd from Sway didn't actually work for me: my previously configured systemd units didn't correctly start, and especially not with the right $PATH and environment. So I went down that rabbit hole and managed to correctly configure Sway to be started from the systemd --user session. I have partly followed the wiki but also picked ideas from damjan's sway-setup and xdbob's sway-services. Another option is uwsm (not in Debian). This is the config I have in .config/systemd/user/: I have also configured those services, but that's somewhat optional: You will also need at least part of my sway/config, which sends the systemd notification (because, no, Sway doesn't support any sort of readiness notification, that would be too easy). And you might like to see my swayidle-config while you're there. Finally, you need to hook this up somehow to the login manager. This is typically done with a desktop file, so drop sway-session.desktop in /usr/share/wayland-sessions and sway-user-service somewhere in your $PATH (typically /usr/bin/sway-user-service). The session then looks something like this:
$ systemd-cgls   head -101
Control group /:
-.slice
 user.slice (#472)
    user.invocation_id: bc405c6341de4e93a545bde6d7abbeec
    trusted.invocation_id: bc405c6341de4e93a545bde6d7abbeec
   user-1000.slice (#10072)
      user.invocation_id: 08f40f5c4bcd4fd6adfd27bec24e4827
      trusted.invocation_id: 08f40f5c4bcd4fd6adfd27bec24e4827
     user@1000.service   (#10156)
        user.delegate: 1
        trusted.delegate: 1
        user.invocation_id: 76bed72a1ffb41dca9bfda7bb174ef6b
        trusted.invocation_id: 76bed72a1ffb41dca9bfda7bb174ef6b
       session.slice (#10282)
         xdg-document-portal.service (#12248)
           9533 /usr/libexec/xdg-document-portal
           9542 fusermount3 -o rw,nosuid,nodev,fsname=portal,auto_unmount,subt 
         xdg-desktop-portal.service (#12211)
           9529 /usr/libexec/xdg-desktop-portal
         pipewire-pulse.service (#10778)
           6002 /usr/bin/pipewire-pulse
         wireplumber.service (#10519)
           5944 /usr/bin/wireplumber
         gvfs-daemon.service (#10667)
           5960 /usr/libexec/gvfsd
         gvfs-udisks2-volume-monitor.service (#10852)
           6021 /usr/libexec/gvfs-udisks2-volume-monitor
         at-spi-dbus-bus.service (#11481)
           6210 /usr/libexec/at-spi-bus-launcher
           6216 /usr/bin/dbus-daemon --config-file=/usr/share/defaults/at-spi2 
           6450 /usr/libexec/at-spi2-registryd --use-gnome-session
         pipewire.service (#10403)
           5940 /usr/bin/pipewire
         dbus.service (#10593)
           5946 /usr/bin/dbus-daemon --session --address=systemd: --nofork --n 
       background.slice (#10324)
         tracker-miner-fs-3.service (#10741)
           6001 /usr/libexec/tracker-miner-fs-3
       app.slice (#10240)
         xdg-permission-store.service (#12285)
           9536 /usr/libexec/xdg-permission-store
         gammastep.service (#11370)
           6197 gammastep
         dunst.service (#11958)
           7460 /usr/bin/dunst
         wterminal.service (#13980)
           69100 foot --title pop-up
           69101 /bin/bash
           77660 sudo systemd-cgls
           77661 head -101
           77662 wl-copy
           77663 sudo systemd-cgls
           77664 systemd-cgls
         syncthing.service (#11995)
           7529 /usr/bin/syncthing -no-browser -no-restart -logflags=0 --verbo 
           7537 /usr/bin/syncthing -no-browser -no-restart -logflags=0 --verbo 
         dconf.service (#10704)
           5967 /usr/libexec/dconf-service
         gnome-keyring-daemon.service (#10630)
           5951 /usr/bin/gnome-keyring-daemon --foreground --components=pkcs11 
         gcr-ssh-agent.service (#10963)
           6035 /usr/libexec/gcr-ssh-agent /run/user/1000/gcr
         swayidle.service (#11444)
           6199 /usr/bin/swayidle -w
         nm-applet.service (#11407)
           6198 /usr/bin/nm-applet --indicator
         wcolortaillog.service (#11518)
           6226 foot colortaillog
           6228 /bin/sh /home/anarcat/bin/colortaillog
           6230 sudo journalctl -f
           6233 ccze -m ansi
           6235 sudo journalctl -f
           6236 journalctl -f
         afuse.service (#10889)
           6051 /usr/bin/afuse -o mount_template=sshfs -o transform_symlinks - 
         gpg-agent.service (#13547)
           51662 /usr/bin/gpg-agent --supervised
           51719 scdaemon --multi-server
         emacs.service (#10926)
            6034 /usr/bin/emacs --fg-daemon
           33203 /usr/bin/aspell -a -m -d en --encoding=utf-8
         xdg-desktop-portal-gtk.service (#12322)
           9546 /usr/libexec/xdg-desktop-portal-gtk
         xdg-desktop-portal-wlr.service (#12359)
           9555 /usr/libexec/xdg-desktop-portal-wlr
         sway.service (#11037)
           6037 /usr/bin/sway
           6181 swaybar -b bar-0
           6209 py3status
           6309 /usr/bin/i3status -c /tmp/py3status_oy4ntfnq
           6969 Xwayland :0 -rootless -terminate -core -listen 29 -listen 30 - 
       init.scope (#10198)
         5909 /lib/systemd/systemd --user
         5911 (sd-pam)
     session-7.scope (#10440)
       5895 gdm-session-worker [pam/gdm-password]
       6028 /usr/libexec/gdm-wayland-session --register-session sway-user-serv 
[...]
I think that's pretty neat.

Environment propagation At first, my terminals and rofi didn't have the right $PATH, which broke a lot of my workflow. It's hard to tell exactly how Wayland gets started or where to inject environment. This discussion suggests a few alternatives and this Debian bug report discusses this issue as well. I eventually picked environment.d(5) since I already manage my user session with systemd, and it fixes a bunch of other problems. I used to have a .shenv that I had to manually source everywhere. The only problem with that approach is that it doesn't support conditionals, but that's something that's rarely needed.

Pipewire This is a whole topic onto itself, but migrating to Wayland also involves using Pipewire if you want screen sharing to work. You can actually keep using Pulseaudio for audio, that said, but that migration is actually something I've wanted to do anyways: Pipewire's design seems much better than Pulseaudio, as it folds in JACK features which allows for pretty neat tricks. (Which I should probably show in a separate post, because this one is getting rather long.) I first tried this migration in Debian bullseye, and it didn't work very well. Ardour would fail to export tracks and I would get into weird situations where streams would just drop mid-way. A particularly funny incident is when I was in a meeting and I couldn't hear my colleagues speak anymore (but they could) and I went on blabbering on my own for a solid 5 minutes until I realized what was going on. By then, people had tried numerous ways of letting me know that something was off, including (apparently) coughing, saying "hello?", chat messages, IRC, and so on, until they just gave up and left. I suspect that was also a Pipewire bug, but it could also have been that I muted the tab by error, as I recently learned that clicking on the little tiny speaker icon on a tab mutes that tab. Since the tab itself can get pretty small when you have lots of them, it's actually quite frequently that I mistakenly mute tabs. Anyways. Point is: I already knew how to make the migration, and I had already documented how to make the change in Puppet. It's basically:
apt install pipewire pipewire-audio-client-libraries pipewire-pulse wireplumber 
Then, as a regular user:
systemctl --user daemon-reload
systemctl --user --now disable pulseaudio.service pulseaudio.socket
systemctl --user --now enable pipewire pipewire-pulse
systemctl --user mask pulseaudio
An optional (but key, IMHO) configuration you should also make is to "switch on connect", which will make your Bluetooth or USB headset automatically be the default route for audio, when connected. In ~/.config/pipewire/pipewire-pulse.conf.d/autoconnect.conf:
context.exec = [
      path = "pactl"        args = "load-module module-always-sink"  
      path = "pactl"        args = "load-module module-switch-on-connect"  
    #  path = "/usr/bin/sh"  args = "~/.config/pipewire/default.pw"  
]
See the excellent as usual Arch wiki page about Pipewire for that trick and more information about Pipewire. Note that you must not put the file in ~/.config/pipewire/pipewire.conf (or pipewire-pulse.conf, maybe) directly, as that will break your setup. If you want to add to that file, first copy the template from /usr/share/pipewire/pipewire-pulse.conf first. So far I'm happy with Pipewire in bookworm, but I've heard mixed reports from it. I have high hopes it will become the standard media server for Linux in the coming months or years, which is great because I've been (rather boldly, I admit) on the record saying I don't like PulseAudio. Rereading this now, I feel it might have been a little unfair, as "over-engineered and tries to do too many things at once" applies probably even more to Pipewire than PulseAudio (since it also handles video dispatching). That said, I think Pipewire took the right approach by implementing existing interfaces like Pulseaudio and JACK. That way we're not adding a third (or fourth?) way of doing audio in Linux; we're just making the server better.

Keypress drops Sometimes I lose keyboard presses. This correlates with the following warning from Sway:
d c 06 10:36:31 curie sway[343384]: 23:32:14.034 [ERROR] [wlr] [libinput] event5  - SONiX USB Keyboard: client bug: event processing lagging behind by 37ms, your system is too slow 
... and corresponds to an open bug report in Sway. It seems the "system is too slow" should really be "your compositor is too slow" which seems to be the case here on this older system (curie). It doesn't happen often, but it does happen, particularly when a bunch of busy processes start in parallel (in my case: a linter running inside a container and notmuch new). The proposed fix for this in Sway is to gain real time privileges and add the CAP_SYS_NICE capability to the binary. We'll see how that goes in Debian once 1.8 gets released and shipped.

Improvements over i3

Tiling improvements There's a lot of improvements Sway could bring over using plain i3. There are pretty neat auto-tilers that could replicate the configurations I used to have in Xmonad or Awesome, see:

Display latency tweaks TODO: You can tweak the display latency in wlroots compositors with the max_render_time parameter, possibly getting lower latency than X11 in the end.

Sound/brightness changes notifications TODO: Avizo can display a pop-up to give feedback on volume and brightness changes. Not in Debian. Other alternatives include SwayOSD and sway-nc, also not in Debian.

Debugging tricks The xeyes (in the x11-apps package) will run in Wayland, and can actually be used to easily see if a given window is also in Wayland. If the "eyes" follow the cursor, the app is actually running in xwayland, so not natively in Wayland. Another way to see what is using Wayland in Sway is with the command:
swaymsg -t get_tree

Other documentation

Conclusion In general, this took me a long time, but it mostly works. The tray icon situation is pretty frustrating, but there's a workaround and I have high hopes it will eventually fix itself. I'm also actually worried about the DisplayLink support because I eventually want to be using this, but hopefully that's another thing that will hopefully fix itself before I need it.

A word on the security model I'm kind of worried about all the hacks that have been added to Wayland just to make things work. Pretty much everywhere we need to, we punched a hole in the security model: Wikipedia describes the security properties of Wayland as it "isolates the input and output of every window, achieving confidentiality, integrity and availability for both." I'm not sure those are actually realized in the actual implementation, because of all those holes punched in the design, at least in Sway. For example, apparently the GNOME compositor doesn't have the virtual-keyboard protocol, but they do have (another?!) text input protocol. Wayland does offer a better basis to implement such a system, however. It feels like the Linux applications security model lacks critical decision points in the UI, like the user approving "yes, this application can share my screen now". Applications themselves might have some of those prompts, but it's not mandatory, and that is worrisome.

Antoine Beaupr : A ZFS migration

In my tubman setup, I started using ZFS on an old server I had lying around. The machine is really old though (2011!) and it "feels" pretty slow. I want to see how much of that is ZFS and how much is the machine. Synthetic benchmarks show that ZFS may be slower than mdadm in RAID-10 or RAID-6 configuration, so I want to confirm that on a live workload: my workstation. Plus, I want easy, regular, high performance backups (with send/receive snapshots) and there's no way I'm going to use BTRFS because I find it too confusing and unreliable. So off we go.

Installation Since this is a conversion (and not a new install), our procedure is slightly different than the official documentation but otherwise it's pretty much in the same spirit: we're going to use ZFS for everything, including the root filesystem. So, install the required packages, on the current system:
apt install --yes gdisk zfs-dkms zfs zfs-initramfs zfsutils-linux
We also tell DKMS that we need to rebuild the initrd when upgrading:
echo REMAKE_INITRD=yes > /etc/dkms/zfs.conf

Partitioning This is going to partition /dev/sdc with:
  • 1MB MBR / BIOS legacy boot
  • 512MB EFI boot
  • 1GB bpool, unencrypted pool for /boot
  • rest of the disk for zpool, the rest of the data
     sgdisk --zap-all /dev/sdc
     sgdisk -a1 -n1:24K:+1000K -t1:EF02 /dev/sdc
     sgdisk     -n2:1M:+512M   -t2:EF00 /dev/sdc
     sgdisk     -n3:0:+1G      -t3:BF01 /dev/sdc
     sgdisk     -n4:0:0        -t4:BF00 /dev/sdc
    
That will look something like this:
    root@curie:/home/anarcat# sgdisk -p /dev/sdc
    Disk /dev/sdc: 1953525168 sectors, 931.5 GiB
    Model: ESD-S1C         
    Sector size (logical/physical): 512/512 bytes
    Disk identifier (GUID): [REDACTED]
    Partition table holds up to 128 entries
    Main partition table begins at sector 2 and ends at sector 33
    First usable sector is 34, last usable sector is 1953525134
    Partitions will be aligned on 16-sector boundaries
    Total free space is 14 sectors (7.0 KiB)
    Number  Start (sector)    End (sector)  Size       Code  Name
       1              48            2047   1000.0 KiB  EF02  
       2            2048         1050623   512.0 MiB   EF00  
       3         1050624         3147775   1024.0 MiB  BF01  
       4         3147776      1953525134   930.0 GiB   BF00
Unfortunately, we can't be sure of the sector size here, because the USB controller is probably lying to us about it. Normally, this smartctl command should tell us the sector size as well:
root@curie:~# smartctl -i /dev/sdb -qnoserial
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.0-14-amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Black Mobile
Device Model:     WDC WD10JPLX-00MBPT0
Firmware Version: 01.01H01
User Capacity:    1 000 204 886 016 bytes [1,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      2.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 6
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Tue May 17 13:33:04 2022 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Above is the example of the builtin HDD drive. But the SSD device enclosed in that USB controller doesn't support SMART commands, so we can't trust that it really has 512 bytes sectors. This matters because we need to tweak the ashift value correctly. We're going to go ahead the SSD drive has the common 4KB settings, which means ashift=12. Note here that we are not creating a separate partition for swap. Swap on ZFS volumes (AKA "swap on ZVOL") can trigger lockups and that issue is still not fixed upstream. Ubuntu recommends using a separate partition for swap instead. But since this is "just" a workstation, we're betting that we will not suffer from this problem, after hearing a report from another Debian developer running this setup on their workstation successfully. We do not recommend this setup though. In fact, if I were to redo this partition scheme, I would probably use LUKS encryption and setup a dedicated swap partition, as I had problems with ZFS encryption as well.

Creating pools ZFS pools are somewhat like "volume groups" if you are familiar with LVM, except they obviously also do things like RAID-10. (Even though LVM can technically also do RAID, people typically use mdadm instead.) In any case, the guide suggests creating two different pools here: one, in cleartext, for boot, and a separate, encrypted one, for the rest. Technically, the boot partition is required because the Grub bootloader only supports readonly ZFS pools, from what I understand. But I'm a little out of my depth here and just following the guide.

Boot pool creation This creates the boot pool in readonly mode with features that grub supports:
    zpool create \
        -o cachefile=/etc/zfs/zpool.cache \
        -o ashift=12 -d \
        -o feature@async_destroy=enabled \
        -o feature@bookmarks=enabled \
        -o feature@embedded_data=enabled \
        -o feature@empty_bpobj=enabled \
        -o feature@enabled_txg=enabled \
        -o feature@extensible_dataset=enabled \
        -o feature@filesystem_limits=enabled \
        -o feature@hole_birth=enabled \
        -o feature@large_blocks=enabled \
        -o feature@lz4_compress=enabled \
        -o feature@spacemap_histogram=enabled \
        -o feature@zpool_checkpoint=enabled \
        -O acltype=posixacl -O canmount=off \
        -O compression=lz4 \
        -O devices=off -O normalization=formD -O relatime=on -O xattr=sa \
        -O mountpoint=/boot -R /mnt \
        bpool /dev/sdc3
I haven't investigated all those settings and just trust the upstream guide on the above.

Main pool creation This is a more typical pool creation.
    zpool create \
        -o ashift=12 \
        -O encryption=on -O keylocation=prompt -O keyformat=passphrase \
        -O acltype=posixacl -O xattr=sa -O dnodesize=auto \
        -O compression=zstd \
        -O relatime=on \
        -O canmount=off \
        -O mountpoint=/ -R /mnt \
        rpool /dev/sdc4
Breaking this down:
  • -o ashift=12: mentioned above, 4k sector size
  • -O encryption=on -O keylocation=prompt -O keyformat=passphrase: encryption, prompt for a password, default algorithm is aes-256-gcm, explicit in the guide, made implicit here
  • -O acltype=posixacl -O xattr=sa: enable ACLs, with better performance (not enabled by default)
  • -O dnodesize=auto: related to extended attributes, less compatibility with other implementations
  • -O compression=zstd: enable zstd compression, can be disabled/enabled by dataset to with zfs set compression=off rpool/example
  • -O relatime=on: classic atime optimisation, another that could be used on a busy server is atime=off
  • -O canmount=off: do not make the pool mount automatically with mount -a?
  • -O mountpoint=/ -R /mnt: mount pool on / in the future, but /mnt for now
Those settings are all available in zfsprops(8). Other flags are defined in zpool-create(8). The reasoning behind them is also explained in the upstream guide and some also in [the Debian wiki][]. Those flags were actually not used:
  • -O normalization=formD: normalize file names on comparisons (not storage), implies utf8only=on, which is a bad idea (and effectively meant my first sync failed to copy some files, including this folder from a supysonic checkout). and this cannot be changed after the filesystem is created. bad, bad, bad.
[the Debian wiki]: https://wiki.debian.org/ZFS#Advanced_Topics

Side note about single-disk pools Also note that we're living dangerously here: single-disk ZFS pools are rumoured to be more dangerous than not running ZFS at all. The choice quote from this article is:
[...] any error can be detected, but cannot be corrected. This sounds like an acceptable compromise, but its actually not. The reason its not is that ZFS' metadata cannot be allowed to be corrupted. If it is it is likely the zpool will be impossible to mount (and will probably crash the system once the corruption is found). So a couple of bad sectors in the right place will mean that all data on the zpool will be lost. Not some, all. Also there's no ZFS recovery tools, so you cannot recover any data on the drives.
Compared with (say) ext4, where a single disk error can recovered, this is pretty bad. But we are ready to live with this with the idea that we'll have hourly offline snapshots that we can easily recover from. It's trade-off. Also, we're running this on a NVMe/M.2 drive which typically just blinks out of existence completely, and doesn't "bit rot" the way a HDD would. Also, the FreeBSD handbook quick start doesn't have any warnings about their first example, which is with a single disk. So I am reassured at least.

Creating mount points Next we create the actual filesystems, known as "datasets" which are the things that get mounted on mountpoint and hold the actual files.
  • this creates two containers, for ROOT and BOOT
     zfs create -o canmount=off -o mountpoint=none rpool/ROOT &&
     zfs create -o canmount=off -o mountpoint=none bpool/BOOT
    
    Note that it's unclear to me why those datasets are necessary, but they seem common practice, also used in this FreeBSD example. The OpenZFS guide mentions the Solaris upgrades and Ubuntu's zsys that use that container for upgrades and rollbacks. This blog post seems to explain a bit the layout behind the installer.
  • this creates the actual boot and root filesystems:
     zfs create -o canmount=noauto -o mountpoint=/ rpool/ROOT/debian &&
     zfs mount rpool/ROOT/debian &&
     zfs create -o mountpoint=/boot bpool/BOOT/debian
    
    I guess the debian name here is because we could technically have multiple operating systems with the same underlying datasets.
  • then the main datasets:
     zfs create                                 rpool/home &&
     zfs create -o mountpoint=/root             rpool/home/root &&
     chmod 700 /mnt/root &&
     zfs create                                 rpool/var
    
  • exclude temporary files from snapshots:
     zfs create -o com.sun:auto-snapshot=false  rpool/var/cache &&
     zfs create -o com.sun:auto-snapshot=false  rpool/var/tmp &&
     chmod 1777 /mnt/var/tmp
    
  • and skip automatic snapshots in Docker:
     zfs create -o canmount=off                 rpool/var/lib &&
     zfs create -o com.sun:auto-snapshot=false  rpool/var/lib/docker
    
    Notice here a peculiarity: we must create rpool/var/lib to create rpool/var/lib/docker otherwise we get this error:
     cannot create 'rpool/var/lib/docker': parent does not exist
    
    ... and no, just creating /mnt/var/lib doesn't fix that problem. In fact, it makes things even more confusing because an existing directory shadows a mountpoint, which is the opposite of how things normally work. Also note that you will probably need to change storage driver in Docker, see the zfs-driver documentation for details but, basically, I did:
    echo '  "storage-driver": "zfs"  ' > /etc/docker/daemon.json
    
    Note that podman has the same problem (and similar solution):
    printf '[storage]\ndriver = "zfs"\n' > /etc/containers/storage.conf
    
  • make a tmpfs for /run:
     mkdir /mnt/run &&
     mount -t tmpfs tmpfs /mnt/run &&
     mkdir /mnt/run/lock
    
We don't create a /srv, as that's the HDD stuff. Also mount the EFI partition:
mkfs.fat -F 32 /dev/sdc2 &&
mount /dev/sdc2 /mnt/boot/efi/
At this point, everything should be mounted in /mnt. It should look like this:
root@curie:~# LANG=C df -h -t zfs -t vfat
Filesystem            Size  Used Avail Use% Mounted on
rpool/ROOT/debian     899G  384K  899G   1% /mnt
bpool/BOOT/debian     832M  123M  709M  15% /mnt/boot
rpool/home            899G  256K  899G   1% /mnt/home
rpool/home/root       899G  256K  899G   1% /mnt/root
rpool/var             899G  384K  899G   1% /mnt/var
rpool/var/cache       899G  256K  899G   1% /mnt/var/cache
rpool/var/tmp         899G  256K  899G   1% /mnt/var/tmp
rpool/var/lib/docker  899G  256K  899G   1% /mnt/var/lib/docker
/dev/sdc2             511M  4.0K  511M   1% /mnt/boot/efi
Now that we have everything setup and mounted, let's copy all files over.

Copying files This is a list of all the mounted filesystems
for fs in /boot/ /boot/efi/ / /home/; do
    echo "syncing $fs to /mnt$fs..." && 
    rsync -aSHAXx --info=progress2 --delete $fs /mnt$fs
done
You can check that the list is correct with:
mount -l -t ext4,btrfs,vfat   awk ' print $3 '
Note that we skip /srv as it's on a different disk. On the first run, we had:
root@curie:~# for fs in /boot/ /boot/efi/ / /home/; do
        echo "syncing $fs to /mnt$fs..." && 
        rsync -aSHAXx --info=progress2 $fs /mnt$fs
    done
syncing /boot/ to /mnt/boot/...
              0   0%    0.00kB/s    0:00:00 (xfr#0, to-chk=0/299)  
syncing /boot/efi/ to /mnt/boot/efi/...
     16,831,437 100%  184.14MB/s    0:00:00 (xfr#101, to-chk=0/110)
syncing / to /mnt/...
 28,019,293,280  94%   47.63MB/s    0:09:21 (xfr#703710, ir-chk=6748/839220)rsync: [generator] delete_file: rmdir(var/lib/docker) failed: Device or resource busy (16)
could not make way for new symlink: var/lib/docker
 34,081,267,990  98%   50.71MB/s    0:10:40 (xfr#736577, to-chk=0/867732)    
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1333) [sender=3.2.3]
syncing /home/ to /mnt/home/...
rsync: [sender] readlink_stat("/home/anarcat/.fuse") failed: Permission denied (13)
 24,456,268,098  98%   68.03MB/s    0:05:42 (xfr#159867, ir-chk=6875/172377) 
file has vanished: "/home/anarcat/.cache/mozilla/firefox/s2hwvqbu.quantum/cache2/entries/B3AB0CDA9C4454B3C1197E5A22669DF8EE849D90"
199,762,528,125  93%   74.82MB/s    0:42:26 (xfr#1437846, ir-chk=1018/1983979)rsync: [generator] recv_generator: mkdir "/mnt/home/anarcat/dist/supysonic/tests/assets/\#346" failed: Invalid or incomplete multibyte or wide character (84)
*** Skipping any contents from this failed directory ***
315,384,723,978  96%   76.82MB/s    1:05:15 (xfr#2256473, to-chk=0/2993950)    
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1333) [sender=3.2.3]
Note the failure to transfer that supysonic file? It turns out they had a weird filename in their source tree, since then removed, but still it showed how the utf8only feature might not be such a bad idea. At this point, the procedure was restarted all the way back to "Creating pools", after unmounting all ZFS filesystems (umount /mnt/run /mnt/boot/efi && umount -t zfs -a) and destroying the pool, which, surprisingly, doesn't require any confirmation (zpool destroy rpool). The second run was cleaner:
root@curie:~# for fs in /boot/ /boot/efi/ / /home/; do
        echo "syncing $fs to /mnt$fs..." && 
        rsync -aSHAXx --info=progress2 --delete $fs /mnt$fs
    done
syncing /boot/ to /mnt/boot/...
              0   0%    0.00kB/s    0:00:00 (xfr#0, to-chk=0/299)  
syncing /boot/efi/ to /mnt/boot/efi/...
              0   0%    0.00kB/s    0:00:00 (xfr#0, to-chk=0/110)  
syncing / to /mnt/...
 28,019,033,070  97%   42.03MB/s    0:10:35 (xfr#703671, ir-chk=1093/833515)rsync: [generator] delete_file: rmdir(var/lib/docker) failed: Device or resource busy (16)
could not make way for new symlink: var/lib/docker
 34,081,807,102  98%   44.84MB/s    0:12:04 (xfr#736580, to-chk=0/867723)    
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1333) [sender=3.2.3]
syncing /home/ to /mnt/home/...
rsync: [sender] readlink_stat("/home/anarcat/.fuse") failed: Permission denied (13)
IO error encountered -- skipping file deletion
 24,043,086,450  96%   62.03MB/s    0:06:09 (xfr#151819, ir-chk=15117/172571)
file has vanished: "/home/anarcat/.cache/mozilla/firefox/s2hwvqbu.quantum/cache2/entries/4C1FDBFEA976FF924D062FB990B24B897A77B84B"
315,423,626,507  96%   67.09MB/s    1:14:43 (xfr#2256845, to-chk=0/2994364)    
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1333) [sender=3.2.3]
Also note the transfer speed: we seem capped at 76MB/s, or 608Mbit/s. This is not as fast as I was expecting: the USB connection seems to be at around 5Gbps:
anarcat@curie:~$ lsusb -tv   head -4
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/6p, 5000M
    ID 1d6b:0003 Linux Foundation 3.0 root hub
     __ Port 1: Dev 4, If 0, Class=Mass Storage, Driver=uas, 5000M
        ID 0b05:1932 ASUSTek Computer, Inc.
So it shouldn't cap at that speed. It's possible the USB adapter is failing to give me the full speed though. It's not the M.2 SSD drive either, as that has a ~500MB/s bandwidth, acccording to its spec. At this point, we're about ready to do the final configuration. We drop to single user mode and do the rest of the procedure. That used to be shutdown now, but it seems like the systemd switch broke that, so now you can reboot into grub and pick the "recovery" option. Alternatively, you might try systemctl rescue, as I found out. I also wanted to copy the drive over to another new NVMe drive, but that failed: it looks like the USB controller I have doesn't work with older, non-NVME drives.

Boot configuration Now we need to enter the new system to rebuild the boot loader and initrd and so on. First, we bind mounts and chroot into the ZFS disk:
mount --rbind /dev  /mnt/dev &&
mount --rbind /proc /mnt/proc &&
mount --rbind /sys  /mnt/sys &&
chroot /mnt /bin/bash
Next we add an extra service that imports the bpool on boot, to make sure it survives a zpool.cache destruction:
cat > /etc/systemd/system/zfs-import-bpool.service <<EOF
[Unit]
DefaultDependencies=no
Before=zfs-import-scan.service
Before=zfs-import-cache.service
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/sbin/zpool import -N -o cachefile=none bpool
# Work-around to preserve zpool cache:
ExecStartPre=-/bin/mv /etc/zfs/zpool.cache /etc/zfs/preboot_zpool.cache
ExecStartPost=-/bin/mv /etc/zfs/preboot_zpool.cache /etc/zfs/zpool.cache
[Install]
WantedBy=zfs-import.target
EOF
Enable the service:
systemctl enable zfs-import-bpool.service
I had to trim down /etc/fstab and /etc/crypttab to only contain references to the legacy filesystems (/srv is still BTRFS!). If we don't already have a tmpfs defined in /etc/fstab:
ln -s /usr/share/systemd/tmp.mount /etc/systemd/system/ &&
systemctl enable tmp.mount
Rebuild boot loader with support for ZFS, but also to workaround GRUB's missing zpool-features support:
grub-probe /boot   grep -q zfs &&
update-initramfs -c -k all &&
sed -i 's,GRUB_CMDLINE_LINUX.*,GRUB_CMDLINE_LINUX="root=ZFS=rpool/ROOT/debian",' /etc/default/grub &&
update-grub
For good measure, make sure the right disk is configured here, for example you might want to tag both drives in a RAID array:
dpkg-reconfigure grub-pc
Install grub to EFI while you're there:
grub-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=debian --recheck --no-floppy
Filesystem mount ordering. The rationale here in the OpenZFS guide is a little strange, but I don't dare ignore that.
mkdir /etc/zfs/zfs-list.cache
touch /etc/zfs/zfs-list.cache/bpool
touch /etc/zfs/zfs-list.cache/rpool
zed -F &
Verify that zed updated the cache by making sure these are not empty:
cat /etc/zfs/zfs-list.cache/bpool
cat /etc/zfs/zfs-list.cache/rpool
Once the files have data, stop zed:
fg
Press Ctrl-C.
Fix the paths to eliminate /mnt:
sed -Ei "s /mnt/? / " /etc/zfs/zfs-list.cache/*
Snapshot initial install:
zfs snapshot bpool/BOOT/debian@install
zfs snapshot rpool/ROOT/debian@install
Exit chroot:
exit

Finalizing One last sync was done in rescue mode:
for fs in /boot/ /boot/efi/ / /home/; do
    echo "syncing $fs to /mnt$fs..." && 
    rsync -aSHAXx --info=progress2 --delete $fs /mnt$fs
done
Then we unmount all filesystems:
mount   grep -v zfs   tac   awk '/\/mnt/  print $3 '   xargs -i  umount -lf  
zpool export -a
Reboot, swap the drives, and boot in ZFS. Hurray!

Benchmarks This is a test that was ran in single-user mode using fio and the Ars Technica recommended tests, which are:
  • Single 4KiB random write process:
     fio --name=randwrite4k1x --ioengine=posixaio --rw=randwrite --bs=4k --size=4g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1
    
  • 16 parallel 64KiB random write processes:
     fio --name=randwrite64k16x --ioengine=posixaio --rw=randwrite --bs=64k --size=256m --numjobs=16 --iodepth=16 --runtime=60 --time_based --end_fsync=1
    
  • Single 1MiB random write process:
     fio --name=randwrite1m1x --ioengine=posixaio --rw=randwrite --bs=1m --size=16g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1
    
Strangely, that's not exactly what the author, Jim Salter, did in his actual test bench used in the ZFS benchmarking article. The first thing is there's no read test at all, which is already pretty strange. But also it doesn't include stuff like dropping caches or repeating results. So here's my variation, which i called fio-ars-bench.sh for now. It just batches a bunch of fio tests, one by one, 60 seconds each. It should take about 12 minutes to run, as there are 3 pair of tests, read/write, with and without async. My bias, before building, running and analysing those results is that ZFS should outperform the traditional stack on writes, but possibly not on reads. It's also possible it outperforms it on both, because it's a newer drive. A new test might be possible with a new external USB drive as well, although I doubt I will find the time to do this.

Results All tests were done on WD blue SN550 drives, which claims to be able to push 2400MB/s read and 1750MB/s write. An extra drive was bought to move the LVM setup from a WDC WDS500G1B0B-00AS40 SSD, a WD blue M.2 2280 SSD that was at least 5 years old, spec'd at 560MB/s read, 530MB/s write. Benchmarks were done on the M.2 SSD drive but discarded so that the drive difference is not a factor in the test. In practice, I'm going to assume we'll never reach those numbers because we're not actually NVMe (this is an old workstation!) so the bottleneck isn't the disk itself. For our purposes, it might still give us useful results.

Rescue test, LUKS/LVM/ext4 Those tests were performed with everything shutdown, after either entering the system in rescue mode, or by reaching that target with:
systemctl rescue
The network might have been started before or after the test as well:
systemctl start systemd-networkd
So it should be fairly reliable as basically nothing else is running. Raw numbers, from the ?job-curie-lvm.log, converted to MiB/s and manually merged:
test read I/O read IOPS write I/O write IOPS
rand4k4g1x 39.27 10052 212.15 54310
rand4k4g1x--fsync=1 39.29 10057 2.73 699
rand64k256m16x 1297.00 20751 1068.57 17097
rand64k256m16x--fsync=1 1290.90 20654 353.82 5661
rand1m16g1x 315.15 315 563.77 563
rand1m16g1x--fsync=1 345.88 345 157.01 157
Peaks are at about 20k IOPS and ~1.3GiB/s read, 1GiB/s write in the 64KB blocks with 16 jobs. Slowest is the random 4k block sync write at an abysmal 3MB/s and 700 IOPS The 1MB read/write tests have lower IOPS, but that is expected.

Rescue test, ZFS This test was also performed in rescue mode. Raw numbers, from the ?job-curie-zfs.log, converted to MiB/s and manually merged:
test read I/O read IOPS write I/O write IOPS
rand4k4g1x 77.20 19763 27.13 6944
rand4k4g1x--fsync=1 76.16 19495 6.53 1673
rand64k256m16x 1882.40 30118 70.58 1129
rand64k256m16x--fsync=1 1865.13 29842 71.98 1151
rand1m16g1x 921.62 921 102.21 102
rand1m16g1x--fsync=1 908.37 908 64.30 64
Peaks are at 1.8GiB/s read, also in the 64k job like above, but much faster. The write is, as expected, much slower at 70MiB/s (compared to 1GiB/s!), but it should be noted the sync write doesn't degrade performance compared to async writes (although it's still below the LVM 300MB/s).

Conclusions Really, ZFS has trouble performing in all write conditions. The random 4k sync write test is the only place where ZFS outperforms LVM in writes, and barely (7MiB/s vs 3MiB/s). Everywhere else, writes are much slower, sometimes by an order of magnitude. And before some ZFS zealot jumps in talking about the SLOG or some other cache that could be added to improved performance, I'll remind you that those numbers are on a bare bones NVMe drive, pretty much as fast storage as you can find on this machine. Adding another NVMe drive as a cache probably will not improve write performance here. Still, those are very different results than the tests performed by Salter which shows ZFS beating traditional configurations in all categories but uncached 4k reads (not writes!). That said, those tests are very different from the tests I performed here, where I test writes on a single disk, not a RAID array, which might explain the discrepancy. Also, note that neither LVM or ZFS manage to reach the 2400MB/s read and 1750MB/s write performance specification. ZFS does manage to reach 82% of the read performance (1973MB/s) and LVM 64% of the write performance (1120MB/s). LVM hits 57% of the read performance and ZFS hits barely 6% of the write performance. Overall, I'm a bit disappointed in the ZFS write performance here, I must say. Maybe I need to tweak the record size or some other ZFS voodoo, but I'll note that I didn't have to do any such configuration on the other side to kick ZFS in the pants...

Real world experience This section document not synthetic backups, but actual real world workloads, comparing before and after I switched my workstation to ZFS.

Docker performance I had the feeling that running some git hook (which was firing a Docker container) was "slower" somehow. It seems that, at runtime, ZFS backends are significant slower than their overlayfs/ext4 equivalent:
May 16 14:42:52 curie systemd[1]: home-docker-overlay2-17e4d24228decc2d2d493efc401dbfb7ac29739da0e46775e122078d9daf3e87\x2dinit-merged.mount: Succeeded.
May 16 14:42:52 curie systemd[5161]: home-docker-overlay2-17e4d24228decc2d2d493efc401dbfb7ac29739da0e46775e122078d9daf3e87\x2dinit-merged.mount: Succeeded.
May 16 14:42:52 curie systemd[1]: home-docker-overlay2-17e4d24228decc2d2d493efc401dbfb7ac29739da0e46775e122078d9daf3e87-merged.mount: Succeeded.
May 16 14:42:53 curie dockerd[1723]: time="2022-05-16T14:42:53.087219426-04:00" level=info msg="starting signal loop" namespace=moby path=/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/af22586fba07014a4d10ab19da10cf280db7a43cad804d6c1e9f2682f12b5f10 pid=151170
May 16 14:42:53 curie systemd[1]: Started libcontainer container af22586fba07014a4d10ab19da10cf280db7a43cad804d6c1e9f2682f12b5f10.
May 16 14:42:54 curie systemd[1]: docker-af22586fba07014a4d10ab19da10cf280db7a43cad804d6c1e9f2682f12b5f10.scope: Succeeded.
May 16 14:42:54 curie dockerd[1723]: time="2022-05-16T14:42:54.047297800-04:00" level=info msg="shim disconnected" id=af22586fba07014a4d10ab19da10cf280db7a43cad804d6c1e9f2682f12b5f10
May 16 14:42:54 curie dockerd[998]: time="2022-05-16T14:42:54.051365015-04:00" level=info msg="ignoring event" container=af22586fba07014a4d10ab19da10cf280db7a43cad804d6c1e9f2682f12b5f10 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
May 16 14:42:54 curie systemd[2444]: run-docker-netns-f5453c87c879.mount: Succeeded.
May 16 14:42:54 curie systemd[5161]: run-docker-netns-f5453c87c879.mount: Succeeded.
May 16 14:42:54 curie systemd[2444]: home-docker-overlay2-17e4d24228decc2d2d493efc401dbfb7ac29739da0e46775e122078d9daf3e87-merged.mount: Succeeded.
May 16 14:42:54 curie systemd[5161]: home-docker-overlay2-17e4d24228decc2d2d493efc401dbfb7ac29739da0e46775e122078d9daf3e87-merged.mount: Succeeded.
May 16 14:42:54 curie systemd[1]: run-docker-netns-f5453c87c879.mount: Succeeded.
May 16 14:42:54 curie systemd[1]: home-docker-overlay2-17e4d24228decc2d2d493efc401dbfb7ac29739da0e46775e122078d9daf3e87-merged.mount: Succeeded.
Translating this:
  • container setup: ~1 second
  • container runtime: ~1 second
  • container teardown: ~1 second
  • total runtime: 2-3 seconds
Obviously, those timestamps are not quite accurate enough to make precise measurements... After I switched to ZFS:
mai 30 15:31:39 curie systemd[1]: var-lib-docker-zfs-graph-41ce08fb7a1d3a9c101694b82722f5621c0b4819bd1d9f070933fd1e00543cdf\x2dinit.mount: Succeeded. 
mai 30 15:31:39 curie systemd[5287]: var-lib-docker-zfs-graph-41ce08fb7a1d3a9c101694b82722f5621c0b4819bd1d9f070933fd1e00543cdf\x2dinit.mount: Succeeded. 
mai 30 15:31:40 curie systemd[1]: var-lib-docker-zfs-graph-41ce08fb7a1d3a9c101694b82722f5621c0b4819bd1d9f070933fd1e00543cdf.mount: Succeeded. 
mai 30 15:31:40 curie systemd[5287]: var-lib-docker-zfs-graph-41ce08fb7a1d3a9c101694b82722f5621c0b4819bd1d9f070933fd1e00543cdf.mount: Succeeded. 
mai 30 15:31:41 curie dockerd[3199]: time="2022-05-30T15:31:41.551403693-04:00" level=info msg="starting signal loop" namespace=moby path=/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/42a1a1ed5912a7227148e997f442e7ab2e5cc3558aa3471548223c5888c9b142 pid=141080 
mai 30 15:31:41 curie systemd[1]: run-docker-runtime\x2drunc-moby-42a1a1ed5912a7227148e997f442e7ab2e5cc3558aa3471548223c5888c9b142-runc.ZVcjvl.mount: Succeeded. 
mai 30 15:31:41 curie systemd[5287]: run-docker-runtime\x2drunc-moby-42a1a1ed5912a7227148e997f442e7ab2e5cc3558aa3471548223c5888c9b142-runc.ZVcjvl.mount: Succeeded. 
mai 30 15:31:41 curie systemd[1]: Started libcontainer container 42a1a1ed5912a7227148e997f442e7ab2e5cc3558aa3471548223c5888c9b142. 
mai 30 15:31:45 curie systemd[1]: docker-42a1a1ed5912a7227148e997f442e7ab2e5cc3558aa3471548223c5888c9b142.scope: Succeeded. 
mai 30 15:31:45 curie dockerd[3199]: time="2022-05-30T15:31:45.883019128-04:00" level=info msg="shim disconnected" id=42a1a1ed5912a7227148e997f442e7ab2e5cc3558aa3471548223c5888c9b142 
mai 30 15:31:45 curie dockerd[1726]: time="2022-05-30T15:31:45.883064491-04:00" level=info msg="ignoring event" container=42a1a1ed5912a7227148e997f442e7ab2e5cc3558aa3471548223c5888c9b142 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete" 
mai 30 15:31:45 curie systemd[1]: run-docker-netns-e45f5cf5f465.mount: Succeeded. 
mai 30 15:31:45 curie systemd[5287]: run-docker-netns-e45f5cf5f465.mount: Succeeded. 
mai 30 15:31:45 curie systemd[1]: var-lib-docker-zfs-graph-41ce08fb7a1d3a9c101694b82722f5621c0b4819bd1d9f070933fd1e00543cdf.mount: Succeeded. 
mai 30 15:31:45 curie systemd[5287]: var-lib-docker-zfs-graph-41ce08fb7a1d3a9c101694b82722f5621c0b4819bd1d9f070933fd1e00543cdf.mount: Succeeded.
That's double or triple the run time, from 2 seconds to 6 seconds. Most of the time is spent in run time, inside the container. Here's the breakdown:
  • container setup: ~2 seconds
  • container run: ~4 seconds
  • container teardown: ~1 second
  • total run time: about ~6-7 seconds
That's a two- to three-fold increase! Clearly something is going on here that I should tweak. It's possible that code path is less optimized in Docker. I also worry about podman, but apparently it also supports ZFS backends. Possibly it would perform better, but at this stage I wouldn't have a good comparison: maybe it would have performed better on non-ZFS as well...

Interactivity While doing the offsite backups (below), the system became somewhat "sluggish". I felt everything was slow, and I estimate it introduced ~50ms latency in any input device. Arguably, those are all USB and the external drive was connected through USB, but I suspect the ZFS drivers are not as well tuned with the scheduler as the regular filesystem drivers...

Recovery procedures For test purposes, I unmounted all systems during the procedure:
umount /mnt/boot/efi /mnt/boot/run
umount -a -t zfs
zpool export -a
And disconnected the drive, to see how I would recover this system from another Linux system in case of a total motherboard failure. To import an existing pool, plug the device, then import the pool with an alternate root, so it doesn't mount over your existing filesystems, then you mount the root filesystem and all the others:
zpool import -l -a -R /mnt &&
zfs mount rpool/ROOT/debian &&
zfs mount -a &&
mount /dev/sdc2 /mnt/boot/efi &&
mount -t tmpfs tmpfs /mnt/run &&
mkdir /mnt/run/lock

Offsite backup Part of the goal of using ZFS is to simplify and harden backups. I wanted to experiment with shorter recovery times specifically both point in time recovery objective and recovery time objective and faster incremental backups. This is, therefore, part of my backup services. This section documents how an external NVMe enclosure was setup in a pool to mirror the datasets from my workstation. The final setup should include syncoid copying datasets to the backup server regularly, but I haven't finished that configuration yet.

Partitioning The above partitioning procedure used sgdisk, but I couldn't figure out how to do this with sgdisk, so this uses sfdisk to dump the partition from the first disk to an external, identical drive:
sfdisk -d /dev/nvme0n1   sfdisk --no-reread /dev/sda --force

Pool creation This is similar to the main pool creation, except we tweaked a few bits after changing the upstream procedure:
zpool create \
        -o cachefile=/etc/zfs/zpool.cache \
        -o ashift=12 -d \
        -o feature@async_destroy=enabled \
        -o feature@bookmarks=enabled \
        -o feature@embedded_data=enabled \
        -o feature@empty_bpobj=enabled \
        -o feature@enabled_txg=enabled \
        -o feature@extensible_dataset=enabled \
        -o feature@filesystem_limits=enabled \
        -o feature@hole_birth=enabled \
        -o feature@large_blocks=enabled \
        -o feature@lz4_compress=enabled \
        -o feature@spacemap_histogram=enabled \
        -o feature@zpool_checkpoint=enabled \
        -O acltype=posixacl -O xattr=sa \
        -O compression=lz4 \
        -O devices=off \
        -O relatime=on \
        -O canmount=off \
        -O mountpoint=/boot -R /mnt \
        bpool-tubman /dev/sdb3
The change from the main boot pool are: Main pool creation is:
zpool create \
        -o ashift=12 \
        -O encryption=on -O keylocation=prompt -O keyformat=passphrase \
        -O acltype=posixacl -O xattr=sa -O dnodesize=auto \
        -O compression=zstd \
        -O relatime=on \
        -O canmount=off \
        -O mountpoint=/ -R /mnt \
        rpool-tubman /dev/sdb4

First sync I used syncoid to copy all pools over to the external device. syncoid is a thing that's part of the sanoid project which is specifically designed to sync snapshots between pool, typically over SSH links but it can also operate locally. The sanoid command had a --readonly argument to simulate changes, but syncoid didn't so I tried to fix that with an upstream PR. It seems it would be better to do this by hand, but this was much easier. The full first sync was:
root@curie:/home/anarcat# ./bin/syncoid -r  bpool bpool-tubman
CRITICAL ERROR: Target bpool-tubman exists but has no snapshots matching with bpool!
                Replication to target would require destroying existing
                target. Cowardly refusing to destroy your existing target.
          NOTE: Target bpool-tubman dataset is < 64MB used - did you mistakenly run
                 zfs create bpool-tubman  on the target? ZFS initial
                replication must be to a NON EXISTENT DATASET, which will
                then be CREATED BY the initial replication process.
INFO: Sending oldest full snapshot bpool/BOOT@test (~ 42 KB) to new target filesystem:
44.2KiB 0:00:00 [4.19MiB/s] [========================================================================================================================] 103%            
INFO: Updating new target filesystem with incremental bpool/BOOT@test ... syncoid_curie_2022-05-30:12:50:39 (~ 4 KB):
2.13KiB 0:00:00 [ 114KiB/s] [===============================================================>                                                         ] 53%            
INFO: Sending oldest full snapshot bpool/BOOT/debian@install (~ 126.0 MB) to new target filesystem:
 126MiB 0:00:00 [ 308MiB/s] [=======================================================================================================================>] 100%            
INFO: Updating new target filesystem with incremental bpool/BOOT/debian@install ... syncoid_curie_2022-05-30:12:50:39 (~ 113.4 MB):
 113MiB 0:00:00 [ 315MiB/s] [=======================================================================================================================>] 100%
root@curie:/home/anarcat# ./bin/syncoid -r  rpool rpool-tubman
CRITICAL ERROR: Target rpool-tubman exists but has no snapshots matching with rpool!
                Replication to target would require destroying existing
                target. Cowardly refusing to destroy your existing target.
          NOTE: Target rpool-tubman dataset is < 64MB used - did you mistakenly run
                 zfs create rpool-tubman  on the target? ZFS initial
                replication must be to a NON EXISTENT DATASET, which will
                then be CREATED BY the initial replication process.
INFO: Sending oldest full snapshot rpool/ROOT@syncoid_curie_2022-05-30:12:50:51 (~ 69 KB) to new target filesystem:
44.2KiB 0:00:00 [2.44MiB/s] [===========================================================================>                                             ] 63%            
INFO: Sending oldest full snapshot rpool/ROOT/debian@install (~ 25.9 GB) to new target filesystem:
25.9GiB 0:03:33 [ 124MiB/s] [=======================================================================================================================>] 100%            
INFO: Updating new target filesystem with incremental rpool/ROOT/debian@install ... syncoid_curie_2022-05-30:12:50:52 (~ 3.9 GB):
3.92GiB 0:00:33 [ 119MiB/s] [======================================================================================================================>  ] 99%            
INFO: Sending oldest full snapshot rpool/home@syncoid_curie_2022-05-30:12:55:04 (~ 276.8 GB) to new target filesystem:
 277GiB 0:27:13 [ 174MiB/s] [=======================================================================================================================>] 100%            
INFO: Sending oldest full snapshot rpool/home/root@syncoid_curie_2022-05-30:13:22:19 (~ 2.2 GB) to new target filesystem:
2.22GiB 0:00:25 [90.2MiB/s] [=======================================================================================================================>] 100%            
INFO: Sending oldest full snapshot rpool/var@syncoid_curie_2022-05-30:13:22:47 (~ 5.6 GB) to new target filesystem:
5.56GiB 0:00:32 [ 176MiB/s] [=======================================================================================================================>] 100%            
INFO: Sending oldest full snapshot rpool/var/cache@syncoid_curie_2022-05-30:13:23:22 (~ 627.3 MB) to new target filesystem:
 627MiB 0:00:03 [ 169MiB/s] [=======================================================================================================================>] 100%            
INFO: Sending oldest full snapshot rpool/var/lib@syncoid_curie_2022-05-30:13:23:28 (~ 69 KB) to new target filesystem:
44.2KiB 0:00:00 [1.40MiB/s] [===========================================================================>                                             ] 63%            
INFO: Sending oldest full snapshot rpool/var/lib/docker@syncoid_curie_2022-05-30:13:23:28 (~ 442.6 MB) to new target filesystem:
 443MiB 0:00:04 [ 103MiB/s] [=======================================================================================================================>] 100%            
INFO: Sending oldest full snapshot rpool/var/lib/docker/05c0de7fabbea60500eaa495d0d82038249f6faa63b12914737c4d71520e62c5@266253254 (~ 6.3 MB) to new target filesystem:
6.49MiB 0:00:00 [12.9MiB/s] [========================================================================================================================] 102%            
INFO: Updating new target filesystem with incremental rpool/var/lib/docker/05c0de7fabbea60500eaa495d0d82038249f6faa63b12914737c4d71520e62c5@266253254 ... syncoid_curie_2022-05-30:13:23:34 (~ 4 KB):
1.52KiB 0:00:00 [27.6KiB/s] [============================================>                                                                            ] 38%            
INFO: Sending oldest full snapshot rpool/var/lib/flatpak@syncoid_curie_2022-05-30:13:23:36 (~ 2.0 GB) to new target filesystem:
2.00GiB 0:00:17 [ 115MiB/s] [=======================================================================================================================>] 100%            
INFO: Sending oldest full snapshot rpool/var/tmp@syncoid_curie_2022-05-30:13:23:55 (~ 57.0 MB) to new target filesystem:
61.8MiB 0:00:01 [45.0MiB/s] [========================================================================================================================] 108%            
INFO: Clone is recreated on target rpool-tubman/var/lib/docker/ed71ddd563a779ba6fb37b3b1d0cc2c11eca9b594e77b4b234867ebcb162b205 based on rpool/var/lib/docker/05c0de7fabbea60500eaa495d0d82038249f6faa63b12914737c4d71520e62c5@266253254
INFO: Sending oldest full snapshot rpool/var/lib/docker/ed71ddd563a779ba6fb37b3b1d0cc2c11eca9b594e77b4b234867ebcb162b205@syncoid_curie_2022-05-30:13:23:58 (~ 218.6 MB) to new target filesystem:
 219MiB 0:00:01 [ 151MiB/s] [=======================================================================================================================>] 100%
Funny how the CRITICAL ERROR doesn't actually stop syncoid and it just carries on merrily doing when it's telling you it's "cowardly refusing to destroy your existing target"... Maybe that's because my pull request broke something though... During the transfer, the computer was very sluggish: everything feels like it has ~30-50ms latency extra:
anarcat@curie:sanoid$ LANG=C top -b  -n 1   head -20
top - 13:07:05 up 6 days,  4:01,  1 user,  load average: 16.13, 16.55, 11.83
Tasks: 606 total,   6 running, 598 sleeping,   0 stopped,   2 zombie
%Cpu(s): 18.8 us, 72.5 sy,  1.2 ni,  5.0 id,  1.2 wa,  0.0 hi,  1.2 si,  0.0 st
MiB Mem :  15898.4 total,   1387.6 free,  13170.0 used,   1340.8 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.   1319.8 avail Mem 
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
     70 root      20   0       0      0      0 S  83.3   0.0   6:12.67 kswapd0
4024878 root      20   0  282644  96432  10288 S  44.4   0.6   0:11.43 puppet
3896136 root      20   0   35328  16528     48 S  22.2   0.1   2:08.04 mbuffer
3896135 root      20   0   10328    776    168 R  16.7   0.0   1:22.93 zfs
3896138 root      20   0   10588    788    156 R  16.7   0.0   1:49.30 zfs
    350 root       0 -20       0      0      0 R  11.1   0.0   1:03.53 z_rd_int
    351 root       0 -20       0      0      0 S  11.1   0.0   1:04.15 z_rd_int
3896137 root      20   0    4384    352    244 R  11.1   0.0   0:44.73 pv
4034094 anarcat   30  10   20028  13960   2428 S  11.1   0.1   0:00.70 mbsync
4036539 anarcat   20   0    9604   3464   2408 R  11.1   0.0   0:00.04 top
    352 root       0 -20       0      0      0 S   5.6   0.0   1:03.64 z_rd_int
    353 root       0 -20       0      0      0 S   5.6   0.0   1:03.64 z_rd_int
    354 root       0 -20       0      0      0 S   5.6   0.0   1:04.01 z_rd_int
I wonder how much of that is due to syncoid, particularly because I often saw mbuffer and pv in there which are not strictly necessary to do those kind of operations, as far as I understand. Once that's done, export the pools to disconnect the drive:
zpool export bpool-tubman
zpool export rpool-tubman

Raw disk benchmark Copied the 512GB SSD/M.2 device to another 1024GB NVMe/M.2 device:
anarcat@curie:~$ sudo dd if=/dev/sdb of=/dev/sdc bs=4M status=progress conv=fdatasync
499944259584 octets (500 GB, 466 GiB) copi s, 1713 s, 292 MB/s
119235+1 enregistrements lus
119235+1 enregistrements  crits
500107862016 octets (500 GB, 466 GiB) copi s, 1719,93 s, 291 MB/s
... while both over USB, whoohoo 300MB/s!

Monitoring ZFS should be monitoring your pools regularly. Normally, the [[!debman zed]] daemon monitors all ZFS events. It is the thing that will report when a scrub failed, for example. See this configuration guide. Scrubs should be regularly scheduled to ensure consistency of the pool. This can be done in newer zfsutils-linux versions (bullseye-backports or bookworm) with one of those, depending on the desired frequency:
systemctl enable zfs-scrub-weekly@rpool.timer --now
systemctl enable zfs-scrub-monthly@rpool.timer --now
When the scrub runs, if it finds anything it will send an event which will get picked up by the zed daemon which will then send a notification, see below for an example. TODO: deploy on curie, if possible (probably not because no RAID) TODO: this should be in Puppet

Scrub warning example So what happens when problems are found? Here's an example of how I dealt with an error I received. After setting up another server (tubman) with ZFS, I eventually ended up getting a warning from the ZFS toolchain.
Date: Sun, 09 Oct 2022 00:58:08 -0400
From: root <root@anarc.at>
To: root@anarc.at
Subject: ZFS scrub_finish event for rpool on tubman
ZFS has finished a scrub:
   eid: 39536
 class: scrub_finish
  host: tubman
  time: 2022-10-09 00:58:07-0400
  pool: rpool
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: scrub repaired 0B in 00:33:57 with 0 errors on Sun Oct  9 00:58:07 2022
config:
        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sdb4    ONLINE       0     1     0
            sdc4    ONLINE       0     0     0
        cache
          sda3      ONLINE       0     0     0
errors: No known data errors
This, in itself, is a little worrisome. But it helpfully links to this more detailed documentation (and props up there: the link still works) which explains this is a "minor" problem (something that could be included in the report). In this case, this happened on a server setup on 2021-04-28, but the disks and server hardware are much older. The server itself (marcos v1) was built around 2011, over 10 years ago now. The hard drive in question is:
root@tubman:~# smartctl -i -qnoserial /dev/sdb
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.0-15-amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family:     Seagate BarraCuda 3.5
Device Model:     ST4000DM004-2CV104
Firmware Version: 0001
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5425 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Tue Oct 11 11:02:32 2022 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Some more SMART stats:
root@tubman:~# smartctl -a -qnoserial /dev/sdb   grep -e  Head_Flying_Hours -e Power_On_Hours -e Total_LBA -e 'Sector Sizes'
Sector Sizes:     512 bytes logical, 4096 bytes physical
  9 Power_On_Hours          0x0032   086   086   000    Old_age   Always       -       12464 (206 202 0)
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       10966h+55m+23.757s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       21107792664
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       3201579750
That's over a year of power on, which shouldn't be so bad. It has written about 10TB of data (21107792664 LBAs * 512 byte/LBA), which is about two full writes. According to its specification, this device is supposed to support 55 TB/year of writes, so we're far below spec. Note that are still far from the "non-recoverable read error per bits" spec (1 per 10E15), as we've basically read 13E12 bits (3201579750 LBAs * 512 byte/LBA = 13E12 bits). It's likely this disk was made in 2018, so it is in its fourth year. Interestingly, /dev/sdc is also a Seagate drive, but of a different series:
root@tubman:~# smartctl -qnoserial  -i /dev/sdb
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.0-15-amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family:     Seagate BarraCuda 3.5
Device Model:     ST4000DM004-2CV104
Firmware Version: 0001
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5425 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Tue Oct 11 11:21:35 2022 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
It has seen much more reads than the other disk which is also interesting:
root@tubman:~# smartctl -a -qnoserial /dev/sdc   grep -e  Head_Flying_Hours -e Power_On_Hours -e Total_LBA -e 'Sector Sizes'
Sector Sizes:     512 bytes logical, 4096 bytes physical
  9 Power_On_Hours          0x0032   059   059   000    Old_age   Always       -       36240
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       33994h+10m+52.118s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       30730174438
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       51894566538
That's 4 years of Head_Flying_Hours, and over 4 years (4 years and 48 days) of Power_On_Hours. The copyright date on that drive's specs goes back to 2016, so it's a much older drive. SMART self-test succeeded.

Remaining issues
  • TODO: move send/receive backups to offsite host, see also zfs for alternatives to syncoid/sanoid there
  • TODO: setup backup cron job (or timer?)
  • TODO: swap still not setup on curie, see zfs
  • TODO: document this somewhere: bpool and rpool are both pools and datasets. that's pretty confusing, but also very useful because it allows for pool-wide recursive snapshots, which are used for the backup system

fio improvements I really want to improve my experience with fio. Right now, I'm just cargo-culting stuff from other folks and I don't really like it. stressant is a good example of my struggles, in the sense that it doesn't really work that well for disk tests. I would love to have just a single .fio job file that lists multiple jobs to run serially. For example, this file describes the above workload pretty well:
[global]
# cargo-culting Salter
fallocate=none
ioengine=posixaio
runtime=60
time_based=1
end_fsync=1
stonewall=1
group_reporting=1
# no need to drop caches, done by default
# invalidate=1
# Single 4KiB random read/write process
[randread-4k-4g-1x]
rw=randread
bs=4k
size=4g
numjobs=1
iodepth=1
[randwrite-4k-4g-1x]
rw=randwrite
bs=4k
size=4g
numjobs=1
iodepth=1
# 16 parallel 64KiB random read/write processes:
[randread-64k-256m-16x]
rw=randread
bs=64k
size=256m
numjobs=16
iodepth=16
[randwrite-64k-256m-16x]
rw=randwrite
bs=64k
size=256m
numjobs=16
iodepth=16
# Single 1MiB random read/write process
[randread-1m-16g-1x]
rw=randread
bs=1m
size=16g
numjobs=1
iodepth=1
[randwrite-1m-16g-1x]
rw=randwrite
bs=1m
size=16g
numjobs=1
iodepth=1
... except the jobs are actually started in parallel, even though they are stonewall'd, as far as I can tell by the reports. I sent a mail to the fio mailing list for clarification. It looks like the jobs are started in parallel, but actual (correctly) run serially. It seems like this might just be a matter of reporting the right timestamps in the end, although it does feel like starting all the processes (even if not doing any work yet) could skew the results.

Hangs during procedure During the procedure, it happened a few times where any ZFS command would completely hang. It seems that using an external USB drive to sync stuff didn't work so well: sometimes it would reconnect under a different device (from sdc to sdd, for example), and this would greatly confuse ZFS. Here, for example, is sdd reappearing out of the blue:
May 19 11:22:53 curie kernel: [  699.820301] scsi host4: uas
May 19 11:22:53 curie kernel: [  699.820544] usb 2-1: authorized to connect
May 19 11:22:53 curie kernel: [  699.922433] scsi 4:0:0:0: Direct-Access     ROG      ESD-S1C          0    PQ: 0 ANSI: 6
May 19 11:22:53 curie kernel: [  699.923235] sd 4:0:0:0: Attached scsi generic sg2 type 0
May 19 11:22:53 curie kernel: [  699.923676] sd 4:0:0:0: [sdd] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB)
May 19 11:22:53 curie kernel: [  699.923788] sd 4:0:0:0: [sdd] Write Protect is off
May 19 11:22:53 curie kernel: [  699.923949] sd 4:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
May 19 11:22:53 curie kernel: [  699.924149] sd 4:0:0:0: [sdd] Optimal transfer size 33553920 bytes
May 19 11:22:53 curie kernel: [  699.961602]  sdd: sdd1 sdd2 sdd3 sdd4
May 19 11:22:53 curie kernel: [  699.996083] sd 4:0:0:0: [sdd] Attached SCSI disk
Next time I run a ZFS command (say zpool list), the command completely hangs (D state) and this comes up in the logs:
May 19 11:34:21 curie kernel: [ 1387.914843] zio pool=bpool vdev=/dev/sdc3 error=5 type=2 offset=71344128 size=4096 flags=184880
May 19 11:34:21 curie kernel: [ 1387.914859] zio pool=bpool vdev=/dev/sdc3 error=5 type=2 offset=205565952 size=4096 flags=184880
May 19 11:34:21 curie kernel: [ 1387.914874] zio pool=bpool vdev=/dev/sdc3 error=5 type=2 offset=272789504 size=4096 flags=184880
May 19 11:34:21 curie kernel: [ 1387.914906] zio pool=bpool vdev=/dev/sdc3 error=5 type=1 offset=270336 size=8192 flags=b08c1
May 19 11:34:21 curie kernel: [ 1387.914932] zio pool=bpool vdev=/dev/sdc3 error=5 type=1 offset=1073225728 size=8192 flags=b08c1
May 19 11:34:21 curie kernel: [ 1387.914948] zio pool=bpool vdev=/dev/sdc3 error=5 type=1 offset=1073487872 size=8192 flags=b08c1
May 19 11:34:21 curie kernel: [ 1387.915165] zio pool=bpool vdev=/dev/sdc3 error=5 type=2 offset=272793600 size=4096 flags=184880
May 19 11:34:21 curie kernel: [ 1387.915183] zio pool=bpool vdev=/dev/sdc3 error=5 type=2 offset=339853312 size=4096 flags=184880
May 19 11:34:21 curie kernel: [ 1387.915648] WARNING: Pool 'bpool' has encountered an uncorrectable I/O failure and has been suspended.
May 19 11:34:21 curie kernel: [ 1387.915648] 
May 19 11:37:25 curie kernel: [ 1571.558614] task:txg_sync        state:D stack:    0 pid:  997 ppid:     2 flags:0x00004000
May 19 11:37:25 curie kernel: [ 1571.558623] Call Trace:
May 19 11:37:25 curie kernel: [ 1571.558640]  __schedule+0x282/0x870
May 19 11:37:25 curie kernel: [ 1571.558650]  schedule+0x46/0xb0
May 19 11:37:25 curie kernel: [ 1571.558670]  schedule_timeout+0x8b/0x140
May 19 11:37:25 curie kernel: [ 1571.558675]  ? __next_timer_interrupt+0x110/0x110
May 19 11:37:25 curie kernel: [ 1571.558678]  io_schedule_timeout+0x4c/0x80
May 19 11:37:25 curie kernel: [ 1571.558689]  __cv_timedwait_common+0x12b/0x160 [spl]
May 19 11:37:25 curie kernel: [ 1571.558694]  ? add_wait_queue_exclusive+0x70/0x70
May 19 11:37:25 curie kernel: [ 1571.558702]  __cv_timedwait_io+0x15/0x20 [spl]
May 19 11:37:25 curie kernel: [ 1571.558816]  zio_wait+0x129/0x2b0 [zfs]
May 19 11:37:25 curie kernel: [ 1571.558929]  dsl_pool_sync+0x461/0x4f0 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559032]  spa_sync+0x575/0xfa0 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559138]  ? spa_txg_history_init_io+0x101/0x110 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559245]  txg_sync_thread+0x2e0/0x4a0 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559354]  ? txg_fini+0x240/0x240 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559366]  thread_generic_wrapper+0x6f/0x80 [spl]
May 19 11:37:25 curie kernel: [ 1571.559376]  ? __thread_exit+0x20/0x20 [spl]
May 19 11:37:25 curie kernel: [ 1571.559379]  kthread+0x11b/0x140
May 19 11:37:25 curie kernel: [ 1571.559382]  ? __kthread_bind_mask+0x60/0x60
May 19 11:37:25 curie kernel: [ 1571.559386]  ret_from_fork+0x22/0x30
May 19 11:37:25 curie kernel: [ 1571.559401] task:zed             state:D stack:    0 pid: 1564 ppid:     1 flags:0x00000000
May 19 11:37:25 curie kernel: [ 1571.559404] Call Trace:
May 19 11:37:25 curie kernel: [ 1571.559409]  __schedule+0x282/0x870
May 19 11:37:25 curie kernel: [ 1571.559412]  ? __kmalloc_node+0x141/0x2b0
May 19 11:37:25 curie kernel: [ 1571.559417]  schedule+0x46/0xb0
May 19 11:37:25 curie kernel: [ 1571.559420]  schedule_preempt_disabled+0xa/0x10
May 19 11:37:25 curie kernel: [ 1571.559424]  __mutex_lock.constprop.0+0x133/0x460
May 19 11:37:25 curie kernel: [ 1571.559435]  ? nvlist_xalloc.part.0+0x68/0xc0 [znvpair]
May 19 11:37:25 curie kernel: [ 1571.559537]  spa_all_configs+0x41/0x120 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559644]  zfs_ioc_pool_configs+0x17/0x70 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559752]  zfsdev_ioctl_common+0x697/0x870 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559758]  ? _copy_from_user+0x28/0x60
May 19 11:37:25 curie kernel: [ 1571.559860]  zfsdev_ioctl+0x53/0xe0 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559866]  __x64_sys_ioctl+0x83/0xb0
May 19 11:37:25 curie kernel: [ 1571.559869]  do_syscall_64+0x33/0x80
May 19 11:37:25 curie kernel: [ 1571.559873]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
May 19 11:37:25 curie kernel: [ 1571.559876] RIP: 0033:0x7fcf0ef32cc7
May 19 11:37:25 curie kernel: [ 1571.559878] RSP: 002b:00007fcf0e181618 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
May 19 11:37:25 curie kernel: [ 1571.559881] RAX: ffffffffffffffda RBX: 000055b212f972a0 RCX: 00007fcf0ef32cc7
May 19 11:37:25 curie kernel: [ 1571.559883] RDX: 00007fcf0e181640 RSI: 0000000000005a04 RDI: 000000000000000b
May 19 11:37:25 curie kernel: [ 1571.559885] RBP: 00007fcf0e184c30 R08: 00007fcf08016810 R09: 00007fcf08000080
May 19 11:37:25 curie kernel: [ 1571.559886] R10: 0000000000080000 R11: 0000000000000246 R12: 000055b212f972a0
May 19 11:37:25 curie kernel: [ 1571.559888] R13: 0000000000000000 R14: 00007fcf0e181640 R15: 0000000000000000
May 19 11:37:25 curie kernel: [ 1571.559980] task:zpool           state:D stack:    0 pid:11815 ppid:  3816 flags:0x00004000
May 19 11:37:25 curie kernel: [ 1571.559983] Call Trace:
May 19 11:37:25 curie kernel: [ 1571.559988]  __schedule+0x282/0x870
May 19 11:37:25 curie kernel: [ 1571.559992]  schedule+0x46/0xb0
May 19 11:37:25 curie kernel: [ 1571.559995]  io_schedule+0x42/0x70
May 19 11:37:25 curie kernel: [ 1571.560004]  cv_wait_common+0xac/0x130 [spl]
May 19 11:37:25 curie kernel: [ 1571.560008]  ? add_wait_queue_exclusive+0x70/0x70
May 19 11:37:25 curie kernel: [ 1571.560118]  txg_wait_synced_impl+0xc9/0x110 [zfs]
May 19 11:37:25 curie kernel: [ 1571.560223]  txg_wait_synced+0xc/0x40 [zfs]
May 19 11:37:25 curie kernel: [ 1571.560325]  spa_export_common+0x4cd/0x590 [zfs]
May 19 11:37:25 curie kernel: [ 1571.560430]  ? zfs_log_history+0x9c/0xf0 [zfs]
May 19 11:37:25 curie kernel: [ 1571.560537]  zfsdev_ioctl_common+0x697/0x870 [zfs]
May 19 11:37:25 curie kernel: [ 1571.560543]  ? _copy_from_user+0x28/0x60
May 19 11:37:25 curie kernel: [ 1571.560644]  zfsdev_ioctl+0x53/0xe0 [zfs]
May 19 11:37:25 curie kernel: [ 1571.560649]  __x64_sys_ioctl+0x83/0xb0
May 19 11:37:25 curie kernel: [ 1571.560653]  do_syscall_64+0x33/0x80
May 19 11:37:25 curie kernel: [ 1571.560656]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
May 19 11:37:25 curie kernel: [ 1571.560659] RIP: 0033:0x7fdc23be2cc7
May 19 11:37:25 curie kernel: [ 1571.560661] RSP: 002b:00007ffc8c792478 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
May 19 11:37:25 curie kernel: [ 1571.560664] RAX: ffffffffffffffda RBX: 000055942ca49e20 RCX: 00007fdc23be2cc7
May 19 11:37:25 curie kernel: [ 1571.560666] RDX: 00007ffc8c792490 RSI: 0000000000005a03 RDI: 0000000000000003
May 19 11:37:25 curie kernel: [ 1571.560667] RBP: 00007ffc8c795e80 R08: 00000000ffffffff R09: 00007ffc8c792310
May 19 11:37:25 curie kernel: [ 1571.560669] R10: 000055942ca49e30 R11: 0000000000000246 R12: 00007ffc8c792490
May 19 11:37:25 curie kernel: [ 1571.560671] R13: 000055942ca49e30 R14: 000055942aed2c20 R15: 00007ffc8c795a40
Here's another example, where you see the USB controller bleeping out and back into existence:
mai 19 11:38:39 curie kernel: usb 2-1: USB disconnect, device number 2
mai 19 11:38:39 curie kernel: sd 4:0:0:0: [sdd] Synchronizing SCSI cache
mai 19 11:38:39 curie kernel: sd 4:0:0:0: [sdd] Synchronize Cache(10) failed: Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
mai 19 11:39:25 curie kernel: INFO: task zed:1564 blocked for more than 241 seconds.
mai 19 11:39:25 curie kernel:       Tainted: P          IOE     5.10.0-14-amd64 #1 Debian 5.10.113-1
mai 19 11:39:25 curie kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mai 19 11:39:25 curie kernel: task:zed             state:D stack:    0 pid: 1564 ppid:     1 flags:0x00000000
mai 19 11:39:25 curie kernel: Call Trace:
mai 19 11:39:25 curie kernel:  __schedule+0x282/0x870
mai 19 11:39:25 curie kernel:  ? __kmalloc_node+0x141/0x2b0
mai 19 11:39:25 curie kernel:  schedule+0x46/0xb0
mai 19 11:39:25 curie kernel:  schedule_preempt_disabled+0xa/0x10
mai 19 11:39:25 curie kernel:  __mutex_lock.constprop.0+0x133/0x460
mai 19 11:39:25 curie kernel:  ? nvlist_xalloc.part.0+0x68/0xc0 [znvpair]
mai 19 11:39:25 curie kernel:  spa_all_configs+0x41/0x120 [zfs]
mai 19 11:39:25 curie kernel:  zfs_ioc_pool_configs+0x17/0x70 [zfs]
mai 19 11:39:25 curie kernel:  zfsdev_ioctl_common+0x697/0x870 [zfs]
mai 19 11:39:25 curie kernel:  ? _copy_from_user+0x28/0x60
mai 19 11:39:25 curie kernel:  zfsdev_ioctl+0x53/0xe0 [zfs]
mai 19 11:39:25 curie kernel:  __x64_sys_ioctl+0x83/0xb0
mai 19 11:39:25 curie kernel:  do_syscall_64+0x33/0x80
mai 19 11:39:25 curie kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
mai 19 11:39:25 curie kernel: RIP: 0033:0x7fcf0ef32cc7
mai 19 11:39:25 curie kernel: RSP: 002b:00007fcf0e181618 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
mai 19 11:39:25 curie kernel: RAX: ffffffffffffffda RBX: 000055b212f972a0 RCX: 00007fcf0ef32cc7
mai 19 11:39:25 curie kernel: RDX: 00007fcf0e181640 RSI: 0000000000005a04 RDI: 000000000000000b
mai 19 11:39:25 curie kernel: RBP: 00007fcf0e184c30 R08: 00007fcf08016810 R09: 00007fcf08000080
mai 19 11:39:25 curie kernel: R10: 0000000000080000 R11: 0000000000000246 R12: 000055b212f972a0
mai 19 11:39:25 curie kernel: R13: 0000000000000000 R14: 00007fcf0e181640 R15: 0000000000000000
mai 19 11:39:25 curie kernel: INFO: task zpool:11815 blocked for more than 241 seconds.
mai 19 11:39:25 curie kernel:       Tainted: P          IOE     5.10.0-14-amd64 #1 Debian 5.10.113-1
mai 19 11:39:25 curie kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mai 19 11:39:25 curie kernel: task:zpool           state:D stack:    0 pid:11815 ppid:  2621 flags:0x00004004
mai 19 11:39:25 curie kernel: Call Trace:
mai 19 11:39:25 curie kernel:  __schedule+0x282/0x870
mai 19 11:39:25 curie kernel:  schedule+0x46/0xb0
mai 19 11:39:25 curie kernel:  io_schedule+0x42/0x70
mai 19 11:39:25 curie kernel:  cv_wait_common+0xac/0x130 [spl]
mai 19 11:39:25 curie kernel:  ? add_wait_queue_exclusive+0x70/0x70
mai 19 11:39:25 curie kernel:  txg_wait_synced_impl+0xc9/0x110 [zfs]
mai 19 11:39:25 curie kernel:  txg_wait_synced+0xc/0x40 [zfs]
mai 19 11:39:25 curie kernel:  spa_export_common+0x4cd/0x590 [zfs]
mai 19 11:39:25 curie kernel:  ? zfs_log_history+0x9c/0xf0 [zfs]
mai 19 11:39:25 curie kernel:  zfsdev_ioctl_common+0x697/0x870 [zfs]
mai 19 11:39:25 curie kernel:  ? _copy_from_user+0x28/0x60
mai 19 11:39:25 curie kernel:  zfsdev_ioctl+0x53/0xe0 [zfs]
mai 19 11:39:25 curie kernel:  __x64_sys_ioctl+0x83/0xb0
mai 19 11:39:25 curie kernel:  do_syscall_64+0x33/0x80
mai 19 11:39:25 curie kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
mai 19 11:39:25 curie kernel: RIP: 0033:0x7fdc23be2cc7
mai 19 11:39:25 curie kernel: RSP: 002b:00007ffc8c792478 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
mai 19 11:39:25 curie kernel: RAX: ffffffffffffffda RBX: 000055942ca49e20 RCX: 00007fdc23be2cc7
mai 19 11:39:25 curie kernel: RDX: 00007ffc8c792490 RSI: 0000000000005a03 RDI: 0000000000000003
mai 19 11:39:25 curie kernel: RBP: 00007ffc8c795e80 R08: 00000000ffffffff R09: 00007ffc8c792310
mai 19 11:39:25 curie kernel: R10: 000055942ca49e30 R11: 0000000000000246 R12: 00007ffc8c792490
mai 19 11:39:25 curie kernel: R13: 000055942ca49e30 R14: 000055942aed2c20 R15: 00007ffc8c795a40
I understand those are rather extreme conditions: I would fully expect the pool to stop working if the underlying drives disappear. What doesn't seem acceptable is that a command would completely hang like this.

References See the zfs documentation for more information about ZFS, and tubman for another installation and migration procedure.

3 October 2022

Thorsten Alteholz: My Debian Activities in September 2022

FTP master This month I accepted 226 and rejected 33 packages. The overall number of packages that got accepted was 232. All in all I addressed about 60 RM-bugs and either simply removed the package or added a moreinfo tag. In total I spent 5 hours for this task. Anyway, I have to repeat my comment from last month: please have a look at the removal page and check whether the created dak command is really what you wanted. It would also help if you check the reverse dependencies and write a comment whether they are important or can be ignored or also file a new bug for them. Each removal must have one bug! Debian LTS This was my ninety-ninth month that I did some work for the Debian LTS initiative, started by Raphael Hertzog at Freexian. This month my all in all workload has been 14h. During that time I uploaded: I also started to work on frr. Last but not least I did some days of frontdesk duties and took care of issues on security-master. Debian ELTS This month was the fiftieth ELTS month. During my allocated time I uploaded: Last but not least I did some days of frontdesk duties. Debian Printing This month I uploaded new upstream versions or improved packaging of: Debian IoT This month I uploaded new upstream versions or improved packaging of: Debian Mobcom This month I started another upload session for new upstrea versions: Other stuff This month I uploaded new packages:

29 September 2022

Antoine Beaupr : Detecting manual (and optimizing large) package installs in Puppet

Well this is a mouthful. I recently worked on a neat hack called puppet-package-check. It is designed to warn about manually installed packages, to make sure "everything is in Puppet". But it turns out it can (probably?) dramatically decrease the bootstrap time of Puppet bootstrap when it needs to install a large number of packages.

Detecting manual packages On a cleanly filed workstation, it looks like this:
root@emma:/home/anarcat/bin# ./puppet-package-check -v
listing puppet packages...
listing apt packages...
loading apt cache...
0 unmanaged packages found
A messy workstation will look like this:
root@curie:/home/anarcat/bin# ./puppet-package-check -v
listing puppet packages...
listing apt packages...
loading apt cache...
288 unmanaged packages found
apparmor-utils beignet-opencl-icd bridge-utils clustershell cups-pk-helper davfs2 dconf-cli dconf-editor dconf-gsettings-backend ddccontrol ddrescueview debmake debootstrap decopy dict-devil dict-freedict-eng-fra dict-freedict-eng-spa dict-freedict-fra-eng dict-freedict-spa-eng diffoscope dnsdiag dropbear-initramfs ebtables efibootmgr elpa-lua-mode entr eog evince figlet file file-roller fio flac flex font-manager fonts-cantarell fonts-inconsolata fonts-ipafont-gothic fonts-ipafont-mincho fonts-liberation fonts-monoid fonts-monoid-tight fonts-noto fonts-powerline fonts-symbola freeipmi freetype2-demos ftp fwupd-amd64-signed gallery-dl gcc-arm-linux-gnueabihf gcolor3 gcp gdisk gdm3 gdu gedit gedit-plugins gettext-base git-debrebase gnome-boxes gnote gnupg2 golang-any golang-docker-credential-helpers golang-golang-x-tools grub-efi-amd64-signed gsettings-desktop-schemas gsfonts gstreamer1.0-libav gstreamer1.0-plugins-base gstreamer1.0-plugins-good gstreamer1.0-plugins-ugly gstreamer1.0-pulseaudio gtypist gvfs-backends hackrf hashcat html2text httpie httping hugo humanfriendly iamerican-huge ibus ibus-gtk3 ibus-libpinyin ibus-pinyin im-config imediff img2pdf imv initramfs-tools input-utils installation-birthday internetarchive ipmitool iptables iptraf-ng jackd2 jupyter jupyter-nbextension-jupyter-js-widgets jupyter-qtconsole k3b kbtin kdialog keditbookmarks keepassxc kexec-tools keyboard-configuration kfind konsole krb5-locales kwin-x11 leiningen lightdm lintian linux-image-amd64 linux-perf lmodern lsb-base lvm2 lynx lz4json magic-wormhole mailscripts mailutils manuskript mat2 mate-notification-daemon mate-themes mime-support mktorrent mp3splt mpdris2 msitools mtp-tools mtree-netbsd mupdf nautilus nautilus-sendto ncal nd ndisc6 neomutt net-tools nethogs nghttp2-client nocache npm2deb ntfs-3g ntpdate nvme-cli nwipe obs-studio okular-extra-backends openstack-clients openstack-pkg-tools paprefs pass-extension-audit pcmanfm pdf-presenter-console pdf2svg percol pipenv playerctl plymouth plymouth-themes popularity-contest progress prometheus-node-exporter psensor pubpaste pulseaudio python3-ldap qjackctl qpdfview qrencode r-cran-ggplot2 r-cran-reshape2 rake restic rhash rpl rpm2cpio rs ruby ruby-dev ruby-feedparser ruby-magic ruby-mocha ruby-ronn rygel-playbin rygel-tracker s-tui sanoid saytime scrcpy scrcpy-server screenfetch scrot sdate sddm seahorse shim-signed sigil smartmontools smem smplayer sng sound-juicer sound-theme-freedesktop spectre-meltdown-checker sq ssh-audit sshuttle stress-ng strongswan strongswan-swanctl syncthing system-config-printer system-config-printer-common system-config-printer-udev systemd-bootchart systemd-container tardiff task-desktop task-english task-ssh-server tasksel tellico texinfo texlive-fonts-extra texlive-lang-cyrillic texlive-lang-french texlive-lang-german texlive-lang-italian texlive-xetex tftp-hpa thunar-archive-plugin tidy tikzit tint2 tintin++ tipa tpm2-tools traceroute tree trocla ucf udisks2 unifont unrar-free upower usbguard uuid-runtime vagrant-cachier vagrant-libvirt virt-manager vmtouch vorbis-tools w3m wamerican wamerican-huge wfrench whipper whohas wireshark xapian-tools xclip xdg-user-dirs-gtk xlax xmlto xsensors xserver-xorg xsltproc xxd xz-utils yubioath-desktop zathura zathura-pdf-poppler zenity zfs-dkms zfs-initramfs zfsutils-linux zip zlib1g zlib1g-dev
157 old: apparmor-utils clustershell davfs2 dconf-cli dconf-editor ddccontrol ddrescueview decopy dnsdiag ebtables efibootmgr elpa-lua-mode entr figlet file-roller fio flac flex font-manager freetype2-demos ftp gallery-dl gcc-arm-linux-gnueabihf gcolor3 gcp gdu gedit git-debrebase gnote golang-docker-credential-helpers golang-golang-x-tools gtypist hackrf hashcat html2text httpie httping hugo humanfriendly iamerican-huge ibus ibus-pinyin imediff input-utils internetarchive ipmitool iptraf-ng jackd2 jupyter-qtconsole k3b kbtin kdialog keditbookmarks keepassxc kexec-tools kfind konsole leiningen lightdm lynx lz4json magic-wormhole manuskript mat2 mate-notification-daemon mktorrent mp3splt msitools mtp-tools mtree-netbsd nautilus nautilus-sendto nd ndisc6 neomutt net-tools nethogs nghttp2-client nocache ntpdate nwipe obs-studio openstack-pkg-tools paprefs pass-extension-audit pcmanfm pdf-presenter-console pdf2svg percol pipenv playerctl qjackctl qpdfview qrencode r-cran-ggplot2 r-cran-reshape2 rake restic rhash rpl rpm2cpio rs ruby-feedparser ruby-magic ruby-mocha ruby-ronn s-tui saytime scrcpy screenfetch scrot sdate seahorse shim-signed sigil smem smplayer sng sound-juicer spectre-meltdown-checker sq ssh-audit sshuttle stress-ng system-config-printer system-config-printer-common tardiff tasksel tellico texlive-lang-cyrillic texlive-lang-french tftp-hpa tikzit tint2 tintin++ tpm2-tools traceroute tree unrar-free vagrant-cachier vagrant-libvirt vmtouch vorbis-tools w3m wamerican wamerican-huge wfrench whipper whohas xdg-user-dirs-gtk xlax xmlto xsensors xxd yubioath-desktop zenity zip
131 new: beignet-opencl-icd bridge-utils cups-pk-helper dconf-gsettings-backend debmake debootstrap dict-devil dict-freedict-eng-fra dict-freedict-eng-spa dict-freedict-fra-eng dict-freedict-spa-eng diffoscope dropbear-initramfs eog evince file fonts-cantarell fonts-inconsolata fonts-ipafont-gothic fonts-ipafont-mincho fonts-liberation fonts-monoid fonts-monoid-tight fonts-noto fonts-powerline fonts-symbola freeipmi fwupd-amd64-signed gdisk gdm3 gedit-plugins gettext-base gnome-boxes gnupg2 golang-any grub-efi-amd64-signed gsettings-desktop-schemas gsfonts gstreamer1.0-libav gstreamer1.0-plugins-base gstreamer1.0-plugins-good gstreamer1.0-plugins-ugly gstreamer1.0-pulseaudio gvfs-backends ibus-gtk3 ibus-libpinyin im-config img2pdf imv initramfs-tools installation-birthday iptables jupyter jupyter-nbextension-jupyter-js-widgets keyboard-configuration krb5-locales kwin-x11 lintian linux-image-amd64 linux-perf lmodern lsb-base lvm2 mailscripts mailutils mate-themes mime-support mpdris2 mupdf ncal npm2deb ntfs-3g nvme-cli okular-extra-backends openstack-clients plymouth plymouth-themes popularity-contest progress prometheus-node-exporter psensor pubpaste pulseaudio python3-ldap ruby ruby-dev rygel-playbin rygel-tracker sanoid scrcpy-server sddm smartmontools sound-theme-freedesktop strongswan strongswan-swanctl syncthing system-config-printer-udev systemd-bootchart systemd-container task-desktop task-english task-ssh-server texinfo texlive-fonts-extra texlive-lang-german texlive-lang-italian texlive-xetex thunar-archive-plugin tidy tipa trocla ucf udisks2 unifont upower usbguard uuid-runtime virt-manager wireshark xapian-tools xclip xserver-xorg xsltproc xz-utils zathura zathura-pdf-poppler zfs-dkms zfs-initramfs zfsutils-linux zlib1g zlib1g-dev
Yuck! That's a lot of shit to go through. Notice how the packages get sorted between "old" and "new" packages. This is because popcon is used as a tool to mark which packages are "old". If you have unmanaged packages, the "old" ones are likely things that you can uninstall, for example. If you don't have popcon installed, you'll also get this warning:
popcon stats not available: [Errno 2] No such file or directory: '/var/log/popularity-contest'
The error can otherwise be safely ignored, but you won't get "help" prioritizing the packages to add to your manifests. Note that the tool ignores packages that were "marked" (see apt-mark(8)) as automatically installed. This implies that you might have to do a little bit of cleanup the first time you run this, as Debian doesn't necessarily mark all of those packages correctly on first install. For example, here's how it looks like on a clean install, after Puppet ran:
root@angela:/home/anarcat# ./bin/puppet-package-check -v
listing puppet packages...
listing apt packages...
loading apt cache...
127 unmanaged packages found
ca-certificates console-setup cryptsetup-initramfs dbus file gcc-12-base gettext-base grub-common grub-efi-amd64 i3lock initramfs-tools iw keyboard-configuration krb5-locales laptop-detect libacl1 libapparmor1 libapt-pkg6.0 libargon2-1 libattr1 libaudit-common libaudit1 libblkid1 libbpf0 libbsd0 libbz2-1.0 libc6 libcap-ng0 libcap2 libcap2-bin libcom-err2 libcrypt1 libcryptsetup12 libdb5.3 libdebconfclient0 libdevmapper1.02.1 libedit2 libelf1 libext2fs2 libfdisk1 libffi8 libgcc-s1 libgcrypt20 libgmp10 libgnutls30 libgpg-error0 libgssapi-krb5-2 libhogweed6 libidn2-0 libip4tc2 libiw30 libjansson4 libjson-c5 libk5crypto3 libkeyutils1 libkmod2 libkrb5-3 libkrb5support0 liblocale-gettext-perl liblockfile-bin liblz4-1 liblzma5 libmd0 libmnl0 libmount1 libncurses6 libncursesw6 libnettle8 libnewt0.52 libnftables1 libnftnl11 libnl-3-200 libnl-genl-3-200 libnl-route-3-200 libnss-systemd libp11-kit0 libpam-systemd libpam0g libpcre2-8-0 libpcre3 libpcsclite1 libpopt0 libprocps8 libreadline8 libselinux1 libsemanage-common libsemanage2 libsepol2 libslang2 libsmartcols1 libss2 libssl1.1 libssl3 libstdc++6 libsystemd-shared libsystemd0 libtasn1-6 libtext-charwidth-perl libtext-iconv-perl libtext-wrapi18n-perl libtinfo6 libtirpc-common libtirpc3 libudev1 libunistring2 libuuid1 libxtables12 libxxhash0 libzstd1 linux-image-amd64 logsave lsb-base lvm2 media-types mlocate ncurses-term pass-extension-otp puppet python3-reportbug shim-signed tasksel ucf usr-is-merged util-linux-extra wpasupplicant xorg zlib1g
popcon stats not available: [Errno 2] No such file or directory: '/var/log/popularity-contest'
Normally, there should be unmanaged packages here. But because of the way Debian is installed, a lot of libraries and some core packages are marked as manually installed, and are of course not managed through Puppet. There are two solutions to this problem:
  • really manage everything in Puppet (argh)
  • mark packages as automatically installed
I typically chose the second path and mark a ton of stuff as automatic. Then either they will be auto-removed, or will stop being listed. In the above scenario, one could mark all libraries as automatically installed with:
apt-mark auto $(./bin/puppet-package-check   grep -o 'lib[^ ]*')
... but if you trust that most of that stuff is actually garbage that you don't really want installed anyways, you could just mark it all as automatically installed:
apt-mark auto $(./bin/puppet-package-check)
In my case, that ended up keeping basically all libraries (because of course they're installed for some reason) and auto-removing this:
dh-dkms discover-data dkms libdiscover2 libjsoncpp25 libssl1.1 linux-headers-amd64 mlocate pass-extension-otp pass-otp plocate x11-apps x11-session-utils xinit xorg
You'll notice xorg in there: yep, that's bad. Not what I wanted. But for some reason, on other workstations, I did not actually have xorg installed. Turns out having xserver-xorg is enough, and that one has dependencies. So now I guess I just learned to stop worrying and live without X(org).

Optimizing large package installs But that, of course, is not all. Why make things simple when you can have an unreadable title that is trying to be both syntactically correct and click-baity enough to flatter my vain ego? Right. One of the challenges in bootstrapping Puppet with large package lists is that it's slow. Puppet lists packages as individual resources and will basically run apt install $PKG on every package in the manifest, one at a time. While the overhead of apt is generally small, when you add things like apt-listbugs, apt-listchanges, needrestart, triggers and so on, it can take forever setting up a new host. So for initial installs, it can actually makes sense to skip the queue and just install everything in one big batch. And because the above tool inspects the packages installed by Puppet, you can run it against a catalog and have a full lists of all the packages Puppet would install, even before I even had Puppet running. So when reinstalling my laptop, I basically did this:
apt install puppet-agent/experimental
puppet agent --test --noop
apt install $(./puppet-package-check --debug \
    2>&1   grep ^puppet\ packages 
      sed 's/puppet packages://;s/ /\n/g'
      grep -v -e onionshare -e golint -e git-sizer -e github-backup -e hledger -e xsane -e audacity -e chirp -e elpa-flycheck -e elpa-lsp-ui -e yubikey-manager -e git-annex -e hopenpgp-tools -e puppet
) puppet-agent/experimental
That massive grep was because there are currently a lot of packages missing from bookworm. Those are all packages that I have in my catalog but that still haven't made it to bookworm. Sad, I know. I eventually worked around that by adding bullseye sources so that the Puppet manifest actually ran. The point here is that this improves the Puppet run time a lot. All packages get installed at once, and you get a nice progress bar. Then you actually run Puppet to deploy configurations and all the other goodies:
puppet agent --test
I wish I could tell you how much faster that ran. I don't know, and I will not go through a full reinstall just to please your curiosity. The only hard number I have is that it installed 444 packages (which exploded in 10,191 packages with dependencies) in a mere 10 minutes. That might also be with the packages already downloaded. In any case, I have that gut feeling it's faster, so you'll have to just trust my gut. It is, after all, much more important than you might think.

Similar work The blueprint system is something similar to this:
It figures out what you ve done manually, stores it locally in a Git repository, generates code that s able to recreate your efforts, and helps you deploy those changes to production
That tool has unfortunately been abandoned for a decade at this point. Also note that the AutoRemove::RecommendsImportant and AutoRemove::SuggestsImportant are relevant here. If it is set to true (the default), a package will not be removed if it is (respectively) a Recommends or Suggests of another package (as opposed to the normal Depends). In other words, if you want to also auto-remove packages that are only Suggests, you would, for example, add this to apt.conf:
AutoRemove::SuggestsImportant false;
Paul Wise has tried to make the Debian installer and debootstrap properly mark packages as automatically installed in the past, but his bug reports were rejected. The other suggestions in this section are also from Paul, thanks!

11 September 2022

Shirish Agarwal: Politics, accessibility, books

Politics I have been reading books, both fiction and non-fiction for a long long time. My first book was a comic most probably when I was down with Malaria when I was a kid. I must be around 4-5 years old. Over the years, books have given me great joy and I continue to find nuggets of useful information, both in fiction as well as non-fiction books. So here s to sharing something and how that can lead you to a rabbit hole. This entry would be a bit NSFW as far as language is concerned. NYPD Red 5 by James Patterson First of all, have no clue as to why James Patterson s popularity has been falling. He used to be right there with Lee Child and others, but not so much now. While I try to be mysterious about books, I would give a bit of heads-up so people know what to expect. This is probably more towards the Adult crowd as there is a bit of sex as well as quite a few grey characters. The NYPD Red is a sort of elite police task force that basically is for celebrities. In the book series, they do a lot of ass-kissing (figuratively more than literally). Now the reason I have always liked fiction is that however wild the assumption or presumption is, it does have somewhere a grain of truth. And each and every time I read a book or two, that gets cemented. One of the statements in the book told something about how 9/11 took a lot of police personnel out of the game. First, there were a number of policemen who were patrolling the Two Towers, so they perished literally during the explosion. Then there were policemen who were given the cases to close the cases (bring the cases to conclusion). When you are investigating your own brethren or even civilians who perished 9/11 they must have experienced emotional trauma and no outlet. Mental health even in cops is the same and given similar help as you and me (i.e. next to none.) But both of these were my assumptions. The only statement that was in the book was they lost a lot of bench strength. Even NYFD (New York Fire Department). This led me to me to With Crime At Record Lows, Should NYC Have Fewer Cops? This is more right-wing sentiment and in fact, there have been calls to defund the police. This led me to https://cbcny.org/ and one specific graph. Unfortunately, this tells the story from 2010-2022 but not before. I was looking for data from around 1999 to 2005 because that will tell whether or not it happened. Then I remembered reading in newspapers the year or two later how 9/11 had led NYC to recession. I looked up online and for sure NY was booming before 9/11. One can argue that NYC could come down and that is pretty much possible, everything that goes up comes down, it s a law of nature but it would have been steady rather than abrupt. And once you are in recession, the first thing to go is personnel. So people both from NYPD and NYFD were let go, even though they were needed the most then. As you can see, a single statement in a book can take you to places & time literally. Edit: Addition 11th September There were quite a few people who also died from New York Port Authority and they also lost quite a number of people directly and indirectly and did a lot of patrolling of the water bodies near NYC. Later on, even in their department, there were a lot of early retirements.

Kosovo A couple of days back I had a look at the Debconf 2023 BOF that was done in Kosovo. One of the interesting things that happened during the BOF is when a woman participant chimed in and asks India to recognize Kosovo. Immediately it triggered me and I opened the Kosovo Wikipedia page to get some understanding of the topic. Reading up on it, came to know Russia didn t agree and doesn t recognize Kosovo. Mr. Modi likes Putin and India imports a lot of its oil from Russia. Unrelatedly, but still useful, we rejected to join IPEF. Earlier, we had rejected China s BRI. India has never been as vulnerable as she is now. Our foreign balance has reached record lows. Now India has been importing quite a bit of Russian crude and has been buying arms and ammunition from them. We are also scheduled to buy a couple of warships and submarines etc. We even took arms and ammunition from them on lease. So we can t afford that they are displeased with India. Even though Russia has more than friendly relations with both China and Pakistan. At the same time, the U.S. is back to aiding Pakistan which the mainstream media in India refuses to even cover. And to top all of this, we have the Chip 4 Alliance but that needs its own article, truth be told but we will do with a paragraph  Edit Addition 11th September Seems Kosovo isn t unique in that situation, there are 3-4 states like that. A brief look at worldpopulationreview tells you there are many more.

Chip 4 Alliance For almost a decade I have been screaming about this on my blog as well as everywhere that chip fabrication is a national security thing. And for years, most people deny it. And now we have chip 4 alliance. Now to understand this, you have to understand that China for almost a decade, somewhere around 2014 or so came up with something called the big fund . Now one can argue one way or the other how successful the fund has been, but it has, without doubt, created ripples so strong that the U.S., Taiwan, Japan, and probably South Korea will join and try to stem the tide. Interestingly, in this grouping, South Korea is the weakest in the statements and what they have been saying. Within the group itself, there is a lot of tension and China would use that and there are a number of unresolved issues between the three countries that both China & Russia would exploit. For e.g. the Comfort women between South Korea and Japan. Or the 1985 Accord Agreement between Japan and the U.S. Now people need to understand this, this is not just about China but also about us. If China has 5-6x times India s GDP and their research budget is at the very least 100x times what India spends, how do you think we will be self-reliant? Whom are we fooling? Are we not tired of fooling ourselves  In diplomacy, countries use leverage. Sadly, we let go of some of our most experienced negotiators in 2014 and since then have been singing in the wind

Accessibility, Jitsi, IRC, Element-Desktop The Wikipedia page on Accessibility says the following Accessibility is the design of products, devices, services, vehicles, or environments so as to be usable by people with disabilities. The concept of accessible design and practice of accessible development ensures both direct access (i.e. unassisted) and indirect access meaning compatibility with a person s assistive technology. Now IRC or Internet Relay Chat has been accessible for a long time. I know of even blind people who have been able to navigate IRC quite effortlessly as there has been a lot of work done to make sure all the joints speak to each other so people with one or more disabilities still can use, and contribute without an issue. It does help that IRC and many clients have been there since the 1970s so most of them have had more than enough time to get all the bugs fixed and both text-to-speech and speech-to-text work brilliantly on IRC. Newer software like Jitsi or for that matter Telegram is lacking those features. A few days ago, discovered on Telegram I was shared that Samsung Voice input is also able to do the same. The Samsung Voice Input works wonder as it translates voice to text, I have not yet tried the text-to-speech but perhaps somebody can and they can share whatever the results can be one way or the other. I have tried element-desktop both on the desktop as well as mobile phone and it has been disappointing, to say the least. On the desktop, it is unruly and freezes once in a while, and is buggy. The mobile version is a little better but that s not saying a lot. I prefer the desktop version as I can use the full-size keyboard. The bug I reported has been there since its Riot days. I had put up a bug report even then. All in all, yesterday was disappointing

26 August 2022

Antoine Beaupr : How to nationalize the internet in Canada

Rogers had a catastrophic failure in July 2022. It affected emergency services (as in: people couldn't call 911, but also some 911 services themselves failed), hospitals (which couldn't access prescriptions), banks and payment systems (as payment terminals stopped working), and regular users as well. The outage lasted almost a full day, and Rogers took days to give any technical explanation on the outage, and even when they did, details were sparse. So far the only detailed account is from outside actors like Cloudflare which seem to point at an internal BGP failure. Its impact on the economy has yet to be measured, but it probably cost millions of dollars in wasted time and possibly lead to life-threatening situations. Apart from holding Rogers (criminally?) responsible for this, what should be done in the future to avoid such problems? It's not the first time something like this has happened: it happened to Bell Canada as well. The Rogers outage is also strangely similar to the Facebook outage last year, but, to its credit, Facebook did post a fairly detailed explanation only a day later. The internet is designed to be decentralised, and having large companies like Rogers hold so much power is a crucial mistake that should be reverted. The question is how. Some critics were quick to point out that we need more ISP diversity and competition, but I think that's missing the point. Others have suggested that the internet should be a public good or even straight out nationalized. I believe the solution to the problem of large, private, centralised telcos and ISPs is to replace them with smaller, public, decentralised service providers. The only way to ensure that works is to make sure that public money ends up creating infrastructure controlled by the public, which means treating ISPs as a public utility. This has been implemented elsewhere: it works, it's cheaper, and provides better service.

A modest proposal Global wireless services (like phone services) and home internet inevitably grow into monopolies. They are public utilities, just like water, power, railways, and roads. The question of how they should be managed is therefore inherently political, yet people don't seem to question the idea that only the market (i.e. "competition") can solve this problem. I disagree. 10 years ago (in french), I suggested we, in Qu bec, should nationalize large telcos and internet service providers. I no longer believe is a realistic approach: most of those companies have crap copper-based networks (at least for the last mile), yet are worth billions of dollars. It would be prohibitive, and a waste, to buy them out. Back then, I called this idea "R seau-Qu bec", a reference to the already nationalized power company, Hydro-Qu bec. (This idea, incidentally, made it into the plan of a political party.) Now, I think we should instead build our own, public internet. Start setting up municipal internet services, fiber to the home in all cities, progressively. Then interconnect cities with fiber, and build peering agreements with other providers. This also includes a bid on wireless spectrum to start competing with phone providers as well. And while that sounds really ambitious, I think it's possible to take this one step at a time.

Municipal broadband In many parts of the world, municipal broadband is an elegant solution to the problem, with solutions ranging from Stockholm's city-owned fiber network (dark fiber, layer 1) to Utah's UTOPIA network (fiber to the premises, layer 2) and municipal wireless networks like Guifi.net which connects about 40,000 nodes in Catalonia. A good first step would be for cities to start providing broadband services to its residents, directly. Cities normally own sewage and water systems that interconnect most residences and therefore have direct physical access everywhere. In Montr al, in particular, there is an ongoing project to replace a lot of old lead-based plumbing which would give an opportunity to lay down a wired fiber network across the city. This is a wild guess, but I suspect this would be much less expensive than one would think. Some people agree with me and quote this as low as 1000$ per household. There is about 800,000 households in the city of Montr al, so we're talking about a 800 million dollars investment here, to connect every household in Montr al with fiber and incidentally a quarter of the province's population. And this is not an up-front cost: this can be built progressively, with expenses amortized over many years. (We should not, however, connect Montr al first: it's used as an example here because it's a large number of households to connect.) Such a network should be built with a redundant topology. I leave it as an open question whether we should adopt Stockholm's more minimalist approach or provide direct IP connectivity. I would tend to favor the latter, because then you can immediately start to offer the service to households and generate revenues to compensate for the capital expenditures. Given the ridiculous profit margins telcos currently have 8 billion $CAD net income for BCE (2019), 2 billion $CAD for Rogers (2020) I also believe this would actually turn into a profitable revenue stream for the city, the same way Hydro-Qu bec is more and more considered as a revenue stream for the state. (I personally believe that's actually wrong and we should treat those resources as human rights and not money cows, but I digress. The point is: this is not a cost point, it's a revenue.) The other major challenge here is that the city will need competent engineers to drive this project forward. But this is not different from the way other public utilities run: we have electrical engineers at Hydro, sewer and water engineers at the city, this is just another profession. If anything, the computing science sector might be more at fault than the city here in its failure to provide competent and accountable engineers to society... Right now, most of the network in Canada is copper: we are hitting the limits of that technology with DSL, and while cable has some life left to it (DOCSIS 4.0 does 4Gbps), that is nowhere near the capacity of fiber. Take the town of Chattanooga, Tennessee: in 2010, the city-owned ISP EPB finished deploying a fiber network to the entire town and provided gigabit internet to everyone. Now, 12 years later, they are using this same network to provide the mind-boggling speed of 25 gigabit to the home. To give you an idea, Chattanooga is roughly the size and density of Sherbrooke.

Provincial public internet As part of building a municipal network, the question of getting access to "the internet" will immediately come up. Naturally, this will first be solved by using already existing commercial providers to hook up residents to the rest of the global network. But eventually, networks should inter-connect: Montr al should connect with Laval, and then Trois-Rivi res, then Qu bec City. This will require long haul fiber runs, but those links are not actually that expensive, and many of those already exist as a public resource at RISQ and CANARIE, which cross-connects universities and colleges across the province and the country. Those networks might not have the capacity to cover the needs of the entire province right now, but that is a router upgrade away, thanks to the amazing capacity of fiber. There are two crucial mistakes to avoid at this point. First, the network needs to remain decentralised. Long haul links should be IP links with BGP sessions, and each city (or MRC) should have its own independent network, to avoid Rogers-class catastrophic failures. Second, skill needs to remain in-house: RISQ has already made that mistake, to a certain extent, by selling its neutral datacenter. Tellingly, MetroOptic, probably the largest commercial dark fiber provider in the province, now operates the QIX, the second largest "public" internet exchange in Canada. Still, we have a lot of infrastructure we can leverage here. If RISQ or CANARIE cannot be up to the task, Hydro-Qu bec has power lines running into every house in the province, with high voltage power lines running hundreds of kilometers far north. The logistics of long distance maintenance are already solved by that institution. In fact, Hydro already has fiber all over the province, but it is a private network, separate from the internet for security reasons (and that should probably remain so). But this only shows they already have the expertise to lay down fiber: they would just need to lay down a parallel network to the existing one. In that architecture, Hydro would be a "dark fiber" provider.

International public internet None of the above solves the problem for the entire population of Qu bec, which is notoriously dispersed, with an area three times the size of France, but with only an eight of its population (8 million vs 67). More specifically, Canada was originally a french colony, a land violently stolen from native people who have lived here for thousands of years. Some of those people now live in reservations, sometimes far from urban centers (but definitely not always). So the idea of leveraging the Hydro-Qu bec infrastructure doesn't always work to solve this, because while Hydro will happily flood a traditional hunting territory for an electric dam, they don't bother running power lines to the village they forcibly moved, powering it instead with noisy and polluting diesel generators. So before giving me fiber to the home, we should give power (and potable water, for that matter), to those communities first. So we need to discuss international connectivity. (How else could we consider those communities than peer nations anyways?c) Qu bec has virtually zero international links. Even in Montr al, which likes to style itself a major player in gaming, AI, and technology, most peering goes through either Toronto or New York. That's a problem that we must fix, regardless of the other problems stated here. Looking at the submarine cable map, we see very few international links actually landing in Canada. There is the Greenland connect which connects Newfoundland to Iceland through Greenland. There's the EXA which lands in Ireland, the UK and the US, and Google has the Topaz link on the west coast. That's about it, and none of those land anywhere near any major urban center in Qu bec. We should have a cable running from France up to Saint-F licien. There should be a cable from Vancouver to China. Heck, there should be a fiber cable running all the way from the end of the great lakes through Qu bec, then up around the northern passage and back down to British Columbia. Those cables are expensive, and the idea might sound ludicrous, but Russia is actually planning such a project for 2026. The US has cables running all the way up (and around!) Alaska, neatly bypassing all of Canada in the process. We just look ridiculous on that map. (Addendum: I somehow forgot to talk about Teleglobe here was founded as publicly owned company in 1950, growing international phone and (later) data links all over the world. It was privatized by the conservatives in 1984, along with rails and other "crown corporations". So that's one major risk to any effort to make public utilities work properly: some government might be elected and promptly sell it out to its friends for peanuts.)

Wireless networks I know most people will have rolled their eyes so far back their heads have exploded. But I'm not done yet. I want wireless too. And by wireless, I don't mean a bunch of geeks setting up OpenWRT routers on rooftops. I tried that, and while it was fun and educational, it didn't scale. A public networking utility wouldn't be complete without providing cellular phone service. This involves bidding for frequencies at the federal level, and deploying a rather large amount of infrastructure, but it could be a later phase, when the engineers and politicians have proven their worth. At least part of the Rogers fiasco would have been averted if such a decentralized network backend existed. One might even want to argue that a separate institution should be setup to provide phone services, independently from the regular wired networking, if only for reliability. Because remember here: the problem we're trying to solve is not just technical, it's about political boundaries, centralisation, and automation. If everything is ran by this one organisation again, we will have failed. However, I must admit that phone services is where my ideas fall a little short. I can't help but think it's also an accessible goal maybe starting with a virtual operator but it seems slightly less so than the others, especially considering how closed the phone ecosystem is.

Counter points In debating these ideas while writing this article, the following objections came up.

I don't want the state to control my internet One legitimate concern I have about the idea of the state running the internet is the potential it would have to censor or control the content running over the wires. But I don't think there is necessarily a direct relationship between resource ownership and control of content. Sure, China has strong censorship in place, partly implemented through state-controlled businesses. But Russia also has strong censorship in place, based on regulatory tools: they force private service providers to install back-doors in their networks to control content and surveil their users. Besides, the USA have been doing warrantless wiretapping since at least 2003 (and yes, that's 10 years before the Snowden revelations) so a commercial internet is no assurance that we have a free internet. Quite the contrary in fact: if anything, the commercial internet goes hand in hand with the neo-colonial internet, just like businesses did in the "good old colonial days". Large media companies are the primary censors of content here. In Canada, the media cartel requested the first site-blocking order in 2018. The plaintiffs (including Qu becor, Rogers, and Bell Canada) are both content providers and internet service providers, an obvious conflict of interest. Nevertheless, there are some strong arguments against having a centralised, state-owned monopoly on internet service providers. FDN makes a good point on this. But this is not what I am suggesting: at the provincial level, the network would be purely physical, and regional entities (which could include private companies) would peer over that physical network, ensuring decentralization. Delegating the management of that infrastructure to an independent non-profit or cooperative (but owned by the state) would also ensure some level of independence.

Isn't the government incompetent and corrupt? Also known as "private enterprise is better skilled at handling this, the state can't do anything right" I don't think this is a "fait accomplit". If anything, I have found publicly ran utilities to be spectacularly reliable here. I rarely have trouble with sewage, water, or power, and keep in mind I live in a city where we receive about 2 meters of snow a year, which tend to create lots of trouble with power lines. Unless there's a major weather event, power just runs here. I think the same can happen with an internet service provider. But it would certainly need to have higher standards to what we're used to, because frankly Internet is kind of janky.

A single monopoly will be less reliable I actually agree with that, but that is not what I am proposing anyways. Current commercial or non-profit entities will be free to offer their services on top of the public network. And besides, the current "ha! diversity is great" approach is exactly what we have now, and it's not working. The pretense that we can have competition over a single network is what led the US into the ridiculous situation where they also pretend to have competition over the power utility market. This led to massive forest fires in California and major power outages in Texas. It doesn't work.

Wouldn't this create an isolated network? One theory is that this new network would be so hostile to incumbent telcos and ISPs that they would simply refuse to network with the public utility. And while it is true that the telcos currently do also act as a kind of "tier one" provider in some places, I strongly feel this is also a problem that needs to be solved, regardless of ownership of networking infrastructure. Right now, telcos often hold both ends of the stick: they are the gateway to users, the "last mile", but they also provide peering to the larger internet in some locations. In at least one datacenter in downtown Montr al, I've seen traffic go through Bell Canada that was not directly targeted at Bell customers. So in effect, they are in a position of charging twice for the same traffic, and that's not only ridiculous, it should just be plain illegal. And besides, this is not a big problem: there are other providers out there. As bad as the market is in Qu bec, there is still some diversity in Tier one providers that could allow for some exits to the wider network (e.g. yes, Cogent is here too).

What about Google and Facebook? Nationalization of other service providers like Google and Facebook is out of scope of this discussion. That said, I am not sure the state should get into the business of organising the web or providing content services however, but I will point out it already does do some of that through its own websites. It should probably keep itself to this, and also consider providing normal services for people who don't or can't access the internet. (And I would also be ready to argue that Google and Facebook already act as extensions of the state: certainly if Facebook didn't exist, the CIA or the NSA would like to create it at this point. And Google has lucrative business with the US department of defense.)

What does not work So we've seen one thing that could work. Maybe it's too expensive. Maybe the political will isn't there. Maybe it will fail. We don't know yet. But we know what does not work, and it's what we've been doing ever since the internet has gone commercial.

Subsidies The absurd price we pay for data does not actually mean everyone gets high speed internet at home. Large swathes of the Qu bec countryside don't get broadband at all, and it can be difficult or expensive, even in large urban centers like Montr al, to get high speed internet. That is despite having a series of subsidies that all avoided investing in our own infrastructure. We had the "fonds de l'autoroute de l'information", "information highway fund" (site dead since 2003, archive.org link) and "branchez les familles", "connecting families" (site dead since 2003, archive.org link) which subsidized the development of a copper network. In 2014, more of the same: the federal government poured hundreds of millions of dollars into a program called connecting Canadians to connect 280 000 households to "high speed internet". And now, the federal and provincial governments are proudly announcing that "everyone is now connected to high speed internet", after pouring more than 1.1 billion dollars to connect, guess what, another 380 000 homes, right in time for the provincial election. Of course, technically, the deadline won't actually be met until 2023. Qu bec is a big area to cover, and you can guess what happens next: the telcos threw up their hand and said some areas just can't be connected. (Or they connect their CEO but not the poor folks across the lake.) The story then takes the predictable twist of giving more money out to billionaires, subsidizing now Musk's Starlink system to connect those remote areas. To give a concrete example: a friend who lives about 1000km away from Montr al, 4km from a small, 2500 habitant village, has recently got symmetric 100 mbps fiber at home from Telus, thanks to those subsidies. But I can't get that service in Montr al at all, presumably because Telus and Bell colluded to split that market. Bell doesn't provide me with such a service either: they tell me they have "fiber to my neighborhood", and only offer me a 25/10 mbps ADSL service. (There is Vid otron offering 400mbps, but that's copper cable, again a dead technology, and asymmetric.)

Conclusion Remember Chattanooga? Back in 2010, they funded the development of a fiber network, and now they have deployed a network roughly a thousand times faster than what we have just funded with a billion dollars. In 2010, I was paying Bell Canada 60$/mth for 20mbps and a 125GB cap, and now, I'm still (indirectly) paying Bell for roughly the same speed (25mbps). Back then, Bell was throttling their competitors networks until 2009, when they were forced by the CRTC to stop throttling. Both Bell and Vid otron still explicitly forbid you from running your own servers at home, Vid otron charges prohibitive prices which make it near impossible for resellers to sell uncapped services. Those companies are not spurring innovation: they are blocking it. We have spent all this money for the private sector to build us a private internet, over decades, without any assurance of quality, equity or reliability. And while in some locations, ISPs did deploy fiber to the home, they certainly didn't upgrade their entire network to follow suit, and even less allowed resellers to compete on that network. In 10 years, when 100mbps will be laughable, I bet those service providers will again punt the ball in the public courtyard and tell us they don't have the money to upgrade everyone's equipment. We got screwed. It's time to try something new.

Updates There was a discussion about this article on Hacker News which was surprisingly productive. Trigger warning: Hacker News is kind of right-wing, in case you didn't know. Since this article was written, at least two more major acquisitions happened, just in Qu bec: In the latter case, vMedia was explicitly saying it couldn't grow because of "lack of access to capital". So basically, we have given those companies a billion dollars, and they are not using that very money to buy out their competition. At least we could have given that money to small players to even out the playing field. But this is not how that works at all. Also, in a bizarre twist, an "analyst" believes the acquisition is likely to help Rogers acquire Shaw. Also, since this article was written, the Washington Post published a review of a book bringing similar ideas: Internet for the People The Fight for Our Digital Future, by Ben Tarnoff, at Verso books. It's short, but even more ambitious than what I am suggesting in this article, arguing that all big tech companies should be broken up and better regulated:
He pulls from Ethan Zuckerman s idea of a web that is plural in purpose that just as pool halls, libraries and churches each have different norms, purposes and designs, so too should different places on the internet. To achieve this, Tarnoff wants governments to pass laws that would make the big platforms unprofitable and, in their place, fund small-scale, local experiments in social media design. Instead of having platforms ruled by engagement-maximizing algorithms, Tarnoff imagines public platforms run by local librarians that include content from public media.
(Links mine: the Washington Post obviously prefers to not link to the real web, and instead doesn't link to Zuckerman's site all and suggests Amazon for the book, in a cynical example.) And in another example of how the private sector has failed us, there was recently a fluke in the AMBER alert system where the entire province was warned about a loose shooter in Saint-Elz ar except the people in the town, because they have spotty cell phone coverage. In other words, millions of people received a strongly toned, "life-threatening", alert for a city sometimes hours away, except the people most vulnerable to the alert. Not missing a beat, the CAQ party is promising more of the same medicine again and giving more money to telcos to fix the problem, suggesting to spend three billion dollars in private infrastructure.

31 July 2022

Paul Wise: FLOSS Activities July 2022

Focus This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Review

Administration
  • Debian BTS: unarchive/reopen/triage bugs for reintroduced packages
  • Debian servers: check full disks, ping users of excessive disk usage, restart a hung service
  • Debian wiki: approve accounts

Communication
  • Respond to queries from Debian users and contributors on the mailing lists and IRC

Sponsors The SPTAG, SIMDEverywhere, cwidget, aptitude, tldextract work was sponsored. All other work was done on a volunteer basis.

Next.