Search Results: "mkd"

25 April 2024

Petter Reinholdtsen: 45 orphaned Debian packages moved to git, 391 to go

Nine days ago, I started migrating orphaned Debian packages with no version control system listed in debian/control of the source to git. At the time there were 438 such packages. Now there are 391, according to the UDD. In reality it is slightly less, as there is a delay between uploads and UDD updates. In the nine days since, I have thus been able to work my way through ten percent of the packages. I am starting to run out of steam, and hope someone else will also help brushing some dust of these packages. Here is a recipe how to do it. I start by picking a random package by querying the UDD for a list of 10 random packages from the set of remaining packages:

PGPASSWORD="udd-mirror" psql --port=5432 --host=udd-mirror.debian.net \
  --username=udd-mirror udd -c "select source from sources \
   where release = 'sid' and (vcs_url ilike '%anonscm.debian.org%' \
   OR vcs_browser ilike '%anonscm.debian.org%' or vcs_url IS NULL \
   OR vcs_browser IS NULL) AND maintainer ilike '%packages@qa.debian.org%' \
   order by random() limit 10;"

Next, I visit http://salsa.debian.org/debian and search for the package name, to ensure no git repository already exist. If it does, I clone it and try to get it to an uploadable state, and add the Vcs-* entries in d/control to make the repository more widely known. These packages are a minority, so I will not cover that use case here. For packages without an existing git repository, I run the following script debian-snap-to-salsa to prepare a git repository with the existing packaging.

#!/bin/sh
#
# See also https://bugs.debian.org/804722#31
set -e
# Move to this Standards-Version.
SV_LATEST=4.7.0
PKG="$1"
if [ -z "$PKG" ]; then
    echo "usage: $0 "
    exit 1
fi
if [ -e "$ PKG -salsa" ]; then
    echo "error: $ PKG -salsa already exist, aborting."
    exit 1
fi
if [ -z "ALLOWFAILURE" ] ; then
    ALLOWFAILURE=false
fi
# Fetch every snapshotted source package.  Manually loop until all
# transfers succeed, as 'gbp import-dscs --debsnap' do not fail on
# download failures.
until debsnap --force -v $PKG   $ALLOWFAILURE ; do sleep 1; done
mkdir $ PKG -salsa; cd $ PKG -salsa
git init
# Specify branches to override any debian/gbp.conf file present in the
# source package.
gbp import-dscs  --debian-branch=master --upstream-branch=upstream \
    --pristine-tar ../source-$PKG/*.dsc
# Add Vcs pointing to Salsa Debian project (must be manually created
# and pushed to).
if ! grep -q ^Vcs- debian/control ; then
    awk "BEGIN   s=1   /^\$/   if (s==1)   print \"Vcs-Browser: https://salsa.debian.org/debian/$PKG\"; print \"Vcs-Git: https://salsa.debian.org/debian/$PKG.git\"  ; s=0     print  " < debian/control > debian/control.new && mv debian/control.new debian/control
    git commit -m "Updated vcs in d/control to Salsa." debian/control
fi
# Tell gbp to enforce the use of pristine-tar.
inifile +inifile debian/gbp.conf +create +section DEFAULT +key pristine-tar +value True
git add debian/gbp.conf
git commit -m "Added d/gbp.conf to enforce the use of pristine-tar." debian/gbp.conf
# Update to latest Standards-Version.
SV="$(grep ^Standards-Version: debian/control awk ' print $2 ')"
if [ $SV_LATEST != $SV ]; then
    sed -i "s/\(Standards-Version: \)\(.*\)/\1$SV_LATEST/" debian/control
    git commit -m "Updated Standards-Version from $SV to $SV_LATEST." debian/control
fi
if grep -q pkg-config debian/control; then
    sed -i s/pkg-config/pkgconf/ debian/control
    git commit -m "Replaced obsolete pkg-config build dependency with pkgconf." debian/control
fi
if grep -q libncurses5-dev debian/control; then
    sed -i s/libncurses5-dev/libncurses-dev/ debian/control
    git commit -m "Replaced obsolete libncurses5-dev build dependency with libncurses-dev." debian/control
fi

Some times the debsnap script fail to download some of the versions. In those cases I investigate, and if I decide the failing versions will not be missed, I call it using ALLOWFAILURE=true to ignore the problem and create the git repository anyway. With the git repository in place, I do a test build (gbp buildpackage) to ensure the build is actually working. If it does not I pick a different package, or if the build failure is trivial to fix, I fix it before continuing. At this stage I revisit http://salsa.debian.org/debian and create the project under this group for the package. I then follow the instructions to publish the local git repository. Here is from a recent example:

git remote add origin git@salsa.debian.org:debian/perl-byacc.git
git push --set-upstream origin master upstream pristine-tar
git push --tags

With a working build, I have a look at the build rules if I want to remove some more dust. I normally try to move to debhelper compat level 13, which involves removing debian/compat and modifying debian/control to build depend on debhelper-compat (=13). I also test with 'Rules-Requires-Root: no' in debian/control and verify in debian/rules that hardening is enabled, and include all of these if the package still build. If it fail to build with level 13, I try with 12, 11, 10 and so on until I find a level where it build, as I do not want to spend a lot of time fixing build issues. Some times, when I feel inspired, I make sure debian/copyright is converted to the machine readable format, often by starting with 'debhelper -cc' and then cleaning up the autogenerated content until it matches realities. If I feel like it, I might also clean up non-dh-based debian/rules files to use the short style dh build rules. Once I have removed all the dust I care to process for the package, I run 'gbp dch' to generate a debian/changelog entry based on the commits done so far, run 'dch -r' to switch from 'UNRELEASED' to 'unstable' and get an editor to make sure the 'QA upload' marker is in place and that all long commit descriptions are wrapped into sensible lengths, run 'debcommit --release -a' to commit and tag the new debian/changelog entry, run 'debuild -S' to build a source only package, and 'dput ../perl-byacc_2.0-10_source.changes' to do the upload. During the entire process, and many times per step, I run 'debuild' to verify the changes done still work. I also some times verify the set of built files using 'find debian' to see if I can spot any problems (like no file in usr/bin any more or empty package). I also try to fix all lintian issues reported at the end of each 'debuild' run. If I find Debian specific patches, I try to ensure their metadata is fairly up to date and some times I even try to reach out to upstream, to make the upstream project aware of the patches. Most of my emails bounce, so the success rate is low. For projects with no Homepage entry in debian/control I try to track down one, and for packages with no debian/watch file I try to create one. But at least for some of the packages I have been unable to find a functioning upstream, and must skip both of these. If I could handle ten percent in nine days, twenty people could complete the rest in less then five days. I use approximately twenty minutes per package, when I have twenty minutes spare time to spend. Perhaps you got twenty minutes to spare too? As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

Lukas M rdian: Creating a Netplan enabled system through Debian-Installer

With the work that has been done in the debian-installer/netcfg merge-proposal !9 it is possible to install a standard Debian system, using the normal Debian-Installer (d-i) mini.iso images, that will come pre-installed with Netplan and all network configuration structured in /etc/netplan/. In this write-up I d like to run you through a list of commands for experiencing the Netplan enabled installation process first-hand. For now, we ll be using a custom ISO image, while waiting for the above-mentioned merge-proposal to be landed. Furthermore, as the Debian archive is going through major transitions builds of the unstable branch of d-i don t currently work. So I implemented a small backport, producing updated netcfg and netcfg-static for Bookworm, which can be used as localudebs/ during the d-i build. Let s start with preparing a working directory and installing the software dependencies for our virtualized Debian system:

$ mkdir d-i_bookworm && cd d-i_bookworm
$ apt install ovmf qemu-utils qemu-system-x86

Now let s download the custom mini.iso, linux kernel image and initrd.gz containing the Netplan enablement changes, as mentioned above. TODO: localudebs/

$ wget https://people.ubuntu.com/~slyon/d-i/bookworm/mini.iso
$ wget https://people.ubuntu.com/~slyon/d-i/bookworm/linux
$ wget https://people.ubuntu.com/~slyon/d-i/bookworm/initrd.gz

Next we ll prepare a VM, by copying the EFI firmware files, preparing some persistent EFIVARs file, to boot from FS0:\EFI\debian\grubx64.efi, and create a virtual disk for our machine:

$ cp /usr/share/OVMF/OVMF_CODE_4M.fd .
$ cp /usr/share/OVMF/OVMF_VARS_4M.fd .
$ qemu-img create -f qcow2 ./data.qcow2 5G

Finally, let s launch the installer using a custom preseed.cfg file, that will automatically install Netplan for us in the target system. A minimal preseed file could look like this:

# Install minimal Netplan generator binary
d-i preseed/late_command string in-target apt-get -y install netplan-generator

For this demo, we re installing the full netplan.io package (incl. Python CLI), as the netplan-generator package was not yet split out as an independent binary in the Bookworm cycle. You can choose the preseed file from a set of different variants to test the different configurations:

Netplan + systemd-resolved configuration
- https://people.ubuntu.com/~slyon/d-i/bookworm/netplan-preseed+networkd.cfg
Netplan + NetworkManager configuration
- https://people.ubuntu.com/~slyon/d-i/bookworm/netplan-preseed+nm.cfg

We re using the custom linux kernel and initrd.gz here to be able to pass the PRESEED_URL as a parameter to the kernel s cmdline directly. Launching this VM should bring up the normal debian-installer in its netboot/gtk form:

$ export U=https://people.ubuntu.com/~slyon/d-i/bookworm/netplan-preseed+networkd.cfg
$ qemu-system-x86_64 \
	-M q35 -enable-kvm -cpu host -smp 4 -m 2G \
	-drive if=pflash,format=raw,unit=0,file=OVMF_CODE_4M.fd,readonly=on \
	-drive if=pflash,format=raw,unit=1,file=OVMF_VARS_4M.fd,readonly=off \
	-device qemu-xhci -device usb-kbd -device usb-mouse \
	-vga none -device virtio-gpu-pci \
	-net nic,model=virtio -net user \
	-kernel ./linux -initrd ./initrd.gz -append "url=$U" \
	-hda ./data.qcow2 -cdrom ./mini.iso;

Now you can click through the normal Debian-Installer process, using mostly default settings. Optionally, you could play around with the networking settings, to see how those get translated to /etc/netplan/ in the target system.

After you confirmed your partitioning changes, the base system gets installed. I suggest not to select any additional components, like desktop environments, to speed up the process.

During the final step of the installation (finish-install.d/55netcfg-copy-config) d-i will detect that Netplan was installed in the target system (due to the preseed file provided) and opt to write its network configuration to /etc/netplan/ instead of /etc/network/interfaces or /etc/NetworkManager/system-connections/.

Done! After the installation finished you can reboot into your virgin Debian Bookworm system. To do that, quit the current Qemu process, by pressing Ctrl+C and make sure to copy over the EFIVARS.fd file that was written by grub during the installation, so Qemu can find the new system. Then reboot into the new system, not using the mini.iso image any more:

$ cp ./OVMF_VARS_4M.fd ./EFIVARS.fd
$ qemu-system-x86_64 \
        -M q35 -enable-kvm -cpu host -smp 4 -m 2G \
        -drive if=pflash,format=raw,unit=0,file=OVMF_CODE_4M.fd,readonly=on \
        -drive if=pflash,format=raw,unit=1,file=EFIVARS.fd,readonly=off \
        -device qemu-xhci -device usb-kbd -device usb-mouse \
        -vga none -device virtio-gpu-pci \
        -net nic,model=virtio -net user \
        -drive file=./data.qcow2,if=none,format=qcow2,id=disk0 \
        -device virtio-blk-pci,drive=disk0,bootindex=1
        -serial mon:stdio

Finally, you can play around with your Netplan enabled Debian system! As you will find, /etc/network/interfaces exists but is empty, it could still be used (optionally/additionally). Netplan was configured in /etc/netplan/ according to the settings given during the d-i installation process.

In our case we also installed the Netplan CLI, so we can play around with some of its features, like netplan status:

Thank you for following along the Netplan enabled Debian installation process and happy hacking! If you want to learn more join the discussion at Salsa:installer-team/netcfg and find us at GitHub:netplan.

18 April 2024

Thomas Koch: Minimal overhead VMs with Nix and MicroVM

Posted on March 17, 2024

Tags: debian, free software, nix

Joachim Breitner wrote about a Convenient sandboxed development environment and thus reminded me to blog about MicroVM. I ve toyed around with it a little but not yet seriously used it as I m currently not coding. MicroVM is a nix based project to configure and run minimal VMs. It can mount and thus reuse the hosts nix store inside the VM and thus has a very small disk footprint. I use MicroVM on a debian system using the nix package manager. The MicroVM author uses the project to host production services. Otherwise I consider it also a nice way to learn about NixOS after having started with the nix package manager and before making the big step to NixOS as my main system. The guests root filesystem is a tmpdir, so one must explicitly define folders that should be mounted from the host and thus be persistent across VM reboots. I defined the VM as a nix flake since this is how I started from the MicroVM projects example:

 
  description = "Haskell dev MicroVM";
  inputs.impermanence.url = "github:nix-community/impermanence";
  inputs.microvm.url = "github:astro/microvm.nix";
  inputs.microvm.inputs.nixpkgs.follows = "nixpkgs";
  outputs =   self, impermanence, microvm, nixpkgs  :
    let
      persistencePath = "/persistent";
      system = "x86_64-linux";
      user = "thk";
      vmname = "haskell";
      nixosConfiguration = nixpkgs.lib.nixosSystem  
          inherit system;
          modules = [
            microvm.nixosModules.microvm
            impermanence.nixosModules.impermanence
            ( pkgs, ...  :  
            environment.persistence.$ persistencePath  =  
                hideMounts = true;
                users.$ user  =  
                  directories = [
                    "git" ".stack"
                  ];
                 ;
               ;
              environment.sessionVariables =  
                TERM = "screen-256color";
               ;
              environment.systemPackages = with pkgs; [
                ghc
                git
                (haskell-language-server.override   supportedGhcVersions = [ "94" ];  )
                htop
                stack
                tmux
                tree
                vcsh
                zsh
              ];
              fileSystems.$ persistencePath .neededForBoot = nixpkgs.lib.mkForce true;
              microvm =  
                forwardPorts = [
                    from = "host"; host.port = 2222; guest.port = 22;  
                    from = "guest"; host.port = 5432; guest.port = 5432;   # postgresql
                ];
                hypervisor = "qemu";
                interfaces = [
                    type = "user"; id = "usernet"; mac = "00:00:00:00:00:02";  
                ];
                mem = 4096;
                shares = [  
                  # use "virtiofs" for MicroVMs that are started by systemd
                  proto = "9p";
                  tag = "ro-store";
                  # a host's /nix/store will be picked up so that no
                  # squashfs/erofs will be built for it.
                  source = "/nix/store";
                  mountPoint = "/nix/.ro-store";
                   
                  proto = "virtiofs";
                  tag = "persistent";
                  source = "~/.local/share/microvm/vms/$ vmname /persistent";
                  mountPoint = persistencePath;
                  socket = "/run/user/1000/microvm-$ vmname -persistent";
                 
                ];
                socket = "/run/user/1000/microvm-control.socket";
                vcpu = 3;
                volumes = [];
                writableStoreOverlay = "/nix/.rwstore";
               ;
              networking.hostName = vmname;
              nix.enable = true;
              nix.nixPath = ["nixpkgs=$ builtins.storePath <nixpkgs> "];
              nix.settings =  
                extra-experimental-features = ["nix-command" "flakes"];
                trusted-users = [user];
               ;
              security.sudo =  
                enable = true;
                wheelNeedsPassword = false;
               ;
              services.getty.autologinUser = user;
              services.openssh =  
                enable = true;
               ;
              system.stateVersion = "24.11";
              systemd.services.loadnixdb =  
                description = "import hosts nix database";
                path = [pkgs.nix];
                wantedBy = ["multi-user.target"];
                requires = ["nix-daemon.service"];
                script = "cat $ persistencePath /nix-store-db-dump nix-store --load-db";
               ;
              time.timeZone = nixpkgs.lib.mkDefault "Europe/Berlin";
              users.users.$ user  =  
                extraGroups = [ "wheel" "video" ];
                group = "user";
                isNormalUser = true;
                openssh.authorizedKeys.keys = [
                  "ssh-rsa REDACTED"
                ];
                password = "";
               ;
              users.users.root.password = "";
              users.groups.user =  ;
             )
          ];
         ;
    in  
      packages.$ system .default = nixosConfiguration.config.microvm.declaredRunner;
     ;

I start the microVM with a templated systemd user service:

[Unit]
Description=MicroVM for Haskell development
Requires=microvm-virtiofsd-persistent@.service
After=microvm-virtiofsd-persistent@.service
AssertFileNotEmpty=%h/.local/share/microvm/vms/%i/flake/flake.nix
[Service]
Type=forking
ExecStartPre=/usr/bin/sh -c "[ /nix/var/nix/db/db.sqlite -ot %h/.local/share/microvm/nix-store-db-dump ]   nix-store --dump-db >%h/.local/share/microvm/nix-store-db-dump"
ExecStartPre=ln -f -t %h/.local/share/microvm/vms/%i/persistent/ %h/.local/share/microvm/nix-store-db-dump
ExecStartPre=-%h/.local/state/nix/profile/bin/tmux new -s microvm -d
ExecStart=%h/.local/state/nix/profile/bin/tmux new-window -t microvm: -n "%i" "exec %h/.local/state/nix/profile/bin/nix run --impure %h/.local/share/microvm/vms/%i/flake"

The above service definition creates a dump of the hosts nix store db so that it can be imported in the guest. This is necessary so that the guest can actually use what is available in /nix/store. There is an effort for an overlayed nix store that would be preferable to this hack. Finally the microvm is started inside a tmux session named microvm . This way I can use the VM with SSH or through the console and also access the qemu console. And for completeness the virtiofsd service:

[Unit]
Description=serve host persistent folder for dev VM
AssertPathIsDirectory=%h/.local/share/microvm/vms/%i/persistent
[Service]
ExecStart=%h/.local/state/nix/profile/bin/virtiofsd \
 --socket-path=$ XDG_RUNTIME_DIR /microvm-%i-persistent \
 --shared-dir=%h/.local/share/microvm/vms/%i/persistent \
 --gid-map :995:%G:1: \
 --uid-map :1000:%U:1:

10 March 2024

Vasudev Kamath: Cloning a laptop over NVME TCP

Recently, I got a new laptop and had to set it up so I could start using it. But I wasn't really in the mood to go through the same old steps which I had explained in this post earlier. I was complaining about this to my colleague, and there came the suggestion of why not copy the entire disk to the new laptop. Though it sounded like an interesting idea to me, I had my doubts, so here is what I told him in return.

I don't have the tools to open my old laptop and connect the new disk over USB to my new laptop.
I use full disk encryption, and my old laptop has a 512GB disk, whereas the new laptop has a 1TB NVME, and I'm not so familiar with resizing LUKS.

He promptly suggested both could be done. For step 1, just expose the disk using NVME over TCP and connect it over the network and do a full disk copy, and the rest is pretty simple to achieve. In short, he suggested the following:

Export the disk using nvmet-tcp from the old laptop.
Do a disk copy to the new laptop.
Resize the partition to use the full 1TB.
Resize LUKS.
Finally, resize the BTRFS root disk.

Exporting Disk over NVME TCP The easiest way suggested by my colleague to do this is using systemd-storagetm.service. This service can be invoked by simply booting into storage-target-mode.target by specifying rd.systemd.unit=storage-target-mode.target. But he suggested not to use this as I need to tweak the dracut initrd image to involve network services as well as configuring WiFi from this mode is a painful thing to do. So alternatively, I simply booted both my laptops with GRML rescue CD. And the following step was done to export the NVME disk on my current laptop using the nvmet-tcp module of Linux:

modprobe nvmet-tcp
cd /sys/kernel/config/nvmet
mkdir ports/0
cd ports/0
echo "ipv4" > addr_adrfam
echo 0.0.0.0 > addr_traaddr
echo 4420 > addr_trsvcid
echo tcp > addr_trtype
cd /sys/kernel/config/nvmet/subsystems
mkdir testnqn
echo 1 >testnqn/allow_any_host
mkdir testnqn/namespaces/1
cd testnqn
# replace the device name with the disk you want to export
echo "/dev/nvme0n1" > namespaces/1/device_path
echo 1 > namespaces/1/enable
ln -s "../../subsystems/testnqn" /sys/kernel/config/nvmet/ports/0/subsystems/testnqn

These steps ensure that the device is now exported using NVME over TCP. The next step is to detect this on the new laptop and connect the device:

nvme discover -t tcp -a <ip> -s 4420
nvme connectl-all -t tcp -a <> -s 4420

Finally, nvme list shows the device which is connected to the new laptop, and we can proceed with the next step, which is to do the disk copy.

Copying the Disk I simply used the dd command to copy the root disk to my new laptop. Since the new laptop didn't have an Ethernet port, I had to rely only on WiFi, and it took about 7 and a half hours to copy the entire 512GB to the new laptop. The speed at which I was copying was about 18-20MB/s. The other option would have been to create an initial partition and file system and do an rsync of the root disk or use BTRFS itself for file system transfer.

dd if=/dev/nvme2n1 of=/dev/nvme0n1 status=progress bs=40M

Resizing Partition and LUKS Container The final part was very easy. When I launched parted, it detected that the partition table does not match the disk size and asked if it can fix it, and I said yes. Next, I had to install cloud-guest-utils to get growpart to fix the second partition, and the following command extended the partition to the full 1TB:

growpart /dev/nvem0n1 p2

Next, I used cryptsetup-resize to increase the LUKS container size.

cryptsetup luksOpen /dev/nvme0n1p2 ENC
cryptsetup resize ENC

Finally, I rebooted into the disk, and everything worked fine. After logging into the system, I resized the BTRFS file system. BTRFS requires the system to be mounted for resize, so I could not attempt it in live boot.

btfs fielsystem resize max /

Conclussion The only benefit of this entire process is that I have a new laptop, but I still feel like I'm using my existing laptop. Typically, setting up a new laptop takes about a week or two to completely get adjusted, but in this case, that entire time is saved. An added benefit is that I learned how to export disks using NVME over TCP, thanks to my colleague. This new knowledge adds to the value of the experience.

25 February 2024

Jacob Adams: AAC and Debian

Currently, in a default installation of Debian with the GNOME desktop, Bluetooth headphones that require the AAC codec¹ cannot be used. As the Debian wiki outlines, using the AAC codec over Bluetooth, while technically supported by PipeWire, is explicitly disabled in Debian at this time. This is because the fdk-aac library needed to enable this support is currently in the non-free component of the repository, meaning that PipeWire, which is in the main component, cannot depend on it.

How to Fix it Yourself If what you, like me, need is simply for Bluetooth Audio to work with AAC in Debian s default desktop environment², then you ll need to rebuild the pipewire package to include the AAC codec. While the current version in Debian main has been built with AAC deliberately disabled, it is trivial to enable if you can install a version of the fdk-aac library. I preface this with the usual caveats when it comes to patent and licensing controversies. I am not a lawyer, building this package and/or using it could get you into legal trouble. These instructions have only been tested on an up-to-date copy of Debian 12.

Install pipewire s build dependencies

sudo apt install build-essential devscripts
sudo apt build-dep pipewire

Install libfdk-aac-dev
```
sudo apt install libfdk-aac-dev
```
If the above doesn t work you ll likely need to enable non-free and try again
```
sudo sed -i 's/main/main non-free/g' /etc/apt/sources.list
sudo apt update
```
Alternatively, if you wish to ensure you are maximally license-compliant and patent un-infringing³, you can instead build fdk-aac-free which includes only those components of AAC that are known to be patent-free³. This is what should eventually end up in Debian to resolve this problem (see below).
```
sudo apt install git-buildpackage
mkdir fdk-aac-source
cd fdk-aac-source
git clone https://salsa.debian.org/multimedia-team/fdk-aac
cd fdk-aac
gbp buildpackage
sudo dpkg -i ../libfdk-aac2_*deb ../libfdk-aac-dev_*deb
```
Get the pipewire source code
```
mkdir pipewire-source
cd pipewire-source
apt source pipewire
```
This will create a bunch of files within the pipewire-source directory, but you ll only need the pipewire-<version> folder, this contains all the files you ll need to build the package, with all the debian-specific patches already applied. Note that you don t want to run the apt source command as root, as it will then create files that your regular user cannot edit.

Fix the dependencies and build options To fix up the build scripts to use the fdk-aac library, you need to save the following as pipewire-source/aac.patch

--- debian/control.orig
+++ debian/control
@@ -40,8 +40,8 @@
             modemmanager-dev,
             pkg-config,
             python3-docutils,
-               systemd [linux-any]
-Build-Conflicts: libfdk-aac-dev
+               systemd [linux-any],
+               libfdk-aac-dev
 Standards-Version: 4.6.2
 Vcs-Browser: https://salsa.debian.org/utopia-team/pipewire
 Vcs-Git: https://salsa.debian.org/utopia-team/pipewire.git
--- debian/rules.orig
+++ debian/rules
@@ -37,7 +37,7 @@
 		-Dauto_features=enabled \
 		-Davahi=enabled \
 		-Dbluez5-backend-native-mm=enabled \
-		-Dbluez5-codec-aac=disabled \
+		-Dbluez5-codec-aac=enabled \
 		-Dbluez5-codec-aptx=enabled \
 		-Dbluez5-codec-lc3=enabled \
 		-Dbluez5-codec-lc3plus=disabled \

Then you ll need to run patch from within the pipewire-<version> folder created by apt source:

patch -p0 < ../aac.patch

Build pipewire
```
cd pipewire-*
debuild
```
Note that you will likely see an error from debsign at the end of this process, this is harmless, you simply don t have a GPG key set up to sign your newly-built package⁴. Packages don t need to be signed to be installed, and debsign uses a somewhat non-standard signing process that dpkg does not check anyway.

Install libspa-0.2-bluetooth

sudo dpkg -i libspa-0.2-bluetooth_*.deb

Restart PipeWire and/or Reboot
```
sudo reboot
```
Theoretically there s a set of services to restart here that would get pipewire to pick up the new library, probably just pipewire itself. But it s just as easy to restart and ensure everything is using the correct library.

Why This is a slightly unusual situation, as the `fdk-aac` library is licensed under what even the GNU project acknowledges is a free software license. However, this license explicitly informs the user that they need to acquire a patent license to use this software⁵:
3. NO PATENT LICENSE NO EXPRESS OR IMPLIED LICENSES TO ANY PATENT CLAIMS, including without limitation the patents of Fraunhofer, ARE GRANTED BY THIS SOFTWARE LICENSE. Fraunhofer provides no warranty of patent non-infringement with respect to this software. You may use this FDK AAC Codec software or modifications thereto only for purposes that are authorized by appropriate patent licenses.
To quote the GNU project:
Because of this, and because the license author is a known patent aggressor, we encourage you to be careful about using or redistributing software under this license: you should first consider whether the licensor might aim to lure you into patent infringement.
AAC is covered by a number of patents, which expire at some point in the 2030s⁶. As such the current version of the library is potentially legally dubious to ship with any other software, as it could be considered patent-infringing³.

Fedora s solution Since 2017, Fedora has included a modified version of the library as fdk-aac-free, see the announcement and the bugzilla bug requesting review. This version of the library includes only the AAC LC profile, which is believed to be entirely patent-free³. Based on this, there is an open bug report in Debian requesting that the `fdk-aac` package be moved to the main component and that the `pipwire` package be updated to build against it.

The Debian NEW queue To resolve these bugs, a version of `fdk-aac-free` has been uploaded to Debian by Jeremy Bicha. However, to make it into Debian proper, it must first pass through the ftpmaster s NEW queue. The current version of fdk-aac-free has been in the NEW queue since July 2023. Based on conversations in some of the bugs above, it s been there since at least 2022⁷. I hope this helps anyone stuck with AAC to get their hardware working for them while we wait for the package to eventually make it through the NEW queue. Discuss on Hacker News

Such as, for example, any Apple AirPods, which only support AAC AFAICT.

Which, as of Debian 12 is GNOME 3 under Wayland with PipeWire.

I m not a lawyer, I don t know what kinds of infringement might or might not be possible here, do your own research, etc. ² ³ ⁴

And if you DO have a key setup with `debsign` you almost certainly don t need these instructions.

This was originally phrased as explicitly does not grant any patent rights. It was pointed out on Hacker News that this is not exactly what it says, as it also includes a specific note that you ll need to acquire your own patent license. I ve now quoted the relevant section of the license for clarity.

Wikipedia claims the base patents expire in 2031, with the extensions expiring in 2038, but its source for these claims is some guy s spreadsheet in a forum. The same discussion also brings up Wikipedia s claim and casts some doubt on it, so I m not entirely sure what s correct here, but I didn t feel like doing a patent deep-dive today. If someone can provide a clear answer that would be much appreciated.

According to Jeremy B cha: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1021370#17

3 January 2024

John Goerzen: Live Migrating from Raspberry Pi OS bullseye to Debian bookworm

I ve been getting annoyed with Raspberry Pi OS (Raspbian) for years now. It s a fork of Debian, but manages to omit some of the most useful things. So I ve decided to migrate all of my Pis to run pure Debian. These are my reasons:

Raspberry Pi OS has, for years now, specified that there is no upgrade path. That is, to get to a newer major release, it s a reinstall. While I have sometimes worked around this, for a device that is frequently installed in hard-to-reach locations, this is even more important than usual. It s common for me to upgrade machines for a decade or more across Debian releases and there s no reason that it should be so much more difficult with Raspbian.
As I noted in Consider Security First, the security situation for Raspberry Pi OS isn t as good as it is with Debian.
Raspbian lags behind Debian often times by 6 months or more for major releases, and days or weeks for bug fixes and security patches.
Raspbian has no direct backports support, though Raspberry Pi 3 and above can use Debian s backports (per my instructions as Installing Debian Backports on Raspberry Pi)
Raspbian uses a custom kernel without initramfs support

It turns out it is actually possible to do an in-place migration from Raspberry Pi OS bullseye to Debian bookworm. Here I will describe how. Even if you don t have a Raspberry Pi, this might still be instructive on how Raspbian and Debian packages work.

WARNINGS Before continuing, back up your system. This process isn t for the neophyte and it is entirely possible to mess up your boot device to the point that you have to do a fresh install to get your Pi to boot. This isn t a supported process at all.

Architecture Confusion Debian has three ARM-based architectures:

armel, for the lowest-end 32-bit ARM devices without hardware floating point support

armhf, for the higher-end 32-bit ARM devices with hardware float (hence hf )

arm64, for 64-bit ARM devices (which all have hardware float)

Although the Raspberry Pi 0 and 1 do support hardware float, they lack support for other CPU features that Debian s armhf architecture assumes. Therefore, the Raspberry Pi 0 and 1 could only run Debian s armel architecture. Raspberry Pi 3 and above are capable of running 64-bit, and can run both armhf and arm64. Prior to the release of the Raspberry Pi 5 / Raspbian bookworm, Raspbian only shipped the armhf architecture. Well, it was an architecture they called armhf, but it was different from Debian s armhf in that everything was recompiled to work with the more limited set of features on the earlier Raspberry Pi boards. It was really somewhere between Debian s armel and armhf archs. You could run Debian armel on those, but it would run more slowly, due to doing floating point calculations without hardware support. Debian s raspi FAQ goes into this a bit. What I am going to describe here is going from Raspbian armhf to Debian armhf with a 64-bit kernel. Therefore, it will only work with Raspberry Pi 3 and above. It may theoretically be possible to take a Raspberry Pi 2 to Debian armhf with a 32-bit kernel, but I haven t tried this and it may be more difficult. I have seen conflicting information on whether armhf really works on a Pi 2. (If you do try it on a Pi 2, ignore everything about arm64 and 64-bit kernels below, and just go with the `linux-image-armmp-lpae` kernel per the ARMMP page) There is another wrinkle: Debian doesn t support running 32-bit ARM kernels on 64-bit ARM CPUs, though it does support running a 32-bit userland on them. So we will wind up with a system with kernel packages from arm64 and everything else from armhf. This is a perfectly valid configuration as the arm64 like x86_64 is multiarch (that is, the CPU can natively execute both the 32-bit and 64-bit instructions). (It is theoretically possible to crossgrade a system from 32-bit to 64-bit userland, but that felt like a rather heavy lift for dubious benefit on a Pi; nevertheless, if you want to make this process even more complicated, refer to the CrossGrading page.)

Prerequisites and Limitations In addition to the need for a Raspberry Pi 3 or above in order for this to work, there are a few other things to mention. If you are using the GPIO features of the Pi, I don t know if those work with Debian. I think Raspberry Pi OS modified the desktop environment more than other components. All of my Pis are headless, so I don t know if this process will work if you use a desktop environment. I am assuming you are booting from a MicroSD card as is typical in the Raspberry Pi world. The Pi s firmware looks for a FAT partition (MBR type 0x0c) and looks within it for boot information. Depending on how long ago you first installed an OS on your Pi, your /boot may be too small for Debian. Use `df -h /boot` to see how big it is. I recommend 200MB at minimum. If your /boot is smaller than that, stop now (or use some other system to shrink your root filesystem and rearrange your partitions; I ve done this, but it s outside the scope of this article.) You need to have stable power. Once you begin this process, your pi will mostly be left in a non-bootable state until you finish. (You did make a backup, right?)

Basic idea The basic idea here is that since bookworm has almost entirely newer packages then bullseye, we can just switch over to it and let the Debian packages replace the Raspbian ones as they are upgraded. Well, it s not quite that easy, but that s the main idea.

Preparation First, make a backup. Even an image of your MicroSD card might be nice. OK, I think I ve said that enough now. It would be a good idea to have a HDMI cable (with the appropriate size of connector for your particular Pi board) and a HDMI display handy so you can troubleshoot any bootup issues with a console.

Preparation: access The Raspberry Pi OS by default sets up a user named pi that can use sudo to gain root without a password. I think this is an insecure practice, but assuming you haven t changed it, you will need to ensure it still works once you move to Debian. Raspberry Pi OS had a patch in their sudo package to enable it, and that will be removed when Debian s sudo package is installed. So, put this in /etc/sudoers.d/010_picompat:

pi ALL=(ALL) NOPASSWD: ALL

Also, there may be no password set for the root account. It would be a good idea to set one; it makes it easier to log in at the console. Use the passwd command as root to do so.

Preparation: bluetooth Debian doesn t correctly identify the Bluetooth hardware address. You can save it off to a file by running `hcitool dev > /root/bluetooth-from-raspbian.txt`. I don t use Bluetooth, but this should let you develop a script to bring it up properly.

Preparation: Debian archive keyring You will next need to install Debian s archive keyring so that apt can authenticate packages from Debian. Go to the bookworm download page for debian-archive-keyring and copy the URL for one of the files, then download it on the pi. For instance:

wget http://http.us.debian.org/debian/pool/main/d/debian-archive-keyring/debian-archive-keyring_2023.3+deb12u1_all.deb

Use sha256sum to verify the checksum of the downloaded file, comparing it to the package page on the Debian site. Now, you ll install it with:

dpkg -i debian-archive-keyring_2023.3+deb12u1_all.deb

Package first steps From here on, we are making modifications to the system that can leave it in a non-bootable state. Examine /etc/apt/sources.list and all the files in /etc/apt/sources.list.d. Most likely you will want to delete or comment out all lines in all files there. Replace them with something like:

deb http://deb.debian.org/debian/ bookworm main non-free-firmware contrib non-free
deb http://security.debian.org/debian-security bookworm-security main non-free-firmware contrib non-free
deb https://deb.debian.org/debian bookworm-backports main non-free-firmware contrib non-free

(you might leave off contrib and non-free depending on your needs) Now, we re going to tell it that we ll support arm64 packages:

dpkg --add-architecture arm64

And finally, download the bookworm package lists:

apt-get update

If there are any errors from that command, fix them and don t proceed until you have a clean run of apt-get update.

Moving /boot to /boot/firmware The boot FAT partition I mentioned above is mounted at /boot by Raspberry Pi OS, but Debian s scripts assume it will be at /boot/firmware. We need to fix this. First:

umount /boot
mkdir /boot/firmware

Now, edit fstab and change the reference to /boot to be to /boot/firmware. Now:

mount -v /boot/firmware
cd /boot/firmware
mv -vi * ..

This mounts the filesystem at the new location, and moves all its contents back to where apt believes it should be. Debian s packages will populate /boot/firmware later.

Installing the first packages Now we start by installing the first of the needed packages. Eventually we will wind up with roughly the same set Debian uses.

apt-get install linux-image-arm64
apt-get install firmware-brcm80211=20230210-5
apt-get install raspi-firmware

If you get errors relating to firmware-brcm80211 from any commands, run that install firmware-brcm80211 command and then proceed. There are a few packages that Raspbian marked as newer than the version in bookworm (whether or not they really are), and that s one of them.

Configuring the bootloader We need to configure a few things in /etc/default/raspi-firmware before proceeding. Edit that file. First, uncomment (or add) a line like this:

KERNEL_ARCH="arm64"

Next, in /boot/cmdline.txt you can find your old Raspbian boot command line. It will say something like:

root=PARTUUID=...

Save off the bit starting with PARTUUID. Back in /etc/default/raspi-firmware, set a line like this:

ROOTPART=PARTUUID=abcdef00

(substituting your real value for abcdef00). This is necessary because the microSD card device name often changes from /dev/mmcblk0 to /dev/mmcblk1 when switching to Debian s kernel. raspi-firmware will encode the current device name in /boot/firmware/cmdline.txt by default, which will be wrong once you boot into Debian s kernel. The PARTUUID approach lets it work regardless of the device name.

Purging the Raspbian kernel Run:

dpkg --purge raspberrypi-kernel

Upgrading the system At this point, we are going to run the procedure beginning at section 4.4.3 of the Debian release notes. Generally, you will do:

apt-get -u upgrade
apt full-upgrade

Fix any errors at each step before proceeding to the next. Now, to remove some cruft, run:

apt-get --purge autoremove

Inspect the list to make sure nothing important isn t going to be removed.

Removing Raspbian cruft You can list some of the cruft with:

apt list '~o'

And remove it with:

apt purge '~o'

I also don t run Bluetooth, and it seemed to sometimes hang on boot becuase I didn t bother to fix it, so I did:

apt-get --purge remove bluez

Installing some packages This makes sure some basic Debian infrastructure is available:

apt-get install wpasupplicant parted dosfstools wireless-tools iw alsa-tools
apt-get --purge autoremove

Installing firmware Now run:

apt-get install firmware-linux

Resolving firmware package version issues If it gives an error about the installed version of a package, you may need to force it to the bookworm version. For me, this often happened with firmware-atheros, firmware-libertas, and firmware-realtek. Here s how to resolve it, with firmware-realtek as an example:

Go to https://packages.debian.org/PACKAGENAME for instance, https://packages.debian.org/firmware-realtek. Note the version number in bookworm in this case, 20230210-5.
Now, you will force the installation of that package at that version:
```
apt-get install firmware-realtek=20230210-5
```
Repeat with every conflicting package until done.
Rerun apt-get install firmware-linux and make sure it runs cleanly.

Also, in the end you should be able to:

apt-get install firmware-atheros firmware-libertas firmware-realtek firmware-linux

Dealing with other Raspbian packages The Debian release notes discuss removing non-Debian packages. There will still be a few of those. Run:

apt list '?narrow(?installed, ?not(?origin(Debian)))'

Deal with them; mostly you will need to force the installation of a bookworm version using the procedure in the section Resolving firmware package version issues above (even if it s not for a firmware package). For non-firmware packages, you might possibly want to add --mark-auto to your apt-get install command line to allow the package to be autoremoved later if the things depending on it go away. If you aren t going to use Bluetooth, I recommend apt-get --purge remove bluez as well. Sometimes it can hang at boot if you don t fix it up as described above.

Set up networking We ll be switching to the Debian method of networking, so we ll create some files in /etc/network/interfaces.d. First, eth0 should look like this:

allow-hotplug eth0
iface eth0 inet dhcp
iface eth0 inet6 auto

And wlan0 should look like this:

allow-hotplug wlan0
iface wlan0 inet dhcp
    wpa-conf /etc/wpa_supplicant/wpa_supplicant.conf

Raspbian is inconsistent about using eth0/wlan0 or renamed interface. Run ifconfig or ip addr. If you see a long-named interface such as enx<something> or wlp<something>, copy the eth0 file to the one named after the enx interface, or the wlan0 file to the one named after the wlp interface, and edit the internal references to eth0/wlan0 in this new file to name the long interface name. If using wifi, verify that your SSIDs and passwords are in /etc/wpa_supplicant/wpa_supplicant.conf. It should have lines like:

network= 
   ssid="NetworkName"
   psk="passwordHere"

(This is where Raspberry Pi OS put them).

Deal with DHCP Raspberry Pi OS used dhcpcd, whereas bookworm normally uses isc-dhcp-client. Verify the system is in the correct state:

apt-get install isc-dhcp-client
apt-get --purge remove dhcpcd dhcpcd-base dhcpcd5 dhcpcd-dbus

Set up LEDs To set up the LEDs to trigger on MicroSD activity as they did with Raspbian, follow the Debian instructions. Run apt-get install sysfsutils. Then put this in a file at /etc/sysfs.d/local-raspi-leds.conf:

class/leds/ACT/brightness = 1
class/leds/ACT/trigger = mmc1

Prepare for boot To make sure all the `/boot/firmware` files are updated, run `update-initramfs -u`. Verify that `root` in `/boot/firmware/cmdline.txt` references the `PARTUUID` as appropriate. Verify that `/boot/firmware/config.txt` contains the lines `arm_64bit=1` and `upstream_kernel=1`. If not, go back to the section on modifying `/etc/default/raspi-firmware` and fix it up.

The moment arrives Cross your fingers and try rebooting into your Debian system:

reboot

For some reason, I found that the first boot into Debian seems to hang for 30-60 seconds during bootstrap. I m not sure why; don t panic if that happens. It may be necessary to power cycle the Pi for this boot.

Troubleshooting If things don t work out, hook up the Pi to a HDMI display and see what s up. If I anticipated a particular problem, I would have documented it here (a lot of the things I documented here are because I ran into them!) So I can t give specific advice other than to watch boot messages on the console. If you don t even get kernel messages going, then there is some problem with your partition table or `/boot/firmware` FAT partition. Otherwise, you ve at least got the kernel going and can troubleshoot like usual from there.

9 December 2023

Simon Josefsson: Classic McEliece goes to IETF and OpenSSH

My earlier work on Streamlined NTRU Prime has been progressing along. The IETF document on sntrup761 in SSH has passed several process points. GnuPG s libgcrypt has added support for sntrup761. The libssh support for sntrup761 is working, but the merge request is stuck mostly due to lack of time to debug why the regression test suite sporadically errors out in non-sntrup761 related parts with the patch. The foundation for lattice-based post-quantum algorithms has some uncertainty around it, and I have felt that there is more to the post-quantum story than adding sntrup761 to implementations. Classic McEliece has been mentioned to me a couple of times, and I took some time to learn it and did a cut n paste job of the proposed ISO standard and published draft-josefsson-mceliece in the IETF to make the algorithm easily available to the IETF community. A high-quality implementation of Classic McEliece has been published as libmceliece and I ve been supporting the work of Jan Moj to package libmceliece for Debian, alas it has been stuck in the ftp-master NEW queue for manual review for over two months. The pre-dependencies librandombytes and libcpucycles are available in Debian already. All that text writing and packaging work set the scene to write some code. When I added support for sntrup761 in libssh, I became familiar with the OpenSSH code base, so it was natural to return to OpenSSH to experiment with a new SSH KEX for Classic McEliece. DJB suggested to pick mceliece6688128 and combine it with the existing X25519+sntrup761 or with plain X25519. While a three-algorithm hybrid between X25519, sntrup761 and mceliece6688128 would be a simple drop-in for those that don t want to lose the benefits offered by sntrup761, I decided to start the journey on a pure combination of X25519 with mceliece6688128. The key combiner in sntrup761x25519 is a simple SHA512 call and the only good I can say about that is that it is simple to describe and implement, and doesn t raise too many questions since it is already deployed. After procrastinating coding for months, once I sat down to work it only took a couple of hours until I had a successful Classic McEliece SSH connection. I suppose my brain had sorted everything in background before I started. To reproduce it, please try the following in a Debian testing environment (I use podman to get a clean environment).

# podman run -it --rm debian:testing-slim
apt update
apt dist-upgrade -y
apt install -y wget python3 librandombytes-dev libcpucycles-dev gcc make git autoconf libz-dev libssl-dev
cd ~
wget -q -O- https://lib.mceliece.org/libmceliece-20230612.tar.gz   tar xfz -
cd libmceliece-20230612/
./configure
make install
ldconfig
cd ..
git clone https://gitlab.com/jas/openssh-portable
cd openssh-portable
git checkout jas/mceliece
autoreconf
./configure # verify 'libmceliece support: yes'
make # CC="cc -DDEBUG_KEX=1 -DDEBUG_KEXDH=1 -DDEBUG_KEXECDH=1"

You should now have a working SSH client and server that supports Classic McEliece! Verify support by running ./ssh -Q kex and it should mention mceliece6688128x25519-sha512@openssh.com. To have it print plenty of debug outputs, you may remove the # character on the final line, but don t use such a build in production. You can test it as follows:

./ssh-keygen -A # writes to /usr/local/etc/ssh_host_...
# setup public-key based login by running the following:
./ssh-keygen -t rsa -f ~/.ssh/id_rsa -P ""
cat ~/.ssh/id_rsa.pub > ~/.ssh/authorized_keys
adduser --system sshd
mkdir /var/empty
while true; do $PWD/sshd -p 2222 -f /dev/null; done &
./ssh -v -p 2222 localhost -oKexAlgorithms=mceliece6688128x25519-sha512@openssh.com date

On the client you should see output like this:

OpenSSH_9.5p1, OpenSSL 3.0.11 19 Sep 2023
...
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: algorithm: mceliece6688128x25519-sha512@openssh.com
debug1: kex: host key algorithm: ssh-ed25519
debug1: kex: server->client cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
debug1: kex: client->server cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
debug1: SSH2_MSG_KEX_ECDH_REPLY received
debug1: Server host key: ssh-ed25519 SHA256:YognhWY7+399J+/V8eAQWmM3UFDLT0dkmoj3pIJ0zXs
...
debug1: Host '[localhost]:2222' is known and matches the ED25519 host key.
debug1: Found key in /root/.ssh/known_hosts:1
debug1: rekey out after 134217728 blocks
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug1: SSH2_MSG_NEWKEYS received
debug1: rekey in after 134217728 blocks
...
debug1: Sending command: date
debug1: pledge: fork
debug1: permanently_set_uid: 0/0
Environment:
  USER=root
  LOGNAME=root
  HOME=/root
  PATH=/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin
  MAIL=/var/mail/root
  SHELL=/bin/bash
  SSH_CLIENT=::1 46894 2222
  SSH_CONNECTION=::1 46894 ::1 2222
debug1: client_input_channel_req: channel 0 rtype exit-status reply 0
debug1: client_input_channel_req: channel 0 rtype eow@openssh.com reply 0
Sat Dec  9 22:22:40 UTC 2023
debug1: channel 0: free: client-session, nchannels 1
Transferred: sent 1048044, received 3500 bytes, in 0.0 seconds
Bytes per second: sent 23388935.4, received 78108.6
debug1: Exit status 0

Notice the kex: algorithm: mceliece6688128x25519-sha512@openssh.com output. How about network bandwidth usage? Below is a comparison of a complete SSH client connection such as the one above that log in and print date and logs out. Plain X25519 is around 7kb, X25519 with sntrup761 is around 9kb, and mceliece6688128 with X25519 is around 1MB. Yes, Classic McEliece has large keys, but for many environments, 1MB of data for the session establishment will barely be noticeable.

./ssh -v -p 2222 localhost -oKexAlgorithms=curve25519-sha256 date 2>&1   grep ^Transferred
Transferred: sent 3028, received 3612 bytes, in 0.0 seconds
./ssh -v -p 2222 localhost -oKexAlgorithms=sntrup761x25519-sha512@openssh.com date 2>&1   grep ^Transferred
Transferred: sent 4212, received 4596 bytes, in 0.0 seconds
./ssh -v -p 2222 localhost -oKexAlgorithms=mceliece6688128x25519-sha512@openssh.com date 2>&1   grep ^Transferred
Transferred: sent 1048044, received 3764 bytes, in 0.0 seconds

So how about session establishment time?

date; i=0; while test $i -le 100; do ./ssh -v -p 2222 localhost -oKexAlgorithms=curve25519-sha256 date > /dev/null 2>&1; i= expr $i + 1 ; done; date
Sat Dec  9 22:39:19 UTC 2023
Sat Dec  9 22:39:25 UTC 2023
# 6 seconds
date; i=0; while test $i -le 100; do ./ssh -v -p 2222 localhost -oKexAlgorithms=sntrup761x25519-sha512@openssh.com date > /dev/null 2>&1; i= expr $i + 1 ; done; date
Sat Dec  9 22:39:29 UTC 2023
Sat Dec  9 22:39:38 UTC 2023
# 9 seconds
date; i=0; while test $i -le 100; do ./ssh -v -p 2222 localhost -oKexAlgorithms=mceliece6688128x25519-sha512@openssh.com date > /dev/null 2>&1; i= expr $i + 1 ; done; date
Sat Dec  9 22:39:55 UTC 2023
Sat Dec  9 22:40:07 UTC 2023
# 12 seconds

I never noticed adding sntrup761, so I m pretty sure I wouldn t notice this increase either. This is all running on my laptop that runs Trisquel so take it with a grain of salt but at least the magnitude is clear. Future work items include:

Use a better hybrid mode combiner than SHA512? See for example KEM Combiners.
Write IETF document describing the Classic McEliece SSH protocol
Submit my patch to the OpenSSH community for discussion and feedback, please review meanwhile!
Implement a mceliece6688128sntrup761x25519 multi-hybrid mode?
Write a shell script a la sntrup761.sh to import a stripped-down mceliece6688128 implementation to avoid having OpenSSH depend on libmceliece?
My kexmceliece6688128x25519.c is really just kexsntrup761x25519.c with the three calls to sntrup761 replaced with mceliece6688128. This should be parametrized to avoid cut n paste of code, if the OpenSSH community is interested.
Consider if the behaviour of librandombytes related to closing of file descriptors is relevant to OpenSSH.

Happy post-quantum SSH ing! Update: Changing the mceliece6688128_keypair call to mceliece6688128f_keypair (i.e., using the fully compatible f-variant) results in McEliece being just as fast as sntrup761 on my machine. Update 2023-12-26: An initial IETF document draft-josefsson-ssh-mceliece-00 published.

21 November 2023

Mike Hommey: How I (kind of) killed Mercurial at Mozilla

Did you hear the news? Firefox development is moving from Mercurial to Git. While the decision is far from being mine, and I was barely involved in the small incremental changes that ultimately led to this decision, I feel I have to take at least some responsibility. And if you are one of those who would rather use Mercurial than Git, you may direct all your ire at me. But let's take a step back and review the past 25 years leading to this decision. You'll forgive me for skipping some details and any possible inaccuracies. This is already a long post, while I could have been more thorough, even I think that would have been too much. This is also not an official Mozilla position, only my personal perception and recollection as someone who was involved at times, but mostly an observer from a distance. From CVS to DVCS From its release in 1998, the Mozilla source code was kept in a CVS repository. If you're too young to know what CVS is, let's just say it's an old school version control system, with its set of problems. Back then, it was mostly ubiquitous in the Open Source world, as far as I remember. In the early 2000s, the Subversion version control system gained some traction, solving some of the problems that came with CVS. Incidentally, Subversion was created by Jim Blandy, who now works at Mozilla on completely unrelated matters. In the same period, the Linux kernel development moved from CVS to Bitkeeper, which was more suitable to the distributed nature of the Linux community. BitKeeper had its own problem, though: it was the opposite of Open Source, but for most pragmatic people, it wasn't a real concern because free access was provided. Until it became a problem: someone at OSDL developed an alternative client to BitKeeper, and licenses of BitKeeper were rescinded for OSDL members, including Linus Torvalds (they were even prohibited from purchasing one). Following this fiasco, in April 2005, two weeks from each other, both Git and Mercurial were born. The former was created by Linus Torvalds himself, while the latter was developed by Olivia Mackall, who was a Linux kernel developer back then. And because they both came out of the same community for the same needs, and the same shared experience with BitKeeper, they both were similar distributed version control systems. Interestingly enough, several other DVCSes existed:

SVK, a DVCS built on top of Subversion, allowing users to create local (offline) branches of remote Subversion repositories. It was also known for its powerful merging capabilities. I picked it at some point for my Debian work, mainly because I needed to interact with Subversion repositories.
Arch (tla), later known as GNU arch. From what I remember, it was awful to use. You think Git is complex or confusing? Arch was way worse. It was forked as "Bazaar", but the fork was abandoned in favor of "Bazaar-NG", now known as "Bazaar" or "bzr", a much more user-friendly DVCS. The first release of Bzr actually precedes Git's by two weeks. I guess it was too new to be considered by Linus Torvalds for the Linux kernel needs.
Monotone, which I don't know much about, but it was mentioned by Linus Torvalds two days before the initial Git commit of Git. As far as I know, it was too slow for the Linux kernel's needs. I'll note in passing that Monotone is the creation of Graydon Hoare, who also created Rust.
Darcs, with its patch-based model, rather than the common snapshot-based model, allowed more flexible management of changes. This approach came, however, at the expense of performance.

In this landscape, the major difference Git was making at the time was that it was blazing fast. Almost incredibly so, at least on Linux systems. That was less true on other platforms (especially Windows). It was a game-changer for handling large codebases in a smooth manner. Anyways, two years later, in 2007, Mozilla decided to move its source code not to Bzr, not to Git, not to Subversion (which, yes, was a contender), but to Mercurial. The decision "process" was laid down in two rather colorful blog posts. My memory is a bit fuzzy, but I don't recall that it was a particularly controversial choice. All of those DVCSes were still young, and there was no definite "winner" yet (GitHub hadn't even been founded). It made the most sense for Mozilla back then, mainly because the Git experience on Windows still wasn't there, and that mattered a lot for Mozilla, with its diverse platform support. As a contributor, I didn't think much of it, although to be fair, at the time, I was mostly consuming the source tarballs. Personal preferences Digging through my archives, I've unearthed a forgotten chapter: I did end up setting up both a Mercurial and a Git mirror of the Firefox source repository on alioth.debian.org. Alioth.debian.org was a FusionForge-based collaboration system for Debian developers, similar to SourceForge. It was the ancestor of salsa.debian.org. I used those mirrors for the Debian packaging of Firefox (cough cough Iceweasel). The Git mirror was created with hg-fast-export, and the Mercurial mirror was only a necessary step in the process. By that time, I had converted my Subversion repositories to Git, and switched off SVK. Incidentally, I started contributing to Git around that time as well. I apparently did this not too long after Mozilla switched to Mercurial. As a Linux user, I think I just wanted the speed that Mercurial was not providing. Not that Mercurial was that slow, but the difference between a couple seconds and a couple hundred milliseconds was a significant enough difference in user experience for me to prefer Git (and Firefox was not the only thing I was using version control for) Other people had also similarly created their own mirror, or with other tools. But none of them were "compatible": their commit hashes were different. Hg-git, used by the latter, was putting extra information in commit messages that would make the conversion differ, and hg-fast-export would just not be consistent with itself! My mirror is long gone, and those have not been updated in more than a decade. I did end up using Mercurial, when I got commit access to the Firefox source repository in April 2010. I still kept using Git for my Debian activities, but I now was also using Mercurial to push to the Mozilla servers. I joined Mozilla as a contractor a few months after that, and kept using Mercurial for a while, but as a, by then, long time Git user, it never really clicked for me. It turns out, the sentiment was shared by several at Mozilla. Git incursion In the early 2010s, GitHub was becoming ubiquitous, and the Git mindshare was getting large. Multiple projects at Mozilla were already entirely hosted on GitHub. As for the Firefox source code base, Mozilla back then was kind of a Wild West, and engineers being engineers, multiple people had been using Git, with their own inconvenient workflows involving a local Mercurial clone. The most popular set of scripts was moz-git-tools, to incorporate changes in a local Git repository into the local Mercurial copy, to then send to Mozilla servers. In terms of the number of people doing that, though, I don't think it was a lot of people, probably a few handfuls. On my end, I was still keeping up with Mercurial. I think at that time several engineers had their own unofficial Git mirrors on GitHub, and later on Ehsan Akhgari provided another mirror, with a twist: it also contained the full CVS history, which the canonical Mercurial repository didn't have. This was particularly interesting for engineers who needed to do some code archeology and couldn't get past the 2007 cutoff of the Mercurial repository. I think that mirror ultimately became the official-looking, but really unofficial, mozilla-central repository on GitHub. On a side note, a Mercurial repository containing the CVS history was also later set up, but that didn't lead to something officially supported on the Mercurial side. Some time around 2011~2012, I started to more seriously consider using Git for work myself, but wasn't satisfied with the workflows others had set up for themselves. I really didn't like the idea of wasting extra disk space keeping a Mercurial clone around while using a Git mirror. I wrote a Python script that would use Mercurial as a library to access a remote repository and produce a git-fast-import stream. That would allow the creation of a git repository without a local Mercurial clone. It worked quite well, but it was not able to incrementally update. Other, more complete tools existed already, some of which I mentioned above. But as time was passing and the size and depth of the Mercurial repository was growing, these tools were showing their limits and were too slow for my taste, especially for the initial clone. Boot to Git In the same time frame, Mozilla ventured in the Mobile OS sphere with Boot to Gecko, later known as Firefox OS. What does that have to do with version control? The needs of third party collaborators in the mobile space led to the creation of what is now the gecko-dev repository on GitHub. As I remember it, it was challenging to create, but once it was there, Git users could just clone it and have a working, up-to-date local copy of the Firefox source code and its history... which they could already have, but this was the first officially supported way of doing so. Coincidentally, Ehsan's unofficial mirror was having trouble (to the point of GitHub closing the repository) and was ultimately shut down in December 2013. You'll often find comments on the interwebs about how GitHub has become unreliable since the Microsoft acquisition. I can't really comment on that, but if you think GitHub is unreliable now, rest assured that it was worse in its beginning. And its sustainability as a platform also wasn't a given, being a rather new player. So on top of having this official mirror on GitHub, Mozilla also ventured in setting up its own Git server for greater control and reliability. But the canonical repository was still the Mercurial one, and while Git users now had a supported mirror to pull from, they still had to somehow interact with Mercurial repositories, most notably for the Try server. Git slowly creeping in Firefox build tooling Still in the same time frame, tooling around building Firefox was improving drastically. For obvious reasons, when version control integration was needed in the tooling, Mercurial support was always a no-brainer. The first explicit acknowledgement of a Git repository for the Firefox source code, other than the addition of the .gitignore file, was bug 774109. It added a script to install the prerequisites to build Firefox on macOS (still called OSX back then), and that would print a message inviting people to obtain a copy of the source code with either Mercurial or Git. That was a precursor to current bootstrap.py, from September 2012. Following that, as far as I can tell, the first real incursion of Git in the Firefox source tree tooling happened in bug 965120. A few days earlier, bug 952379 had added a mach clang-format command that would apply clang-format-diff to the output from hg diff. Obviously, running hg diff on a Git working tree didn't work, and bug 965120 was filed, and support for Git was added there. That was in January 2014. A year later, when the initial implementation of mach artifact was added (which ultimately led to artifact builds), Git users were an immediate thought. But while they were considered, it was not to support them, but to avoid actively breaking their workflows. Git support for mach artifact was eventually added 14 months later, in March 2016. From gecko-dev to git-cinnabar Let's step back a little here, back to the end of 2014. My user experience with Mercurial had reached a level of dissatisfaction that was enough for me to decide to take that script from a couple years prior and make it work for incremental updates. That meant finding a way to store enough information locally to be able to reconstruct whatever the incremental updates would be relying on (guess why other tools hid a local Mercurial clone under hood). I got something working rather quickly, and after talking to a few people about this side project at the Mozilla Portland All Hands and seeing their excitement, I published a git-remote-hg initial prototype on the last day of the All Hands. Within weeks, the prototype gained the ability to directly push to Mercurial repositories, and a couple months later, was renamed to git-cinnabar. At that point, as a Git user, instead of cloning the gecko-dev repository from GitHub and switching to a local Mercurial repository whenever you needed to push to a Mercurial repository (i.e. the aforementioned Try server, or, at the time, for reviews), you could just clone and push directly from/to Mercurial, all within Git. And it was fast too. You could get a full clone of mozilla-central in less than half an hour, when at the time, other similar tools would take more than 10 hours (needless to say, it's even worse now). Another couple months later (we're now at the end of April 2015), git-cinnabar became able to start off a local clone of the gecko-dev repository, rather than clone from scratch, which could be time consuming. But because git-cinnabar and the tool that was updating gecko-dev weren't producing the same commits, this setup was cumbersome and not really recommended. For instance, if you pushed something to mozilla-central with git-cinnabar from a gecko-dev clone, it would come back with a different commit hash in gecko-dev, and you'd have to deal with the divergence. Eventually, in April 2020, the scripts updating gecko-dev were switched to git-cinnabar, making the use of gecko-dev alongside git-cinnabar a more viable option. Ironically(?), the switch occurred to ease collaboration with KaiOS (you know, the mobile OS born from the ashes of Firefox OS). Well, okay, in all honesty, when the need of syncing in both directions between Git and Mercurial (we only had ever synced from Mercurial to Git) came up, I nudged Mozilla in the direction of git-cinnabar, which, in my (biased but still honest) opinion, was the more reliable option for two-way synchronization (we did have regular conversion problems with hg-git, nothing of the sort has happened since the switch). One Firefox repository to rule them all For reasons I don't know, Mozilla decided to use separate Mercurial repositories as "branches". With the switch to the rapid release process in 2011, that meant one repository for nightly (mozilla-central), one for aurora, one for beta, and one for release. And with the addition of Extended Support Releases in 2012, we now add a new ESR repository every year. Boot to Gecko also had its own branches, and so did Fennec (Firefox for Mobile, before Android). There are a lot of them. And then there are also integration branches, where developer's work lands before being merged in mozilla-central (or backed out if it breaks things), always leaving mozilla-central in a (hopefully) good state. Only one of them remains in use today, though. I can only suppose that the way Mercurial branches work was not deemed practical. It is worth noting, though, that Mercurial branches are used in some cases, to branch off a dot-release when the next major release process has already started, so it's not a matter of not knowing the feature exists or some such. In 2016, Gregory Szorc set up a new repository that would contain them all (or at least most of them), which eventually became what is now the mozilla-unified repository. This would e.g. simplify switching between branches when necessary. 7 years later, for some reason, the other "branches" still exist, but most developers are expected to be using mozilla-unified. Mozilla's CI also switched to using mozilla-unified as base repository. Honestly, I'm not sure why the separate repositories are still the main entry point for pushes, rather than going directly to mozilla-unified, but it probably comes down to switching being work, and not being a top priority. Also, it probably doesn't help that working with multiple heads in Mercurial, even (especially?) with bookmarks, can be a source of confusion. To give an example, if you aren't careful, and do a plain clone of the mozilla-unified repository, you may not end up on the latest mozilla-central changeset, but rather, e.g. one from beta, or some other branch, depending which one was last updated. Hosting is simple, right? Put your repository on a server, install hgweb or gitweb, and that's it? Maybe that works for... Mercurial itself, but that repository "only" has slightly over 50k changesets and less than 4k files. Mozilla-central has more than an order of magnitude more changesets (close to 700k) and two orders of magnitude more files (more than 700k if you count the deleted or moved files, 350k if you count the currently existing ones). And remember, there are a lot of "duplicates" of this repository. And I didn't even mention user repositories and project branches. Sure, it's a self-inflicted pain, and you'd think it could probably(?) be mitigated with shared repositories. But consider the simple case of two repositories: mozilla-central and autoland. You make autoland use mozilla-central as a shared repository. Now, you push something new to autoland, it's stored in the autoland datastore. Eventually, you merge to mozilla-central. Congratulations, it's now in both datastores, and you'd need to clean-up autoland if you wanted to avoid the duplication. Now, you'd think mozilla-unified would solve these issues, and it would... to some extent. Because that wouldn't cover user repositories and project branches briefly mentioned above, which in GitHub parlance would be considered as Forks. So you'd want a mega global datastore shared by all repositories, and repositories would need to only expose what they really contain. Does Mercurial support that? I don't think so (okay, I'll give you that: even if it doesn't, it could, but that's extra work). And since we're talking about a transition to Git, does Git support that? You may have read about how you can link to a commit from a fork and make-pretend that it comes from the main repository on GitHub? At least, it shows a warning, now. That's essentially the architectural reason why. So the actual answer is that Git doesn't support it out of the box, but GitHub has some backend magic to handle it somehow (and hopefully, other things like Gitea, Girocco, Gitlab, etc. have something similar). Now, to come back to the size of the repository. A repository is not a static file. It's a server with which you negotiate what you have against what it has that you want. Then the server bundles what you asked for based on what you said you have. Or in the opposite direction, you negotiate what you have that it doesn't, you send it, and the server incorporates what you sent it. Fortunately the latter is less frequent and requires authentication. But the former is more frequent and CPU intensive. Especially when pulling a large number of changesets, which, incidentally, cloning is. "But there is a solution for clones" you might say, which is true. That's clonebundles, which offload the CPU intensive part of cloning to a single job scheduled regularly. Guess who implemented it? Mozilla. But that only covers the cloning part. We actually had laid the ground to support offloading large incremental updates and split clones, but that never materialized. Even with all that, that still leaves you with a server that can display file contents, diffs, blames, provide zip archives of a revision, and more, all of which are CPU intensive in their own way. And these endpoints are regularly abused, and cause extra load to your servers, yes plural, because of course a single server won't handle the load for the number of users of your big repositories. And because your endpoints are abused, you have to close some of them. And I'm not mentioning the Try repository with its tens of thousands of heads, which brings its own sets of problems (and it would have even more heads if we didn't fake-merge them once in a while). Of course, all the above applies to Git (and it only gained support for something akin to clonebundles last year). So, when the Firefox OS project was stopped, there wasn't much motivation to continue supporting our own Git server, Mercurial still being the official point of entry, and git.mozilla.org was shut down in 2016. The growing difficulty of maintaining the status quo Slowly, but steadily in more recent years, as new tooling was added that needed some input from the source code manager, support for Git was more and more consistently added. But at the same time, as people left for other endeavors and weren't necessarily replaced, or more recently with layoffs, resources allocated to such tooling have been spread thin. Meanwhile, the repository growth didn't take a break, and the Try repository was becoming an increasing pain, with push times quite often exceeding 10 minutes. The ongoing work to move Try pushes to Lando will hide the problem under the rug, but the underlying problem will still exist (although the last version of Mercurial seems to have improved things). On the flip side, more and more people have been relying on Git for Firefox development, to my own surprise, as I didn't really push for that to happen. It just happened organically, by ways of git-cinnabar existing, providing a compelling experience to those who prefer Git, and, I guess, word of mouth. I was genuinely surprised when I recently heard the use of Git among moz-phab users had surpassed a third. I did, however, occasionally orient people who struggled with Mercurial and said they were more familiar with Git, towards git-cinnabar. I suspect there's a somewhat large number of people who never realized Git was a viable option. But that, on its own, can come with its own challenges: if you use git-cinnabar without being backed by gecko-dev, you'll have a hard time sharing your branches on GitHub, because you can't push to a fork of gecko-dev without pushing your entire local repository, as they have different commit histories. And switching to gecko-dev when you weren't already using it requires some extra work to rebase all your local branches from the old commit history to the new one. Clone times with git-cinnabar have also started to go a little out of hand in the past few years, but this was mitigated in a similar manner as with the Mercurial cloning problem: with static files that are refreshed regularly. Ironically, that made cloning with git-cinnabar faster than cloning with Mercurial. But generating those static files is increasingly time-consuming. As of writing, generating those for mozilla-unified takes close to 7 hours. I was predicting clone times over 10 hours "in 5 years" in a post from 4 years ago, I wasn't too far off. With exponential growth, it could still happen, although to be fair, CPUs have improved since. I will explore the performance aspect in a subsequent blog post, alongside the upcoming release of git-cinnabar 0.7.0-b1. I don't even want to check how long it now takes with hg-git or git-remote-hg (they were already taking more than a day when git-cinnabar was taking a couple hours). I suppose it's about time that I clarify that git-cinnabar has always been a side-project. It hasn't been part of my duties at Mozilla, and the extent to which Mozilla supports git-cinnabar is in the form of taskcluster workers on the community instance for both git-cinnabar CI and generating those clone bundles. Consequently, that makes the above git-cinnabar specific issues a Me problem, rather than a Mozilla problem. Taking the leap I can't talk for the people who made the proposal to move to Git, nor for the people who put a green light on it. But I can at least give my perspective. Developers have regularly asked why Mozilla was still using Mercurial, but I think it was the first time that a formal proposal was laid out. And it came from the Engineering Workflow team, responsible for issue tracking, code reviews, source control, build and more. It's easy to say "Mozilla should have chosen Git in the first place", but back in 2007, GitHub wasn't there, Bitbucket wasn't there, and all the available options were rather new (especially compared to the then 21 years-old CVS). I think Mozilla made the right choice, all things considered. Had they waited a couple years, the story might have been different. You might say that Mozilla stayed with Mercurial for so long because of the sunk cost fallacy. I don't think that's true either. But after the biggest Mercurial repository hosting service turned off Mercurial support, and the main contributor to Mercurial going their own way, it's hard to ignore that the landscape has evolved. And the problems that we regularly encounter with the Mercurial servers are not going to get any better as the repository continues to grow. As far as I know, all the Mercurial repositories bigger than Mozilla's are... not using Mercurial. Google has its own closed-source server, and Facebook has another of its own, and it's not really public either. With resources spread thin, I don't expect Mozilla to be able to continue supporting a Mercurial server indefinitely (although I guess Octobus could be contracted to give a hand, but is that sustainable?). Mozilla, being a champion of Open Source, also doesn't live in a silo. At some point, you have to meet your contributors where they are. And the Open Source world is now majoritarily using Git. I'm sure the vast majority of new hires at Mozilla in the past, say, 5 years, know Git and have had to learn Mercurial (although they arguably didn't need to). Even within Mozilla, with thousands(!) of repositories on GitHub, Firefox is now actually the exception rather than the norm. I should even actually say Desktop Firefox, because even Mobile Firefox lives on GitHub (although Fenix is moving back in together with Desktop Firefox, and the timing is such that that will probably happen before Firefox moves to Git). Heck, even Microsoft moved to Git! With a significant developer base already using Git thanks to git-cinnabar, and all the constraints and problems I mentioned previously, it actually seems natural that a transition (finally) happens. However, had git-cinnabar or something similarly viable not existed, I don't think Mozilla would be in a position to take this decision. On one hand, it probably wouldn't be in the current situation of having to support both Git and Mercurial in the tooling around Firefox, nor the resource constraints related to that. But on the other hand, it would be farther from supporting Git and being able to make the switch in order to address all the other problems. But... GitHub? I hope I made a compelling case that hosting is not as simple as it can seem, at the scale of the Firefox repository. It's also not Mozilla's main focus. Mozilla has enough on its plate with the migration of existing infrastructure that does rely on Mercurial to understandably not want to figure out the hosting part, especially with limited resources, and with the mixed experience hosting both Mercurial and git has been so far. After all, GitHub couldn't even display things like the contributors' graph on gecko-dev until recently, and hosting is literally their job! They still drop the ball on large blames (thankfully we have searchfox for those). Where does that leave us? Gitlab? For those criticizing GitHub for being proprietary, that's probably not open enough. Cloud Source Repositories? "But GitHub is Microsoft" is a complaint I've read a lot after the announcement. Do you think Google hosting would have appealed to these people? Bitbucket? I'm kind of surprised it wasn't in the list of providers that were considered, but I'm also kind of glad it wasn't (and I'll leave it at that). I think the only relatively big hosting provider that could have made the people criticizing the choice of GitHub happy is Codeberg, but I hadn't even heard of it before it was mentioned in response to Mozilla's announcement. But really, with literal thousands of Mozilla repositories already on GitHub, with literal tens of millions repositories on the platform overall, the pragmatic in me can't deny that it's an attractive option (and I can't stress enough that I wasn't remotely close to the room where the discussion about what choice to make happened). "But it's a slippery slope". I can see that being a real concern. LLVM also moved its repository to GitHub (from a (I think) self-hosted Subversion server), and ended up moving off Bugzilla and Phabricator to GitHub issues and PRs four years later. As an occasional contributor to LLVM, I hate this move. I hate the GitHub review UI with a passion. At least, right now, GitHub PRs are not a viable option for Mozilla, for their lack of support for security related PRs, and the more general shortcomings in the review UI. That doesn't mean things won't change in the future, but let's not get too far ahead of ourselves. The move to Git has just been announced, and the migration has not even begun yet. Just because Mozilla is moving the Firefox repository to GitHub doesn't mean it's locked in forever or that all the eggs are going to be thrown into one basket. If bridges need to be crossed in the future, we'll see then. So, what's next? The official announcement said we're not expecting the migration to really begin until six months from now. I'll swim against the current here, and say this: the earlier you can switch to git, the earlier you'll find out what works and what doesn't work for you, whether you already know Git or not. While there is not one unique workflow, here's what I would recommend anyone who wants to take the leap off Mercurial right now:

Make sure git is installed. Chances are you already have it.

Install git-cinnabar where mach bootstrap would install it.

$ mkdir -p ~/.mozbuild/git-cinnabar
$ cd ~/.mozbuild/git-cinnabar
$ curl -sOL https://raw.githubusercontent.com/glandium/git-cinnabar/master/download.py
$ python3 download.py && rm download.py

Add git-cinnabar to your PATH. Make sure to also set that wherever you keep your PATH up-to-date (.bashrc or wherever else).
```
$ PATH=$PATH:$HOME/.mozbuild/git-cinnabar
```
Enter your mozilla-central or mozilla-unified Mercurial working copy, we'll do an in-place conversion, so that you don't need to move your mozconfigs, objdirs and what not.

Initialize the git repository from GitHub.

$ git init
$ git remote add origin https://github.com/mozilla/gecko-dev
$ git remote update origin

Switch to a Mercurial remote.

$ git remote set-url origin hg::https://hg.mozilla.org/mozilla-unified
$ git config --local remote.origin.cinnabar-refs bookmarks
$ git remote update origin --prune

Fetch your local Mercurial heads.
```
$ git -c cinnabar.refs=heads fetch hg::$PWD refs/heads/default/*:refs/heads/hg/*
```
This will create a bunch of hg/<sha1> local branches, not all relevant to you (some come from old branches on mozilla-central). Note that if you're using Mercurial MQ, this will not pull your queues, as they don't exist as heads in the Mercurial repo. You'd need to apply your queues one by one and run the command above for each of them.
Or, if you have bookmarks for your local Mercurial work, you can use this instead:
```
$ git -c cinnabar.refs=bookmarks fetch hg::$PWD refs/heads/*:refs/heads/hg/*
```
This will create hg/<bookmark_name> branches.
Now, make git know what commit your working tree is on.
```
$ git reset $(git cinnabar hg2git $(hg log -r . -T ' node '))
```
This will take a little moment because Git is going to scan all the files in the tree for the first time. On the other hand, it won't touch their content or timestamps, so if you had a build around, it will still be valid, and mach build won't rebuild anything it doesn't have to.

As there is no one-size-fits-all workflow, I won't tell you how to organize yourself from there. I'll just say this: if you know the Mercurial sha1s of your previous local work, you can create branches for them with:

$ git branch <branch_name> $(git cinnabar hg2git <hg_sha1>)

At this point, you should have everything available on the Git side, and you can remove the .hg directory. Or move it into some empty directory somewhere else, just in case. But don't leave it here, it will only confuse the tooling. Artifact builds WILL be confused, though, and you'll have to ./mach configure before being able to do anything. You may also hit bug 1865299 if your working tree is older than this post. If you have any problem or question, you can ping me on #git-cinnabar or #git on Matrix. I'll put the instructions above somewhere on wiki.mozilla.org, and we can collaboratively iterate on them. Now, what the announcement didn't say is that the Git repository WILL NOT be gecko-dev, doesn't exist yet, and WON'T BE COMPATIBLE (trust me, it'll be for the better). Why did I make you do all the above, you ask? Because that won't be a problem. I'll have you covered, I promise. The upcoming release of git-cinnabar 0.7.0-b1 will have a way to smoothly switch between gecko-dev and the future repository (incidentally, that will also allow to switch from a pure git-cinnabar clone to a gecko-dev one, for the git-cinnabar users who have kept reading this far). What about git-cinnabar? With Mercurial going the way of the dodo at Mozilla, my own need for git-cinnabar will vanish. Legitimately, this begs the question whether it will still be maintained. I can't answer for sure. I don't have a crystal ball. However, the needs of the transition itself will motivate me to finish some long-standing things (like finalizing the support for pushing merges, which is currently behind an experimental flag) or implement some missing features (support for creating Mercurial branches). Git-cinnabar started as a Python script, it grew a sidekick implemented in C, which then incorporated some Rust, which then cannibalized the Python script and took its place. It is now close to 90% Rust, and 10% C (if you don't count the code from Git that is statically linked to it), and has sort of become my Rust playground (it's also, I must admit, a mess, because of its history, but it's getting better). So the day to day use with Mercurial is not my sole motivation to keep developing it. If it were, it would stay stagnant, because all the features I need are there, and the speed is not all that bad, although I know it could be better. Arguably, though, git-cinnabar has been relatively stagnant feature-wise, because all the features I need are there. So, no, I don't expect git-cinnabar to die along Mercurial use at Mozilla, but I can't really promise anything either. Final words That was a long post. But there was a lot of ground to cover. And I still skipped over a bunch of things. I hope I didn't bore you to death. If I did and you're still reading... what's wrong with you? ;) So this is the end of Mercurial at Mozilla. So long, and thanks for all the fish. But this is also the beginning of a transition that is not easy, and that will not be without hiccups, I'm sure. So fasten your seatbelts (plural), and welcome the change. To circle back to the clickbait title, did I really kill Mercurial at Mozilla? Of course not. But it's like I stumbled upon a few sparks and tossed a can of gasoline on them. I didn't start the fire, but I sure made it into a proper bonfire... and now it has turned into a wildfire. And who knows? 15 years from now, someone else might be looking back at how Mozilla picked Git at the wrong time, and that, had we waited a little longer, we would have picked some yet to come new horse. But hey, that's the tech cycle for you.

12 October 2023

Jonathan McDowell: Installing Debian on the BananaPi M2 Zero

My previously mentioned C.H.I.P. repurposing has been partly successful; I ve found a use for it (which I still need to write up), but unfortunately it s too useful and the fact it s still a bit flaky has become a problem. I spent a while trying to isolate exactly what the problem is (I m still seeing occasional hard hangs with no obvious debug output in the logs or on the serial console), then realised I should just buy one of the cheap ARM SBC boards currently available. The C.H.I.P. is based on an Allwinner R8, which is a single ARM v7 core (an A8). So it s fairly low power by today s standards and it seemed pretty much any board would probably do. I considered a Pi 2 Zero, but couldn t be bothered trying to find one in stock at a reasonable price (I ve had one on backorder from CPC since May 2022, and yes, I know other places have had them in stock since but I don t need one enough to chase and I m now mostly curious about whether it will ever ship). As the title of this post gives away, I settled on a Banana Pi BPI-M2 Zero, which is based on an Allwinner H3. That s a quad-core ARM v7 (an A7), so a bit more oompfh than the C.H.I.P. All in all it set me back 25, including a set of heatsinks that form a case around it. I started with the vendor provided Debian SD card image, which is based on Debian 9 (stretch) and so somewhat old. I was able to dist-upgrade my way through buster and bullseye, and end up on bookworm. I then discovered the bookworm 6.1 kernel worked just fine out of the box, and even included a suitable DTB. Which got me thinking about whether I could do a completely fresh Debian install with minimal tweaking. First thing, a boot loader. The Allwinner chips are nice in that they ll boot off SD, so I just needed a suitable u-boot image. Rather than go with the vendor image I had a look at mainline and discovered it had support! So let s build a clean image:

noodles@buildhost:~$ mkdir ~/BPI
noodles@buildhost:~$ cd ~/BPI
noodles@buildhost:~/BPI$ ls
noodles@buildhost:~/BPI$ git clone https://source.denx.de/u-boot/u-boot.git
Cloning into 'u-boot'...
remote: Enumerating objects: 935825, done.
remote: Counting objects: 100% (5777/5777), done.
remote: Compressing objects: 100% (1967/1967), done.
remote: Total 935825 (delta 3799), reused 5716 (delta 3769), pack-reused 930048
Receiving objects: 100% (935825/935825), 186.15 MiB   2.21 MiB/s, done.
Resolving deltas: 100% (785671/785671), done.
noodles@buildhost:~/BPI$ mkdir u-boot-build
noodles@buildhost:~/BPI$ cd u-boot
noodles@buildhost:~/BPI/u-boot$ git checkout v2023.07.02
...
HEAD is now at 83cdab8b2c Prepare v2023.07.02
noodles@buildhost:~/BPI/u-boot$ make O=../u-boot-build bananapi_m2_zero_defconfig
  HOSTCC  scripts/basic/fixdep
  GEN     Makefile
  HOSTCC  scripts/kconfig/conf.o
  YACC    scripts/kconfig/zconf.tab.c
  LEX     scripts/kconfig/zconf.lex.c
  HOSTCC  scripts/kconfig/zconf.tab.o
  HOSTLD  scripts/kconfig/conf
#
# configuration written to .config
#
make[1]: Leaving directory '/home/noodles/BPI/u-boot-build'
noodles@buildhost:~/BPI/u-boot$ cd ../u-boot-build/
noodles@buildhost:~/BPI/u-boot-build$ make CROSS_COMPILE=arm-linux-gnueabihf-
  GEN     Makefile
scripts/kconfig/conf  --syncconfig Kconfig
...
  LD      spl/u-boot-spl
  OBJCOPY spl/u-boot-spl-nodtb.bin
  COPY    spl/u-boot-spl.bin
  SYM     spl/u-boot-spl.sym
  MKIMAGE spl/sunxi-spl.bin
  MKIMAGE u-boot.img
  COPY    u-boot.dtb
  MKIMAGE u-boot-dtb.img
  BINMAN  .binman_stamp
  OFCHK   .config
noodles@buildhost:~/BPI/u-boot-build$ ls -l u-boot-sunxi-with-spl.bin
-rw-r--r-- 1 noodles noodles 494900 Aug  8 08:06 u-boot-sunxi-with-spl.bin

I had the advantage here of already having a host setup to cross build armhf binaries, but this was all done on a Debian bookworm host with packages from main. I ve put my build up here in case it s useful to someone - everything else below can be done on a normal x86_64 host. Next I needed a Debian installer. I went for the netboot variant - although I was writing it to SD rather than TFTP booting I wanted as much as possible to come over the network.

noodles@buildhost:~/BPI$ wget https://deb.debian.org/debian/dists/bookworm/main/installer-armhf/20230607%2Bdeb12u1/images/netboot/netboot.tar.gz
...
2023-08-08 10:15:03 (34.5 MB/s) -  netboot.tar.gz  saved [37851404/37851404]
noodles@buildhost:~/BPI$ tar -axf netboot.tar.gz

Then I took a suitable microSD card and set it up with a 500M primary VFAT partition, leaving the rest for Linux proper. I could have got away with a smaller VFAT partition but I d initially thought I might need to put some more installation files on it.

noodles@buildhost:~/BPI$ sudo fdisk /dev/sdb
Welcome to fdisk (util-linux 2.38.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.
Command (m for help): o
Created a new DOS (MBR) disklabel with disk identifier 0x793729b3.
Command (m for help): n
Partition type
   p   primary (0 primary, 0 extended, 4 free)
   e   extended (container for logical partitions)
Select (default p):
Using default response p.
Partition number (1-4, default 1):
First sector (2048-60440575, default 2048):
Last sector, +/-sectors or +/-size K,M,G,T,P  (2048-60440575, default 60440575): +500M
Created a new partition 1 of type 'Linux' and of size 500 MiB.
Command (m for help): t
Selected partition 1
Hex code or alias (type L to list all): c
Changed type of partition 'Linux' to 'W95 FAT32 (LBA)'.
Command (m for help): n
Partition type
   p   primary (1 primary, 0 extended, 3 free)
   e   extended (container for logical partitions)
Select (default p):
Using default response p.
Partition number (2-4, default 2):
First sector (1026048-60440575, default 1026048):
Last sector, +/-sectors or +/-size K,M,G,T,P  (534528-60440575, default 60440575):
Created a new partition 2 of type 'Linux' and of size 28.3 GiB.
Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.
$ sudo mkfs -t vfat -n BPI-UBOOT /dev/sdb1
mkfs.fat 4.2 (2021-01-31)

The bootloader image gets written 8k into the SD card (our first partition starts at sector 2048, i.e. 1M into the device, so there s plenty of space here):

noodles@buildhost:~/BPI$ sudo dd if=u-boot-build/u-boot-sunxi-with-spl.bin of=/dev/sdb bs=1024 seek=8
483+1 records in
483+1 records out
494900 bytes (495 kB, 483 KiB) copied, 0.0282234 s, 17.5 MB/s

Copy the Debian installer files onto the VFAT partition:

noodles@buildhost:~/BPI$ cp -r debian-installer/ /media/noodles/BPI-UBOOT/

Unmount the SD from the build host, pop it into the M2 Zero, boot it up while connected to the serial console, hit a key to stop autoboot and tell it to boot the installer:

U-Boot SPL 2023.07.02 (Aug 08 2023 - 09:05:44 +0100)
DRAM: 512 MiB
Trying to boot from MMC1
U-Boot 2023.07.02 (Aug 08 2023 - 09:05:44 +0100) Allwinner Technology
CPU:   Allwinner H3 (SUN8I 1680)
Model: Banana Pi BPI-M2-Zero
DRAM:  512 MiB
Core:  60 devices, 17 uclasses, devicetree: separate
WDT:   Not starting watchdog@1c20ca0
MMC:   mmc@1c0f000: 0, mmc@1c10000: 1
Loading Environment from FAT... Unable to read "uboot.env" from mmc0:1...
In:    serial
Out:   serial
Err:   serial
Net:   No ethernet found.
Hit any key to stop autoboot:  0
=> setenv dibase /debian-installer/armhf
=> fatload mmc 0:1 $ kernel_addr_r  $ dibase /vmlinuz
5333504 bytes read in 225 ms (22.6 MiB/s)
=> setenv bootargs "console=ttyS0,115200n8"
=> fatload mmc 0:1 $ fdt_addr_r  $ dibase /dtbs/sun8i-h2-plus-bananapi-m2-zero.dtb
25254 bytes read in 7 ms (3.4 MiB/s)
=> fdt addr $ fdt_addr_r  0x40000
Working FDT set to 43000000
=> fatload mmc 0:1 $ ramdisk_addr_r  $ dibase /initrd.gz
31693887 bytes read in 1312 ms (23 MiB/s)
=> bootz $ kernel_addr_r  $ ramdisk_addr_r :$ filesize  $ fdt_addr_r 
Kernel image @ 0x42000000 [ 0x000000 - 0x516200 ]
## Flattened Device Tree blob at 43000000
   Booting using the fdt blob at 0x43000000
Working FDT set to 43000000
   Loading Ramdisk to 481c6000, end 49fffc3f ... OK
   Loading Device Tree to 48183000, end 481c5fff ... OK
Working FDT set to 48183000
Starting kernel ...

At this point the installer runs and you can do a normal install. Well, except the wifi wasn t detected, I think because the netinst images don t include firmware. I spent a bit of time trying to figure out how to include it but ultimately ended up installing over a USB ethernet dongle, which Just Worked and was less faff. Installing firmware-brcm80211 once installation completed allowed the built-in wifi to work fine. After install you need to configure u-boot to boot without intervention. At the u-boot prompt (i.e. after hitting a key to stop autoboot):

=> setenv bootargs "console=ttyS0,115200n8 root=LABEL=BPI-ROOT ro"
=> setenv bootcmd 'ext4load mmc 0:2 $ fdt_addr_r  /boot/sun8i-h2-plus-bananapi-m2-zero.dtb ; fdt addr $ fdt_addr_r  0x40000 ; ext4load mmc 0:2 $ kernel_addr_r  /boot/vmlinuz ; ext4load mmc 0:2 $ ramdisk_addr_r  /boot/initrd.img ; bootz $ kernel_addr_r  $ ramdisk_addr_r :$ filesize  $ fdt_addr_r '
=> saveenv
Saving Environment to FAT... OK
=> reset

This is assuming you have /boot on partition 2 on the SD - I left the first partition as VFAT (that s where the u-boot environment will be saved) and just used all of the rest as a single ext4 partition. I did have to do an e2label /dev/sdb2 BPI-ROOT to label / appropriately; otherwise I occasionally saw the SD card appear as mmc1 for Linux (I m guessing due to asynchronous boot order with the wifi). You should now find the device boots without intervention.

Reproducible Builds: Reproducible Builds in September 2023

Welcome to the September 2023 report from the Reproducible Builds project In these reports, we outline the most important things that we have been up to over the past month. As a quick recap, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries.
Andreas Herrmann gave a talk at All Systems Go 2023 titled Fast, correct, reproducible builds with Nix and Bazel . Quoting from the talk description:

You will be introduced to Google s open source build system Bazel, and will learn how it provides fast builds, how correctness and reproducibility is relevant, and how Bazel tries to ensure correctness. But, we will also see where Bazel falls short in ensuring correctness and reproducibility. You will [also] learn about the purely functional package manager Nix and how it approaches correctness and build isolation. And we will see where Bazel has an advantage over Nix when it comes to providing fast feedback during development.

Andreas also shows how you can get the best of both worlds and combine Nix and Bazel, too. A video of the talk is available.

diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb fixed compatibility with file(1) version 5.45 [ ] and updated some documentation [ ]. In addition, Vagrant Cascadian extended support for GNU Guix [ ][ ] and updated the version in that distribution as well. [ ].

Yet another reminder that our upcoming Reproducible Builds Summit is set to take place from October 31st November 2nd 2023 in Hamburg, Germany. If you haven t been before, our summits are a unique gathering that brings together attendees from diverse projects, united by a shared vision of advancing the Reproducible Builds effort. During this enriching event, participants will have the opportunity to engage in discussions, establish connections and exchange ideas to drive progress in this vital field. If you re interested in joining us this year, please make sure to read the event page, the news item, or the invitation email that Mattia Rizzolo sent out recently, all of which have more details about the event and location. We are also still looking for sponsors to support the event, so please reach out to the organising team if you are able to help. Also note that PackagingCon 2023 is taking place in Berlin just before our summit.
On the Reproducible Builds website, Greg Chabala updated the JVM-related documentation to update a link to the BUILDSPEC.md file. [ ] And Fay Stegerman fixed the builds failing because of a YAML syntax error.

Distribution work In Debian, this month:

Debian bug #940234 was originally opened in September 2019 by Aurelien Jarno to request that Debian add a new requirement that repeatedly building the source package in the same environment produces identical `.dsc` file modulo the GPG signature . This month, however, Russ Allbery closed the bug due to concerns about the viability of source reproducibility.

10 reviews of Debian packages were added, 56 were updated and 68 were removed this month adding to our knowledge about identified issues.

Bastian Blank posted a mail to a number of Debian mailing lists titled Upcoming changes to Debian Linux kernel packages . Under Kernel modules will be signed with an ephemeral key, Bastian wrote: Yes, this will make the build unreproducible, but no better solution currently exists.

Vagrant Cascadian posted about various experiments with verification builds for Debian and snapshotting the Debian archive

September saw F-Droid add ten new reproducible apps, and one existing app switched to reproducible builds. In addition, two reproducible apps were archived and one was disabled for a current total of 199 apps published with Reproducible Builds and using the upstream developer s signature. [ ] In addition, an extensive blog post was posted on f-droid.org titled Reproducible builds, signing keys, and binary repos .

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

openSUSE monthly

Bernhard M. Wiedemann:

`linuxsampler` (benchmarking issue)

`antlr3` (date)

`rpm` (embeds too many build details)

`seamonkey` (date)

`conky` (date and ordering-related issue)

`lsp-plugins-shared` (date/copyright year issue)

`build-compare`

`helix` (ASLR-related non-determinism)

`intel-graphics-compiler` (ASLR)

Chris Lamb:

#1052950 filed against `sphinxcontrib-mermaid`.

#1052970 filed against `mkdocs-material`.

#1052989 filed against `apophenia`.

#1053205 filed against `lapackpp`.

#1053263 filed against `blaspp`.

Fridrich Strba applied 124 Java-related patches into openSUSE:

`mysql-connector-java`, `java-21-openjdk`, `apache-ivy`, `maven-assembly-plugin`, `eclipse`, `antlr3`, `groovy18`, `hbci4java`, `ini4j`, `hppc`, `checkstyle`, `glassfish-jaxb`, `tycho`, `xmvn`, `mockito`, `languagetool`, `json-lib`, `jnr-unixsocket`, `jnr-ffi`, `jnr-enxio`, `jboss-jaxrs-2.0-api`, `istack-commons`, `rxtx-java`, `glassfish-jaxb`, `glassfish-hk2`, `findbugs`, `docker-client-java`, `maven`, `xmvn-connector-ivy`, `xmlstreambuffer`, `checkstyle`, `cglib`, `bean-validation-api`, `aws-sdk-java`, `javapackages-tools`, `ant`, `scala`, `osgi-service-log`, `jmdns`, `xml-security`, `super-csv`, `osgi-service-jdbc`, `msv`, `junit5`, `jsr-311`, `jersey`, `itextpdf`, `httpcomponents-asyncclient`, `ed25519-java`, `jnacl`, `javaparser`, `picocli`, `freemarker`, `extra166y`, `javaparser`, `xstream`, `woodstox-core`, `uom-lib`, `unit-api`, `uncommons-maths`, `tycho`, `treelayout`, `tiger-types`, `super-csv`, `stax-ex`, `stax2-api`, `sqlite-jdbc`, `reflectasm`, `prometheus-simpleclient-java`, `powermock`, `paranamer`, `opennlp`, `netty3`, `mybatis`, `morfologik-stemming`, `minlog`, `maven-archetype`, `mariadb-java-client`, `logback`, `kryo`, `jsonp`, `jopt-simple`, `jnr-posix`, `jnr-constants`, `jnr-a64asm`, `jfreechart`, `jffi`, `jetty-schemas`, `jetty-minimal`, `jeromq`, `jctools`, `jcsp`, `jboss-websocket-1.0-api`, `jboss-marshalling`, `jboss-logmanager`, `jboss-logging`, `javaewah`, `jatl`, `janino`, `jackson-modules-base`, `jackson-jaxrs-providers`, `jackson-datatypes-collections`, `jackson-dataformat-xml`, `jackson-dataformats-text`, `jackson-dataformats-binary`, `indriya`, `google-gson`, `glassfish-websocket-api`, `glassfish-transaction-api`, `glassfish-jsp`, `glassfish-jax-rs-api`, `glassfish-hk2`, `glassfish-fastinfoset`, `felix-scr`, `felix-gogo-shell`, `felix-gogo-command`, `disruptor`, `apache-commons-ognl`, `apache-commons-math`, `apache-commons-csv`, `antlr4`, `jettison`, `sisu`, `maven`

Pol Dellaiera:

Since version 2.6.4, Composer (the PHP package manager) is now fully reproducible. See the work in the corresponding PR.

Testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In August, a number of changes were made by Holger Levsen:

Disable `armhf` and `i386` builds due to Debian bug #1052257. [ ][ ][ ][ ]

Run diffoscope with a lower `ionice` priority. [ ]

Log every build in a simple text file [ ] and create persistent stamp files when running diffoscope to ease debugging [ ].

Run schedulers one hour after `dinstall` again. [ ]

Temporarily use diffoscope from the host, and not from a `schroot` running the tested suite. [ ][ ]

Fail the diffoscope distribution test if the diffoscope version cannot be determined. [ ]

Fix a spelling error in the email to IRC gateway. [ ]

Force (and document) the reconfiguration of all jobs, due to the recent rise of zombies. [ ][ ][ ][ ]

Deal with a rare condition when killing processes which should not be there. [ ]

Install the Debian backports kernel in an attempt to address Debian bug #1052257. [ ][ ]

In addition, Mattia Rizzolo fixed a call to `diffoscope --version` (as suggested by Fay Stegerman on our mailing list) [ ], worked on an openQA credential issue [ ] and also made some changes to the machine-readable reproducible metadata, `reproducible-tracker.json` [ ]. Lastly, Roland Clobus added instructions for manual configuration of the openQA secrets [ ].

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

IRC: `#reproducible-builds` on `irc.oftc.net`.

Mailing list: `rb-general@lists.reproducible-builds.org`

Mastodon: @reproducible_builds

Twitter: @ReproBuilds

6 October 2023

Emanuele Rocca: Custom Debian Installer and Kernel on a USB stick

There are many valid reasons to create a custom Debian Installer image. You may need to pass some special arguments to the kernel, use a different GRUB version, automate the installation by means of preseeding, use a custom kernel, or modify the installer itself.

If you have a EFI system, which is probably the case in 2023, there is no need to learn complex procedures in order to create a custom Debian Installer stick.

The source of many frustrations is that the ISO format for CDs/DVDs is read-only, but you can just create a VFAT filesystem on a USB stick, copy all ISO contents onto the stick itself, and modify things at will.

Create a writable USB stick

First create a FAT32 filesystem on the removable device and mount it. The device is sdX in the example.

$ sudo parted --script /dev/sdX mklabel msdos
$ sudo parted --script /dev/sdX mkpart primary fat32 0% 100%
$ sudo mkfs.vfat /dev/sdX1
$ sudo mount /dev/sdX1 /mnt/data/

Then copy to the USB stick the installer ISO you would like to modify, debian-testing-amd64-netinst.iso here.

$ sudo kpartx -v -a debian-testing-amd64-netinst.iso
# Mount the first partition on the ISO and copy its contents to the stick
$ sudo mount /dev/mapper/loop0p1 /mnt/cdrom/
$ sudo rsync -av /mnt/cdrom/ /mnt/data/
$ sudo umount /mnt/cdrom
# Same story with the second partition on the ISO
$ sudo mount /dev/mapper/loop0p2 /mnt/cdrom/
$ sudo rsync -av /mnt/cdrom/ /mnt/data/
$ sudo umount /mnt/cdrom
$ sudo kpartx -d debian-testing-amd64-netinst.iso
$ sudo umount /mnt/data

Now try booting from the USB stick just to verify that everything went well and we can start customizing the image.

Boot loader, preseeding, installer hacks

The easiest things we can change now are the shim, GRUB, and GRUB s configuration. The USB stick contains the shim under /EFI/boot/bootx64.efi, while GRUB is at /EFI/boot/grubx64.efi. This means that if you want to test a different shim / GRUB version, you just replace the relevant files. That s it. Take for example /usr/lib/grub/x86_64-efi/monolithic/grubx64.efi from the package grub-efi-amd64-bin, or the signed version from grub-efi-amd64-signed and copy them under /EFI/boot/grubx64.efi. Or perhaps you want to try out systemd-boot? Then take /usr/lib/systemd/boot/efi/systemd-bootx64.efi from the package systemd-boot-efi, copy it to /EFI/boot/bootx64.efi and you re good to go. Figuring out the right systemd-boot configuration needed to start the Installer is left as an exercise.

By editing /boot/grub/grub.cfg you can pass arbitrary arguments to the kernel and the Installer itself. See the official Installation Guide for a comprehensive list of boot parameters.

One very commong thing to do is automating the installation using a preseed file. Add the following to the kernel command line: preseed/file=/cdrom/preseed.cfg and create a /preseed.cfg file on the USB stick. As a little example:

d-i time/zone select Europe/Rome
d-i passwd/root-password this-is-the-root-password
d-i passwd/root-password-again this-is-the-root-password
d-i passwd/user-fullname string Emanuele Rocca
d-i passwd/username string ema
d-i passwd/user-password password lol-haha-uh
d-i passwd/user-password-again password lol-haha-uh
d-i apt-setup/no_mirror boolean true
d-i popularity-contest/participate boolean true
tasksel tasksel/first multiselect standard

See Steve McIntyre s awesome page with the full list of available settings and their description: https://preseed.einval.com/debian-preseed/.

Two noteworthy settings are early_command and late_command. They can be used to execute arbitrary commands and provide thus extreme flexibility! You can go as far as replacing parts of the installer with a sed command, or maybe wgetting an entirely different file. This is a fairly easy way to test minor Installer patches. As an example, I ve once used this to test a patch to grub-installer:

d-i partman/early_command string wget https://people.debian.org/~ema/grub-installer-1035085-1 -O /usr/bin/grub-installer

Finally, the initrd contains all early stages of the installer. It s easy to unpack it, modify whatever component you like, and repack it. Say you want to change a given udev rule:

$ mkdir /tmp/new-initrd
$ cd /tmp/new-initrd
$ zstdcat /mnt/data/install.a64/initrd.gz   sudo cpio -id
$ vi lib/udev/rules.d/60-block.rules
$ find .   cpio -o -H newc   zstd --stdout > /mnt/data/install.a64/initrd.gz

Custom udebs

From a basic architectural standpoint the Debian Installer can be seen as an initrd that loads a series of special Debian packages called udebs. In the previous section we have seen how to (ab)use early_command to replace one of the scripts used by the Installer, namely grub-installer. It turns out that such script is installed by a udeb, so let s do things right and build a new Installer ISO with our custom grub udeb.

Fetch the code for the grub-installer udeb, make your changes and build it with a classic dpkg-buildpackage -rfakeroot.

Then get the Installer code and install all dependencies:

$ git clone https://salsa.debian.org/installer-team/debian-installer/
$ cd debian-installer/
$ sudo apt build-dep .

Now add the grub-installer udeb to the localudebs directory and create a new netboot image:

$ cp /path/to/grub-installer_1.198_arm64.udeb build/localudebs/
$ cd build
$ fakeroot make clean_netboot build_netboot

Give it some time, soon enough you ll have a brand new ISO to test under dest/netboot/mini.iso.

Custom kernel

Perhaps there s a kernel configuration option you need to enable, or maybe you need a more recent kernel version than what is available in sid.

The Debian Linux Kernel Handbook has all the details for how to do things properly, but here s a quick example.

Get the Debian kernel packaging from salsa and generate the upstream tarball:

$ git clone https://salsa.debian.org/kernel-team/linux/
$ ./debian/bin/genorig.py https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git

For RC kernels use the repo from Linus instead of linux-stable.

Now do your thing, for instance change a config setting by editing debian/config/amd64/config. Don t worry about where you put it in the file, there s a tool from https://salsa.debian.org/kernel-team/kernel-team to fix that:

$ /path/to/kernel-team/utils/kconfigeditor2/process.py .

Now build your kernel:

$ export MAKEFLAGS=-j$(nproc)
$ export DEB_BUILD_PROFILES='pkg.linux.nokerneldbg pkg.linux.nokerneldbginfo pkg.linux.notools nodoc'
$ debian/rules orig
$ debian/rules debian/control
$ dpkg-buildpackage -b -nc -uc

After some time, if everything went well, you should get a bunch of .deb files as well as a .changes file, linux_6.6~rc3-1~exp1_arm64.changes here. To generate the udebs used by the Installer you need to first get a linux-signed .dsc file, and then build it with sbuild in this example:

$ /path/to/kernel-team/scripts/debian-test-sign linux_6.6~rc3-1~exp1_arm64.changes
$ sbuild --dist=unstable --extra-package=$PWD linux-signed-arm64_6.6~rc3+1~exp1.dsc

Excellent, now you should have a ton of .udebs. To build a custom installer image with this kernel, copy them all under debian-installer/build/localudebs/ and then run

fakeroot make clean_netboot
build_netboot

as described in the previous section. In case you are trying to use a different kernel version from what is currently in sid, you will have to install the linux-image package on the system building the ISO, and change LINUX_KERNEL_ABI in build/config/common. The linux-image dependency in debian/control probably needs to be tweaked as well.

That s it, the new Installer ISO should boot with your custom kernel!

There is going to be another minor obstacle though, as anna will complain that your new kernel cannot be found in the archive. Copy the kernel udebs you have built onto a vfat formatted USB stick, switch to a terminal, and install them all with udpkg:

~ # udpkg -i *.udeb

Now the installation should proceed smoothly.

16 September 2023

Sergio Talens-Oliag: GitLab CI/CD Tips: Using a Common CI Repository with Assets

This post describes how to handle files that are used as assets by jobs and pipelines defined on a common gitlab-ci repository when we include those definitions from a different project.

Problem descriptionWhen a .giltlab-ci.yml file includes files from a different repository its contents are expanded and the resulting code is the same as the one generated when the included files are local to the repository. In fact, even when the remote files include other files everything works right, as they are also expanded (see the description of how included files are merged for a complete explanation), allowing us to organise the common repository as we want. As an example, suppose that we have the following script on the assets/ folder of the common repository:

dumb.sh

#!/bin/sh
echo "The script arguments are: '$@'"

If we run the following job on the common repository:

job:
  script:
    - $CI_PROJECT_DIR/assets/dumb.sh ARG1 ARG2

the output will be:

The script arguments are: 'ARG1 ARG2'

But if we run the same job from a different project that includes the same job definition the output will be different:

/scripts-23-19051/step_script: eval: line 138: d./assets/dumb.sh: not found

The problem here is that we include and expand the YAML files, but if a script wants to use other files from the common repository as an asset (configuration file, shell script, template, etc.), the execution fails if the files are not available on the project that includes the remote job definition.

SolutionsWe can solve the issue using multiple approaches, I ll describe two of them:

Create files using scripts
Download files from the common repository

Create files using scriptsOne way to dodge the issue is to generate the non YAML files from scripts included on the pipelines using HERE documents. The problem with this approach is that we have to put the content of the files inside a script on a YAML file and if it uses characters that can be replaced by the shell (remember, we are using HERE documents) we have to escape them (error prone) or encode the whole file into base64 or something similar, making maintenance harder. As an example, imagine that we want to use the dumb.sh script presented on the previous section and we want to call it from the same PATH of the main project (on the examples we are using the same folder, in practice we can create a hidden folder inside the project directory or use a PATH like /tmp/assets-$CI_JOB_ID to leave things outside the project folder and make sure that there will be no collisions if two jobs are executed on the same place (i.e. when using a ssh runner). To create the file we will use hidden jobs to write our script template and reference tags to add it to the scripts when we want to use them. Here we have a snippet that creates the file with cat:

.file_scripts:
  create_dumb_sh:
    -  
      # Create dumb.sh script
      mkdir -p "$ CI_PROJECT_DIR /assets"
      cat >"$ CI_PROJECT_DIR /assets/dumb.sh" <<EOF
      #!/bin/sh
      echo "The script arguments are: '\$@'"
      EOF
      chmod +x "$ CI_PROJECT_DIR /assets/dumb.sh"

Note that to make things work we ve added 6 spaces before the script code and escaped the dollar sign. To do the same using base64 we replace the previous snippet by this:

.file_scripts:
  create_dumb_sh:
    -  
      # Create dumb.sh script
      mkdir -p "$ CI_PROJECT_DIR /assets"
      base64 -d >"$ CI_PROJECT_DIR /assets/dumb.sh" <<EOF
      IyEvYmluL3NoCmVjaG8gIlRoZSBzY3JpcHQgYXJndW1lbnRzIGFyZTogJyRAJyIK
      EOF
      chmod +x "$ CI_PROJECT_DIR /assets/dumb.sh"

Again, we have to indent the base64 version of the file using 6 spaces (all lines of the base64 output have to be indented) and to make changes we have to decode and re-code the file manually, making it harder to maintain. With either version we just need to add a !reference before using the script, if we add the call on the first lines of the before_script we can use the downloaded file in the before_script, script or after_script sections of the job without problems:

job:
  before_script:
    - !reference [.file_scripts, create_dumb_sh]
  script:
    - $ CI_PROJECT_DIR /assets/dumb.sh ARG1 ARG2

The output of a pipeline that uses this job will be the same as the one shown in the original example:

The script arguments are: 'ARG1 ARG2'

Download the files from the common repositoryAs we ve seen the previous solution works but is not ideal as it makes the files harder to read, maintain and use. An alternative approach is to keep the assets on a directory of the common repository (in our examples we will name it assets) and prepare a YAML file that declares some variables (i.e. the URL of the templates project and the PATH where we want to download the files) and defines a script fragment to download the complete folder. Once we have the YAML file we just need to include it and add a reference to the script fragment at the beginning of the before_script of the jobs that use files from the assets directory and they will be available when needed. The following file is an example of the YAML file we just mentioned:

bootstrap.yml

variables:
  CI_TMPL_API_V4_URL: "$ CI_API_V4_URL /projects/common%2Fci-templates"
  CI_TMPL_ARCHIVE_URL: "$ CI_TMPL_API_V4_URL /repository/archive"
  CI_TMPL_ASSETS_DIR: "/tmp/assets-$ CI_JOB_ID "
.scripts_common:
  bootstrap_ci_templates:
    -  
      # Downloading assets
      echo "Downloading assets"
      mkdir -p "$CI_TMPL_ASSETS_DIR"
      wget -q -O - --header="PRIVATE-TOKEN: $CI_TMPL_READ_TOKEN" \
        "$CI_TMPL_ARCHIVE_URL?path=assets&sha=$ CI_TMPL_REF:-main "  
        tar --strip-components 2 -C "$ASSETS_DIR" -xzf -

The file defines the following variables:

CI_TMPL_API_V4_URL: URL of the common project, in our case we are using the project ci-templates inside the common group (note that the slash between the group and the project is escaped, that is needed to reference the project by name, if we don t like that approach we can replace the url encoded path by the project id, i.e. we could use a value like $ CI_API_V4_URL /projects/31)
CI_TMPL_ARCHIVE_URL: Base URL to use the gitlab API to download files from a repository, we will add the arguments path and sha to select which sub path to download and from which commit, branch or tag (we will explain later why we use the CI_TMPL_REF, for now just keep in mind that if it is not defined we will download the version of the files available on the main branch when the job is executed).
CI_TMPL_ASSETS_DIR: Destination of the downloaded files.

And uses variables defined in other places:

CI_TMPL_READ_TOKEN: token that includes the read_api scope for the common project, we need it because the tokens created by the CI/CD pipelines of other projects can t be used to access the api of the common one.We define the variable on the gitlab CI/CD variables section to be able to change it if needed (i.e. if it expires)
CI_TMPL_REF: branch or tag of the common repo from which to get the files (we need that to make sure we are using the right version of the files, i.e. when testing we will use a branch and on production pipelines we can use fixed tags to make sure that the assets don t change between executions unless we change the reference).We will set the value on the .gitlab-ci.yml file of the remote projects and will use the same reference when including the files to make sure that everything is coherent.

This is an example YAML file that defines a pipeline with a job that uses the script from the common repository:

pipeline.yml

include:
  - /bootstrap.yaml
stages:
  - test
dumb_job:
  stage: test
  before_script:
    - !reference [.bootstrap_ci_templates, create_dumb_sh]
  script:
    - $ CI_TMPL_ASSETS_DIR /dumb.sh ARG1 ARG2

To use it from an external project we will use the following gitlab ci configuration:

gitlab-ci.yml

include:
  - project: 'common/ci-templates'
    ref: &ciTmplRef 'main'
    file: '/pipeline.yml'
variables:
  CI_TMPL_REF: *ciTmplRef

Where we use a YAML anchor to ensure that we use the same reference when including and when assigning the value to the CI_TMPL_REF variable (as far as I know we have to pass the ref value explicitly to know which reference was used when including the file, the anchor allows us to make sure that the value is always the same in both places). The reference we use is quite important for the reproducibility of the jobs, if we don t use fixed tags or commit hashes as references each time a job that downloads the files is executed we can get different versions of them. For that reason is not a bad idea to create tags on our common repo and use them as reference on the projects or branches that we want to behave as if their CI/CD configuration was local (if we point to a fixed version of the common repo the way everything is going to work is almost the same as having the pipelines directly in our repo). But while developing pipelines using branches as references is a really useful option; it allows us to re-run the jobs that we want to test and they will download the latest versions of the asset files on the branch, speeding up the testing process. However keep in mind that the trick only works with the asset files, if we change a job or a pipeline on the YAML files restarting the job is not enough to test the new version as the restart uses the same job created with the current pipeline. To try the updated jobs we have to create a new pipeline using a new action against the repository or executing the pipeline manually.

ConclusionFor now I m using the second solution and as it is working well my guess is that I ll keep using that approach unless giltab itself provides a better or simpler way of doing the same thing.

12 July 2023

Reproducible Builds: Reproducible Builds in June 2023

Welcome to the June 2023 report from the Reproducible Builds project In our reports, we outline the most important things that we have been up to over the past month. As always, if you are interested in contributing to the project, please visit our Contribute page on our website.

We are very happy to announce the upcoming Reproducible Builds Summit which set to take place from October 31st November 2nd 2023, in the vibrant city of Hamburg, Germany. Our summits are a unique gathering that brings together attendees from diverse projects, united by a shared vision of advancing the Reproducible Builds effort. During this enriching event, participants will have the opportunity to engage in discussions, establish connections and exchange ideas to drive progress in this vital field. Our aim is to create an inclusive space that fosters collaboration, innovation and problem-solving. We are thrilled to host the seventh edition of this exciting event, following the success of previous summits in various iconic locations around the world, including Venice, Marrakesh, Paris, Berlin and Athens. If you re interesting in joining us this year, please make sure to read the event page] which has more details about the event and location. (You may also be interested in attending PackagingCon 2023 held a few days before in Berlin.)

This month, Vagrant Cascadian will present at FOSSY 2023 on the topic of Breaking the Chains of Trusting Trust:

Corrupted build environments can deliver compromised cryptographically signed binaries. Several exploits in critical supply chains have been demonstrated in recent years, proving that this is not just theoretical. The most well secured build environments are still single points of failure when they fail. [ ] This talk will focus on the state of the art from several angles in related Free and Open Source Software projects, what works, current challenges and future plans for building trustworthy toolchains you do not need to trust.

Hosted by the Software Freedom Conservancy and taking place in Portland, Oregon, FOSSY aims to be a community-focused event: Whether you are a long time contributing member of a free software project, a recent graduate of a coding bootcamp or university, or just have an interest in the possibilities that free and open source software bring, FOSSY will have something for you . More information on the event is available on the FOSSY 2023 website, including the full programme schedule.

Marcel Fourn , Dominik Wermke, William Enck, Sascha Fahl and Yasemin Acar recently published an academic paper in the 44th IEEE Symposium on Security and Privacy titled It s like flossing your teeth: On the Importance and Challenges of Reproducible Builds for Software Supply Chain Security . The abstract reads as follows:

The 2020 Solarwinds attack was a tipping point that caused a heightened awareness about the security of the software supply chain and in particular the large amount of trust placed in build systems. Reproducible Builds (R-Bs) provide a strong foundation to build defenses for arbitrary attacks against build systems by ensuring that given the same source code, build environment, and build instructions, bitwise-identical artifacts are created.

However, in contrast to other papers that touch on some theoretical aspect of reproducible builds, the authors paper takes a different approach. Starting with the observation that much of the software industry believes R-Bs are too far out of reach for most projects and conjoining that with a goal of to help identify a path for R-Bs to become a commonplace property , the paper has a different methodology:

We conducted a series of 24 semi-structured expert interviews with participants from the Reproducible-Builds.org project, and iterated on our questions with the reproducible builds community. We identified a range of motivations that can encourage open source developers to strive for R-Bs, including indicators of quality, security benefits, and more efficient caching of artifacts. We identify experiences that help and hinder adoption, which heavily include communication with upstream projects. We conclude with recommendations on how to better integrate R-Bs with the efforts of the open source and free software community.

A PDF of the paper is now available, as is an entry on the CISPA Helmholtz Center for Information Security website and an entry under the TeamUSEC Human-Centered Security research group.

On our mailing list this month:

Martin Monperrus asked whether there are any project[s] where reproducibility is checked in a continuous integration pipeline? which received a number of replies from various projects.
Vagrant Cascadian mentioned that Packaging Con 2023 is being held in Berlin, the weekend before the Reproducible Builds summit later this year. In particular, Vagrant noticed that the Call for Proposals (CFP) closes at the end of July.
Larry Doolittle was searching Usenet archives and discovered a thread from December 1999 titled Time independent checksum(cksum) on comp.unix.programming. Larry notes that it starts with Jayan asking about comparing binaries that might have difference in their embedded timestamps (that is, perhaps, Foreshadowing diffoscope, amiright? ) and goes on to observe that:

The antagonist is David Schwartz, who correctly says There are dozens of complex reasons why what seems to be the same sequence of operations might produce different end results, but goes on to say I totally disagree with your general viewpoint that compilers must provide for reproducability [sic]. Dwight Tovey and I (Larry Doolittle) argue for reproducible builds. I assert Any program especially a mission-critical program like a compiler that cannot reproduce a result at will is broken. Also it s commonplace to take a binary from the net, and check to see if it was trojaned by attempting to recreate it from source.

Lastly, there were a few changes to our website this month too, including Bernhard M. Wiedemann adding a simplified Rust example to our documentation about the SOURCE_DATE_EPOCH environment variable [ ], Chris Lamb made it easier to parse our summit announcement at a glance [ ], Mattia Rizzolo added the summit announcement at a glance [ ] itself [ ][ ][ ] and Rahul Bajaj added a taxonomy of variations in build environments [ ].

Distribution work 27 reviews of Debian packages were added, 40 were updated and 8 were removed this month adding to our knowledge about identified issues. A new `randomness_in_documentation_generated_by_mkdocs` toolchain issue was added by Chris Lamb [ ], and the `deterministic` flag on the `paths_vary_due_to_usrmerge` issue as we are not currently testing `usrmerge` issues [ ] issues.
Roland Clobus posted his 18th update of the status of reproducible Debian ISO images on our mailing list. Roland reported that all major desktops build reproducibly with `bullseye`, `bookworm`, `trixie` and `sid` , but he also mentioned amongst many changes that not only are the `non-free` images being built (and are reproducible) but that the live images are generated officially by Debian itself. [ ]
Jan-Benedict Glaw noticed a problem when building NetBSD for the VAX architecture. Noting that Reproducible builds [are] probably not as reproducible as we thought , Jan-Benedict goes on to describe that when two builds from different source directories won t produce the same result and adds various notes about sub-optimal handling of the `CFLAGS` environment variable. [ ]
F-Droid added 21 new reproducible apps in June, resulting in a new record of 145 reproducible apps in total. [ ]. (This page now sports missing data for March May 2023.) F-Droid contributors also reported an issue with broken resources in APKs making some builds unreproducible. [ ]
Bernhard M. Wiedemann published another monthly report about reproducibility within openSUSE

Upstream patches

Bernhard M. Wiedemann:

`bcachefs` (sort find / filesys)

`build-compare` (reports files as identical)

`build-time` (toolchain date)

`cockpit` (merged, gzip mtime)

`gcc13` (gcc13 toolchain LTO parallelism)

`ghc-rpm-macros` (toolchain parallelism)

`golangcli-lint` (date)

`gutenprint` (date+time)

`mage` (date (golang))

`mumble` (filesys)

`pcr` (date)

`python-nss` (drop sphinx .doctrees)

`python310` (merged, bisected+backported)

`warpinator` (merged, date)

`xroachng` (date)

Chris Lamb:

#1037159 filed against `elinks`.

#1037205 filed against `multipath-tools`.

#1037216 filed against `mkdocstrings-python-handlers`.

#1038730 filed against `fribidi`.

#1038957 filed against `jtreg7`.

#1039932 filed against `python-bitstring` (forwarded upstream).

Vagrant Cascadian:

#1037235 filed against `gradle-kotlin-dsl`.

#1037240 filed against `libsdl-console`.

#1037267 filed against `kawari8`.

#1037269 filed against `freetds`.

#1037272 filed against `gbrowse`.

#1037276 filed against `bglibs`.

#1037277 filed against `advi`.

#1037278 filed against `afterstep`.

#1037280 filed against `simstring`.

#1037296 filed against `manderlbot`.

#1037298 filed against `erlang-proper`.

#1037300 filed against `comedilib`.

#1037309 filed against `libint`.

#1037427 filed against `newlib`.

#1038908 filed against `binutils-msp430`.

#1038970 & #1038971 filed against `c-munipack`.

#1039044 filed against `python-marshmallow-sqlalchemy`.

#1039457 filed against `mplayer`.

#1039493 & #1039500 filed against `menu`.

#1039506 filed against `mini-buildd`.

#1039610 filed against `pnetcdf`.

#1039617 & #1039618 filed against `liblopsub`.

#1039619 filed against `wcc`.

#1039623 filed against `shotcut`.

#1039624 filed against `icu`.

#1039741 filed against `libapache-poi-java`.

#1039808 filed against `atf`.

#1039865 filed against `valgrind`.

Testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In June, a number of changes were made by Holger Levsen, including:

Additions to a (relatively) new Documented Jenkins Maintenance (djm) script to automatically shrink a cache & save a backup of old data [ ], automatically split out previous months data from logfiles into specially-named files [ ], prevent concurrent remote logfile fetches by using a lock file [ ] and to add/remove various debugging statements [ ].

Updates to the automated system health checks to, for example, to correctly detect new kernel warnings due to a wording change [ ] and to explicitly observe which old/unused kernels should be removed [ ]. This was related to an improvement so that various kernel issues on Ubuntu-based nodes are automatically fixed. [ ]

Holger and Vagrant Cascadian updated all thirty-five hosts running Debian on the `amd64`, `armhf`, and `i386` architectures to Debian bookworm, with the exception of the Jenkins host itself which will be upgraded after the release of Debian 12.1. In addition, Mattia Rizzolo updated the email configuration for the `@reproducible-builds.org` domain to correctly accept incoming mails from `jenkins.debian.net` [ ] as well as to set up DomainKeys Identified Mail (DKIM) signing [ ]. And working together with Holger, Mattia also updated the Jenkins configuration to start testing Debian trixie which resulted in stopped testing Debian buster. And, finally, Jan-Benedict Glaw contributed patches for improved NetBSD testing.

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

IRC: `#reproducible-builds` on `irc.oftc.net`.

Mailing list: `rb-general@lists.reproducible-builds.org`

Twitter: @ReproBuilds

9 July 2023

Vasudev Kamath: Using LUKS-Encrypted USB Stick with TPM2 Integration

I use a LUKS-encrypted USB stick to store my GPG and SSH keys, which acts as a backup and portable key setup when working on different laptops. One inconvenience with LUKS-encrypted USB sticks is that you need to enter the password every time you want to mount the device, either through a Window Manager like KDE or using the cryptsetup luksOpen command. Fortunately, many laptops nowadays come equipped with TPM2 modules, which can be utilized to automatically decrypt the device and subsequently mount it. In this post, we'll explore the usage of systemd-cryptenroll for this purpose, along with udev rules and a set of scripts to automate the mounting of the encrypted USB. First, ensure that your device has a TPM2 module. You can run the following command to check:

sudo journalctl -k --grep=tpm2

The output should resemble the following:

Jul 08 18:57:32 bhairava kernel: ACPI: SSDT 0x00000000BBEFC000 0003C6 (v02
LENOVO Tpm2Tabl 00001000 INTL 20160422) Jul 08 18:57:32 bhairava kernel:
ACPI: TPM2 0x00000000BBEFB000 000034 (v03 LENOVO TP-R0D 00000830
PTEC 00000002) Jul 08 18:57:32 bhairava kernel: ACPI: Reserving TPM2 table
memory at [mem 0xbbefb000-0xbbefb033]

You can also use the systemd-cryptenroll command to check for the availability of a TPM2 device on your laptop:

systemd-cryptenroll --tpm2-device=list

The output will be something like following:

blog git:(master) systemd-cryptenroll --tpm2-device=list
PATH        DEVICE      DRIVER
/dev/tpmrm0 MSFT0101:00 tpm_tis
   blog git:(master)

Next, ensure that you have connected your encrypted USB device. Note that systemd-cryptenroll only works with LUKS2 and not LUKS1. If your device is LUKS1-encrypted, you may encounter an error while enrolling the device, complaining about the LUKS2 superblock not found. To determine if your device uses a LUKS1 header or LUKS2, use the cryptsetup luksDump <device> command. If it is LUKS1, the header will begin with:

LUKS header information for /dev/sdb1
Version:        1
Cipher name:    aes
Cipher mode:    xts-plain64
Hash spec:      sha256
Payload offset: 4096

Converting from LUKS1 to LUKS2 is a simple process, but for safety, ensure that you backup the header using the cryptsetup luksHeaderBackup command. Once backed up, use the following command to convert the header to LUKS2:

sudo cryptsetup convert --type luks2 /dev/sdb1

After conversion, the header will look like this:

Version:        2
Epoch:          4
Metadata area:  16384 [bytes]
Keyslots area:  2064384 [bytes]
UUID:           000b2670-be4a-41b4-98eb-9adbd12a7616
Label:          (no label)
Subsystem:      (no subsystem)
Flags:          (no flags)

The next step is to enroll the new LUKS key for the encrypted device using systemd-cryptenroll. Run the following command:

sudo systemd-cryptenroll --tpm2-device=/dev/tpmrm0 --tpm2-pcrs="0+7" /dev/sdb1

This command will prompt you to provide the existing key to unseal the device. It will then add a new random key to the volume, allowing it to be unlocked in addition to the existing keys. Additionally, it will bind this new key to PCRs 0 and 7, representing the system firmware and Secure Boot state. If there is only one TPM device on the system, you can use --tpm2-device=auto to automatically select the device. To confirm that the new key has been enrolled, you can dump the LUKS configuration and look for a systemd-tpm2 token entry, as well as an additional entry in the Keyslots section. To test the setup, you can use the /usr/lib/systemd/systemd-cryptsetup command. Additionally, you can check if the device is unsealed by using lsblk:

sudo /usr/lib/systemd/systemd-cryptsetup attach GPG_USB "/dev/sdb1" - tpm2-device=auto
lsblk

The lsblk command should display the unsealed and mounted device, like this:

NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
sda           8:0    0 223.6G  0 disk
 sda1        8:1    0   976M  0 part  /boot/efi
 sda2        8:2    0 222.6G  0 part
   root    254:0    0 222.6G  0 crypt /
sdb           8:16   1   7.5G  0 disk
 sdb1        8:17   1   7.5G  0 part
   GPG_USB 254:1    0   7.5G  0 crypt /media/vasudev/GPG_USB

Auto Mounting the device Now that we have solved the initial problem of unsealing the USB device using TPM2 instead of manually entering the key, the next step is to automatically mount the device upon insertion and remove the mapping when the device is removed. This can be achieved using the following udev rules:

ACTION=="add", KERNEL=="sd*", ENV DEVTYPE =="partition", ENV ID_BUS =="usb", ENV SYSTEMD_WANTS ="mount-gpg-usb@$env DEVNAME .service"
ACTION=="remove", KERNEL=="sd*", ENV DEVTYPE =="partition", ENV ID_BUS =="usb", RUN+="/usr/local/bin/umount_enc_usb.sh '%E ID_FS_UUID '"

When a device is added, a systemd service is triggered to mount the device at a specific location. Initially, I used a script with the RUN directive, but it resulted in an exit code of 32. This might be due to systemd-cryptsetup taking some time to return, causing udev to time out. To address this, I opted to use a systemd service instead. For device removal, even though the physical device is no longer present, the mapping may still remain, causing issues upon reinsertion. To resolve this, I created a script to close the luks mapping upon device removal. Below are the systemd service and script files: mount_enc_usb.sh:

#!/bin/bash
set -x
if [[ "$#" -ne 1 ]]; then
    echo "$(basename $0) <device>"
    exit 1
fi
device_uuid="$(blkid --output udev $1   grep ID_FS_UUID=   cut -d= -f2)"
if [[ "$device_uuid" == 000b2670-be4a-41b4-98eb-9adbd12a7616 ]]; then
    # Found our device, let's trigger systemd-cryptsetup
    /usr/lib/systemd/systemd-cryptsetup attach GPG_USB "$1" - tpm2-device=auto
    [[ -d /media/vasudev/GPG_USB ]]   (mkdir -p /media/vasudev/GPG_USB/ && chown vasudev:vasudev /media/vasudev/GPG_USB)
    mount /dev/mapper/GPG_USB /media/vasudev/GPG_USB
else
    echo "Not the interested device. Ignoring."
    exit 0
fi

umount_enc_usb.sh:

#!/bin/bash
if [[ "$#" -ne 1 ]]; then
  echo "$(basename $0) <fsuuid>"
  exit 1
fi
if [[ "$1" == "000b2670-be4a-41b4-98eb-9adbd12a7616" ]]; then
  # Our device is removed, let's close the luks mapping
  [[ -e /dev/mapper/GPG_USB ]] && cryptsetup luksClose /dev/mapper/GPG_USB
else
  echo "Not our device."
  exit 0
fi

mount-gpg-usb@.service:

[Unit]
Description=Mount the encrypted USB device service
[Service]
Type=simple
ExecStart=/usr/local/bin/mount_enc_usb.sh

With this setup, plugging in the USB device will automatically unseal and mount it, and upon removal, the luks mapping will be closed.

Note

This can be even done for LUKS2 encrypted root disk but will need some tweaking in initramfs.

References

29 June 2023

Antoine Beaupr : Using signal-cli to cancel your Signal account

For obscure reasons, I have found myself with a phone number registered with Signal but without any device associated with it. This is the I lost my phone section in Signal support, which rather unhelpfully tell you that, literally:

Until you have access to your phone number, there is nothing that can be done with Signal.

To be fair, I guess that sort of makes sense: Signal relies heavily on phone numbers for identity. It's how you register to the service and how you recover after losing your phone number. If you have your PIN ready, you don't even change safety numbers! But my case is different: this phone number was a test number, associated with my tablet, because you can't link multiple Android device to the same phone number. And now that I brilliantly bricked that tablet, I just need to tell people to stop trying to contact me over that thing (which wasn't really working in the first place anyway because I wasn't using the tablet that much, but I digress). So. What do you do? You could follow the above "lost my phone" guide and get a new Android or iOS phone to register on Signal again, but that's pretty dumb: I don't want another phone, I already have one. Lo and behold, signal-cli to the rescue!

Disclaimer: no warranty or liability Before following this guide, make sure you remember the license of this website, which specifically has a Section 5 Disclaimer of Warranties and Limitation of Liability. If you follow this guide literally, you might actually get into trouble. You have been warned. All Cats Are Beautiful.

Installing in Docker Because signal-cli is not packaged in Debian (but really should be), I need to bend over backwards to install it. The installation instructions suggest building from source (what is this, GentooBSD?) or installing binary files (what is this, Debiandows?), that's all so last millennium. I want something fresh and fancy, so I went with the extremely legit Docker registry ran by the not-shady-at-all gitlab.com/packaging group which is suspiciously not owned by any GitLab.com person I know of. This is surely perfectly safe.

(Insert long digression on supply chain security here and how Podman is so much superior to Docker. Feel free to dive deep into how RedHat sold out to the nazis or how this is just me ranting about something I don't understand, again. I'm not going to do all the work for you.)

Anyway. The magic command is:

mkdir .config/signal-cli
podman pull registry.gitlab.com/packaging/signal-cli/signal-cli-jre:latest
# lightly hit computer with magic supply chain verification wand
alias signal-cli="podman run --rm --publish 7583:7583 --volume .config/signal-cli:/var/lib/signal-cli --tmpfs /tmp:exec   registry.gitlab.com/packaging/signal-cli/signal-cli-jre:latest --config /var/lib/signal-cli"

At this point, you have a signal-cli alias that should more or less behave as per upstream documentation. Note that it sets up a network service on port 7583 which is unnecessary because you likely won't be using signal-cli's "daemon mode" here, this is a one-shot thing. But I'll probably be reusing those instructions later on, so I figured it might be a safe addition. Besides, it's what the instructions told me to do so I'm blindly slamming my head in the bash pipe, as trained. Also, you're going to have the signal-cli configuration persist in ~/.config/signal-cli there. Again, totally unnecessary.

Re-registering the number Back to our original plan of canceling our Signal account. The next step is, of course, to register with Signal.

Yes, this is a little counter-intuitive and you'd think there would be a "I want off this boat" button on https://signal.org that would do this for you, but hey, I guess that's only reserved for elite hackers who want to ~~screw people over~~, I mean close their accounts. Mere mortals don't get access to such beauties. Update: a friend reminded me there used to be such a page at https://signal.org/signal/unregister/ but it's mysteriously gone from the web, but still available on the wayback machine although surely that doesn't work anymore. Untested.

To register an account with signal-cli, you first need to pass a CAPTCHA. Those are the funky images generated by deep neural networks that try to fool humans into thinking other neural networks can't break them, and generally annoy the hell out of people. This will generate a URL that looks like:

signalcaptcha://signal-hcaptcha.$UUID.registration.$THIRTYTWOKILOBYTESOFGARBAGE

Yes, it's a very long URL. Yes, you need the entire thing. The URL is hidden behind the Open Signal link, you can right-click on the link to copy it or, if you want to feel like it's 1988 again, use view-source: or butterflies or something. You will also need the phone number you want to unregister here, obviously. We're going to take a not quite random phone number as an example, +18002677468.

Don't do this at home kids! Use the actual number and don't copy-paste examples from random websites!

So the actual command you need to run now is:

signal-cli -a +18002677468 register --captcha signalcaptcha://signal-hcaptcha.$UUID.registration.$THIRTYTWOKILOBYTESOFGARBAGE

To confirm the registration, Signal will send a text message (SMS) to that phone number with a verification code. (Fun fact: it's actually Twilio relaying that message for Signal and that is... not great.) If you don't have access to SMS on that number, you can try again with the --voice option, which will do the same thing with a actual phone call. I wish it would say "Ok boomer" when it calls, but it doesn't. If you don't have access to either, you're screwed. You may be able to port your phone number to another provider to gain control of the phone number again that said, but at that point it's a whole different ball game. With any luck now you've received the verification code. You use it with:

signal-cli -a +18002677468 verify 131213

If you want to make sure this worked, you can try writing to another not random number at all, it should Just Work:

signal-cli -a +18002677468 send -mtest +18005778477

This is almost without any warning on the other end too, which says something amazing about Signal's usability and something horrible about its security.

Unregistering the number Now we get to the final conclusion, the climax. Can you feel it? I'll try to refrain from further rants, I promise. It's pretty simple and fast, just call:

signal-cli -a +18002677468 unregister

That's it! Your peers will now see an "Invite to Signal" button instead of a text field to send a text message.

Cleanup Optionally, cleanup the mess you left on this computer:

rm -r ~/.config/signal-cli
podman image rm registry.gitlab.com/packaging/signal-cli/signal-cli-jre

17 June 2023

Bastian Venthur: Blag 2.0 released

A few days ago, I released a major update on blag, my blog-aware static-site generator, which introduces a few backwards-incompatible changes and many improvements over the old version. Good-looking default theme The old bare-bones default theme has been replaced with a good-looking one, based on the one used on this blog: Blag Screenshot

It comes with a light- and dark theme that switches automatically based on the browser setting, as well as fitting light- and dark syntax highlighting themes for code blocks. Improved quickstart The blag quickstart command has been improved. Additionally to generating the configuration, it now also populates the working directory with the templates-, static- and content directories, containing the updated default theme and a few content pages to get you started. No internal fallback templates anymore Related to the changes in quickstart, the internal fallback template has been removed, and blag now completely relies on the templates in the local templates directory. This makes it more transparent for the user what is happening while simplifying blag s internal logic. However, this is a backwards incompatible change! In the case of a missing template, the user will be warned with a hint on how to obtain the missing template. Index and archive are now separate Previously, the front-page would always show the archive of all articles. This is not very useful when your blog contains more than a few dozen articles. With blag 2.0, the previous archive has been split into index and archive, where index is the front-page showing only the most recent 15 articles by default and linking to the archive which shows all articles. There s also two corresponding templates in the templates directory. Miscellaneous

Various dependencies have been updated.
blag s documentation has been migrated from Sphinx to MkDocs, which is a bit more lightweight and easier to maintain
A packaging issue has been fixed, where the tests/conftest.py was missing in the source distribution

Blag 2.0 is available on pypi, debian/unstable and github

16 June 2023

John Goerzen: Using git-annex for Data Archiving

In my recent post about data archiving to removable media, I laid out the difference between backing up and archiving, and also said I d evaluate git-annex and dar. This post evaluates git-annex. The next will look at dar, and then I ll make a comparison post. What is git-annex? git-annex is a fantastic and versatile program that does well, it s one of those things that can do so much that it s a bit hard to describe. Its homepage says:

git-annex allows managing large files with git, without storing the file contents in git. It can sync, backup, and archive your data, offline and online. Checksums and encryption keep your data safe and secure. Bring the power and distributed nature of git to bear on your large files with git-annex.

I think the particularly interesting features of git-annex aren t actually included in that list. Among the features of git-annex that make it shine for this purpose, its location tracking is key. git-annex can know exactly which device has which file at which version at all times. Combined with its preferred content settings, this lets you very easily say things like:

I want exactly 1 copy of every file to exist within the set #1 of backup drives. Here s a drive in that set; copy to it whatever needs to be copied to satisfy that requirement.
Now I have another set of backup drives. Periodically I will swap sets offsite. Copy whatever is needed to this drive in the second set, making sure that there is 1 copy of every file within this set as well, regardless of what s in the first set.
Here s a directory I want to use to track the status of everything else. I don t want any copies at all here.

git-annex can be set to allow a configurable amount of free space to remain on a device, and it will fill it up with whatever copies are necessary up until it hits that limit. Very convenient! git-annex will store files in a folder structure that mirrors the origin folder structure, in plain files just as they were. This maximizes the ability for a future person to access the content, since it is all viewable without any special tool at all. Of course, for things like optical media, git-annex will essentially be creating what amounts to incrementals. To obtain a consistent copy of the original tree, you would still need to use git-annex to process (export) the archives. git-annex challenges In my prior post, I related some challenges with git-annex. The biggest of them quite poor performance of the directory special remote when dealing with many files has been resolved by Joey, git-annex s author! That dramatically improves the git-annex use scenario here! The fixing commit is in the source tree but not yet in a release. git-annex no doubt may still have performance challenges with repositories in the 100,000+-range, but in that order of magnitude it now looks usable. I m not sure about 1,000,000-file repositories (I haven t tested); there is a page about scalability. A few other more minor challenges remain:

git-annex doesn t really preserve POSIX attributes; for instance, permissions, symlink destinations, and timestamps are all not preserved. Of these, timestamps are the most important for my particular use case.
If your data set to archive contains Git repositories itself, these will not be included.

I worked around the timestamp issue by using the mtree-netbsd package in Debian. mtree writes out a summary of files and metadata in a tree, and can restore them. To save: mtree -c -R nlink,uid,gid,mode -p /PATH/TO/REPO -X <(echo './.git') > /tmp/spec And, after restoration, the timestamps can be applied with: mtree -t -U -e < /tmp/spec Walkthrough: initial setup To use git-annex in this way, we have to do some setup. My general approach is this:

There is a source of data that lives outside git-annex. I'll call this $SOURCEDIR.
I'm going to name the directories holding my data $REPONAME.
There will be a "coordination" git-annex repo. It will hold metadata only, and no data. This will let us track where things live. I'll call it $METAREPO.
There will be drives. For this example, I'll call their mountpoints $DRIVE01 and $DRIVE02. For easy demonstration purposes, I used a ZFS dataset with a refquota set (to observe the size handling), but I could have as easily used a LVM volume, btrfs dataset, loopback filesystem, or USB drive. For optical discs, this would be a staging area or a UDF filesystem.

Let's get started! I've set all these shell variables appropriately for this example, and REPONAME to "testdata". We'll begin by setting up the metadata-only tracking repo.



$ REPONAME=testdata

$ mkdir "$METAREPO"

$ cd "$METAREPO"

$ git init

$ git config annex.thin true

There is a sort of complicated topic of how git-annex stores files in a repo, which varies depending on whether the data for the file is present in a given repo, and whether the file is locked or unlocked. Basically, the options I use here cause git-annex to mostly use hard links instead of symlinks or pointer files, for maximum compatibility with non-POSIX filesystems such as NTFS and UDF, which might be used on these devices. thin is part of that. Let's continue:



$ git annex init 'local hub'

init local hub ok

(recording state in git...)

$ git annex wanted . "include=* and exclude=$REPONAME/*"

wanted . ok

(recording state in git...)

In a bit, we are going to import the source data under the directory named $REPONAME (here, testdata). The wanted command says: in this repository (represented by the bare dot), the files we want are matched by the rule that says eveyrthing except what's under $REPONAME. In other words, we don't want to make an unnecessary copy here. Because I expect to use an mtree file as documented above, and it is not under $REPONAME/, it will be included. Let's just add it and tweak some things.



$ touch mtree

$ git annex add mtree

add mtree

ok

(recording state in git...)

$ git annex sync

git-annex sync will change default behavior to operate on --content in a future version of git-annex. Recommend you explicitly use --no-content (or -g) to prepare for that change. (Or you can configure annex.synccontent)

commit

[main (root-commit) 6044742] git-annex in local hub

 1 file changed, 1 insertion(+)

 create mode 120000 mtree

ok

$ ls -l

total 9

lrwxrwxrwx 1 jgoerzen jgoerzen 178 Jun 15 22:31 mtree -> .git/annex/objects/pX/ZJ/...

OK! We've added a file, and it got transformed into a symlink. That's the thing I said we were going to avoid, so:



 git annex adjust --unlock-present

adjust

Switched to branch 'adjusted/main(unlockpresent)'

ok

$ ls -l

total 1

-rw-r--r-- 2 jgoerzen jgoerzen 0 Jun 15 22:31 mtree

You'll notice it transformed into a hard link (nlinks=2) file. Great! Now let's import the source data. For that, we'll use the directory special remote.



$ git annex initremote source type=directory  directory=$SOURCEDIR importtree=yes \

   encryption=none

initremote source ok

(recording state in git...)

$ git annex enableremote source directory=$SOURCEDIR

enableremote source ok

(recording state in git...)

$ git config remote.source.annex-readonly true

$ git config annex.securehashesonly true

$ git config annex.genmetadata true

$ git config annex.diskreserve 100M

$ git config remote.source.annex-tracking-branch main:$REPONAME

OK, so here we created a new remote named "source". We enabled it, and set some configuration. Most notably, that last line causes files from "source" to be imported under $REPONAME/ as we wanted earlier. Now we're ready to scan the source.



$ git annex sync

At this point, you'll see git-annex computing a hash for every file in the source directory. I can verify with du that my metadata-only repo only uses 14MB of disk space, while my source is around 4GB. Now we can see what git-annex thinks about file locations:



$ git-annex whereis   less

whereis mtree (1 copy)

        8aed01c5-da30-46c0-8357-1e8a94f67ed6 -- local hub [here]

ok

whereis testdata/[redacted] (0 copies)

  The following untrusted locations may also have copies:

        9e48387e-b096-400a-8555-a3caf5b70a64 -- [source]

failed

... many more lines ...

So remember we said we wanted mtree, but nothing under testdata, under this repo? That's exactly what we got. git-annex knows that the files under testdata can be found under the "source" special remote, but aren't in any git-annex repo -- yet. Now we'll start adding them. Walkthrough: removable drives I've set up two 500MB filesystems to represent removable drives. We'll see how git-annex works with them.



$ cd $DRIVE01

$ df -h .

Filesystem                     Size  Used Avail Use% Mounted on

acrypt/no-backup/annexdrive01  500M  1.0M  499M   1% /acrypt/no-backup/annexdrive01

$ git clone $METAREPO

Cloning into 'testdata'...

done.

$ cd $REPONAME

$ git config annex.thin true

$ git annex init "test drive #1"

$ git annex adjust --hide-missing --unlock

adjust

Switched to branch 'adjusted/main(hidemissing-unlocked)'

ok

$ git annex sync

OK, that's the initial setup. Now let's enable the source remote and configure it the same way we did before:



$ git annex enableremote source directory=$SOURCEDIR

enableremote source ok

(recording state in git...)

$ git config remote.source.annex-readonly true

$ git config remote.source.annex-tracking-branch main:$REPONAME

$ git config annex.securehashesonly true

$ git config annex.genmetadata true

$ git config annex.diskreserve 100M

Now, we'll add the drive to a group called "driveset01" and configure what we want on it:



$ git annex group . driveset01

$ git annex wanted . '(not copies=driveset01:1)'

What this does is say: first of all, this drive is in a group named driveset01. Then, this drive wants any files for which there isn't already at least one copy in driveset01. Now let's load up some files!



$ git annex sync --content

As the messages fly by from here, you'll see it mentioning that it got mtree, and then various files from "source" -- until, that is, the filesystem had less than 100MB free, at which point it complained of no space for the rest. Exactly like we wanted! Now, we need to teach $METAREPO about $DRIVE01.



$ cd $METAREPO

$ git remote add drive01 $DRIVE01/$REPONAME

$ git annex sync drive01

git-annex sync will change default behavior to operate on --content in a future version of git-annex. Recommend you explicitly use --no-content (or -g) to prepare for that change. (Or you can configure annex.synccontent)

commit

On branch adjusted/main(unlockpresent)

nothing to commit, working tree clean

ok

merge synced/main (Merging into main...)

Updating d1d9e53..817befc

Fast-forward

(Merging into adjusted branch...)

Updating 7ccc20b..861aa60

Fast-forward

ok

pull drive01

remote: Enumerating objects: 214, done.

remote: Counting objects: 100% (214/214), done.

remote: Compressing objects: 100% (95/95), done.

remote: Total 110 (delta 6), reused 0 (delta 0), pack-reused 0

Receiving objects: 100% (110/110), 13.01 KiB   1.44 MiB/s, done.

Resolving deltas: 100% (6/6), completed with 6 local objects.

From /acrypt/no-backup/annexdrive01/testdata

 * [new branch]      adjusted/main(hidemissing-unlocked) -> drive01/adjusted/main(hidemissing-unlocked)

 * [new branch]      adjusted/main(unlockpresent)        -> drive01/adjusted/main(unlockpresent)

 * [new branch]      git-annex                           -> drive01/git-annex

 * [new branch]      main                                -> drive01/main

 * [new branch]      synced/main                         -> drive01/synced/main

ok

OK! This step is important, because drive01 and drive02 (which we'll set up shortly) won't necessarily be able to reach each other directly, due to not being plugged in simultaneously. Our $METAREPO, however, will know all about where every file is, so that the "wanted" settings can be correctly resolved. Let's see what things look like now:



$ git annex whereis   less

whereis mtree (2 copies)

        8aed01c5-da30-46c0-8357-1e8a94f67ed6 -- local hub [here]

        b46fc85c-c68e-4093-a66e-19dc99a7d5e7 -- test drive #1 [drive01]

ok

whereis testdata/[redacted] (1 copy)

        b46fc85c-c68e-4093-a66e-19dc99a7d5e7 -- test drive #1 [drive01]

  The following untrusted locations may also have copies:

        9e48387e-b096-400a-8555-a3caf5b70a64 -- [source]

ok

If I scroll down a bit, I'll see the files past the 400MB mark that didn't make it onto drive01. Let's add another example drive! Walkthrough: Adding a second drive The steps for $DRIVE02 are the same as we did before, just with drive02 instead of drive01, so I'll omit listing it all a second time. Now look at this excerpt from whereis:



whereis testdata/[redacted] (1 copy)

        b46fc85c-c68e-4093-a66e-19dc99a7d5e7 -- test drive #1 [drive01]


  The following untrusted locations may also have copies:

        9e48387e-b096-400a-8555-a3caf5b70a64 -- [source]

ok

whereis testdata/[redacted] (1 copy)

        c4540343-e3b5-4148-af46-3f612adda506 -- test drive #2 [drive02]

  The following untrusted locations may also have copies:

        9e48387e-b096-400a-8555-a3caf5b70a64 -- [source]

ok

Look at that! Some files on drive01, some on drive02, some neither place. Perfect! Walkthrough: Updates So I've made some changes in the source directory: moved a file, added another, and deleted one. All of these were copied to drive01 above. How do we handle this? First, we update the metadata repo:



$ cd $METAREPO

$ git annex sync

$ git annex dropunused all

OK, this has scanned $SOURCEDIR and noted changes. Let's see what whereis says:



$ git annex whereis   less

...

whereis testdata/cp (0 copies)

  The following untrusted locations may also have copies:

        9e48387e-b096-400a-8555-a3caf5b70a64 -- [source]

failed

whereis testdata/file01-unchanged (1 copy)

        b46fc85c-c68e-4093-a66e-19dc99a7d5e7 -- test drive #1 [drive01]

  The following untrusted locations may also have copies:

        9e48387e-b096-400a-8555-a3caf5b70a64 -- [source]

ok

So this looks right. The file I added was a copy of /bin/cp. I moved another file to one named file01-unchanged. Notice that it realized this was a rename and that the data still exists on drive01. Well, let's update drive01.



$ cd $DRIVE01/$REPONAME

$ git annex sync --content

Looking at the testdata/ directory now, I see that file01-unchanged has been renamed, the deleted file is gone, but cp isn't yet here -- probably due to space issues; as it's new, it's undefined whether it or some other file would fill up free space. Let's work along a few more commands.



$ git annex get --auto

$ git annex drop --auto

$ git annex dropunused all

And now, let's make sure metarepo is updated with its state.



$ cd $METAREPO

$ git annex sync

We could do the same for drive02. This is how we would proceed with every update. Walkthrough: Restoration Now, we have bare files at reasonable locations in drive01 and drive02. But, to generate a consistent restore, we need to be able to actually do an export. Otherwise, we may have files with old names, duplicate files, etc. Let's assume that we lost our source and metadata repos and have to restore from scratch. We'll make a new $RESTOREDIR. We'll begin with drive01 since we used it most recently.



$ mv $METAREPO $METAREPO.disabled

$ mv $SOURCEDIR $SOURCEDIR.disabled

$ git clone $DRIVE01/$REPONAME $RESTOREDIR

$ cd $RESTOREDIR

$ git config annex.thin true

$ git annex init "restore"

$ git annex adjust --hide-missing --unlock

Now, we need to connect the drive01 and pull the files from it.



$ git remote add drive01 $DRIVE01/$REPONAME

$ git annex sync --content

Now, repeat with drive02:



$ git remote add drive02 $DRIVE02/$REPONAME

$ git annex sync --content

Now we've got all our content back! Here's what whereis looks like:



whereis testdata/file01-unchanged (3 copies)

        3d663d0f-1a69-4943-8eb1-f4fe22dc4349 -- restore [here]

        9e48387e-b096-400a-8555-a3caf5b70a64 -- source

        b46fc85c-c68e-4093-a66e-19dc99a7d5e7 -- test drive #1 [origin]

ok

...

I was a little surprised that drive01 didn't seem to know what was on drive02. Perhaps that could have been remedied by adding more remotes there? I'm not entirely sure; I'd thought would have been able to do that automatically. Conclusions I think I have demonstrated two things: First, git-annex is indeed an extremely powerful tool. I have only scratched the surface here. The location tracking is a neat feature, and being able to just access the data as plain files if all else fails is nice for future users. Secondly, it is also a complex tool and difficult to get right for this purpose (I think much easier for some other purposes). For someone that doesn't live and breathe git-annex, it can be hard to get right. In fact, I'm not entirely sure I got it right here. Why didn't drive02 know what files were on drive01 and vice-versa? I don't know, and that reflects some kind of misunderstanding on my part about how metadata is synced; perhaps more care needs to be taken in restore, or done in a different order, than I proposed. I initially tried to do a restore by using git annex export to a directory special remote with exporttree=yes, but I couldn't ever get it to actually do anything, and I don't know why. These two cut against each other. On the one hand, the raw accessibility of the data to someone with no computer skills is unmatched. On the other hand, I'm not certain I have the skill to always prepare the discs properly, or to do a proper consistent restore.

14 May 2023

Holger Levsen: 20230514-fwupd

How-To use fwupd As one cannot use fwupd on Qubes OS to update firmwares this is a quick How-To for using fwupd on Grml for future me. (Qubes 4.2 will bring qubes-fwupd.)

boot into Grml.
mkdir /efi ; mount /boot/efi to /efi or set OverrideESPMountPoint=/boot/efi/EFI if you mount to the usual path.
apt update ; apt install fwupd fwupd-amd64-signed udisks2 policykit-1
fwupdmgr get-devices
fwupdmgr refresh
fwupdmgr get-updates
fwupdmgr update
reboot into Qubes OS.

23 April 2023

Petter Reinholdtsen: Speech to text, she APTly whispered, how hard can it be?

While visiting a convention during Easter, it occurred to me that it would be great if I could have a digital Dictaphone with transcribing capabilities, providing me with texts to cut-n-paste into stuff I need to write. The background is that long drives often bring up the urge to write on texts I am working on, which of course is out of the question while driving. With the release of OpenAI Whisper, this seem to be within reach with Free Software, so I decided to give it a go. OpenAI Whisper is a Linux based neural network system to read in audio files and provide text representation of the speech in that audio recording. It handle multiple languages and according to its creators even can translate into a different language than the spoken one. I have not tested the latter feature. It can either use the CPU or a GPU with CUDA support. As far as I can tell, CUDA in practice limit that feature to NVidia graphics cards. I have few of those, as they do not work great with free software drivers, and have not tested the GPU option. While looking into the matter, I did discover some work to provide CUDA support on non-NVidia GPUs, and some work with the library used by Whisper to port it to other GPUs, but have not spent much time looking into GPU support yet. I've so far used an old X220 laptop as my test machine, and only transcribed using its CPU. As it from a privacy standpoint is unthinkable to use computers under control of someone else (aka a "cloud" service) to transcribe ones thoughts and personal notes, I want to run the transcribing system locally on my own computers. The only sensible approach to me is to make the effort I put into this available for any Linux user and to upload the needed packages into Debian. Looking at Debian Bookworm, I discovered that only three packages were missing, tiktoken, triton, and openai-whisper. For a while I also believed ffmpeg-python was needed, but as its upstream seem to have vanished I found it safer to rewrite whisper to stop depending on in than to introduce ffmpeg-python into Debian. I decided to place these packages under the umbrella of the Debian Deep Learning Team, which seem like the best team to look after such packages. Discussing the topic within the group also made me aware that the triton package was already a future dependency of newer versions of the torch package being planned, and would be needed after Bookworm is released. All required code packages have been now waiting in the Debian NEW queue since Wednesday, heading for Debian Experimental until Bookworm is released. An unsolved issue is how to handle the neural network models used by Whisper. The default behaviour of Whisper is to require Internet connectivity and download the model requested to ~/.cache/whisper/ on first invocation. This obviously would fail the deserted island test of free software as the Debian packages would be unusable for someone stranded with only the Debian archive and solar powered computer on a deserted island. Because of this, I would love to include the models in the Debian mirror system. This is problematic, as the models are very large files, which would put a heavy strain on the Debian mirror infrastructure around the globe. The strain would be even higher if the models change often, which luckily as far as I can tell they do not. The small model, which according to its creator is most useful for English and in my experience is not doing a great job there either, is 462 MiB (deb is 414 MiB). The medium model, which to me seem to handle English speech fairly well is 1.5 GiB (deb is 1.3 GiB) and the large model is 2.9 GiB (deb is 2.6 GiB). I would assume everyone with enough resources would prefer to use the large model for highest quality. I believe the models themselves would have to go into the non-free part of the Debian archive, as they are not really including any useful source code for updating the models. The "source", aka the model training set, according to the creators consist of "680,000 hours of multilingual and multitask supervised data collected from the web", which to me reads material with both unknown copyright terms, unavailable to the general public. In other words, the source is not available according to the Debian Free Software Guidelines and the model should be considered non-free. I asked the Debian FTP masters for advice regarding uploading a model package on their IRC channel, and based on the feedback there it is still unclear to me if such package would be accepted into the archive. In any case I wrote build rules for a OpenAI Whisper model package and modified the Whisper code base to prefer shared files under /usr/ and /var/ over user specific files in ~/.cache/whisper/ to be able to use these model packages, to prepare for such possibility. One solution might be to include only one of the models (small or medium, I guess) in the Debian archive, and ask people to download the others from the Internet. Not quite sure what to do here, and advice is most welcome (use the debian-ai mailing list). To make it easier to test the new packages while I wait for them to clear the NEW queue, I created an APT source targeting bookworm. I selected Bookworm instead of Bullseye, even though I know the latter would reach more users, is that some of the required dependencies are missing from Bullseye and I during this phase of testing did not want to backport a lot of packages just to get up and running. Here is a recipe to run as user root if you want to test OpenAI Whisper using Debian packages on your Debian Bookworm installation, first adding the APT repository GPG key to the list of trusted keys, then setting up the APT repository and finally installing the packages and one of the models:

curl https://geekbay.nuug.no/~pere/openai-whisper/D78F5C4796F353D211B119E28200D9B589641240.asc \
  -o /etc/apt/trusted.gpg.d/pere-whisper.asc
mkdir -p /etc/apt/sources.list.d
cat > /etc/apt/sources.list.d/pere-whisper.list <<EOF
deb https://geekbay.nuug.no/~pere/openai-whisper/ bookworm main
deb-src https://geekbay.nuug.no/~pere/openai-whisper/ bookworm main
EOF
apt update
apt install openai-whisper

The package work for me, but have not yet been tested on any other computer than my own. With it, I have been able to (badly) transcribe a 2 minute 40 second Norwegian audio clip to test using the small model. This took 11 minutes and around 2.2 GiB of RAM. Transcribing the same file with the medium model gave a accurate text in 77 minutes using around 5.2 GiB of RAM. My test machine had too little memory to test the large model, which I believe require 11 GiB of RAM. In short, this now work for me using Debian packages, and I hope it will for you and everyone else once the packages enter Debian. Now I can start on the audio recording part of this project. As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

20 April 2023

Simon Josefsson: Sigstore for Apt Archives: apt-cosign

As suggested in my initial announcement of apt-sigstore my plan was to look into stronger uses of Sigstore than rekor, and I m now happy to announce that the apt-cosign plugin has been added to apt-sigstore and the operational project debdistcanary is publishing cosign-statements about the InRelease file published by the following distributions: Trisquel GNU/Linux, PureOS, Gnuinos, Ubuntu, Debian and Devuan. Summarizing the commands that you need to run as root to experience the great new world:

# run everything as root: su / sudo -i / doas -s
apt-get install -y apt gpg bsdutils wget
wget -nv -O/usr/local/bin/apt-verify-gpgv https://gitlab.com/debdistutils/apt-verify/-/raw/main/apt-verify-gpgv
chmod +x /usr/local/bin/apt-verify-gpgv
mkdir -p /etc/apt/verify.d
ln -s /usr/bin/gpgv /etc/apt/verify.d
echo 'APT::Key::gpgvcommand "apt-verify-gpgv";' > /etc/apt/apt.conf.d/75verify
wget -O/usr/local/bin/cosign https://github.com/sigstore/cosign/releases/download/v2.0.1/cosign-linux-amd64
echo 924754b2e62f25683e3e74f90aa5e166944a0f0cf75b4196ee76cb2f487dd980  /usr/local/bin/cosign   sha256sum -c
chmod +x /usr/local/bin/cosign
wget -nv -O/etc/apt/verify.d/apt-cosign https://gitlab.com/debdistutils/apt-sigstore/-/raw/main/apt-cosign
chmod +x /etc/apt/verify.d/apt-cosign
mkdir -p /etc/apt/trusted.cosign.d
dist=$(lsb_release --short --id   tr A-Z a-z)
wget -O/etc/apt/trusted.cosign.d/cosign-public-key-$dist.txt "https://gitlab.com/debdistutils/debdistcanary/-/raw/main/cosign/cosign-public-key-$dist.txt"
echo "Cosign::Base-URL \"https://gitlab.com/debdistutils/canary/$dist/-/raw/main/cosign\";" > /etc/apt/apt.conf.d/77cosign

Then run your usual apt-get update and look in the syslog to debug things. This is the kind of work that gets done while waiting for the build machines to attempt to reproducibly build PureOS. Unfortunately, the results is that a meager 16% of the 765 added/modifed packages are reproducible by me. There is some infrastructure work to be done to improve things: we should use sbuild for example. The build infrastructure should produce signed statements for each package it builds: One statement saying that it attempted to reproducible build a particular binary package (thus generated some build logs and diffoscope-output for auditing), and one statements saying that it actually was able to reproduce a package. Verifying such claims during apt-get install or possibly dpkg -i is a logical next step. There is some code cleanups and release work to be done now. Which distribution will be the first apt-based distribution that includes native support for Sigstore? Let s see. Sigstore is not the only relevant transparency log around, and I ve been trying to learn a bit about Sigsum to be able to support it as well. The more improved confidence about system security, the merrier!

Next.

Search Results: "mkd"

25 April 2024

18 April 2024

10 March 2024

25 February 2024

3 January 2024

WARNINGS Before continuing, back up your system. This process isn t for the neophyte and it is entirely possible to mess up your boot device to the point that you have to do a fresh install to get your Pi to boot. This isn t a supported process at all.

Basic idea The basic idea here is that since bookworm has almost entirely newer packages then bullseye, we can just switch over to it and let the Debian packages replace the Raspbian ones as they are upgraded. Well, it s not quite that easy, but that s the main idea.

Preparation: bluetooth Debian doesn t correctly identify the Bluetooth hardware address. You can save it off to a file by running hcitool dev > /root/bluetooth-from-raspbian.txt. I don t use Bluetooth, but this should let you develop a script to bring it up properly.

Purging the Raspbian kernel Run: dpkg --purge raspberrypi-kernel

Removing Raspbian cruft You can list some of the cruft with: apt list '~o' And remove it with: apt purge '~o' I also don t run Bluetooth, and it seemed to sometimes hang on boot becuase I didn t bother to fix it, so I did: apt-get --purge remove bluez

Installing some packages This makes sure some basic Debian infrastructure is available: apt-get install wpasupplicant parted dosfstools wireless-tools iw alsa-tools apt-get --purge autoremove

Installing firmware Now run: apt-get install firmware-linux

Deal with DHCP Raspberry Pi OS used dhcpcd, whereas bookworm normally uses isc-dhcp-client. Verify the system is in the correct state: apt-get install isc-dhcp-client apt-get --purge remove dhcpcd dhcpcd-base dhcpcd5 dhcpcd-dbus

Set up LEDs To set up the LEDs to trigger on MicroSD activity as they did with Raspbian, follow the Debian instructions. Run apt-get install sysfsutils. Then put this in a file at /etc/sysfs.d/local-raspi-leds.conf: class/leds/ACT/brightness = 1 class/leds/ACT/trigger = mmc1

The moment arrives Cross your fingers and try rebooting into your Debian system: reboot For some reason, I found that the first boot into Debian seems to hang for 30-60 seconds during bootstrap. I m not sure why; don t panic if that happens. It may be necessary to power cycle the Pi for this boot.

9 December 2023

21 November 2023

12 October 2023

6 October 2023

16 September 2023

ConclusionFor now I m using the second solution and as it is working well my guess is that I ll keep using that approach unless giltab itself provides a better or simpler way of doing the same thing.

12 July 2023

9 July 2023

29 June 2023

Cleanup Optionally, cleanup the mess you left on this computer: rm -r ~/.config/signal-cli podman image rm registry.gitlab.com/packaging/signal-cli/signal-cli-jre

17 June 2023

16 June 2023

14 May 2023

23 April 2023

20 April 2023

Preparation: bluetooth Debian doesn t correctly identify the Bluetooth hardware address. You can save it off to a file by running `hcitool dev > /root/bluetooth-from-raspbian.txt`. I don t use Bluetooth, but this should let you develop a script to bring it up properly.

Purging the Raspbian kernel Run:

`dpkg --purge raspberrypi-kernel`

Removing Raspbian cruft You can list some of the cruft with:

`apt list '~o'`

And remove it with:

`apt purge '~o'`

I also don t run Bluetooth, and it seemed to sometimes hang on boot becuase I didn t bother to fix it, so I did:

`apt-get --purge remove bluez`

Installing some packages This makes sure some basic Debian infrastructure is available:

`apt-get install wpasupplicant parted dosfstools wireless-tools iw alsa-tools apt-get --purge autoremove`

Installing firmware Now run:

`apt-get install firmware-linux`

Deal with DHCP Raspberry Pi OS used dhcpcd, whereas bookworm normally uses `isc-dhcp-client`. Verify the system is in the correct state:

`apt-get install isc-dhcp-client apt-get --purge remove dhcpcd dhcpcd-base dhcpcd5 dhcpcd-dbus`

Set up LEDs To set up the LEDs to trigger on MicroSD activity as they did with Raspbian, follow the Debian instructions. Run `apt-get install sysfsutils`. Then put this in a file at `/etc/sysfs.d/local-raspi-leds.conf`:

`class/leds/ACT/brightness = 1 class/leds/ACT/trigger = mmc1`

Cleanup Optionally, cleanup the mess you left on this computer:
`rm -r ~/.config/signal-cli podman image rm registry.gitlab.com/packaging/signal-cli/signal-cli-jre`