Search Results: "lex"

15 October 2024

Lukas M rdian: Waiting for a Linux system to be online

Designed by Freepik

What is an online system? Networking is a complex topic, and there is lots of confusion around the definition of an online system. Sometimes the boot process gets delayed up to two minutes, because the system still waits for one or more network interfaces to be ready. Systemd provides the network-online.target that other service units can rely on, if they are deemed to require network connectivity. But what does online actually mean in this context, is a link-local IP address enough, do we need a routable gateway and how about DNS name resolution? The requirements for an online network interface depend very much on the services using an interface. For some services it might be good enough to reach their local network segment (e.g. to announce Zeroconf services), while others need to reach domain names (e.g. to mount a NFS share) or reach the global internet to run a web server. On the other hand, the implementation of network-online.target varies, depending on which networking daemon is in use, e.g. systemd-networkd-wait-online.service or NetworkManager-wait-online.service. For Ubuntu, we created a specification that describes what we as a distro expect an online system to be. Having a definition in place, we are able to tackle the network-online-ordering issues that got reported over the years and can work out solutions to avoid delayed boot times on Ubuntu systems. In essence, we want systems to reach the following networking state to be considered online:
  1. Do not wait for optional interfaces to receive network configuration
  2. Have IPv6 and/or IPv4 link-local addresses on every network interface
  3. Have at least one interface with a globally routable connection
  4. Have functional domain name resolution on any routable interface

A common implementation NetworkManager and systemd-networkd are two very common networking daemons used on modern Linux systems. But they originate from different contexts and therefore show different behaviours in certain scenarios, such as wait-online. Luckily, on Ubuntu we already have Netplan as a unification layer on top of those networking daemons, that allows for common network configuration, and can also be used to tweak the wait-online logic. With the recent release of Netplan v1.1 we introduced initial functionality to tweak the behaviour of the systemd-networkd-wait-online.service, as used on Ubuntu Server systems. When Netplan is used to drive the systemd-networkd backend, it will emit an override configuration file in /run/systemd/system/systemd-networkd-wait-online.service.d/10-netplan.conf, listing the specific non-optional interfaces that should receive link-local IP configuration. In parallel to that, it defines a list of network interfaces that Netplan detected to be potential global connections, and waits for any of those interfaces to reach a globally routable state. Such override config file might look like this:
[Unit]
ConditionPathIsSymbolicLink=/run/systemd/generator/network-online.target.wants/systemd-networkd-wait-online.service

[Service]
ExecStart=
ExecStart=/lib/systemd/systemd-networkd-wait-online -i eth99.43:carrier -i lo:carrier -i eth99.42:carrier -i eth99.44:degraded -i bond0:degraded
ExecStart=/lib/systemd/systemd-networkd-wait-online --any -o routable -i eth99.43 -i eth99.45 -i bond0
In addition to the new features implemented in Netplan, we reached out to upstream systemd, proposing an enhancement to the systemd-networkd-wait-online service, integrating it with systemd-resolved to check for the availability of DNS name resolution. Once this is implemented upstream, we re able to fully control the systemd-networkd backend on Ubuntu Server systems, to behave consistently and according to the definition of an online system that was lined out above.

Future work The story doesn t end there, because Ubuntu Desktop systems are using NetworkManager as their networking backend. This daemon provides its very own nm-online utility, utilized by the NetworkManager-wait-online systemd service. It implements a much higher-level approach, looking at the networking daemon in general instead of the individual network interfaces. By default, it considers a system to be online once every autoconnect profile got activated (or failed to activate), meaning that either a IPv4 or IPv6 address got assigned. There are considerable enhancements to be implemented to this tool, for it to be controllable in a fine-granular way similar to systemd-networkd-wait-online, so that it can be instructed to wait for specific networking states on selected interfaces.

A note of caution Making a service depend on network-online.target is considered an antipattern in most cases. This is because networking on Linux systems is very dynamic and the systemd target can only ever reflect the networking state at a single point in time. It cannot guarantee this state to be remained over the uptime of your system and has the potentially to delay the boot process considerably. Cables can be unplugged, wireless connectivity can drop, or remote routers can go down at any time, affecting the connectivity state of your local system. Therefore, instead of wondering what to do about network.target, please just fix your program to be friendly to dynamically changing network configuration. [source].

10 October 2024

Freexian Collaborators: Debian Contributions: Packaging Pydantic v2, Reworking of glib2.0 for cross bootstrap, Python archive rebuilds and more! (by Anupa Ann Joseph)

Debian Contributions: 2024-09 Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services.

Pydantic v2, by Colin Watson Pydantic is a useful library for validating data in Python using type hints: Freexian uses it in a number of projects, including Debusine. Its Debian packaging had been stalled at 1.10.17 in testing for some time, partly due to needing to make sure everything else could cope with the breaking changes introduced in 2.x, but mostly due to needing to sort out packaging of its new Rust dependencies. Several other people (notably Alexandre Detiste, Andreas Tille, Drew Parsons, and Timo R hling) had made some good progress on this, but nobody had quite got it over the line and it seemed a bit stuck. Colin upgraded a few Rust libraries to new upstream versions, packaged rust-jiter, and chased various failures in other packages. This eventually allowed getting current versions of both pydantic-core and pydantic into testing. It should now be much easier for us to stay up to date routinely.

Reworking of glib2.0 for cross bootstrap, by Helmut Grohne Simon McVittie (not affiliated with Freexian) earlier restructured the libglib2.0-dev such that it would absorb more functionality and in particular provide tools for working with .gir files. Those tools practically require being run for their host architecture (practically this means running under qemu-user) which is at odds with the requirements of architecture cross bootstrap. The qemu requirement was expressed in package dependencies and also made people unhappy attempting to use libglib2.0-dev for i386 on amd64 without resorting to qemu. The use of qemu in architecture bootstrap is particularly problematic as it tends to not be ready at the time bootstrapping is needed. As a result, Simon proposed and implemented the introduction of a libgio-2.0-dev package providing a subset of libglib2.0-dev that does not require qemu. Packages should continue to use libglib2.0-dev in their Build-Depends unless involved in architecture bootstrap. Helmut reviewed and tested the implementation and integrated the necessary changes into rebootstrap. He also prepared a patch for libverto to use the new package and proposed adding forward compatibility to glib2.0. Helmut continued working on adding cross-exe-wrapper to architecture-properties and implemented autopkgtests later improved by Simon. The cross-exe-wrapper package now provides a generic mechanism to a program on a different architecture by using qemu when needed only. For instance, a dependency on cross-exe-wrapper:i386 provides a i686-linux-gnu-cross-exe-wrapper program that can be used to wrap an ELF executable for the i386 architecture. When installed on amd64 or i386 it will skip installing or running qemu, but for other architectures qemu will be used automatically. This facility can be used to support cross building with targeted use of qemu in cases where running host code is unavoidable as is the case for GObject introspection. This concludes the joint work with Simon and Niels Thykier on glib2.0 and architecture-properties resolving known architecture bootstrap regressions arising from the glib2.0 refactoring earlier this year.

Analyzing binary package metadata, by Helmut Grohne As Guillem Jover (not affiliated with Freexian) continues to work on adding metadata tracking to dpkg, the question arises how this affects existing packages. The dedup.debian.net infrastructure provides an easy playground to answer such questions, so Helmut gathered file metadata from all binary packages in unstable and performed an explorative analysis. Some results include: Guillem also performed a cursory analysis and reported other problem categories such as mismatching directory permissions for directories installed by multiple packages and thus gained a better understanding of what consistency checks dpkg can enforce.

Python archive rebuilds, by Stefano Rivera Last month Stefano started to write some tooling to do large-scale rebuilds in debusine, starting with finding packages that had already started to fail to build from source (FTBFS) due to the removal of setup.py test. This month, Stefano did some more rebuilds, starting with experimental versions of dh-python. During the Python 3.12 transition, we had added a dependency on python3-setuptools to dh-python, to ease the transition. Python 3.12 removed distutils from the stdlib, but many packages were expecting it to still be available. Setuptools contains a version of distutils, and dh-python was a convenient place to depend on setuptools for most package builds. This dependency was never meant to be permanent. A rebuild without it resulted in mass-filing about 340 bugs (and around 80 more by mistake). A new feature in Python 3.12, was to have unittest s test runner exit with a non-zero return code, if no tests were run. We added this feature, to be able to detect tests that are not being discovered, by mistake. We are ignoring this failure, as we wouldn t want to suddenly cause hundreds of packages to fail to build, if they have no tests. Stefano did a rebuild to see how many packages were affected, and found that around 1000 were. The Debian Python community has not come to a conclusion on how to move forward with this. As soon as Python 3.13 release candidate 2 was available, Stefano did a rebuild of the Python packages in the archive against it. This was a more complex rebuild than the others, as it had to be done in stages. Many packages need other Python packages at build time, typically to run tests. So transitions like this involve some manual bootstrapping, followed by several rounds of builds. Not all packages could be tested, as not all their dependencies support 3.13 yet. The result was around 100 bugs in packages that need work to support Python 3.13. Many other packages will need additional work to properly support Python 3.13, but being able to build (and run tests) is an important first step.

Miscellaneous contributions
  • Carles prepared the update of python-pyaarlo package to a new upstream release.
  • Carles worked on updating python-ring-doorbell to a new upstream release. Unfinished, pending to package a new dependency python3-firebase-messaging RFP #1082958 and its dependency python3-http-ece RFP #1083020.
  • Carles improved po-debconf-manager. Main new feature is that it can open Salsa merge requests. Aiming for a lightning talk in MiniDebConf Toulouse (November) to be functional end to end and get feedback from the wider public for this proof of concept.
  • Carles helped one translator to use po-debconf-manager (added compatibility for bullseye, fixed other issues) and reviewed 17 package templates.
  • Colin upgraded the OpenSSH packaging to 9.9p1.
  • Colin upgraded the various YubiHSM packages to new upstream versions, enabled more tests, fixed yubihsm-shell build failures on some 32-bit architectures, made yubihsm-shell build reproducibly, and fixed yubihsm-connector to apply udev rules to existing devices when the package is installed. As usual, bookworm-backports is up to date with all these changes.
  • Colin fixed quite a bit of fallout from setuptools 72.0.0 removing setup.py test, backported a large upstream patch set to make buildbot work with SQLAlchemy 2.0, and upgraded 25 other Python packages to new upstream versions.
  • Enrico worked with Jakob Haufe to get him up to speed for managing sso.debian.org
  • Rapha l did remove spam entries in the list of teams on tracker.debian.org (see #1080446), and he applied a few external contributions, fixing a rendering issue and replacing the DDPO link with a more useful alternative. He also gave feedback on a couple of merge requests that required more work. As part of the analysis of the underlying problem, he suggested to the ftpmasters (via #1083068) to auto-reject packages having the too-many-contacts lintian error, and he raised the severity of #1076048 to serious to actually have that 4 year old bug fixed.
  • Rapha l uploaded zim and hamster-time-tracker to fix issues with Python 3.12 getting rid of setuptools. He also uploaded a new gnome-shell-extension-hamster to cope with the upcoming transition to GNOME 47.
  • Helmut sent seven patches and sponsored one upload for cross build failures.
  • Helmut uploaded a Nagios/Icinga plugin check-smart-attributes for monitoring the health of physical disks.
  • Helmut collaborated on sbuild reviewing and improving a MR for refactoring the unshare backend.
  • Helmut sent a patch fixing coinstallability of gcc-defaults.
  • Helmut continued to monitor the evolution of the /usr-move. With more and more key packages such as libvirt or fuse3 fixed. We re moving into the boring long-tail of the transition.
  • Helmut proposed updating the meson buildsystem in debhelper to use env2mfile.
  • Helmut continued to update patches maintained in rebootstrap. Due to the work on glib2.0 above, rebootstrap moves a lot further, but still fails for any architecture.
  • Santiago reviewed some Merge Request in Salsa CI, such as: !478, proposed by Otto to extend the information about how to use additional runners in the pipeline and !518, proposed by Ahmed to add support for Ubuntu images, that will help to test how some debian packages, including the complex MariaDB are built on Ubuntu. Santiago also prepared !545, which will make the reprotest job more consistent with the result seen on reproducible-builds.
  • Santiago worked on different tasks related to DebConf 25. Especially he drafted the fundraising brochure (which is almost ready).
  • Thorsten Alteholz uploaded package libcupsfilter to fix the autopkgtest and a dependency problem of this package. After package splix was abandoned by upstream and OpenPrinting.org adopted its maintenance, Thorsten uploaded their first release.
  • Anupa published posts on the Debian Administrators group in LinkedIn and moderated the group, one of the tasks of the Debian Publicity Team.
  • Anupa helped organize DebUtsav 2024. It had over 100 attendees with hand-on sessions on making initial contributions to Linux Kernel, Debian packaging, submitting documentation to Debian wiki and assisting Debian Installations.

7 October 2024

Reproducible Builds: Reproducible Builds in September 2024

Welcome to the September 2024 report from the Reproducible Builds project! Our reports attempt to outline what we ve been up to over the past month, highlighting news items from elsewhere in tech where they are related. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website. Table of contents:
  1. New binsider tool to analyse ELF binaries
  2. Unreproducibility of GHC Haskell compiler 95% fixed
  3. Mailing list summary
  4. Towards a 100% bit-for-bit reproducible OS
  5. Two new reproducibility-related academic papers
  6. Distribution work
  7. diffoscope
  8. Other software development
  9. Android toolchain core count issue reported
  10. New Gradle plugin for reproducibility
  11. Website updates
  12. Upstream patches
  13. Reproducibility testing framework

New binsider tool to analyse ELF binaries Reproducible Builds developer Orhun Parmaks z has announced a fantastic new tool to analyse the contents of ELF binaries. According to the project s README page:
Binsider can perform static and dynamic analysis, inspect strings, examine linked libraries, and perform hexdumps, all within a user-friendly terminal user interface!
More information about Binsider s features and how it works can be found within Binsider s documentation pages.

Unreproducibility of GHC Haskell compiler 95% fixed A seven-year-old bug about the nondeterminism of object code generated by the Glasgow Haskell Compiler (GHC) received a recent update, consisting of Rodrigo Mesquita noting that the issue is:
95% fixed by [merge request] !12680 when -fobject-determinism is enabled. [ ]
The linked merge request has since been merged, and Rodrigo goes on to say that:
After that patch is merged, there are some rarer bugs in both interface file determinism (eg. #25170) and in object determinism (eg. #25269) that need to be taken care of, but the great majority of the work needed to get there should have been merged already. When merged, I think we should close this one in favour of the more specific determinism issues like the two linked above.

Mailing list summary On our mailing list this month:
  • Fay Stegerman let everyone know that she started a thread on the Fediverse about the problems caused by unreproducible zlib/deflate compression in .zip and .apk files and later followed up with the results of her subsequent investigation.
  • Long-time developer kpcyrd wrote that there has been a recent public discussion on the Arch Linux GitLab [instance] about the challenges and possible opportunities for making the Linux kernel package reproducible , all relating to the CONFIG_MODULE_SIG flag. [ ]
  • Bernhard M. Wiedemann followed-up to an in-person conversation at our recent Hamburg 2024 summit on the potential presence for Reproducible Builds in recognised standards. [ ]
  • Fay Stegerman also wrote about her worry about the possible repercussions for RB tooling of Debian migrating from zlib to zlib-ng as reproducibility requires identical compressed data streams. [ ]
  • Martin Monperrus wrote the list announcing the latest release of maven-lockfile that is designed aid building Maven projects with integrity . [ ]
  • Lastly, Bernhard M. Wiedemann wrote about potential role of reproducible builds in combatting silent data corruption, as detailed in a recent Tweet and scholarly paper on faulty CPU cores. [ ]

Towards a 100% bit-for-bit reproducible OS Bernhard M. Wiedemann began writing on journey towards a 100% bit-for-bit reproducible operating system on the openSUSE wiki:
This is a report of Part 1 of my journey: building 100% bit-reproducible packages for every package that makes up [openSUSE s] minimalVM image. This target was chosen as the smallest useful result/artifact. The larger package-sets get, the more disk-space and build-power is required to build/verify all of them.
This work was sponsored by NLnet s NGI Zero fund.

Distribution work In Debian this month, 14 reviews of Debian packages were added, 12 were updated and 20 were removed, all adding to our knowledge about identified issues. A number of issue types were updated as well. [ ][ ] In addition, Holger opened 4 bugs against the debrebuild component of the devscripts suite of tools. In particular:
  • #1081047: Fails to download .dsc file.
  • #1081048: Does not work with a proxy.
  • #1081050: Fails to create a debrebuild.tar.
  • #1081839: Fails with E: mmdebstrap failed to run error.
Last month, an issue was filed to update the Salsa CI pipeline (used by 1,000s of Debian packages) to no longer test for reproducibility with reprotest s build_path variation. Holger Levsen provided a rationale for this change in the issue, which has already been made to the tests being performed by tests.reproducible-builds.org. This month, this issue was closed by Santiago R. R., nicely explaining that build path variation is no longer the default, and, if desired, how developers may enable it again. In openSUSE news, Bernhard M. Wiedemann published another report for that distribution.

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made the following changes, including preparing and uploading version 278 to Debian:
  • New features:
    • Add a helpful contextual message to the output if comparing Debian .orig tarballs within .dsc files without the ability to fuzzy-match away the leading directory. [ ]
  • Bug fixes:
    • Drop removal of calculated os.path.basename from GNU readelf output. [ ]
    • Correctly invert X% similar value and do not emit 100% similar . [ ]
  • Misc:
    • Temporarily remove procyon-decompiler from Build-Depends as it was removed from testing (via #1057532). (#1082636)
    • Update copyright years. [ ]
For trydiffoscope, the command-line client for the web-based version of diffoscope, Chris Lamb also:
  • Added an explicit python3-setuptools dependency. (#1080825)
  • Bumped the Standards-Version to 4.7.0. [ ]

Other software development disorderfs is our FUSE-based filesystem that deliberately introduces non-determinism into system calls to reliably flush out reproducibility issues. This month, version 0.5.11-4 was uploaded to Debian unstable by Holger Levsen making the following changes:
  • Replace build-dependency on the obsolete pkg-config package with one on pkgconf, following a Lintian check. [ ]
  • Bump Standards-Version field to 4.7.0, with no related changes needed. [ ]

In addition, reprotest is our tool for building the same source code twice in different environments and then checking the binaries produced by each build for any differences. This month, version 0.7.28 was uploaded to Debian unstable by Holger Levsen including a change by Jelle van der Waa to move away from the pipes Python module to shlex, as the former will be removed in Python version 3.13 [ ].

Android toolchain core count issue reported Fay Stegerman reported an issue with the Android toolchain where a part of the build system generates a different classes.dex file (and thus a different .apk) depending on the number of cores available during the build, thereby breaking Reproducible Builds:
We ve rebuilt [tag v3.6.1] multiple times (each time in a fresh container): with 2, 4, 6, 8, and 16 cores available, respectively:
  • With 2 and 4 cores we always get an unsigned APK with SHA-256 14763d682c9286ef .
  • With 6, 8, and 16 cores we get an unsigned APK with SHA-256 35324ba4c492760 instead.

New Gradle plugin for reproducibility A new plugin for the Gradle build tool for Java has been released. This easily-enabled plugin results in:
reproducibility settings [being] applied to some of Gradle s built-in tasks that should really be the default. Compatible with Java 8 and Gradle 8.3 or later.

Website updates There were a rather substantial number of improvements made to our website this month, including:

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In September, a number of changes were made by Holger Levsen, including:
  • Debian-related changes:
    • Upgrade the osuosl4 node to Debian trixie in anticipation of running debrebuild and rebuilderd there. [ ][ ][ ]
    • Temporarily mark the osuosl4 node as offline due to ongoing xfs_repair filesystem maintenance. [ ][ ]
    • Do not warn about (very old) broken nodes. [ ]
    • Add the risc64 architecture to the multiarch version skew tests for Debian trixie and sid. [ ][ ][ ]
    • Mark the virt 32,64 b nodes as down. [ ]
  • Misc changes:
    • Add support for powercycling OpenStack instances. [ ]
    • Update the fail2ban to ban hosts for 4 weeks in total [ ][ ] and take care to never ban our own Jenkins instance. [ ]
In addition, Vagrant Cascadian recorded a disk failure for the virt32b and virt64b nodes [ ], performed some maintenance of the cbxi4a node [ ][ ] and marked most armhf architecture systems as being back online.

Finally, If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

3 October 2024

Mike Gabriel: Creating (a) new frontend(s) for Polis

After (quite) a summer break, here comes the 4th article of the 5-episode blog post series on Polis, written by Guido Berh rster, member of staff at my company Fre(i)e Software GmbH. Have fun with the read on Guido's work on Polis,
Mike
Table of Contents of the Blog Post Series
  1. Introduction
  2. Initial evaluation and adaptation
  3. Issues extending Polis and adjusting our goals
  4. Creating (a) new frontend(s) for Polis (this article)
  5. Current status and roadmap
4. Creating (a) new frontend(s) for Polis Why a new frontend was needed... Our initial experiences of working with Polis, the effort required to implement more invasive changes and the desire of iterating changes more rapidly ultimately lead to the decision to create a new foundation for frontend development that would be independent of but compatible with the upstream project. Our primary objective was thus not to develop another frontend but rather to make frontend development more flexible and to facilitate experimentation and rapid prototyping of different frontends by providing abstraction layers and building blocks. This also implied developing a corresponding backend since the Polis backend is tightly coupled to the frontend and is neither intended to be used by third-party projects nor supporting cross-domain requests due to the expectation of being embedded as an iframe on third-party websites. The long-term plan for achieving our objectives is to provide three abstraction layers for building frontends: The Particiapp Project Under the umbrella of the Particiapp project we have so far developed two new components: Both the participation frontend and backend are fully compatible and require an existing Polis installation and can be run alongside the upstream frontend. More specifically, the administration frontend and common backend are required to administrate conversations and send out notifications and the statistics processing server is required for processing the voting results. Particiapi server For the backend the Python language and the Flask framework were chosen as a technological basis mainly due to developer mindshare, a large community and ecosystem and the smaller dependency chain and maintenance overhead compared to Node.js/npm. Instead of integrating specific identity providers we adopted the OpenID Connect standard as an abstraction layer for authentication which allows delegating authentication either to a self-hosted identity provider or a large number of existing external identity providers. Particiapp Example Frontend The experimental example frontend serves both as a test bed for the client library and as a tool for better understanding the needs of frontend designers. It also features a completely redesigned user interface and results visualization in line with our goals. Branded variants are currently used for evaluation and testing by the stakeholders. In order to simplify evaluation, development, testing and deployment a Docker Compose configuration is made available which contains all necessary components for running Polis with our experimental example frontend. In addition, a development environment is provided which includes a preconfigured OpenID Connect identity provider (KeyCloak), SMTP-Server with web interface (MailDev), and a database frontend (PgAdmin). The new frontend can also be tested using our public demo server.

1 October 2024

Colin Watson: Free software activity in September 2024

Almost all of my Debian contributions this month were sponsored by Freexian. You can also support my work directly via Liberapay. Pydantic My main Debian project for the month turned out to be getting Pydantic back into a good state in Debian testing. I ve used Pydantic quite a bit in various projects, most recently in Debusine, so I have an interest in making sure it works well in Debian. However, it had been stalled on 1.10.17 for quite a while due to the complexities of getting 2.x packaged. This was partly making sure everything else could cope with the transition, but in practice mostly sorting out packaging of its new Rust dependencies. Several other people (notably Alexandre Detiste, Andreas Tille, Drew Parsons, and Timo R hling) had made some good progress on this, but nobody had quite got it over the line and it seemed a bit stuck. Learning Rust is on my to-do list, but merely not knowing a language hasn t stopped me before. So I learned how the Debian Rust team s packaging works, upgraded a few packages to new upstream versions (including rust-half and upstream rust-idna test fixes), and packaged rust-jiter. After a lot of waiting around for various things and chasing some failures in other packages I was eventually able to get current versions of both pydantic-core and pydantic into testing. I m looking forward to being able to drop our clunky v1 compatibility code once debusine can rely on running on trixie! OpenSSH I upgraded the Debian packaging to OpenSSH 9.9p1. YubiHSM I upgraded python-yubihsm, yubihsm-connector, and yubihsm-shell to new upstream versions. I noticed that I could enable some tests in python-yubihsm and yubihsm-shell; I d previously thought the whole test suite required a real YubiHSM device, but when I looked closer it turned out that this was only true for some tests. I fixed yubihsm-shell build failures on some 32-bit architectures (upstream PRs #431, #432), and also made it build reproducibly. Thanks to Helmut Grohne, I fixed yubihsm-connector to apply udev rules to existing devices when the package is installed. As usual, bookworm-backports is up to date with all these changes. Python team setuptools 72.0.0 removed the venerable setup.py test command. This caused some fallout in Debian, some of which was quite non-obvious as packaging helpers sometimes fell back to different ways of running test suites that didn t quite work. I fixed django-guardian, manuel, python-autopage, python-flask-seeder, python-pgpdump, python-potr, python-precis-i18n, python-stopit, serpent, straight.plugin, supervisor, and zope.i18nmessageid. As usual for new language versions, the addition of Python 3.13 caused some problems. I fixed psycopg2, python-time-machine, and python-traits. I fixed build/autopkgtest failures in keymapper, python-django-test-migrations, python-rosettasciio, routes, transmissionrpc, and twisted. buildbot was in a bit of a mess due to being incompatible with SQLAlchemy 2.0. Fortunately by the time I got to it upstream had committed a workable set of patches, and the main difficulty was figuring out what to cherry-pick since they haven t made a new upstream release with all of that yet. I figured this out and got us up to 4.0.3. Adrian Bunk asked whether python-zipp should be removed from trixie. I spent some time investigating this and concluded that the answer was no, but looking into it was an interesting exercise anyway. On the other hand, I looked into flask-appbuilder, concluded that it should be removed, and filed a removal request. I upgraded some embedded CSS files in nbconvert. I upgraded importlib-resources, ipywidgets, jsonpickle, pydantic-settings, pylint (fixing a test failure), python-aiohttp-session, python-apptools, python-asyncssh, python-django-celery-beat, python-django-rules, python-limits, python-multidict, python-persistent, python-pkginfo, python-rt, python-spur, python-zipp, stravalib, transmissionrpc, vulture, zodbpickle, zope.exceptions (adopting it), zope.i18nmessageid, zope.proxy, and zope.security to new upstream versions. debmirror The experimental and *-proposed-updates suites used to not have Contents-* files, and a long time ago debmirror was changed to just skip those files in those suites. They were added to the Debian archive some time ago, but debmirror carried on skipping them anyway. Once I realized what was going on, I removed these unnecessary special cases (#819925, #1080168).

29 September 2024

Reproducible Builds: Supporter spotlight: Kees Cook on Linux kernel security

The Reproducible Builds project relies on several projects, supporters and sponsors for financial support, but they are also valued as ambassadors who spread the word about our project and the work that we do. This is the eighth installment in a series featuring the projects, companies and individuals who support the Reproducible Builds project. We started this series by featuring the Civil Infrastructure Platform project, and followed this up with a post about the Ford Foundation as well as recent ones about ARDC, the Google Open Source Security Team (GOSST), Bootstrappable Builds, the F-Droid project, David A. Wheeler and Simon Butler. Today, however, we will be talking with Kees Cook, founder of the Kernel Self-Protection Project.

Vagrant Cascadian: Could you tell me a bit about yourself? What sort of things do you work on? Kees Cook: I m a Free Software junkie living in Portland, Oregon, USA. I have been focusing on the upstream Linux kernel s protection of itself. There is a lot of support that the kernel provides userspace to defend itself, but when I first started focusing on this there was not as much attention given to the kernel protecting itself. As userspace got more hardened the kernel itself became a bigger target. Almost 9 years ago I formally announced the Kernel Self-Protection Project because the work necessary was way more than my time and expertise could do alone. So I just try to get people to help as much as possible; people who understand the ARM architecture, people who understand the memory management subsystem to help, people who understand how to make the kernel less buggy.
Vagrant: Could you describe the path that lead you to working on this sort of thing? Kees: I have always been interested in security through the aspect of exploitable flaws. I always thought it was like a magic trick to make a computer do something that it was very much not designed to do and seeing how easy it is to subvert bugs. I wanted to improve that fragility. In 2006, I started working at Canonical on Ubuntu and was mainly focusing on bringing Debian and Ubuntu up to what was the state of the art for Fedora and Gentoo s security hardening efforts. Both had really pioneered a lot of userspace hardening with compiler flags and ELF stuff and many other things for hardened binaries. On the whole, Debian had not really paid attention to it. Debian s packaging building process at the time was sort of a chaotic free-for-all as there wasn t centralized build methodology for defining things. Luckily that did slowly change over the years. In Ubuntu we had the opportunity to apply top down build rules for hardening all the packages. In 2011 Chrome OS was following along and took advantage of a bunch of the security hardening work as they were based on ebuild out of Gentoo and when they looked for someone to help out they reached out to me. We recognized the Linux kernel was pretty much the weakest link in the Chrome OS security posture and I joined them to help solve that. Their userspace was pretty well handled but the kernel had a lot of weaknesses, so focusing on hardening was the next place to go. When I compared notes with other users of the Linux kernel within Google there were a number of common concerns and desires. Chrome OS already had an upstream first requirement, so I tried to consolidate the concerns and solve them upstream. It was challenging to land anything in other kernel team repos at Google, as they (correctly) wanted to minimize their delta from upstream, so I needed to work on any major improvements entirely in upstream and had a lot of support from Google to do that. As such, my focus shifted further from working directly on Chrome OS into being entirely upstream and being more of a consultant to internal teams, helping with integration or sometimes backporting. Since the volume of needed work was so gigantic I needed to find ways to inspire other developers (both inside and outside of Google) to help. Once I had a budget I tried to get folks paid (or hired) to work on these areas when it wasn t already their job.
Vagrant: So my understanding of some of your recent work is basically defining undefined behavior in the language or compiler? Kees: I ve found the term undefined behavior to have a really strict meaning within the compiler community, so I have tried to redefine my goal as eliminating unexpected behavior or ambiguous language constructs . At the end of the day ambiguity leads to bugs, and bugs lead to exploitable security flaws. I ve been taking a four-pronged approach: supporting the work people are doing to get rid of ambiguity, identify new areas where ambiguity needs to be removed, actually removing that ambiguity from the C language, and then dealing with any needed refactoring in the Linux kernel source to adapt to the new constraints. None of this is particularly novel; people have recognized how dangerous some of these language constructs are for decades and decades but I think it is a combination of hard problems and a lot of refactoring that nobody has the interest/resources to do. So, we have been incrementally going after the lowest hanging fruit. One clear example in recent years was the elimination of C s implicit fall-through in switch statements. The language would just fall through between adjacent cases if a break (or other code flow directive) wasn t present. But this is ambiguous: is the code meant to fall-through, or did the author just forget a break statement? By defining the [[fallthrough]] statement, and requiring its use in Linux, all switch statements now have explicit code flow, and the entire class of bugs disappeared. During our refactoring we actually found that 1 in 10 added [[fallthrough]] statements were actually missing break statements. This was an extraordinarily common bug! So getting rid of that ambiguity is where we have been. Another area I ve been spending a bit of time on lately is looking at how defensive security work has challenges associated with metrics. How do you measure your defensive security impact? You can t say because we installed locks on the doors, 20% fewer break-ins have happened. Much of our signal is always secondary or retrospective, which is frustrating: This class of flaw was used X much over the last decade so, and if we have eliminated that class of flaw and will never see it again, what is the impact? Is the impact infinity? Attackers will just move to the next easiest thing. But it means that exploitation gets incrementally more difficult. As attack surfaces are reduced, the expense of exploitation goes up.
Vagrant: So it is hard to identify how effective this is how bad would it be if people just gave up? Kees: I think it would be pretty bad, because as we have seen, using secondary factors, the work we have done in the industry at large, not just the Linux kernel, has had an impact. What we, Microsoft, Apple, and everyone else is doing for their respective software ecosystems, has shown that the price of functional exploits in the black market has gone up. Especially for really egregious stuff like a zero-click remote code execution. If those were cheap then obviously we are not doing something right, and it becomes clear that it s trivial for anyone to attack the infrastructure that our lives depend on. But thankfully we have seen over the last two decades that prices for exploits keep going up and up into millions of dollars. I think it is important to keep working on that because, as a central piece of modern computer infrastructure, the Linux kernel has a giant target painted on it. If we give up, we have to accept that our computers are not doing what they were designed to do, which I can t accept. The safety of my grandparents shouldn t be any different from the safety of journalists, and political activists, and anyone else who might be the target of attacks. We need to be able to trust our devices otherwise why use them at all?
Vagrant: What has been your biggest success in recent years? Kees: I think with all these things I am not the only actor. Almost everything that we have been successful at has been because of a lot of people s work, and one of the big ones that has been coordinated across the ecosystem and across compilers was initializing stack variables to 0 by default. This feature was added in Clang, GCC, and MSVC across the board even though there were a lot of fears about forking the C language. The worry was that developers would come to depend on zero-initialized stack variables, but this hasn t been the case because we still warn about uninitialized variables when the compiler can figure that out. So you still still get the warnings at compile time but now you can count on the contents of your stack at run-time and we drop an entire class of uninitialized variable flaws. While the exploitation of this class has mostly been around memory content exposure, it has also been used for control flow attacks. So that was politically and technically a large challenge: convincing people it was necessary, showing its utility, and implementing it in a way that everyone would be happy with, resulting in the elimination of a large and persistent class of flaws in C.
Vagrant: In a world where things are generally Reproducible do you see ways in which that might affect your work? Kees: One of the questions I frequently get is, What version of the Linux kernel has feature $foo? If I know how things are built, I can answer with just a version number. In a Reproducible Builds scenario I can count on the compiler version, compiler flags, kernel configuration, etc. all those things are known, so I can actually answer definitively that a certain feature exists. So that is an area where Reproducible Builds affects me most directly. Indirectly, it is just being able to trust the binaries you are running are going to behave the same for the same build environment is critical for sane testing.
Vagrant: Have you used diffoscope? Kees: I have! One subset of tree-wide refactoring that we do when getting rid of ambiguous language usage in the kernel is when we have to make source level changes to satisfy some new compiler requirement but where the binary output is not expected to change at all. It is mostly about getting the compiler to understand what is happening, what is intended in the cases where the old ambiguity does actually match the new unambiguous description of what is intended. The binary shouldn t change. We have used diffoscope to compare the before and after binaries to confirm that yep, there is no change in binary .
Vagrant: You cannot just use checksums for that? Kees: For the most part, we need to only compare the text segments. We try to hold as much stable as we can, following the Reproducible Builds documentation for the kernel, but there are macros in the kernel that are sensitive to source line numbers and as a result those will change the layout of the data segment (and sometimes the text segment too). With diffoscope there s flexibility where I can exclude or include different comparisons. Sometimes I just go look at what diffoscope is doing and do that manually, because I can tweak that a little harder, but diffoscope is definitely the default. Diffoscope is awesome!
Vagrant: Where has reproducible builds affected you? Kees: One of the notable wins of reproducible builds lately was dealing with the fallout of the XZ backdoor and just being able to ask the question is my build environment running the expected code? and to be able to compare the output generated from one install that never had a vulnerable XZ and one that did have a vulnerable XZ and compare the results of what you get. That was important for kernel builds because the XZ threat actor was working to expand their influence and capabilities to include Linux kernel builds, but they didn t finish their work before they were noticed. I think what happened with Debian proving the build infrastructure was not affected is an important example of how people would have needed to verify the kernel builds too.
Vagrant: What do you want to see for the near or distant future in security work? Kees: For reproducible builds in the kernel, in the work that has been going on in the ClangBuiltLinux project, one of the driving forces of code and usability quality has been the continuous integration work. As soon as something breaks, on the kernel side, the Clang side, or something in between the two, we get a fast signal and can chase it and fix the bugs quickly. I would like to see someone with funding to maintain a reproducible kernel build CI. There have been places where there are certain architecture configurations or certain build configuration where we lose reproducibility and right now we have sort of a standard open source development feedback loop where those things get fixed but the time in between introduction and fix can be large. Getting a CI for reproducible kernels would give us the opportunity to shorten that time.
Vagrant: Well, thanks for that! Any last closing thoughts? Kees: I am a big fan of reproducible builds, thank you for all your work. The world is a safer place because of it.
Vagrant: Likewise for your work!


For more information about the Reproducible Builds project, please see our website at reproducible-builds.org. If you are interested in ensuring the ongoing security of the software that underpins our civilisation and wish to sponsor the Reproducible Builds project, please reach out to the project by emailing contact@reproducible-builds.org.

28 September 2024

Dave Hibberd: EuroBSDCon 2024 Report

This year I attended EuroBSDCon 2024 in Dublin. I always appreciate an excuse to head over to Ireland, and this seemed like a great chance to spend some time in Dublin and learn new things. Due to constraints on my time I didn t go to the 2 day devsummit that precedes the conference, only the main event itself.

The Event EuroBSDCon was attended by about 200-250 people, the hardcore of the BSD community! Attendees came from all over, I met Canadians, USAians, Germanians, Belgians and Irelandians amongst other nationalities! The event was at UCD Dublin, which is a gorgeous university campus about 10km south of Dublin proper in Stillorgan. The speaker hotel was a 20 minute walk (at my ~9min/km pace) from the hotel, or a quick bus journey. It was a pleasant walk, through the leafy campus and then along some pretty broad pavements, albeit beside a dual carriageway. The cycle infrastructure was pretty excellent too, but I sadly was unable to lease a city bike and make my way around on 2 wheels - Dublinbikes don t extend that far out the city. Lunch each day was Irish themed food - Saturday was beef stew (a Frenchman asked me what it was called - his only equivalent words were Beef Bourguignon ) and Sunday was Bangers & Mash! The kitchen struggled a bit - food was brought out in bowls in waves, and that ensured there was artificial scarcity that clearly left anxiety for some that they weren t going to be fed! Everyone I met was friendly from the day I arrived, and that set me very much at ease and made the event much more enjoyable - things are better shared with others. Big shout out to dch and Blake Willis for spending a lot of time talking to me over the weekend!

Talks I Attended

Keynote: Evidence based Policy formation in the EU Tom Smyth This talk given by Tom Smyth was an interesting look into his work with EU Policymakers in ensuring fair competition for his small, Irish ISP. It was an enlightening look into the workings of the EU and the various bodies that set, and manage policy. It truly is a complicated beast, but the feeling I left with was that there are people all through the organisation who are desperate to do the right thing for EU citizens at all costs. Sadly none of it is directly applicable to me living in the UK, but I still get to have a say on policy and vote in polls as an Irish citizen abroad.

10(ish) years of FreeBSD/arm64 Andrew Turner I have been a fan of ARM platforms for a long, long time. I had an early ARM Chromebook and have been equal measures excited and frustrated by the raspberry pi since first contact. I tend to find other ARM people at events and this was no exception! It was an interesting view into one person s dedication to making arm64 a platform for FreeBSD, starting out with no documentation or hardware to becoming a first-class platform. It s interesting to see the roadmap and things upcoming too and makes me hopeful for the future of arm64 in various OSes!

1-800-RC(8)-HELP: Dial Into FreeBSD Service Scripts Mastery! Mateusz Piotrowski rc scripts and startup applications scare me a bit. I m better at systemd units than sysvinit scripts, but that isn t really a transferable skill! This was a deep dive into lots of the functionality that FreeBSD s RC offers, and highlighted things that I only thought were limited to Linux s systemd. I am much more aware of what it s capable of now, but I m still scared to take it on! Afterwards I had a great chat in the hallway with Mateusz about our OS s different approaches to this problem and was impressed with the pragmatic view he had on startup, systemd, rc and the future!

Package management without borders. Using Ravenports on multiple BSDs Michael Reim Ports on the BSDs interest me, but I hadn t realised that outside of each major BSD s collection there were other, cross platform ports collections offered. Ravenports is one of these under developments, and it was good to understand the hows, why and what s happenings of the system. Plus, with my hibbian obsession on building other people s software as my own packages, it s interesting to see how others are doing it!

Building a Modern Packet Radio Network using Open Software me I spoke for 45 minutes to share my passion and frustration for amateur radio, packet radio, the law, the technology and what we re doing in the UK Packet network. This was a lot of fun - it felt like I had a busy room, lots of people interested in the stupid stuff I do with technology and I had lots of conversations after the fact about radio, telecoms, networking and at one point was cornered by what I describe as the Erlang Mafia to talk about how they could help!

Hacking - 30 years ago Walter Belgers This unrecorded talk looked at the history of the Dutch hacker scene, and a young group of hackers explorations of the early internet before modern security was a thing. It was exciting, enrapturing, well presented and a great story of a well spent youth in front of computers.

Social By 1730 I was pretty drained so I took myself back to the hotel missing the last talks, had some down time, and got the DART train to the social event at Brewdog. This invovled about an hour s walking and some train time and that was a nice time to reset my head and just watch the world. The train I was on had a particularly interesting feature where when the motors were not loaded (slowdown or coasting) the lights slowly flickered dim-bright-dim. I don t know if this is across the fleet or just this one, but it was fun to pontificate as I looked out the window at South Dublin passing by. The social was good - a few beer tokens (cider in my case, trying to avoid beer-driven hangovers still), some pleasant junk food and plenty of good company to talk to, lots of people wanting to talk about radio and packet to me! Brewdog struggled a bit - both in bar speed (a linear queue formed despite the staff preferring the crowd-around method of queue) and buffet food appeared in somewhat disjointed waves, meaning that people loitered around the food tables and cleared the plates of wings, sliders, fries, onion rings, mac & cheese as they appeared 4-5 plates at a time. Perhaps a few hundred hungry bodies was a bit too much for them to feed at once. They had shuffleboard that was played all night by various groups! I caught the last bus home, which was relatively painless!

Is our software sustainable? Kent Inge Fagerland Simonsen This was an interesting look into reducing the footprint of software to make it a net benefit. Lots of examples of how little changes can barrel up to big, gigawatthour changes when aggregated over the entire installbase of android or iOS!

A Packet s Journey Through the OpenBSD Network Stack Alexander Bluhm This was an analysis of what happens at each stage of networking in OpenBSD and was pretty interesting to see. Lots of it was out my depth, but it s cool to get an explanation and appreciation for various elements of how software handles each packet that arrives and the differences in the ipv4 and ipv6 stack!

FreeBSD at 30 Years: Its Secrets to Success Kirk McKusick This was a great statistical breakdown of FreeBSD since inception, including top committers, why certain parts of the system and community work so well and what has given it staying power compared to some projects on the internet that peter out after just a few years! Kirk s excitement and passion for the project really shone through, and I want to read his similarly titled article in the FreeBSD Journal now!

Building an open native FreeBSD CI system from scratch with lua, C, jails & zfs Dave Cottlehuber Dave spoke pretty excitedly about his work on a CI system using tools that FreeBSD ships with, and introduced me to the integration of C and Lua which I wasn t fully aware of before. Or I was, and my brain forgot it! With my interest in software build this year, it was quite a timely look at how others are thinking of doing things (I am doing similar stuff with zfs!). I look forward to playing with it when it finally is released to the Real World!

Building an Appliance Allan Jude This was an interesting look into the tools that FreeBSD provides which can be used to make immutable, appliance OSes without too much overhead. Fail safe upgrades and boots with ZFS, running approved code with secure boot, factory resetting and more were discussed! I have had thoughts around this in the recent past, so it was good to have some ideas validated, some challenged and gave me food for thought.

Experience as a speaker I really enjoyed being a speaker at the event! I ve spoken at other things before, but this really was a cut above. The event having money to provide me a hotel was a really welcome surprise, and also receiving a gorgeous scarf as a speaker gift was a great surprise (and it has already been worn with the change of temperature here in Scotland this week!). I would definitely consider returning, either as an attendee or as a speaker. The community of attendees were pragmatic, interesting, engaging and welcoming, the organising committee were spot-on in their work making it happen and the whole event, while turning my brain to mush with all the information, was really enjoyable and I left energised and excited by things instead of ground down and tired.

26 September 2024

Melissa Wen: Reflections on 2024 Linux Display Next Hackfest

Hey everyone! The 2024 Linux Display Next hackfest concluded in May, and its outcomes continue to shape the Linux Display stack. Igalia hosted this year s event in A Coru a, Spain, bringing together leading experts in the field. Samuel Iglesias and I organized this year s edition and this blog post summarizes the experience and its fruits. One of the highlights of this year s hackfest was the wide range of backgrounds represented by our 40 participants (both on-site and remotely). Developers and experts from various companies and open-source projects came together to advance the Linux Display ecosystem. You can find the list of participants here. The event covered a broad spectrum of topics affecting the development of Linux projects, user experiences, and the future of display technologies on Linux. From cutting-edge topics to long-term discussions, you can check the event agenda here.

Organization Highlights The hackfest was marked by in-depth discussions and knowledge sharing among Linux contributors, making everyone inspired, informed, and connected to the community. Building on feedback from the previous year, we refined the unconference format to enhance participant preparation and engagement. Structured Agenda and Timeboxes: Each session had a defined scope, time limit (1h20 or 2h10), and began with an introductory talk on the topic.
  • Participant-Led Discussions: We pre-selected in-person participants to lead discussions, allowing them to prepare introductions, resources, and scope.
  • Transparent Scheduling: The schedule was shared in advance as GitHub issues, encouraging participants to review and prepare for sessions of interest.
Engaging Sessions: The hackfest featured a variety of topics, including presentations and discussions on how participants were addressing specific subjects within their companies.
  • No Breakout Rooms, No Overlaps: All participants chose to attend all sessions, eliminating the need for separate breakout rooms. We also adapted run-time schedule to keep everybody involved in the same topics.
  • Real-time Updates: We provided notifications and updates through dedicated emails and the event matrix room.
Strengthening Community Connections: The hackfest offered ample opportunities for networking among attendees.
  • Social Events: Igalia sponsored coffee breaks, lunches, and a dinner at a local restaurant.
  • Museum Visit: Participants enjoyed a sponsored visit to the Museum of Estrela Galicia Beer (MEGA).

Fruitful Discussions and Follow-up The structured agenda and breaks allowed us to cover multiple topics during the hackfest. These discussions have led to new display feature development and improvements, as evidenced by patches, merge requests, and implementations in project repositories and mailing lists. With the KMS color management API taking shape, we discussed refinements and best approaches to cover the variety of color pipeline from different hardware-vendors. We are also investigating techniques for a performant SDR<->HDR content reproduction and reducing latency and power consumption when using the color blocks of the hardware.

Color Management/HDR Color Management and HDR continued to be the hottest topic of the hackfest. We had three sessions dedicated to discuss Color and HDR across Linux Display stack layers.

Color/HDR (Kernel-Level) Harry Wentland (AMD) led this session. Here, kernel Developers shared the Color Management pipeline of AMD, Intel and NVidia. We counted with diagrams and explanations from HW-vendors developers that discussed differences, constraints and paths to fit them into the KMS generic color management properties such as advertising modeset needs, IN\_FORMAT, segmented LUTs, interpolation types, etc. Developers from Qualcomm and ARM also added information regarding their hardware. Upstream work related to this session:

Color/HDR (Compositor-Level) Sebastian Wick (RedHat) led this session. It started with Sebastian s presentation covering Wayland color protocols and compositor implementation. Also, an explanation of APIs provided by Wayland and how they can be used to achieve better color management for applications and discussions around ICC profiles and color representation metadata. There was also an intensive Q&A about LittleCMS with Marti Maria. Upstream work related to this session:

Color/HDR (Use Cases and Testing) Christopher Cameron (Google) and Melissa Wen (Igalia) led this session. In contrast to the other sessions, here we focused less on implementation and more on brainstorming and reflections of real-world SDR and HDR transformations (use and validation) and gainmaps. Christopher gave a nice presentation explaining HDR gainmap images and how we should think of HDR. This presentation and Q&A were important to put participants at the same page of how to transition between SDR and HDR and somehow emulating HDR. We also discussed on the usage of a kernel background color property. Finally, we discussed a bit about Chamelium and the future of VKMS (future work and maintainership).

Power Savings vs Color/Latency Mario Limonciello (AMD) led this session. Mario gave an introductory presentation about AMD ABM (adaptive backlight management) that is similar to Intel DPST. After some discussions, we agreed on exposing a kernel property for power saving policy. This work was already merged on kernel and the userspace support is under development. Upstream work related to this session:

Strategy for video and gaming use-cases Leo Li (AMD) led this session. Miguel Casas (Google) started this session with a presentation of Overlays in Chrome/OS Video, explaining the main goal of power saving by switching off GPU for accelerated compositing and the challenges of different colorspace/HDR for video on Linux. Then Leo Li presented different strategies for video and gaming and we discussed the userspace need of more detailed feedback mechanisms to understand failures when offloading. Also, creating a debugFS interface came up as a tool for debugging and analysis.

Real-time scheduling and async KMS API Xaver Hugl (KDE/BlueSystems) led this session. Compositor developers have exposed some issues with doing real-time scheduling and async page flips. One is that the Kernel limits the lifetime of realtime threads and if a modeset takes too long, the thread will be killed and thus the compositor as well. Also, simple page flips take longer than expected and drivers should optimize them. Another issue is the lack of feedback to compositors about hardware programming time and commit deadlines (the lastest possible time to commit). This is difficult to predict from drivers, since it varies greatly with the type of properties. For example, color management updates take much longer. In this regard, we discusssed implementing a hw_done callback to timestamp when the hardware programming of the last atomic commit is complete. Also an API to pre-program color pipeline in a kind of A/B scheme. It may not be supported by all drivers, but might be useful in different ways.

VRR/Frame Limit, Display Mux, Display Control, and more and beer We also had sessions to discuss a new KMS API to mitigate headaches on VRR and Frame Limit as different brightness level at different refresh rates, abrupt changes of refresh rates, low frame rate compensation (LFC) and precise timing in VRR more. On Display Control we discussed features missing in the current KMS interface for HDR mode, atomic backlight settings, source-based tone mapping, etc. We also discussed the need of a place where compositor developers can post TODOs to be developed by KMS people. The Content-adaptive Scaling and Sharpening session focused on sharpening and scaling filters. In the Display Mux session, we discussed proposals to expose the capability of dynamic mux switching display signal between discrete and integrated GPUs. In the last session of the 2024 Display Next Hackfest, participants representing different compositors summarized current and future work and built a Linux Display wish list , which includes: improvements to VTTY and HDR switching, better dmabuf API for multi-GPU support, definition of tone mapping, blending and scaling sematics, and wayland protocols for advertising to clients which colorspaces are supported. We closed this session with a status update on feature development by compositors, including but not limited to: plane offloading (from libcamera to output) / HDR video offloading (dma-heaps) / plane-based scrolling for web pages, color management / HDR / ICC profiles support, addressing issues such as flickering when color primaries don t match, etc. After three days of intensive discussions, all in-person participants went to a guided tour at the Museum of Extrela Galicia beer (MEGA), pouring and tasting the most famous local beer.

Feedback and Future Directions Participants provided valuable feedback on the hackfest, including suggestions for future improvements.
  • Schedule and Break-time Setup: Having a pre-defined agenda and schedule provided a better balance between long discussions and mental refreshments, preventing the fatigue caused by endless discussions.
  • Action Points: Some participants recommended explicitly asking for action points at the end of each session and assigning people to follow-up tasks.
  • Remote Participation: Remote attendees appreciated the inclusive setup and opportunities to actively participate in discussions.
  • Technical Challenges: There were bandwidth and video streaming issues during some sessions due to the large number of participants.

Thank you for joining the 2024 Display Next Hackfest We can t help but thank the 40 participants, who engaged in-person or virtually on relevant discussions, for a collaborative evolution of the Linux display stack and for building an insightful agenda. A big thank you to the leaders and presenters of the nine sessions: Christopher Cameron (Google), Harry Wentland (AMD), Leo Li (AMD), Mario Limoncello (AMD), Sebastian Wick (RedHat) and Xaver Hugl (KDE/BlueSystems) for the effort in preparing the sessions, explaining the topic and guiding discussions. My acknowledge to the others in-person participants that made such an effort to travel to A Coru a: Alex Goins (NVIDIA), David Turner (Raspberry Pi), Georges Stavracas (Igalia), Joan Torres (SUSE), Liviu Dudau (Arm), Louis Chauvet (Bootlin), Robert Mader (Collabora), Tian Mengge (GravityXR), Victor Jaquez (Igalia) and Victoria Brekenfeld (System76). It was and awesome opportunity to meet you and chat face-to-face. Finally, thanks virtual participants who couldn t make it in person but organized their days to actively participate in each discussion, adding different perspectives and valuable inputs even remotely: Abhinav Kumar (Qualcomm), Chaitanya Borah (Intel), Christopher Braga (Qualcomm), Dor Askayo (Red Hat), Jiri Koten (RedHat), Jonas dahl (Red Hat), Leandro Ribeiro (Collabora), Marti Maria (Little CMS), Marijn Suijten, Mario Kleiner, Martin Stransky (Red Hat), Michel D nzer (Red Hat), Miguel Casas-Sanchez (Google), Mitulkumar Golani (Intel), Naveen Kumar (Intel), Niels De Graef (Red Hat), Pekka Paalanen (Collabora), Pichika Uday Kiran (AMD), Shashank Sharma (AMD), Sriharsha PV (AMD), Simon Ser, Uma Shankar (Intel) and Vikas Korjani (AMD). We look forward to another successful Display Next hackfest, continuing to drive innovation and improvement in the Linux display ecosystem!

25 September 2024

Melissa Wen: Reflections on 2024 Linux Display Next Hackfest

Hey everyone! The 2024 Linux Display Next hackfest concluded in May, and its outcomes continue to shape the Linux Display stack. Igalia hosted this year s event in A Coru a, Spain, bringing together leading experts in the field. Samuel Iglesias and I organized this year s edition and this blog post summarizes the experience and its fruits. One of the highlights of this year s hackfest was the wide range of backgrounds represented by our 40 participants (both on-site and remotely). Developers and experts from various companies and open-source projects came together to advance the Linux Display ecosystem. You can find the list of participants here. The event covered a broad spectrum of topics affecting the development of Linux projects, user experiences, and the future of display technologies on Linux. From cutting-edge topics to long-term discussions, you can check the event agenda here.

Organization Highlights The hackfest was marked by in-depth discussions and knowledge sharing among Linux contributors, making everyone inspired, informed, and connected to the community. Building on feedback from the previous year, we refined the unconference format to enhance participant preparation and engagement. Structured Agenda and Timeboxes: Each session had a defined scope, time limit (1h20 or 2h10), and began with an introductory talk on the topic.
  • Participant-Led Discussions: We pre-selected in-person participants to lead discussions, allowing them to prepare introductions, resources, and scope.
  • Transparent Scheduling: The schedule was shared in advance as GitHub issues, encouraging participants to review and prepare for sessions of interest.
Engaging Sessions: The hackfest featured a variety of topics, including presentations and discussions on how participants were addressing specific subjects within their companies.
  • No Breakout Rooms, No Overlaps: All participants chose to attend all sessions, eliminating the need for separate breakout rooms. We also adapted run-time schedule to keep everybody involved in the same topics.
  • Real-time Updates: We provided notifications and updates through dedicated emails and the event matrix room.
Strengthening Community Connections: The hackfest offered ample opportunities for networking among attendees.
  • Social Events: Igalia sponsored coffee breaks, lunches, and a dinner at a local restaurant.
  • Museum Visit: Participants enjoyed a sponsored visit to the Museum of Estrela Galicia Beer (MEGA).

Fruitful Discussions and Follow-up The structured agenda and breaks allowed us to cover multiple topics during the hackfest. These discussions have led to new display feature development and improvements, as evidenced by patches, merge requests, and implementations in project repositories and mailing lists. With the KMS color management API taking shape, we discussed refinements and best approaches to cover the variety of color pipeline from different hardware-vendors. We are also investigating techniques for a performant SDR<->HDR content reproduction and reducing latency and power consumption when using the color blocks of the hardware.

Color Management/HDR Color Management and HDR continued to be the hottest topic of the hackfest. We had three sessions dedicated to discuss Color and HDR across Linux Display stack layers.

Color/HDR (Kernel-Level) Harry Wentland (AMD) led this session. Here, kernel Developers shared the Color Management pipeline of AMD, Intel and NVidia. We counted with diagrams and explanations from HW-vendors developers that discussed differences, constraints and paths to fit them into the KMS generic color management properties such as advertising modeset needs, IN\_FORMAT, segmented LUTs, interpolation types, etc. Developers from Qualcomm and ARM also added information regarding their hardware. Upstream work related to this session:

Color/HDR (Compositor-Level) Sebastian Wick (RedHat) led this session. It started with Sebastian s presentation covering Wayland color protocols and compositor implementation. Also, an explanation of APIs provided by Wayland and how they can be used to achieve better color management for applications and discussions around ICC profiles and color representation metadata. There was also an intensive Q&A about LittleCMS with Marti Maria. Upstream work related to this session:

Color/HDR (Use Cases and Testing) Christopher Cameron (Google) and Melissa Wen (Igalia) led this session. In contrast to the other sessions, here we focused less on implementation and more on brainstorming and reflections of real-world SDR and HDR transformations (use and validation) and gainmaps. Christopher gave a nice presentation explaining HDR gainmap images and how we should think of HDR. This presentation and Q&A were important to put participants at the same page of how to transition between SDR and HDR and somehow emulating HDR. We also discussed on the usage of a kernel background color property. Finally, we discussed a bit about Chamelium and the future of VKMS (future work and maintainership).

Power Savings vs Color/Latency Mario Limonciello (AMD) led this session. Mario gave an introductory presentation about AMD ABM (adaptive backlight management) that is similar to Intel DPST. After some discussions, we agreed on exposing a kernel property for power saving policy. This work was already merged on kernel and the userspace support is under development. Upstream work related to this session:

Strategy for video and gaming use-cases Leo Li (AMD) led this session. Miguel Casas (Google) started this session with a presentation of Overlays in Chrome/OS Video, explaining the main goal of power saving by switching off GPU for accelerated compositing and the challenges of different colorspace/HDR for video on Linux. Then Leo Li presented different strategies for video and gaming and we discussed the userspace need of more detailed feedback mechanisms to understand failures when offloading. Also, creating a debugFS interface came up as a tool for debugging and analysis.

Real-time scheduling and async KMS API Xaver Hugl (KDE/BlueSystems) led this session. Compositor developers have exposed some issues with doing real-time scheduling and async page flips. One is that the Kernel limits the lifetime of realtime threads and if a modeset takes too long, the thread will be killed and thus the compositor as well. Also, simple page flips take longer than expected and drivers should optimize them. Another issue is the lack of feedback to compositors about hardware programming time and commit deadlines (the lastest possible time to commit). This is difficult to predict from drivers, since it varies greatly with the type of properties. For example, color management updates take much longer. In this regard, we discusssed implementing a hw_done callback to timestamp when the hardware programming of the last atomic commit is complete. Also an API to pre-program color pipeline in a kind of A/B scheme. It may not be supported by all drivers, but might be useful in different ways.

VRR/Frame Limit, Display Mux, Display Control, and more and beer We also had sessions to discuss a new KMS API to mitigate headaches on VRR and Frame Limit as different brightness level at different refresh rates, abrupt changes of refresh rates, low frame rate compensation (LFC) and precise timing in VRR more. On Display Control we discussed features missing in the current KMS interface for HDR mode, atomic backlight settings, source-based tone mapping, etc. We also discussed the need of a place where compositor developers can post TODOs to be developed by KMS people. The Content-adaptive Scaling and Sharpening session focused on sharpening and scaling filters. In the Display Mux session, we discussed proposals to expose the capability of dynamic mux switching display signal between discrete and integrated GPUs. In the last session of the 2024 Display Next Hackfest, participants representing different compositors summarized current and future work and built a Linux Display wish list , which includes: improvements to VTTY and HDR switching, better dmabuf API for multi-GPU support, definition of tone mapping, blending and scaling sematics, and wayland protocols for advertising to clients which colorspaces are supported. We closed this session with a status update on feature development by compositors, including but not limited to: plane offloading (from libcamera to output) / HDR video offloading (dma-heaps) / plane-based scrolling for web pages, color management / HDR / ICC profiles support, addressing issues such as flickering when color primaries don t match, etc. After three days of intensive discussions, all in-person participants went to a guided tour at the Museum of Extrela Galicia beer (MEGA), pouring and tasting the most famous local beer.

Feedback and Future Directions Participants provided valuable feedback on the hackfest, including suggestions for future improvements.
  • Schedule and Break-time Setup: Having a pre-defined agenda and schedule provided a better balance between long discussions and mental refreshments, preventing the fatigue caused by endless discussions.
  • Action Points: Some participants recommended explicitly asking for action points at the end of each session and assigning people to follow-up tasks.
  • Remote Participation: Remote attendees appreciated the inclusive setup and opportunities to actively participate in discussions.
  • Technical Challenges: There were bandwidth and video streaming issues during some sessions due to the large number of participants.

Thank you for joining the 2024 Display Next Hackfest We can t help but thank the 40 participants, who engaged in-person or virtually on relevant discussions, for a collaborative evolution of the Linux display stack and for building an insightful agenda. A big thank you to the leaders and presenters of the nine sessions: Christopher Cameron (Google), Harry Wentland (AMD), Leo Li (AMD), Mario Limoncello (AMD), Sebastian Wick (RedHat) and Xaver Hugl (KDE/BlueSystems) for the effort in preparing the sessions, explaining the topic and guiding discussions. My acknowledge to the others in-person participants that made such an effort to travel to A Coru a: Alex Goins (NVIDIA), David Turner (Raspberry Pi), Georges Stavracas (Igalia), Joan Torres (SUSE), Liviu Dudau (Arm), Louis Chauvet (Bootlin), Robert Mader (Collabora), Tian Mengge (GravityXR), Victor Jaquez (Igalia) and Victoria Brekenfeld (System76). It was and awesome opportunity to meet you and chat face-to-face. Finally, thanks virtual participants who couldn t make it in person but organized their days to actively participate in each discussion, adding different perspectives and valuable inputs even remotely: Abhinav Kumar (Qualcomm), Chaitanya Borah (Intel), Christopher Braga (Qualcomm), Dor Askayo, Jiri Koten (RedHat), Jonas dahl (Red Hat), Leandro Ribeiro (Collabora), Marti Maria (Little CMS), Marijn Suijten, Mario Kleiner, Martin Stransky (Red Hat), Michel D nzer (Red Hat), Miguel Casas-Sanchez (Google), Mitulkumar Golani (Intel), Naveen Kumar (Intel), Niels De Graef (Red Hat), Pekka Paalanen (Collabora), Pichika Uday Kiran (AMD), Shashank Sharma (AMD), Sriharsha PV (AMD), Simon Ser, Uma Shankar (Intel) and Vikas Korjani (AMD). We look forward to another successful Display Next hackfest, continuing to drive innovation and improvement in the Linux display ecosystem!

8 September 2024

Jacob Adams: Linux's Bedtime Routine

How does Linux move from an awake machine to a hibernating one? How does it then manage to restore all state? These questions led me to read way too much C in trying to figure out how this particular hardware/software boundary is navigated. This investigation will be split into a few parts, with the first one going from invocation of hibernation to synchronizing all filesystems to disk. This article has been written using Linux version 6.9.9, the source of which can be found in many places, but can be navigated easily through the Bootlin Elixir Cross-Referencer: https://elixir.bootlin.com/linux/v6.9.9/source Each code snippet will begin with a link to the above giving the file path and the line number of the beginning of the snippet.

A Starting Point for Investigation: /sys/power/state and /sys/power/disk These two system files exist to allow debugging of hibernation, and thus control the exact state used directly. Writing specific values to the state file controls the exact sleep mode used and disk controls the specific hibernation mode1. This is extremely handy as an entry point to understand how these systems work, since we can just follow what happens when they are written to.

Show and Store Functions These two files are defined using the power_attr macro: kernel/power/power.h:80
#define power_attr(_name) \
static struct kobj_attribute _name##_attr =     \
    .attr   =               \
        .name = __stringify(_name), \
        .mode = 0644,           \
     ,                  \
    .show   = _name##_show,         \
    .store  = _name##_store,        \
 
show is called on reads and store on writes. state_show is a little boring for our purposes, as it just prints all the available sleep states. kernel/power/main.c:657
/*
 * state - control system sleep states.
 *
 * show() returns available sleep state labels, which may be "mem", "standby",
 * "freeze" and "disk" (hibernation).
 * See Documentation/admin-guide/pm/sleep-states.rst for a description of
 * what they mean.
 *
 * store() accepts one of those strings, translates it into the proper
 * enumerated value, and initiates a suspend transition.
 */
static ssize_t state_show(struct kobject *kobj, struct kobj_attribute *attr,
			  char *buf)
 
	char *s = buf;
#ifdef CONFIG_SUSPEND
	suspend_state_t i;
	for (i = PM_SUSPEND_MIN; i < PM_SUSPEND_MAX; i++)
		if (pm_states[i])
			s += sprintf(s,"%s ", pm_states[i]);
#endif
	if (hibernation_available())
		s += sprintf(s, "disk ");
	if (s != buf)
		/* convert the last space to a newline */
		*(s-1) = '\n';
	return (s - buf);
 
state_store, however, provides our entry point. If the string disk is written to the state file, it calls hibernate(). This is our entry point. kernel/power/main.c:715
static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
			   const char *buf, size_t n)
 
	suspend_state_t state;
	int error;
	error = pm_autosleep_lock();
	if (error)
		return error;
	if (pm_autosleep_state() > PM_SUSPEND_ON)  
		error = -EBUSY;
		goto out;
	 
	state = decode_state(buf, n);
	if (state < PM_SUSPEND_MAX)  
		if (state == PM_SUSPEND_MEM)
			state = mem_sleep_current;
		error = pm_suspend(state);
	  else if (state == PM_SUSPEND_MAX)  
		error = hibernate();
	  else  
		error = -EINVAL;
	 
 out:
	pm_autosleep_unlock();
	return error ? error : n;
 
kernel/power/main.c:688
static suspend_state_t decode_state(const char *buf, size_t n)
 
#ifdef CONFIG_SUSPEND
	suspend_state_t state;
#endif
	char *p;
	int len;
	p = memchr(buf, '\n', n);
	len = p ? p - buf : n;
	/* Check hibernation first. */
	if (len == 4 && str_has_prefix(buf, "disk"))
		return PM_SUSPEND_MAX;
#ifdef CONFIG_SUSPEND
	for (state = PM_SUSPEND_MIN; state < PM_SUSPEND_MAX; state++)  
		const char *label = pm_states[state];
		if (label && len == strlen(label) && !strncmp(buf, label, len))
			return state;
	 
#endif
	return PM_SUSPEND_ON;
 
Could we have figured this out just via function names? Sure, but this way we know for sure that nothing else is happening before this function is called.

Autosleep Our first detour is into the autosleep system. When checking the state above, you may notice that the kernel grabs the pm_autosleep_lock before checking the current state. autosleep is a mechanism originally from Android that sends the entire system to either suspend or hibernate whenever it is not actively working on anything. This is not enabled for most desktop configurations, since it s primarily for mobile systems and inverts the standard suspend and hibernate interactions. This system is implemented as a workqueue2 that checks the current number of wakeup events, processes and drivers that need to run3, and if there aren t any, then the system is put into the autosleep state, typically suspend. However, it could be hibernate if configured that way via /sys/power/autosleep in a similar manner to using /sys/power/state to manually enable hibernation. kernel/power/main.c:841
static ssize_t autosleep_store(struct kobject *kobj,
			       struct kobj_attribute *attr,
			       const char *buf, size_t n)
 
	suspend_state_t state = decode_state(buf, n);
	int error;
	if (state == PM_SUSPEND_ON
	    && strcmp(buf, "off") && strcmp(buf, "off\n"))
		return -EINVAL;
	if (state == PM_SUSPEND_MEM)
		state = mem_sleep_current;
	error = pm_autosleep_set_state(state);
	return error ? error : n;
 
power_attr(autosleep);
#endif /* CONFIG_PM_AUTOSLEEP */
kernel/power/autosleep.c:24
static DEFINE_MUTEX(autosleep_lock);
static struct wakeup_source *autosleep_ws;
static void try_to_suspend(struct work_struct *work)
 
	unsigned int initial_count, final_count;
	if (!pm_get_wakeup_count(&initial_count, true))
		goto out;
	mutex_lock(&autosleep_lock);
	if (!pm_save_wakeup_count(initial_count)  
		system_state != SYSTEM_RUNNING)  
		mutex_unlock(&autosleep_lock);
		goto out;
	 
	if (autosleep_state == PM_SUSPEND_ON)  
		mutex_unlock(&autosleep_lock);
		return;
	 
	if (autosleep_state >= PM_SUSPEND_MAX)
		hibernate();
	else
		pm_suspend(autosleep_state);
	mutex_unlock(&autosleep_lock);
	if (!pm_get_wakeup_count(&final_count, false))
		goto out;
	/*
	 * If the wakeup occurred for an unknown reason, wait to prevent the
	 * system from trying to suspend and waking up in a tight loop.
	 */
	if (final_count == initial_count)
		schedule_timeout_uninterruptible(HZ / 2);
 out:
	queue_up_suspend_work();
 
static DECLARE_WORK(suspend_work, try_to_suspend);
void queue_up_suspend_work(void)
 
	if (autosleep_state > PM_SUSPEND_ON)
		queue_work(autosleep_wq, &suspend_work);
 

The Steps of Hibernation

Hibernation Kernel Config It s important to note that most of the hibernate-specific functions below do nothing unless you ve defined CONFIG_HIBERNATION in your Kconfig4. As an example, hibernate itself is defined as the following if CONFIG_HIBERNATE is not set. include/linux/suspend.h:407
static inline int hibernate(void)   return -ENOSYS;  

Check if Hibernation is Available We begin by confirming that we actually can perform hibernation, via the hibernation_available function. kernel/power/hibernate.c:742
if (!hibernation_available())  
	pm_pr_dbg("Hibernation not available.\n");
	return -EPERM;
 
kernel/power/hibernate.c:92
bool hibernation_available(void)
 
	return nohibernate == 0 &&
		!security_locked_down(LOCKDOWN_HIBERNATION) &&
		!secretmem_active() && !cxl_mem_active();
 
nohibernate is controlled by the kernel command line, it s set via either nohibernate or hibernate=no. security_locked_down is a hook for Linux Security Modules to prevent hibernation. This is used to prevent hibernating to an unencrypted storage device, as specified in the manual page kernel_lockdown(7). Interestingly, either level of lockdown, integrity or confidentiality, locks down hibernation because with the ability to hibernate you can extract bascially anything from memory and even reboot into a modified kernel image. secretmem_active checks whether there is any active use of memfd_secret, and if so it prevents hibernation. memfd_secret returns a file descriptor that can be mapped into a process but is specifically unmapped from the kernel s memory space. Hibernating with memory that not even the kernel is supposed to access would expose that memory to whoever could access the hibernation image. This particular feature of secret memory was apparently controversial, though not as controversial as performance concerns around fragmentation when unmapping kernel memory (which did not end up being a real problem). cxl_mem_active just checks whether any CXL memory is active. A full explanation is provided in the commit introducing this check but there s also a shortened explanation from cxl_mem_probe that sets the relevant flag when initializing a CXL memory device. drivers/cxl/mem.c:186
* The kernel may be operating out of CXL memory on this device,
* there is no spec defined way to determine whether this device
* preserves contents over suspend, and there is no simple way
* to arrange for the suspend image to avoid CXL memory which
* would setup a circular dependency between PCI resume and save
* state restoration.

Check Compression The next check is for whether compression support is enabled, and if so whether the requested algorithm is enabled. kernel/power/hibernate.c:747
/*
 * Query for the compression algorithm support if compression is enabled.
 */
if (!nocompress)  
	strscpy(hib_comp_algo, hibernate_compressor, sizeof(hib_comp_algo));
	if (crypto_has_comp(hib_comp_algo, 0, 0) != 1)  
		pr_err("%s compression is not available\n", hib_comp_algo);
		return -EOPNOTSUPP;
	 
 
The nocompress flag is set via the hibernate command line parameter, setting hibernate=nocompress. If compression is enabled, then hibernate_compressor is copied to hib_comp_algo. This synchronizes the current requested compression setting (hibernate_compressor) with the current compression setting (hib_comp_algo). Both values are character arrays of size CRYPTO_MAX_ALG_NAME (128 in this kernel). kernel/power/hibernate.c:50
static char hibernate_compressor[CRYPTO_MAX_ALG_NAME] = CONFIG_HIBERNATION_DEF_COMP;
/*
 * Compression/decompression algorithm to be used while saving/loading
 * image to/from disk. This would later be used in 'kernel/power/swap.c'
 * to allocate comp streams.
 */
char hib_comp_algo[CRYPTO_MAX_ALG_NAME];
hibernate_compressor defaults to lzo if that algorithm is enabled, otherwise to lz4 if enabled5. It can be overwritten using the hibernate.compressor setting to either lzo or lz4. kernel/power/Kconfig:95
choice
	prompt "Default compressor"
	default HIBERNATION_COMP_LZO
	depends on HIBERNATION
config HIBERNATION_COMP_LZO
	bool "lzo"
	depends on CRYPTO_LZO
config HIBERNATION_COMP_LZ4
	bool "lz4"
	depends on CRYPTO_LZ4
endchoice
config HIBERNATION_DEF_COMP
	string
	default "lzo" if HIBERNATION_COMP_LZO
	default "lz4" if HIBERNATION_COMP_LZ4
	help
	  Default compressor to be used for hibernation.
kernel/power/hibernate.c:1425
static const char * const comp_alg_enabled[] =  
#if IS_ENABLED(CONFIG_CRYPTO_LZO)
	COMPRESSION_ALGO_LZO,
#endif
#if IS_ENABLED(CONFIG_CRYPTO_LZ4)
	COMPRESSION_ALGO_LZ4,
#endif
 ;
static int hibernate_compressor_param_set(const char *compressor,
		const struct kernel_param *kp)
 
	unsigned int sleep_flags;
	int index, ret;
	sleep_flags = lock_system_sleep();
	index = sysfs_match_string(comp_alg_enabled, compressor);
	if (index >= 0)  
		ret = param_set_copystring(comp_alg_enabled[index], kp);
		if (!ret)
			strscpy(hib_comp_algo, comp_alg_enabled[index],
				sizeof(hib_comp_algo));
	  else  
		ret = index;
	 
	unlock_system_sleep(sleep_flags);
	if (ret)
		pr_debug("Cannot set specified compressor %s\n",
			 compressor);
	return ret;
 
static const struct kernel_param_ops hibernate_compressor_param_ops =  
	.set    = hibernate_compressor_param_set,
	.get    = param_get_string,
 ;
static struct kparam_string hibernate_compressor_param_string =  
	.maxlen = sizeof(hibernate_compressor),
	.string = hibernate_compressor,
 ;
We then check whether the requested algorithm is supported via crypto_has_comp. If not, we bail out of the whole operation with EOPNOTSUPP. As part of crypto_has_comp we perform any needed initialization of the algorithm, loading kernel modules and running initialization code as needed6.

Grab Locks The next step is to grab the sleep and hibernation locks via lock_system_sleep and hibernate_acquire. kernel/power/hibernate.c:758
sleep_flags = lock_system_sleep();
/* The snapshot device should not be opened while we're running */
if (!hibernate_acquire())  
	error = -EBUSY;
	goto Unlock;
 
First, lock_system_sleep marks the current thread as not freezable, which will be important later7. It then grabs the system_transistion_mutex, which locks taking snapshots or modifying how they are taken, resuming from a hibernation image, entering any suspend state, or rebooting.

The GFP Mask The kernel also issues a warning if the gfp mask is changed via either pm_restore_gfp_mask or pm_restrict_gfp_mask without holding the system_transistion_mutex. GFP flags tell the kernel how it is permitted to handle a request for memory. include/linux/gfp_types.h:12
 * GFP flags are commonly used throughout Linux to indicate how memory
 * should be allocated.  The GFP acronym stands for get_free_pages(),
 * the underlying memory allocation function.  Not every GFP flag is
 * supported by every function which may allocate memory.
In the case of hibernation specifically we care about the IO and FS flags, which are reclaim operators, ways the system is permitted to attempt to free up memory in order to satisfy a specific request for memory. include/linux/gfp_types.h:176
 * Reclaim modifiers
 * -----------------
 * Please note that all the following flags are only applicable to sleepable
 * allocations (e.g. %GFP_NOWAIT and %GFP_ATOMIC will ignore them).
 *
 * %__GFP_IO can start physical IO.
 *
 * %__GFP_FS can call down to the low-level FS. Clearing the flag avoids the
 * allocator recursing into the filesystem which might already be holding
 * locks.
gfp_allowed_mask sets which flags are permitted to be set at the current time. As the comment below outlines, preventing these flags from being set avoids situations where the kernel needs to do I/O to allocate memory (e.g. read/writing swap8) but the devices it needs to read/write to/from are not currently available. kernel/power/main.c:24
/*
 * The following functions are used by the suspend/hibernate code to temporarily
 * change gfp_allowed_mask in order to avoid using I/O during memory allocations
 * while devices are suspended.  To avoid races with the suspend/hibernate code,
 * they should always be called with system_transition_mutex held
 * (gfp_allowed_mask also should only be modified with system_transition_mutex
 * held, unless the suspend/hibernate code is guaranteed not to run in parallel
 * with that modification).
 */
static gfp_t saved_gfp_mask;
void pm_restore_gfp_mask(void)
 
	WARN_ON(!mutex_is_locked(&system_transition_mutex));
	if (saved_gfp_mask)  
		gfp_allowed_mask = saved_gfp_mask;
		saved_gfp_mask = 0;
	 
 
void pm_restrict_gfp_mask(void)
 
	WARN_ON(!mutex_is_locked(&system_transition_mutex));
	WARN_ON(saved_gfp_mask);
	saved_gfp_mask = gfp_allowed_mask;
	gfp_allowed_mask &= ~(__GFP_IO   __GFP_FS);
 

Sleep Flags After grabbing the system_transition_mutex the kernel then returns and captures the previous state of the threads flags in sleep_flags. This is used later to remove PF_NOFREEZE if it wasn t previously set on the current thread. kernel/power/main.c:52
unsigned int lock_system_sleep(void)
 
	unsigned int flags = current->flags;
	current->flags  = PF_NOFREEZE;
	mutex_lock(&system_transition_mutex);
	return flags;
 
EXPORT_SYMBOL_GPL(lock_system_sleep);
include/linux/sched.h:1633
#define PF_NOFREEZE		0x00008000	/* This thread should not be frozen */
Then we grab the hibernate-specific semaphore to ensure no one can open a snapshot or resume from it while we perform hibernation. Additionally this lock is used to prevent hibernate_quiet_exec, which is used by the nvdimm driver to active its firmware with all processes and devices frozen, ensuring it is the only thing running at that time9. kernel/power/hibernate.c:82
bool hibernate_acquire(void)
 
	return atomic_add_unless(&hibernate_atomic, -1, 0);
 

Prepare Console The kernel next calls pm_prepare_console. This function only does anything if CONFIG_VT_CONSOLE_SLEEP has been set. This prepares the virtual terminal for a suspend state, switching away to a console used only for the suspend state if needed. kernel/power/console.c:130
void pm_prepare_console(void)
 
	if (!pm_vt_switch())
		return;
	orig_fgconsole = vt_move_to_console(SUSPEND_CONSOLE, 1);
	if (orig_fgconsole < 0)
		return;
	orig_kmsg = vt_kmsg_redirect(SUSPEND_CONSOLE);
	return;
 
The first thing is to check whether we actually need to switch the VT kernel/power/console.c:94
/*
 * There are three cases when a VT switch on suspend/resume are required:
 *   1) no driver has indicated a requirement one way or another, so preserve
 *      the old behavior
 *   2) console suspend is disabled, we want to see debug messages across
 *      suspend/resume
 *   3) any registered driver indicates it needs a VT switch
 *
 * If none of these conditions is present, meaning we have at least one driver
 * that doesn't need the switch, and none that do, we can avoid it to make
 * resume look a little prettier (and suspend too, but that's usually hidden,
 * e.g. when closing the lid on a laptop).
 */
static bool pm_vt_switch(void)
 
	struct pm_vt_switch *entry;
	bool ret = true;
	mutex_lock(&vt_switch_mutex);
	if (list_empty(&pm_vt_switch_list))
		goto out;
	if (!console_suspend_enabled)
		goto out;
	list_for_each_entry(entry, &pm_vt_switch_list, head)  
		if (entry->required)
			goto out;
	 
	ret = false;
out:
	mutex_unlock(&vt_switch_mutex);
	return ret;
 
There is an explanation of the conditions under which a switch is performed in the comment above the function, but we ll also walk through the steps here. Firstly we grab the vt_switch_mutex to ensure nothing will modify the list while we re looking at it. We then examine the pm_vt_switch_list. This list is used to indicate the drivers that require a switch during suspend. They register this requirement, or the lack thereof, via pm_vt_switch_required. kernel/power/console.c:31
/**
 * pm_vt_switch_required - indicate VT switch at suspend requirements
 * @dev: device
 * @required: if true, caller needs VT switch at suspend/resume time
 *
 * The different console drivers may or may not require VT switches across
 * suspend/resume, depending on how they handle restoring video state and
 * what may be running.
 *
 * Drivers can indicate support for switchless suspend/resume, which can
 * save time and flicker, by using this routine and passing 'false' as
 * the argument.  If any loaded driver needs VT switching, or the
 * no_console_suspend argument has been passed on the command line, VT
 * switches will occur.
 */
void pm_vt_switch_required(struct device *dev, bool required)
Next, we check console_suspend_enabled. This is set to false by the kernel parameter no_console_suspend, but defaults to true. Finally, if there are any entries in the pm_vt_switch_list, then we check to see if any of them require a VT switch. Only if none of these conditions apply, then we return false. If a VT switch is in fact required, then we move first the currently active virtual terminal/console10 (vt_move_to_console) and then the current location of kernel messages (vt_kmsg_redirect) to the SUSPEND_CONSOLE. The SUSPEND_CONSOLE is the last entry in the list of possible consoles, and appears to just be a black hole to throw away messages. kernel/power/console.c:16
#define SUSPEND_CONSOLE	(MAX_NR_CONSOLES-1)
Interestingly, these are separate functions because you can use TIOCL_SETKMSGREDIRECT (an ioctl11) to send kernel messages to a specific virtual terminal, but by default its the same as the currently active console. The locations of the previously active console and the previous kernel messages location are stored in orig_fgconsole and orig_kmsg, to restore the state of the console and kernel messages after the machine wakes up again. Interestingly, this means orig_fgconsole also ends up storing any errors, so has to be checked to ensure it s not less than zero before we try to do anything with the kernel messages on both suspend and resume. drivers/tty/vt/vt_ioctl.c:1268
/* Perform a kernel triggered VT switch for suspend/resume */
static int disable_vt_switch;
int vt_move_to_console(unsigned int vt, int alloc)
 
	int prev;
	console_lock();
	/* Graphics mode - up to X */
	if (disable_vt_switch)  
		console_unlock();
		return 0;
	 
	prev = fg_console;
	if (alloc && vc_allocate(vt))  
		/* we can't have a free VC for now. Too bad,
		 * we don't want to mess the screen for now. */
		console_unlock();
		return -ENOSPC;
	 
	if (set_console(vt))  
		/*
		 * We're unable to switch to the SUSPEND_CONSOLE.
		 * Let the calling function know so it can decide
		 * what to do.
		 */
		console_unlock();
		return -EIO;
	 
	console_unlock();
	if (vt_waitactive(vt + 1))  
		pr_debug("Suspend: Can't switch VCs.");
		return -EINTR;
	 
	return prev;
 
Unlike most other locking functions we ve seen so far, console_lock needs to be careful to ensure nothing else is panicking and needs to dump to the console before grabbing the semaphore for the console and setting a couple flags.

Panics Panics are tracked via an atomic integer set to the id of the processor currently panicking. kernel/printk/printk.c:2649
/**
 * console_lock - block the console subsystem from printing
 *
 * Acquires a lock which guarantees that no consoles will
 * be in or enter their write() callback.
 *
 * Can sleep, returns nothing.
 */
void console_lock(void)
 
	might_sleep();
	/* On panic, the console_lock must be left to the panic cpu. */
	while (other_cpu_in_panic())
		msleep(1000);
	down_console_sem();
	console_locked = 1;
	console_may_schedule = 1;
 
EXPORT_SYMBOL(console_lock);
kernel/printk/printk.c:362
/*
 * Return true if a panic is in progress on a remote CPU.
 *
 * On true, the local CPU should immediately release any printing resources
 * that may be needed by the panic CPU.
 */
bool other_cpu_in_panic(void)
 
	return (panic_in_progress() && !this_cpu_in_panic());
 
kernel/printk/printk.c:345
static bool panic_in_progress(void)
 
	return unlikely(atomic_read(&panic_cpu) != PANIC_CPU_INVALID);
 
kernel/printk/printk.c:350
/* Return true if a panic is in progress on the current CPU. */
bool this_cpu_in_panic(void)
 
	/*
	 * We can use raw_smp_processor_id() here because it is impossible for
	 * the task to be migrated to the panic_cpu, or away from it. If
	 * panic_cpu has already been set, and we're not currently executing on
	 * that CPU, then we never will be.
	 */
	return unlikely(atomic_read(&panic_cpu) == raw_smp_processor_id());
 
console_locked is a debug value, used to indicate that the lock should be held, and our first indication that this whole virtual terminal system is more complex than might initially be expected. kernel/printk/printk.c:373
/*
 * This is used for debugging the mess that is the VT code by
 * keeping track if we have the console semaphore held. It's
 * definitely not the perfect debug tool (we don't know if _WE_
 * hold it and are racing, but it helps tracking those weird code
 * paths in the console code where we end up in places I want
 * locked without the console semaphore held).
 */
static int console_locked;
console_may_schedule is used to see if we are permitted to sleep and schedule other work while we hold this lock. As we ll see later, the virtual terminal subsystem is not re-entrant, so there s all sorts of hacks in here to ensure we don t leave important code sections that can t be safely resumed.

Disable VT Switch As the comment below lays out, when another program is handling graphical display anyway, there s no need to do any of this, so the kernel provides a switch to turn the whole thing off. Interestingly, this appears to only be used by three drivers, so the specific hardware support required must not be particularly common.
drivers/gpu/drm/omapdrm/dss
drivers/video/fbdev/geode
drivers/video/fbdev/omap2
drivers/tty/vt/vt_ioctl.c:1308
/*
 * Normally during a suspend, we allocate a new console and switch to it.
 * When we resume, we switch back to the original console.  This switch
 * can be slow, so on systems where the framebuffer can handle restoration
 * of video registers anyways, there's little point in doing the console
 * switch.  This function allows you to disable it by passing it '0'.
 */
void pm_set_vt_switch(int do_switch)
 
	console_lock();
	disable_vt_switch = !do_switch;
	console_unlock();
 
EXPORT_SYMBOL(pm_set_vt_switch);
The rest of the vt_switch_console function is pretty normal, however, simply allocating space if needed to create the requested virtual terminal and then setting the current virtual terminal via set_console.

Virtual Terminal Set Console With set_console, we begin (as if we haven t been already) to enter the madness that is the virtual terminal subsystem. As mentioned previously, modifications to its state must be made very carefully, as other stuff happening at the same time could create complete messes. All this to say, calling set_console does not actually perform any work to change the state of the current console. Instead it indicates what changes it wants and then schedules that work. drivers/tty/vt/vt.c:3153
int set_console(int nr)
 
	struct vc_data *vc = vc_cons[fg_console].d;
	if (!vc_cons_allocated(nr)   vt_dont_switch  
		(vc->vt_mode.mode == VT_AUTO && vc->vc_mode == KD_GRAPHICS))  
		/*
		 * Console switch will fail in console_callback() or
		 * change_console() so there is no point scheduling
		 * the callback
		 *
		 * Existing set_console() users don't check the return
		 * value so this shouldn't break anything
		 */
		return -EINVAL;
	 
	want_console = nr;
	schedule_console_callback();
	return 0;
 
The check for vc->vc_mode == KD_GRAPHICS is where most end-user graphical desktops will bail out of this change, as they re in graphics mode and don t need to switch away to the suspend console. vt_dont_switch is a flag used by the ioctls11 VT_LOCKSWITCH and VT_UNLOCKSWITCH to prevent the system from switching virtual terminal devices when the user has explicitly locked it. VT_AUTO is a flag indicating that automatic virtual terminal switching is enabled12, and thus deliberate switching to a suspend terminal is not required. However, if you do run your machine from a virtual terminal, then we indicate to the system that we want to change to the requested virtual terminal via the want_console variable and schedule a callback via schedule_console_callback. drivers/tty/vt/vt.c:315
void schedule_console_callback(void)
 
	schedule_work(&console_work);
 
console_work is a workqueue2 that will execute the given task asynchronously.

Console Callback drivers/tty/vt/vt.c:3109
/*
 * This is the console switching callback.
 *
 * Doing console switching in a process context allows
 * us to do the switches asynchronously (needed when we want
 * to switch due to a keyboard interrupt).  Synchronization
 * with other console code and prevention of re-entrancy is
 * ensured with console_lock.
 */
static void console_callback(struct work_struct *ignored)
 
	console_lock();
	if (want_console >= 0)  
		if (want_console != fg_console &&
		    vc_cons_allocated(want_console))  
			hide_cursor(vc_cons[fg_console].d);
			change_console(vc_cons[want_console].d);
			/* we only changed when the console had already
			   been allocated - a new console is not created
			   in an interrupt routine */
		 
		want_console = -1;
	 
...
console_callback first looks to see if there is a console change wanted via want_console and then changes to it if it s not the current console and has been allocated already. We do first remove any cursor state with hide_cursor. drivers/tty/vt/vt.c:841
static void hide_cursor(struct vc_data *vc)
 
	if (vc_is_sel(vc))
		clear_selection();
	vc->vc_sw->con_cursor(vc, false);
	hide_softcursor(vc);
 
A full dive into the tty driver is a task for another time, but this should give a general sense of how this system interacts with hibernation.

Notify Power Management Call Chain kernel/power/hibernate.c:767
pm_notifier_call_chain_robust(PM_HIBERNATION_PREPARE, PM_POST_HIBERNATION)
This will call a chain of power management callbacks, passing first PM_HIBERNATION_PREPARE and then PM_POST_HIBERNATION on startup or on error with another callback. kernel/power/main.c:98
int pm_notifier_call_chain_robust(unsigned long val_up, unsigned long val_down)
 
	int ret;
	ret = blocking_notifier_call_chain_robust(&pm_chain_head, val_up, val_down, NULL);
	return notifier_to_errno(ret);
 
The power management notifier is a blocking notifier chain, which means it has the following properties. include/linux/notifier.h:23
 *	Blocking notifier chains: Chain callbacks run in process context.
 *		Callouts are allowed to block.
The callback chain is a linked list with each entry containing a priority and a function to call. The function technically takes in a data value, but it is always NULL for the power management chain. include/linux/notifier.h:49
struct notifier_block;
typedef	int (*notifier_fn_t)(struct notifier_block *nb,
			unsigned long action, void *data);
struct notifier_block  
	notifier_fn_t notifier_call;
	struct notifier_block __rcu *next;
	int priority;
 ;
The head of the linked list is protected by a read-write semaphore. include/linux/notifier.h:65
struct blocking_notifier_head  
	struct rw_semaphore rwsem;
	struct notifier_block __rcu *head;
 ;
Because it is prioritized, appending to the list requires walking it until an item with lower13 priority is found to insert the current item before. kernel/notifier.c:252
/*
 *	Blocking notifier chain routines.  All access to the chain is
 *	synchronized by an rwsem.
 */
static int __blocking_notifier_chain_register(struct blocking_notifier_head *nh,
					      struct notifier_block *n,
					      bool unique_priority)
 
	int ret;
	/*
	 * This code gets used during boot-up, when task switching is
	 * not yet working and interrupts must remain disabled.  At
	 * such times we must not call down_write().
	 */
	if (unlikely(system_state == SYSTEM_BOOTING))
		return notifier_chain_register(&nh->head, n, unique_priority);
	down_write(&nh->rwsem);
	ret = notifier_chain_register(&nh->head, n, unique_priority);
	up_write(&nh->rwsem);
	return ret;
 
kernel/notifier.c:20
/*
 *	Notifier chain core routines.  The exported routines below
 *	are layered on top of these, with appropriate locking added.
 */
static int notifier_chain_register(struct notifier_block **nl,
				   struct notifier_block *n,
				   bool unique_priority)
 
	while ((*nl) != NULL)  
		if (unlikely((*nl) == n))  
			WARN(1, "notifier callback %ps already registered",
			     n->notifier_call);
			return -EEXIST;
		 
		if (n->priority > (*nl)->priority)
			break;
		if (n->priority == (*nl)->priority && unique_priority)
			return -EBUSY;
		nl = &((*nl)->next);
	 
	n->next = *nl;
	rcu_assign_pointer(*nl, n);
	trace_notifier_register((void *)n->notifier_call);
	return 0;
 
Each callback can return one of a series of options. include/linux/notifier.h:18
#define NOTIFY_DONE		0x0000		/* Don't care */
#define NOTIFY_OK		0x0001		/* Suits me */
#define NOTIFY_STOP_MASK	0x8000		/* Don't call further */
#define NOTIFY_BAD		(NOTIFY_STOP_MASK 0x0002)
						/* Bad/Veto action */
When notifying the chain, if a function returns STOP or BAD then the previous parts of the chain are called again with PM_POST_HIBERNATION14 and an error is returned. kernel/notifier.c:107
/**
 * notifier_call_chain_robust - Inform the registered notifiers about an event
 *                              and rollback on error.
 * @nl:		Pointer to head of the blocking notifier chain
 * @val_up:	Value passed unmodified to the notifier function
 * @val_down:	Value passed unmodified to the notifier function when recovering
 *              from an error on @val_up
 * @v:		Pointer passed unmodified to the notifier function
 *
 * NOTE:	It is important the @nl chain doesn't change between the two
 *		invocations of notifier_call_chain() such that we visit the
 *		exact same notifier callbacks; this rules out any RCU usage.
 *
 * Return:	the return value of the @val_up call.
 */
static int notifier_call_chain_robust(struct notifier_block **nl,
				     unsigned long val_up, unsigned long val_down,
				     void *v)
 
	int ret, nr = 0;
	ret = notifier_call_chain(nl, val_up, v, -1, &nr);
	if (ret & NOTIFY_STOP_MASK)
		notifier_call_chain(nl, val_down, v, nr-1, NULL);
	return ret;
 
Each of these callbacks tends to be quite driver-specific, so we ll cease discussion of this here.

Sync Filesystems The next step is to ensure all filesystems have been synchronized to disk. This is performed via a simple helper function that times how long the full synchronize operation, ksys_sync takes. kernel/power/main.c:69
void ksys_sync_helper(void)
 
	ktime_t start;
	long elapsed_msecs;
	start = ktime_get();
	ksys_sync();
	elapsed_msecs = ktime_to_ms(ktime_sub(ktime_get(), start));
	pr_info("Filesystems sync: %ld.%03ld seconds\n",
		elapsed_msecs / MSEC_PER_SEC, elapsed_msecs % MSEC_PER_SEC);
 
EXPORT_SYMBOL_GPL(ksys_sync_helper);
ksys_sync wakes and instructs a set of flusher threads to write out every filesystem, first their inodes15, then the full filesystem, and then finally all block devices, to ensure all pages are written out to disk. fs/sync.c:87
/*
 * Sync everything. We start by waking flusher threads so that most of
 * writeback runs on all devices in parallel. Then we sync all inodes reliably
 * which effectively also waits for all flusher threads to finish doing
 * writeback. At this point all data is on disk so metadata should be stable
 * and we tell filesystems to sync their metadata via ->sync_fs() calls.
 * Finally, we writeout all block devices because some filesystems (e.g. ext2)
 * just write metadata (such as inodes or bitmaps) to block device page cache
 * and do not sync it on their own in ->sync_fs().
 */
void ksys_sync(void)
 
	int nowait = 0, wait = 1;
	wakeup_flusher_threads(WB_REASON_SYNC);
	iterate_supers(sync_inodes_one_sb, NULL);
	iterate_supers(sync_fs_one_sb, &nowait);
	iterate_supers(sync_fs_one_sb, &wait);
	sync_bdevs(false);
	sync_bdevs(true);
	if (unlikely(laptop_mode))
		laptop_sync_completion();
 
It follows an interesting pattern of using iterate_supers to run both sync_inodes_one_sb and then sync_fs_one_sb on each known filesystem16. It also calls both sync_fs_one_sb and sync_bdevs twice, first without waiting for any operations to complete and then again waiting for completion17. When laptop_mode is enabled the system runs additional filesystem synchronization operations after the specified delay without any writes. mm/page-writeback.c:111
/*
 * Flag that puts the machine in "laptop mode". Doubles as a timeout in jiffies:
 * a full sync is triggered after this time elapses without any disk activity.
 */
int laptop_mode;
EXPORT_SYMBOL(laptop_mode);
However, when running a filesystem synchronization operation, the system will add an additional timer to schedule more writes after the laptop_mode delay. We don t want the state of the system to change at all while performing hibernation, so we cancel those timers. mm/page-writeback.c:2198
/*
 * We're in laptop mode and we've just synced. The sync's writes will have
 * caused another writeback to be scheduled by laptop_io_completion.
 * Nothing needs to be written back anymore, so we unschedule the writeback.
 */
void laptop_sync_completion(void)
 
	struct backing_dev_info *bdi;
	rcu_read_lock();
	list_for_each_entry_rcu(bdi, &bdi_list, bdi_list)
		del_timer(&bdi->laptop_mode_wb_timer);
	rcu_read_unlock();
 
As a side note, the ksys_sync function is simply called when the system call sync is used. fs/sync.c:111
SYSCALL_DEFINE0(sync)
 
	ksys_sync();
	return 0;
 

The End of Preparation With that the system has finished preparations for hibernation. This is a somewhat arbitrary cutoff, but next the system will begin a full freeze of userspace to then dump memory out to an image and finally to perform hibernation. All this will be covered in future articles!
  1. Hibernation modes are outside of scope for this article, see the previous article for a high-level description of the different types of hibernation.
  2. Workqueues are a mechanism for running asynchronous tasks. A full description of them is a task for another time, but the kernel documentation on them is available here: https://www.kernel.org/doc/html/v6.9/core-api/workqueue.html 2
  3. This is a bit of an oversimplification, but since this isn t the main focus of this article this description has been kept to a higher level.
  4. Kconfig is Linux s build configuration system that sets many different macros to enable/disable various features.
  5. Kconfig defaults to the first default found
  6. Including checking whether the algorithm is larval? Which appears to indicate that it requires additional setup, but is an interesting choice of name for such a state.
  7. Specifically when we get to process freezing, which we ll get to in the next article in this series.
  8. Swap space is outside the scope of this article, but in short it is a buffer on disk that the kernel uses to store memory not current in use to free up space for other things. See Swap Management for more details.
  9. The code for this is lengthy and tangential, thus it has not been included here. If you re curious about the details of this, see kernel/power/hibernate.c:858 for the details of hibernate_quiet_exec, and drivers/nvdimm/core.c:451 for how it is used in nvdimm.
  10. Annoyingly this code appears to use the terms console and virtual terminal interchangeably.
  11. ioctls are special device-specific I/O operations that permit performing actions outside of the standard file interactions of read/write/seek/etc. 2
  12. I m not entirely clear on how this flag works, this subsystem is particularly complex.
  13. In this case a higher number is higher priority.
  14. Or whatever the caller passes as val_down, but in this case we re specifically looking at how this is used in hibernation.
  15. An inode refers to a particular file or directory within the filesystem. See Wikipedia for more details.
  16. Each active filesystem is registed with the kernel through a structure known as a superblock, which contains references to all the inodes contained within the filesystem, as well as function pointers to perform the various required operations, like sync.
  17. I m including minimal code in this section, as I m not looking to deep dive into the filesystem code at this time.

7 September 2024

Sergio Durigan Junior: Chatting in the 21st century

Several people have been asking me to explain and/or write about my solution for chatting nowadays. I realize that the current scenario is much more complex than, say, 10 or 20 years ago. Back then, this post would probably be more about the IRC client I used than about different chatting technologies. I have also spent a non trivial amount of time setting things up the way I want, so I understand that it s about time to write about my setup not only because I think it can be helpful to others, but also because I would like to document things for myself.

The backbone: Matrix I chose to use Matrix as the place where I integrate everything. Despite there being some heavy (and justified) criticism on the protocol itself, it serves me well for what I need right now. Obviously, I don t like the fact that I have to provide Matrix and all of its accompanying bridges a VPS with 4GB of RAM and 3 vCPUs, but I think that that ship has sailed, unfortunately. In an ideal world, I would be using XMPP and dedicating only a fraction of the resources I m using today to have a full chat system. And since I have been running my personal XMPP server for more than a decade now, I did try to find a solution that would allow me to keep using it, but unfortunately the protocol became almost a hobbyist thing, so there s that.

A few disclaimers I self-host everything, including my Matrix server. Much of what I did won t work if you don t self-host Matrix, so keep that in mind. This won t be a post teaching you how to deploy the services. My intention is to describe what I use and for what purpose. Also, as much as I try to use Debian packages for everything I do, I opted to deploy all services using a community-maintained Ansible playbook which is very well written and organized: matrix-docker-ansible-deploy. Last but not least, as I said above, you will likely need a machine with a good amount of RAM, CPU and storage, especially if you deploy Synapse as your Matrix homeserver (which is what I recommend if you plan to use the bridges I ll mention). My current VPS has 4GB of RAM, 3 vCPUs and 80GB of storage (of which I m currently using approximately 55GB).

Problem #1: my Matrix client(s) There are a lot of clients that can talk the Matrix protocol, but most of them are either web clients or GUI programs. I live on the terminal, more specifically inside Emacs, so I settled for the amazing ement.el Emacs mode. It works surprisingly well, but unfortunately doesn t support end-to-end encryption out of the box; for that, you have to hook it up with pantalaimon. Unfortunately, the project seems abandoned and therefore I don t recommend you to use it. I don t use it myself. When I have to reply some E2E encrypted message from another user, I go to my web browser and use my self-hosted Element client. It s a nuisance, but one that I m willing to accept because of security concerns. If you re into web clients and don t want to use Element (because it is heavy), you can try Cinny. It s lightweight and supports a decent set of features. If you re a terminal lover but don t use Emacs, you may want to try gomuks or iamb.

Problem #2: IRC bridging There are basically two types of IRC bridges for Matrix:
  • The regular and most used matrix-appservice-irc. This bridge takes Matrix to IRC (think of IRC users with the [m] suffix appended to their nicknames), and is what the matrix.org and other big homeservers (including matrix.debian.social) use. It s a complex service which allows thousands of Matrix users to connect to IRC networks, but that unfortunately has complex problems and is only worth using if you intend to host a community server.
  • A bouncer-like bridge called Heisenbridge. This is what I use personally. It takes IRC to Matrix, which means that people on IRC will not know that you re using Matrix. This bridge is much simpler, and because it acts like a bouncer it s pretty much impossible for it to cause problems with the IRC network.
Due to the fact that I sometimes like to use other IRC clients, I still run a regular ZNC bouncer, and I use Heisenbridge to connect to my ZNC. This means that I can use, e.g., ERC inside Emacs and my Matrix bridge at the same time. But you don t necessarily need to run another bouncer; you can simply use Heisenbridge and connect directly to the IRC network(s) you want. A word of caution, though: unlike ZNC, Heisenbridge doesn t support per-user configuration when you use it in bouncer mode. This is the reason why you need to self-host it, and why it s not possible to offer the service to other users (they would have access to your IRC network configuration otherwise). It s also worth talking about logs. I find that keeping logs of everything that goes on IRC has saved me a bunch of times, and so I find it really important to continue doing that. Unfortunately, neither ement.el nor Element support logging things out of the box (at least not that I know). This is also one of the reasons why I still keep my ZNC around: I configure it to log everything.

Problem #3: Telegram I don t use Telegram myself, but unfortunately several people from the Debian community do, especially in Brazil. There is a whole Debian community on Telegram, and I wanted to be able to bridge our Debian Matrix channels to their Telegram counterparts. I am currently using mautrix-telegram for that, and it s working great. You need someone with a Telegram account to configure their credentials so that the bridge can connect to it, but afterwards it s really easy to bridge channels together.

Problem #4: GitLab webhooks Something else I wanted to be able to do was to receive notifications regarding new issues, merge requests and other activities from Salsa. For this, I m using maubot, which is awesome and has a huge list of plugins. I m using the gitlab one.

Final thoughts Overall, I m satisfied with the setup I have now. It has certainly taken some time and effort to find the right tool for each problem I needed to solve, and I still feel like there are some rough edges to soften (like the fact that my Emacs client doesn t support E2E encryption out of the box, or the whole logging situation), but otherwise things are working fine and I haven t had any big problems with the deployment. You do have to be much more careful about stuff (for example, when I installed an unrelated service that hijacked my Apache configuration and made Matrix s federation silently stop working), though. If you have more specific questions about any part of my setup, shoot me an email and I ll do my best to help. Happy chatting!

1 September 2024

Bits from Debian: Bits from the DPL

Dear Debian community, this are my bits from DPL for August. Happy Birthday Debian On 16th of August Debian celebrated its 31th birthday. Since I'm unable to write a better text than our great publicity team I'm simply linking to their article for those who might have missed it: https://bits.debian.org/2024/08/debian-turns-31.html Removing more packages from unstable Helmut Grohne argued for more aggressive package removal and sought consensus on a way forward. He provided six examples of processes where packages that are candidates for removal are consuming valuable person-power. I d like to add that the Bug of the Day initiative (see below) also frequently encounters long-unmaintained packages with popcon votes sometimes as low as zero, and often fewer than ten. Helmut's email included a list of packages that would meet the suggested removal criteria. There was some discussion about whether a popcon vote should be included in these criteria, with arguments both for and against it. Although I support including popcon, I acknowledge that Helmut has a valid point in suggesting it be left out. While I ve read several emails in agreement, Scott Kitterman made a valid point "I don't think we need more process. We just need someone to do the work of finding the packages and filing the bugs." I agree that this is crucial to ensure an automated process doesn t lead to unwanted removals. However, I don t see "someone" stepping up to file RM bugs against other maintainers' packages. As long as we have strict ownership of packages, many people are hesitant to touch a package, even for fixing it. Asking for its removal might be even less well-received. Therefore, if an automated procedure were to create RM bugs based on defined criteria, it could help reduce some of the social pressure. In this aspect the opinion of Niels Thykier is interesting: "As much as I want automation, I do not mind the prototype starting as a semi-automatic process if that is what it takes to get started." The urgency of the problem to remove packages was put by CharlesPlessy into the words: "So as of today, it is much less work to keep a package rotting than removing it." My observation when trying to fix the Bug of the Day exactly fits this statement. I would love for this discussion to lead to more aggressive removals that we can agree upon, whether they are automated, semi-automated, or managed by a person processing an automatically generated list (supported by an objective procedure). To use an analogy: I ve found that every image collection improves with aggressive pruning. Similarly, I m convinced that Debian will improve if we remove packages that no longer serve our users well. DEP14 / DEP18 There are two DEPs that affect our workflow for maintaining packages particularly for those who agree on using Git for Debian packages. DEP-14 recommends a standardized layout for Git packaging repositories, which benefits maintainers working across teams and makes it easier for newcomers to learn a consistent repository structure. DEP-14 stalled for various reasons. Sam Hartman suspected it might be because 'it doesn't bring sufficient value.' However, the assumption that git-buildpackage is incompatible with DEP-14 is incorrect, as confirmed by its author, Guido G nther. As one of the two key tools for Debian Git repositories (besides dgit) fully supports DEP-14, though the migration from the previous default is somewhat complex. Some investigation into mass-converting older formats to DEP-14 was conducted by the Perl team, as Gregor Hermann pointed out.. The discussion about DEP-14 resurfaced with the suggestion of DEP-18. Guido G nther proposed the title Encourage Continuous Integration and Merge Request-Based Collaboration for Debian Packages , which more accurately reflects the DEP's technical intent. Otto Kek l inen, who initiated DEP-18 (thank you, Otto), provided a good summary of the current status. He also assembled a very helpful overview of Git and GitLab usage in other Linux distros. More Salsa CI As a result of the DEP-18 discussion, Otto Kek l inen suggested implementing Salsa CI for our top popcon packages. I believe it would be a good idea to enable CI by default across Salsa whenever a new repository is created. Progress in Salsa migration In my campaign, I stated that I aim to reduce the number of packages maintained outside Salsa to below 2,000. As of March 28, 2024, the count was 2,368. Today, it stands at 2,187 (UDD query: SELECT DISTINCT count(*) FROM sources WHERE release = 'sid' and vcs_url not like '%salsa%' ;). After a third of my DPL term (OMG), we've made significant progress, reducing the amount in question (369 packages) by nearly half. I'm pleased with the support from the DDs who moved their packages to Salsa. Some packages were transferred as part of the Bug of the Day initiative (see below). Bug of the Day As announced in my 'Bits from the DPL' talk at DebConf, I started an initiative called Bug of the Day. The goal is to train newcomers in bug triaging by enabling them to tackle small, self-contained QA tasks. We have consistently identified target packages and resolved at least one bug per day, often addressing multiple bugs in a single package. In several cases, we followed the Package Salvaging procedure outlined in the Developers Reference. Most instances were either welcomed by the maintainer or did not elicit a response. Unfortunately, there was one exception where the recipient of the Package Salvage bug expressed significant dissatisfaction. The takeaway is to balance formal procedures with consideration for the recipient s perspective. I'm pleased to confirm that the Matrix channel has seen an increase in active contributors. This aligns with my hope that our efforts would attract individuals interested in QA work. I m particularly pleased that, within just one month, we have had help with both fixing bugs and improving the code that aids in bug selection. As I aim to introduce newcomers to various teams within Debian, I also take the opportunity to learn about each team's specific policies myself. I rely on team members' assistance to adapt to these policies. I find that gaining this practical insight into team dynamics is an effective way to understand the different teams within Debian as DPL. Another finding from this initiative, which aligns with my goal as DPL, is that many of the packages we addressed are already on Salsa but have not been uploaded, meaning their VCS fields are not published. This suggests that maintainers are generally open to managing their packages on Salsa. For packages that were not yet on Salsa, the move was generally welcomed. Publicity team wants you The publicity team has decided to resume regular meetings to coordinate their efforts. Given my high regard for their work, I plan to attend their meetings as frequently as possible, which I began doing with the first IRC meeting. During discussions with some team members, I learned that the team could use additional help. If anyone interested in supporting Debian with non-packaging tasks reads this, please consider introducing yourself to debian-publicity@lists.debian.org. Note that this is a publicly archived mailing list, so it's not the best place for sharing private information. Kind regards Andreas.

Colin Watson: Free software activity in August 2024

All but about four hours of my Debian contributions this month were sponsored by Freexian. (I ended up going a bit over my 20% billing limit this month.) You can also support my work directly via Liberapay. man-db and friends I released libpipeline 1.5.8 and man-db 2.13.0. Since autopkgtests are great for making sure we spot regressions caused by changes in dependencies, I added one to man-db that runs the upstream tests against the installed package. This required some preparatory work upstream, but otherwise was surprisingly easy to do. OpenSSH I fixed the various 9.8 regressions I mentioned last month: socket activation, libssh2, and Twisted. There were a few other regressions reported too: TCP wrappers support, openssh-server-udeb, and xinetd were all broken by changes related to the listener/per-session binary split, and I fixed all of those. Once all that had made it through to testing, I finally uploaded the first stage of my plan to split out GSS-API support: there are now openssh-client-gssapi and openssh-server-gssapi packages in unstable, and if you use either GSS-API authentication or key exchange then you should install the corresponding package in order for upgrades to trixie+1 to work correctly. I ll write a release note once this has reached testing. Multiple identical results from getaddrinfo I expect this is really a bug in a chroot creation script somewhere, but I haven t been able to track down what s causing it yet. My sbuild chroots, and apparently Lucas Nussbaum s as well, have an /etc/hosts that looks like this:
$ cat /var/lib/schroot/chroots/sid-amd64/etc/hosts
127.0.0.1       localhost
127.0.1.1       [...]
127.0.0.1       localhost ip6-localhost ip6-loopback
The last line clearly ought to be ::1 rather than 127.0.0.1; but things mostly work anyway, since most code doesn t really care which protocol it uses to talk to localhost. However, a few things try to set up test listeners by calling getaddrinfo("localhost", ...) and binding a socket for each result. This goes wrong if there are duplicates in the resulting list, and the test output is typically very confusing: it looks just like what you d see if a test isn t tearing down its resources correctly, which is a much more common thing for a test suite to get wrong, so it took me a while to spot the problem. I ran into this in both python-asyncssh (#1052788, upstream PR) and Ruby (ruby3.1/#1069399, ruby3.2/#1064685, ruby3.3/#1077462, upstream PR). The latter took a while since Ruby isn t one of my languages, but hey, I ve tackled much harder side quests. I NMUed ruby3.1 for this since it was showing up as a blocker for openssl testing migration, but haven t done the other active versions (yet, anyway). OpenSSL vs. cryptography I tend to care about openssl migrating to testing promptly, since openssh uploads have a habit of getting stuck on it otherwise. Debian s OpenSSL packaging recently split out some legacy code (cryptography that s no longer considered a good idea to use, but that s sometimes needed for compatibility) to an openssl-legacy-provider package, and added a Recommends on it. Most users install Recommends, but package build processes don t; and the Python cryptography package requires this code unless you set the CRYPTOGRAPHY_OPENSSL_NO_LEGACY=1 environment variable, which caused a bunch of packages that build-depend on it to fail to build. After playing whack-a-mole setting that environment variable in a few packages build process, I decided I didn t want to be caught in the middle here and filed an upstream issue to see if I could get Debian s OpenSSL team and cryptography s upstream talking to each other directly. There was some moderately spirited discussion and the issue remains open, but for the time being the OpenSSL team has effectively reverted the change so it s no longer a pressing problem. GCC 14 regressions Continuing from last month, I fixed build failures in pccts (NMU) and trn4. Python team I upgraded alembic, automat, gunicorn, incremental, referencing, pympler (fixing compatibility with Python >= 3.10), python-aiohttp, python-asyncssh (fixing CVE-2023-46445, CVE-2023-46446, and CVE-2023-48795), python-avro, python-multidict (fixing a build failure with GCC 14), python-tokenize-rt, python-zipp, pyupgrade, twisted (fixing CVE-2024-41671 and CVE-2024-41810), zope.exceptions, zope.interface, zope.proxy, zope.security, and zope.testrunner to new upstream versions. In the process, I added myself to Uploaders for zope.interface; I m reasonably comfortable with the Zope Toolkit and I seem to be gradually picking up much of its maintenance in Debian. A few of these required their own bits of yak-shaving: I improved some Multi-Arch: foreign tagging (python-importlib-metadata, python-typing-extensions, python-zipp). I fixed build failures in pipenv, python-stdlib-list, psycopg3, and sen, and fixed autopkgtest failures in autoimport (upstream PR), python-semantic-release and rstcheck. Upstream for zope.file (not in Debian) filed an issue about a test failure with Python 3.12, which I tracked down to a Python 3.12 compatibility PR in zope.security. I made python-nacl build reproducibly (upstream PR). I moved aliased files from / to /usr in timekpr-next (#1073722). Installer team I applied a patch from Ubuntu to make os-prober support building with the noudeb profile (#983325).

Russ Allbery: Review: Reasons Not to Worry

Review: Reasons Not to Worry, by Brigid Delaney
Publisher: Harper
Copyright: 2022
Printing: October 2023
ISBN: 0-06-331484-3
Format: Kindle
Pages: 295
Reasons Not to Worry is a self-help non-fiction book about stoicism, focusing specifically on quotes from Seneca, Epictetus, and Marcus Aurelius. Brigid Delaney is a long-time Guardian columnist who has written on a huge variety of topics, including (somewhat relevantly to this book) her personal experiences trying weird fads. Stoicism is having a moment among the sort of men who give people life advice in podcast form. Ryan Holiday, a former marketing executive, has made a career out of being the face of stoicism in everyone's podcast feed (and, of course, hosting his own). He is far from alone. If you pay attention to anyone in the male self-help space right now (Cal Newport, in my case), you have probably heard something vague about the "wisdom of the stoics." Given that the core of stoicism is easily interpreted as a strategy for overcoming your emotions with logic, this isn't surprising. Philosophies that lean heavily on college dorm room logic, discount emotion, and argue that society is full of obvious flaws that can be analyzed and debunked by one dude with some blog software and a free afternoon have been very popular in tech circles for the past ten to fifteen years, and have spread to some extent into popular culture. Intriguingly, though, stoicism is a system of virtue ethics, which means it is historically in opposition to consequentialist philosophies like utilitarianism, the ethical philosophy behind effective altruism and other related Silicon Valley fads. I am pretty exhausted with the whole genre of men talking to each other about how to live a better life Cal Newport by himself more than satisfies the amount of that I want to absorb but I was still mildly curious about stoicism. My education didn't provide me with a satisfying grounding in major historical philosophical movements, so I occasionally look around for good introductions. Stoicism also has some reputation as an anxiety-reduction technique, and I could use more of those. When I saw a Discord recommendation for Reasons Not to Worry that specifically mentioned its lack of bro perspective, I figured I'd give it a shot. Reasons Not to Worry is indeed not a bro book, although I would have preferred fewer appearances of the author's friend Andrew, whose opinions on stoicism I could not possibly care less about. What it is, though, is a shallow and credulous book that falls squarely in the middle of the lightweight self-help genre. Delaney is here to explain why stoicism is awesome and to convince you that a school of Greek and Roman philosophers knew exactly how you should think about your life today. If this sounds quasi-religious, well, I'll get to that. Delaney does provide a solid introduction to stoicism that I think is a bit more approachable than reading the relevant Wikipedia article. In her presentation, the core of stoicism is the practice of four virtues: wisdom, courage, moderation, and justice. The modern definition of "stoic" as someone who is impassive in the presence of pleasure or pain is somewhat misleading, but Delaney does emphasize a goal of ataraxia, or tranquility of mind. By making that the goal rather than joy or pleasure, stoicism tries to avoid the trap of the hedonic treadmill in favor of a more achievable persistent contentment. As an aside, some quick Internet research makes me doubt Delaney's summary here. Other material about stoicism I found focuses on apatheia and associates ataraxia with Epicureanism instead. But I won't start quibbling with Delaney's definitions; I'm not qualified and this review is already too long. The key to ataraxia, in Delaney's summary of stoicism, is to focus only on those parts of life we can control. She summarizes those as our character, how we treat others, and our actions and reactions. Everything else wealth, the esteem of our colleagues, good health, good fortune is at least partly outside of our control, and therefore we should enjoy it when we have it but try to be indifferent to whether it will last. Attempting to control things that are outside of our control is doomed to failure and will disturb our tranquility. Essentially all of this book is elaborations and variations on this theme, specialized to some specific area of life like social media, anxiety, or grief and written in the style of a breezy memoir. If you're familiar with modern psychological treatment frameworks like cognitive behavioral therapy or acceptance and commitment therapy, this summary of stoicism may sound familiar. (Apparently this is not an accident; the predecessor to CBT used stoicism as a philosophical basis.) Stoicism, like those treatment approaches, tries to refocus your attention on the things that you can improve and de-emphasizes the things outside of your control. This is a lot of the appeal, at least to me (and I think to Delaney as well). Hearing that definition, you may have some questions. Why those virtues specifically? They sound good, but all virtues sound good almost by definition. Is there any measure of your success in following those virtues outside your subjective feeling of ataraxia? Does the focus on only things you can control lead to ignoring problems only mostly outside of your control, where your actions would matter but only to a small degree? Doesn't this whole philosophy sound a little self-centered? What do non-stoic virtue ethics look like, and why do they differ from stoicism? What is the consequentialist critique of stoicism? This is where the shortcomings of this book become clear: Delaney is not very interested in questions like this. There are sections on some of those topics, particularly the relationship between stoicism and social justice, but her treatment is highly unsatisfying. She raises the question, talks about her doubts about stoicism's applicability, and then says that, after further thought, she decided stoicism is entirely consistent with social justice and the stoics were right after all. There is a little bit more explanation than that, but not much. Stoicism can apparently never be wrong; it can only be incompletely understood. Self-help books often fall short here, and I suspect this may be what the audience wants. Part of the appeal of the self-help genre is artificial certainty. Becoming a better manager, starting a business, becoming more productive, or working out an entire life philosophy are not problems amenable to a highly approachable and undemanding book. We all know that at some level, but the seductive allure of the self-help genre is the promise of simplifying complex problems down to a few approachable bullet points. Here is a life philosophy in a neatly packaged form, and if you just think deeply about its core principles, you will find they can be applied to any situation and any doubts you were harboring will turn out to be incorrect. I am all too familiar with this pattern because it's also how fundamentalist Christianity works. The second time Delaney talked about her doubts about the applicability of stoicism and then claimed a few pages later that those doubts disappeared with additional thought and discussion, my radar went off. This book was sounding less like a thoughtful examination of one specific philosophy out of many and more like the soothing adoption of religious certainty by a convert. I was therefore entirely unsurprised when Delaney all but says outright in the epilogue that she's adopted stoicism as her religion and approaches it with the same dedicated practice that she used to bring to Catholicism. I think this is where a lot of self-help books end up, although most of them don't admit it. There's nothing wrong with this, to be clear. It sounds like she was looking for a non-theistic religion, found one that she liked, and is excited to tell other people about it. But it's a profound mismatch with what I was looking for in an introduction to stoicism. I wanted context, history, and a frank discussion of the problems with adopting philosophy to everyday issues. I also wanted some acknowledgment that it is highly unlikely that a few men who lived 2000 years ago in a wildly different social context, and with drastically limited information about cultures other than their own, figured out a foolproof recipe for how to approach life. The subsequent two millennia of philosophical debates prove that stoicism didn't end the argument, and that a lot of other philosophers thought that stoicism got a few things wrong. You would never know that from this book. What I wanted is outside the scope of this sort of undemanding self-help book, though, and this is the problem that I keep having with philosophy. The books I happen across are either nigh-incomprehensibly dense and academic, or they're simplified into catechism. This was the latter. That's probably more the fault of my reading selection than it is the fault of the book, but it was still annoying. What I will say for this book, and what I suspect may be the most useful property of self-help books in general, is that it prompts you to think about basic stoic principles without getting in the way of your thoughts. It's like background music for the brain: nothing Delaney wrote was very thorny or engaging, but she kept quietly and persistently repeating the basic stoic formula and turning my thoughts back to it. Some of those thoughts may have been useful? As a source of prompts for me to ponder, Reasons Not to Worry was therefore somewhat successful. The concept of not trying to control things outside of my control is simple but valid, and it probably didn't hurt me to spend a week thinking about it. "It kind of works as an undemanding meditation aid" is not a good enough reason for me to recommend this book, but maybe that's what someone else is looking for. Rating: 5 out of 10

27 August 2024

Russ Allbery: Review: Dark Horse

Review: Dark Horse, by Michelle Diener
Series: Class 5 #1
Publisher: Eclipse
Copyright: June 2015
ISBN: 0-9924559-3-6
Format: Kindle
Pages: 366
Dark Horse is a science fiction romance novel, the first of a five book series as of this writing. It is self-published, although it is sufficiently well-edited and packaged that I had to do some searching to confirm that. Rose was abducted by aliens. The Tecrans picked her up along with a selection of Earth animals, kept her in a cell in their starship, and experimented on her. As the book opens, she has managed to make her escape with the aid of an AI named Sazo who was also imprisoned on the Tecran ship. Sazo dealt with the Tecrans, dropped the ship in the middle of Grih territory, and then got Rose and most of the animals on shuttles to a nearby planet. Dav Jallan is the commander of the ship the Grih sent to investigate the unexplained appearance of a Class 5 Tecran warship in the middle of their territory. The Grih and the Tecran, along with three other species, are members of the United Council, which means in theory they're all at peace. With the Tecran, that theory is often strained. Dav is not going to turn down one of their highly-advanced Class 5 warships delivered to him on a silver platter. There is only the matter of the unexpected cargo, the first orange dots (indicating unknown life forms) that most of the Grih have ever seen. There is a romance. That romance did not work for me. I thought it was highly unprofessional on Dav's part and a bit too obviously constructed on the author's part. It also leans on the subgenre convention that aliens can be remarkably physically similar and sexually compatible, which always causes problems for my suspension of disbelief even though I know it's no less plausible than faster-than-light travel. Despite that, I had so much fun with this book! It was absolutely delightful and weirdly grabby in a way that caught me by surprise. I was skimming some parts of it to write this review and found myself re-reading multiple pages before I dragged myself back on task. I think the most charming part of this book is that the United Council has a law called the Sentient Beings Agreement that makes what the Tecran were doing extremely illegal, and the Grih and the other non-Tecran aliens take this very seriously and with a refreshing lack of cynicism. Rose has a typical human reaction to ending up in a place where she doesn't know the rules and isn't entirely an expected guest. She almost reflexively smoothes over miscommunications and tensions, trying to adapt to their expectations. And then, repeatedly, the Grih realize how much work she's doing to adapt to them, feel enraged at the Tecran and upset that they didn't understand or properly explain something, and find some way to make Rose feel more comfortable. It's surprisingly soothing and comforting to read. It occurred to me in several places that Dark Horse could be read as a wish-fulfillment fantasy of what life as a woman could be like if men took their fair share of the mental load. (This concept is usually applied to housework, but I think it generalizes to other social and communication contexts.) I suspect this was not an accident. There is a lot of wish fulfillment in this book. The Grih are very human-like but hunky, which is convenient for the romance subplot. They struggle to sing, value music exceptionally highly, and consider Rose's speaking voice beautifully musical. Her typical human habit of singing to herself is a source of immediate and almost overwhelming fascination. The supplies Rose takes from the Tecran ship when she flees just happen to be absurdly expensive scented shampoo and equally expensive luxury adaptable clothing. The world she lands on, and the Grih ship, are low-gravity compared to Earth, so Rose is unusually strong for her size. Grih military camouflage has no effect on her human vision. The book is set up to make Rose special. If that type of wish fulfillment is going to grate, wait on this book until you're more in the mood for it. But I like wish fulfillment books when they're done well. Part of why I like to read is to imagine a better world. And Rose isn't doted on; despite their hospitality, she's constantly underestimated by the Grih. Even with their deep belief in the Sentient Beings Agreement, they find it hard to believe that an unknown sentient, even an advanced sentient, is really their equal. Their concern at the start is somewhat patronizing, so watching Rose constantly surprise them delighted the part of my brain that likes both competence porn and deserved reversals, even though the competence here is often due to accidents of biology. It helps that Diener tells the story in alternating perspectives, so the reader first watches Rose do something practical and straightforward from her perspective and then gets to enjoy the profound surprise and chagrin of the aliens. There is a plot beneath this first contact story, and beyond the political problem of figuring out what to do with Rose and the Tecran. Sazo, Rose's AI friend, does not want the Grih to know he exists. He has a history that Rose does not know about and may not be entirely safe. As the political situation with the Tecran escalates, Sazo is pursuing goals of his own, and Rose has a firm opinion about where her loyalties should lie. The resolution is nothing ground-breaking as far as SF goes, but I thought it was satisfyingly tense and complex. Dark Horse leaves obvious room for a sequel, but it comes to a satisfying conclusion. The writing is serviceable, particularly once you get into the story. I would not call it great, and it's not going to win any literary awards, but it didn't interfere with my enjoyment of the story. This is not the sort of book that will make anyone's award list, but it is easily in the top five of books I had the most fun reading this year. Maybe save it for when you're looking for something light and wholesome and don't mind some rather obvious tropes, but if you're in the mood for imagining people who take laws seriously and sincerely try to help other people, I found this an utterly delightful way to pass the time. I immediately bought the sequel. Recommended. Followed by Dark Deeds. Rating: 8 out of 10

22 August 2024

Jonathan McDowell: Thoughts on Advent of Code + Rust

Diego wrote about his dislike for Advent of Code and that reminded me I hadn t written up my experience from 2023. Mostly because, spoiler, I never actually completed it and always intended to do so and then write it up. I think it s time to accept I m not going to do that, and write down some thoughts before I forget all of them. These are somewhat vague, given the time that s elapsed, but I think still relevant. You might also find Roger s problem write up interesting. I ve tried AoC a couple of times before; I think I had a very brief attempt back in 2021, and I got 4 days in for 2022. For Advent of Code 2023 I tried much harder to actually complete the challenges, and got most of the way there. I didn t allow myself to move on to the next day until fully completing the previous day, and didn t end up doing the second half of December 24th, or any of December 25th.

Rust First I want to talk about Rust, which is the language I chose to use for the problems. I ve dabbled a little in it, but I d like more familiarity with the basic language, and some programming problems seemed like a good way to get that. It s a language I want to like; I ve spent a lot of my career writing C, do more in Go these days, and generally think Rust promises a low level, run-time light environment like C but with the rough edges taken off. I set myself the challenge of using just bare Rust; no external crates, no use of cargo. I was accused of playing on hard mode by doing this, but it really wasn t the intention - I figured that I should be able to do what I needed without recourse to anything outside the core language, and didn t want what seemed like the extra complexity of dealing with cargo. That caused problems, however. I m used to by-default generic error handling in Go through the error type, but Rust seems to have much more tightly typed errors. I was pointed at anyhow as the right way to do this in Rust. I still find this surprising; I ended up using unwrap() a lot when I think with more generic error handling I could have used ?. The other thing I discovered is that by default rustc is heavy on the debug output. I got significantly better results on some of the solutions with rustc -O -C target-cpu=native source.rs. I probably shouldn t be surprised by this, but worth noting. Rust, to me, has a syntax only a C++ programmer could love. I am not a C++ programmer. Coming from C I found Go to be a nice, simple syntax to learn. Rust has not been the same. There s a lot more punctuation, and it s not always clear to me what it s doing. This applies more when reading other people s code than when writing it myself, obviously, but I see a lot of Rust code that could give Perl a run for its money in terms of looking like line noise. The borrow checker didn t bug me too much, but did add overhead to my thinking. The Rust compiler is generally very good at outputting helpful error messages when the programmer is an idiot. I ended up having to use a RefCell for one solution, and using .iter() for loops rather than explicit iterators (why, why is this different?). I also kept forgetting to explicitly mark variables as mutable when declaring them. Things I liked? There s a rich set of first class data types. Look, I m a C programmer, I m easily pleased. You give me some sort of hash array and I ll be happy. Rust manages that, tuples, strings, all the standard bits any modern language can provide. The whole impl thing for adding methods to structures I like as a way of providing some abstraction, though I think Go has a nicer syntax for it. The compiler, as mentioned, is great at spitting out useful errors for the most part. Also although I wasn t using external crates for AoC I do appreciate there s a decent ecosystem there now (though that brings up another gripe: rust seems to still be a fairly fast moving target, to the extent I can no longer rely on the compiler in Debian stable to be able to compile random projects I find).

Advent of Code Let s talk about the advent of code bit now. Hopefully it s long enough since it came out that this won t be spoilers for anyone, but if you haven t attempted the 2023 AoC and might, you might want to stop reading here. First, a refresher on the format for those who might not be aware of it. Problems are posted daily from December 1st until the 25th. Each is in 2 parts; the second part is not viewable until you have provided the correct answer for the first part. There s a whole leaderboard thing going on, but the puzzle opens at midnight UTC-5 so generally by the time I wake up and have time to look the problem has been solved many times over; no chance of getting listed. Credit to AoC creator, Eric Wastl, for writing up the set of problems in an entertaining fashion. I quite enjoyed seeing how the puzzle would be phrased each day, and the whole thing obviously brings a lot of joy to folk I know. I always start AoC thinking it ll be a fun set of puzzles to solve. Then something happens and I miss a day or two, and all of a sudden I ve a bunch of catching up to do and it s all a bit more of a chore. I hit that at some points this time, but made a concerted effort to try and power through it. That perseverance was required up front, because I found the second part of Day 1 to be ill specified, and had to iterate a few times to actually calculate the desired solution (IIRC, issues about whether sevenone at the end of a line ended up as 7 or 1 really tripped me up). I don t recall any other problems that bit me as hard on the specification as this one, but it happening up front was unfortunate. The short example input doesn t always help with this either; either it s not enough to be able to extrapolate patterns, or it doesn t show all the variations you need to account for (that aren t fully specified in the text), or in a few cases it turned out I needed to understand the shape of the actual data to produce a solution that could actually complete in a reasonable time. Which brings me to another matter, sometimes brute force doesn t actually work. This is fine, but the second part of the day s problem can change the approach you d take. So sometimes I got lucky in the way I handled the first half, and doing the second half was a simple 5 minute tweak, and sometimes I had to entirely change the way I was storing data. You might claim that if I was a better programmer I d have always produced a first half solution that was amenable to extension for the second half. First, I dispute that; I think there are always situations where the problem domain can change in enough directions that you can t handle all of them without a lot of effort. Secondly, I didn t find AoC an environment that encouraged me to optimise for generic solutions. Maybe some of the puzzles in isolation would allow for that, but a month of daily problems to solve while still engaging in regular life meant I hacked things up, took short cuts based on the knowledge I had of the input data, etc, etc. Overall I can see the appeal, but the sheer quantity and the fact I write code as part of my day job just made it feel too much like a chore, rather than a fun mental exercise. I did wonder how they d look as a set of interview puzzles (obviously a subset, rather than all of them), but I m not sure how you d actually use them for that - I wouldn t want anyone to have to solve them in a live interview. So, in case it s not obvious, I m not planning to engage in AoC again this yet. But I m continuing to persevere with Rust (though most of my work stuff is thankfully still Go).

21 August 2024

Russ Allbery: Review: These Burning Stars

Review: These Burning Stars, by Bethany Jacobs
Series: Kindom Trilogy #1
Publisher: Orbit
Copyright: October 2023
ISBN: 0-316-46342-6
Format: Kindle
Pages: 430
These Burning Stars is a science fiction thriller with cyberpunk vibes. It is Bethany Jacobs's first novel and the first of an expected trilogy, and won the 2024 Philip K. Dick Award for the best SF paperback original published in the US. Generation starships brought humanity to the three star systems of the Treble, where they've built a new and thriving culture of billions. The Treble is ruled by the Kindom, a tripartite government structure built around the worship of six gods and the aristocratic power of the First Families. The Clerisy handle religion, the Secretaries run the bureaucracy, and the Cloaksaan enforce the decisions of the other branches. The Nightfoots are one of the First Families. They control sevite, the propellant required to move between the systems of the Treble now that the moon Jeve and the sole source of natural jevite has been destroyed. Esek Nightfoot is a cleric, theoretically following the rules of the Clerisy, but she has made a career of training cloaksaan. She is is mercurial, powerful, ruthless, ambitious, politically well-connected, and greatly feared. She is also obsessed with a person named Six: an orphan she first encountered at a training school who was too young to have a gender or a name but who was already one of the best fighters in the school. In the sort of manipulative challenge typical of Esek, she dangled the offer of a place as a student and challenged the child to learn enough to do something impressive. The subsequent twenty years of elusive taunts and mysterious gifts from the impossible-to-locate Six have driven Esek wild. Cleric Chono was beside Esek for much of that time. One of Six's classmates and another of Esek's rescues, Chono is the rare student who became a cleric rather than a cloaksaan. She is pious, cautious, and careful, the opposite of Esek's mercurial rage, but it's impossible to spend that much time around the woman and not be affected and manipulated by her. As this story opens, Chono is summoned by the First Cleric to join Esek on an assignment: recover a data coin that was stolen from a pirate raid on the Nightfoot compound. He refuses to tell them what data is on it, only saying that he believes it could be used to undermine public trust in the Nightfoot family. Jun is a hacker with considerably fewer connections to power or government and no desire to meet any of these people. She and her partner Liis make a dubiously legal living from smaller, quieter jobs. Buying a collection of stolen data coins for an archivist consortium is riskier than she prefers, but she's been tracking down rumors of this coin for months. The deal is worth a lot of money, enough to make a huge difference for her family. This is the second book I've read recently with strong cyberpunk vibes, although These Burning Stars mixes them with political thriller. This is a messy world with complicated political and religious systems, a lot of contentious history, and vast inequality. The story is told in two interleaved time sequences: the present-day fight over the data coin and the information that it contains, and a sequence of flashbacks telling the history of Esek's relationship with Six and Chono. Jun's story is the most cyberpunk and the one I found the most enjoyable to read, but Chono is a good viewpoint character for Esek's vicious energy and abusive charisma. Six is not a viewpoint character. For most of the book, they're present mostly in shadows, glimpses, and consequences, but they're the strongest character of the book. Both Esek and Six are larger than life, creatures of legend stuffed into mundane politics but too full of strong emotions, both good and bad, to play by any of the rules. Esek has the power base and access to the levers of government, but Six's quiet competence and mercilessly targeted morality may make them the more dangerous of the pair. I found the twisty political thriller part of this book engrossing and very difficult to put down, but it was also a bit too much drama for me in places. Jacobs has some surprises in store, one of which I did not expect at all, and they're set up beautifully and well-done within the story, but Esek and Six become an emotional star that the other characters orbit around and are in danger of getting pulled into. Chono is an accomplished and powerful character in her own right, but she's also an abuse victim, and while those parts are realistic, I didn't entirely enjoy reading them. There is quiet competence here alongside the drama, but I think I wanted the balance of emotion to tip a bit more towards the competence. There is one thing that Jacobs does with the end of the book that greatly impressed me. Unfortunately I can't even hint at it for fear of spoilers, but the ending is unsettling in a way that I found surprising and thought-provoking. I think what I can say is that this book respects the intelligence and skill of secondary characters in a way that I think is rare in a story with such overwhelming protagonists. I'm still thinking about that, and it's going to pull me right into the sequel. This is not going to be to everyone's taste. Esek is a viewpoint character and she can be very nasty. There's a lot of violence and abuse, including one rather graphic fight scene that I thought dragged on much longer than it needed to. But it's a satisfying, complex story with a true variety of characters and some real surprises. I'm glad I read it. Followed by On Vicious Worlds, not yet published as I write this. Content warnings: emotional and physical abuse, graphic violence, off-screen rape and sexual abuse of minors. Rating: 7 out of 10

14 August 2024

Lukas M rdian: Netplan v1.1 released

I m happy to announce that Netplan version 1.1 is now available on GitHub and is soon to be deployed into a Debian and/or Ubuntu installation near you! Six months and 120 commits after the previous version (including one patch release v1.0.1), this release is brought to you by 17 free software contributors from around the globe.  Kudos to everybody involved!

Highlights
  • Custom systemd-networkd-wait-online logic override to wait for link-local and routable interfaces. (#456, #482)
  • Modification of the embedded-switch-mode setting without virtual-function (VF) definitions on SR-IOV devices (#454)
  • Parser flag to ignore individual, broken configurations, instead of not generating any backend configuration (#412)
  • Fixes for @ProtonVPN (#495) and @microsoft Azure Linux (#445), contributed by those companies

Releasing v1.1

Documentation

Bug fixes

New Contributors Full Changelog: 1.0 1.1

8 August 2024

Reproducible Builds: Reproducible Builds in July 2024

Welcome to the July 2024 report from the Reproducible Builds project! In our reports, we outline what we ve been up to over the past month and highlight news items in software supply-chain security more broadly. As always, if you are interested in contributing to the project, please visit our Contribute page on our website. Table of contents:
  1. Reproducible Builds Summit 2024
  2. Pulling Linux up by its bootstraps
  3. Towards Idempotent Rebuilds?
  4. AROMA: Automatic Reproduction of Maven Artifacts
  5. Community updates
  6. Android Reproducible Builds at IzzyOnDroid with rbtlog
  7. Extending the Scalability, Flexibility and Responsiveness of Secure Software Update Systems
  8. Development news
  9. Website updates
  10. Upstream patches
  11. Reproducibility testing framework


Reproducible Builds Summit 2024 Last month, we were very pleased to announce the upcoming Reproducible Builds Summit, set to take place from September 17th 19th 2024 in Hamburg, Germany. We are thrilled to host the seventh edition of this exciting event, following the success of previous summits in various iconic locations around the world, including Venice, Marrakesh, Paris, Berlin and Athens. Our summits are a unique gathering that brings together attendees from diverse projects, united by a shared vision of advancing the Reproducible Builds effort. During this enriching event, participants will have the opportunity to engage in discussions, establish connections and exchange ideas to drive progress in this vital field. Our aim is to create an inclusive space that fosters collaboration, innovation and problem-solving. If you re interesting in joining us this year, please make sure to read the event page, which has more details about the event and location. We are very much looking forward to seeing many readers of these reports there.

Pulling Linux up by its bootstraps (LWN) In a recent edition of Linux Weekly News, Daroc Alden has written an article on bootstrappable builds. Starting with a brief introduction that
a bootstrappable build is one that builds existing software from scratch for example, building GCC without relying on an existing copy of GCC. In 2023, the Guix project announced that the project had reduced the size of the binary bootstrap seed needed to build its operating system to just 357-bytes not counting the Linux kernel required to run the build process.
The article goes onto to describe that now, the live-bootstrap project has gone a step further and removed the need for an existing kernel at all. and concludes:
The real benefit of bootstrappable builds comes from a few things. Like reproducible builds, they can make users more confident that the binary packages downloaded from a package mirror really do correspond to the open-source project whose source code they can inspect. Bootstrappable builds have also had positive effects on the complexity of building a Linux distribution from scratch [ ]. But most of all, bootstrappable builds are a boon to the longevity of our software ecosystem. It s easy for old software to become unbuildable. By having a well-known, self-contained chain of software that can build itself from a small seed, in a variety of environments, bootstrappable builds can help ensure that today s software is not lost, no matter where the open-source community goes from here

Towards Idempotent Rebuilds? Trisquel developer Simon Josefsson wrote an interesting blog post comparing the output of the .deb files from our tests.reproducible-builds.org testing framework and the ones in the official Debian archive. Following up from a previous post on the reproducibility of Trisquel, Simon notes that typically [the] rebuilds do not match the official packages, even when they say the package is reproducible , Simon correctly identifies that the purpose of [these] rebuilds are not to say anything about the official binary build, instead the purpose is to offer a QA service to maintainers by performing two builds of a package and declaring success if both builds match. However, Simon s post swiftly moves on to announce a new tool called debdistrebuild that performs rebuilds of the difference between two distributions in a GitLab pipeline and displays diffoscope output for further analysis.

AROMA: Automatic Reproduction of Maven Artifacts Mehdi Keshani, Tudor-Gabriel Velican, Gideon Bot and Sebastian Proksch of the Delft University of Technology, Netherlands, have published a new paper in the ACM Software Engineering on a new tool to automatically reproduce Apache Maven artifacts:
Reproducible Central is an initiative that curates a list of reproducible Maven libraries, but the list is limited and challenging to maintain due to manual efforts. [We] investigate the feasibility of automatically finding the source code of a library from its Maven release and recovering information about the original release environment. Our tool, AROMA, can obtain this critical information from the artifact and the source repository through several heuristics and we use the results for reproduction attempts of Maven packages. Overall, our approach achieves an accuracy of up to 99.5% when compared field-by-field to the existing manual approach [and] we reveal that automatic reproducibility is feasible for 23.4% of the Maven packages using AROMA, and 8% of these packages are fully reproducible.

Community updates On our mailing list this month:
  • Nichita Morcotilo reached out to the community, first to share their efforts to build reproducible packages cross-platform with a new build tool called rattler-build, noting that as you can imagine, building packages reproducibly on Windows is the hardest challenge (so far!) . Nichita goes onto mention that the Apple ecosystem appears to be using ZERO_AR_DATE over SOURCE_DATE_EPOCH. [ ]
  • Roland Clobus announced that the Debian bookworm 12.6 live images are nearly reproducible , with more detail in the post itself and input in the thread from other contributors.
  • As reported in last month s report, Pol Dellaiera completed his master thesis on Reproducibility in Software Engineering at the University of Mons, Belgium. This month, Pol announced this on the list with more background info. Since the master thesis sources have been available, it has received some feedback and contributions. As a result, an updated version of the thesis has been published containing those community fixes.
  • Daniel Gr ber asked for help in getting the Yosys documentation to build reproducibly, citing issues in inter alia the PDF generation causing differing CreationDate metadata values.
  • James Addison continued his long journey towards getting the Sphinx documentation generator to build reproducible documentation. In this thread, James concerns himself with the problem that even when SOURCE_DATE_EPOCH is configured, Sphinx projects that have configured their copyright notices using dynamic elements can produce nonsensical output under some circumstances. James query ended up generating a number of replies.
  • Allen gunner Gunner posted a brief update on the progress the core team is making towards introducing a Code of Conduct (CoC) such that it is in place in time for the RB Summit in Hamburg in September . In particular, gunner asks if you are interested in helping with CoC design and development in the weeks ahead, simply email rb-core@lists.reproducible-builds.org and let us know . [ ]

Android Reproducible Builds at IzzyOnDroid with rbtlog On our mailing list, Fay Stegerman announced a new Reproducible Builds collaboration in the Android ecosystem:
We are pleased to announce Reproducible Builds, special client support and more in our repo : a collaboration between various independent interoperable projects: the IzzyOnDroid team, 3rd-party clients Droid-ify & Neo Store, and rbtlog (part of my collection of tools for Android Reproducible Builds) to bring Reproducible Builds to IzzyOnDroid and the wider Android ecosystem.

Extending the Scalability, Flexibility and Responsiveness of Secure Software Update Systems Congratulations to Marina Moore of the New York Tandon School of Engineering who has submitted her PhD thesis on Extending the Scalability, Flexibility and Responsiveness of Secure Software Update Systems. The introduction outlines its contributions to the field:
[S]oftware repositories are a vital component of software development and release, with packages downloaded both for direct use and to use as dependencies for other software. Further, when software is updated due to patched vulnerabilities or new features, it is vital that users are able to see and install this patched version of the software. However, this process of updating software can also be the source of attack. To address these attacks, secure software update systems have been proposed. However, these secure software update systems have seen barriers to widespread adoption. The Update Framework (TUF) was introduced in 2010 to address several attacks on software update systems including repository compromise, rollback attacks, and arbitrary software installation. Despite this, compromises continue to occur, with millions of users impacted by such compromises. My work has addressed substantial challenges to adoption of secure software update systems grounded in an understanding of practical concerns. Work with industry and academic communities provided opportunities to discover challenges, expand adoption, and raise awareness about secure software updates. [ ]

Development news In Debian this month, 12 reviews of Debian packages were added, 13 were updated and 6 were removed this month adding to our knowledge about identified issues. A new toolchain issue type was identified as well, specifically ordering_differences_in_pkg_info.
Colin Percival filed a bug against the LLVM compiler noting that building i386 binaries on the i386 architecture is different when building i386 binaries under amd64. The fix was narrowed down to x87 excess precision, which can result in slightly different register choices when the compiler is hosted on x86_64 or i386 and a fix committed. [ ]
Fay Stegerman performed some in-depth research surrounding her apksigcopier tool, after some Android .apk files signed with the latest apksigner could no longer be verified as reproducible. Fay identified the issue as follows:
Since build-tools >= 35.0.0-rc1, backwards-incompatible changes to apksigner break apksigcopier as it now by default forcibly replaces existing alignment padding and changed the default page alignment from 4k to 16k (same as Android Gradle Plugin >= 8.3, so the latter is only an issue when using older AGP). [ ]
She documented multiple available workarounds and filed a bug in Google s issue tracker.
Lastly, diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb uploaded version 272 and Mattia Rizzolo uploaded version 273 to Debian, and the following changes were made as well:
  • Chris Lamb:
    • Ensure that the convert utility is from ImageMagick version 6.x. The command-line interface has seemingly changed with the 7.x series of ImageMagick. [ ]
    • Factor out version detection in test_jpeg_image. [ ]
    • Correct the import of the identify_version method after a refactoring change in a previous commit. [ ]
    • Move away from using DSA OpenSSH keys in tests as support has been deprecated and removed in OpenSSH version 9.8p1. [ ]
    • Move to assert_diff in the test_openssh_pub_key package. [ ]
    • Update copyright years. [ ]
  • Mattia Rizzolo:
    • Add support for ffmpeg version 7.x which adds some extra context to the diff. [ ]
    • Rework the handling of OpenSSH testing of DSA keys if OpenSSH is strictly 9.7, and add an OpenSSH key test with a ed25519-format key [ ][ ][ ]
    • Temporarily disable a few packages that are not available in Debian testing. [ ][ ]
    • Stop ignoring the results of Debian testing in the continuous integration system. [ ]
    • Adjust options in debian/source to make sure not to pack the Python sdist directory into the binary Debian package. [ ]
    • Adjust Lintian overrides. [ ]

Website updates There were a number of improvements made to our website this month, including:

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In July, a number of changes were made by Holger Levsen, including:
  • Grant bremner access to the ionos7 node. [ ][ ]
  • Perform a dummy change to force update of all jobs. [ ][ ]
In addition, Vagrant Cascadian performed some necessary node maintenance of the underlying build hosts. [ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

Jonathan Carter: DebConf24 Busan, South Korea

I m finishing typing up this blog entry hours before my last 13 hour leg back home, after I spent 2 weeks in Busan, South Korea for DebCamp24 and DebCamp24. I had a rough year and decided to take it easy this DebConf. So this is the first DebConf in a long time where I didn t give any talks. I mostly caught up on a bit of packaging, worked on DebConf video stuff, attended a few BoFs and talked to people. Overall it was a very good DebConf, which also turned out to be more productive than I expeced it would. In the welcome session on the first day of DebConf, Nicolas Dandrimont mentioned that a benefit of DebConf is that it provides a sort of caffeine for your Debian motivation. I could certainly feel that affect swell as the days went past, and it s nice to be excited about some ideas again that would otherwise be fading.

Recovering DPL It s a bit of a gear shift being DPL for 4 years, and DebConf Committee for nearly 5 years before that, and then being at DebConf while some issue arise (as it always does during a conference). At first I jump into high alert mode, but then I have to remind myself it s not your problem anymore and let others deal with it. It was nice spending a little in-person time with Andreas Tille, our new DPL, we did some more handover and discussed some current issues. I still have a few dozen emails in my DPL inbox that I need to collate and forward to Andreas, I hope to finish all that up by the end of August. During the Bits from the DPL talk, the usual question came up whether Andreas will consider running for DPL again, to which he just responded in a slide Maybe . I think it s a good idea for a DPL to do at least two terms if it all works out for everyone, since it takes a while to get up to speed on everything. Also, having been DPL for four years, I have a lot to say about it, and I think there s a lot we can fix in the role, or at least discuss it. If I had the bandwidth for it I would have scheduled a BoF for it, but I ll very likely do that for the next DebConf instead!

Video team I set up the standby loop for the video streaming setup. We call it loopy, it s a bunch of OBS scenes that provide announcements, shows sponsors, the schedule and some social content. I wrote about it back in 2020, but it s evolved quite a bit since then, so I m probably due to write another blog post with a bunch of updates on it. I hope to organise a video team sprint in Cape Town in the first half of next year, so I ll summarize everything before then.

It would ve been great if we could have some displays in social areas that could show talks, the loop and other content, but we were just too pressed for time for that. This year s DebConf had a very compressed timeline, and there was just too much that had to be done and that had to be figured out on the last minute. This put quite a lot of strain on the organisers, but I was glad to see how, for the most part, most attendees were very sympathetic to some rough edges (but I digress ). I added more of the OBS machine setup to the videoteam s ansible repository, so as of now it just needs an ansible setup and the OBS data and it s good to go. The loopy data is already in the videoteam git repository, so I could probably just add a git pull and create some symlinks in ansible and then that machine can be installed from 0% to 100% by just installing via debian-installer with our ansible hooks. This DebConf I volunteered quite a bit for actual video roles during the conference, something I didn t have much time for in recent DebConfs, and it s been fun, especially in a session or two where nearly none of the other volunteers showed up. Sometimes chaos is just fun :-)
Baekyongee is the university mascot, who s visible throughout the university. So of course we included this four legged whale creature on the loop too!

Packaging I was hoping to do more packaging during DebCamp, but at least it was a non-zero amount:
  • Uploaded gdisk 1.0.10-2 to unstable (previously tested effects of adding dh-sequence-movetousr) (Closes: #1073679).
  • Worked a bit on bcachefs-tools (updating git to 1.9.4), but has a build failure that I need to look into (we might need a newer bindgen) update: I m probably going to ROM this package soon, it doesn t seem suitable for packaging in Debian.
  • Calamares: Tested a fix for encrypted installs, and uploaded it.
  • Calamares: Uploaded (3.3.8-1) to backports (at the time of writing it s still in backports-NEW).
  • Backport obs-gradient-source for bookworm.
  • Did some initial packaging on Cambalache, I ll upload to unstable once wlroots (0.18) hits unstable.
  • Pixelorama 1.0 I did some initial packaging for Pixelorama back when we did the MiniDebConf Gaming Edition, but it had a few stoppers back then. Version 1.0 seems to fix all of that, but it depends on Godot 4.2 and we re still on the 3 series in Debian, so I ll upload this once Godot 4.2 hits at least experimental. Godot software/games is otherwise quite easy to run, it s basically just source code / data that is installed and then run via godot-runner (godot3-runner package in Debian).

BoFs Python Team BoF Link to the etherpad / pad archive link and video can be found on the talk page: https://debconf24.debconf.org/talks/31-python-bof/ The session ended up being extended to a second part, since all the issues didn t fit into the first session. I was distracted by too many thing during the Python 3.12 transition (to the point where I thought that 3.11 was still new in Debian), so it was very useful listening to the retrospective of that transition. There was a discussion whether Python 3.13 could still make it to testing in time for freeze, and it seems that there is consensus that it can, although, likely with new experimental features like disabling the global interpreter lock and the just in time compiler disabled. I learned for the first time about the dead batteries project, PEP-0594, which removes ancient modules that have mostly been superseded, from the Python standard library. There was some talk about the process for changing team policy, and a policy discussion on whether we should require autopkgtests as a SHOULD or a MUST for migration to testing. As with many things, the devil is in the details and in my opinion you could go either way and achieve a similar result (the original MUST proposal allowed exceptions which imho made it the same as the SHOULD proposal). There s an idea to do some ongoing remote sprints, like having co-ordinated days for bug squashing / working on stuff together. This is a nice idea and probably a good way to energise the team and also to gain some interest from potential newcomers. Louis-Philipe V ronneau was added as a new team admin and there was some discussion on various Sphinx issues and which Lintian tags might be needed for Python 3.13. If you want to know more, you probably have to watch the videos / read the notes :)
    Debian.net BoF Link to the etherpad / pad archive link can be found on the talk page: https://debconf24.debconf.org/talks/37-debiannet-team-bof Debian Developers can set up services on subdomains on debian.net, but a big problem we ve had before was that developers were on their own for hosting those services. This meant that they either hosted it on their DSL/fiber connection at home, paid for the hosting themselves, or hosted it at different services which became an accounting nightmare to claim back the used funds. So, a few of us started the debian.net hosting project (sometimes we just call it debian.net, this is probably a bit of a bug) so that Debian has accounts with cloud providers, and as admins we can create instances there that gets billed directly to Debian. We had an initial rush of services, but requests have slowed down since (not really a bad thing, we don t want lots of spurious requests). Last year we did a census, to check which of the instances were still used, whether they received system updates and to ask whether they are performing backups. It went well and some issues were found along the way, so we ll be doing that again. We also gained two potential volunteers to help run things, which is great. Debian Social BoF Link to the etherpad / pad archive link can be found on the talk page: https://debconf24.debconf.org/talks/34-debiansocial-bof We discussed the services we run, you can view the current state of things at: https://wiki.debian.org/Teams/DebianSocial Pleroma has shown some cracks over the last year or so, and there are some forks that seem promising. At the same time, it might be worth while considering Mastodon too. So we ll do some comparison of features and maintenance and find a way forward. At the time when Pleroma was installed, it was way ahead in terms of moderation features. Pixelfed is doing well and chugging along nicely, we should probably promote it more. Peertube is working well, although we learned that we still don t have all the recent DebConf videos on there. A bunch of other issues should be fixed once we move it to a new machine that we plan to set up. We re removing writefreely and plume. Nice concepts, but it didn t get much traction yet, and no one who signed up for these actually used it, which is fine, some experimentation with services is good and sometimes they prove to be very popular and other times not. The WordPress multisite instance has some mild use, otherwise haven t had any issues. Matrix ended up to be much, much bigger than we thought, both in usage and in its requirements. It s very stateful and remembers discussions for as long as you let it, so it s Postgres database is continuously expanding, this will also be a lot easier to manage once we have this on the new host. Jitsi is also quite popular, but it could probably be on jitsi.debian.net instead (we created this on debian.social during the initial height of COVID-19 where we didn t have the debian.net hosting yet), although in practice it doesn t really matter where it lives. Most of our current challenges will be solved by moving everything to a new big machine that has a few public IPs available for some VMs, so we ll be doing that shortly. Debian Foundation Discussion BoF This was some brainstorming about the future structure of Debian, and what steps might be needed to get there. It s way too big a problem to take on in a BoF, but we made some progress in figuring out some smaller pieces of the larger puzzle. The DPL is going to get in touch with some legal advisors and our trusted organisations so that we can aim to formalise our relationships a bit more by the time it s DebConf again. I also introduced my intention to join the Debian Partners delegation. When I was DPL, I enjoyed talking with external organisations who wanted to help Debian, but helping external organisations help Debian turned out to be too much additional load on the usual DPL roles, so I m pursuing this with the Debian Partners team, more on that some other time. This session wasn t recorded, but if you feel like you missed something, don t worry, all intentions will be communicated and discussed with project members before anything moves forward. There was a strong agreement in the room though that we should push forward on this, and not reach another DebConf where we didn t make progress on formalising Debian s structure more.

    Social Conference Dinner
    Conference Dinner Photo from Santiago
    The conference dinner took place in the university gymnasium. I hope not many people do sports there in the summer, because it got HOT. There was also some interesting observations on the thermodynamics of the attempted cooling solutions, which was amusing. On the plus side, the food was great, the company was good, and the speeches were kept to a minimum, so it was a great conference dinner, even though it was probably cut a bit short due to the heat. Cheese and Wine Cheese and Wine happened on 1 August, which happens to be the date I became a DD at DebConf17 in Montr al seven years before, so this was a nice accidental celebration of my Debiversary :) Since I m running out of time, I ll add some more photos to this post some time after publishing it :P Group Photo As per DebConf tradition, Aigars took the group photo. You can find the high resolution version on Debian s GitLab instance.
    Debian annual conference Debconf 24, Busan, South Korea
    Photography: Aigars Mahinovs aigarius@debian.org
    License: CC-BYv3+ or GPLv2+
    Talking Ah yes, talking to people is a big part of DebConf, but I didn t keep track of it very well.
    • I mostly listened to Alper a bit about his ideas for his talk about debian installer.
    • I talked to Rhonda a bit about ActivityPub and MQTT and whether they could be useful for publicising Debian activity.
    • Listened to Gunnar and Julian have a discussion about GPG and APT which was interesting.
    • I learned that you can learn Hangul, the Korean alphabet, in about an hour or so (I wish I knew that in all my years of playing StarCraft II).
    • We had the usual continuous keysigning party. Besides it s intended function, this is always a good ice breaker and a way to for shy people to meet other shy people.
    • and many other fly-by discussions.

    Stuff that didn t happen this DebConf
    • loo.py A simple Python script that could eventually replace the obs-advanced-scene-switcher sequencer in OBS. It would also be extremely useful if we d ever replace OBS for loopy. I was hoping to have some time to hack on this, and try to recreate the current loopy in loo.py, but didn t have the time.
    • toetally This year videoteam had to scramble to get a bunch of resistors to assemble some tally light. Even when assembled, they were a bit troublesome. It would ve been nice to hack on toetally and get something ready for testing, but it mostly relies on having something like a rasbperry pi zero with an attached screen in order to work on further. I ll try to have something ready for the next mini conf though.
    • extrepo on debian live I think we should have extrepo installed by default on desktop systems, I meant to start a discussion on this, but perhaps it s just time I go ahead and do it and announce it.
    • Live stream to peertube server It would ve been nice to live stream DebConf to PeerTube, but the dependency tree to get this going got a bit too huge. Following our plans discussed in the Debian Social BoF, we should have this safely ready before the next MiniDebConf and should be able to test it there.
    • Desktop Egg there was this idea to get a stand-in theme for Debian testing/unstable until the artwork for the next release is finalized (Debian bug: #1038660), I have an idea that I meant to implement months ago, but too many things got in the way. It s based on Juliette Taka s Homeworld theme, and basically transforms the homeworld into an egg. Get it? Something that hasn t hatched yet? I also only recently noticed that we never used the actual homeworld graphics (featuring the world image) in the final bullseye release. lol.
    So, another DebConf and another new plush animal. Last but not least, thanks to PKNU for being such a generous and fantastic host to us! See you again at DebConf25 in Brest, France next year!

      Next.

      Previous.