Search Results: "ghe"

11 June 2025

Freexian Collaborators: Monthly report about Debian Long Term Support, May 2025 (by Roberto C. S nchez)

Like each month, have a look at the work funded by Freexian s Debian LTS offering.

Debian LTS contributors In May, 22 contributors have been paid to work on Debian LTS, their reports are available:
  • Abhijith PA did 8.0h (out of 0.0h assigned and 8.0h from previous period).
  • Adrian Bunk did 26.0h (out of 26.0h assigned).
  • Andreas Henriksson did 1.0h (out of 15.0h assigned and 3.0h from previous period), thus carrying over 17.0h to the next month.
  • Andrej Shadura did 3.0h (out of 10.0h assigned), thus carrying over 7.0h to the next month.
  • Bastien Roucari s did 20.0h (out of 20.0h assigned).
  • Ben Hutchings did 8.0h (out of 20.0h assigned and 4.0h from previous period), thus carrying over 16.0h to the next month.
  • Carlos Henrique Lima Melara did 12.0h (out of 11.0h assigned and 1.0h from previous period).
  • Chris Lamb did 15.5h (out of 0.0h assigned and 15.5h from previous period).
  • Daniel Leidert did 25.0h (out of 26.0h assigned), thus carrying over 1.0h to the next month.
  • Emilio Pozuelo Monfort did 21.0h (out of 16.75h assigned and 11.0h from previous period), thus carrying over 6.75h to the next month.
  • Guilhem Moulin did 11.5h (out of 8.5h assigned and 6.5h from previous period), thus carrying over 3.5h to the next month.
  • Jochen Sprickerhof did 3.5h (out of 8.75h assigned and 17.5h from previous period), thus carrying over 22.75h to the next month.
  • Lee Garrett did 26.0h (out of 12.75h assigned and 13.25h from previous period).
  • Lucas Kanashiro did 20.0h (out of 18.0h assigned and 2.0h from previous period).
  • Markus Koschany did 20.0h (out of 26.25h assigned), thus carrying over 6.25h to the next month.
  • Roberto C. S nchez did 20.75h (out of 24.0h assigned), thus carrying over 3.25h to the next month.
  • Santiago Ruano Rinc n did 15.0h (out of 12.5h assigned and 2.5h from previous period).
  • Sean Whitton did 6.25h (out of 6.0h assigned and 2.0h from previous period), thus carrying over 1.75h to the next month.
  • Sylvain Beucler did 26.25h (out of 26.25h assigned).
  • Thorsten Alteholz did 15.0h (out of 15.0h assigned).
  • Tobias Frost did 12.0h (out of 12.0h assigned).
  • Utkarsh Gupta did 1.0h (out of 15.0h assigned), thus carrying over 14.0h to the next month.

Evolution of the situation In May, we released 54 DLAs. The LTS Team was particularly active in May, publishing a higher than normal number of advisories, as well as helping with a wide range of updates to packages in stable and unstable, plus some other interesting work. We are also pleased to welcome several updates from contributors outside the regular team.
  • Notable security updates:
    • containerd, prepared by Andreas Henriksson, fixes a vulnerability that could cause containers launched as non-root users to be run as root
    • libapache2-mod-auth-openidc, prepared by Moritz Schlarb, fixes a vulnerability which could allow an attacker to crash an Apache web server with libapache2-mod-auth-openidc installed
    • request-tracker4, prepared by Andrew Ruthven, fixes multiple vulnerabilities which could result in information disclosure, cross-site scripting and use of weak encryption for S/MIME emails
    • postgresql-13, prepared by Bastien Roucari s, fixes an application crash vulnerability that could affect the server or applications using libpq
    • dropbear, prepared by Guilhem Moulin, fixes a vulnerability which could potentially result in execution of arbitrary shell commands
    • openjdk-17, openjdk-11, prepared by Thorsten Glaser, fixes several vulnerabilities, which include denial of service, information disclosure or bypass of sandbox restrictions
    • glibc, prepared by Sean Whitton, fixes a privilege escalation vulnerability
  • Notable non-security updates:
    • wireless-regdb, prepared by Ben Hutchings, updates information reflecting changes to radio regulations in many countries
This month s contributions from outside the regular team include the libapache2-mod-auth-openidc update mentioned above, prepared by Moritz Schlarb (the maintainer of the package); the update of request-tracker4, prepared by Andrew Ruthven (the maintainer of the package); and the updates of openjdk-17 and openjdk-11, also noted above, prepared by Thorsten Glaser. Additionally, LTS Team members contributed stable updates of the following packages:
  • rubygems and yelp/yelp-xsl, prepared by Lucas Kanashiro
  • simplesamlphp, prepared by Tobias Frost
  • libbson-xs-perl, prepared by Roberto C. S nchez
  • fossil, prepared by Sylvain Beucler
  • setuptools and mydumper, prepared by Lee Garrett
  • redis and webpy, prepared by Adrian Bunk
  • xrdp, prepared by Abhijith PA
  • tcpdf, prepared by Santiago Ruano Rinc n
  • kmail-account-wizard, prepared by Thorsten Alteholz
Other contributions were also made by LTS Team members to packages in unstable:
  • proftpd-dfsg DEP-8 tests (autopkgtests) were provided to the maintainer, prepared by Lucas Kanashiro
  • a regular upload of libsoup2.4, prepared by Sean Whitton
  • a regular upload of setuptools, prepared by Lee Garrett
Freexian, the entity behind the management of the Debian LTS project, has been working for some time now on the development of an advanced CI platform for Debian-based distributions, called Debusine. Recently, Debusine has reached a level of feature implementation that makes it very usable. Some members of the LTS Team have been using Debusine informally, and during May LTS coordinator Santiago Ruano Rinc n has made a call for the team to help with testing of Debusine, and to help evaluate its suitability for the LTS Team to eventually begin using as the primary mechanism for uploading packages into Debian. Team members who have started using Debusine are providing valuable feedback to the Debusine development team, thus helping to improve the platform for all users. Actually, a number of updates, for both bullseye and bookworm, made during the month of May were handled using Debusine, e.g. rubygems s DLA-4163-1. By the way, if you are a Debian Developer, you can easily test Debusine following the instructions found at https://wiki.debian.org/DebusineDebianNet. DebConf, the annual Debian Conference, is coming up in July and, as is customary each year, the week preceding the conference will feature an event called DebCamp. The DebCamp week provides an opportunity for teams and other interested groups/individuals to meet together in person in the same venue as the conference itself, with the purpose of doing focused work, often called sprints . LTS coordinator Roberto C. S nchez has announced that the LTS Team is planning to hold a sprint primarily focused on the Debian security tracker and the associated tooling used by the LTS Team and the Debian Security Team.

Thanks to our sponsors Sponsors that joined recently are in bold.

5 June 2025

Matthew Garrett: Twitter's new encrypted DMs aren't better than the old ones

(Edit: Twitter could improve this significantly with very few changes - I wrote about that here. It's unclear why they'd launch without doing that, since it entirely defeats the point of using HSMs)

When Twitter[1] launched encrypted DMs a couple
of years ago, it was the worst kind of end-to-end
encrypted - technically e2ee, but in a way that made it relatively easy for Twitter to inject new encryption keys and get everyone's messages anyway. It was also lacking a whole bunch of features such as "sending pictures", so the entire thing was largely a waste of time. But a couple of days ago, Elon announced the arrival of "XChat", a new encrypted message platform built on Rust with (Bitcoin style) encryption, whole new architecture. Maybe this time they've got it right?

tl;dr - no. Use Signal. Twitter can probably obtain your private keys, and admit that they can MITM you and have full access to your metadata.

The new approach is pretty similar to the old one in that it's based on pretty straightforward and well tested cryptographic primitives, but merely using good cryptography doesn't mean you end up with a good solution. This time they've pivoted away from using the underlying cryptographic primitives directly and into higher level abstractions, which is probably a good thing. They're using Libsodium's boxes for message encryption, which is, well, fine? It doesn't offer forward secrecy (if someone's private key is leaked then all existing messages can be decrypted) so it's a long way from the state of the art for a messaging client (Signal's had forward secrecy for over a decade!), but it's not inherently broken or anything. It is, however, written in C, not Rust[2].

That's about the extent of the good news. Twitter's old implementation involved clients generating keypairs and pushing the public key to Twitter. Each client (a physical device or a browser instance) had its own private key, and messages were simply encrypted to every public key associated with an account. This meant that new devices couldn't decrypt old messages, and also meant there was a maximum number of supported devices and terrible scaling issues and it was pretty bad. The new approach generates a keypair and then stores the private key using the Juicebox protocol. Other devices can then retrieve the private key.

Doesn't this mean Twitter has the private key? Well, no. There's a PIN involved, and the PIN is used to generate an encryption key. The stored copy of the private key is encrypted with that key, so if you don't know the PIN you can't decrypt the key. So we brute force the PIN, right? Juicebox actually protects against that - before the backend will hand over the encrypted key, you have to prove knowledge of the PIN to it (this is done in a clever way that doesn't directly reveal the PIN to the backend). If you ask for the key too many times while providing the wrong PIN, access is locked down.

But this is true only if the Juicebox backend is trustworthy. If the backend is controlled by someone untrustworthy[3] then they're going to be able to obtain the encrypted key material (even if it's in an HSM, they can simply watch what comes out of the HSM when the user authenticates if there's no validation of the HSM's keys). And now all they need is the PIN. Turning the PIN into an encryption key is done using the Argon2id key derivation function, using 32 iterations and a memory cost of 16MB (the Juicebox white paper says 16KB, but (a) that's laughably small and (b) the code says 16 * 1024 in an argument that takes kilobytes), which makes it computationally and moderately memory expensive to generate the encryption key used to decrypt the private key. How expensive? Well, on my (not very fast) laptop, that takes less than 0.2 seconds. How many attempts to I need to crack the PIN? Twitter's chosen to fix that to 4 digits, so a maximum of 10,000. You aren't going to need many machines running in parallel to bring this down to a very small amount of time, at which point private keys can, to a first approximation, be extracted at will.

Juicebox attempts to defend against this by supporting sharding your key over multiple backends, and only requiring a subset of those to recover the original. I can't find any evidence that Twitter's does seem to be making use of this,Twitter uses three backends and requires data from at least two, but all the backends used are under x.com so are presumably under Twitter's direct control. Trusting the keystore without needing to trust whoever's hosting it requires a trustworthy communications mechanism between the client and the keystore. If the device you're talking to can prove that it's an HSM that implements the attempt limiting protocol and has no other mechanism to export the data, this can be made to work. Signal makes use of something along these lines using Intel SGX for contact list and settings storage and recovery, and Google and Apple also have documentation about how they handle this in ways that make it difficult for them to obtain backed up key material. Twitter has no documentation of this, and as far as I can tell does nothing to prove that the backend is in any way trustworthy. (Edit to add: The Juicebox API does support authenticated communication between the client and the HSM, but that relies on you having some way to prove that the public key you're presented with corresponds to a private key that only exists in the HSM. Twitter gives you the public key whenever you communicate with them, so even if they've implemented this properly you can't prove they haven't made up a new key and MITMed you the next time you retrieve your key)

On the plus side, Juicebox is written in Rust, so Elon's not 100% wrong. Just mostly wrong.

But ok, at least you've got viable end-to-end encryption even if someone can put in some (not all that much, really) effort to obtain your private key and render it all pointless? Actually no, since you're still relying on the Twitter server to give you the public key of the other party and there's no out of band mechanism to do that or verify the authenticity of that public key at present. Twitter can simply give you a public key where they control the private key, decrypt the message, and then reencrypt it with the intended recipient's key and pass it on. The support page makes it clear that this is a known shortcoming and that it'll be fixed at some point, but they said that about the original encrypted DM support and it never was, so that's probably dependent on whether Elon gets distracted by something else again. And the server knows who and when you're messaging even if they haven't bothered to break your private key, so there's a lot of metadata leakage.

Signal doesn't have these shortcomings. Use Signal.

[1] I'll respect their name change once Elon respects his daughter

[2] There are implementations written in Rust, but Twitter's using the C one with these JNI bindings

[3] Or someone nominally trustworthy but who's been compelled to act against your interests - even if Elon were absolutely committed to protecting all his users, his overarching goals for Twitter require him to have legal presence in multiple jurisdictions that are not necessarily above placing employees in physical danger if there's a perception that they could obtain someone's encryption keys

comment count unavailable comments

29 May 2025

Arthur Diniz: Bringing Kubernetes Back to Debian

I ve been part of the Debian Project since 2019, when I attended DebConf held in Curitiba, Brazil. That event sparked my interest in the community, packaging, and how Debian works as a distribution. In the early years of my involvement, I contributed to various teams such as the Python, Golang and Cloud teams, packaging dependencies and maintaining various tools. However, I soon felt the need to focus on packaging software I truly enjoyed, tools I was passionate about using and maintaining. That s when I turned my attention to Kubernetes within Debian.

A Broken Ecosystem The Kubernetes packaging situation in Debian had been problematic for some time. Given its large codebase and complex dependency tree, the initial packaging approach involved vendorizing all dependencies. While this allowed a somewhat functional package to be published, it introduced several long-term issues, especially security concerns. Vendorized packages bundle third-party dependencies directly into the source tarball. When vulnerabilities arise in those dependencies, it becomes difficult for Debian s security team to patch and rebuild affected packages system-wide. This approach broke Debian s best practices, and it eventually led to the abandonment of the Kubernetes source package, which had stalled at version 1.20.5. Due to this abandonment, critical bugs emerged and the package was removed from Debian s testing channel, as we can see in the package tracker.

New Debian Kubernetes Team Around this time, I became a Debian Maintainer (DM), with permissions to upload certain packages. I saw an opportunity to both contribute more deeply to Debian and to fix Kubernetes packaging. In early 2024, just before DebConf Busan in South Korea, I founded the Debian Kubernetes Team. The mission of the team was to repackage Kubernetes in a maintainable, security-conscious, and Debian-compliant way. At DebConf, I shared our progress with the broader community and received great feedback and more visibility, along with people interested in contributing to the team. Our first tasks was to migrate existing Kubernetes-related tools such as kubectx, kubernetes-split-yaml and kubetail into a dedicated namespace on Salsa, Debian s GitLab instance. Many of these tools were stored across different teams (like the Go team), and consolidating them helped us organize development and focus our efforts.

De-vendorizing Kubernetes Our main goal was to un-vendorize Kubernetes and bring it up-to-date with upstream releases. This meant:
  • Removing the vendor directory and all embedded third-party code.
  • Trimming the build scope to focus solely on building kubectl, Kubernetes CLI.
  • Using Files-Excluded in debian/copyright to cleanly drop unneeded files during source imports.
  • Rebuilding the dependency tree, ensuring all Go modules were separately packaged in Debian.
We used uscan, a standard Debian packaging tool that fetches upstream tarballs and prepares them accordingly. The Files-Excluded directive in our debian/copyright file instructed uscan to automatically remove unnecessary files during the repackaging process:
$ uscan
Newest version of kubernetes on remote site is 1.32.3, specified download version is 1.32.3
Successfully repacked ../v1.32.3 as ../kubernetes_1.32.3+ds.orig.tar.gz, deleting 30616 files from it.
The results were dramatic. By comparing the original upstream tarball with our repackaged version, we can see that our approach reduced the tarball size by over 75%:
$ du -h upstream-v1.32.3.tar.gz kubernetes_1.32.3+ds.orig.tar.gz
14M	upstream-v1.32.3.tar.gz
3.2M	kubernetes_1.32.3+ds.orig.tar.gz
This significant reduction wasn t just about saving space. By removing over 30,000 files, we simplified the package, making it more maintainable. Each dependency could now be properly tracked, updated, and patched independently, resolving the security concerns that had plagued the previous packaging approach.

Dependency Graph To give you an idea of the complexity involved in packaging Kubernetes for Debian, the image below is a dependency graph generated with debtree, visualizing all the Go modules and other dependencies required to build the kubectl binary. kubectl-depgraph This web of nodes and edges represents every module and its relationship during the compilation process of kubectl. Each box is a Debian package, and the lines connecting them show how deeply intertwined the ecosystem is. What might look like a mess of blue spaghetti is actually a clear demonstration of the vast and interconnected upstream world that tools like kubectl rely on. But more importantly, this graph is a testament to the effort that went into making kubectl build entirely using Debian-packaged dependencies only, no vendoring, no downloading from the internet, no proprietary blobs.

Upstream Version 1.32.3 and Beyond After nearly two years of work, we successfully uploaded version 1.32.3+ds of kubectl to Debian unstable. kubernetes/-/merge_requests/1 The new package also includes:
  • Zsh, Fish, and Bash completions installed automatically
  • Man pages and metadata for improved discoverability
  • Full integration with kind and docker for testing purposes

Integration Testing with Autopkgtest To ensure the reliability of kubectl in real-world scenarios, we developed a new autopkgtest suite that runs integration tests using real Kubernetes clusters created via Kind. Autopkgtest is a Debian tool used to run automated tests on binary packages. These tests are executed after the package is built but before it s accepted into the Debian archive, helping catch regressions and integration issues early in the packaging pipeline. Our test workflow validates kubectl by performing the following steps:
  • Installing Kind and Docker as test dependencies.
  • Spinning up two local Kubernetes clusters.
  • Switching between cluster contexts to ensure multi-cluster support.
  • Deploying and scaling a sample nginx application using kubectl.
  • Cleaning up the entire test environment to avoid side effects.
  • debian/tests/kubectl.sh

Popcon: Measuring Adoption To measure real-world usage, we rely on data from Debian s popularity contest (popcon), which gives insight into how many users have each binary installed. popcon-graph popcon-table Here s what the data tells us:
  • kubectl (new binary): Already installed on 2,124 systems.
  • golang-k8s-kubectl-dev: This is the Go development package (a library), useful for other packages and developers who want to interact with Kubernetes programmatically.
  • kubernetes-client: The legacy package that kubectl is replacing. We expect this number to decrease in future releases as more systems transition to the new package.
Although the popcon data shows activity for kubectl before the official Debian upload date, it s important to note that those numbers represent users who had it installed from upstream source-lists, not from the Debian repositories. This distinction underscores a demand that existed even before the package was available in Debian proper, and it validates the importance of bringing it into the archive.
Also worth mentioning: this number is not the real total number of installations, since users can choose not to participate in the popularity contest. So the actual adoption is likely higher than what popcon reflects.

Community and Documentation The team also maintains a dedicated wiki page which documents:
  • Maintained tools and packages
  • Contribution guidelines
  • Our roadmap for the upcoming Debian releases
https://debian-kubernetes.org

Looking Ahead to Debian 13 (Trixie) The next stable release of Debian will ship with kubectl version 1.32.3, built from a clean, de-vendorized source. This version includes nearly all the latest upstream features, and will be the first time in years that Debian users can rely on an up-to-date, policy-compliant kubectl directly from the archive. By comparing with upstream, our Debian package even delivers more out of the box, including shell completions, which the upstream still requires users to generate manually. In 2025, the Debian Kubernetes team will continue expanding our packaging efforts for the Kubernetes ecosystem. Our roadmap includes:
  • kubelet: The primary node agent that runs on each node. This will enable Debian users to create fully functional Kubernetes nodes without relying on external packages.
  • kubeadm: A tool for creating Kubernetes clusters. With kubeadm in Debian, users will then be able to bootstrap minimum viable clusters directly from the official repositories.
  • helm: The package manager for Kubernetes that helps manage applications through Kubernetes YAML files defined as charts.
  • kompose: A conversion tool that helps users familiar with docker-compose move to Kubernetes by translating Docker Compose files into Kubernetes resources.

Final Thoughts This journey was only possible thanks to the amazing support of the debian-devel-br community and the collective effort of contributors who stepped up to package missing dependencies, fix bugs, and test new versions. Special thanks to:
  • Carlos Henrique Melara (@charles)
  • Guilherme Puida (@puida)
  • Jo o Pedro Nobrega (@jnpf)
  • Lucas Kanashiro (@kanashiro)
  • Matheus Polkorny (@polkorny)
  • Samuel Henrique (@samueloph)
  • Sergio Cipriano (@cipriano)
  • Sergio Durigan Junior (@sergiodj)
I look forward to continuing this work, bringing more Kubernetes tools into Debian and improving the developer experience for everyone.

Arthur Diniz: Bringing Kubernetes Back to Debian

I ve been part of the Debian Project since 2019, when I attended DebConf held in Curitiba, Brazil. That event sparked my interest in the community, packaging, and how Debian works as a distribution. In the early years of my involvement, I contributed to various teams such as the Python, Golang and Cloud teams, packaging dependencies and maintaining various tools. However, I soon felt the need to focus on packaging software I truly enjoyed, tools I was passionate about using and maintaining. That s when I turned my attention to Kubernetes within Debian.

A Broken Ecosystem The Kubernetes packaging situation in Debian had been problematic for some time. Given its large codebase and complex dependency tree, the initial packaging approach involved vendorizing all dependencies. While this allowed a somewhat functional package to be published, it introduced several long-term issues, especially security concerns. Vendorized packages bundle third-party dependencies directly into the source tarball. When vulnerabilities arise in those dependencies, it becomes difficult for Debian s security team to patch and rebuild affected packages system-wide. This approach broke Debian s best practices, and it eventually led to the abandonment of the Kubernetes source package, which had stalled at version 1.20.5. Due to this abandonment, critical bugs emerged and the package was removed from Debian s testing channel, as we can see in the package tracker.

New Debian Kubernetes Team Around this time, I became a Debian Maintainer (DM), with permissions to upload certain packages. I saw an opportunity to both contribute more deeply to Debian and to fix Kubernetes packaging. In early 2024, just before DebConf Busan in South Korea, I founded the Debian Kubernetes Team. The mission of the team was to repackage Kubernetes in a maintainable, security-conscious, and Debian-compliant way. At DebConf, I shared our progress with the broader community and received great feedback and more visibility, along with people interested in contributing to the team. Our first tasks was to migrate existing Kubernetes-related tools such as kubectx, kubernetes-split-yaml and kubetail into a dedicated namespace on Salsa, Debian s GitLab instance. Many of these tools were stored across different teams (like the Go team), and consolidating them helped us organize development and focus our efforts.

De-vendorizing Kubernetes Our main goal was to un-vendorize Kubernetes and bring it up-to-date with upstream releases. This meant:
  • Removing the vendor directory and all embedded third-party code.
  • Trimming the build scope to focus solely on building kubectl, Kubernetes CLI.
  • Using Files-Excluded in debian/copyright to cleanly drop unneeded files during source imports.
  • Rebuilding the dependency tree, ensuring all Go modules were separately packaged in Debian.
We used uscan, a standard Debian packaging tool that fetches upstream tarballs and prepares them accordingly. The Files-Excluded directive in our debian/copyright file instructed uscan to automatically remove unnecessary files during the repackaging process:
$ uscan
Newest version of kubernetes on remote site is 1.32.3, specified download version is 1.32.3
Successfully repacked ../v1.32.3 as ../kubernetes_1.32.3+ds.orig.tar.gz, deleting 30616 files from it.
The results were dramatic. By comparing the original upstream tarball with our repackaged version, we can see that our approach reduced the tarball size by over 75%:
$ du -h upstream-v1.32.3.tar.gz kubernetes_1.32.3+ds.orig.tar.gz
14M	upstream-v1.32.3.tar.gz
3.2M	kubernetes_1.32.3+ds.orig.tar.gz
This significant reduction wasn t just about saving space. By removing over 30,000 files, we simplified the package, making it more maintainable. Each dependency could now be properly tracked, updated, and patched independently, resolving the security concerns that had plagued the previous packaging approach.

Dependency Graph To give you an idea of the complexity involved in packaging Kubernetes for Debian, the image below is a dependency graph generated with debtree, visualizing all the Go modules and other dependencies required to build the kubectl binary. kubectl-depgraph This web of nodes and edges represents every module and its relationship during the compilation process of kubectl. Each box is a Debian package, and the lines connecting them show how deeply intertwined the ecosystem is. What might look like a mess of blue spaghetti is actually a clear demonstration of the vast and interconnected upstream world that tools like kubectl rely on. But more importantly, this graph is a testament to the effort that went into making kubectl build entirely using Debian-packaged dependencies only, no vendoring, no downloading from the internet, no proprietary blobs.

Upstream Version 1.32.3 and Beyond After nearly two years of work, we successfully uploaded version 1.32.3+ds of kubectl to Debian unstable. kubernetes/-/merge_requests/1 The new package also includes:
  • Zsh, Fish, and Bash completions installed automatically
  • Man pages and metadata for improved discoverability
  • Full integration with kind and docker for testing purposes

Integration Testing with Autopkgtest To ensure the reliability of kubectl in real-world scenarios, we developed a new autopkgtest suite that runs integration tests using real Kubernetes clusters created via Kind. Autopkgtest is a Debian tool used to run automated tests on binary packages. These tests are executed after the package is built but before it s accepted into the Debian archive, helping catch regressions and integration issues early in the packaging pipeline. Our test workflow validates kubectl by performing the following steps:
  • Installing Kind and Docker as test dependencies.
  • Spinning up two local Kubernetes clusters.
  • Switching between cluster contexts to ensure multi-cluster support.
  • Deploying and scaling a sample nginx application using kubectl.
  • Cleaning up the entire test environment to avoid side effects.
  • debian/tests/kubectl.sh

Popcon: Measuring Adoption To measure real-world usage, we rely on data from Debian s popularity contest (popcon), which gives insight into how many users have each binary installed. popcon-graph popcon-table Here s what the data tells us:
  • kubectl (new binary): Already installed on 2,124 systems.
  • golang-k8s-kubectl-dev: This is the Go development package (a library), useful for other packages and developers who want to interact with Kubernetes programmatically.
  • kubernetes-client: The legacy package that kubectl is replacing. We expect this number to decrease in future releases as more systems transition to the new package.
Although the popcon data shows activity for kubectl before the official Debian upload date, it s important to note that those numbers represent users who had it installed from upstream source-lists, not from the Debian repositories. This distinction underscores a demand that existed even before the package was available in Debian proper, and it validates the importance of bringing it into the archive.
Also worth mentioning: this number is not the real total number of installations, since users can choose not to participate in the popularity contest. So the actual adoption is likely higher than what popcon reflects.

Community and Documentation The team also maintains a dedicated wiki page which documents:
  • Maintained tools and packages
  • Contribution guidelines
  • Our roadmap for the upcoming Debian releases
https://debian-kubernetes.org

Looking Ahead to Debian 13 (Trixie) The next stable release of Debian will ship with kubectl version 1.32.3, built from a clean, de-vendorized source. This version includes nearly all the latest upstream features, and will be the first time in years that Debian users can rely on an up-to-date, policy-compliant kubectl directly from the archive. By comparing with upstream, our Debian package even delivers more out of the box, including shell completions, which the upstream still requires users to generate manually. In 2025, the Debian Kubernetes team will continue expanding our packaging efforts for the Kubernetes ecosystem. Our roadmap includes:
  • kubelet: The primary node agent that runs on each node. This will enable Debian users to create fully functional Kubernetes nodes without relying on external packages.
  • kubeadm: A tool for creating Kubernetes clusters. With kubeadm in Debian, users will then be able to bootstrap minimum viable clusters directly from the official repositories.
  • helm: The package manager for Kubernetes that helps manage applications through Kubernetes YAML files defined as charts.
  • kompose: A conversion tool that helps users familiar with docker-compose move to Kubernetes by translating Docker Compose files into Kubernetes resources.

Final Thoughts This journey was only possible thanks to the amazing support of the debian-devel-br community and the collective effort of contributors who stepped up to package missing dependencies, fix bugs, and test new versions. Special thanks to:
  • Carlos Henrique Melara (@charles)
  • Guilherme Puida (@puida)
  • Jo o Pedro Nobrega (@jnpf)
  • Lucas Kanashiro (@kanashiro)
  • Matheus Polkorny (@polkorny)
  • Samuel Henrique (@samueloph)
  • Sergio Cipriano (@cipriano)
  • Sergio Durigan Junior (@sergiodj)
I look forward to continuing this work, bringing more Kubernetes tools into Debian and improving the developer experience for everyone.

27 May 2025

Russell Coker: Leaf ZE1

I ve just got a second hand Nissan LEAF. It s not nearly as luxurious as the Genesis EV that I test drove [1]. It s also just over 5 years old so it s not as slick as the MG4 I test drove [2]. But the going rate for a LEAF of that age is $17,000 vs $35,000 or more for a new MG4 or $130,000+ for a Genesis. At this time the LEAF is the only EV in Australia that s available on the second hand market in quantity. Apparently the cheapest new EV in Australia is a Great Wall one which is $32,000 and which had a wait list last time I checked, so $17,000 is a decent price if you want an electric car and aren t interested in paying the price of a new car. Starting the Car One thing I don t like about most recent cars (petrol as well as electric) is that they needlessly break traditions of car design. Inserting a key and turning it clockwise to start a car is a long standing tradition that shouldn t be broken without a good reason. With the use of traditional keys you know that when a car has the key removed it can t be operated, there s no situation of the person with the key walking away and leaving the car driveable and there s no possibility of the owner driving somewhere without the key and then being unable to start it. To start a LEAF you have to have the key fob device in range, hold down the brake pedal, and then press the power button. To turn on accessories you do the same but without holding down the brake pedal. They also have patterns of pushes, push twice to turn it on, push three times to turn it off. This is all a lot easier with a key where you can just rotate it as many clicks as needed. The change of car design for the key means that no physical contact is needed to unlock the car. If someone stands by a car fiddling with the door lock it will get noticed which deters certain types of crime. If a potential thief can sit in a nearby car to try attack methods and only walk to the target vehicle once it s unlocked it makes the crime a lot easier. Even if the electronic key is as secure as a physical key allowing attempts to unlock remotely weakens security. Reports on forums suggest that the electronic key is vulnerable to replay attacks. I guess I just have to hope that as car thieves typically get less than 10% of the value of a car it s just not worth their effort to steal a $17,000 car. Unlocking doors remotely is a common feature that s been around for a while but starting a car without a key being physically inserted is a new thing. Other Features The headlights turn on automatically when the car thinks that the level of ambient light warrants it. There is an option to override this to turn on lights but no option to force the lights to be off. So if you have your car in the on state while parked the headlights will be on even if you are parked and listening to the radio. The LEAF has a bunch of luxury features which seem a bit ridiculous like seat warmers. It also has a heated steering wheel which has turned out to be a good option for me as I have problems with my hands getting cold. According to the My Nissan LEAF Forum the seat warmer uses a maximum of 50W per seat while the car heater uses a minimum of 250W [3]. So if there are one or two people in the car then significantly less power is used by just heating the seats and also keeping the car air cool reduces window fog. The Bluetooth audio support works well. I ve done hands free calls and used it for playing music from my phone. This is the first car I ve owned with Bluetooth support. It also has line-in which might have had some use in 2019 but is becoming increasingly useless as phones with Bluetooth become more popular. It has support for two devices connecting via Bluetooth at the same time which could be handy if you wanted to watch movies on a laptop or tablet while waiting for someone. The LEAF has some of the newer safety features, it tracks lane markers and notifies the driver via beeps and vibration if they stray from their lane. It also tries to read speed limit signs and display the last observed speed limit on the dash display. It also has a skid alert which in my experience goes off under hard acceleration when it s not skidding but doesn t go off if you lose grip when cornering. The features for detecting changing lanes when close to other cars and for emergency braking when another car is partly in the lane (even if moving out of the lane) don t seem well tuned for Australian driving, the common trend on Australian roads is lawful-evil to use DND terminology. Range My most recent driving was just over 2 hours driving with a distance of a bit over 100Km which took the battery from 62% to 14%. So it looks like I can drive a bit over 200Km at an average speed of 50Km/h. I have been unable to find out the battery size for my car, my model will have either a 40KWh or 62KWh battery. Google results say it should be printed on the B pillar (it s not) and that it can be deduced from the VIN (it can t). I m guessing that my car is the cheaper option which is supposed to do 240Km when new which means that a bit over 200Km at an average speed of 50Km/h when 6yo is about what s expected. If it has the larger battery designed to do 340Km then doing 200Km in real use would be rather disappointing. Assuming the battery is 40KWh that means it s 5Km/KWh or 10KW average for the duration. That means that the 250W or so used by the car heater should only make a about 2% difference to range which is something that a human won t usually notice. If I was to drive to another state I d definitely avoid using the heater or airconditioner as an extra 4km could really matter when trying to find a place to charge when you aren t familiar with the area. It s also widely reported that the LEAF is less efficient at highway speeds which is an extra difficulty for that. It seems that the LEAF just isn t designed for interstate driving in Australia, it would be fine for driving between provinces of the Netherlands as it s difficult to drive for 200km without leaving that country. Driving 700km to another city in a car with 200km range would mean charging 3 times along the way, that s 2 hours of charging time when using fast chargers. This isn t a problem at all as the average household in Australia has 1.8 cars and the battery electric vehicles only comprise 6.3% of the market. So if a household had a LEAF and a Prius they could just use the Prius for interstate driving. A recent Prius could drive from Melbourne to Canberra or Adelaide without refuelling on the way. If I was driving to another state a couple of times a year I could rent an old fashioned car to do that and still be saving money when compared to buying petrol all the time. Running Cost Currently I m paying about $0.28 per KWh for electricity, it s reported that the efficiency of charging a LEAF is as low as 83% with the best efficiency when fast charging. I don t own the fast charge hardware and don t plan to install it as that would require getting a replacement of the connection to my home from the street, a new switchboard, and other expenses. So I expect I ll be getting 83% efficiency when charging which means 48KWh for 200KM or 96KWH for the equivalent of a $110 tank of petrol. At $0.28/KWh it will cost $26 for the same amount of driving as $110 of petrol. I also anticipate saving money on service as there s no need for engine oil changes and all the other maintenance of a petrol engine and regenerative braking will reduce the incidence of brake pad replacement. I expect to save over $1100 per annum on using electricity instead of petrol even if I pay the full rate. But if I charge my car in the middle of the day when there is over supply and I don t get paid for feeding electricity from my solar panels into the grid (as is common nowadays) it could be almost free to charge the car and I could save about $1500 on fuel. Comfort Electric cars are much quieter than cars with petrol or Diesel engines which is a major luxury feature. This car is also significantly newer than any other car I ve driven much so it has features like Bluetooth audio which weren t in other cars I ve driven. When doing 100Km/h I can hear a lot of noise from the airflow, part of that would be due to the LEAF not having the extreme streamlining features that are associated with Teslas (such as retracting door handles) and part of that would be due to the car being older and the door seals not being as good as they were when new. It s still a very quiet car with a very smooth ride. It would be nice if they used the quality of seals and soundproofing that VW uses in the Passat but I guess the car would be heavier and have a shorter range if they did that. This car has less space for the driver than any other car I ve driven (with the possible exception of a 1989 Ford Laser AKA Mazda 323). The front seats have less space than the Prius. Also the batteries seem to be under the front seats so there s a bulge in the floor going slightly in front of the front seats when they are moved back which gives less space for the front passenger to move their legs and less space for the driver when sitting in a parked car. There are a selection of electric cars from MG, BYD, and Great Wall that have more space in the front seats, if those cars were on the second hand market I might have made a different choice but a second hand LEAF is the only option for a cheap electric car in Australia now. The heated steering wheel and heated seats took a bit of getting used to but I have come to appreciate the steering wheel and the heated seats are a good way of extending the range of the car. Misc Notes The LEAF is a fun car to drive and being quiet is a luxury feature, it s no different to other EVs in this regard. It isn t nearly as fast as a Tesla, but is faster than most cars actually drive on the road. When I was looking into buying a LEAF from one of the car sales sites I was looking at models less than 5 years old. But the ZR1 series went from 2017 to 2023 so there s probably not much difference between a 2019 model and a 2021 model but there is a significant price difference. I didn t deliberately choose a 2019 car, it was what a relative was selling at a time when I needed a new car. But knowing what I know now I d probably look at that age of LEAF if choosing from the car sales sites. Problems When I turn the car off the side mirrors fold in but when I turn it on they usually don t automatically unfold if I have anything connected to the cigarette lighter power port. This is a well known problem and documented on forums. This is something that Nissan really should have tested before release because phone chargers that connect to the car cigarette lighter port have been common for at least 6 years before my car was manufactured and at least 4 years before the ZE1 model was released. The built in USB port doesn t supply enough power to match the power use of a Galaxy Note 9 running Google maps and playing music through Bluetooth. On it s own this isn t a big deal but combined with the mirror issue of using a charger in the cigarette lighter port it s a problem. The cover over the charging ports doesn t seem to lock easily enough, I had it come open when doing 100Km/h on a freeway. This wasn t a big deal but as the cover opens in a suicide-door manner at a higher speed it could have broken off. The word is that LEAF service in Australia is not done well. Why do you need regular service of an electric car anyway? For petrol and Diesel cars it s engine oil replacement that makes it necessary to have regular service. Surely you can just drive it until either the brakes squeak or the tires seem worn. I have been having problems charging, sometimes it will charge from ~20% to 100% in under 24 hours, sometimes in 14+ hours it only gets to 30%. Conclusion This is a good car and the going price on them is low. I generally recommend them as long as you aren t really big and aren t too worried about the poor security. It s a fun car to drive even with a few annoying things like the mirrors not automatically extending on start. The older ones like this are cheap enough that they should be able to cover the entire purchase cost in 10 years by the savings from not buying petrol even if you don t drive a lot. With a petrol car I use about 13 tanks of petrol a year so my driving is about half the average for Australia. Some people could cover the purchase price of a second hand leaf in under 5 years.

21 May 2025

Simon Quigley: Fences and Values

Don t knock the fence down before you know why it s up. I repeat this phrase over and over again, yet the (metaphorical) Homeowner s Association still decides my fence is the wrong color.Well, now you get to know why the fence is up. If anyone s actually willing to challenge me on this level, I d welcome it.The four ideas I d like to discuss are this: quantum physics, Lutheranism, mental resilience, and psychology. I ve been studying these topics intensely for the past decade as a passion project. I m just going to let my thoughts flow, but I d like to hear other opinions on this.Can the mysteries of the mind, the subatomic world, and faith converge to reveal deeper truths?When it comes to self-taught knowledge on analysis, I m mostly learned on Freud, with some hints of Jung and Peterson. I ve read much of the original source material, and watched countless presentations on it. This all being said, I m both learned on Rothbard and Marx, so if there is a major flaw in the way of Freud is frowned upon, I d genuinely like to know so I can update my research and juxtapose the two schools of thought.Alongside this, although probably not directly relevant, I m learned on John Locke and transcendentalism. What I d like to focus on here is this the Id.The Id is the pleasure-seeking, instinctual part of the psyche. Jung further extends this into the idea of the shadow self, and Peterson maps the meanings of these texts into a combined work (at least in my rudimentary understanding).In my research, the Id represents the part of your psyche that deals with religious values. As an example, if you re an impulsive person, turning to a spiritual or religious outlet can be highly beneficial. I ve been using references from the foundational text of the Judaeo-Christian value system this entire time, feel free to re-read my other blog posts (instead of claiming they don t exist).Let s tie this into quantum physics. This is the part where I ll struggle most. I ve watched several movies about this, read several books, and even learned about it academically, but quantum physics is likely to be my weak spot here.I did some research, and here are the elements I m looking for: uncertainty principle, wave-particle duality, quantum entanglement, and the observer effect.I already know about the cat in the box. And the Cat in the Hat, for that matter. I know about wave-particle duality from an incredibly intelligent high school physics teacher of mine. I know about the uncertainty principle purely in a colloquial sense. The remaining element I need to wrap my head around is quantum entanglement, but it feels like I m almost there.These concepts do actually challenge the idea of pure free will. It s almost like we re coming full circle. Some theologians (including myself, if you can call me a self-taught one) do believe the idea of quantum indeterminacy can be a space where divine action may take place. You could also liken the unpredictable nature of the Id to quantum indeterminacy as well. These are ones to think about, because in all reality, they re subjective opinions. I do believe they re interconnected.In terms of Lutheranism, I ll be short on this one. Please do go read the full history behind Martin Luther and his turbulent relationship with Catholicism. I m not a Bible thumper, and I actually think this is the first time I ve mentioned religion publicly at all. This being said, now I m actually ready to defend the points on an academic level.The Id represents hidden psychological forces, quantum physics reveals subatomic mysteries, and Lutheranism emphasizes faith in the unseen God. Okay, so we have the baseline. Now, time for some mental resilience. When I think of mental resilience, the first people I think of are David Goggins and Jocko Willink. I ve also enjoyed Dr. Andrew Huberman s podcast.The idea there is simple if you understand exactly how to learn, you know your fundamentals well enough to draw them and explain them vividly on a whiteboard, and you can make it a habit, at that point you re ready to work on your mental resilience. Little by little, gradually, how far can you push the bar towards the ceiling?There s obviously limits. People sometimes get scared when I mention mental resilience, but obviously that s a bit of a catch 22. There are plenty of satirical videos out there, and of course, I don t believe in Goggins or Jocko wholeheartedly. They re just tools in the toolbox when times get tough.I wish you all well, and I hope this gets you thinking about those people who just insist there is no God or higher being, and think you re stupid for believing there is one. Those people obviously haven t read analysis, in my own opinion.Have a great night!

17 May 2025

Daniel Lange: Polkitd (Policy Kit Daemon) in Trixie ... getting rid of "Authentication is required to create a color profile"

On the way to Trixie, polkitd (Policy Kit Daemon) has lost the functionality to evaluate its .pkla (Polkit Local Authority) files.
$ zcat /usr/share/doc/polkitd/NEWS.Debian.gz 
policykit-1 (121+compat0.1-2) experimental; urgency=medium
  This version of polkit changes the syntax used for local policy rules:
  it is now the same JavaScript-based format used by the upstream polkit
  project and by other Linux distributions.
  System administrators can override the default security policy by
  installing local policy overrides into /etc/polkit-1/rules.d/*.rules,
  which can either make the policy more restrictive or more
  permissive. Some sample policy rules can be found in the
  /usr/share/doc/polkitd/examples directory. Please see polkit(8) for
  more details.
  Some Debian packages include security policy overrides, typically to
  allow members of the sudo group to carry out limited administrative
  actions without re-authenticating. These packages should install their
  rules as /usr/share/polkit-1/rules.d/*.rules. Typical examples can be
  found in packages like flatpak, network-manager and systemd.
  Older Debian releases used the "local authority" rules format from
  upstream version 0.105 (.pkla files with an .desktop-like syntax,
  installed into subdirectories of /etc/polkit-1/localauthority
  or /var/lib/polkit-1/localauthority). The polkitd-pkla package
  provides compatibility with these files: if it is installed, they
  will be processed at a higher priority than most .rules files. If the
  polkitd-pkla package is removed, .pkla files will no longer be used.
 -- Simon McVittie   Wed, 14 Sep 2022 21:33:22 +0100
This applies now to the polkitd version 126-2 destined for Trixie. The most prominent issue is that you will get an error message: "Authentication is required to create a color profile" asking for the root(!) password every time you remotely log into a Debian Trixie system via RDP, x2go or the like. This used to be mendable with a .pkla file dropped into /etc/polkit-1/localauthority/50-local.d/ ... but these .pkla files are void now and need to be replace with a Javascript "rules" file. The background to his is quite a fascinating read ... 13 years later:
https://davidz25.blogspot.com/2012/06/authorization-rules-in-polkit.html The solution has been listed in DevAnswers as other distros (Fedora, ArchLinux, OpenSuse) have been faster to depreciate the .pkla files and require .rules files. I amended the solution given there with checking for root to be automatically authenticated, too. So, create a 50-color-manager.rules file in /etc/polkit-1/rules.d/:
polkit.addRule(function(action, subject)
if (action.id.startsWith("org.freedesktop.color-manager.") && (subject.isInGroup("users") (subject.user == "root")))
return polkit.Result.YES;

);
and run systemctl restart polkit. You should be good until polkit is rewritten in Rust.

13 May 2025

Ben Hutchings: Report for Debian BSP near Leuven in April 2025

On 26th and 27th April we held a Debian bug-squashing party near Leuven, Belgium. Several longstanding and new Debian contributors gathered to work through some of the highest priority bugs affecting the upcoming release of Debian 13 trixie . We were hosted by the Familia community centre in Tildonk. As this venue currently does not have an Internet connection, we brought a mobile hotspot and a local Debian mirror. In attendance were: The new contributors were variously using Arch, Fedora, and Ubuntu, and the DDs spent some some time setting them up with Debian dvelopment environments. The bugs we worked on included:

15 April 2025

Russell Coker: What Desktop PCs Need

It seems to me that we haven t had much change in the overall design of desktop PCs since floppy drives were removed, and modern PCs still have bays the size of 5.25 floppy drives despite having nothing modern that can fit in such spaces other than DVD drives (which aren t really modern) and carriers for 4*2.5 drives both of which most people don t use. We had the PC System Design Guide [1] which was last updated in 2001 which should have been updated more recently to address some of these issues, the thing that most people will find familiar in that standard is the colours for audio ports. Microsoft developed the Legacy Free PC [2] concept which was a good one. There s a lot of things that could be added to the list of legacy stuff to avoid, TPM 1.2, 5.25 drive bays, inefficient PSUs, hardware that doesn t sleep when idle or which prevents the CPU from sleeping, VGA and DVI ports, ethernet slower than 2.5Gbit, and video that doesn t include HDMI 2.1 or DisplayPort 2.1 for 8K support. There are recently released high-end PCs on sale right now with 1gbit ethernet as standard and hardly any PCs support resolutions above 4K properly. Here are some of the things that I think should be in a modern PC System Design Guide. Power Supply The power supply is a core part of the computer and it s central location dictates the layout of the rest of the PC. GaN PSUs are more power efficient and therefore require less cooling. A 400W USB power supply is about 1/4 the size of a standard PC PSU and doesn t have a cooling fan. A new PC standard should include less space for the PSU except for systems with multiple CPUs or that are designed for multiple GPUs. A Dell T630 server has an option of a 1600W PSU that is 20*8.5*4cm = 680cc. The typical dimensions of an ATX PSU are 15*8.6*14cm = 1806cc. The SFX (small form factor variant of ATX) PSU is 12.5*6.3*10cm = 787cc. There is a reason for the ATX and SFX PSUs having a much worse ratio of power to size and that is the airflow. Server class systems are designed for good airflow and can efficiently cool the PSU with less space and they are also designed for uses where people are less concerned about fan noise. But the 680cc used for a 1600W Dell server PSU that predates GaN technology could be used for a modern GaN PSU that supplies the ~600W needed for a modern PC while being quiet. There are several different smaller size PSUs for name-brand PCs (where compatibility with other systems isn t needed) that have been around for ~20 years but there hasn t been a standard so all white-box PC systems have had really large PSUs. PCs need USB-C PD ports that can charge a laptop etc. There are phones that can draw 80W for fast charging and it s not unreasonable to expect a PC to be able to charge a phone at it s maximum speed. GPUs should have USB-C alternate mode output and support full USB functionality over the cable as well as PD that can power the monitor. Having a monitor with a separate PSU, a HDMI or DP cable to the PC, and a USB cable between PC and monitor is an annoyance. There should be one cable between PC and monitor and then keyboard, mouse, etc should connect to the monior. All devices that are connected to a PC should use USB-C for power connection. That includes monitors that are using HDMI or DisplayPort for video, desktop switches, home Wifi APs, printers, and speakers (even when using line-in for the audio signal). The European Commission Common Charger Directive is really good but it only covers portable devices, keyboards, and mice. Motherboard Features Latest verions of Wifi and Bluetooth on the motherboard (this is becoming a standard feature). On motherboard video that supports 8K resolution. An option of a PCIe GPU is a good thing to have but it would be nice if the motherboard had enough video capabilities to satisfy most users. There are several options for video that have a higher resolution than 4K and making things just work at 8K means that there will be less e-waste in future. ECC RAM should be a standard feature on all motherboards, having a single bit error cause a system crash is a MS-DOS thing, we need to move past that. There should be built in hardware for monitoring the system status that is better than BIOS beeps on boot. Lenovo laptops have a feature for having the BIOS play a tune on a serious error with an Android app to decode the meaning of the tune, we could have a standard for this. For desktop PCs there should be a standard for LCD status displays similar to the ones on servers, this would be cheap if everyone did it. Case Features The way the Framework Laptop can be expanded with modules is really good [3]. There should be something similar for PC cases. While you can buy USB devices for these things they are messy and risk getting knocked out of their sockets when moving cables around. While the Framework laptop expansion cards are much more expensive than other devices with similar functions that are aimed at a mass market if there was a standard for PCs then the devices to fit them would become cheap. The PC System Design Guide specifies colors for ports (which is good) but not the feel of them. While some ports like Ethernet ports allow someone to feel which way the connector should go it isn t possible to easily feel which way a HDMI or DisplayPort connector should go. It would be good if there was a standard that required plastic spikes on one side or some other way of feeling which way a connector should go. GPU Placement In modern systems it s fairly common to have a high heatsink on the CPU with a fan to blow air in at the front and out the back of the PC. The GPU (which often dissipates twice as much heat as the CPU) has fans blowing air in sideways and not out the back. This gives some sort of compromise between poor cooling and excessive noise. What we need is to have air blown directly through a GPU heatsink and out of the case. One option for a tower case that needs minimal changes is to have the PCIe slot nearest the bottom of the case used for the GPU and have a grille in the bottom to allow air to go out, the case could have feet to keep it a few cm above the floor or desk. Another possibility is to have a PCIe slot parallel to the rear surface of the case (right angles to the other PCIe slots). A common case with desktop PCs is to have the GPU use more than half the total power of the PC. The placement of the GPU shouldn t be an afterthought, it should be central to the design. Is a PCIe card even a good way of installing a GPU? Could we have a standard GPU socket on the motherboard next to the CPU socket and use the same type of heatsink and fan for GPU and CPU? External Cooling There are a range of aftermarket cooling devices for laptops that push cool air in the bottom or suck it out the side. We need to have similar options for desktop PCs. I think it would be ideal to have a standard attachments for airflow on the front and back of tower PCs. The larger a fan is the slower it can spin to give the same airflow and therefore the less noise it will produce. Instead of just relying on 10cm fans at the front and back of a PC to push air in and suck it out you could have a conical rubber duct connected to a 30cm diameter fan. That would allow quieter fans to do most of the work in pushing air through the PC and also allow the hot air to be directed somewhere suitable. When doing computer work in summer it s not great to have a PC sending 300+W of waste heat into the room you are in. If it could be directed out a window that would be good. Noise For restricting noise of PCs we have industrial relations legislation that seems to basically require that workers not be exposed to noise louder than a blender, so if a PC is quieter than that then it s OK. For name brand PCs there are specs about how much noise is produced but there are usually caveats like under typical load or with a typical feature set that excuse them from liability if the noise is louder than expected. It doesn t seem possible for someone to own a PC, determine that the noise from it is what is acceptable, and then buy another that is close to the same. We need regulations about this, and the EU seems the best jurisdiction for it as they cover the purchase of a lot of computer equipment that is also sold without change in other countries. The regulations need to also cover updates, for example I have a Dell T630 which is unreasonably loud and Dell support doesn t have much incentive to be particularly helpful about it. BIOS updates routinely tweak things like fan speeds without the developers having an incentive to keep it as quiet as it was when it was sold. What Else? Please comment about other things you think should be standard PC features.

Russell Coker: Storage Trends 2025

It s been almost 15 months since I blogged about Storage Trends 2024 [1]. There hasn t been much change in this time (in Australia at least I m not tracking prices in other countries). The change was so small I had to check how the Australian dollar has performed against other currencies to see if changes to currencies had countered changes to storage prices, but there has been little overall change when compared to the Chinese Yuan and the Australian dollar is only about 11% worse against the US dollar when compared to a year ago. Generally there s a trend of computer parts decreasing in price by significantly more than 11% per annum. Small Storage The cheapest storage device from MSY now is a Patriot P210 128G SATA SSD for $19, cheaper than the $24 last year and the same price as the year before. So over the last 2 years there has been no change to the cheapest storage device on sale. It would almost never make sense to buy that as a 256G SATA SSD (also Patriot P210) is $25 and has twice the lifetime (120TBW vs 60TBW). There are also 256G NVMe devices for $29 and $30 which would be better options if the system has a NVMe socket built in. The cheapest 500G devices are $42.50 for a 512G SATA SSD and $45 for a 500G NVMe. Last year the prices were $33 for SATA and $36 for NVMe in that size so there s been a significant increase in price there. The difference is enough that if someone was on a tight budget they might reasonably decide to use smaller storage than they might have used last year! 2TB hard drives are still $89 the same price as last year! Last year a 2TB SATA SSD was $118 and a 2TB NVMe was $145, now a 2TB SATA SSD is $157 and a 2TB NVMe is $127. So NVMe has become cheaper than SATA in that segment but overall prices are higher than last year. Again for business use 2TB seems a sensible minimum for most systems if you are paying MSY rates (or similar rates from Amazon etc). Medium Storage Last year 4TB HDDs were $135, now they are $148. Last year the cheapest 4TB SSD was $299, now the cheapest is a $309 NVMe. While the prices have all gone up the price difference between hard drives and SSD has decreased in that size range. So for a small server (a lot of home servers and small business servers) 4TB of RAID-1 storage is all that s needed and for that SSDs are the best option. The price difference between $296 for 4TB of RAID-1 HDDs and $618 for RAID-1 NVMe is small enough to be justified by the benefits of speed and being quiet for most small server uses. In 2023 a 8TB hard drive cost $179 and a 8TB SSD cost $739. Last year a 8TB hard drive cost $239 and a 8TB SATA SSD cost, $899. Now a 8TB HDD costs $229 and MSY doesn t sell 8TB SSDs but for comparison Amazon has a Samsung 8TB SATA SSD for $919. So for storing 8TB+ there are benefits of hard drives as SSDs are difficult to get in that size range and more expensive than they were before. It seems that 8TB SSDs aren t used by enough people to have a large market in the home and small office space, so those of us who want the larger storage sizes will have to get second hand enterprise gear. It will probably be another few years before 8TB enterprise SSDs start appearing on the second hand market. Serious Storage Last year I wrote about the affordability of U.2 devices. I regret not buying some then as there are fewer on sale now and prices are higher. For hard drives they still aren t a good choice for most users because most users don t have more than 4TB of data. For large quantities of data hard drives are still a good option, a 22TB disk costs $899. For companies this is a good option for many situations. For home users there is the additional problem that determining whether a drive is Shingled Magnetic Recording which has some serious performance issues for some use and it s very difficult to determine which drives use it. Conclusion For corporate purchases the options for serious storage are probably decent. But for small companies and home users things definitely don t seem to have improved as much as we expect from the computer industry, I had expected 8TB SSDs to go for $450 by now and SSDs less than 500G to not even be sold new any more. The prices on 8TB SSDs have gone up more in the last 2 yeas than the ASX 200 (index of 200 biggest companies in the Australian stock market). I would never recommend using SSDs as an investment, but in retrospect 8TB SSDs could have been a good one. $20 seems to be about the minimum cost that SSDs approach while hard drives have a higher minimum price of a bit under $100 because they are larger, heavier, and more fragile. It seems that the market is likely to move to most SSDs being close to $20, if they can make 2TB SSDs cheaply enough to sell for about that price then that would cover the majority of the market. I ve created a table of the prices, I should have done this before but I initially didn t plan an ongoing series of posts on this topic.
Jun 2020 Apr 2021 Apr 2023 Jan 2024 Apr 2025
128G SSD $49 $19 $24 $19
500G SSD $97 $73 $32 $33 $42.50
2TB HDD $95 $72 $75 $89 $89
2TB SSD $335 $245 $149
4TB HDD $115 $135 $148
4TB SSD $895 $349 $299 $309
8TB HDD $179 $239 $229
8TB SSD $949 $739 $899 $919
10TB HDD $549 $395

11 April 2025

Reproducible Builds: Reproducible Builds in March 2025

Welcome to the third report in 2025 from the Reproducible Builds project. Our monthly reports outline what we ve been up to over the past month, and highlight items of news from elsewhere in the increasingly-important area of software supply-chain security. As usual, however, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. Table of contents:
  1. Debian bookworm live images now fully reproducible from their binary packages
  2. How NixOS and reproducible builds could have detected the xz backdoor
  3. LWN: Fedora change aims for 99% package reproducibility
  4. Python adopts PEP standard for specifying package dependencies
  5. OSS Rebuild real-time validation and tooling improvements
  6. SimpleX Chat server components now reproducible
  7. Three new scholarly papers
  8. Distribution roundup
  9. An overview of Supply Chain Attacks on Linux distributions
  10. diffoscope & strip-nondeterminism
  11. Website updates
  12. Reproducibility testing framework
  13. Upstream patches

Debian bookworm live images now fully reproducible from their binary packages Roland Clobus announced on our mailing list this month that all the major desktop variants (ie. Gnome, KDE, etc.) can be reproducibly created for Debian bullseye, bookworm and trixie from their (pre-compiled) binary packages. Building reproducible Debian live images does not require building from reproducible source code, but this is still a remarkable achievement. Some large proportion of the binary packages that comprise these live images can (and were) built reproducibly, but live image generation works at a higher level. (By contrast, full or end-to-end reproducibility of a bootable OS image will, in time, require both the compile-the-packages the build-the-bootable-image stages to be reproducible.) Nevertheless, in response, Roland s announcement generated significant congratulations as well as some discussion regarding the finer points of the terms employed: a full outline of the replies can be found here. The news was also picked up by Linux Weekly News (LWN) as well as to Hacker News.

How NixOS and reproducible builds could have detected the xz backdoor Julien Malka aka luj published an in-depth blog post this month with the highly-stimulating title How NixOS and reproducible builds could have detected the xz backdoor for the benefit of all . Starting with an dive into the relevant technical details of the XZ Utils backdoor, Julien s article goes on to describe how we might avoid the xz catastrophe in the future by building software from trusted sources and building trust into untrusted release tarballs by way of comparing sources and leveraging bitwise reproducibility, i.e. applying the practices of Reproducible Builds. The article generated significant discussion on Hacker News as well as on Linux Weekly News (LWN).

LWN: Fedora change aims for 99% package reproducibility Linux Weekly News (LWN) contributor Joe Brockmeier has published a detailed round-up on how Fedora change aims for 99% package reproducibility. The article opens by mentioning that although Debian has been working toward reproducible builds for more than a decade , the Fedora project has now:
progressed far enough that the project is now considering a change proposal for the Fedora 43 development cycle, expected to be released in October, with a goal of making 99% of Fedora s package builds reproducible. So far, reaction to the proposal seems favorable and focused primarily on how to achieve the goal with minimal pain for packagers rather than whether to attempt it.
The Change Proposal itself is worth reading:
Over the last few releases, we [Fedora] changed our build infrastructure to make package builds reproducible. This is enough to reach 90%. The remaining issues need to be fixed in individual packages. After this Change, package builds are expected to be reproducible. Bugs will be filed against packages when an irreproducibility is detected. The goal is to have no fewer than 99% of package builds reproducible.
Further discussion can be found on the Fedora mailing list as well as on Fedora s Discourse instance.

Python adopts PEP standard for specifying package dependencies Python developer Brett Cannon reported on Fosstodon that PEP 751 was recently accepted. This design document has the purpose of describing a file format to record Python dependencies for installation reproducibility . As the abstract of the proposal writes:
This PEP proposes a new file format for specifying dependencies to enable reproducible installation in a Python environment. The format is designed to be human-readable and machine-generated. Installers consuming the file should be able to calculate what to install without the need for dependency resolution at install-time.
The PEP, which itself supersedes PEP 665, mentions that there are at least five well-known solutions to this problem in the community .

OSS Rebuild real-time validation and tooling improvements OSS Rebuild aims to automate rebuilding upstream language packages (e.g. from PyPI, crates.io, npm registries) and publish signed attestations and build definitions for public use. OSS Rebuild is now attempting rebuilds as packages are published, shortening the time to validating rebuilds and publishing attestations. Aman Sharma contributed classifiers and fixes for common sources of non-determinism in JAR packages. Improvements were also made to some of the core tools in the project:
  • timewarp for simulating the registry responses from sometime in the past.
  • proxy for transparent interception and logging of network activity.
  • and stabilize, yet another nondeterminism fixer.

SimpleX Chat server components now reproducible SimpleX Chat is a privacy-oriented decentralised messaging platform that eliminates user identifiers and metadata, offers end-to-end encryption and has a unique approach to decentralised identity. Starting from version 6.3, however, Simplex has implemented reproducible builds for its server components. This advancement allows anyone to verify that the binaries distributed by SimpleX match the source code, improving transparency and trustworthiness.

Three new scholarly papers Aman Sharma of the KTH Royal Institute of Technology of Stockholm, Sweden published a paper on Build and Runtime Integrity for Java (PDF). The paper s abstract notes that Software Supply Chain attacks are increasingly threatening the security of software systems and goes on to compare build- and run-time integrity:
Build-time integrity ensures that the software artifact creation process, from source code to compiled binaries, remains untampered. Runtime integrity, on the other hand, guarantees that the executing application loads and runs only trusted code, preventing dynamic injection of malicious components.
Aman s paper explores solutions to safeguard Java applications and proposes some novel techniques to detect malicious code injection. A full PDF of the paper is available.
In addition, Hamed Okhravi and Nathan Burow of Massachusetts Institute of Technology (MIT) Lincoln Laboratory along with Fred B. Schneider of Cornell University published a paper in the most recent edition of IEEE Security & Privacy on Software Bill of Materials as a Proactive Defense:
The recently mandated software bill of materials (SBOM) is intended to help mitigate software supply-chain risk. We discuss extensions that would enable an SBOM to serve as a basis for making trust assessments thus also serving as a proactive defense.
A full PDF of the paper is available.
Lastly, congratulations to Giacomo Benedetti of the University of Genoa for publishing their PhD thesis. Titled Improving Transparency, Trust, and Automation in the Software Supply Chain, Giacomo s thesis:
addresses three critical aspects of the software supply chain to enhance security: transparency, trust, and automation. First, it investigates transparency as a mechanism to empower developers with accurate and complete insights into the software components integrated into their applications. To this end, the thesis introduces SUNSET and PIP-SBOM, leveraging modeling and SBOMs (Software Bill of Materials) as foundational tools for transparency and security. Second, it examines software trust, focusing on the effectiveness of reproducible builds in major ecosystems and proposing solutions to bolster their adoption. Finally, it emphasizes the role of automation in modern software management, particularly in ensuring user safety and application reliability. This includes developing a tool for automated security testing of GitHub Actions and analyzing the permission models of prominent platforms like GitHub, GitLab, and BitBucket.

Distribution roundup In Debian this month:
The IzzyOnDroid Android APK repository reached another milestone in March, crossing the 40% coverage mark specifically, more than 42% of the apps in the repository is now reproducible Thanks to funding by NLnet/Mobifree, the project was also to put more time into their tooling. For instance, developers can now run easily their own verification builder in less than 5 minutes . This currently supports Debian-based systems, but support for RPM-based systems is incoming. Future work in the pipeline, including documentation, guidelines and helpers for debugging.
Fedora developer Zbigniew J drzejewski-Szmek announced a work-in-progress script called fedora-repro-build which attempts to reproduce an existing package within a Koji build environment. Although the project s README file lists a number of fields will always or almost always vary (and there are a non-zero list of other known issues), this is an excellent first step towards full Fedora reproducibility (see above for more information).
Lastly, in openSUSE news, Bernhard M. Wiedemann posted another monthly update for his work there.

An overview of Supply Chain Attacks on Linux distributions Fenrisk, a cybersecurity risk-management company, has published a lengthy overview of Supply Chain Attacks on Linux distributions. Authored by Maxime Rinaudo, the article asks:
[What] would it take to compromise an entire Linux distribution directly through their public infrastructure? Is it possible to perform such a compromise as simple security researchers with no available resources but time?

diffoscope & strip-nondeterminism diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made the following changes, including preparing and uploading versions 290, 291, 292 and 293 and 293 to Debian:
  • Bug fixes:
    • file(1) version 5.46 now returns XHTML document for .xhtml files such as those found nested within our .epub tests. [ ]
    • Also consider .aar files as APK files, at least for the sake of diffoscope. [ ]
    • Require the new, upcoming, version of file(1) and update our quine-related testcase. [ ]
  • Codebase improvements:
    • Ensure all calls to our_check_output in the ELF comparator have the potential CalledProcessError exception caught. [ ][ ]
    • Correct an import masking issue. [ ]
    • Add a missing subprocess import. [ ]
    • Reformat openssl.py. [ ]
    • Update copyright years. [ ][ ][ ]
In addition, Ivan Trubach contributed a change to ignore the st_size metadata entry for directories as it is essentially arbitrary and introduces unnecessary or even spurious changes. [ ]

Website updates Once again, there were a number of improvements made to our website this month, including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In March, a number of changes were made by Holger Levsen, including:
  • reproduce.debian.net-related:
    • Add links to two related bugs about buildinfos.debian.net. [ ]
    • Add an extra sync to the database backup. [ ]
    • Overhaul description of what the service is about. [ ][ ][ ][ ][ ][ ]
    • Improve the documentation to indicate that need to fix syncronisation pipes. [ ][ ]
    • Improve the statistics page by breaking down output by architecture. [ ]
    • Add a copyright statement. [ ]
    • Add a space after the package name so one can search for specific packages more easily. [ ]
    • Add a script to work around/implement a missing feature of debrebuild. [ ]
  • Misc:
    • Run debian-repro-status at the end of the chroot-install tests. [ ][ ]
    • Document that we have unused diskspace at Ionos. [ ]
In addition:
  • James Addison made a number of changes to the reproduce.debian.net homepage. [ ][ ].
  • Jochen Sprickerhof updated the statistics generation to catch No space left on device issues. [ ]
  • Mattia Rizzolo added a better command to stop the builders [ ] and fixed the reStructuredText syntax in the README.infrastructure file. [ ]
And finally, node maintenance was performed by Holger Levsen [ ][ ][ ] and Mattia Rizzolo [ ][ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:
Finally, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

Bits from Debian: Bits from the DPL

Dear Debian community, this is bits from DPL for March (sorry for the delay, I was waiting for some additional input). Conferences In March, I attended two conferences, each with a distinct motivation. I joined FOSSASIA to address the imbalance in geographical developer representation. Encouraging more developers from Asia to contribute to Free Software is an important goal for me, and FOSSASIA provided a valuable opportunity to work towards this. I also attended Chemnitzer Linux-Tage, a conference I have been part of for over 20 years. To me, it remains a key gathering for the German Free Software community a place where contributors meet, collaborate, and exchange ideas. I have a remark about submitting an event proposal to both FOSDEM and FOSSASIA: Cross distribution experience exchange
As Debian Project Leader, I have often reflected on how other Free Software distributions address challenges we all face. I am interested in discussing how we can learn from each other to improve our work and better serve our users. Recognizing my limited understanding of other distributions, I aim to bridge this gap through open knowledge exchange. My hope is to foster a constructive dialogue that benefits the broader Free Software ecosystem. Representatives of other distributions are encouraged to participate in this BoF whether as contributors or official co-speakers. My intention is not to drive the discussion from a Debian-centric perspective but to ensure that all distributions have an equal voice in the conversation.
This event proposal was part of my commitment from my 2024 DPL platform, specifically under the section "Reaching Out to Learn". Had it been accepted, I would have also attended FOSDEM. However, both FOSDEM and FOSSASIA rejected the proposal. In hindsight, reaching out to other distribution contributors beforehand might have improved its chances. I may take this approach in the future if a similar opportunity arises. That said, rejecting an interdistribution discussion without any feedback is, in my view, a missed opportunity for collaboration. FOSSASIA Summit The 14th FOSSASIA Summit took place in Bangkok. As a leading open-source technology conference in Asia, it brings together developers, startups, and tech enthusiasts to collaborate on projects in AI, cloud computing, IoT, and more. With a strong focus on open innovation, the event features hands-on workshops, keynote speeches, and community-driven discussions, emphasizing open-source software, hardware, and digital freedom. It fosters a diverse, inclusive environment and highlights Asia's growing role in the global FOSS ecosystem. I presented a talk on Debian as a Global Project and led a packaging workshop. Additionally, to further support attendees interested in packaging, I hosted an extra self-organized workshop at a hacker caf , initiated by participants eager to deepen their skills. There was another Debian related talk given by Ananthu titled "The Herculean Task of OS Maintenance - The Debian Way!" To further my goal of increasing diversity within Debian particularly by encouraging more non-male contributors I actively engaged with attendees, seeking opportunities to involve new people in the project. Whether through discussions, mentoring, or hands-on sessions, I aimed to make Debian more approachable for those who might not yet see themselves as contributors. I was fortunate to have the support of Debian enthusiasts from India and China, who ran the Debian booth and helped create a welcoming environment for these conversations. Strengthening diversity in Free Software is a collective effort, and I hope these interactions will inspire more people to get involved. Chemnitzer Linuxtage The Chemnitzer Linux-Tage (CLT) is one of Germany's largest and longest-running community-driven Linux and open-source conferences, held annually in Chemnitz since 2000. It has been my favorite conference in Germany, and I have tried to attend every year. Focusing on Free Software, Linux, and digital sovereignty, CLT offers a mix of expert talks, workshops, and exhibitions, attracting hobbyists, professionals, and businesses alike. With a strong grassroots ethos, it emphasizes hands-on learning, privacy, and open-source advocacy while fostering a welcoming environment for both newcomers and experienced Linux users. Despite my appreciation for the diverse and high-quality talks at CLT, my main focus was on connecting with people who share the goal of attracting more newcomers to Debian. Engaging with both longtime contributors and potential new participants remains one of the most valuable aspects of the event for me. I was fortunate to be joined by Debian enthusiasts staffing the Debian booth, where I found myself among both experienced booth volunteers who have attended many previous CLT events and young newcomers. This was particularly reassuring, as I certainly can't answer every detailed question at the booth. I greatly appreciate the knowledgeable people who represent Debian at this event and help make it more accessible to visitors. As a small point of comparison while FOSSASIA and CLT are fundamentally different events the gender ratio stood out. FOSSASIA had a noticeably higher proportion of women compared to Chemnitz. This contrast highlighted the ongoing need to foster more diversity within Free Software communities in Europe. At CLT, I gave a talk titled "Tausend Freiwillige, ein Ziel" (Thousand Volunteers, One Goal), which was video recorded. It took place in the grand auditorium and attracted a mix of long-term contributors and newcomers, making for an engaging and rewarding experience. Kind regards Andreas.

31 March 2025

Simon Josefsson: On Binary Distribution Rebuilds

I rebuilt (the top-50 popcon) Debian and Ubuntu packages, on amd64 and arm64, and compared the results a couple of months ago. Since then the Reproduce.Debian.net effort has been launched. Unlike my small experiment, that effort is a full-scale rebuild with more architectures. Their goal is to reproduce what is published in the Debian archive. One differences between these two approaches are the build inputs: The Reproduce Debian effort use the same build inputs which were used to build the published packages. I m using the latest version of published packages for the rebuild. What does that difference imply? I believe reproduce.debian.net will be able to reproduce more of the packages in the archive. If you build a C program using one version of GCC you will get some binary output; and if you use a later GCC version you are likely to end up with a different binary output. This is a good thing: we want GCC to evolve and produce better output over time. However it means in order to reproduce the binaries we publish and use, we need to rebuild them using whatever build dependencies were used to prepare those binaries. The conclusion is that we need to use the old GCC to rebuild the program, and this appears to be the Reproduce.Debian.Net approach. It would be a huge success if the Reproduce.Debian.net effort were to reach 100% reproducibility, and this seems to be within reach. However I argue that we need go further than that. Being able to rebuild the packages reproducible using older binary packages only begs the question: can we rebuild those older packages? I fear attempting to do so ultimately leads to a need to rebuild 20+ year old packages, with a non-negligible amount of them being illegal to distribute or are unable to build anymore due to bit-rot. We won t solve the Trusting Trust concern if our rebuild effort assumes some initial binary blob that we can no longer build from source code. I ve made an illustration of the effort I m thinking of, to reach something that is stronger than reproducible rebuilds. I am calling this concept a Idempotent Rebuild, which is an old concept that I believe is the same as John Gilmore has described many years ago.
The illustration shows how the Debian main archive is used as input to rebuild another stage #0 archive. This stage #0 archive can be compared with diffoscope to the main archive, and all differences are things that would be nice to resolve. The packages in the stage #0 archive is used to prepare a new container image with build tools, and the stage #0 archive is used as input to rebuild another version of itself, called the stage #1 archive. The differences between stage #0 and stage #1 are also useful to analyse and resolve. This process can be repeated many times. I believe it would be a useful property if this process terminated at some point, where the stage #N archive was identical to the stage #N-1 archive. If this would happen, I label the output archive as an Idempotent Rebuild of the distribution. How big is N today? The simplest assumption is that it is infinity. Any build timestamp embedded into binary packages will change on every iteration. This will cause the process to never terminate. Fixing embedded timestamps is something that the Reproduce.Debian.Net effort will also run into, and will have to resolve. What other causes for differences could there be? It is easy to see that generally if some output is not deterministic, such as the sort order of assembler object code in binaries, then the output will be different. Trivial instances of this problem will be caught by the reproduce.debian.net effort as well. Could there be higher order chains that lead to infinite N? It is easy to imagine the existence of these, but I don t know how they would look like in practice. An ideal would be if we could get down to N=1. Is that technically possible? Compare building GCC, it performs an initial stage 0 build using the system compiler to produce a stage 1 intermediate, which is used to build itself again to stage 2. Stage 1 and 2 is compared, and on success (identical binaries), the compilation succeeds. Here N=2. But this is performed using some unknown system compiler that is normally different from the GCC version being built. When rebuilding a binary distribution, you start with the same source versions. So it seems N=1 could be possible. I m unhappy to not be able to report any further technical progress now. The next step in this effort is to publish the stage #0 build artifacts in a repository, so they can be used to build stage #1. I already showed that stage #0 was around ~30% reproducible compared to the official binaries, but I didn t save the artifacts in a reusable repository. Since the official binaries were not built using the latest versions, it is to be expected that the reproducibility number is low. But what happens at stage #1? The percentage should go up: we are now compare the rebuilds with an earlier rebuild, using the same build inputs. I m eager to see this materialize, and hope to eventually make progress on this. However to build stage #1 I believe I need to rebuild a much larger number packages in stage #0, it could be roughly similar to the build-essentials-depends package set. I believe the ultimate end goal of Idempotent Rebuilds is to be able to re-bootstrap a binary distribution like Debian from some other bootstrappable environment like Guix. In parallel to working on a achieving the 100% Idempotent Rebuild of Debian, we can setup a Guix environment that build Debian packages using Guix binaries. These builds ought to eventually converge to the same Debian binary packages, or there is something deeply problematic happening. This approach to re-bootstrap a binary distribution like Debian seems simpler than rebuilding all binaries going back to the beginning of time for that distribution. What do you think? PS. I fear that Debian main may have already went into a state where it is not able to rebuild itself at all anymore: the presence and assumption of non-free firmware and non-Debian signed binaries may have already corrupted the ability for Debian main to rebuild itself. To be able to complete the idempotent and bootstrapped rebuild of Debian, this needs to be worked out.

30 March 2025

Dirk Eddelbuettel: RcppZiggurat 0.1.8 on CRAN: Build Refinements

ziggurats A new release 0.1.8 of RcppZiggurat is now on the CRAN network for R, following up on the 0.1.7 release last week which was the first release in four and a half years. The RcppZiggurat package updates the code for the Ziggurat generator by Marsaglia and others which provides very fast draws from a Normal (or Exponential) distribution. The package provides a simple C++ wrapper class for the generator improving on the very basic macros, and permits comparison among several existing Ziggurat implementations. This can be seen in the figure where Ziggurat from this package dominates accessing the implementations from the GSL, QuantLib and Gretl all of which are still way faster than the default Normal generator in R (which is of course of higher code complexity). This release switches the vignette to the standard trick of premaking it as a pdf and including it in a short Sweave document that imports it via pdfpages; this minimizes build-time dependencies on other TeXLive components. It also incorporates a change contributed by Tomas to rely on the system build of the GSL on Windows as well if Rtools 42 or later is found. No other changes. The NEWS file entry below lists all changes.

Changes in version 0.1.8 (2025-03-30)
  • The vignette is now premade and rendered as Rnw via pdfpage to minimize the need for TeXLive package at build / install time (Dirk)
  • Windows builds now use the GNU GSL when Rtools is 42 or later (Tomas Kalibera in #25)

Courtesy of my CRANberries, there is a diffstat report relative to previous release. More detailed information is on the Rcppziggurat page or the GitHub repository.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

28 March 2025

Ian Jackson: Rust is indeed woke

Rust, and resistance to it in some parts of the Linux community, has been in my feed recently. One undercurrent seems to be the notion that Rust is woke (and should therefore be rejected as part of culture wars). I m going to argue that Rust, the language, is woke. So the opponents are right, in that sense. Of course, as ever, dissing something for being woke is nasty and fascist-adjacent. Community The obvious way that Rust may seem woke is that it has the trappings, and many of the attitudes and outcomes, of a modern, nice, FLOSS community. Rust certainly does better than toxic environments like the Linux kernel, or Debian. This is reflected in a higher proportion of contributors from various kinds of minoritised groups. But Rust is not outstanding in this respect. It certainly has its problems. Many other projects do as well or better. And this is well-trodden ground. I have something more interesting to say: Technological values - particularly, compared to C/C++ Rust is woke technology that embodies a woke understanding of what it means to be a programming language. Ostensible values Let s start with Rust s strapline:
A language empowering everyone to build reliable and efficient software.
Surprisingly, this motto is not mere marketing puff. For Rustaceans, it is a key goal which strongly influences day-to-day decisions (big and small). Empowering everyone is a key aspect of this, which aligns with my own personal values. In the Rust community, we care about empowerment. We are trying to help liberate our users. And we want to empower everyone because everyone is entitled to technological autonomy. (For a programming language, empowering individuals means empowering their communities, of course.) This is all very airy-fairy, but it has concrete consequences: Attitude to the programmer s mistakes In Rust we consider it a key part of our job to help the programmer avoid mistakes; to limit the consequences of mistakes; and to guide programmers in useful directions. If you write a bug in your Rust program, Rust doesn t blame you. Rust asks how could the compiler have spotted that bug . This is in sharp contrast to C (and C++). C nowadays is an insanely hostile programming environment. A C compiler relentlessly scours your program for any place where you may have violated C s almost incomprehensible rules, so that it can compile your apparently-correct program into a buggy executable. And then the bug is considered your fault. These aren t just attitudes implicitly embodied in the software. They are concrete opinions expressed by compiler authors, and also by language proponents. In other words: Rust sees programmers writing bugs as a systemic problem, which must be addressed by improvements to the environment and the system. The toxic parts of the C and C++ community see bugs as moral failings by individual programmers. Sound familiar? The ideology of the hardcore programmer Programming has long suffered from the myth of the rockstar . Silicon Valley techbro culture loves this notion. In reality, though, modern information systems are far too complicated for a single person. Developing systems is a team sport. Nontechnical, and technical-adjacent, skills are vital: clear but friendly communication; obtaining and incorporating the insights of every member of your team; willingness to be challenged. Community building. Collaboration. Governance. The hardcore C community embraces the rockstar myth: they imagine that a few super-programmers (or super-reviewers) are able to spot bugs, just by being so brilliant. Of course this doesn t actually work at all, as we can see from the atrocious bugfest that is the Linux kernel. These rockstars want us to believe that there is a steep hierarchy in programmming; that they are at the top of this hierarchy; and that being nice isn t important. Sound familiar? Memory safety as a power struggle Much of the modern crisis of software reliability arises from memory-unsafe programming languages, mostly C and C++. Addressing this is a big job, requiring many changes. This threatens powerful interests; notably, corporations who want to keep shipping junk. (See also, conniptions over the EU Product Liability Directive.) The harms of this serious problem mostly fall on society at large, but the convenience of carrying on as before benefits existing powerful interests. Sound familiar? Memory safety via Rust as a power struggle Addressing this problem via Rust is a direct threat to the power of established C programmers such as gatekeepers in the Linux kernel. Supplanting C means they will have to learn new things, and jostle for status against better Rustaceans, or be replaced. More broadly, Rust shows that it is practical to write fast, reliable, software, and that this does not need (mythical) rockstars . So established C programmer experts are existing vested interests, whose power is undermined by (this approach to) tackling this serious problem. Sound familiar? Notes This is not a RIIR manifesto I m not saying we should rewrite all the world s C in Rust. We should not try to do that. Rust is often a good choice for new code, or when a rewrite or substantial overhaul is needed anyway. But we re going to need other techniques to deal with all of our existing C. CHERI is a very promising approach. Sandboxing, emulation and automatic translation are other possibilities. The problem is a big one and we need a toolkit, not a magic bullet. But as for Linux: it is a scandal that substantial new drivers and subsystems are still being written in C. We could have been using Rust for new code throughout Linux years ago, and avoided very many bugs. Those bugs are doing real harm. This is not OK. Disclosure I first learned C from K&R I in 1989. I spent the first three decades of my life as a working programmer writing lots and lots of C. I ve written C++ too. I used to consider myself an expert C programmer, but nowadays my C is a bit rusty and out of date. Why is my C rusty? Because I found Rust, and immediately liked and adopted it (despite its many faults). I like Rust because I care that the software I write actually works: I care that my code doesn t do harm in the world. On the meaning of woke The original meaning of woke is something much more specific, to do with racism. For the avoidance of doubt, I don t think Rust is particularly antiracist. I m using woke (like Rust s opponents are) in the much broader, and now much more prevalent, culture wars sense. Pithy conclusion If you re a senior developer who knows only C/C++, doesn t want their authority challenged, and doesn t want to have to learn how to write better software, you should hate Rust. Also you should be fired.
Edited 2025-03-28 17:10 UTC to fix minor problems and add a new note about the meaning of the word "woke".


comment count unavailable comments

22 March 2025

Dirk Eddelbuettel: RcppZiggurat 0.1.7 on CRAN: New Generators, Many Updates

ziggurats A new release 0.1.7 of RcppZiggurat is now on the CRAN network for R. This marks the first release in four and a half years. The RcppZiggurat package updates the code for the Ziggurat generator by Marsaglia and others which provides very fast draws from a Normal distribution. The package provides a simple C++ wrapper class for the generator improving on the very basic macros, and permits comparison among several existing Ziggurat implementations. This can be seen in the figure where Ziggurat from this package dominates accessing the implementations from the GSL, QuantLib and Gretl all of which are still way faster than the default Normal generator in R (which is of course of higher code complexity). This release brings a number of changes. Notably, based on the work we did with the new package zigg (more on that in a second), we now also expose the Exponential generator, and the underlying Uniform generator. Otherwise many aspects of the package have been refreshed: updated builds, updated links, updated CI processes, more use of DOIs and more. The other big news is zigg which should now be the preference for deployment of Ziggurat due to its much lighter-weight and zero-dependency setup. The NEWS file entry below lists all changes.

Changes in version 0.1.7 (2025-03-22)
  • The CI setup was updated to use run.sh from r-ci (Dirk).
  • The windows build was updated to GSL 2.7, and UCRT support was added (Jeroen in #16).
  • Manual pages now use JSS DOIs for references per CRAN request
  • README.md links and badges have been updated
  • Continuous integration actions have been updated several times
  • The DESCRIPTION file now uses Authors@R as mandated
  • Use of multiple cores is eased via a new helper function reflecting option mc.core or architecture defaults, used in tests
  • An inline function has been added to avoid a compiler nag
  • Support for exponential RNG draws zrexp has been added, the internal uniform generator is now also exposed via zruni
  • The vignette bibliography has been updated, and switched to DOIs
  • New package zigg is now mentioned in DESCRIPTION and vignette

Courtesy of my CRANberries, there is a diffstat report relative to previous release. More detailed information is on the Rcppziggurat page or the GitHub repository.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

10 March 2025

Joachim Breitner: Extrinsic termination proofs for well-founded recursion in Lean

A few months ago I explained that one reason why this blog has become more quiet is that all my work on Lean is covered elsewhere. This post is an exception, because it is an observation that is (arguably) interesting, but does not lead anywhere, so where else to put it than my own blog Want to share your thoughts about this? Please join the discussion on the Lean community zulip!

Background When defining a function recursively in Lean that has nested recursion, e.g. a recusive call that is in the argument to a higher-order function like List.map, then extra attention used to be necessary so that Lean can see that xs.map applies its argument only elements of the list xs. The usual idiom is to write xs.attach.map instead, where List.attach attaches to the list elements a proof that they are in that list. You can read more about this my Lean blog post on recursive definitions and our new shiny reference manual, look for Example Nested Recursion in Higher-order Functions . To make this step less tedious I taught Lean to automatically rewrite xs.map to xs.attach.map (where suitable) within the construction of well-founded recursion, so that nested recursion just works (issue #5471). We already do such a rewriting to change if c then else to the dependent if h : c then else , but the attach-introduction is much more ambitious (the rewrites are not definitionally equal, there are higher-order arguments etc.) Rewriting the terms in a way that we can still prove the connection later when creating the equational lemmas is hairy at best. Also, we want the whole machinery to be extensible by the user, setting up their own higher order functions to add more facts to the context of the termination proof. I implemented it like this (PR #6744) and it ships with 4.18.0, but in the course of this work I thought about a quite different and maybe better way to do this, and well-founded recursion in general:

A simpler fix Recall that to use WellFounded.fix
WellFounded.fix : (hwf : WellFounded r) (F : (x :  )   ((y :  )   r y x   C y)   C x) (x :  ) : C x
we have to rewrite the functorial of the recursive function, which naturally has type
F : ((y :  )    C y)   ((x :  )   C x)
to the one above, where all recursive calls take the termination proof r y x. This is a fairly hairy operation, mangling the type of matcher s motives and whatnot. Things are simpler for recursive definitions using the new partial_fixpoint machinery, where we use Lean.Order.fix
Lean.Order.fix : [CCPO  ] (F :      ) (hmono : monotone F) :  
so the functorial s type is unmodified (here will be ((x : ) C x)), and everything else is in the propositional side-condition montone F. For this predicate we have a syntax-guided compositional tactic, and it s easily extensible, e.g. by
theorem monotone_mapM (f :         m  ) (xs : List  ) (hmono : monotone f) :
    monotone (fun x => xs.mapM (f x)) 
Once given, we don t care about the content of that proof. In particular proving the unfolding theorem only deals with the unmodified F that closely matches the function definition as written by the user. Much simpler!

Isabelle has it easier Isabelle also supports well-founded recursion, and has great support for nested recursion. And it s much simpler! There, all you have to do to make nested recursion work is to define a congruence lemma of the form, for List.map something like our List.map_congr_left
List.map_congr_left : (h :   a   l, f a = g a) :
    List.map f l = List.map g l
This is because in Isabelle, too, the termination proofs is a side-condition that essentially states the functorial F calls its argument f only on smaller arguments .

Can we have it easy, too? I had wished we could do the same in Lean for a while, but that form of congruence lemma just isn t strong enough for us. But maybe there is a way to do it, using an existential to give a witness that F can alternatively implemented using the more restrictive argument. The following callsOn P F predicate can express that F calls its higher-order argument only on arguments that satisfy the predicate P:
section setup
variable   : Sort u 
variable   :     Sort v 
variable   : Sort w 
def callsOn (P :     Prop) (F : (  y,   y)    ) :=
    (F': (  y, P y     y)    ),   f, F' (fun y _ => f y) = F f
variable (R :         Prop)
variable (F : (  y,   y)   (  x,   x))
local infix:50 "   " => R
def recursesVia : Prop :=   x, callsOn (    x) (fun f => F f x)
noncomputable def fix (wf : WellFounded R) (h : recursesVia R F) : (  x,   x) :=
  wf.fix (fun x => (h x).choose)
def fix_eq (wf : WellFounded R) h x :
    fix R F wf h x = F (fix R F wf h) x := by
  unfold fix
  rw [wf.fix_eq]
  apply (h x).choose_spec
This allows nice compositional lemmas to discharge callsOn predicates:
theorem callsOn_base (y :  ) (hy : P y) :
    callsOn P (fun (f :   x,   x) => f y) := by
  exists fun f => f y hy
  intros; rfl
@[simp]
theorem callsOn_const (x :  ) :
    callsOn P (fun (_ :   x,   x) => x) :=
   fun _ => x, fun _ => rfl 
theorem callsOn_app
      : Sort uu    : Sort ww 
    (F  :  (  y,   y)        ) -- can this also support dependent types?
    (F  :  (  y,   y)    )
    (h  : callsOn P F )
    (h  : callsOn P F ) :
    callsOn P (fun f => F  f (F  f)) := by
  obtain  F ', h  := h 
  obtain  F ', h  := h 
  exists (fun f => F ' f (F ' f))
  intros; simp_all
theorem callsOn_lam
      : Sort uu 
    (F :     (  y,   y)    ) -- can this also support dependent types?
    (h :   x, callsOn P (F x)) :
    callsOn P (fun f x => F x f) := by
  exists (fun f x => (h x).choose f)
  intro f
  ext x
  apply (h x).choose_spec
theorem callsOn_app2
      : Sort uu    : Sort ww 
    (g :          )
    (F  :  (  y,   y)    ) -- can this also support dependent types?
    (F  :  (  y,   y)    )
    (h  : callsOn P F )
    (h  : callsOn P F ) :
    callsOn P (fun f => g (F  f) (F  f)) := by
  apply_rules [callsOn_app, callsOn_const]
With this setup, we can have the following, possibly user-defined, lemma expressing that List.map calls its arguments only on elements of the list:
theorem callsOn_map (  : Type uu) (  : Type ww)
    (P :     Prop) (F : (  y,   y)        ) (xs : List  )
    (h :   x, x   xs   callsOn P (fun f => F f x)) :
    callsOn P (fun f => xs.map (fun x => F f x)) := by
  suffices callsOn P (fun f => xs.attach.map (fun  x, h  => F f x)) by
    simpa
  apply callsOn_app
    apply callsOn_app
      apply callsOn_const
      apply callsOn_lam
      intro  x', hx' 
      dsimp
      exact (h x' hx')
    apply callsOn_const
end setup
So here is the (manual) construction of a nested map for trees:
section examples
structure Tree (  : Type u) where
  val :  
  cs : List (Tree  )
-- essentially
-- def Tree.map (f :      ) : Tree     Tree   :=
--   fun t =>  f t.val, t.cs.map Tree.map )
noncomputable def Tree.map (f :      ) : Tree     Tree   :=
  fix (sizeOf   < sizeOf  ) (fun map t =>  f t.val, t.cs.map map )
    (InvImage.wf (sizeOf  ) WellFoundedRelation.wf) <  by
  intro  v, cs 
  dsimp only
  apply callsOn_app2
    apply callsOn_const
    apply callsOn_map
    intro t' ht'
    apply callsOn_base
    -- ht' : t'   cs -- !
    --   sizeOf t' < sizeOf   val := v, cs := cs  
    decreasing_trivial
end examples
This makes me happy! All details of the construction are now contained in a proof that can proceed by a syntax-driven tactic and that s easily and (likely robustly) extensible by the user. It also means that we can share a lot of code paths (e.g. everything related to equational theorems) between well-founded recursion and partial_fixpoint. I wonder if this construction is really as powerful as our current one, or if there are certain (likely dependently typed) functions where this doesn t fit, but the above is dependent, so it looks good. With this construction, functions defined by well-founded recursion will reduce even worse in the kernel, I assume. This may be a good thing.

The cake is a lie What unfortunately kills this idea, though, is the generation of the functional induction principles, which I believe is not (easily) possible with this construction: The functional induction principle is proved by massaging F to return a proof, but since the extra assumptions (e.g. for ite or List.map) only exist in the termination proof, they are not available in F. Oh wey, how anticlimactic.

PS: Path dependencies Curiously, if we didn t have functional induction at this point yet, then very likely I d change Lean to use this construction, and then we d either not get functional induction, or it would be implemented very differently, maybe a more syntactic approach that would re-prove termination. I guess that s called path dependence.

9 March 2025

Lisandro Dami n Nicanor P rez Meyer: Bah a Blanca floods - Mother nature says: no Nuremberg for you today

Update 20250309 13:20-03:00 - How to help A friend of mine living in the USA sent me this link to help the flood victims: Support Bah a Blanca (Argentina) Flood Victims Original blog post These are not good news. In fact, much the contrary. Compared to the real issue, the fact that I'm not able to attend Embedded World at Nuremberg is, well, a detail. Or at least that's what I'm forcing myself to believe, as I REALLY wanted to be there. But mother nature said otherwise. Plaza Dr. Alberto Martinelli - Barrio Parque Las Ca itas Park "D. Alberto Martinelli", Las Ca itas, Bah a Blanca (Google Maps) Bah a Blanca , the city I live, has received a lot on rainfall. Really, a lot. Let me introduce the number like this: the previous highest recorded measurement was 170mm (6.69 inch)... in a month. Yesterday Friday 07 we had more than 400mm (15.75 inch) in 9 hours. But those are just numbers. Some things are better seen in images. I'll start with some soft ones. Streetk sink 1 Streetk sink 2 Sink in Fournier street near Cambaceres (Google Maps) I also happen to do figure skating in the same school of the 4 times world champions (where "world" means the whole world) Roller Dreams precision skating team - Instagram, from Club El Nacional. Our skating rink got severely damaged with the hail we had like 3 weeks ago (yes, we had hail too!!!). Now it's just impossible: Roller Dreams CEN skating rink The "real" thing Let's get to the heavy, heartbreaker part. I did go to downtown Bah a Blanca, but during night, so let me share some links, most of them in Spanish, but images are images: My alma matter, Universidad Nacional del Sur, lost its main library, great part of the Physics department and a lot of labs :-( A nearby town, General Cerri, had even worst luck. In Bah a Blanca, a city of 300k+ people, has around 400 evacuated people. General Cerri, a town of 3000? people, had at least 800. Bah a Blanca, devil's land Every place has its legends. We do too. This land was called "Huecuv Map ", something like "Devil's land" by the original inhabitants of the zone, due to its harsh climate: string winters and hot summers, couple with fierce wind. But back in 1855 the Cacique (chief) Jos Mar a Bulnes Yanquetruz had a peace agreement with commander Nicanor Otamendi. But a battle ensued, which Yanquetruz won. At this point history defers depending upon who tells it. Some say Yanquetruz was assigned a military grade as Captain of the indigenous auxiliary forces and provided a military suit, some say he stole it, some say this was a setup of another chief wanting to disrupt peace. But what is known is that Yanquetruz was killed, and his wife, the "machi" (sorceress), issued a curse over the land that would last 1000 years, and the curse was on the climate. Aftermath No, we are not there yet. This has just happened. The third violent climate occurrence in 15 months. The city needs to mourn and start healing itself. Time will say.

6 March 2025

Russell Coker: 8k Video Cards

I previously blogged about getting an 8K TV [1]. Now I m working on getting 8K video out for a computer that talks to it. I borrowed an NVidia RTX A2000 card which according to it s specs can do 8K [2] with a mini-DisplayPort to HDMI cable rated at 8K but on both Windows and Linux the two highest resolutions on offer are 3840*2160 (regular 4K) and 4096*2160 which is strange and not useful. The various documents on the A2000 differ on whether it has DisplayPort version 1.4 or 1.4a. According to the DisplayPort Wikipedia page [3] both versions 1.4 and 1.4a have a maximum of HBR3 speed and the difference is what version of DSC (Display Stream Compression [4]) is in use. DSC apparently causes no noticeable loss of quality for movies or games but apparently can be bad for text. According to the DisplayPort Wikipedia page version 1.4 can do 8K uncompressed at 30Hz or 24Hz with high dynamic range. So this should be able to work. My theories as to why it doesn t work are: To get some more input on this issue I posted on Lemmy, here is the Lemmy post [5]. I signed up to lemmy.ml because it was the first one I found that seemed reasonable and was giving away free accounts, I haven t tried any others and can t review it but it seems to work well enough and it s free. It s described as A community of privacy and FOSS enthusiasts, run by Lemmy s developers which is positive, I recommend that everyone who s into FOSS create an account there or some other Lemmy server. My Lemmy post was about what video cards to buy. I was looking at the Gigabyte RX 6400 Eagle 4G as a cheap card from a local store that does 8K, it also does DisplayPort 1.4 so might have the same issues, also apparently FOSS drivers don t support 8K on HDMI because the people who manage HDMI specs are jerks. It s a $200 card at MSY and a bit less on ebay so it s an amount I can afford to risk on a product that might not do what I want, but it seems to have a high probability of getting the same result. The NVidia cards have the option of proprietary drivers which allow using HDMI and there are cards with DisplayPort 1.4 (which can do 8K@30Hz) and HDMI 2.1 (which can do 8K@50Hz). So HDMI is a better option for some cards just based on card output and has the additional benefit of not needing DisplayPort to HDMI conversion. The best option apparently is the Intel cards which do DisplayPort internally and convert to HDMI in hardware which avoids the issue of FOSS drivers for HDMI at 8K. The Intel Arc B580 has nice specs [6], HDMI 2.1a and DisplayPort 2.1 output, 12G of RAM, and being faster than the low end cards like the RX 6400. But the local computer store price is $470 and the ebay price is a bit over $400. If it turns out to not do what I need it still will be a long way from the worst way I ve wasted money on computer gear. But I m still hesitating about this. Any suggestions?

5 March 2025

Dima Kogan: Shop scheduling with PuLP

I recently used the PuLP modeler to solve a work scheduling problem to assign workers to shifts. Here are notes about doing that. This is a common use case, but isn't explicitly covered in the case studies in the PuLP documentation. Here's the problem: The tool is supposed to allocate workers to the shifts to try to cover all the shifts, give everybody work, and try to match their preferences. I implemented the tool:
#!/usr/bin/python3
import sys
import os
import re
def report_solution_to_console(vars):
    for w in days_of_week:
        annotation = ''
        if human_annotate is not None:
            for s in shifts.keys():
                m = re.match(rf' w  - ', s)
                if not m: continue
                if vars[human_annotate][s].value():
                    annotation = f" ( human_annotate  SCHEDULED)"
                    break
            if not len(annotation):
                annotation = f" ( human_annotate  OFF)"
        print(f" w annotation ")
        for s in shifts.keys():
            m = re.match(rf' w  - ', s)
            if not m: continue
            annotation = ''
            if human_annotate is not None:
                annotation = f" ( human_annotate   shifts[s][human_annotate] )"
            print(f"    ----  s[m.end():] annotation ")
            for h in humans:
                if vars[h][s].value():
                    print(f"          h  ( shifts[s][h] )")
def report_solution_summary_to_console(vars):
    print("\nSUMMARY")
    for h in humans:
        print(f"--  h ")
        print(f"   benefit:  benefits[h].value():.3f ")
        counts = dict()
        for a in availabilities:
            counts[a] = 0
        for s in shifts.keys():
            if vars[h][s].value():
                counts[shifts[s][h]] += 1
        for a in availabilities:
            print(f"    counts[a]   a ")
human_annotate = None
days_of_week = ('SUNDAY',
                'MONDAY',
                'TUESDAY',
                'WEDNESDAY',
                'THURSDAY',
                'FRIDAY',
                'SATURDAY')
humans = ['ALICE', 'BOB',
          'CAROL', 'DAVID', 'EVE', 'FRANK', 'GRACE', 'HEIDI', 'IVAN', 'JUDY']
shifts =  'SUNDAY - SANDING 9:00 AM - 4:00 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'PREFERRED',
           'DAVID': 'PREFERRED',
           'EVE':   'PREFERRED',
           'FRANK': 'PREFERRED',
           'GRACE': 'DISFAVORED',
           'HEIDI': 'DISFAVORED',
           'IVAN':  'PREFERRED',
           'JUDY':  'NEUTRAL' ,
          'WEDNESDAY - SAWING 7:30 AM - 2:30 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'PREFERRED',
           'DAVID': 'PREFERRED',
           'FRANK': 'PREFERRED',
           'GRACE': 'NEUTRAL',
           'HEIDI': 'DISFAVORED',
           'IVAN':  'PREFERRED',
           'EVE':   'REFUSED',
           'JUDY':  'REFUSED' ,
          'THURSDAY - SANDING 9:00 AM - 4:00 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'PREFERRED',
           'DAVID': 'PREFERRED',
           'EVE':   'PREFERRED',
           'FRANK': 'PREFERRED',
           'GRACE': 'PREFERRED',
           'HEIDI': 'DISFAVORED',
           'IVAN':  'PREFERRED',
           'JUDY':  'PREFERRED' ,
          'SATURDAY - SAWING 7:30 AM - 2:30 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'PREFERRED',
           'DAVID': 'PREFERRED',
           'FRANK': 'PREFERRED',
           'HEIDI': 'DISFAVORED',
           'IVAN':  'PREFERRED',
           'EVE':   'REFUSED',
           'JUDY':  'REFUSED',
           'GRACE': 'REFUSED' ,
          'SUNDAY - SAWING 9:00 AM - 4:00 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'PREFERRED',
           'DAVID': 'PREFERRED',
           'EVE':   'PREFERRED',
           'FRANK': 'PREFERRED',
           'GRACE': 'DISFAVORED',
           'IVAN':  'PREFERRED',
           'JUDY':  'PREFERRED',
           'HEIDI': 'REFUSED' ,
          'MONDAY - SAWING 9:00 AM - 4:00 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'PREFERRED',
           'DAVID': 'PREFERRED',
           'EVE':   'PREFERRED',
           'FRANK': 'PREFERRED',
           'GRACE': 'PREFERRED',
           'IVAN':  'PREFERRED',
           'JUDY':  'PREFERRED',
           'HEIDI': 'REFUSED' ,
          'TUESDAY - SAWING 9:00 AM - 4:00 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'PREFERRED',
           'DAVID': 'PREFERRED',
           'EVE':   'PREFERRED',
           'FRANK': 'PREFERRED',
           'GRACE': 'NEUTRAL',
           'IVAN':  'PREFERRED',
           'JUDY':  'PREFERRED',
           'HEIDI': 'REFUSED' ,
          'WEDNESDAY - PAINTING 7:30 AM - 2:30 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'PREFERRED',
           'FRANK': 'PREFERRED',
           'GRACE': 'NEUTRAL',
           'HEIDI': 'DISFAVORED',
           'IVAN':  'PREFERRED',
           'EVE':   'REFUSED',
           'JUDY':  'REFUSED',
           'DAVID': 'REFUSED' ,
          'THURSDAY - SAWING 9:00 AM - 4:00 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'PREFERRED',
           'DAVID': 'PREFERRED',
           'EVE':   'PREFERRED',
           'FRANK': 'PREFERRED',
           'GRACE': 'PREFERRED',
           'IVAN':  'PREFERRED',
           'JUDY':  'PREFERRED',
           'HEIDI': 'REFUSED' ,
          'FRIDAY - SAWING 9:00 AM - 4:00 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'PREFERRED',
           'DAVID': 'PREFERRED',
           'EVE':   'PREFERRED',
           'FRANK': 'PREFERRED',
           'GRACE': 'PREFERRED',
           'IVAN':  'PREFERRED',
           'JUDY':  'DISFAVORED',
           'HEIDI': 'REFUSED' ,
          'SATURDAY - PAINTING 7:30 AM - 2:30 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'PREFERRED',
           'FRANK': 'PREFERRED',
           'HEIDI': 'DISFAVORED',
           'IVAN':  'PREFERRED',
           'EVE':   'REFUSED',
           'JUDY':  'REFUSED',
           'GRACE': 'REFUSED',
           'DAVID': 'REFUSED' ,
          'SUNDAY - PAINTING 9:45 AM - 4:45 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'NEUTRAL',
           'EVE':   'PREFERRED',
           'FRANK': 'PREFERRED',
           'GRACE': 'DISFAVORED',
           'IVAN':  'PREFERRED',
           'JUDY':  'PREFERRED',
           'HEIDI': 'REFUSED',
           'DAVID': 'REFUSED' ,
          'MONDAY - PAINTING 9:45 AM - 4:45 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'NEUTRAL',
           'EVE':   'PREFERRED',
           'FRANK': 'PREFERRED',
           'GRACE': 'PREFERRED',
           'IVAN':  'PREFERRED',
           'JUDY':  'NEUTRAL',
           'HEIDI': 'REFUSED',
           'DAVID': 'REFUSED' ,
          'TUESDAY - PAINTING 9:45 AM - 4:45 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'NEUTRAL',
           'EVE':   'PREFERRED',
           'FRANK': 'PREFERRED',
           'GRACE': 'NEUTRAL',
           'IVAN':  'PREFERRED',
           'JUDY':  'PREFERRED',
           'HEIDI': 'REFUSED',
           'DAVID': 'REFUSED' ,
          'WEDNESDAY - SANDING 9:45 AM - 4:45 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'NEUTRAL',
           'DAVID': 'PREFERRED',
           'FRANK': 'PREFERRED',
           'GRACE': 'NEUTRAL',
           'HEIDI': 'DISFAVORED',
           'IVAN':  'PREFERRED',
           'JUDY':  'NEUTRAL',
           'EVE':   'REFUSED' ,
          'THURSDAY - PAINTING 9:45 AM - 4:45 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'NEUTRAL',
           'EVE':   'PREFERRED',
           'FRANK': 'PREFERRED',
           'GRACE': 'NEUTRAL',
           'IVAN':  'PREFERRED',
           'JUDY':  'PREFERRED',
           'HEIDI': 'REFUSED',
           'DAVID': 'REFUSED' ,
          'FRIDAY - PAINTING 9:45 AM - 4:45 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'NEUTRAL',
           'EVE':   'PREFERRED',
           'FRANK': 'PREFERRED',
           'GRACE': 'PREFERRED',
           'IVAN':  'PREFERRED',
           'JUDY':  'DISFAVORED',
           'HEIDI': 'REFUSED',
           'DAVID': 'REFUSED' ,
          'SATURDAY - SANDING 9:45 AM - 4:45 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'NEUTRAL',
           'DAVID': 'PREFERRED',
           'FRANK': 'PREFERRED',
           'HEIDI': 'DISFAVORED',
           'IVAN':  'PREFERRED',
           'EVE':   'REFUSED',
           'JUDY':  'REFUSED',
           'GRACE': 'REFUSED' ,
          'SUNDAY - PAINTING 11:00 AM - 6:00 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'NEUTRAL',
           'EVE':   'DISFAVORED',
           'FRANK': 'NEUTRAL',
           'GRACE': 'NEUTRAL',
           'HEIDI': 'PREFERRED',
           'IVAN':  'NEUTRAL',
           'JUDY':  'NEUTRAL',
           'DAVID': 'REFUSED' ,
          'MONDAY - PAINTING 12:00 PM - 7:00 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'NEUTRAL',
           'EVE':   'DISFAVORED',
           'FRANK': 'NEUTRAL',
           'GRACE': 'PREFERRED',
           'IVAN':  'NEUTRAL',
           'JUDY':  'NEUTRAL',
           'HEIDI': 'REFUSED',
           'DAVID': 'REFUSED' ,
          'TUESDAY - PAINTING 12:00 PM - 7:00 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'NEUTRAL',
           'EVE':   'DISFAVORED',
           'FRANK': 'NEUTRAL',
           'GRACE': 'NEUTRAL',
           'IVAN':  'NEUTRAL',
           'HEIDI': 'REFUSED',
           'JUDY':  'REFUSED',
           'DAVID': 'REFUSED' ,
          'WEDNESDAY - PAINTING 12:00 PM - 7:00 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'NEUTRAL',
           'FRANK': 'NEUTRAL',
           'GRACE': 'NEUTRAL',
           'IVAN':  'NEUTRAL',
           'JUDY':  'PREFERRED',
           'EVE':   'REFUSED',
           'HEIDI': 'REFUSED',
           'DAVID': 'REFUSED' ,
          'THURSDAY - PAINTING 12:00 PM - 7:00 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'NEUTRAL',
           'EVE':   'DISFAVORED',
           'FRANK': 'NEUTRAL',
           'GRACE': 'NEUTRAL',
           'IVAN':  'NEUTRAL',
           'JUDY':  'PREFERRED',
           'HEIDI': 'REFUSED',
           'DAVID': 'REFUSED' ,
          'FRIDAY - PAINTING 12:00 PM - 7:00 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'NEUTRAL',
           'EVE':   'DISFAVORED',
           'FRANK': 'NEUTRAL',
           'GRACE': 'NEUTRAL',
           'IVAN':  'NEUTRAL',
           'JUDY':  'DISFAVORED',
           'HEIDI': 'REFUSED',
           'DAVID': 'REFUSED' ,
          'SATURDAY - PAINTING 12:00 PM - 7:00 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'NEUTRAL',
           'FRANK': 'NEUTRAL',
           'IVAN':  'NEUTRAL',
           'JUDY':  'DISFAVORED',
           'EVE':   'REFUSED',
           'HEIDI': 'REFUSED',
           'GRACE': 'REFUSED',
           'DAVID': 'REFUSED' ,
          'SUNDAY - SAWING 12:00 PM - 7:00 PM':
           'ALICE': 'PREFERRED',
           'BOB':   'PREFERRED',
           'CAROL': 'NEUTRAL',
           'EVE':   'DISFAVORED',
           'FRANK': 'NEUTRAL',
           'GRACE': 'NEUTRAL',
           'IVAN':  'NEUTRAL',
           'JUDY':  'PREFERRED',
           'HEIDI': 'REFUSED',
           'DAVID': 'REFUSED' ,
          'MONDAY - SAWING 2:00 PM - 9:00 PM':
           'ALICE': 'PREFERRED',
           'BOB':   'PREFERRED',
           'CAROL': 'DISFAVORED',
           'EVE':   'DISFAVORED',
           'FRANK': 'NEUTRAL',
           'GRACE': 'NEUTRAL',
           'IVAN':  'DISFAVORED',
           'JUDY':  'DISFAVORED',
           'HEIDI': 'REFUSED',
           'DAVID': 'REFUSED' ,
          'TUESDAY - SAWING 2:00 PM - 9:00 PM':
           'ALICE': 'PREFERRED',
           'BOB':   'PREFERRED',
           'CAROL': 'DISFAVORED',
           'EVE':   'DISFAVORED',
           'FRANK': 'NEUTRAL',
           'GRACE': 'NEUTRAL',
           'IVAN':  'DISFAVORED',
           'HEIDI': 'REFUSED',
           'JUDY':  'REFUSED',
           'DAVID': 'REFUSED' ,
          'WEDNESDAY - SAWING 2:00 PM - 9:00 PM':
           'ALICE': 'PREFERRED',
           'BOB':   'PREFERRED',
           'CAROL': 'DISFAVORED',
           'FRANK': 'NEUTRAL',
           'GRACE': 'NEUTRAL',
           'IVAN':  'DISFAVORED',
           'JUDY':  'DISFAVORED',
           'EVE':   'REFUSED',
           'HEIDI': 'REFUSED',
           'DAVID': 'REFUSED' ,
          'THURSDAY - SAWING 2:00 PM - 9:00 PM':
           'ALICE': 'PREFERRED',
           'BOB':   'PREFERRED',
           'CAROL': 'DISFAVORED',
           'EVE':   'DISFAVORED',
           'FRANK': 'NEUTRAL',
           'GRACE': 'NEUTRAL',
           'IVAN':  'DISFAVORED',
           'JUDY':  'DISFAVORED',
           'HEIDI': 'REFUSED',
           'DAVID': 'REFUSED' ,
          'FRIDAY - SAWING 2:00 PM - 9:00 PM':
           'ALICE': 'PREFERRED',
           'BOB':   'PREFERRED',
           'CAROL': 'DISFAVORED',
           'EVE':   'DISFAVORED',
           'FRANK': 'NEUTRAL',
           'GRACE': 'NEUTRAL',
           'IVAN':  'DISFAVORED',
           'HEIDI': 'REFUSED',
           'JUDY':  'REFUSED',
           'DAVID': 'REFUSED' ,
          'SATURDAY - SAWING 2:00 PM - 9:00 PM':
           'ALICE': 'PREFERRED',
           'BOB':   'PREFERRED',
           'CAROL': 'DISFAVORED',
           'FRANK': 'NEUTRAL',
           'IVAN':  'DISFAVORED',
           'JUDY':  'DISFAVORED',
           'EVE':   'REFUSED',
           'HEIDI': 'REFUSED',
           'GRACE': 'REFUSED',
           'DAVID': 'REFUSED' ,
          'SUNDAY - PAINTING 12:15 PM - 7:15 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'PREFERRED',
           'EVE':   'DISFAVORED',
           'FRANK': 'NEUTRAL',
           'GRACE': 'NEUTRAL',
           'HEIDI': 'NEUTRAL',
           'IVAN':  'DISFAVORED',
           'JUDY':  'NEUTRAL',
           'DAVID': 'REFUSED' ,
          'MONDAY - PAINTING 2:00 PM - 9:00 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'DISFAVORED',
           'EVE':   'DISFAVORED',
           'FRANK': 'NEUTRAL',
           'GRACE': 'NEUTRAL',
           'HEIDI': 'NEUTRAL',
           'IVAN':  'DISFAVORED',
           'JUDY':  'DISFAVORED',
           'DAVID': 'REFUSED' ,
          'TUESDAY - PAINTING 2:00 PM - 9:00 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'DISFAVORED',
           'EVE':   'DISFAVORED',
           'FRANK': 'NEUTRAL',
           'GRACE': 'NEUTRAL',
           'HEIDI': 'NEUTRAL',
           'IVAN':  'DISFAVORED',
           'JUDY':  'REFUSED',
           'DAVID': 'REFUSED' ,
          'WEDNESDAY - PAINTING 2:00 PM - 9:00 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'DISFAVORED',
           'FRANK': 'NEUTRAL',
           'GRACE': 'NEUTRAL',
           'HEIDI': 'NEUTRAL',
           'IVAN':  'DISFAVORED',
           'JUDY':  'DISFAVORED',
           'EVE':   'REFUSED',
           'DAVID': 'REFUSED' ,
          'THURSDAY - PAINTING 2:00 PM - 9:00 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'DISFAVORED',
           'EVE':   'DISFAVORED',
           'FRANK': 'NEUTRAL',
           'GRACE': 'NEUTRAL',
           'HEIDI': 'NEUTRAL',
           'IVAN':  'DISFAVORED',
           'JUDY':  'DISFAVORED',
           'DAVID': 'REFUSED' ,
          'FRIDAY - PAINTING 2:00 PM - 9:00 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'DISFAVORED',
           'EVE':   'DISFAVORED',
           'FRANK': 'NEUTRAL',
           'GRACE': 'NEUTRAL',
           'HEIDI': 'NEUTRAL',
           'IVAN':  'DISFAVORED',
           'JUDY':  'REFUSED',
           'DAVID': 'REFUSED' ,
          'SATURDAY - PAINTING 2:00 PM - 9:00 PM':
           'ALICE': 'NEUTRAL',
           'BOB':   'NEUTRAL',
           'CAROL': 'DISFAVORED',
           'FRANK': 'NEUTRAL',
           'HEIDI': 'NEUTRAL',
           'IVAN':  'DISFAVORED',
           'JUDY':  'DISFAVORED',
           'EVE':   'REFUSED',
           'GRACE': 'REFUSED',
           'DAVID': 'REFUSED' 
availabilities = ['PREFERRED', 'NEUTRAL', 'DISFAVORED']
import pulp
prob = pulp.LpProblem("Scheduling", pulp.LpMaximize)
vars = pulp.LpVariable.dicts("Assignments",
                             (humans, shifts.keys()),
                             None,None, # bounds; unused, since these are binary variables
                             pulp.LpBinary)
# Everyone works at least 2 shifts
Nshifts_min = 2
for h in humans:
    prob += (
        pulp.lpSum([vars[h][s] for s in shifts.keys()]) >= Nshifts_min,
        f" h  works at least  Nshifts_min  shifts",
    )
# each shift is ~ 8 hours, so I limit everyone to 40/8 = 5 shifts
Nshifts_max = 5
for h in humans:
    prob += (
        pulp.lpSum([vars[h][s] for s in shifts.keys()]) <= Nshifts_max,
        f" h  works at most  Nshifts_max  shifts",
    )
# all shifts staffed and not double-staffed
for s in shifts.keys():
    prob += (
        pulp.lpSum([vars[h][s] for h in humans]) == 1,
        f" s  is staffed",
    )
# each human can work at most one shift on any given day
for w in days_of_week:
    for h in humans:
        prob += (
            pulp.lpSum([vars[h][s] for s in shifts.keys() if re.match(rf' w  ',s)]) <= 1,
            f" h  cannot be double-booked on  w "
        )
#### Some explicit constraints; as an example
# DAVID can't work any PAINTING shift and is off on Thu and Sun
h = 'DAVID'
prob += (
    pulp.lpSum([vars[h][s] for s in shifts.keys() if re.search(r'- PAINTING',s)]) == 0,
    f" h  can't work any PAINTING shift"
)
prob += (
    pulp.lpSum([vars[h][s] for s in shifts.keys() if re.match(r'THURSDAY SUNDAY',s)]) == 0,
    f" h  is off on Thursday and Sunday"
)
# Do not assign any "REFUSED" shifts
for s in shifts.keys():
    for h in humans:
        if shifts[s][h] == 'REFUSED':
            prob += (
                vars[h][s] == 0,
                f" h  is not available for  s "
            )
# Objective. I try to maximize the "happiness". Each human sees each shift as
# one of:
#
#   PREFERRED
#   NEUTRAL
#   DISFAVORED
#   REFUSED
#
# I set a hard constraint to handle "REFUSED", and arbitrarily, I set these
# benefit values for the others
benefit_availability = dict()
benefit_availability['PREFERRED']  = 3
benefit_availability['NEUTRAL']    = 2
benefit_availability['DISFAVORED'] = 1
# Not used, since this is a hard constraint. But the code needs this to be a
# part of the benefit. I can ignore these in the code, but let's keep this
# simple
benefit_availability['REFUSED' ] = -1000
benefits = dict()
for h in humans:
    benefits[h] = \
        pulp.lpSum([vars[h][s] * benefit_availability[shifts[s][h]] \
                    for s in shifts.keys()])
benefit_total = \
    pulp.lpSum([benefits[h] \
                for h in humans])
prob += (
    benefit_total,
    "happiness",
)
prob.solve()
if pulp.LpStatus[prob.status] == "Optimal":
    report_solution_to_console(vars)
    report_solution_summary_to_console(vars)
The set of workers is in the humans variable, and the shift schedule and the workers' preferences are encoded in the shifts dict. The problem is defined by a vars dict of dicts, each a boolean variable indicating whether a particular worker is scheduled for a particular shift. We define a set of constraints to these worker allocations to restrict ourselves to valid solutions. And among these valid solutions, we try to find the one that maximizes some benefit function, defined here as:
benefit_availability = dict()
benefit_availability['PREFERRED']  = 3
benefit_availability['NEUTRAL']    = 2
benefit_availability['DISFAVORED'] = 1
benefits = dict()
for h in humans:
    benefits[h] = \
        pulp.lpSum([vars[h][s] * benefit_availability[shifts[s][h]] \
                    for s in shifts.keys()])
benefit_total = \
    pulp.lpSum([benefits[h] \
                for h in humans])
So for instance each shift that was scheduled as somebody's PREFERRED shift gives us 3 benefit points. And if all the shifts ended up being PREFERRED, we'd have a total benefit value of 3*Nshifts. This is impossible, however, because that would violate some constraints in the problem. The exact trade-off between the different preferences is set in the benefit_availability dict. With the above numbers, it's equally good for somebody to have a NEUTRAL shift and a day off as it is for them to have DISFAVORED shifts. If we really want to encourage the program to work people as much as possible (days off discouraged), we'd want to raise the DISFAVORED threshold. I run this program and I get:
....
Result - Optimal solution found
Objective value:                108.00000000
Enumerated nodes:               0
Total iterations:               0
Time (CPU seconds):             0.01
Time (Wallclock seconds):       0.01
Option for printingOptions changed from normal to all
Total time (CPU seconds):       0.02   (Wallclock seconds):       0.02
SUNDAY
    ---- SANDING 9:00 AM - 4:00 PM
         EVE (PREFERRED)
    ---- SAWING 9:00 AM - 4:00 PM
         IVAN (PREFERRED)
    ---- PAINTING 9:45 AM - 4:45 PM
         FRANK (PREFERRED)
    ---- PAINTING 11:00 AM - 6:00 PM
         HEIDI (PREFERRED)
    ---- SAWING 12:00 PM - 7:00 PM
         ALICE (PREFERRED)
    ---- PAINTING 12:15 PM - 7:15 PM
         CAROL (PREFERRED)
MONDAY
    ---- SAWING 9:00 AM - 4:00 PM
         DAVID (PREFERRED)
    ---- PAINTING 9:45 AM - 4:45 PM
         IVAN (PREFERRED)
    ---- PAINTING 12:00 PM - 7:00 PM
         GRACE (PREFERRED)
    ---- SAWING 2:00 PM - 9:00 PM
         ALICE (PREFERRED)
    ---- PAINTING 2:00 PM - 9:00 PM
         HEIDI (NEUTRAL)
TUESDAY
    ---- SAWING 9:00 AM - 4:00 PM
         DAVID (PREFERRED)
    ---- PAINTING 9:45 AM - 4:45 PM
         EVE (PREFERRED)
    ---- PAINTING 12:00 PM - 7:00 PM
         FRANK (NEUTRAL)
    ---- SAWING 2:00 PM - 9:00 PM
         BOB (PREFERRED)
    ---- PAINTING 2:00 PM - 9:00 PM
         HEIDI (NEUTRAL)
WEDNESDAY
    ---- SAWING 7:30 AM - 2:30 PM
         DAVID (PREFERRED)
    ---- PAINTING 7:30 AM - 2:30 PM
         IVAN (PREFERRED)
    ---- SANDING 9:45 AM - 4:45 PM
         FRANK (PREFERRED)
    ---- PAINTING 12:00 PM - 7:00 PM
         JUDY (PREFERRED)
    ---- SAWING 2:00 PM - 9:00 PM
         BOB (PREFERRED)
    ---- PAINTING 2:00 PM - 9:00 PM
         ALICE (NEUTRAL)
THURSDAY
    ---- SANDING 9:00 AM - 4:00 PM
         GRACE (PREFERRED)
    ---- SAWING 9:00 AM - 4:00 PM
         CAROL (PREFERRED)
    ---- PAINTING 9:45 AM - 4:45 PM
         EVE (PREFERRED)
    ---- PAINTING 12:00 PM - 7:00 PM
         JUDY (PREFERRED)
    ---- SAWING 2:00 PM - 9:00 PM
         BOB (PREFERRED)
    ---- PAINTING 2:00 PM - 9:00 PM
         ALICE (NEUTRAL)
FRIDAY
    ---- SAWING 9:00 AM - 4:00 PM
         DAVID (PREFERRED)
    ---- PAINTING 9:45 AM - 4:45 PM
         FRANK (PREFERRED)
    ---- PAINTING 12:00 PM - 7:00 PM
         GRACE (NEUTRAL)
    ---- SAWING 2:00 PM - 9:00 PM
         BOB (PREFERRED)
    ---- PAINTING 2:00 PM - 9:00 PM
         HEIDI (NEUTRAL)
SATURDAY
    ---- SAWING 7:30 AM - 2:30 PM
         CAROL (PREFERRED)
    ---- PAINTING 7:30 AM - 2:30 PM
         IVAN (PREFERRED)
    ---- SANDING 9:45 AM - 4:45 PM
         DAVID (PREFERRED)
    ---- PAINTING 12:00 PM - 7:00 PM
         FRANK (NEUTRAL)
    ---- SAWING 2:00 PM - 9:00 PM
         ALICE (PREFERRED)
    ---- PAINTING 2:00 PM - 9:00 PM
         BOB (NEUTRAL)
SUMMARY
-- ALICE
   benefit: 13.000
   3 PREFERRED
   2 NEUTRAL
   0 DISFAVORED
-- BOB
   benefit: 14.000
   4 PREFERRED
   1 NEUTRAL
   0 DISFAVORED
-- CAROL
   benefit: 9.000
   3 PREFERRED
   0 NEUTRAL
   0 DISFAVORED
-- DAVID
   benefit: 15.000
   5 PREFERRED
   0 NEUTRAL
   0 DISFAVORED
-- EVE
   benefit: 9.000
   3 PREFERRED
   0 NEUTRAL
   0 DISFAVORED
-- FRANK
   benefit: 13.000
   3 PREFERRED
   2 NEUTRAL
   0 DISFAVORED
-- GRACE
   benefit: 8.000
   2 PREFERRED
   1 NEUTRAL
   0 DISFAVORED
-- HEIDI
   benefit: 9.000
   1 PREFERRED
   3 NEUTRAL
   0 DISFAVORED
-- IVAN
   benefit: 12.000
   4 PREFERRED
   0 NEUTRAL
   0 DISFAVORED
-- JUDY
   benefit: 6.000
   2 PREFERRED
   0 NEUTRAL
   0 DISFAVORED
So we have a solution! We have 108 total benefit points. But it looks a bit uneven: Judy only works 2 days, while some people work many more: David works 5 for instance. Why is that? I update the program with =human_annotate = 'JUDY'=, run it again, and it tells me more about Judy's preferences:
Objective value:                108.00000000
Enumerated nodes:               0
Total iterations:               0
Time (CPU seconds):             0.01
Time (Wallclock seconds):       0.01
Option for printingOptions changed from normal to all
Total time (CPU seconds):       0.01   (Wallclock seconds):       0.02
SUNDAY (JUDY OFF)
    ---- SANDING 9:00 AM - 4:00 PM (JUDY NEUTRAL)
         EVE (PREFERRED)
    ---- SAWING 9:00 AM - 4:00 PM (JUDY PREFERRED)
         IVAN (PREFERRED)
    ---- PAINTING 9:45 AM - 4:45 PM (JUDY PREFERRED)
         FRANK (PREFERRED)
    ---- PAINTING 11:00 AM - 6:00 PM (JUDY NEUTRAL)
         HEIDI (PREFERRED)
    ---- SAWING 12:00 PM - 7:00 PM (JUDY PREFERRED)
         ALICE (PREFERRED)
    ---- PAINTING 12:15 PM - 7:15 PM (JUDY NEUTRAL)
         CAROL (PREFERRED)
MONDAY (JUDY OFF)
    ---- SAWING 9:00 AM - 4:00 PM (JUDY PREFERRED)
         DAVID (PREFERRED)
    ---- PAINTING 9:45 AM - 4:45 PM (JUDY NEUTRAL)
         IVAN (PREFERRED)
    ---- PAINTING 12:00 PM - 7:00 PM (JUDY NEUTRAL)
         GRACE (PREFERRED)
    ---- SAWING 2:00 PM - 9:00 PM (JUDY DISFAVORED)
         ALICE (PREFERRED)
    ---- PAINTING 2:00 PM - 9:00 PM (JUDY DISFAVORED)
         HEIDI (NEUTRAL)
TUESDAY (JUDY OFF)
    ---- SAWING 9:00 AM - 4:00 PM (JUDY PREFERRED)
         DAVID (PREFERRED)
    ---- PAINTING 9:45 AM - 4:45 PM (JUDY PREFERRED)
         EVE (PREFERRED)
    ---- PAINTING 12:00 PM - 7:00 PM (JUDY REFUSED)
         FRANK (NEUTRAL)
    ---- SAWING 2:00 PM - 9:00 PM (JUDY REFUSED)
         BOB (PREFERRED)
    ---- PAINTING 2:00 PM - 9:00 PM (JUDY REFUSED)
         HEIDI (NEUTRAL)
WEDNESDAY (JUDY SCHEDULED)
    ---- SAWING 7:30 AM - 2:30 PM (JUDY REFUSED)
         DAVID (PREFERRED)
    ---- PAINTING 7:30 AM - 2:30 PM (JUDY REFUSED)
         IVAN (PREFERRED)
    ---- SANDING 9:45 AM - 4:45 PM (JUDY NEUTRAL)
         FRANK (PREFERRED)
    ---- PAINTING 12:00 PM - 7:00 PM (JUDY PREFERRED)
         JUDY (PREFERRED)
    ---- SAWING 2:00 PM - 9:00 PM (JUDY DISFAVORED)
         BOB (PREFERRED)
    ---- PAINTING 2:00 PM - 9:00 PM (JUDY DISFAVORED)
         ALICE (NEUTRAL)
THURSDAY (JUDY SCHEDULED)
    ---- SANDING 9:00 AM - 4:00 PM (JUDY PREFERRED)
         GRACE (PREFERRED)
    ---- SAWING 9:00 AM - 4:00 PM (JUDY PREFERRED)
         CAROL (PREFERRED)
    ---- PAINTING 9:45 AM - 4:45 PM (JUDY PREFERRED)
         EVE (PREFERRED)
    ---- PAINTING 12:00 PM - 7:00 PM (JUDY PREFERRED)
         JUDY (PREFERRED)
    ---- SAWING 2:00 PM - 9:00 PM (JUDY DISFAVORED)
         BOB (PREFERRED)
    ---- PAINTING 2:00 PM - 9:00 PM (JUDY DISFAVORED)
         ALICE (NEUTRAL)
FRIDAY (JUDY OFF)
    ---- SAWING 9:00 AM - 4:00 PM (JUDY DISFAVORED)
         DAVID (PREFERRED)
    ---- PAINTING 9:45 AM - 4:45 PM (JUDY DISFAVORED)
         FRANK (PREFERRED)
    ---- PAINTING 12:00 PM - 7:00 PM (JUDY DISFAVORED)
         GRACE (NEUTRAL)
    ---- SAWING 2:00 PM - 9:00 PM (JUDY REFUSED)
         BOB (PREFERRED)
    ---- PAINTING 2:00 PM - 9:00 PM (JUDY REFUSED)
         HEIDI (NEUTRAL)
SATURDAY (JUDY OFF)
    ---- SAWING 7:30 AM - 2:30 PM (JUDY REFUSED)
         CAROL (PREFERRED)
    ---- PAINTING 7:30 AM - 2:30 PM (JUDY REFUSED)
         IVAN (PREFERRED)
    ---- SANDING 9:45 AM - 4:45 PM (JUDY REFUSED)
         DAVID (PREFERRED)
    ---- PAINTING 12:00 PM - 7:00 PM (JUDY DISFAVORED)
         FRANK (NEUTRAL)
    ---- SAWING 2:00 PM - 9:00 PM (JUDY DISFAVORED)
         ALICE (PREFERRED)
    ---- PAINTING 2:00 PM - 9:00 PM (JUDY DISFAVORED)
         BOB (NEUTRAL)
SUMMARY
-- ALICE
   benefit: 13.000
   3 PREFERRED
   2 NEUTRAL
   0 DISFAVORED
-- BOB
   benefit: 14.000
   4 PREFERRED
   1 NEUTRAL
   0 DISFAVORED
-- CAROL
   benefit: 9.000
   3 PREFERRED
   0 NEUTRAL
   0 DISFAVORED
-- DAVID
   benefit: 15.000
   5 PREFERRED
   0 NEUTRAL
   0 DISFAVORED
-- EVE
   benefit: 9.000
   3 PREFERRED
   0 NEUTRAL
   0 DISFAVORED
-- FRANK
   benefit: 13.000
   3 PREFERRED
   2 NEUTRAL
   0 DISFAVORED
-- GRACE
   benefit: 8.000
   2 PREFERRED
   1 NEUTRAL
   0 DISFAVORED
-- HEIDI
   benefit: 9.000
   1 PREFERRED
   3 NEUTRAL
   0 DISFAVORED
-- IVAN
   benefit: 12.000
   4 PREFERRED
   0 NEUTRAL
   0 DISFAVORED
-- JUDY
   benefit: 6.000
   2 PREFERRED
   0 NEUTRAL
   0 DISFAVORED
This tells us that on Monday Judy does not work, although she marked the SAWING shift as PREFERRED. Instead David got that shift. What would happen if David gave that shift to Judy? He would lose 3 points, she would gain 3 points, and the total would remain exactly the same at 108. How would we favor a more even distribution? We need some sort of tie-break. I want to add a nonlinearity to strongly disfavor people getting a low number of shifts. But PuLP is very explicitly a linear programming solver, and cannot solve nonlinear problems. Here we can get around this by enumerating each specific case, and assigning it a nonlinear benefit function. The most obvious approach is to define another set of boolean variables: vars_Nshifts[human][N]. And then using them to add extra benefit terms, with values nonlinearly related to Nshifts. Something like this:
benefit_boost_Nshifts = \
     2: -0.8,
     3: -0.5,
     4: -0.3,
     5: -0.2 
for h in humans:
    benefits[h] = \
        ... + \
        pulp.lpSum([vars_Nshifts[h][n] * benefit_boost_Nshifts[n] \
                    for n in benefit_boost_Nshifts.keys()])
So in the previous example we considered giving David's 5th shift to Judy, for her 3rd shift. In that scenario, David's extra benefit would change from -0.2 to -0.3 (a shift of -0.1), while Judy's would change from -0.8 to -0.5 (a shift of +0.3). So the balancing out the shifts in this way would work: the solver would favor the solution with the higher benefit function. Great. In order for this to work, we need the vars_Nshifts[human][N] variables to function as intended: they need to be binary indicators of whether a specific person has that many shifts or not. That would need to be implemented with constraints. Let's plot it like this:
#!/usr/bin/python3
import numpy as np
import gnuplotlib as gp
Nshifts_eq  = 4
Nshifts_max = 10
Nshifts = np.arange(Nshifts_max+1)
i0 = np.nonzero(Nshifts != Nshifts_eq)[0]
i1 = np.nonzero(Nshifts == Nshifts_eq)[0]
gp.plot( # True value: var_Nshifts4==0, Nshifts!=4
         ( np.zeros(i0.shape),
           Nshifts[i0],
           dict(_with     = 'points pt 7 ps 1 lc "red"') ),
         # True value: var_Nshifts4==1, Nshifts==4
         ( np.ones(i1.shape),
           Nshifts[i1],
           dict(_with     = 'points pt 7 ps 1 lc "red"') ),
         # False value: var_Nshifts4==1, Nshifts!=4
         ( np.ones(i0.shape),
           Nshifts[i0],
           dict(_with     = 'points pt 7 ps 1 lc "black"') ),
         # False value: var_Nshifts4==0, Nshifts==4
         ( np.zeros(i1.shape),
           Nshifts[i1],
           dict(_with     = 'points pt 7 ps 1 lc "black"') ),
        unset=('grid'),
        _set = (f'xtics ("(Nshifts== Nshifts_eq ) == 0" 0, "(Nshifts== Nshifts_eq ) == 1" 1)'),
        _xrange = (-0.1, 1.1),
        ylabel = "Nshifts",
        title = "Nshifts equality variable: not linearly separable",
        hardcopy = "/tmp/scheduling-Nshifts-eq.svg")
scheduling-Nshifts-eq.svg
So a hypothetical vars_Nshifts[h][4] variable (plotted on the x axis of this plot) would need to be defined by a set of linear AND constraints to linearly separate the true (red) values of this variable from the false (black) values. As can be seen in this plot, this isn't possible. So this representation does not work. How do we fix it? We can use inequality variables instead. I define a different set of variables vars_Nshifts_leq[human][N] that are 1 iff Nshifts <= N. The equality variable from before can be expressed as a difference of these inequality variables: vars_Nshifts[human][N] = vars_Nshifts_leq[human][N]-vars_Nshifts_leq[human][N-1] Can these vars_Nshifts_leq variables be defined by a set of linear AND constraints? Yes:
#!/usr/bin/python3
import numpy as np
import numpysane as nps
import gnuplotlib as gp
Nshifts_leq = 4
Nshifts_max = 10
Nshifts = np.arange(Nshifts_max+1)
i0 = np.nonzero(Nshifts >  Nshifts_leq)[0]
i1 = np.nonzero(Nshifts <= Nshifts_leq)[0]
def linear_slope_yintercept(xy0,xy1):
    m = (xy1[1] - xy0[1])/(xy1[0] - xy0[0])
    b = xy1[1] - m * xy1[0]
    return np.array(( m, b ))
x01     = np.arange(2)
x01_one = nps.glue( nps.transpose(x01), np.ones((2,1)), axis=-1)
y_lowerbound = nps.inner(x01_one,
                         linear_slope_yintercept( np.array((0, Nshifts_leq+1)),
                                                  np.array((1, 0)) ))
y_upperbound = nps.inner(x01_one,
                         linear_slope_yintercept( np.array((0, Nshifts_max)),
                                                  np.array((1, Nshifts_leq)) ))
y_lowerbound_check = (1-x01) * (Nshifts_leq+1)
y_upperbound_check = Nshifts_max - x01*(Nshifts_max-Nshifts_leq)
gp.plot( # True value: var_Nshifts_leq4==0, Nshifts>4
         ( np.zeros(i0.shape),
           Nshifts[i0],
           dict(_with     = 'points pt 7 ps 1 lc "red"') ),
         # True value: var_Nshifts_leq4==1, Nshifts<=4
         ( np.ones(i1.shape),
           Nshifts[i1],
           dict(_with     = 'points pt 7 ps 1 lc "red"') ),
         # False value: var_Nshifts_leq4==1, Nshifts>4
         ( np.ones(i0.shape),
           Nshifts[i0],
           dict(_with     = 'points pt 7 ps 1 lc "black"') ),
         # False value: var_Nshifts_leq4==0, Nshifts<=4
         ( np.zeros(i1.shape),
           Nshifts[i1],
           dict(_with     = 'points pt 7 ps 1 lc "black"') ),
         ( x01, y_lowerbound, y_upperbound,
           dict( _with     = 'filledcurves lc "green"',
                 tuplesize = 3) ),
         ( x01, nps.cat(y_lowerbound_check, y_upperbound_check),
           dict( _with     = 'lines lc "green" lw 2',
                 tuplesize = 2) ),
        unset=('grid'),
        _set = (f'xtics ("(Nshifts<= Nshifts_leq ) == 0" 0, "(Nshifts<= Nshifts_leq ) == 1" 1)',
                'style fill transparent pattern 1'),
        _xrange = (-0.1, 1.1),
        ylabel = "Nshifts",
        title = "Nshifts inequality variable: linearly separable",
        hardcopy = "/tmp/scheduling-Nshifts-leq.svg")
scheduling-Nshifts-leq.svg
So we can use two linear constraints to make each of these variables work properly. To use these in the benefit function we can use the equality constraint expression from above, or we can use these directly:
# I want to favor people getting more extra shifts at the start to balance
# things out: somebody getting one more shift on their pile shouldn't take
# shifts away from under-utilized people
benefit_boost_leq_bound = \
     2: .2,
     3: .3,
     4: .4,
     5: .5 
# Constrain vars_Nshifts_leq variables to do the right thing
for h in humans:
    for b in benefit_boost_leq_bound.keys():
        prob += (pulp.lpSum([vars[h][s] for s in shifts.keys()])
                 >= (1 - vars_Nshifts_leq[h][b])*(b+1),
                 f" h  at least  b  shifts: lower bound")
        prob += (pulp.lpSum([vars[h][s] for s in shifts.keys()])
                 <= Nshifts_max - vars_Nshifts_leq[h][b]*(Nshifts_max-b),
                 f" h  at least  b  shifts: upper bound")
benefits = dict()
for h in humans:
    benefits[h] = \
        ... + \
        pulp.lpSum([vars_Nshifts_leq[h][b] * benefit_boost_leq_bound[b] \
                    for b in benefit_boost_leq_bound.keys()])
In this scenario, David would get a boost of 0.4 from giving up his 5th shift, while Judy would lose a boost of 0.2 from getting her 3rd, for a net gain of 0.2 benefit points. The exact numbers will need to be adjusted on a case by case basis, but this works. The full program, with this and other extra features is available here.

Next.

Previous.