Search Results: "cas"

9 November 2025

Colin Watson: Free software activity in October 2025

About 95% of my Debian contributions this month were sponsored by Freexian. You can also support my work directly via Liberapay or GitHub Sponsors. OpenSSH OpenSSH upstream released 10.1p1 this month, so I upgraded to that. In the process, I reverted a Debian patch that changed IP quality-of-service defaults, which made sense at the time but has since been reworked upstream anyway, so it makes sense to find out whether we still have similar problems. So far I haven t heard anything bad in this area. 10.1p1 caused a regression in the ssh-agent-filter package s tests, which I bisected and chased up with upstream. 10.1p1 also had a few other user-visible regressions (#1117574, #1117594, #1117638, #1117720); I upgraded to 10.2p1 which fixed some of these, and contributed some upstream debugging help to clear up the rest. While I was there, I also fixed ssh-session-cleanup: fails due to wrong $ssh_session_pattern in our packaging. Finally, I got all this into trixie-backports, which I intend to keep up to date throughout the forky development cycle. Python packaging For some time, ansible-core has had occasional autopkgtest failures that usually go away before anyone has a chance to look into them properly. I ran into these via openssh recently and decided to track them down. It turns out that they only happened when the libpython3.13-stdlib package had different versions in testing and unstable, because an integration test setup script made a change that would be reverted if that package was ever upgraded in the testbed, and one of the integration tests accidentally failed to disable system apt sources comprehensively enough while testing the behaviour of the ansible.builtin.apt module. I fixed this in Debian and contributed the relevant part upstream. We ve started working on enabling Python 3.14 as a supported version in Debian. I fixed or helped to fix a number of packages for this: I upgraded these packages to new upstream versions: I packaged python-blockbuster and python-pytokens, needed as new dependencies of various other packages. Santiago Vila filed a batch of bugs about packages that fail to build when using the nocheck build profile, and I fixed several of these (generally just a matter of adjusting build-dependencies): I helped out with the scikit-learn 1.7 transition: I fixed or helped to fix several other build/test failures: I fixed some other bugs: I investigated a python-py build failure, which turned out to have been fixed in Python 3.13.9. I adopted zope.hookable and zope.location for the Python team. Following an IRC question, I ported linux-gpib-user to pybuild-plugin-pyproject, and added tests to make sure the resulting binary package layout is correct. Rust packaging Another Pydantic upgrade meant I had to upgrade a corresponding stack of Rust packages to new upstream versions: I also upgraded rust-archery and rust-rpds. Other bits and pieces I fixed a few bugs in other packages I maintain: I investigated a malware report against tini, which I think we can prove to be a false positive (at least under the reasonable assumption that there isn t malware hiding in libgcc or glibc). Yay for reproducible builds! I noticed and fixed a small UI deficiency in debbugs, making the checkboxes under Misc options on package pages easier to hit. This is merged but we haven t yet deployed it. I notced and fixed a typo in the Being kind to porters section of the Debian Developer s Reference. Code reviews

Stefano Rivera: Debian Video Team Sprint: November 2025

This week, some of the DebConf Video Team met in Herefordshire (UK) for a sprint. We didn't have a sprint in 2024, and it was sorely needed, now. At the sprint we made good progress towards using Voctomix 2 more reliably, and made plans for our future hardware needs. Attendees Voctomix 2 DebConf 25 was the first event that the team used Voctomix version 2. Testing it during DebCamp 25 (the week before DebConf), it seemed to work reliably. But during the event, we hit repeated audio dropout issues, that affected about 18 of our recordings (and live streams). We had attempted to use Voctomix 2 at DebConf 24, and quickly rolled back to version 1, on day 1 of the conference, when we hit similar issues. We thought these issues would be resolved for DebConf 25, by using more powerful (newer) mixing machines. Trying to get to the bottom of these issues was the main focus of the sprint. Nicolas brought 2 of Debian's cameras and the Framework laptop that we'd used at the conference, so we could reproduce the problem. It didn't take long to reproduce, in fact, we spent most of the week trying any configuration changes we could think of to avoid it. The issue we've been seeing feels like a gstreamer bug, rather than something voctomix is doing incorrectly. If anything, configuration changes are avoiding hitting it. Finally, on the last night of the sprint, we managed to run voctomix all night without the problem appearing. But... that isn't enough to feel confident that the issue is avoided. More testing will be required. Detecting audio breakage Kyle worked on a way to report the audio quality in our Prometheus exporter, so we can automatically detect this kind of audio breakage. This was implemented in helios our audio level monitor, and lead to some related code refactoring. Framework Laptops Historically, the video team has relied on borrowed and rented computer hardware at conferences for our (software) video mixing, streaming, storage and encoding. Many years ago, we'd even typically have a local Debian mirror and upload queue on site. Our video mixing machines had to be desktop size computers with 2 Blackmagic Declink Mini Recorder PCI-e cards installed in them, to capture video from our cameras. Now that we reliably have more Internet bandwidth than we really need, at our conference venues, we can rely on offsite cloud servers. We only need the video capture and mixing machines on site. Blackmagic also has UltraStudio Recorder thunderbolt capture boxes that we can use with a laptop. The project bought a couple of these and a Framework 13 AMD laptop to test at DebConf 25. We used it in production at DebConf, in the "Petit Amphi" room, where it seemed to work fairly well. It was very picky about thunderbolt cable and port combinations, refusing to even boot when they were connected. Since then, Framework firmware has fixed these issues, and in our testing at the sprint, it worked almost perfectly. (One of the capture boxes got into a broken state, and had to be unplugged and re-connected to fix it.) We think these are the best option for the future, and plan to ask the project to buy some more of them. HDCP Apple Silicon devices seem to like to HDCP-encrypt their HDMI output whenever possible. This causes our HDMI capture hardware to display an "Encrypted" error, rather than any useful image. Chris experimented with a few different devices to strip HDCP from HDMI video, at least 2 of them worked. Spring Cleaning Kyle dug through the open issues in our Salsa repositories and cleaned up some issues. DebConf 25 Video Encoding The core video team at DebConf 25 was a little under-staffed, significantly overlapping with core conference organization, which took priority. That, combined with the Voctomix 2 audio dropout issues we'd hit, meant that there was quite a bit of work left to be done to get the conference videos properly encoded and released. We found that the encodings had been done at the wrong resolution, which forced a re-encode of all videos. In the process, we reviewed videos for audio issues and made a list of the ones that need more work. We ran out of time and this work isn't done, yet. DebConf 26 Preparation Kyle reviewed floorplans and photographs of the proposed DebConf 26 talk venues, and build up a list of A/V kit that we'll need to hire. Carl's Video Box Carl uses much of the same stack as the video team, for many other events in the US. He has experimenting with using a Dell 7212 tablet in an all-in-one laser-cut box. Carl demonstrated this box, which could perfect for small miniDebConfs, at the sprint. Using voctomix 2 on the box requires some work, because it doesn't use Blackmagic cards for video capture. The Box: Front The Box: Back gst-fallbacksrc Carl's box's needs lead to looking at gstfallbacksrc. This should let Voctomix 2 survive cameras (or network sources) going away for a moment. Matthias Geiger packaged it for us, and it's now in Debian NEW. Thanks! voctomix-outcasts Carl cut a release of voctomix-outcasts and Stefano uploaded it to unstable. Ansible Configuration The videoteam's stack is deployed with Ansible, and almost everything we do involves work on this stack. Carl upstreamed some of his features to us, and we updated our voctomix2 configuration to take advantage of our experiments at the sprint. Miscellaneous Voctomix contributions We fixed a couple of minor bugs in voctomix. More Nageru experimentation In 2023, we tried to configure Nageru (another live video mixer) for the video team's needs. Like voctomix it needs some configuration and scaffolding to adapt it to your needs. Practically, this means writing a "theme" in Lua that controls the mixer. The team still has a preference for Voctomix (as we're all very familiar with it), but would like to have Nageru available as an option when we need it. We fixed some minor issues in our theme, enough to get it running again, on the Framework laptop. Much more work is needed to really make it a useable option. Thank you Thanks to the Debian project for funding the costs of the sprint, and Chris Boot's extended family for providing us with a no-cost sprint venue. Thanks to c3voc for developing and maintaining voctomix, and helping us to debug issues in it. Thank you to everyone in the videoteam who attended or helped out remotely! And to employers who let us work on Debian on company time. We'll likely need to keep working on our stack remotely, in the leadup to DebConf 26, and/or have another sprint before then. Breakfast Coffee Hacklab Trains!

8 November 2025

Thorsten Alteholz: My Debian Activities in October 2025

Debian LTS This was my hundred-thirty-sixth month that I did some work for the Debian LTS initiative, started by Raphael Hertzog at Freexian. During my allocated time I uploaded or worked on: I also attended the monthly LTS/ELTS meeting. Debian ELTS This month was the eighty-seventh ELTS month. During my allocated time I uploaded or worked on: I also attended the monthly LTS/ELTS meeting. Debian Printing This month I uploaded a new upstream version or a bugfix version of: This work is generously funded by Freexian! Debian Astro This month I uploaded a new upstream version or a bugfix version of: Debian IoT Unfortunately I didn t found any time to work on this topic. Debian Mobcom This month I uploaded a new upstream version or a bugfix version of: misc This month I uploaded a new upstream version or a bugfix version of: On my fight against outdated RFPs, I closed 31 of them in October. I could even close one RFP by uploading the new package gypsy. Meanwhile only 3373 are still open, so don t hesitate to help closing one or another. FTP master This month I accepted 420 and rejected 45 packages. The overall number of packages that got accepted was 423. I would like to remind everybody that in case you don t agree with the removal of a package, please set the moreinfo tag on this bug. This is the only reliable way to prevent processing of that RM-bug. Well, there is a second way, of course you could also achieve this by closing the bug.

7 November 2025

Ravi Dwivedi: A Bad Day in Malaysia

Continuing from where Badri and I left off in the last post. On the 7th of December 2024, we boarded a bus from Singapore to the border town of Johor Bahru in Malaysia. The bus stopped at the Singapore emigration for us to get off for the formalities. The process was similar to the immigration at the Singapore airport. It was automatic, and we just had to scan our passports for the gates to open. Here also, we didn t get Singapore stamps on our passports. After we were done with the emigration, we had to find our bus. We remembered the name of the bus company and the number plate, which helped us recognize our bus. It wasn t there already after we came out of the emigration, but it arrived soon enough, and we boarded it promptly. From the Singapore emigration, the bus travelled a few kilometers and dropped us at Johor Bahru Sentral (JB Sentral) bus station, where we had to go through Malaysian immigration. The process was manual, unlike Singapore, and there was an immigration officer at the counter who stamped our passports (which I like) and recorded our fingerprints. At the bus terminal, we exchanged rupees at an exchange shop to get Malaysian ringgits. We could not find any free drinking water sources on the bus terminal, so we had to buy water. Badri later told me that Johor Bahru has a lot of data centers, which need a lot of water for cooling. When he read about it later, he immediately connected it with the fact that there was no free drinking water, and we had to buy water. Such data centers can lead to scarcity of water for others in the area. From JB Sentral, we took a bus to Larkin Terminal, as our hotel was nearby. It was 1.5 ringgits per person (30 rupees). In order to pay for the fare, we had to put cash in a box near the driver s seat. Around half-an-hour later, we reached our hotel. The time was 23:30 hours. The hotel room was hot as it didn t have air-conditioning. The weather in Malaysia is on the hotter side throughout the year. It was a budget hotel, and we paid 70 ringgits for our room. Badri slept soon after we checked-in. I went out during the midnight at around 00:30. I was hungry, so I entered a small scale restaurant nearby, which was quite lively for the midnight hours. At the restaurant, I ordered a coffee and an omelet. I also asked for drinking water. The unique thing about that was that they put ice in hot water to make its temperature normal. My bill from the restaurant looked like the below-mentioned table, as the items names were in the local language Malay:
Item Price (Malaysian ringgits) Conversion to Indian rupees Comments
Nescafe Tarik 2.50 50 Coffee
Ais Kosong 0.50 10 Water
Telur Dadar 2.00 40 Omelet
SST Tax (6%) 0.30 6
Total 5.30 106
After checking out from the restaurant, I explored nearby shops. I also bought some water before going back to the hotel room. The next day, we had a (pre-booked) bus to Kuala Lumpur. We checked out from the hotel 10 minutes after the check-out time (which was 14:00 hours). However, within those 10 minutes, the hotel staff already came up three times asking us to clear out (which we were doing as fast as possible). And finally on the third time they said our deposit was forfeit, even though it was supposed to be only for keys and towels. The above-mentioned bus for Kuala Lumpur was from the nearby Larkin Bus Terminal. The bus terminal was right next to our hotel, so we walked till there. Upon reaching there, we found out that the process of boarding a bus in Malaysia resembled with taking a flight. We needed to go to a counter to get our boarding passes, followed by reporting at our gate half-an-hour before the scheduled time. Furthermore, they had a separate waiting room and boarding gates. Also, there was a terminal listing buses with their arrival and departure signs. Finally, to top it off, the buses had seatbelts. We got our boarding pass for 2 ringgits (40 rupees). After that, we proceeded to get something to eat as we were hungry. We went to a McDonald s, but couldn t order anything because of the long queue. We didn t have a lot of time, so we proceeded towards our boarding gate without having anything. The boarding gate was in a separate room, which had a vending machine. I tried to order something using my card, but the machine wasn t working. In Malaysia, there is a custom of queueing up to board buses even before the bus has arrived. We saw it in Johor Bahru as well. The culture is so strong that they even did it in Singapore while waiting for the Johor Bahru bus! Our bus departed at 15:30 as scheduled. The journey was around 5 hours. A couple of hours later, our bus stopped for a break. We got off the bus and went to the toilet. As we were starving (we didn t have anything the whole day), we thought it was a good opportunity to get some snack. There was a stall selling some food. However, I had to determine which options were vegetarian. We finally settled on a cylindrical box of potato chips, labelled Mister Potato. They were 7 ringgits. We didn t know how long the bus is going to stop. Furthermore, eating inside buses in Malaysia is forbidden. When we went to get some coffee from the stall, our bus driver was standing there and made a face. We got an impression that he doesn t want us to have coffee. However, after we got into the bus, we had to wait for a long time for it to resume its journey as the driver was taking his sweet time to drink his coffee. During the bus journey, we saw a lot of palm trees on the way. The landscape was beautiful, with good road infrastructure throughout the journey. Badri also helped me improve my blog post on obtaining Luxembourg visa in the bus. The bus dropped us at the Terminal Bersepadu Selatan (TBS in short) in Kuala Lumpur at 21:30 hours. Finally, we got something at the TBS. We also noticed that the TBS bus station had lockers. This gave us the idea of putting some of our luggage in the lockers later while we will be in Brunei. We had booked a cheap Air Asia ticket which doesn t allow check-in luggage. Further, keeping the checked-in luggage in lockers for three days was cheaper than paying the excess luggage penalty for Air Asia. We followed it up by taking a metro as our hotel was closer to a metro station. This was a bad day due to our deposit being forfeited unfairly, and got nothing to eat. We took the metro to reach our hostel, which was located in the Bukit Bintang area. The name of this hostel was Manor by Mingle. I had stayed here earlier in February 2024 for two nights. Back then, I paid 1000 rupees per day for a dormitory bed. However, this time the same hostel was much cheaper. We got a private room for 800 rupees per day, with breakfast included. Earlier it might have been pricier due to my stay falling on weekends or maybe February has more tourists in Kuala Lumpur. That s it for this post. Stay tuned for our adventures in Malaysia!

5 November 2025

Reproducible Builds: Reproducible Builds in October 2025

Welcome to the October 2025 report from the Reproducible Builds project! Welcome to the very latest report from the Reproducible Builds project. Our monthly reports outline what we ve been up to over the past month, and highlight items of news from elsewhere in the increasingly-important area of software supply-chain security. As ever, if you are interested in contributing to the Reproducible Builds project, please see the Contribute page on our website. In this report:

  1. Farewell from the Reproducible Builds Summit 2025
  2. Google s Play Store breaks reproducible builds for Signal
  3. Mailing list updates
  4. The Original Sin of Computing that no one can fix
  5. Reproducible Builds at the Transparency.dev summit
  6. Supply Chain Security for Go
  7. Three new academic papers published
  8. Distribution work
  9. Upstream patches
  10. Website updates
  11. Tool development

Farewell from the Reproducible Builds Summit 2025 Thank you to everyone who joined us at the Reproducible Builds Summit in Vienna, Austria! We were thrilled to host the eighth edition of this exciting event, following the success of previous summits in various iconic locations around the world, including Venice, Marrakesh, Paris, Berlin, Hamburg and Athens. During this event, participants had the opportunity to engage in discussions, establish connections and exchange ideas to drive progress in this vital field. Our aim was to create an inclusive space that fosters collaboration, innovation and problem-solving. The agenda of the three main days is available online however, some working sessions may still lack notes at time of publication. One tangible outcome of the summit is that Johannes Starosta finished their rebuilderd tutorial, which is now available online and Johannes is actively seeking feedback.

Google s Play Store breaks reproducible builds for Signal On the issue tracker for the popular Signal messenger app, developer Greyson Parrelli reports that updates to the Google Play store have, in effect, broken reproducible builds:
The most recent issues have to do with changes to the APKs that are made by the Play Store. Specifically, they add some attributes to some .xml files around languages are resources, which is not unexpected because of how the whole bundle system works. This is trickier to resolve, because unlike current expected differences (like signing information), we can t just exclude a whole file from the comparison. We have to take a more nuanced look at the diff. I ve been hesitant to do that because it ll complicate our currently-very-readable comparison script, but I don t think there s any other reasonable option here.
The full thread with additional context is available on GitHub.

Mailing list updates On our mailing list this month:
  • kpcyrd forwarded a fascinating tidbit regarding so-called ninja and samurai build ordering, that uses data structures in which the pointer values returned from malloc are used to determine some order of execution.
  • Arnout Engelen, Justin Cappos, Ludovic Court s and kpcyrd continued a conversation started in September regarding the Minimum Elements for a Software Bill of Materials . (Full thread)
  • Felix Moessbauer of Siemens posted to the list reporting that he had recently stumbled upon a couple of Debian source packages on the snapshot mirrors that are listed multiple times (same name and version), but each time with a different checksum . The thread, which Felix titled, Debian: what precisely identifies a source package is about precisely that what can be axiomatically relied upon by consumers of the Debian archives, as well as indicating an issue where we can t exactly say which packages were used during build time (even when having the .buildinfo files).
  • Luca DiMaio posted to the list announcing the release of xfsprogs 6.17.0 which specifically includes a commit that implements the functionality to populate a newly created XFS filesystem directly from an existing directory structure which makes it easier to create populated filesystems without having to mount them [and thus is] particularly useful for reproducible builds . Luca asked the list how they might contribute to the docs of the System images page.

The Original Sin of Computing that no one can fix Popular YouTuber @laurewired published a video this month with an engaging take on the Trusting Trust problem. Titled The Original Sin of Computing that no one can fix, the video touches on David A. Wheeler s Diverse Double-Compiling dissertation. GNU developer Janneke Nieuwenhuizen followed-up with an email (additionally sent to our mailing list) as well, underscoring that GNU Mes s current solution [to this issue] uses ancient softwares in its bootstrap path, such as gcc-2.95.3 and glibc-2.2.5 . (According to Colby Russell, the GNU Mes bootstrapping sequence is shown at 18m54s in the video.)

Reproducible Builds at the Transparency.dev summit Holger Levsen gave a talk at this year s Transparency.dev summit in Gothenburg, Sweden, outlining the achievements of the Reproducible Builds project in the last 12 years, covering both upstream developments as well as some distribution-specific details. As mentioned on the talk s page, Holger s presentation concluded with an outlook into the future and an invitation to collaborate to bring transparency logs into Reproducible Builds projects . The slides of the talk are available, although a video has yet to be released. Nevertheless, as a result of the discussions at Transparency.dev there is a new page on the Debian wiki with the aim of describing a potential transparency log setup for Debian.

Supply Chain Security for Go Andrew Ayer has setup a new service at sourcespotter.com that aims to monitor the supply chain security for Go releases. It consists of four separate trackers:
  1. A tool to verify that the Go Module Mirror and Checksum Database is behaving honestly and has not presented inconsistent information to clients.
  2. A module monitor that records every module version served by the Go Module Mirror and Checksum Database, allowing you to monitor for unexpected versions of your modules.
  3. A tool to verifies that the Go toolchains published in the Go Module Mirror can be reproduced from source code, making it difficult to hide backdoors in the binaries downloaded by the go command.
  4. A telemetry config tracker that tracks the names of telemetry counters uploaded by the Go toolchain, to ensure that Go telemetry is not violating users privacy.
As the homepage of the service mentions, the trackers are free software and do not rely on Google infrastructure.

Three new academic papers published Julien Malka of the Institut Polytechnique de Paris published an exciting paper this month on How NixOS could have detected the XZ supply-chain attack for the benefit of all thanks to reproducible-builds. Julien outlines his paper as follows:
In March 2024, a sophisticated backdoor was discovered in xz, a core compression library in Linux distributions, covertly inserted over three years by a malicious maintainer, Jia Tan. The attack, which enabled remote code execution via ssh, was only uncovered by chance when Andres Freund investigated a minor performance issue. This incident highlights the vulnerability of the open-source supply chain and the effort attackers are willing to invest in gaining trust and access. In this article, I analyze the backdoor s mechanics and explore how bitwise build reproducibility could have helped detect it.
A PDF of the paper is available online.
Iy n M ndez Veiga and Esther H nggi (of the Lucerne University of Applied Sciences and Arts and ETH Zurich) published a paper this month on the topic of Reproducible Builds for Quantum Computing. The abstract of their paper mentions the following:
Although quantum computing is a rapidly evolving field of research, it can already benefit from adopting reproducible builds. This paper aims to bridge the gap between the quantum computing and reproducible builds communities. We propose a generalization of the definition of reproducible builds in the quantum setting, motivated by two threat models: one targeting the confidentiality of end users data during circuit preparation and submission to a quantum computer, and another compromising the integrity of quantum computation results. This work presents three examples that show how classical information can be hidden in transpiled quantum circuits, and two cases illustrating how even minimal modifications to these circuits can lead to incorrect quantum computation results.
A full PDF of their paper is available.
Congratulations to Georg Kofler who submitted their Master s thesis for the Johannes Kepler University of Linz, Austria on the topic of Reproducible builds of E2EE-messengers for Android using Nix hermetic builds:
The thesis focuses on providing a reproducible build process for two open-source E2EE messaging applications: Signal and Wire. The motivation to ensure reproducibility and thereby the integrity of E2EE messaging applications stems from their central role as essential tools for modern digital privacy. These applications provide confidentiality for private and sensitive communications, and their compromise could undermine encryption mechanisms, potentially leaking sensitive data to third parties.
A full PDF of their thesis is available online.
Shawkot Hossain of Aalto University, Finland has also submitted their Master s thesis on the The Role of SBOM in Modern Development with a focus on the extant tooling:
Currently, there are numerous solutions and techniques available in the market to tackle supply chain security, and all claim to be the best solution. This thesis delves deeper by implementing those solutions and evaluates them for better understanding. Some of the tools that this thesis implemented are Syft, Trivy, Grype, FOSSA, dependency-check, and Gemnasium. Software dependencies are generated in a Software Bill of Materials (SBOM) format by using these open-source tools, and the corresponding results have been analyzed. Among these tools, Syft and Trivy outperform others as they provide relevant and accurate information on software dependencies.
A PDF of the thesis is also available.

Distribution work Michael Plura published an interesting article on Heise.de on the topic of Trust is good, reproducibility is better:
In the wake of growing supply chain attacks, the FreeBSD developers are relying on a transparent build concept in the form of Zero-Trust Builds. The approach builds on the established Reproducible Builds, where binary files can be rebuilt bit-for-bit from the published source code. While reproducible builds primarily ensure verifiability, the zero-trust model goes a step further and removes trust from the build process itself. No single server, maintainer, or compiler can be considered more than potentially trustworthy.
The article mentions that this goal has now been achieved with a slight delay and can be used in the current development branch for FreeBSD 15 .
In Debian this month, 7 reviews of Debian packages were added, 5 were updated and 11 were removed this month adding to our knowledge about identified issues. For the Debian CI tests Holger fixed #786644 and set nocheck in DEB_BUILD_OPTIONS for the 2nd build..
Lastly, Bernhard M. Wiedemann posted another openSUSE monthly update for their work there.

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Website updates Once again, there were a number of improvements made to our website this month including: In addition, a number of contributors added a series of notes from our recent summit to the website, including Alexander Couzens [ ], Robin Candau [ ][ ][ ][ ][ ][ ][ ][ ][ ] and kpcyrd [ ].

Tool development diffoscope version 307 was uploaded to Debian unstable by Chris Lamb, who made a number of changes including fixing compatibility with LLVM version 21 [ ], an attempt to automatically attempt to deploy to PyPI by liaising with the PyPI developers/maintainers (with this experimental feature). [ ] In addition, Vagrant Cascadian updated diffoscope in GNU Guix to version 307.

Finally, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

3 November 2025

Melissa Wen: Kworkflow at Kernel Recipes 2025

Franks drawing of Melissa Wen with Kernel Recipes mascots around This was the first year I attended Kernel Recipes and I have nothing but say how much I enjoyed it and how grateful I m for the opportunity to talk more about kworkflow to very experienced kernel developers. What I mostly like about Kernel Recipes is its intimate format, with only one track and many moments to get closer to experts and people that you commonly talk online during your whole year. In the beginning of this year, I gave the talk Don t let your motivation go, save time with kworkflow at FOSDEM, introducing kworkflow to a more diversified audience, with different levels of involvement in the Linux kernel development. At this year s Kernel Recipes I presented the second talk of the first day: Kworkflow - mix & match kernel recipes end-to-end. The Kernel Recipes audience is a bit different from FOSDEM, with mostly long-term kernel developers, so I decided to just go directly to the point. I showed kworkflow being part of the daily life of a typical kernel developer from the local setup to install a custom kernel in different target machines to the point of sending and applying patches to/from the mailing list. In short, I showed how to mix and match kernel workflow recipes end-to-end. As I was a bit fast when showing some features during my presentation, in this blog post I explain each slide from my speaker notes. You can see a summary of this presentation in the Kernel Recipe Live Blog Day 1: morning.

Introduction First slide: Kworkflow by Melissa Wen Hi, I m Melissa Wen from Igalia. As we already started sharing kernel recipes and even more is coming in the next three days, in this presentation I ll talk about kworkflow: a cookbook to mix & match kernel recipes end-to-end. Second slide: About Melissa Wen, the speaker of this talk This is my first time attending Kernel Recipes, so lemme introduce myself briefly.
  • As I said, I work for Igalia, I work mostly on kernel GPU drivers in the DRM subsystem.
  • In the past, I co-maintained VKMS and the v3d driver. Nowadays I focus on the AMD display driver, mostly for the Steam Deck.
  • Besides code, I contribute to the Linux kernel by mentoring several newcomers in Outreachy, Google Summer of Code and Igalia Coding Experience. Also, by documenting and tooling the kernel.
Slide 3: and what's this cookbook called Kwokflow? - with kworkflow logo and KR penguin And what s this cookbook called kworkflow?

Kworkflow (kw) Slide 4: text below Kworkflow is a tool created by Rodrigo Siqueira, my colleague at Igalia. It s a single platform that combines software and tools to:
  • optimize your kernel development workflow;
  • reduce time spent in repetitive tasks;
  • standardize best practices;
  • ensure that deployment data flows smoothly and reliably between different kernel workflows;
Slide 5: kworkflow is mostly a voluntary work It s mostly done by volunteers, kernel developers using their spare time. Its features cover real use cases according to kernel developer needs. Slide 6: Mix & Match the daily life of a kernel developer Basically it s mixing and matching the daily life of a typical kernel developer with kernel workflow recipes with some secret sauces.

First recipe: A good GPU driver for my AMD laptop Slide 7: Let's prepare our first recipe So, it s time to start the first recipe: A good GPU driver for my AMD laptop. Slide 8: Ingredients and Tools Before starting any recipe we need to check the necessary ingredients and tools. So, let s check what you have at home. With kworkflow, you can use: Slide 9: kw device and kw remote
  • kw device: to get information about the target machine, such as: CPU model, kernel version, distribution, GPU model,
  • kw remote: to set the address of this machine for remote access
Slide 11: kw config
  • kw config: you can configure kw with kw config. With this command you can basically select the tools, flags and preferences that kw will use to build and deploy a custom kernel in a target machine. You can also define recipients of your patches when sending it using kw send-patch. I ll explain more about each feature later in this presentation.
Slide 13: kw kernel-config-manager
  • kw kernel-config manager (or just kw k): to fetch the kernel .config file from a given machine, store multiple .config files, list and retrieve them according to your needs.
Slide 15: Preparation Now, with all ingredients and tools selected and well portioned, follow the right steps to prepare your custom kernel! First step: Mix ingredients with kw build or just kw b Slide 16: kw build
  • kw b and its options wrap many routines of compiling a custom kernel.
    • You can run kw b -i to check the name and kernel version and the number of modules that will be compiled and kw b --menu to change kernel configurations.
    • You can also pre-configure compiling preferences in kw config regarding kernel building. For example, target architecture, the name of the generated kernel image, if you need to cross-compile this kernel for a different system and which tool to use for it, setting different warning levels, compiling with CFlags, etc.
    • Then you can just run kw b to compile the custom kernel for a target machine.
Second step: Bake it with kw deploy or just kw d Slide 18: kw deploy After compiling the custom kernel, we want to install it in the target machine. Check the name of the custom kernel built: 6.17.0-rc6 and with kw s SSH access the target machine and see it s running the kernel from the Debian distribution 6.16.7+deb14-amd64. As with building settings, you can also pre-configure some deployment settings, such as compression type, path to device tree binaries, target machine (remote, local, vm), if you want to reboot the target machine just after deploying your custom kernel, and if you want to boot in the custom kernel when restarting the system after deployment. If you didn t pre-configured some options, you can still customize as a command option, for example: kw d --reboot will reboot the system after deployment, even if I didn t set this in my preference. With just running kw d --reboot I have installed the kernel in a given target machine and rebooted it. So when accessing the system again I can see it was booted in my custom kernel. Third step: Time to taste with kw debug Slide 20: kw debug
  • kw debug wraps many tools for validating a kernel in a target machine. We can log basic dmesg info but also tracking events and ftrace.
    • With kw debug --dmesg --history we can grab the full dmesg log from a remote machine, if you use the --follow option, you will monitor dmesg outputs. You can also run a command with kw debug --dmesg --cmd="<my command>" and just collect the dmesg output related to this specific execution period.
    • In the example, I ll just unload the amdgpu driver. I use kw drm --gui-off to drop the graphical interface and release the amdgpu for unloading it. So I run kw debug --dmesg --cmd="modprobe -r amdgpu" to unload the amdgpu driver, but it fails and I couldn t unload it.

Cooking Problems Slide 22: kw patch-hub Oh no! That custom kernel isn t tasting good. Don t worry, as in many recipes preparations, we can search on the internet to find suggestions on how to make it tasteful, alternative ingredients and other flavours according to your taste. With kw patch-hub you can search on the lore kernel mailing list for possible patches that can fix your kernel issue. You can navigate in the mailing lists, check series, bookmark it if you find it relevant and apply it in your local kernel tree, creating a different branch for tasting oops, for testing. In this example, I m opening the amd-gfx mailing list where I can find contributions related to the AMD GPU driver, bookmark and/or just apply the series to my work tree and with kw bd I can compile & install the custom kernel with this possible bug fix in one shot. As I changed my kw config to reboot after deployment, I just need to wait for the system to boot to try again unloading the amdgpu driver with kw debug --dmesg --cm=modprobe -r amdgpu. From the dmesg output retrieved by kw for this command, the driver was unloaded, the problem is fixed by this series and the kernel tastes good now. If I m satisfied with the solution, I can even use kw patch-hub to access the bookmarked series and marking the checkbox that will reply the patch thread with a Reviewed-by tag for me.

Second Recipe: Raspberry Pi 4 with Upstream Kernel Slide 25: Second Recipe RPi 4 with upstream kernel As in all recipes, we need ingredients and tools, but with kworkflow you can get everything set as when changing scenarios in a TV show. We can use kw env to change to a different environment with all kw and kernel configuration set and also with the latest compiled kernel cached. I was preparing the first recipe for a x86 AMD laptop and with kw env --use RPI_64 I use the same worktree but moved to a different kernel workflow, now for Raspberry Pi 4 64 bits. The previous compiled kernel 6.17.0-rc6-mainline+ is there with 1266 modules, not the 6.17.0-rc6 kernel with 285 modules that I just built&deployed. kw build settings are also different, now I m targeting a arm64 architecture with a cross-compiled kernel using aarch64-linu-gnu- cross-compilation tool and my kernel image calls kernel8 now. Slide 27: kw env If you didn t plan for this recipe in advance, don t worry. You can create a new environment with kw env --create RPI_64_V2 and run kw init --template to start preparing your kernel recipe with the mirepoix ready. I mean, with the basic ingredients already cut I mean, with the kw configuration set from a template. And you can use kw remote to set the IP address of your target machine and kw kernel-config-manager to fetch/retrieve the .config file from your target machine. So just run kw bd to compile and install a upstream kernel for Raspberry Pi 4.

Third Recipe: The Mainline Kernel Ringing on my Steam Deck (Live Demo) Slide 30: Third Recipe - The Mainline Kernel Ringing on my Steam Deck Let s show you how easy is to build, install and test a custom kernel for Steam Deck with Kworkflow. It s a live demo, but I also recorded it because I know the risks I m exposed to and something can go very wrong just because of reasons :)

Report: how was the live demo For this live demo, I took my OLED Steam Deck to the stage. I explained that, if I boot mainline kernel on this device, there is no audio. So I turned it on and booted the mainline kernel I ve installed beforehand. It was clear that there was no typical Steam Deck startup audio when the system was loaded. Franks drawing of Melissa Wen doing a demo of kworkflow with the Steam Deck As I started the demo in the kw environment for Raspberry Pi 4, I first moved to another environment previously used for Steam Deck. In this STEAMDECK environment, the mainline kernel was already compiled and cached, and all settings for accessing the target machine, compiling and installing a custom kernel were retrieved automatically. My live demo followed these steps:
  1. With kw env --use STEAMDECK, switch to a kworkflow environment for Steam Deck kernel development.
  2. With kw b -i, shows that kw will compile and install a kernel with 285 modules named 6.17.0-rc6-mainline-for-deck.
  3. Run kw config to show that, in this environment, kw configuration changes to x86 architecture and without cross-compilation.
  4. Run kw device to display information about the Steam Deck device, i.e. the target machine. It also proves that the remote access - user and IP - for this Steam Deck was already configured when using the STEAMDECK environment, as expected.
  5. Using git am, as usual, apply a hot fix on top of the mainline kernel. This hot fix makes the audio play again on Steam Deck.
  6. With kw b, build the kernel with the audio change. It will be fast because we are only compiling the affected files since everything was previously done and cached. Compiled kernel, kw configuration and kernel configuration is retrieved by just moving to the STEAMDECK environment.
  7. Run kw d --force --reboot to deploy the new custom kernel to the target machine. The --force option enables us to install the mainline kernel even if mkinitcpio complains about missing support for downstream packages when generating initramfs. The --reboot option makes the device reboot the Steam Deck automatically, just after the deployment completion.
  8. After finishing deployment, the Steam Deck will reboot on the new custom kernel version and made a clear resonant or vibrating sound. [Hopefully]
Finally, I showed to the audience that, if I wanted to send this patch upstream, I just needed to run kw send-patch and kw would automatically add subsystem maintainers, reviewers and mailing lists for the affected files as recipients, and send the patch to the upstream community assessment. As I didn t want to create unnecessary noise, I just did a dry-run with kw send-patch -s --simulate to explain how it looks.

What else can kworkflow already mix & match? In this presentation, I showed that kworkflow supported different kernel development workflows, i.e., multiple distributions, different bootloaders and architectures, different target machines, different debugging tools and automatize your kernel development routines best practices, from development environment setup and verifying a custom kernel in bare-metal to sending contributions upstream following the contributions-by-e-mail structure. I exemplified it with three different target machines: my ordinary x86 AMD laptop with Debian, Raspberry Pi 4 with arm64 Raspbian (cross-compilation) and the Steam Deck with SteamOS (x86 Arch-based OS). Besides those distributions, Kworkflow also supports Ubuntu, Fedora and PopOS. Now it s your turn: Do you have any secret recipes to share? Please share with us via kworkflow.

Birger Schacht: Status update, October 2025

At the beginning of the month I uploaded a new version of the sway package to Debian. This contains two backported patches, one to fix reported WM capabilities and one to revert the default behavior for drag_lock to disabled. I also uploaded new releases of cage (a kiosk for Wayland), labwc, the window-stacking Wayland compositor that is inspired by Openbox, and wf-recorder, a tool for creating screen recordings of wlroots-based Wayland compositors. If I don t forget I try to update the watch file of the packages I touch to the new version 5 format. Simon Ser announced vali, a C library for Varlink. The blog post also mentions that this will be a dependency of the next version of the kanshi Wayland output management daemon and the PR to do so is now already merged. So I created ITP: vali A Varlink C implementation and code generator, packaged the library and it is now waiting in NEW. In addition to libscfg this is now the second dependency of kanshi that is in NEW. On the Rust side of things I fixed a bug in carl. The fix introduces new date properties which can be use to highlight a calendar date. I also updated all the dependencies and plan to create a new release soon. Later I dug up a Rust project that I started a couple of years ago, where I try to use wasm-bindgen to implement interactive web components. There is a lot I have to refactor in this code base, but I will work on that and try to publish something in the next few months.

Miscellaneous Two weeks ago I wrote A plea for <dialog>, which made the case for using standardized HTML elements instead of resorting to JavaScript libraries. I finally managed to update my shell Server to Debian 13. I created an issue for the nextcloud-news android client because I moved to a new phone and my starred articles did not show up in the news app, which is a bit annoying. I got my ticket for 39C3. In my dayjob I continued to work on the refactoring of the import logic of our apis-core-rdf app. I released version 0.56 which also introduced the #snackbar as the container for the toast message, as described in the <dialog> block post. At the end of the month I released version 0.57 of apis-core-rdf, which got rid of the remaining leftovers of the old import logic. A couple of interesting articles I stumbled upon (or finally had the time to read):

2 November 2025

Guido G nther: Free Software Activities October 2025

Quiete some things made progress last month: We put out Phosh 0.50 release, got closer to enabling media roles for audio by default in Phosh (see related post) and reworked our images builds. You should also (hopefully) notice some nice quality of life improvements once changes land in a distro near you and you're using Phosh. See below for details: phosh phoc phosh-mobile-settings stevia (formerly phosh-osk-stub) phosh-tour meta-phosh xdg-desktop-portal-phosh libphosh-rs Calls Phrog phosh-recipes feedbackd feedbackd-device-themes Chatty Squeekboard Debian Cellbroadcastd gnome-settings-daemon gnome-control-center gnome-initial-setup sdm845-mainline gnome-session alpine droid-juicer phosh-site phosh-debs Linux Wireplumber Phosh.mobi e.V. demo-phones Reviews This is not code by me but reviews on other peoples code. The list is (as usual) slightly incomplete. Thanks for the contributions! Comments? Join the Fediverse thread

31 October 2025

Emmanuel Kasper: Best Pick-up-and-play with a gamepad on Debian and other Linux distributions: SuperTux

After playing some 16 bits era classic games on my Mist FPGA I was wondering what I could play on my Debian desktop as a semi-casual gamer. By semi-casual I mean that if a game needs more than 30 minutes to understand the mechanics, or needs 10 buttons on the gamepad I usually drop it. After testing a dozen games available in the Debian archive my favorite Pick-up-and-play is SuperTux. SuperTux is a 2D platformer quite similar to Super Mario World or Sonic, well also 16 bits classics, but of course you play a friendly penguin. What I like in SuperTux:
  • complete free and opensource application packaged in the Debian main package repository, including all the game assets. So no fiddling around to get game data like Quake / Doom3, everything is available in the Debian repositories. The game is also available from all major Linux distributions in their standard repositories.
  • gamepad immediately usable. Probably the credits has to go the SDL library, but my 8bitdo wireless controller was usable instantly either via 2.4Ghz dongle or Bluetooth
  • well suited for casual players: the game mechanics are easy to grasp and the tutorial is excellent
  • polished interface, the menus are clear and easy to navigate, and there is no internal jargon in the default navigation till you run your first game. (Something which confused me when playing the SuperTuxKart racing game: when I was offered to leave STK I was wondering what that STK mode is. I understood afterwards STK is just the acronym of the game)
  • feel reasonably modern, the game does not start in a 640 480 window with 16 colors and you could demo it without shame for a casual gamer audience.
What can be say of the game itself ? You play a penguin who can run, shoot small fireballs, fall on your back to hit enemies harder. I played 10 levels, most levels had to be tried between 1 and 10 times which I find OK, the difficulty is raising in a very smooth curve. SuperTux has complete localization, hence my screenshots show french text. SuperTux tutorial Comprehensive in-game tutorial World Map There is a large ice flow world, but we are going underground now Example Level Good level design that you have to use to avoid those spiky enemies Underground level The point where I had to pause the game, after missing those flying wigs 15 times in a row SuperTux can be played with keyboard or gamepad, and has minimal hardware requirements, anything computer with working 3D graphic acceleration released in the last 20 years will be able to run it.

28 October 2025

Russ Allbery: Review: Those Who Wait

Review: Those Who Wait, by Haley Cass
Publisher: Haley Cass
Copyright: 2020
ISBN: 979-8-9884929-1-7
Format: Kindle
Pages: 556
Those Who Wait is a stand-alone self-published sapphic romance novel. Given the lack of connection between political figures named in this book and our reality, it's also technically an alternate history, but it will be entirely unsatisfying to anyone who reads it in that genre. Sutton Spencer is an English grad student in New York City. As the story opens, she has recently realized that she's bisexual rather than straight. She certainly has not done anything about that revelation; the very thought makes her blush. Her friend and roommate Regan, not known for either her patience or her impulse control, decides to force the issue by stealing Sutton's phone, creating a profile on a lesbian dating app, and messaging the first woman Sutton admits being attracted to. Charlotte Thompson is a highly ambitious politician, current deputy mayor of New York City for health and human services, and granddaughter of the first female president of the United States. She fully intends to become president of the United States herself. The next step on that path is an open special election for a seat in the House of Representatives. With her family political connections and the firm support of the mayor of New York City (who is also dating her brother), she thinks she has an excellent shot of winning. Charlotte is also a lesbian, something she's known since she was a teenager and which still poses serious problems for a political career. She is therefore out to her family and a few close friends, but otherwise in the closet. Compared to her political ambitions, Charlotte considers her love life almost irrelevant, and therefore has a strict policy of limiting herself to anonymous one-night stands arranged on dating apps. Even that is about to become impossible given her upcoming campaign, but she indulges in one last glance at SapphicSpark before she deletes her account. Sutton is as far as possible from the sort of person who does one-night stands, which is a shame as far as Charlotte is concerned. It would have been a fun last night out. Despite that, both of them find the other unexpectedly enjoyable to chat with. (There are a lot of text message bubbles in this book.) This is when Sutton has her brilliant idea: Charlotte is charming, experienced, and also kind and understanding of Sutton's anxiety, at least in app messages. Maybe Charlotte can be her mentor? Tell her how to approach women, give her some guidance, point her in the right directions. Given the genre, you can guess how this (eventually) turns out. I'm going to say a lot of good things about this book, so let me get the complaints over with first. As you might guess from that introduction, Charlotte's political career and the danger of being outed are central to this story. This is a bit unfortunate because you should not, under any circumstances, attempt to think deeply about the politics in this book. In 550 pages, Charlotte does not mention or expound a single meaningful political position. You come away from this book as ignorant about what Charlotte wants to accomplish as a politician as you entered. Apparently she wants to be president because her grandmother was president and she thinks she'd be good at it. The closest the story comes to a position is something unbelievably vague about homeless services and Charlotte's internal assertion that she wants to help people and make real change. There are even transcripts of media interviews, later in the book, and they somehow manage to be more vacuous than US political talk shows, which is saying something. I also can't remember a single mention of fundraising anywhere in this book, which in US politics is absurd (although I will be generous and say this is due to Cass's alternate history). I assume this was a deliberate choice and Cass didn't want politics to distract from the romance, but as someone with a lot of opinions about concrete political issues, the resulting vague soft-liberal squishiness was actively off-putting. In an actual politician, this would be an entire clothesline of red flags. Thankfully, it's ignorable for the same reason; this is so obviously not the focus of the book that one can mostly perform the same sort of mental trick that one does when ignoring the backdrop in a cheap theater. My second complaint is that I don't know what Sutton does outside of the romance. Yes, she's an English grad student, and she does some grading and some vaguely-described work and is later referred to a prestigious internship, but this is as devoid of detail as Charlotte's political positions. It's not quite as jarring because Cass does eventually show Sutton helping concretely with her mother's work (about which I have some other issues that I won't get into), but it deprives Sutton of an opportunity to be visibly expert in something. The romance setup casts Charlotte as the experienced one to Sutton's naivete, and I think it would have been a better balance to give Sutton something concrete and tangible that she was clearly better at than Charlotte. Those complaints aside, I quite enjoyed this. It was a recommendation from the same BookTuber who recommended Delilah Green Doesn't Care, so her recommendations are quickly accumulating more weight. The chemistry between Sutton and Charlotte is quite believable; the dialogue sparkles, the descriptions of the subtle cues they pick up from each other are excellent, and it's just fun to read about how they navigate a whole lot of small (and sometimes large) misunderstandings and mismatches in personality and world view. Normally, misunderstandings are my least favorite part of a romance novel, but Sutton and Charlotte come from such different perspectives that their misunderstandings feel more justified than is typical. The characters are also fairly mature about working through them: Main characters who track the other character down and insist on talking when something happens they don't understand! Can you imagine! Only with the third-act breakup is the reader dragged through multiple chapters of both characters being miserable, and while I also usually hate third-act breakups, this one is so obviously coming and so clearly advertised from the initial setup that I couldn't really be mad. I did wish the payoff make-up scene at the end of the book had a bit more oomph, though; I thought Sutton's side of it didn't have quite the emotional catharsis that it could have had. I particularly enjoyed the reasons why the two characters fall in love, and how different they are. Charlotte is delighted by Sutton because she's awkward and shy but also straightforward and frequently surprisingly blunt, which fits perfectly with how much Charlotte is otherwise living in a world of polished politicians in constant control of their personas. Sutton's perspective is more physical, but the part I liked was the way that she treats Charlotte like a puzzle. Rather than trying to change how Charlotte expresses herself, she instead discovers that she's remarkably good at reading Charlotte if she trusts her instincts. There was something about Sutton's growing perceptiveness that I found quietly delightful. It's the sort of non-sexual intimacy that often gets lost among the big emotions in romance novels. The supporting cast was also great. Both characters have deep support networks of friends and family who are unambiguously on their side. Regan is pure chaos, and I would not be friends with her, but Cass shows her deep loyalty in a way that makes her dynamic with Sutton make sense. Both characters have thoughtful and loving families who support them but don't make decisions for them, which is a nice change of pace from the usually more mixed family situations of romance novel protagonists. There's a lot of emotional turbulence in the main relationship, and I think that only worked for me because of how rock-solid and kind the supporting cast is. This is, as you might guess from the title, a very slow burn, although the slow burn is for the emotional relationship rather than the physical one (for reasons that would be spoilers). As usual, I have no calibration for spiciness level, but I'd say that this was roughly on par with the later books in the Bright Falls series. If you know something about politics (or political history) and try to take that part of this book seriously, it will drive you to drink, but if you can put that aside and can deal with misunderstandings and emotional turmoil, this was both fun and satisfying. I liked both of the characters, I liked the timing of the alternating viewpoints, and I believed in the relationship and chemistry, as improbable and chaotic as some of the setup was. It's not the greatest thing I ever read, and I wish the ending was a smidgen stronger, but it was an enjoyable way to spend a few reading days. Recommended. Rating: 7 out of 10

27 October 2025

Dirk Eddelbuettel: #054: Faster r-ci Continuous Integration via r2u Container

Welcome to post 54 in the R4 series. The topic of continuous integration has been a recurrent theme here at the R4 series. Post #32 introducess r-ci, while post #41 brings r2u to r-ci, but does not show a matrix deployment. Post #45 describes the updated r-ci setup that is now the default and contains a macOS and Ubuntu matrix, where the latter relies on r2u to keep things fast, easy, reliable . Last but not least more recent post #52 shares a trick for ensuring coverage reports. Following #45, use of r-ci at for example GitHub Actions has seen steady use and very reliable performance. With the standard setup, a vanilla Ubuntu setup is changed into one supported by r2u. This requires downloading and installating a few Ubuntu packages, and has generally been fairly quick on the order of around fourty seconds. Now, the general variability of run-times for identical tasks in GitHub Actions is well documented by the results of the setup described in post #39 which still runs weekly. It runs the identical SQL query against a remote backend using two different package families. And lo and behold, the intra-method variability on unchanged code or setup and therefore due solely to system variability is about as large as the inter-method variability. In short, GitHub Actions performance varies randomly with significant variability. See the repo README.md for chart that updates weekly (and see #39 for background). Of late, this variability became more noticeable during standard GitHub Actions runs where it would regularly take more than two minutes of setup time before actual continuous integration work was done. Some caching seems to be in effect, so subsequent runs in the same repo seem faster and often came back to one minute or less. For lightweight and small packages, loosing two minutes to setup when the actual test time is a minute or less gets old fast. Looking around, we noticed that container use can be combined with matrix use. So we have now been deploying the following setup (not always over all the matrix elements though)
jobs:
  ci:
    strategy:
      matrix:
        include:
          -   name: container, os: ubuntu-latest, container: rocker/r2u4ci  
          -   name: macos,     os: macos-latest   
          -   name: ubuntu,    os: ubuntu-latest  

    runs-on: $  matrix.os  
    container: $  matrix.container  
GitHub Actions is smart enough to provide NULL for container in the two other cases, so container: $ matrix.container is ignored there. But when container is set as here for the ci-enhanced version of r2u (which adds a few binaries commonly needed such as git, curl, wget etc needed for CI) then the CI jobs runs inside the container. And thereby skips most of the setup time as the container is already prepared. This also required some small adjustments in the underlying shell script doing the work. To not disrupt standard deployment, we placed these into a release candidate / development version one can op into via an new variable dev_version
      - name: Setup
        uses: eddelbuettel/github-actions/r-ci@master
        with:
          dev_version: 'TRUE'
Everything else remains the same and works as before. But faster as much less time is spent on setup. You can see the actual full yaml file and actions in my repositories for rcpparmadillo and rcppmlpack-examples. Additional testing would be welcome, so feel free to deploy this in your actions now. Otherwise I will likely carry this over and make it the defaul in a few weeks time. It will still work as before but when the added container: line is used will run much faster thanks to rocker/r2u4ci being already set up for CI.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can now sponsor me at GitHub.

Russ Allbery: Review: On Vicious Worlds

Review: On Vicious Worlds, by Bethany Jacobs
Series: Kindom Trilogy #2
Publisher: Orbit
Copyright: October 2024
ISBN: 0-316-46362-0
Format: Kindle
Pages: 444
On Vicious Worlds is a science fiction thriller with bits of cyberpunk and a direct sequel to These Burning Stars. This is one of those series where each book has massive spoilers for the previous book and builds on characters and situations from that book. I would not read it out of order. It is Bethany Jacobs's second novel. Whooboy, how to review this without spoilers. There are so many major twists in the first book with lingering consequences that it's nearly impossible. I said at the end of my review of These Burning Stars that I was impressed with the ending for reasons that I can't reveal. One thread of this book follows the aftermath: What do you do after the plan? If you have honed yourself for one purpose, can you repurpose yourself? The other thread of the book is a murder mystery. The protectors of the community are being picked off, one by one. The culprit might be a hacker so good that they are causing Jun, the expert hacker of the first book, serious problems. Meanwhile, the political fault lines of the community are cracking open under pressure, and the leaders are untested, exhausted, and navigating difficult emotional terrain. These two story threads alternate, and interspersed are yet more flashbacks. As with the first book, the flashbacks fill in the backstory of Chono and and Esek. This time, though, we get Six's viewpoint. The good news is that On Vicious Worlds tones down the sociopathy considerably without letting up on the political twists. This is the book where Chono comes into her own. She has much more freedom of action, despite being at the center of complicated and cut-throat politics, and I thoroughly enjoyed her principled solidity. She gets a chance to transcend her previous role as an abuse victim, and it's worth the wait. The bad news is that this is very much a middle book of a trilogy. While there are a lot of bloody battles, emotional drama, political betrayals, and plot twists, the series plot has not advanced much by the end of the book. I would not say the characters were left in the same position they started the character development is real and the perils have changed but neither would I say that any of the open questions from These Burning Stars have resolved. The last book I read used science-fiction world-building to tell a story about moral philosophy that was somewhat less drama-filled than one might have expected. That is so not the case here. On Vicious Worlds is, if anything, even more dramatic than the first book of the series. In Chono's thread, the slow burn attempt to understand Six's motives has been replaced with almost non-stop melodrama, full of betrayals, reversals, risky attempts, and emotional roller coasters. Jun's part of the story is a bit more sedate at first, but there too the interpersonal drama setting is headed towards 10. This is the novel equivalent of an action movie. Jun, and her part of the story, are fine. I like the new viewpoint character, I find their system of governance somewhat interesting (although highly optimized for small groups), and I think the climax worked. But I'm invested in this series for Chono and Six. Both of them, but particularly Six, are absurdly over the top, ten people's worth of drama stuffed into one character, unable to communicate in anything less than dramatic gestures and absurd plans, but I find them magnetically fascinating. I'm not sure if written characters can have charisma, but if so, they have it. I liked this entry in the series, but then I also liked the first book. It's trauma-filled and dramatic and involved a bit too much bloody maiming for my tastes, but this whole series is about revolutions and what happens when you decide to fight, and sometimes I'm in the mood for complicated and damaged action heroes who loathe oppression and want to kill some people. This is the sort of series book that will neither be the reason you read the series nor the reason why you stop reading. If you enjoyed These Burning Stars, this is more of the same, with arguably better character development but less plot catharsis. If you didn't like These Burning Stars, this probably won't change your mind, although if you hated it specifically because of Esek's sociopathy, I think you would find this book more congenial. But maybe not; Jacobs is still the same author, and most of the characters in this series are made of sharp edges. I'm still in; I have already pre-ordered the next book. Followed by This Brutal Moon, due out in December of 2025 and advertised as the conclusion. Rating: 7 out of 10

23 October 2025

Russ Allbery: Review: Politics on the Edge

Review: Politics on the Edge, by Rory Stewart
Publisher: Penguin Books
Copyright: 2023, 2025
Printing: 2025
ISBN: 979-8-217-06167-9
Format: Kindle
Pages: 429
Rory Stewart is a former British diplomat, non-profit executive, member of Parliament, and cabinet minister. Politics on the Edge is a memoir of his time in the UK Parliament from 2019 to 2019 as a Tory (Conservative) representing the Penrith and The Border constituency in northern England. It ends with his failed run against Boris Johnson for leader of the Conservative Party and Prime Minister. This book provoked many thoughts, only some of which are about the book. You may want to get a beverage; this review will be long. Since this is a memoir told in chronological order, a timeline may be useful. After Stewart's time as a regional governor in occupied Iraq (see The Prince of the Marshes), he moved to Kabul to found and run an NGO to preserve traditional Afghani arts and buildings (the Turquoise Mountain Foundation, about which I know nothing except what Stewart wrote in this book). By his telling, he found that work deeply rewarding but thought the same politicians who turned Iraq into a mess were going to do the same to Afghanistan. He started looking for ways to influence the politics more directly, which led him first to Harvard and then to stand for Parliament. The bulk of this book covers Stewart's time as MP for Penrith and The Border. The choice of constituency struck me as symbolic of Stewart's entire career: He was not a resident and had no real connection to the district, which he chose for political reasons and because it was the nearest viable constituency to his actual home in Scotland. But once he decided to run, he moved to the district and seems sincerely earnest in his desire to understand it and become part of its community. After five years as a backbencher, he joined David Cameron's government in a minor role as Minister of State in the Department for Environment, Food, and Rural Affairs. He then bounced through several minor cabinet positions (more on this later) before being elevated to Secretary of State for International Development under Theresa May. When May's government collapsed during the fight over the Brexit agreement, he launched a quixotic challenge to Boris Johnson for leader of the Conservative Party. I have enjoyed Rory Stewart's writing ever since The Places in Between. This book is no exception. Whatever one's other feelings about Stewart's politics (about which I'll have a great deal more to say), he's a talented memoir writer with an understated and contemplative style and a deft ability to shift from concrete description to philosophical debate without bogging down a story. Politics on the Edge is compelling reading at the prose level. I spent several afternoons happily engrossed in this book and had great difficulty putting it down. I find Stewart intriguing since, despite being a political conservative, he's neither a neoliberal nor any part of the new right. He is instead an apparently-sincere throwback to a conservatism based on epistemic humility, a veneration of rural life and long-standing traditions, and a deep commitment to the concept of public service. Some of his principles are baffling to me, and I think some of his political views are obvious nonsense, but there were several things that struck me throughout this book that I found admirable and depressingly rare in politics. First, Stewart seems to learn from his mistakes. This goes beyond admitting when he was wrong and appears to include a willingness to rethink entire philosophical positions based on new experience.
I had entered Iraq supporting the war on the grounds that we could at least produce a better society than Saddam Hussein's. It was one of the greatest mistakes in my life. We attempted to impose programmes made up by Washington think tanks, and reheated in air-conditioned palaces in Baghdad a new taxation system modelled on Hong Kong; a system of ministers borrowed from Singapore; and free ports, modelled on Dubai. But we did it ultimately at the point of a gun, and our resources, our abstract jargon and optimistic platitudes could not conceal how much Iraqis resented us, how much we were failing, and how humiliating and degrading our work had become. Our mission was a grotesque satire of every liberal aspiration for peace, growth and democracy.
This quote comes from the beginning of this book and is a sentiment Stewart already expressed in The Prince of the Marshes, but he appears to have taken this so seriously that it becomes a theme of his political career. He not only realized how wrong he was on Iraq, he abandoned the entire neoliberal nation-building project without abandoning his belief in the moral obligation of international aid. And he, I think correctly, identified a key source of the error: an ignorant, condescending superiority that dismissed the importance of deep expertise.
Neither they, nor indeed any of the 12,000 peacekeepers and policemen who had been posted to South Sudan from sixty nations, had spent a single night in a rural house, or could complete a sentence in Dinka, Nuer, Azande or Bande. And the international development strategy written jointly between the donor nations resembled a fading mission statement found in a new space colony, whose occupants had all been killed in an alien attack.
Second, Stewart sincerely likes ordinary people. This shone through The Places in Between and recurs here in his descriptions of his constituents. He has a profound appreciation for individual people who have spent their life learning some trade or skill, expresses thoughtful and observant appreciation for aspects of local culture, and appears to deeply appreciate time spent around people from wildly different social classes and cultures than his own. Every successful politician can at least fake gregariousness, and perhaps that's all Stewart is doing, but there is something specific and attentive about his descriptions of other people, including long before he decided to enter politics, that makes me think it goes deeper than political savvy. Third, Stewart has a visceral hatred of incompetence. I think this is the strongest through-line of his politics in this book: Jobs in government are serious, important work; they should be done competently and well; and if one is not capable of doing that, one should not be in government. Stewart himself strikes me as an insecure overachiever: fiercely ambitious, self-critical, a bit of a micromanager (I suspect he would be difficult to work for), but holding himself to high standards and appalled when others do not do the same. This book is scathing towards multiple politicians, particularly Boris Johnson whom Stewart clearly despises, but no one comes off worse than Liz Truss.
David Cameron, I was beginning to realise, had put in charge of environment, food and rural affairs a Secretary of State who openly rejected the idea of rural affairs and who had little interest in landscape, farmers or the environment. I was beginning to wonder whether he could have given her any role she was less suited to apart perhaps from making her Foreign Secretary. Still, I could also sense why Cameron was mesmerised by her. Her genius lay in exaggerated simplicity. Governing might be about critical thinking; but the new style of politics, of which she was a leading exponent, was not. If critical thinking required humility, this politics demanded absolute confidence: in place of reality, it offered untethered hope; instead of accuracy, vagueness. While critical thinking required scepticism, open-mindedness and an instinct for complexity, the new politics demanded loyalty, partisanship and slogans: not truth and reason but power and manipulation. If Liz Truss worried about the consequences of any of this for the way that government would work, she didn't reveal it.
And finally, Stewart has a deeply-held belief in state capacity and capability. He and I may disagree on the appropriate size and role of the government in society, but no one would be more disgusted by an intentional project to cripple government in order to shrink it than Stewart. One of his most-repeated criticisms of the UK political system in this book is the way the cabinet is formed. All ministers and secretaries come from members of Parliament and therefore branches of government are led by people with no relevant expertise. This is made worse by constant cabinet reshuffles that invalidate whatever small amounts of knowledge a minister was able to gain in nine months or a year in post. The center portion of this book records Stewart's time being shuffled from rural affairs to international development to Africa to prisons, with each move representing a complete reset of the political office and no transfer of knowledge whatsoever.
A month earlier, they had been anticipating every nuance of Minister Rogerson's diary, supporting him on shifts twenty-four hours a day, seven days a week. But it was already clear that there would be no pretence of a handover no explanation of my predecessor's strategy, and uncompleted initiatives. The arrival of a new minister was Groundhog Day. Dan Rogerson was not a ghost haunting my office, he was an absence, whose former existence was suggested only by the black plastic comb.
After each reshuffle, Stewart writes of trying to absorb briefings, do research, and learn enough about his new responsibilities to have the hope of making good decisions, while growing increasingly frustrated with the system and the lack of interest by most of his colleagues in doing the same. He wants government programs to be successful and believes success requires expertise and careful management by the politicians, not only by the civil servants, a position that to me both feels obviously correct and entirely at odds with politics as currently practiced. I found this a fascinating book to read during the accelerating collapse of neoliberalism in the US and, to judge by current polling results, the UK. I have a theory that the political press are so devoted to a simplistic left-right political axis based on seating arrangements during the French Revolution that they are missing a significant minority whose primary political motivation is contempt for arrogant incompetence. They could be convinced to vote for Sanders or Trump, for Polanski or Farage, but will never vote for Biden, Starmer, Romney, or Sunak. Such voters are incomprehensible to those who closely follow and debate policies because their hostile reaction to the center is not about policies. It's about lack of trust and a nebulous desire for justice. They've been promised technocratic competence and the invisible hand of market forces for most of their lives, and all of it looks like lies. Everyday living is more precarious, more frustrating, more abusive and dehumanizing, and more anxious, despite (or because of) this wholehearted embrace of economic "freedom." They're sick of every complaint about the increasing difficulty of life being met with accusations about their ability and work ethic, and of being forced to endure another round of austerity by people who then catch a helicopter ride to a party on some billionaire's yacht. Some of this is inherent in the deep structural weaknesses in neoliberal ideology, but this is worse than an ideological failure. The degree to which neoliberalism started as a project of sincere political thinkers is arguable, but that is clearly not true today. The elite class in politics and business is now thoroughly captured by people whose primary skill is the marginal manipulation of complex systems for their own power and benefit. They are less libertarian ideologues than narcissistic mediocrities. We are governed by management consultants. They are firmly convinced their organizational expertise is universal, and consider the specific business of the company, or government department, irrelevant. Given that context, I found Stewart's instinctive revulsion towards David Cameron quite revealing. Stewart, later in the book, tries to give Cameron some credit by citing several policy accomplishments and comparing him favorably to Boris Johnson (which, true, is a bar Cameron probably flops over). But I think Stewart's baffled astonishment at Cameron's vapidity says a great deal about how we have ended up where we are. This last quote is long, but I think it provides a good feel for Stewart's argument in this book.
But Cameron, who was rumoured to be sceptical about nation-building projects, only nodded, and then looking confidently up and down the table said, "Well, at least we all agree on one extremely straightforward and simple point, which is that our troops are doing very difficult and important work and we should all support them." It was an odd statement to make to civilians running humanitarian operations on the ground. I felt I should speak. "No, with respect, we do not agree with that. Insofar as we have focused on the troops, we have just been explaining that what the troops are doing is often futile, and in many cases making things worse." Two small red dots appeared on his cheeks. Then his face formed back into a smile. He thanked us, told us he was out of time, shook all our hands, and left the room. Later, I saw him repeat the same line in interviews: "the purpose of this visit is straightforward... it is to show support for what our troops are doing in Afghanistan". The line had been written, in London, I assumed, and tested on focus groups. But he wanted to convince himself it was also a position of principle. "David has decided," one of his aides explained, when I met him later, "that one cannot criticise a war when there are troops on the ground." "Why?" "Well... we have had that debate. But he feels it is a principle of British government." "But Churchill criticised the conduct of the Boer War; Pitt the war with America. Why can't he criticise wars?" "British soldiers are losing their lives in this war, and we can't suggest they have died in vain." "But more will die, if no one speaks up..." "It is a principle thing. And he has made his decision. For him and the party." "Does this apply to Iraq too?" "Yes. Again he understands what you are saying, but he voted to support the Iraq War, and troops are on the ground." "But surely he can say he's changed his mind?" The aide didn't answer, but instead concentrated on his food. "It is so difficult," he resumed, "to get any coverage of our trip." He paused again. "If David writes a column about Afghanistan, we will struggle to get it published." "But what would he say in an article anyway?" I asked. "We can talk about that later. But how do you get your articles on Afghanistan published?" I remembered how the US politicians and officials had shown their mastery of strategy and detail. I remembered the earnestness of Gordon Brown when I had briefed him on Iraq. Cameron seemed somehow less serious. I wrote as much in a column in the New York Times, saying that I was afraid the party of Churchill was becoming the party of Bertie Wooster.
I don't know Stewart's reputation in Britain, or in the constituency that he represented. I know he's been accused of being a self-aggrandizing publicity hound, and to some extent this is probably true. It's hard to find an ambitious politician who does not have that instinct. But whatever Stewart's flaws, he can, at least, defend his politics with more substance than a corporate motto. One gets the impression that he would respond favorably to demonstrated competence linked to a careful argument, even if he disagreed. Perhaps this is an illusion created by his writing, but even if so, it's a step in the right direction. When people become angry enough at a failing status quo, any option that promises radical change and punishment for the current incompetents will sound appealing. The default collapse is towards demagogues who are skilled at expressing anger and disgust and are willing to promise simple cures because they are indifferent to honesty. Much of the political establishment in the US, and possibly (to the small degree that I can analyze it from an occasional news article) in the UK, can identify the peril of the demagogue, but they have no solution other than a return to "politics as usual," represented by the amoral mediocrity of a McKinsey consultant. The rare politicians who seem to believe in something, who will argue for personal expertise and humility, who are disgusted by incompetence and have no patience for facile platitudes, are a breath of fresh air. There are a lot of policies on which Stewart and I would disagree, and perhaps some of his apparent humility is an affectation from the rhetorical world of the 1800s that he clearly wishes he were inhabiting, but he gives the strong impression of someone who would shoulder a responsibility and attempt to execute it with competence and attention to detail. He views government as a job, where coworkers should cooperate to achieve defined goals, rather than a reality TV show. The arc of this book, like the arc of current politics, is the victory of the reality TV show over the workplace, and the story of Stewart's run against Boris Johnson is hard reading because of it, but there's a portrayal here of a different attitude towards politics that I found deeply rewarding. If you liked Stewart's previous work, or if you want an inside look at parliamentary politics, highly recommended. I will be thinking about this book for a long time. Rating: 9 out of 10

21 October 2025

Gunnar Wolf: LLM Hallucinations in Practical Code Generation Phenomena, Mechanism, and Mitigation

This post is a review for Computing Reviews for LLM Hallucinations in Practical Code Generation Phenomena, Mechanism, and Mitigation , a article published in Proceedings of the ACM on Software Engineering, Volume 2, Issue ISSTA
How good can large language models (LLMs) be at generating code? This may not seem like a very novel question, as several benchmarks (for example, HumanEval and MBPP, published in 2021) existed before LLMs burst into public view and started the current artificial intelligence (AI) inflation. However, as the paper s authors point out, code generation is very seldom done as an isolated function, but instead must be deployed in a coherent fashion together with the rest of the project or repository it is meant to be integrated into. Today, several benchmarks (for example, CoderEval or EvoCodeBench) measure the functional correctness of LLM-generated code via test case pass rates. This paper brings a new proposal to the table: comparing LLM-generated repository-level evaluated code by examining the hallucinations generated. The authors begin by running the Python code generation tasks proposed in the CoderEval benchmark against six code-generating LLMs. Next, they analyze the results and build a taxonomy to describe code-based LLM hallucinations, with three types of conflicts (task requirement, factual knowledge, and project context) as first-level categories and eight subcategories within them. The authors then compare the results of each of the LLMs per the main hallucination category. Finally, they try to find the root cause for the hallucinations. The paper is structured very clearly, not only presenting the three research questions (RQ) but also referring to them as needed to explain why and how each partial result is interpreted. RQ1 (establishing a hallucination taxonomy) is the most thoroughly explored. While RQ2 (LLM comparison) is clear, it just presents straightforward results without much analysis. RQ3 (root cause discussion) is undoubtedly interesting, but I feel it to be much more speculative and not directly related to the analysis performed. After tackling their research questions, Zhang et al. propose a possible mitigation to counter the effect of hallucinations: enhance the LLM with retrieval-augmented generation (RAG) so it better understands task requirements, factual knowledge, and project context. The presented results show that all of the models are clearly (though modestly) improved by the proposed RAG-based mitigation. The paper is clearly written and easy to read. It should provide its target audience with interesting insights and discussions. I would have liked more details on their RAG implementation, but I suppose that s for a follow-up work.

Emmanuel Kasper: How configuration is passed from the BinderHub helm chart to a running BinderHub

Context: At $WORK I am doing a lot of datascience work around Jupyter Notebooks and their ecosystem. Right now I am setting BinderHub, which is a service to start a Jupyter Notebook from a git repo in your browser. For setting up BinderHub I am using the BinderHub helm chart, and I was wondering how configuration changes are propagated from the BinderHub helm chart to the process running in a Kubernetes Pod. After going through this I can say I am not right now a great fan of Helm, as it looks to me like an unnecessary, overengineered abstraction layer on top of Kubernetes manifests. Or maybe it is just that I don t want to learn the golang templating synthax. I am looking forward to testing Kustomize as an alternative, but I havn t had the chance yet. Starting from the list of config parameters available: Although many parameters are mentioned in the installer document, you have to go to the developer doc at https://binderhub.readthedocs.io/en/latest/reference/ref-index.html to get a whole overview. In my case I want to set the hostname parameter for the Gitlab Repoprovider. This is the relevelant snippet in the developer doc:
hostname c.GitLabRepoProvider.hostname = Unicode('gitlab.com')
    The host of the GitLab instance
The string c.GitLabRepoProvider.hostname here means, that the value of the hostname parameter will be loaded at the path config.GitLabRepoProvider inside a configuration file. Using the yaml synthax this means the configuration file should contain a snippet like:
config:
  GitlabRepoProvider
    hostname: my-domain.com
Digging through Kubernetes constructs: Helm values files When installing BinderHub using the provided helm chart, we can either put the configuration snippet in the config.yaml or secret.yaml helm values files. In my case I have put the snippet in config.yaml, since the hostname is not a secret thing, I can verify with yq that it correctly set:
$ yq --raw-output '.config.GitLabRepoProvider.hostname' config.yaml
my-domain.com
How do we make sure this parameter is properly applied to our running binder processes ? As said previouly this parameter is passed as a value file to helm ( value or -f option) in the command:
$ helm upgrade \                                                                                  
    binderhub \                                                                                     
    jupyterhub/binderhub \                                                                          
    --install \                                                                                     
    --version=$(RELEASE) \                                                                          
    --create-namespace \                                                                            
    --namespace=binderhub \                                                                         
    --values secret.yaml \                                                                                
    --values config.yaml \                                                                                
    --debug 
According to the helm documentation in https://helm.sh/docs/helm/helm_install/ the values file are concatenated to form a single object, and priority will be given to the last (right-most) file specified. For example, if both myvalues.yaml and override.yaml contained a key called Test , the value set in override.yaml would take precedence:
$ helm install --values myvalues.yaml --values override.yaml  myredis ./redis
Digging through Kubernetes constructs: Secrets and Volumes When helm upgrade is run the helm values of type config are stashed in a Kubernetes secret binder-secret: https://github.com/jupyterhub/binderhub/blob/main/helm-chart/binderhub/templates/secret.yaml#L12
stringData:
   - /*
    Stash away relevant Helm template values for
    the BinderHub Python application to read from
    in binderhub_config.py.
  */ 
  values.yaml:  
     - pick .Values "config" "imageBuilderType" "cors" "dind" "pink" "extraConfig"   toYaml   nindent 4  
We can verify that our hostname is passed to our Secret:
$ kubectl get secret binder-secret -o yaml   yq --raw-output '.data."values.yaml"'    base64 --decode
...
  GitLabRepoProvider:
    hostname: my-domain.com
...
Finally a configuration file inside the Binder pod is populated from the Secret, using the Kubernetes Volume construct. Looking at the Pod, we do see a volume called config, created from the binder-secret Secret:
$ kubectl get pod -l component=binder -o yaml   grep --context 4 binder-secret
    volumes:
    - name: config
      secret:
        defaultMode: 420
        secretName: binder-secret
That volume is mounted inside the pod at /etc/binderhub/config:
      volumeMounts:
      - mountPath: /etc/binderhub/config/
        name: config
        readOnly: true
Runtime verification Looking inside our pod we see our hostname value available in a file underneath the mount point:
oc exec binder-74d9c7db95-qtp8r -- grep hostname /etc/binderhub/config/values.yaml
    hostname: my-domain.com

Russ Allbery: Review: Space Trucker Jess

Review: Space Trucker Jess, by Matthew Kressel
Publisher: Fairwood Press
Copyright: July 2025
ISBN: 1-958880-27-2
Format: Kindle
Pages: 472
Space Trucker Jess is a stand-alone far-future space fantasy novel. Jess is a sixteen-year-old mechanic working grey-market jobs on Chadeisson Station with a couple of younger kids. She's there because her charming and utterly unreliable father got caught running a crypto scam and is sitting in detention. This was only the latest in a long series of scams, con jobs, and misadventures she's been dragged through since her mother disappeared without a word. Jess is cynical, world-weary, and infuriated by her own sputtering loyalty to her good-for-nothing dad. What Jess wants most in the universe is to own a CCM 6454 Spark Megahauler, the absolute best cargo ship in the universe according to Jess. She should know; she's worked on nearly every type of ship in existence. With her own ship, she could make a living hauling cargo, repairing her own ship, and going anywhere she wants, free of her father and his endless schemes. (A romantic relationship with her friend Leurie would be a nice bonus.) Then her father is taken off the station on a ship leaving the galactic plane, no one will tell her why, and all the records of the ship appear to have been erased. Jess thinks her father is an asshole, but that doesn't mean she can sit idly by when he disappears. That's how she ends up getting in serious trouble with station security due to some risky in-person sleuthing, followed by an expensive flight off the station with a dodgy guy and a kid in a stolen spaceship. The setup for this book was so great. Kressel felt the need to make up a futuristic slang for Jess and her friends to speak, which rarely works as well as the author expects and does not work here, but apart from that I was hooked. Jess is sarcastic, blustery, and a bit of a con artist herself, but with the idealistic sincerity of someone who knows that her life is been kind of broken and understands the value of friends. She's profoundly cynical in the heartbreakingly defensive way of a sixteen-year-old with a rough life. I have a soft spot in my heart for working-class science fiction (there isn't nearly enough of it), and there are few things I enjoy more than reading about the kind of protagonist who has Opinions about starship models and a dislike of shoddy work. I think this is the only book I've bought solely on the basis of one of the Big Idea blog posts John Scalzi hosts. I really wish this book had stuck with the setup instead of morphing into a weird drug-enabled mystical space fantasy, to which Jess's family is bizarrely central. SPOILERS below because I can't figure out how to rant about what annoyed me without them. Search for the next occurrence of spoilers to skip past them. There are three places where this book lost me. The first was when Jess, after agreeing to help another kid find his father, ends up on a world obsessed with a religious cult involving using hallucinatory drugs to commune with alien gods. Jess immediately flags this as unbelievable bullshit and I was enjoying her well-founded cynicism until Kressel pulls the rug out from under both Jess and the reader by establishing that this new-age claptrap is essentially true. Kressel does try to put a bit of a science fiction gloss on it, but sadly I think that effort was unsuccessful. Sometimes absurdly powerful advanced aliens with near-telepathic powers are part of the fun of a good space opera, but I want the author to make an effort to connect the aliens to plausibility or, failing that, at least avoid sounding indistinguishable from psychic self-help grifters or religious fantasy about spiritual warfare. Stargate SG-1 and Babylon 5 failed on the first part but at least held the second line. Kressel gets depressingly close to Seth territory, although at least Jess is allowed to retain some cynicism about motives. The second, related problem is that Jess ends up being a sort of Chosen One, which I found intensely annoying. This may be a fault of reader expectations more than authorial skill, but one of the things I like to see in working-class science fiction is for the protagonist to not be absurdly central to the future of the galaxy, or to at least force themselves into that position through their own ethics and hard work. This book turns into a sort of quest story with epic fantasy stakes, which I thought was much less interesting than the story the start of the book promised and which made Jess a less interesting character. Finally, this is one of those books where Jess's family troubles and the plot she stumbles across turn into the same plot. Space Trucker Jess is far from alone in having that plot structure, and that's the problem. I'm not universally opposed to this story shape, but Jess felt like the wrong character for it. She starts the story with a lot of self-awareness about how messed up her family dynamics were, and I was rooting for her to find some space to construct her own identity separate from her family. To have her family turn out to be central not only to this story but to the entire galaxy felt like it undermined that human core of the story, although I admit it's a good analogy to the type of drama escalation that dysfunctional families throw at anyone attempting to separate from them. Spoilers end here. I rather enjoyed the first third of this book, despite being a bit annoyed at the constructed slang, and then started rolling my eyes and muttering things about the story going off the rails. Jess is a compelling enough character (and I'm stubborn enough) that I did finish the book, so I can say that I liked the very end. Kressel does finally arrive at the sort of story that I wanted to read all along. Unfortunately, I didn't enjoy the path he took to get there. I think much of my problem was that I wanted Jess to be a more defiant character earlier in the novel, and I wanted her family problems to influence her character growth but not be central to her story. Both of these may be matters of opinion and an artifact of coming into the book with the wrong assumptions. If you are interested in a flawed and backsliding effort to untangle one's identity from a dysfunctional family and don't mind some barely-SF space mysticism and chosen one vibes, it's possible this book will click with you. It's not one that I can recommend, though. I still want the book that I hoped I was getting from that Big Idea piece. Rating: 4 out of 10

19 October 2025

Colin Watson: Mistaken dichotomies about dgit

In Could the XZ backdoor have been detected with better Git and Debian packaging practices? , Otto contrasts git-buildpackage managed git repositories with dgit managed repositories , saying that the dgit managed repositories cannot incorporate the upstream git history and are thus less useful for auditing the full software supply-chain in git . Otto does qualify this earlier with a package that has not had the history recorded in dgit earlier , but the last sentence of the section is a misleading oversimplification. It s true for repositories that have been synthesized by dgit (which indeed was the focus of that section of Otto s article), but it s not true in general for repositories that are managed by dgit. I suspect this was just slightly unclear writing, so I don t want to nitpick here, but rather to take the opportunity to try to clear up some misconceptions around dgit that I ve often heard at conferences and seen on mailing lists. I m not a dgit developer, although I m a happy user of it and I ve tried to help out in various design discussions over the years.
dgit and git-buildpackage sit at different layers It seems very common for people to think of git-buildpackage and dgit as alternatives, as the example I quoted at the start of this article suggests. It s really better to think of dgit as a separate and orthogonal layer. You can use dgit together with tools such as git-buildpackage. In that case, git-buildpackage handles the general shape of your git history, such as helping you to import new upstream versions, and dgit handles gatewaying between the archive and git. The advantages become evident when you start using tag2upload, in which case you can just use git debpush to push a tag and the tag2upload service deals with building the source package and uploading it to the archive for you. This is true regardless of how you put your package s git history together. (There s currently a wrinkle around pristine-tar support, so at the moment I personally tend to use dgit push-source for new upstream versions and git debpush for new Debian revisions, since I haven t yet convinced myself that I see no remaining value in pristine upstream tarballs.)
dgit supports complete history If the maintainer has never used dgit, and so dgit clone synthesizes a repository based on the current contents of the Debian archive, then there s indeed no useful history there; in that situation it doesn t go back and import everything from the snapshot archive the way that gbp import-dscs --debsnap does. However, if the maintainer uses dgit, then dgit s view will include more history, and it s absolutely possible for that to include complete upstream git history as well. Try this:
$ dgit clone man-db
canonical suite name for unstable is sid
fetching existing git history
last upload to archive: specified git info (debian)
downloading http://ftp.debian.org/debian//pool/main/m/man-db/man-db_2.13.1.orig.tar.xz...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2060k  100 2060k    0     0  4643k      0 --:--:-- --:--:-- --:--:-- 4652k
downloading http://ftp.debian.org/debian//pool/main/m/man-db/man-db_2.13.1.orig.tar.xz.asc...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   833  100   833    0     0  16322      0 --:--:-- --:--:-- --:--:-- 16660
HEAD is now at 167835b0 releasing package man-db version 2.13.1-1
dgit ok: ready for work in man-db
$ git -C man-db log --graph --oneline   head
* 167835b0 releasing package man-db version 2.13.1-1
*   f7910493 New upstream release (2.13.1)
 \
  *   3073b72e Import man-db_2.13.1.orig.tar.xz
   \
    * 349ce503 Release man-db 2.13.1
    * 0d6635c1 Update Russian manual page translation
    * cbf87caf Update Italian translation
    * fb5c5017 Update German manual page translation
    * dae2057b Update Brazilian Portuguese manual page translation
That package uses git-dpm, since I prefer the way it represents patches. But it works fine with git-buildpackage too:
$ dgit clone isort
canonical suite name for unstable is sid
fetching existing git history
last upload to archive: specified git info (debian)
downloading http://ftp.debian.org/debian//pool/main/i/isort/isort_7.0.0.orig.tar.gz...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  786k  100  786k    0     0  1772k      0 --:--:-- --:--:-- --:--:-- 1774k
HEAD is now at f812aae releasing package isort version 7.0.0-1
dgit ok: ready for work in isort
$ git -C isort log --graph --oneline   head
* f812aae releasing package isort version 7.0.0-1
*   efde62f Update upstream source from tag 'upstream/7.0.0'
 \
  * 9694f3d New upstream version 7.0.0
*   9cbfe0b releasing package isort version 6.1.0-1
*   5423ffe Mark isort and python3-isort Multi-Arch: foreign
*   5eaf5bf Update upstream source from tag 'upstream/6.1.0'
 \ 
  * edafbfc New upstream version 6.1.0
*     aedfd25 Merge branch 'debian/master' into fix992793
If you look closely you ll see another difference here: the second only includes one commit representing the new upstream release, and doesn t have complete upstream history. This doesn t represent a difference between git-dpm and git-buildpackage. Both tools can operate in both ways: for example, git-dpm import-new-upstream --parent and gbp import-orig --upstream-vcs-tag do broadly similar things, and something like gbp import-dscs --debsnap --upstream-vcs-tag='%(version)s' can be used to do a bulk import provided that upstream s tags are named consistently enough. This is not generally the default because adding complete upstream history requires extra setup: the maintainer has to add an extra git remote pointing to upstream and select the correct tag when importing a new version, and some upstreams forget to push git tags or don t have the sort of consistency you might want. The Debian Python team s policy says that Complete upstream Git history should be avoided in the upstream branch , which is why the isort history above looks the way it does. I don t love this because I think the results are less useful, but I understand why it s there: in a moderately large team maintaining thousands of packages, getting everyone to have the right git remotes set up would be a recipe for frustrating inconsistency. However, in packages I maintain myself, I strongly value having complete upstream history in order to make it easier to debug problems, and I think it makes things a bit more transparent to auditors too, so I m willing to go to a little extra work to make that happen. Doing that is completely compatible with using dgit.

Otto Kek l inen: Could the XZ backdoor have been detected with better Git and Debian packaging practices?

Featured image of post Could the XZ backdoor have been detected with better Git and Debian packaging practices?The discovery of a backdoor in XZ Utils in the spring of 2024 shocked the open source community, raising critical questions about software supply chain security. This post explores whether better Debian packaging practices could have detected this threat, offering a guide to auditing packages and suggesting future improvements. The XZ backdoor in versions 5.6.0/5.6.1 made its way briefly into many major Linux distributions such as Debian and Fedora, but luckily didn t reach that many actual users, as the backdoored releases were quickly removed thanks to the heroic diligence of Andres Freund. We are all extremely lucky that he detected a half a second performance regression in SSH, cared enough to trace it down, discovered malicious code in the XZ library loaded by SSH, and reported promtly to various security teams for quick coordinated actions. This episode makes software engineers pondering the following questions: As a Debian Developer, I decided to audit the xz package in Debian, share my methodology and findings in this post, and also suggest some improvements on how the software supply-chain security could be tightened in Debian specifically. Note that the scope here is only to inspect how Debian imports software from its upstreams, and how they are distributed to Debian s users. This excludes the whole story of how to assess if an upstream project is following software development security best practices. This post doesn t discuss how to operate an individual computer running Debian to ensure it remains untampered as there are plenty of guides on that already.

Downloading Debian and upstream source packages Let s start by working backwards from what the Debian package repositories offer for download. As auditing binaries is extremely complicated, we skip that, and assume the Debian build hosts are trustworthy and reliably building binaries from the source packages, and the focus should be on auditing the source code packages. As with everything in Debian, there are multiple tools and ways to do the same thing, but in this post only one (and hopefully the best) way to do something is presented for brevity. The first step is to download the latest version and some past versions of the package from the Debian archive, which is easiest done with debsnap. The following command will download all Debian source packages of xz-utils from Debian release 5.2.4-1 onwards:
$ debsnap --verbose --first 5.2.4-1 xz-utils
Getting json https://snapshot.debian.org/mr/package/xz-utils/
...
Getting dsc file xz-utils_5.2.4-1.dsc: https://snapshot.debian.org/file/a98271e4291bed8df795ce04d9dc8e4ce959462d
Getting file xz-utils_5.2.4.orig.tar.xz.asc: https://snapshot.debian.org/file/59ccbfb2405abe510999afef4b374cad30c09275
Getting file xz-utils_5.2.4-1.debian.tar.xz: https://snapshot.debian.org/file/667c14fd9409ca54c397b07d2d70140d6297393f
source-xz-utils/xz-utils_5.2.4-1.dsc:
Good signature found
validating xz-utils_5.2.4.orig.tar.xz
validating xz-utils_5.2.4.orig.tar.xz.asc
validating xz-utils_5.2.4-1.debian.tar.xz
All files validated successfully.
Once debsnap completes there will be a subfolder source-<package name> with the following types of files:
  • *.orig.tar.xz: source code from upstream
  • *.orig.tar.xz.asc: detached signature (if upstream signs their releases)
  • *.debian.tar.xz: Debian packaging source, i.e. the debian/ subdirectory contents
  • *.dsc: Debian source control file, including signature by Debian Developer/Maintainer
Example:
$ ls -1 source-xz-utils/
...
xz-utils_5.6.4.orig.tar.xz
xz-utils_5.6.4.orig.tar.xz.asc
xz-utils_5.6.4-1.debian.tar.xz
xz-utils_5.6.4-1.dsc
xz-utils_5.8.0.orig.tar.xz
xz-utils_5.8.0.orig.tar.xz.asc
xz-utils_5.8.0-1.debian.tar.xz
xz-utils_5.8.0-1.dsc
xz-utils_5.8.1.orig.tar.xz
xz-utils_5.8.1.orig.tar.xz.asc
xz-utils_5.8.1-1.1.debian.tar.xz
xz-utils_5.8.1-1.1.dsc
xz-utils_5.8.1-1.debian.tar.xz
xz-utils_5.8.1-1.dsc
xz-utils_5.8.1-2.debian.tar.xz
xz-utils_5.8.1-2.dsc

Verifying authenticity of upstream and Debian sources using OpenPGP signatures As seen in the output of debsnap, it already automatically verifies that the downloaded files match the OpenPGP signatures. To have full clarity on what files were authenticated with what keys, we should verify the Debian packagers signature with:
$ gpg --verify --auto-key-retrieve --keyserver hkps://keyring.debian.org xz-utils_5.8.1-2.dsc
gpg: Signature made Fri Oct 3 22:04:44 2025 UTC
gpg: using RSA key 57892E705233051337F6FDD105641F175712FA5B
gpg: requesting key 05641F175712FA5B from hkps://keyring.debian.org
gpg: key 7B96E8162A8CF5D1: public key "Sebastian Andrzej Siewior" imported
gpg: Total number processed: 1
gpg: imported: 1
gpg: Good signature from "Sebastian Andrzej Siewior" [unknown]
gpg: aka "Sebastian Andrzej Siewior <bigeasy@linutronix.de>" [unknown]
gpg: aka "Sebastian Andrzej Siewior <sebastian@breakpoint.cc>" [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: 6425 4695 FFF0 AA44 66CC 19E6 7B96 E816 2A8C F5D1
Subkey fingerprint: 5789 2E70 5233 0513 37F6 FDD1 0564 1F17 5712 FA5B
The upstream tarball signature (if available) can be verified with:
$ gpg --verify --auto-key-retrieve xz-utils_5.8.1.orig.tar.xz.asc
gpg: assuming signed data in 'xz-utils_5.8.1.orig.tar.xz'
gpg: Signature made Thu Apr 3 11:38:23 2025 UTC
gpg: using RSA key 3690C240CE51B4670D30AD1C38EE757D69184620
gpg: key 38EE757D69184620: public key "Lasse Collin <lasse.collin@tukaani.org>" imported
gpg: Total number processed: 1
gpg: imported: 1
gpg: Good signature from "Lasse Collin <lasse.collin@tukaani.org>" [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: 3690 C240 CE51 B467 0D30 AD1C 38EE 757D 6918 4620
Note that this only proves that there is a key that created a valid signature for this content. The authenticity of the keys themselves need to be validated separately before trusting they in fact are the keys of these people. That can be done by checking e.g. the upstream website for what key fingerprints they published, or the Debian keyring for Debian Developers and Maintainers, or by relying on the OpenPGP web-of-trust .

Verifying authenticity of upstream sources by comparing checksums In case the upstream in question does not publish release signatures, the second best way to verify the authenticity of the sources used in Debian is to download the sources directly from upstream and compare that the sha256 checksums match. This should be done using the debian/watch file inside the Debian packaging, which defines where the upstream source is downloaded from. Continuing on the example situation above, we can unpack the latest Debian sources, enter and then run uscan to download:
$ tar xvf xz-utils_5.8.1-2.debian.tar.xz
...
debian/rules
debian/source/format
debian/source.lintian-overrides
debian/symbols
debian/tests/control
debian/tests/testsuite
debian/upstream/signing-key.asc
debian/watch
...
$ uscan --download-current-version --destdir /tmp
Newest version of xz-utils on remote site is 5.8.1, specified download version is 5.8.1
gpgv: Signature made Thu Apr 3 11:38:23 2025 UTC
gpgv: using RSA key 3690C240CE51B4670D30AD1C38EE757D69184620
gpgv: Good signature from "Lasse Collin <lasse.collin@tukaani.org>"
Successfully symlinked /tmp/xz-5.8.1.tar.xz to /tmp/xz-utils_5.8.1.orig.tar.xz.
The original files downloaded from upstream are now in /tmp along with the files renamed to follow Debian conventions. Using everything downloaded so far the sha256 checksums can be compared across the files and also to what the .dsc file advertised:
$ ls -1 /tmp/
xz-5.8.1.tar.xz
xz-5.8.1.tar.xz.sig
xz-utils_5.8.1.orig.tar.xz
xz-utils_5.8.1.orig.tar.xz.asc
$ sha256sum xz-utils_5.8.1.orig.tar.xz /tmp/xz-5.8.1.tar.xz
0b54f79df85912504de0b14aec7971e3f964491af1812d83447005807513cd9e xz-utils_5.8.1.orig.tar.xz
0b54f79df85912504de0b14aec7971e3f964491af1812d83447005807513cd9e /tmp/xz-5.8.1.tar.xz
$ grep -A 3 Sha256 xz-utils_5.8.1-2.dsc
Checksums-Sha256:
0b54f79df85912504de0b14aec7971e3f964491af1812d83447005807513cd9e 1461872 xz-utils_5.8.1.orig.tar.xz
4138f4ceca1aa7fd2085fb15a23f6d495d27bca6d3c49c429a8520ea622c27ae 833 xz-utils_5.8.1.orig.tar.xz.asc
3ed458da17e4023ec45b2c398480ed4fe6a7bfc1d108675ec837b5ca9a4b5ccb 24648 xz-utils_5.8.1-2.debian.tar.xz
In the example above the checksum 0b54f79df85... is the same across the files, so it is a match.

Repackaged upstream sources can t be verified as easily Note that uscan may in rare cases repackage some upstream sources, for example to exclude files that don t adhere to Debian s copyright and licensing requirements. Those files and paths would be listed under the Files-Excluded section in the debian/copyright file. There are also other situations where the file that represents the upstream sources in Debian isn t bit-by-bit the same as what upstream published. If checksums don t match, an experienced Debian Developer should review all package settings (e.g. debian/source/options) to see if there was a valid and intentional reason for divergence.

Reviewing changes between two source packages using diffoscope Diffoscope is an incredibly capable and handy tool to compare arbitrary files. For example, to view a report in HTML format of the differences between two XZ releases, run:
diffoscope --html-dir xz-utils-5.6.4_vs_5.8.0 xz-utils_5.6.4.orig.tar.xz xz-utils_5.8.0.orig.tar.xz
browse xz-utils-5.6.4_vs_5.8.0/index.html
Inspecting diffoscope output of differences between two XZ Utils releases If the changes are extensive, and you want to use a LLM to help spot potential security issues, generate the report of both the upstream and Debian packaging differences in Markdown with:
diffoscope --markdown diffoscope-debian.md xz-utils_5.6.4-1.debian.tar.xz xz-utils_5.8.1-2.debian.tar.xz
diffoscope --markdown diffoscope.md xz-utils_5.6.4.orig.tar.xz xz-utils_5.8.0.orig.tar.xz
The Markdown files created above can then be passed to your favorite LLM, along with a prompt such as:
Based on the attached diffoscope output for a new Debian package version compared with the previous one, list all suspicious changes that might have introduced a backdoor, followed by other potential security issues. If there are none, list a short summary of changes as the conclusion.

Reviewing Debian source packages in version control As of today only 93% of all Debian source packages are tracked in git on Debian s GitLab instance at salsa.debian.org. Some key packages such as Coreutils and Bash are not using version control at all, as their maintainers apparently don t see value in using git for Debian packaging, and the Debian Policy does not require it. Thus, the only reliable and consistent way to audit changes in Debian packages is to compare the full versions from the archive as shown above. However, for packages that are hosted on Salsa, one can view the git history to gain additional insight into what exactly changed, when and why. For packages that are using version control, their location can be found in the Git-Vcs header in the debian/control file. For xz-utils the location is salsa.debian.org/debian/xz-utils. Note that the Debian policy does not state anything about how Salsa should be used, or what git repository layout or development practices to follow. In practice most packages follow the DEP-14 proposal, and use git-buildpackage as the tool for managing changes and pushing and pulling them between upstream and salsa.debian.org. To get the XZ Utils source, run:
$ gbp clone https://salsa.debian.org/debian/xz-utils.git
gbp:info: Cloning from 'https://salsa.debian.org/debian/xz-utils.git'
At the time of writing this post the git history shows:
$ git log --graph --oneline
* bb787585 (HEAD -> debian/unstable, origin/debian/unstable, origin/HEAD) Prepare 5.8.1-2
* 4b769547 d: Remove the symlinks from -dev package.
* a39f3428 Correct the nocheck build profile
* 1b806b8d Import Debian changes 5.8.1-1.1
* b1cad34b Prepare 5.8.1-1
* a8646015 Import 5.8.1
* 2808ec2d Update upstream source from tag 'upstream/5.8.1'
 \
  * fa1e8796 (origin/upstream/v5.8, upstream/v5.8) New upstream version 5.8.1
  * a522a226 Bump version and soname for 5.8.1
  * 1c462c2a Add NEWS for 5.8.1
  * 513cabcf Tests: Call lzma_code() in smaller chunks in fuzz_common.h
  * 48440e24 Tests: Add a fuzzing target for the multithreaded .xz decoder
  * 0c80045a liblzma: mt dec: Fix lack of parallelization in single-shot decoding
  * 81880488 liblzma: mt dec: Don't modify thr->in_size in the worker thread
  * d5a2ffe4 liblzma: mt dec: Don't free the input buffer too early (CVE-2025-31115)
  * c0c83596 liblzma: mt dec: Simplify by removing the THR_STOP state
  * 831b55b9 liblzma: mt dec: Fix a comment
  * b9d168ee liblzma: Add assertions to lzma_bufcpy()
  * c8e0a489 DOS: Update Makefile to fix the build
  * 307c02ed sysdefs.h: Avoid <stdalign.h> even with C11 compilers
  * 7ce38b31 Update THANKS
  * 688e51bd Translations: Update the Croatian translation
*   a6b54dde Prepare 5.8.0-1.
*   77d9470f Add 5.8 symbols.
*   9268eb66 Import 5.8.0
*   6f85ef4f Update upstream source from tag 'upstream/5.8.0'
 \ \
  *   afba662b New upstream version 5.8.0
   /
  * 173fb5c6 doc/SHA256SUMS: Add 5.8.0
  * db9258e8 Bump version and soname for 5.8.0
  * bfb752a3 Add NEWS for 5.8.0
  * 6ccbb904 Translations: Run "make -C po update-po"
  * 891a5f05 Translations: Run po4a/update-po
  * 4f52e738 Translations: Partially fix overtranslation in Serbian man pages
  * ff5d9447 liblzma: Count the extra bytes in LZMA/LZMA2 decoder memory usage
  * 943b012d liblzma: Use SSE2 intrinsics instead of memcpy() in dict_repeat()
This shows both the changes on the debian/unstable branch as well as the intermediate upstream import branch, and the actual real upstream development branch. See my Debian source packages in git explainer for details of what these branches are used for. To only view changes on the Debian branch, run git log --graph --oneline --first-parent or git log --graph --oneline -- debian. The Debian branch should only have changes inside the debian/ subdirectory, which is easy to check with:
$ git diff --stat upstream/v5.8
debian/README.source   16 +++
debian/autogen.sh   32 +++++
debian/changelog   949 ++++++++++++++++++++++++++
...
debian/upstream/signing-key.asc   52 +++++++++
debian/watch   4 +
debian/xz-utils.README.Debian   47 ++++++++
debian/xz-utils.docs   6 +
debian/xz-utils.install   28 +++++
debian/xz-utils.postinst   19 +++
debian/xz-utils.prerm   10 ++
debian/xzdec.docs   6 +
debian/xzdec.install   4 +
33 files changed, 2014 insertions(+)
All the files outside the debian/ directory originate from upstream, and for example running git blame on them should show only upstream commits:
$ git blame CMakeLists.txt
22af94128 (Lasse Collin 2024-02-12 17:09:10 +0200 1) # SPDX-License-Identifier: 0BSD
22af94128 (Lasse Collin 2024-02-12 17:09:10 +0200 2)
7e3493d40 (Lasse Collin 2020-02-24 23:38:16 +0200 3) ###############
7e3493d40 (Lasse Collin 2020-02-24 23:38:16 +0200 4) #
426bdc709 (Lasse Collin 2024-02-17 21:45:07 +0200 5) # CMake support for building XZ Utils
If the upstream in question signs commits or tags, they can be verified with e.g.:
$ git verify-tag v5.6.2
gpg: Signature made Wed 29 May 2024 09:39:42 AM PDT
gpg: using RSA key 3690C240CE51B4670D30AD1C38EE757D69184620
gpg: issuer "lasse.collin@tukaani.org"
gpg: Good signature from "Lasse Collin <lasse.collin@tukaani.org>" [expired]
gpg: Note: This key has expired!
The main benefit of reviewing changes in git is the ability to see detailed information about each individual change, instead of just staring at a massive list of changes without any explanations. In this example, to view all the upstream commits since the previous import to Debian, one would view the commit range from afba662b New upstream version 5.8.0 to fa1e8796 New upstream version 5.8.1 with git log --reverse -p afba662b...fa1e8796. However, a far superior way to review changes would be to browse this range using a visual git history viewer, such as gitk. Either way, looking at one code change at a time and reading the git commit message makes the review much easier. Browsing git history in gitk --all

Comparing Debian source packages to git contents As stated in the beginning of the previous section, and worth repeating, there is no guarantee that the contents in the Debian packaging git repository matches what was actually uploaded to Debian. While the tag2upload project in Debian is getting more and more popular, Debian is still far from having any system to enforce that the git repository would be in sync with the Debian archive contents. To detect such differences we can run diff across the Debian source packages downloaded with debsnap earlier (path source-xz-utils/xz-utils_5.8.1-2.debian) and the git repository cloned in the previous section (path xz-utils):
diff
$ diff -u source-xz-utils/xz-utils_5.8.1-2.debian/ xz-utils/debian/
diff -u source-xz-utils/xz-utils_5.8.1-2.debian/changelog xz-utils/debian/changelog
--- debsnap/source-xz-utils/xz-utils_5.8.1-2.debian/changelog 2025-10-03 09:32:16.000000000 -0700
+++ xz-utils/debian/changelog 2025-10-12 12:18:04.623054758 -0700
@@ -5,7 +5,7 @@
 * Remove the symlinks from -dev, pointing to the lib package.
 (Closes: #1109354)

- -- Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Fri, 03 Oct 2025 18:32:16 +0200
+ -- Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Fri, 03 Oct 2025 18:36:59 +0200
In the case above diff revealed that the timestamp in the changelog in the version uploaded to Debian is different from what was committed to git. This is not malicious, just a mistake by the maintainer who probably didn t run gbp tag immediately after upload, but instead some dch command and ended up with having a different timestamps in the git compared to what was actually uploaded to Debian.

Creating syntetic Debian packaging git repositories If no Debian packaging git repository exists, or if it is lagging behind what was uploaded to Debian s archive, one can use git-buildpackage s import-dscs feature to create synthetic git commits based on the files downloaded by debsnap, ensuring the git contents fully matches what was uploaded to the archive. To import a single version there is gbp import-dsc (no s at the end), of which an example invocation would be:
$ gbp import-dsc --verbose ../source-xz-utils/xz-utils_5.8.1-2.dsc
Version '5.8.1-2' imported under '/home/otto/debian/xz-utils-2025-09-29'
Example commit history from a repository with commits added with gbp import-dsc:
$ git log --graph --oneline
* 86aed07b (HEAD -> debian/unstable, tag: debian/5.8.1-2, origin/debian/unstable) Import Debian changes 5.8.1-2
* f111d93b (tag: debian/5.8.1-1.1) Import Debian changes 5.8.1-1.1
* 1106e19b (tag: debian/5.8.1-1) Import Debian changes 5.8.1-1
 \
  * 08edbe38 (tag: upstream/5.8.1, origin/upstream/v5.8, upstream/v5.8) Import Upstream version 5.8.1
   \
    * a522a226 (tag: v5.8.1) Bump version and soname for 5.8.1
    * 1c462c2a Add NEWS for 5.8.1
    * 513cabcf Tests: Call lzma_code() in smaller chunks in fuzz_common.h
An online example repository with only a few missing uploads added using gbp import-dsc can be viewed at salsa.debian.org/otto/xz-utils-2025-09-29/-/network/debian%2Funstable An example repository that was fully crafted using gbp import-dscs can be viewed at salsa.debian.org/otto/xz-utils-gbp-import-dscs-debsnap-generated/-/network/debian%2Flatest. There exists also dgit, which in a similar way creates a synthetic git history to allow viewing the Debian archive contents via git tools. However, its focus is on producing new package versions, so fetching a package with dgit that has not had the history recorded in dgit earlier will only show the latest versions:
$ dgit clone xz-utils
canonical suite name for unstable is sid
starting new git history
last upload to archive: NO git hash
downloading http://ftp.debian.org/debian//pool/main/x/xz-utils/xz-utils_5.8.1.orig.tar.xz...
downloading http://ftp.debian.org/debian//pool/main/x/xz-utils/xz-utils_5.8.1.orig.tar.xz.asc...
downloading http://ftp.debian.org/debian//pool/main/x/xz-utils/xz-utils_5.8.1-2.debian.tar.xz...
dpkg-source: info: extracting xz-utils in unpacked
dpkg-source: info: unpacking xz-utils_5.8.1.orig.tar.xz
dpkg-source: info: unpacking xz-utils_5.8.1-2.debian.tar.xz
synthesised git commit from .dsc 5.8.1-2
HEAD is now at f9bcaf7 xz-utils (5.8.1-2) unstable; urgency=medium
dgit ok: ready for work in xz-utils
$ dgit/sid   git log --graph --oneline
* f9bcaf7 xz-utils (5.8.1-2) unstable; urgency=medium 9 days ago (HEAD -> dgit/sid, dgit/dgit/sid)
 \
  * 11d3a62 Import xz-utils_5.8.1-2.debian.tar.xz 9 days ago
* 15dcd95 Import xz-utils_5.8.1.orig.tar.xz 6 months ago
Unlike git-buildpackage managed git repositories, the dgit managed repositories cannot incorporate the upstream git history and are thus less useful for auditing the full software supply-chain in git.

Comparing upstream source packages to git contents Equally important to the note in the beginning of the previous section, one must also keep in mind that the upstream release source packages, often called release tarballs, are not guaranteed to have the exact same contents as the upstream git repository. Projects might strip out test data or extra development files from their release tarballs to avoid shipping unnecessary files to users, or projects might add documentation files or versioning information into the tarball that isn t stored in git. While a small minority, there are also upstreams that don t use git at all, so the plain files in a release tarball is still the lowest common denominator for all open source software projects, and exporting and importing source code needs to interface with it. In the case of XZ, the release tarball has additional version info and also a sizeable amount of pregenerated compiler configuration files. Detecting and comparing differences between git contents and tarballs can of course be done manually by running diff across an unpacked tarball and a checked out git repository. If using git-buildpackage, the difference between the git contents and tarball contents can be made visible directly in the import commit. In this XZ example, consider this git history:
* b1cad34b Prepare 5.8.1-1
* a8646015 Import 5.8.1
* 2808ec2d Update upstream source from tag 'upstream/5.8.1'
 \
  * fa1e8796 (debian/upstream/v5.8, upstream/v5.8) New upstream version 5.8.1
  * a522a226 (tag: v5.8.1) Bump version and soname for 5.8.1
  * 1c462c2a Add NEWS for 5.8.1
The commit a522a226 was the upstream release commit, which upstream also tagged v5.8.1. The merge commit 2808ec2d applied the new upstream import branch contents on the Debian branch. Between these is the special commit fa1e8796 New upstream version 5.8.1 tagged upstream/v5.8. This commit and tag exists only in the Debian packaging repository, and they show what is the contents imported into Debian. This is generated automatically by git-buildpackage when running git import-orig --uscan for Debian packages with the correct settings in debian/gbp.conf. By viewing this commit one can see exactly how the upstream release tarball differs from the upstream git contents (if at all). In the case of XZ, the difference is substantial, and shown below in full as it is very interesting:
$ git show --stat fa1e8796
commit fa1e8796dabd91a0f667b9e90f9841825225413a
(debian/upstream/v5.8, upstream/v5.8)
Author: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Date: Thu Apr 3 22:58:39 2025 +0200
New upstream version 5.8.1
.codespellrc   30 -
.gitattributes   8 -
.github/workflows/ci.yml   163 -
.github/workflows/freebsd.yml   32 -
.github/workflows/netbsd.yml   32 -
.github/workflows/openbsd.yml   35 -
.github/workflows/solaris.yml   32 -
.github/workflows/windows-ci.yml   124 -
.gitignore   113 -
ABOUT-NLS   1 +
ChangeLog   17392 +++++++++++++++++++++
Makefile.in   1097 +++++++
aclocal.m4   1353 ++++++++
build-aux/ci_build.bash   286 --
build-aux/compile   351 ++
build-aux/config.guess   1815 ++++++++++
build-aux/config.rpath   751 +++++
build-aux/config.sub   2354 +++++++++++++
build-aux/depcomp   792 +++++
build-aux/install-sh   541 +++
build-aux/ltmain.sh   11524 ++++++++++++++++++++++
build-aux/missing   236 ++
build-aux/test-driver   160 +
config.h.in   634 ++++
configure   26434 ++++++++++++++++++++++
debug/Makefile.in   756 +++++
doc/SHA256SUMS   236 --
doc/man/txt/lzmainfo.txt   36 +
doc/man/txt/xz.txt   1708 ++++++++++
doc/man/txt/xzdec.txt   76 +
doc/man/txt/xzdiff.txt   39 +
doc/man/txt/xzgrep.txt   70 +
doc/man/txt/xzless.txt   36 +
doc/man/txt/xzmore.txt   31 +
lib/Makefile.in   623 ++++
m4/.gitignore   40 -
m4/build-to-host.m4   274 ++
m4/gettext.m4   392 +++
m4/host-cpu-c-abi.m4   529 +++
m4/iconv.m4   324 ++
m4/intlmacosx.m4   71 +
m4/lib-ld.m4   170 +
m4/lib-link.m4   815 +++++
m4/lib-prefix.m4   334 ++
m4/libtool.m4   8488 +++++++++++++++++++++
m4/ltoptions.m4   467 +++
m4/ltsugar.m4   124 +
m4/ltversion.m4   24 +
m4/lt~obsolete.m4   99 +
m4/nls.m4   33 +
m4/po.m4   456 +++
m4/progtest.m4   92 +
po/.gitignore   31 -
po/Makefile.in.in   517 +++
po/Rules-quot   66 +
po/boldquot.sed   21 +
po/ca.gmo   Bin 0 -> 15587 bytes
po/cs.gmo   Bin 0 -> 7983 bytes
po/da.gmo   Bin 0 -> 9040 bytes
po/de.gmo   Bin 0 -> 29882 bytes
po/en@boldquot.header   35 +
po/en@quot.header   32 +
po/eo.gmo   Bin 0 -> 15060 bytes
po/es.gmo   Bin 0 -> 29228 bytes
po/fi.gmo   Bin 0 -> 28225 bytes
po/fr.gmo   Bin 0 -> 10232 bytes
To be able to easily inspect exactly what changed in the release tarball compared to git release tag contents, the best tool for the job is Meld, invoked via git difftool --dir-diff fa1e8796^..fa1e8796. Meld invoked by git difftool --dir-diff afba662b..fa1e8796 to show differences between git release tag and release tarball contents To compare changes across the new and old upstream tarball, one would need to compare commits afba662b New upstream version 5.8.0 and fa1e8796 New upstream version 5.8.1 by running git difftool --dir-diff afba662b..fa1e8796. Meld invoked by git difftool --dir-diff afba662b..fa1e8796 to show differences between to upstream release tarball contents With all the above tips you can now go and try to audit your own favorite package in Debian and see if it is identical with upstream, and if not, how it differs.

Should the XZ backdoor have been detected using these tools? The famous XZ Utils backdoor (CVE-2024-3094) consisted of two parts: the actual backdoor inside two binary blobs masqueraded as a test files (tests/files/bad-3-corrupt_lzma2.xz, tests/files/good-large_compressed.lzma), and a small modification in the build scripts (m4/build-to-host.m4) to extract the backdoor and plant it into the built binary. The build script was not tracked in version control, but generated with GNU Autotools at release time and only shipped as additional files in the release tarball. The entire reason for me to write this post was to ponder if a diligent engineer using git-buildpackage best practices could have reasonably spotted this while importing the new upstream release into Debian. The short answer is no . The malicious actor here clearly anticipated all the typical ways anyone might inspect both git commits, and release tarball contents, and masqueraded the changes very well and over a long timespan. First of all, XZ has for legitimate reasons for several carefully crafted .xz files as test data to help catch regressions in the decompression code path. The test files are shipped in the release so users can run the test suite and validate that the binary is built correctly and xz works properly. Debian famously runs massive amounts of testing in its CI and autopkgtest system across tens of thousands of packages to uphold high quality despite frequent upgrades of the build toolchain and while supporting more CPU architectures than any other distro. Test data is useful and should stay. When git-buildpackage is used correctly, the upstream commits are visible in the Debian packaging for easy review, but the commit cf44e4b that introduced the test files does not deviate enough from regular sloppy coding practices to really stand out. It is unfortunately very common for git commit to lack a message body explaining why the change was done, and to not be properly atomic with test code and test data together in the same commit, and for commits to be pushed directly to mainline without using code reviews (the commit was not part of any PR in this case). Only another upstream developer could have spotted that this change is not on par to what the project expects, and that the test code was never added, only test data, and thus that this commit was not just a sloppy one but potentially malicious. Secondly, the fact that a new Autotools file appeared (m4/build-to-host.m4) in the XZ Utils 5.6.0 is not suspicious. This is perfectly normal for Autotools. In fact, starting from XZ Utils version 5.8.1 it is now shipping a m4/build-to-host.m4 file that it actually uses now. Spotting that there is anything fishy is practically impossible by simply reading the code, as Autotools files are full custom m4 syntax interwoven with shell script, and there are plenty of backticks ( ) that spawn subshells and evals that execute variable contents further, which is just normal for Autotools. Russ Cox s XZ post explains how exactly the Autotools code fetched the actual backdoor from the test files and injected it into the build. Inspecting the m4/build-to-host.m4 changes in Meld launched via git difftool There is only one tiny thing that maybe a very experienced Autotools user could potentially have noticed: the serial 30 in the version header is way too high. In theory one could also have noticed this Autotools file deviates from what other packages in Debian ship with the same filename, such as e.g. the serial 3, serial 5a or 5b versions. That would however require and an insane amount extra checking work, and is not something we should plan to start doing. A much simpler solution would be to simply strongly recommend all open source projects to stop using Autotools to eventually get rid of it entirely.

Not detectable with reasonable effort While planting backdoors is evil, it is hard not to feel some respect to the level of skill and dedication of the people behind this. I ve been involved in a bunch of security breach investigations during my IT career, and never have I seen anything this well executed. If it hadn t slowed down SSH by ~500 milliseconds and been discovered due to that, it would most likely have stayed undetected for months or years. Hiding backdoors in closed source software is relatively trivial, but hiding backdoors in plain sight in a popular open source project requires some unusual amount of expertise and creativity as shown above.

Is the software supply-chain in Debian easy to audit? While maintaining a Debian package source using git-buildpackage can make the package history a lot easier to inspect, most packages have incomplete configurations in their debian/gbp.conf, and thus their package development histories are not always correctly constructed or uniform and easy to compare. The Debian Policy does not mandate git usage at all, and there are many important packages that are not using git at all. Additionally the Debian Policy also allows for non-maintainers to upload new versions to Debian without committing anything in git even for packages where the original maintainer wanted to use git. Uploads that bypass git unfortunately happen surpisingly often. Because of the situation, I am afraid that we could have multiple similar backdoors lurking that simply haven t been detected yet. More audits, that hopefully also get published openly, would be welcome! More people auditing the contents of the Debian archives would probably also help surface what tools and policies Debian might be missing to make the work easier, and thus help improve the security of Debian s users, and improve trust in Debian.

Is Debian currently missing some software that could help detect similar things? To my knowledge there is currently no system in place as part of Debian s QA or security infrastructure to verify that the upstream source packages in Debian are actually from upstream. I ve come across a lot of packages where the debian/watch or other configs are incorrect and even cases where maintainers have manually created upstream tarballs as it was easier than configuring automation to work. It is obvious that for those packages the source tarball now in Debian is not at all the same as upstream. I am not aware of any malicious cases though (if I was, I would report them of course). I am also aware of packages in the Debian repository that are misconfigured to be of type 1.0 (native) packages, mixing the upstream files and debian/ contents and having patches applied, while they actually should be configured as 3.0 (quilt), and not hide what is the true upstream sources. Debian should extend the QA tools to scan for such things. If I find a sponsor, I might build it myself as my next major contribution to Debian. In addition to better tooling for finding mismatches in the source code, Debian could also have better tooling for tracking in built binaries what their source files were, but solutions like Fraunhofer-AISEC s supply-graph or Sony s ESSTRA are not practical yet. Julien Malka s post about NixOS discusses the role of reproducible builds, which may help in some cases across all distros.

Or, is Debian missing some policies or practices to mitigate this? Perhaps more importantly than more security scanning, the Debian Developer community should switch the general mindset from anyone is free to do anything to valuing having more shared workflows. The ability to audit anything is severely hampered by the fact that there are so many ways to do the same thing, and distinguishing what is a normal deviation from a malicious deviation is too hard, as the normal can basically be almost anything. Also, as there is no documented and recommended default workflow, both those who are old and new to Debian packaging might never learn any one optimal workflow, and end up doing many steps in the packaging process in a way that kind of works, but is actually wrong or unnecessary, causing process deviations that look malicious, but turn out to just be a result of not fully understanding what would have been the right way to do something. In the long run, once individual developers workflows are more aligned, doing code reviews will become a lot easier and smoother as the excess noise of workflow differences diminishes and reviews will feel much more productive to all participants. Debian fostering a culture of code reviews would allow us to slowly move from the current practice of mainly solo packaging work towards true collaboration forming around those code reviews. I have been promoting increased use of Merge Requests in Debian already for some time, for example by proposing DEP-18: Encourage Continuous Integration and Merge Request based Collaboration for Debian packages. If you are involved in Debian development, please give a thumbs up in dep-team/deps!21 if you want me to continue promoting it.

Can we trust open source software? Yes and I would argue that we can only trust open source software. There is no way to audit closed source software, and anyone using e.g. Windows or MacOS just have to trust the vendor s word when they say they have no intentional or accidental backdoors in their software. Or, when the news gets out that the systems of a closed source vendor was compromised, like Crowdstrike some weeks ago, we can t audit anything, and time after time we simply need to take their word when they say they have properly cleaned up their code base. In theory, a vendor could give some kind of contractual or financial guarantee to its customer that there are no preventable security issues, but in practice that never happens. I am not aware of a single case of e.g. Microsoft or Oracle would have paid damages to their customers after a security flaw was found in their software. In theory you could also pay a vendor more to have them focus more effort in security, but since there is no way to verify what they did, or to get compensation when they didn t, any increased fees are likely just pocketed as increased profit. Open source is clearly better overall. You can, if you are an individual with the time and skills, audit every step in the supply-chain, or you could as an organization make investments in open source security improvements and actually verify what changes were made and how security improved. If your organisation is using Debian (or derivatives, such as Ubuntu) and you are interested in sponsoring my work to improve Debian, please reach out.

18 October 2025

Julian Andres Klode: Sound Removals

Problem statement Currently if you have an automatically installed package A (= 1) where
  • A (= 1) Depends B (= 1)
  • A (= 2) Depends B (= 2)
and you upgrade B from 1 to 2; then you can:
  1. Remove A (= 1)
  2. Upgrade A to version 2
If A was installed by a chain initiated by Recommends (say X Rec Y, Y Depends A), the solver sometimes preferred removing A (and anything depending on it until it got). I have a fix pending to introduce eager Recommends which fixes the practical case, but this is still not sound. In fact we can show that the solver produces the wrong result for small minimal test cases, as well as the right result for some others without the fix (hooray?). Ensuring sound removals is more complex, and first of all it begs the question: When is a removal sound? This, of course, is on us to define. An easy case can be found in the Debian policy, 7.6.2 Replacing whole packages, forcing their removal : If B (= 2) declares a Conflicts: A (= 1) and Replaces: A (= 1), then the removal is valid. However this is incomplete as well, consider it declares Conflicts: A (< 1) and Replaces: A (< 1); the solution to remove A rather than upgrade it would still be wrong. This indicates that we should only allow removing A if the conflicts could not be solved by upgrading it. The other case to explore is package removals. If B is removed, A should be removed as well; however it there is another package X that Provides: B (= 1) and it is marked for install, A should not be removed. That said, the solver is not allowed to install X to satisfy the depends B (= 1) - only to satisfy other dependencies [we do not want to get into endless loops where we switch between alternatives to keep reverse dependencies installed].

Proposed solution To solve this, I propose the following definition: Definition (sound removal): A removal of package P is sound if either:
  1. A version v is installed that package-conflicts with B.
  2. A package Q is removed and the installable versions of P package-depends on Q.
where the other definitions are: Definition (installable version): A version v is installable if either it is installed, or it is newer than an installed version of the same package (you may wish to change this to accomodate downgrades, or require strict pinning, but here be dragons). Definition (package-depends): A version v package-depends on a package B if either:
  1. there exists a dependency in v that can be solved by any version of B, or
  2. there exists a package C where v package-depends C and any (c in C) package-depends B (transitivity)
Definition (package-conflicts): A version v package-conflicts with an installed package B if either:
  1. it declares a conflicts against an installable version of B; or
  2. there exists a package C where v package-conflicts C, and b package-depends C for installable versions b.

Translating this into a (modified) SAT solver One approach may be to implement the logic in the conflict analysis that drives backtracking, i.e. we assume a package A and when we reach not A, we analyse if the implication graph for not A constitutes a sound removal, and then replace the assumption A with the assumption A or "learned reason. However, while this seems a plausible mechanism for a DPLL solver, for a modern CDCL solver, it s not immediately evident how to analyse whether not A is sound if the reason for it is a learned clause, rather than a problem clause. Instead we propose a static encoding of the rules into a slightly modified SAT solver: Given c1, , cn that transitive-conflicts A and D1, , Dn that A package-depends on, introduce the rule: A unless c1 or c2 or ... cn ... or not D1 or not D2 ... or not Dn Rules of the form A... unless B... - where A... and B... are CNF - are intuitively the same as A... or B..., however the semantic here is different: We are not allowed to select B... to satisfy this clause. This requires a SAT solver that tracks a reason for each literal being assigned, such as solver3, rather than a SAT solver like MiniSAT that only tracks reasons across propagation (solver3 may track A depends B or C as the reason for B without evaluating C, whereas MiniSAT would only track it as the reason given not C).

Is it actually sound? The proposed definition of a sound removal may still proof unsound as I either missed something in the conclusion of the proposed definition that violates my goal I set out to achieve, or I missed some of the goals. I challenge you to find cases that cause removals that look wrong :D

17 October 2025

Dirk Eddelbuettel: ML quacks: Combining duckdb and mlpack

A side project I have been working on a little since last winter and which explores extending duckdb with mlpack is now public at the duckdb-mlpack repo. duckdb is an excellent small (as in runs as a self-contained binary ) database engine with both a focus on analytical payloads (OLAP rather than OLTP) and an impressive number of already bolted-on extensions (for example for cloud data access) delivered as a single-build C++ executable (or of course as a library used from other front-ends). mlpack is an excellent C++ library containing many/most machine learning algorithms, also built in a self-contained manner (or library) making it possible to build compact yet powerful binaries, or to embed (as opposed to other ML framework accessed from powerful but not lightweight run-times such as Python or R). The compact build aspect as well as the common build tools (C++, cmake) make these two a natural candidate for combining them. Moreover, duckdb is a champion of data access, management and control and the complementary machine learning insights and predictions offered by mlpack are fully complementary and hence fit this rather well. duckdb also has a very robust and active extension system. To use it, one starts from a template repository and its use this template button, runs a script and can then start experimenting. I have now grouped my initial start and test functions into a separate repository duckdb-example-extension to keep the duckdb-mlpack one focused on the extend to mlpack aspect. duckdb-mlpack is right an MVP , i.e. a minimally viable product (or demo). It just runs the adaboost classifier but does so on any dataset fitting the rectangular setup with columns of features (real valued) and a final column (integer valued) of labels. I had hope to use two select queries for both features and then labels but it turns a table function (returning a table of data from a query) can only run one select *. So the basic demo, also on the repo README is now to run the following script (where the SELECT * FROM mlpack_adaboost((SELECT * FROM D)); is the key invocation of the added functionality):
#!/bin/bash

cat <<EOF   build/release/duckdb
SET autoinstall_known_extensions=1;
SET autoload_known_extensions=1; # for httpfs

CREATE TEMP TABLE Xd AS SELECT * FROM read_csv("https://mlpack.org/datasets/iris.csv");
CREATE TEMP TABLE X AS SELECT row_number() OVER () AS id, * FROM Xd;
CREATE TEMP TABLE Yd AS SELECT * FROM read_csv("https://mlpack.org/datasets/iris_labels.csv");
CREATE TEMP TABLE Y AS SELECT row_number() OVER () AS id, CAST(column0 AS double) as label FROM Yd;
CREATE TEMP TABLE D AS SELECT * FROM X INNER JOIN Y ON X.id = Y.id;
ALTER TABLE D DROP id;
ALTER TABLE D DROP id_1;
CREATE TEMP TABLE A AS SELECT * FROM mlpack_adaboost((SELECT * FROM D));

SELECT COUNT(*) as n, predicted FROM A GROUP BY predicted;
EOF
to produce the following tabulation / group by:
./sampleCallRemote.sh
Misclassified: 1
 
    n     predicted  
  int64     int32    
 
     50           0  
     49           1  
     51           2  
 
$
(Note that this requires the httpfs extension. So when you build from a freshly created extension repository you may be ahead of the most recent release of duckdb by a few commits. It is easy to check out the most recent release tag (or maybe the one you are running for your local duckdb binary) to take advantage of the extensions you likely already have for that version. So here, and in the middle of October 2025, I picked v1.4.1 as I run duckdb version 1.4.1 on my box.) There are many other neat duckdb extensions. The core ones are regrouped here while a list of community extensions is here and here. For this (still more minimal) extension, I added a few TODO items to the README.md: Please reach out if you are interested in working on any of this.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can now sponsor me at GitHub.

Next.

Previous.