Search Results: "beh"

29 July 2025

Christoph Berg: The Debian Conference 2025 in Brest

It's Sunday and I'm now sitting in the train from Brest to Paris where I will be changing to Germany, on the way back from the annual Debian conference. A full week of presentations, discussions, talks and socializing is laying behind me and my head is still spinning from the intensity.
Pollito and the gang of DebConf mascots wearing their conference badgesPollito and the gang of DebConf mascots wearing their conference badges (photo: Christoph Berg)
Sunday, July 13th It started last Sunday with traveling to the conference. I got on the Eurostar in Duisburg and we left on time, but even before reaching Cologne, the train was already one hour delayed for external reasons, collecting yet another hour between Aachen and Liege for its own technical problems. "The train driver is working on trying to fix the problem." My original schedule had well over two hours for changing train stations in Paris, but being that late, I missed the connection to Brest in Montparnasse. At least in the end, the total delay was only one hour when finally arriving at the destination. Due to the French julliet quatorze fireworks approaching, buses in Brest were rerouted, but I managed to catch the right bus to the conference venue, already meeting a few Debian people on the way. The conference was hosted at the IMT Atlantique Brest campus, giving the event a nice university touch. I arrived shortly after 10 in the evening and after settling down a bit, got on one of the "magic" buses for transportation to the camping site where half of the attendees where stationed. I shared a mobile home with three other Debianites, where I got a small room for myself. Monday, July 14th Next morning, we took the bus back to the venue with a small breakfast and the opening session where Enrico Zini invited me to come to his and Nicolas Dandrimont's session about Debian community governance and curation, which I gladly did. Many ideas about conflict moderation and community steering were floated around. I hope some of that can be put into effect to make flamewars on the mailing lists less heated and more directed. After that, I attended Olly Betts' "Stemming with Snowball" session, which is the stemmer used also in PostgreSQL. Text search is one of the areas in PostgreSQL that I never really looked closely at, including the integration into the postgresql-common package, so it was nice to get more information about that. In preparation for the conference, a few of us Ham radio operators in Debian had decided to bring some radio gear to DebConf this year in order to perhaps spark more interest for our hobby among the fellow geeks. In the afternoon after the talks, I found a quieter spot just outside of the main hall and set up a shortwave antenna by attaching a 10m mast to one of the park benches there. The 40m band was still pretty much closed, but I could work a few stations from England, just across the channel from Bretagne, answering questions from interested passing-by Debian people between the contacts. Over time, the band opened and more European stations got into the log.
F/DF7CB in Brest (photo: Evangelos Ribeiro Tzaras)
Tuesday, July 15th Tuesday started with Helmut Grohne's session about "Reviving (un)schroot". The schroot program has been Debian's standard way of managing build chroots for a long time, but it is more and more being regarded as obsolete with all kinds of newer containerization and virtualization technologies taking over. Since many bits of Debian infrastructure depend on schroot, and its user interface is still very useful, Helmut reimplemented it using Linux namespaces and the "unshare" systemcall. I had already worked with him at the Hamburg Minidebconf to replace the apt.postgresql.org buildd machinery with the new system, but we were not quite there yet (network isolation is nice, but we still sometimes need proper networking), so it was nice to see the effort is still progressing and I will give his new scripts a try when I'm back home. Next, Stefano Rivera and Colin Watson presented Debusine, a new package repository and workflow management system. It looks very promising for anyone running their own repository, so perhaps yet another bit of apt.postgresql.org infrastructure to replace in the future. After that, I went to the Debian LTS BoF session by Santiago Ruano Rinc n and Bastien Roucari s - Debian releases plus LTS is what we are covering with apt.postgresql.org. Then there were bits from the DPL (Debian Project Leader), and a session moderated by Stefano Rivera interesting to me as a member of the Debian Technical Committee on the future structure of the packages required for cross-building in Debian, a topic which had been brought to TC a while ago. I am happy that we could resolve the issue without having to issue a formal TC ruling as the involved parties (kernel, glibc, gcc and the cross-build people) found a promising way forward themselves. DebConf is really a good way to get such issues unstuck. Ten years ago at the 2015 Heidelberg DebConf, Enrico had given a seminal "Semi-serious stand-up comedy" talk, drawing parallels between the Debian Open Source community and the BDSM community - "People doing things consensually together". (Back then, the talk was announced as "probably unsuitable for people of all ages".) With his unique presentation style and witty insights, the session made a lasting impression on everyone attending. Now, ten years later (and he and many in the audience being ten years older), he gave an updated version of it. We are now looking forward to the sequel in 2035. The evening closed with the famous DebConf tradition of the Cheese & Wine party in a old fort next to the coast, just below the conference venue. Even when he's a fellow Debian Developer, Ham and also TC member, I had never met Paul Tagliamonte in person before, but we spent most of the evening together geeking out on all things Debian and Ham radio.
The northern coast of Ushant (photo: Christoph Berg)
Wednesday, July 16th Wednesday already marked the end of the first half of the week, the day of the day trips. I had chosen to go to Ouessant island (Ushant in English) which marks the Western end of French mainland and hosts one of the lighthouses yielding the way into the English channel. The ferry trip included surprisingly big waves which left some participants seasick, but everyone recovered fast. After around one and a half hours we arrived, picked up the bicycles, and spent the rest of the day roaming the island. The weather forecast was originally very cloudy and 18 C, but over noon this turned into sunny and warm, so many got an unplanned sunburn. I enjoyed the trip very much - it made up for not having time visiting the city during the week. After returning, we spent the rest of the evening playing DebConf's standard game, Mao (spoiler alert: don't follow the link if you ever intend to play).
Having a nice day (photo: Christoph Berg)
Thursday, July 17th The next day started with the traditional "Meet the Technical Committee" session. This year, we trimmed the usual slide deck down to remove the boring boilerplate parts, so after a very short introduction to the work of the committee by our chairman Matthew Vernon, we opened up the discussion with the audience, with seven (out of 8) TC members on stage. I think the format worked very well, with good input from attendees. Next up was "Don't fear the TPM" by Jonathan McDowell. A common misconception in the Free Software community is that the TPM is evil DRM hardware working against the user, but while it could be used in theory that way, the necessary TPM attestations seem to impossible to attain in practice, so that wouldn't happen anyway. Instead, it is a crypto coprocessor present in almost all modern computers that can be used to hold keys, for example to be used for SSH. It will also be interesting to research if we can make use of it for holding the Transparent Data Encryption keys for CYBERTEC's PostgreSQL Enterprise Edition. Aigars Mahinovs then directed everyone in place for the DebConf group picture, and Lucas Nussbaum started a discussion about archive-wide QA tasks in Debian, an area where I did a lot of work in the past and that still interests me. Antonio Terceiro and Paul Gevers followed up with techniques to track archive-wide rebuilding and testing of packages and in turn filing a lot of bugs to track the problems. The evening ended with the conference dinner, again in the fort close by the coast. DebConf is good for meeting new people, and I incidentally ran into another Chris, who happened to be one of the original maintainers of pgaccess, the pre-predecessor of today's pgadmin. I admit still missing this PostgreSQL frontend for its simplicity and ability to easily edit table data, but it disappeared around 2004. Friday, July 18th On Friday, I participated in discussion sessions around contributors.debian.org (PostgreSQL is planning to set up something similar) and the New Member process which I had helped to run and reform a decade or two ago. Agathe Porte (also a Ham radio operator, like so many others at the conference I had no idea of) then shared her work on rust-rewriting the slower parts of Lintian, the Debian package linter. Craig Small talked about "Free as in Bytes", the evolution of the Linux procps free command. Over the time and many kernel versions, the summary numbers printed became better and better, but there will probably never be a version that suits all use cases alike. Later over dinner, Craig (who is also a TC member) and I shared our experiences with these numbers and customers (not) understanding them. He pointed out that for PostgreSQL and looking at used memory in the presence of large shared memory buffers, USS (unique set size) and PSS (proportional set size) should be more realistic numbers than the standard RSS (resident set size) that the top utility is showing by default. Antonio Terceiro and Paul Gevers again joined to lead a session, now on ci.debian.net and autopkgtest, the test driver used for running tests on packages after then have been installed on a system. The PostgreSQL packages are heavily using this to make sure no regressions creep in even after builds have successfully completed and test re-runs are rescheduled periodically. The day ended with Bdale Garbee's electronics team BoF and Paul Tagliamonte and me setting up the radio station in the courtyard, again answering countless questions about ionospheric conditions and operating practice. Saturday, July 19th Saturday was the last conference day. In the first session, Nikos Tsipinakis and Federico Vaga from CERN announced that the LHC will be moving to Debian for the accelerator's frontend computers in their next "long shutdown" maintenance period in the next year. CentOS broke compatibility too often, and Debian trixie together with the extended LTS support will cover the time until the next long shutdown window in 2035, until when the computers should have all been replaced with newer processors covering higher x86_64 baseline versions. The audience was very delighted to hear that Debian is now also being used in this prestige project. Ben Hutchings then presented new Linux kernel features. Particularly interesting for me was the support for atomic writes spanning more than one filesystem block. When configured correctly, this would mean PostgreSQL didn't have to record full-page images in the WAL anymore, increasing throughput and performance. After that, the Debian ftp team discussed ways to improve review of new packages in the archive, and which of their processes could be relaxed with new US laws around Open Source and cryptography algorithms export. Emmanuel Arias led a session on Salsa CI, Debian's Gitlab instance and standard CI pipeline. (I think it's too slow, but the runners are not under their control.) Julian Klode then presented new features in APT, Debian's package manager. I like the new display format (and a tiny bit of that is also from me sending in wishlist bugs). In the last round of sessions this week, I then led the Ham radio BoF with an introduction into the hobby and how Debian can be used. Bdale mentioned that the sBitx family of SDR radios is natively running Debian, so stock packages can be used from the radio's touch display. We also briefly discussed his involvement in ARDC and the possibility to get grants from them for Ham radio projects. Finally, DebConf wrapped up with everyone gathering in the main auditorium and cheering the organizers for making the conference possible and passing Pollito, the DebConf mascot, to the next organizer team.
Pollito on stage (photo: Christoph Berg)
Sunday, July 20th Zoom back to the train: I made it through the Paris metro and I'm now on the Eurostar back to Germany. It has been an intense week with all the conference sessions and meeting all the people I had not seen so long. There are a lot of new ideas to follow up on both for my Debian and PostgreSQL work. Next year's DebConf will take place in Santa Fe, Argentina. I haven't yet decided if I will be going, but I can recommend the experience to everyone! The post The Debian Conference 2025 in Brest appeared first on CYBERTEC PostgreSQL Services & Support.

28 July 2025

Scarlett Gately Moore: Request for Financial Support During Job Search

Dear friends, family, and community, I m reaching out during a challenging time in my life to ask for your support. This year has been particularly difficult as I ve been out of work for most of it due to a broken arm and a serious MRSA infection that required extensive treatment and recovery time.

Current Situation While I ve been recovering, I ve been actively working to maintain and improve my professional skills by contributing to open source software projects. These contributions help me stay current with industry trends and demonstrate my ongoing commitment to my field, but unfortunately, they don t provide the income I need to cover my basic living expenses. Despite my efforts, I m still struggling to secure employment, and I m falling behind on essential bills including:
  • Rent/mortgage payments
  • Utilities
  • Medical expenses
  • Basic living costs

How You Can Help Any financial assistance, no matter the amount, would make a meaningful difference in helping me stay afloat during this job search. Your support would allow me to:
  • Keep my housing stable
  • Maintain essential services
  • Focus fully on finding employment without the constant stress of unpaid bills
  • Continue contributing to the open source community

Moving Forward I m actively job searching and interviewing, and I m confident that I ll be back on my feet soon. Your temporary support during this difficult period would mean the world to me and help bridge the gap until I can secure stable employment. If you re able to contribute, GoFundMe . If you re unable to donate, I completely understand, and sharing this request with others who might be able to help would be greatly appreciated. Thank you for taking the time to read this and for considering helping me during this challenging time. With gratitude, Scarlett

Dimitri John Ledkov: Achieving actually full disk encryption of UEFI ESP at rest with TCG OPAL, FIPS, LUKS

Achieving full disk encryption using FIPS, TCG OPAL and LUKS to encrypt UEFI ESP on bare-metal and in VMs
Many security standards such as CIS and STIG require to protect information at rest. For example, NIST SP 800-53r5 SC-28 advocate to use cryptographic protection, offline storage and TPMs to enhance protection of information confidentiality and/or integrity.Traditionally to satisfy such controls on portable devices such as laptops one would utilize software based Full Disk Encryption - Mac OS X FileVault, Windows Bitlocker, Linux cryptsetup LUKS2. In cases when FIPS cryptography is required, additional burden would be placed onto these systems to operate their kernels in FIPS mode.Trusted Computing Group works on establishing many industry standards and specifications, which are widely adopted to improve safety and security of computing whilst keeping it easy to use. One of their most famous specifications them is TCG TPM 2.0 (Trusted Platform Module). TPMs are now widely available on most devices and help to protect secret keys and attest systems. For example, most software full disk encryption solutions can utilise TCG TPM to store full disk encryption keys providing passwordless, biometric or pin-base ways to unlock the drives as well as attesting that system have not been modified or compromised whilst offline.TCG Storage Security Subsystem Class: Opal Specification is a set of specifications for features of data storage devices. The authors and contributors to OPAL are leading and well trusted storage manufacturers such as Samsung, Western Digital, Seagate Technologies, Dell, Google, Lenovo, IBM, Kioxia, among others. One of the features that Opal Specification enables is self-encrypting drives which becomes very powerful when combined with pre-boot authentication. Out of the box, such drives always and transparently encrypt all disk data using hardware acceleration. To protect data one can enter UEFI firmware setup (BIOS) to set NVMe single user password (or user + administrator/recovery passwords) to encrypt the disk encryption key. If one's firmware didn't come with such features, one can also use SEDutil to inspect and configure all of this. Latest release of major Linux distributions have SEDutil already packaged.Once password is set, on startup, pre-boot authentication will request one to enter password - prior to booting any operating systems. It means that full disk is actually encrypted, including the UEFI ESP and all operating systems that are installed in case of dual or multi-boot installations. This also prevents tampering with ESP, UEFI bootloaders and kernels which with traditional software-based encryption often remain unencrypted and accessible. It also means one doesn't have to do special OS level repartitioning, or installation steps to ensure all data is encrypted at rest.What about FIPS compliance? Well, the good news is that majority of the OPAL compliant hard drives and/or security sub-chips do have FIPS 140-3 certification. Meaning they have been tested by independent laboratories to ensure they do in-fact encrypt data. On the CMVP website one can search for module name terms "OPAL" or "NVMe" or name of hardware vendor to locate FIPS certificates.Are such drives widely available? Yes. For example, a common Thinkpad X1 gen 11 has OPAL NVMe drives as standard, and they have FIPS certification too. Thus, it is likely in your hardware fleet these are already widely available. Use sedutil to check if MediaEncrypt and LockingSupported features are available.Well, this is great for laptops and physical servers, but you may ask - what about public or private cloud? Actually, more or less the same is already in-place in both. On CVMP website all major clouds have their disk encryption hardware certified, and all of them always encrypt all Virtual Machines with FIPS certified cryptography without an ability to opt-out. One is however in full control of how the encryption keys are managed: cloud-provider or self-managed (either with a cloud HSM or KMS or bring your own / external). See these relevant encryption options and key management docs for GCP, Azure, AWS. But the key takeaway without doing anything, at rest, VMs in public cloud are always encrypted and satisfy NIST SP 800-53 controls.What about private cloud? Most Linux based private clouds ultimately use qemu typically with qcow2 virtual disk images. Qemu supports user-space encryption of qcow2 disk, see this manpage. Such encryption encrypts the full virtual machine disk, including the bootloader and ESP. And it is handled entirely outside of the VM on the host - meaning the VM never has access to the disk encryption keys. Qemu implements this encryption entirely in userspace using gnutls, nettle, libgcrypt depending on how it was compiled. This also means one can satisfy FIPS requirements entirely in userspace without a Linux kernel in FIPS mode. Higher level APIs built on top of qemu also support qcow2 disk encryption, as in projects such as libvirt and OpenStack Cinder.If you carefully read the docs, you may notice that agent support is explicitly sometimes called out as not supported or not mentioned. Quite often agents running inside the OS may not have enough observability to them to assess if there is external encryption. It does mean that monitoring above encryption options require different approaches - for example monitor your cloud configuration using tools such as Wiz and Orca, rather than using agents inside individual VMs. For laptop / endpoint security agents, I do wish they would start gaining capability to report OPAL SED availability and status if it is active or not.What about using software encryption none-the-less on top of the above solutions? It is commonly referred to double or multiple encryption. There will be an additional performance impact, but it can be worthwhile. It really depends on what you define as data at rest for yourself and which controls you need. If one has a dual-boot laptop, and wants to keep one OS encrypted whilst booted into the other, it can perfectly reasonable to encrypted the two using separate software encryption keys. In addition to the OPAL encryption of the ESP. For more targeted per-file / per-folder encryption, one can look into using gocryptfs which is the best successor to the once popular, but now deprecated eCryptfs (amazing tool, but has fallen behind in development and can lead to data loss).All of the above mostly talks about cryptographic encryption, which only provides confidentially but not data integrity. To protect integrity, one needs to choose how to maintain that. dm-verity is a good choice for read-only and rigid installations. For read-write workloads, it may be easier to deploy ZFS or Btrfs instead. If one is using filesystems without a built-in integrity support such as XFS or Ext4, one can retrofit integrity layer to them by using dm-integrity (either standalone, or via dm-luks/cryptsetup --integrity option).

If one has a lot of estate and a lot of encryption keys to keep track off a key management solution is likely needed. The most popular solution is likely the one from Thales Group marketed under ChiperTrust Data Security Platform (previously Vormetric), but there are many others including OEM / Vendor / Hardware / Cloud specific or agnostic solutions.

I hope this crash course guide piques your interest to learn and discover modern confidentially and integrity solutions, and to re-affirm or change your existing controls w.r.t. to data protection at rest.

Full disk encryption, including UEFI ESP /boot/efi is now widely achievable by default on both baremetal machines and in VMs including with FIPS certification. To discuss more let's connect on Linkedin.

26 July 2025

Birger Schacht: My DebConf 25 review

DebConf 25 happened between 14th July and 19th July and I was there. It was my first DebConf (the big one, I was at a Mini DebConf in Hamburg a couple of years ago) and it was interesting. DebConf 25 happened at a Campus University at the outskirts of Brest and I was rather reluctant to go at first (EuroPython 25 was happening at the same time in Prague), but I decided to use the chance of DebConf happening in Europe, reachable by train from Vienna. We took the nighttrain to Paris, then found our way through the maze that is the Paris underground system and then got to Brest with the TGV. On our way to the Conference site we made a detour to a supermarket, which wasn t that easy because is was a national holiday in France and most of the shops were closed. But we weren t sure about the food situation at DebConf and we also wanted to get some beer. At the conference we were greeted by very friendly people at the badge station and the front desk and got our badges, swag and most important the keys to pretty nice rooms on the campus. Our rooms had a small private bathroom with a toilet and a shower and between the two rooms was a shared kitchen with a refrigerator and a microwave. All in all, the accommodation was simple but provided everything we needed and especially a space to have some privacy. During the next days I watched a lot of talks, met new people, caught up with old friends and also had a nice time with my travel buddies. There was a beach near the campus which I used nearly every day. It was mostly sunny except for the last day of the conference, which apparently was not common for the Brest area, so we got lucky regarding the weather. Landscape view of the sea at Dellec beach Given that we only arrived in the evening of the first day of DebConf, I missed the talk When Free Software Communities Unite: Tails, Tor, and the Fight for Privacy (recording), but I watched it on the way home and it was also covered by LWN. On Tuesday I started the day by visiting a talk about tag2upload (recording). The same day there was also an academic track and I watched the talk titled Integrating Knowledge Graphs into the Debian Ecosystem (recording) which presented a property graph showing relationships between various entities like packages, maintainers or bugs (there is a repository with parts of a paper, but not much other information). The speaker also mentioned the graphcast framework and the ontocast framework which sound interesting - we might have use for something liked this at $dayjob. In the afternoon there was a talk about the ArchWiki (recording) which gave a comprehensive insight in how the ArchWiki and the community behind it works. Right after that was a Debian Wiki BoF. There are various technical limitations with the current wiki software and there are not enough helping hands to maintain the service and do content curation. But the BoF had some nice results: there is now a new debian-wiki mailinglist, an IRC channel, a MediaWiki installation has been set up during DebConf, there are efforts to migrate the data and most importantly: and handful of people who want to maintain the service and organize the content of the wiki. I think the input from the ArchWiki folks gave some ideas how that team could operate. Tag at the wall at Dellec beach Wednesday was the day of the daytrip. I did not sign up for any of the trips and used the time to try out tag2upload, uploaded the latest labwc release to experimental and spent the rest of the day at the beach. Other noteworthy session I ve attended were the Don t fear the TPM talk (recording), which showed me a lot of stuff to try out, the session about lintian-ng (no recording), which is an experimental approach to make lintian faster, the review of the first year of wcurls existence (no recording yet) and the summary of Rust packaging in Debian (no recording yet). In between the sessions I started working on packaging wlr-sunclock (#1109230).

What did not work Vegan food. I might be spoiled by other conferences. Both at EuroPycon last year (definitely bigger, a lot more commercial) and at PyCon CZ 23 (similar in size, a lot more DIY) there was catering with explicitly vegan options. As I ve mentioned in the beginning, we went to a supermarket before we went to the conference and we had to go there one more time during the conference. I think there was a mixture between a total lack of awareness and a LOT of miscommunication. The breakfasts at the conference consisted of pastries and baguettes - I asked at the first day what the vegan options were and the answer was I don t know, maybe the baguette? and we were asked to only take as much baguette as the people who also got pastries. The lunch was prepared by the Restaurant associatif de Kern vent which is a canteen at the university campus. When we asked if there is vegan food, the people there said that there was only a vegetarian option so we only ate salad. Only later we heard via word of mouth that one has to explicitly ask for a vegan meal which was apparently prepared separatly and you had to find the right person that knows about it (I think thats very Debian-like ). But even then a person once got a vegetarian option offered as vegan food. One problem was also the missing / confusing labeling of the food. At the conference dinner there was apparently vegan food, but it was mixed with all the other food. There were some labels but with hundreds of hungry people around and caterers removing empty plates and dropping off plates with other stuff, everything gets mixed up. In the end we ate bread soaked in olive oil, until the olive oil got taken away by the catering people literally while we were dipping the bread in it. And when these issues were raised, some of the reactions can be summarized as You re holding it wrong which was really frustrating. The dinners at the conference hall were similar. At some point I had the impression that vegan and vegetarian was simply seen as the same thing. Dinner menu at the conference If the menus would be written like a debian/copyright file it would probably have looked like this:
Food: *
Diet: Vegan or Vegetarian
But the thing is that Vegan and Vegetarian cannot be mixed. Its similar to non compatible licenses. Once you mix vegan food with vegan food with vegetarian food it s not vegan anymore. Don t get me wrong, I know its hard to organize food for hundreds of people. But if you don t know what it means to provide a vegan option, just communicate the fact so people can look alternatives in advance. During the week some of the vegan people shared food, which was really nice and there were also a lot of non-vegan people who tried to help, organized extra food or simply listened to the hangry rants. Thanks for that!

Paris Saturday was the last day of DebConf and it was a rainy day. On Sunday morning we took the TGV back to Paris and then stayed there for one night because the next night train back to Vienna was on Monday. Luckily the weather was better in Paris. The first thing we did was to look up a vegan burger place. In the evening we strolled along the Seine and had a couple of beers at the Jardins du Trocad ro. Monday the rain also arrived in Paris and we mostly went from one cafe to the next, but also managed to visit Notre Dame.

Conclusio The next DebConf will be in Argentina and I think its likely that DebConf 27 will also not happen anywhere in trainvelling distance. But even if, I think the Mini DebConfs are more my style of happening (there is one planned in Hamburg next spring, and a couple of days ago I learned that there will be a Back to the Future musical show in Hamburg during that time). Nonetheless I had a nice time and I stumbled over some projects I might get more involved in. Thanks also to my travel buddies who put up with me

Matthew Palmer: Object deserialization attacks using Ruby's Oj JSON parser

tl;dr: there is an attack in the wild which is triggering dangerous-but-seemingly-intended behaviour in the Oj JSON parser when used in the default and recommended manner, which can lead to everyone s favourite kind of security problem: object deserialization bugs! If you have the oj gem anywhere in your Gemfile.lock, the quickest mitigation is to make sure you have Oj.default_options = mode: :strict somewhere, and that no library is overwriting that setting to something else.

Prologue As a sensible sysadmin, all the sites I run send me a notification if any unhandled exception gets raised. Mostly, what I get sent is error-handling corner cases I missed, but now and then things get more interesting. In this case, it was a PG::UndefinedColumn exception, which looked something like this:
PG::UndefinedColumn: ERROR:  column "xyzzydeadbeef" does not exist
This is weird on two fronts: firstly, this application has been running for a while, and if there was a schema problem, I d expect it to have made itself apparent long before now. And secondly, while I don t profess to perfection in my programming, I m usually better at naming my database columns than that. Something is definitely hinky here, so let s jump into the mystery mobile!

The column name is coming from outside the building! The exception notifications I get sent include a whole lot of information about the request that caused the exception, including the request body. In this case, the request body was JSON, and looked like this:
 "name":":xyzzydeadbeef", ... 
The leading colon looks an awful lot like the syntax for a Ruby symbol, but it s in a JSON string. Surely there s no way a JSON parser would be turning that into a symbol, right? Right?!? Immediately, I thought that that possibly was what was happening, because I use Sequel for my SQL database access needs, and Sequel treats symbols as database column names. It seemed like too much of a coincidence that a vaguely symbol-shaped string was being sent in, and the exact same name was showing up as a column name. But how the flying fudgepickles was a JSON string being turned into a Ruby symbol, anyway? Enter Oj.

Oj? I barely know aj A long, long time ago, the standard Ruby JSON library had a reputation for being slow. Thus did many competitors flourish, claiming more features and better performance. Strong amongst the contenders was oj (for Optimized JSON ), touted as The fastest JSON parser and object serializer . Given the history, it s not surprising that people who wanted the best possible performance turned to Oj, leading to it being found in a great many projects, often as a sub-dependency of a dependency of a dependency (which is how it ended up in my project). You might have noticed in Oj s description that, in addition to claiming fastest , it also describes itself as an object serializer . Anyone who has kept an eye on the security bug landscape will recall that object deserialization is a rich vein of vulnerabilities to mine. Libraries that do object deserialization, especially ones with a history that goes back to before the vulnerability class was well-understood, are likely to be trouble magnets. And thus, it turns out to be with Oj. By default, Oj will happily turn any string that starts with a colon into a symbol:

>> require "oj"
>> Oj.load(' "name":":xyzzydeadbeef","username":"bob","answer":42 ')
=>  "name"=>:xyzzydeadbeef, "username"=>"bob", "answer"=>42 

How that gets exploited is only limited by the creativity of an attacker. Which I ll talk about more shortly but first, a word from my rant cortex.

Insecure By Default is a Cancer While the object of my ire today is Oj and its fast-and-loose approach to deserialization, it is just one example of a pervasive problem in software: insecurity by default. Whether it s a database listening on 0.0.0.0 with no password as soon as its installed, or a library whose default behaviour is to permit arbitrary code execution, it all contributes to a software ecosystem that is an appalling security nightmare. When a user (in this case, a developer who wants to parse JSON) comes across a new piece of software, they have by definition no idea what they re doing with that software. They re going to use the defaults, and follow the most easily-available documentation, to achieve their goal. It is unrealistic to assume that a new user of a piece of software is going to do things the right way , unless that right way is the only way, or at least the by-far-the-easiest way. Conversely, the developer(s) of the software is/are the domain experts. They have knowledge of the problem domain, through their exploration while building the software, and unrivalled expertise in the codebase. Given this disparity in knowledge, it is tantamount to malpractice for the experts the developer(s) to off-load the responsibility for the safe and secure use of the software to the party that has the least knowledge of how to do that (the new user). To apply this general principle to the specific case, take the Using section of the Oj README. The example code there calls Oj.load, with no indication that this code will, in fact, parse specially-crafted JSON documents into Ruby objects. The brand-user user of the library, no doubt being under pressure to Get Things Done, is almost certainly going to look at this Using example, get the apparent result they were after (a parsed JSON document), and call it a day. It is unlikely that a brand-new user will, for instance, scroll down to the Further Reading section, find the second last (of ten) listed documents, Security.md , and carefully peruse it. If they do, they ll find an oblique suggestion that parsing untrusted input is never a good idea . While that s true, it s also rather unhelpful, because I d wager that by far the majority of JSON parsed in the world is untrusted , in one way or another, given the predominance of JSON as a format for serializing data passing over the Internet. This guidance is roughly akin to putting a label on a car s airbags that driving at speed can be hazardous to your health : true, but unhelpful under the circumstances. The solution is for default behaviours to be secure, and any deviation from that default that has the potential to degrade security must, at the very least, be clearly labelled as such. For example, the Oj.load function should be named Oj.unsafe_load, and the Oj.load function should behave as the Oj.safe_load function does presently. By naming the unsafe function as explicitly unsafe, developers (and reviewers) have at least a fighting chance of recognising they re doing something risky. We put warning labels on just about everything in the real world; the same should be true of dangerous function calls. OK, rant over. Back to the story.

But how is this exploitable? So far, I ve hopefully made it clear that Oj does some Weird Stuff with parsing certain JSON strings. It caused an unhandled exception in a web application I run, which isn t cool, but apart from bombing me with exception notifications, what s the harm? For starters, let s look at our original example: when presented with a symbol, Sequel will interpret that as a column name, rather than a string value. Thus, if our save an update to the user code looked like this:

# request_body has the JSON representation of the form being submitted
body = Oj.load(request_body)
DB[:users].where(id: user_id).update(name: body["name"])

In normal operation, this will issue an SQL query along the lines of UPDATE users SET name='Jaime' WHERE id=42. If the name given is Jaime O Dowd , all is still good, because Sequel quotes string values, etc etc. All s well so far. But, imagine there is a column in the users table that normally users cannot read, perhaps admin_notes. Or perhaps an attacker has gotten temporary access to an account, and wants to dump the user s password hash for offline cracking. So, they send an update claiming that their name is :admin_notes (or :password_hash). In JSON, that ll look like "name":":admin_notes" , and Oj.load will happily turn that into a Ruby object of "name"=>:admin_notes . When run through the above update the user code fragment, it ll produce the SQL UPDATE users SET name=admin_notes WHERE id=42. In other words, it ll copy the contents of the admin_notes column into the name column which the attacker can then read out just by refreshing their profile page.

But Wait, There s More! That an attacker can read other fields in the same table isn t great, but that s barely scratching the surface. Remember before I said that Oj does object serialization ? That means that, in general, you can create arbitrary Ruby objects from JSON. Since objects contain code, it s entirely possible to trigger arbitrary code execution by instantiating an appropriate Ruby object. I m not going to go into details about how to do this, because it s not really my area of expertise, and many others have covered it in detail. But rest assured, if an attacker can feed input of their choosing into a default call to Oj.load, they ve been handed remote code execution on a platter.

Mitigations As Oj s object deserialization is intended and documented behaviour, don t expect a future release to make any of this any safer. Instead, we need to mitigate the risks. Here are my recommended steps:
  1. Look in your Gemfile.lock (or SBOM, if that s your thing) to see if the oj gem is anywhere in your codebase. Remember that even if you don t use it directly, it s popular enough that it is used in a lot of places. If you find it in your transitive dependency tree anywhere, there s a chance you re vulnerable, limited only by the ingenuity of attackers to feed crafted JSON into a deeply-hidden Oj.load call.
  2. If you depend on oj directly and use it in your project, consider not doing that. The json gem is acceptably fast, and JSON.parse won t create arbitrary Ruby objects.
  3. If you really, really need to squeeze the last erg of performance out of your JSON parsing, and decide to use oj to do so, find all calls to Oj.load in your code and switch them to call Oj.safe_load.
  4. It is a really, really bad idea to ever use Oj to deserialize JSON into objects, as it lacks the safety features needed to mitigate the worst of the risks of doing so (for example, restricting which classes can be instantiated, as is provided by the permitted_classes argument to Psych.load). I d make it a priority to move away from using Oj for that, and switch to something somewhat safer (such as the aforementioned Psych). At the very least, audit and comment heavily to minimise the risk of user-provided input sneaking into those calls somehow, and pass mode: :object as the second argument to Oj.load, to make it explicit that you are opting-in to this far more dangerous behaviour only when it s absolutely necessary.
  5. To secure any unsafe uses of Oj.load in your dependencies, consider setting the default Oj parsing mode to :strict, by putting Oj.default_options = mode: :strict somewhere in your initialization code (and make sure no dependencies are setting it to something else later!). There is a small chance that this change of default might break something, if a dependency is using Oj to deliberately create Ruby objects from JSON, but the overwhelming likelihood is that Oj s just being used to parse ordinary JSON, and these calls are just RCE vulnerabilities waiting to give you a bad time.

Is Your Bacon Saved? If I ve helped you identify and fix potential RCE vulnerabilities in your software, or even just opened your eyes to the risks of object deserialization, please help me out by buying me a refreshing beverage. I would really appreciate any support you can give. Alternately, if you d like my help in fixing these (and many other) sorts of problems, I m looking for work, so email me.

20 July 2025

Michael Prokop: What to expect from Debian/trixie #newintrixie

Trixie Banner, Copyright 2024 Elise Couper Update on 2025-07-28: added note about Debian 13/trixie support for OpenVox (thanks, Ben Ford!) Debian v13 with codename trixie is scheduled to be published as new stable release on 9th of August 2025. I was the driving force at several of my customers to be well prepared for the upcoming stable release (my efforts for trixie started in August 2024). On the one hand, to make sure packages we care about are available and actually make it into the release. On the other hand, to ensure there are no severe issues that make it into the release and to get proper and working upgrades. So far everything is looking pretty well and working fine, the efforts seemed to have payed off. :) As usual with major upgrades, there are some things to be aware of, and hereby I m starting my public notes on trixie that might be worth for other folks. My focus is primarily on server systems and looking at things from a sysadmin perspective. Further readings As usual start at the official Debian release notes, make sure to especially go through What s new in Debian 13 + issues to be aware of for trixie (strongly recommended read!). Package versions As a starting point, let s look at some selected packages and their versions in bookworm vs. trixie as of 2025-07-20 (mainly having amd64 in mind):
Package bookworm/v12 trixie/v13
ansible 2.14.3 2.19.0
apache 2.4.62 2.4.64
apt 2.6.1 3.0.3
bash 5.2.15 5.2.37
ceph 16.2.11 18.2.7
docker 20.10.24 26.1.5
dovecot 2.3.19 2.4.1
dpkg 1.21.22 1.22.21
emacs 28.2 30.1
gcc 12.2.0 14.2.0
git 2.39.5 2.47.2
golang 1.19 1.24
libc 2.36 2.41
linux kernel 6.1 6.12
llvm 14.0 19.0
lxc 5.0.2 6.0.4
mariadb 10.11 11.8
nginx 1.22.1 1.26.3
nodejs 18.13 20.19
openjdk 17.0 21.0
openssh 9.2p1 10.0p1
openssl 3.0 3.5
perl 5.36.0 5.40.1
php 8.2+93 8.4+96
podman 4.3.1 5.4.2
postfix 3.7.11 3.10.3
postgres 15 17
puppet 7.23.0 8.10.0
python3 3.11.2 3.13.5
qemu/kvm 7.2 10.0
rsync 3.2.7 3.4.1
ruby 3.1 3.3
rust 1.63.0 1.85.0
samba 4.17.12 4.22.3
systemd 252.36 257.7-1
unattended-upgrades 2.9.1 2.12
util-linux 2.38.1 2.41
vagrant 2.3.4 2.3.7
vim 9.0.1378 9.1.1230
zsh 5.9 5.9
Misc unsorted apt The new apt version 3.0 brings several new features, including: systemd systemd got upgraded from v252.36-1~deb12u1 to 257.7-1 and there are lots of changes. Be aware that systemd v257 has a new net.naming_scheme, v257 being PCI slot number is now read from firmware_node/sun sysfs file. The naming scheme based on devicetree aliases was extended to support aliases for individual interfaces of controllers with multiple ports. This might affect you, see e.g. #1092176 and #1107187, the Debian Wiki provides further useful information. There are new systemd tools available: The tools provided by systemd gained several new options: Debian s systemd ships new binary packages: Linux Kernel The trixie release ships a Linux kernel based on latest longterm version 6.12. As usual there are lots of changes in the kernel area, including better hardware support, and this might warrant a separate blog entry. To highlight some changes with Debian trixie: See Kernelnewbies.org for further changes between kernel versions. Configuration management For puppet users, Debian provides the puppet-agent (v8.10.0), puppetserver (v8.7.0) and puppetdb (v8.4.1) packages. Puppet s upstream does not provide packages for trixie, yet. Given how long it took them for Debian bookworm, and with their recent Plans for Open Source Puppet in 2025, it s unclear when (and whether at all) we might get something. As a result of upstream behavior, also the OpenVox project evolved, and they already provide Debian 13/trixie support (https://apt.voxpupuli.org/openvox8-release-debian13.deb). FYI: the AIO puppet-agent package for bookworm (v7.34.0-1bookworm) so far works fine for me on Debian/trixie. Be aware that due to the apt-key removal you need a recent version of the puppetlabs-apt for usage with trixie. The puppetlabs-ntp module isn t yet ready for trixie (regarding ntp/ntpsec), if you should depend on that. ansible is available and made it with version 2.19 into trixie. Prometheus stack Prometheus server was updated from v2.42.0 to v2.53, and all the exporters that got shipped with bookworm are still around (in more recent versions of course). Trixie gained some new exporters: Virtualization docker (v26.1.5), ganeti (v3.1.0), libvirt (v11.3.0, be aware of significant changes to libvirt packaging), lxc (v6.0.4), podman (v5.4.2), openstack (see openstack-team on Salsa), qemu/kvm (v10.0.2), xen (v4.20.0) are all still around. Proxmox already announced their PVE 9.0 BETA, being based on trixie and providing 6.14.8-1 kernel, QEMU 10.0.2, LXC 6.0.4, OpenZFS 2.3.3. Vagrant is available in version 2.3.7, but Vagrant upstream does not provide packages for trixie yet. Given that HashiCorp adopted the BSL, the future of vagrant in Debian is unclear. If you re relying on VirtualBox, be aware that upstream doesn t provide packages for trixie, yet. VirtualBox is available from Debian/unstable (version 7.1.12-dfsg-1 as of 2025-07-20), but not shipped with stable release since quite some time (due to lack of cooperation from upstream on security support for older releases, see #794466). Be aware that starting with Linux kernel 6.12, KVM initializes virtualization on module loading by default. This prevents VirtualBox VMs from starting. In order to avoid this, either add kvm.enable_virt_at_load=0 parameter into kernel command line or unload the corresponding kvm_intel / kvm_amd module. If you want to use Vagrant with VirtualBox on trixie, be aware that Debian s vagrant package as present in trixie doesn t support the VirtualBox package version 7.1 as present in Debian/unstable (manually patching vagrant s meta.rb and rebuilding the package without Breaks: virtualbox (>= 7.1) is known to be working). util-linux The are plenty of new options available in the tools provided by util-linux: Now no longer present in util-linux as of trixie: The following binaries got moved from util-linux to the util-linux-extra package: And the util-linux-extra package also provides new tools: OpenSSH OpenSSH was updated from v9.2p1 to 10.0p1-5, so if you re interested in all the changes, check out the release notes between those versions (9.3, 9.4, 9.5, 9.6, 9.7, 9.8, 9.9 + 10.0). Let s highlight some notable behavior changes in Debian: There are some notable new features: Thanks to everyone involved in the release, looking forward to trixie + and happy upgrading!
Let s continue with working towards Debian/forky. :)

17 July 2025

Arnaud Rebillout: Acquire-By-Hash for APT packages repositories, and the lack of it in Kali Linux

This is a lenghty blog post. It features a long introduction that explains how apt update acquires various files from a package repository, what is Acquire-By-Hash, and how it all works for Kali Linux: a Debian-based distro that doesn't support Acquire-By-Hash, and which is distributed via a network of mirrors and a redirector. In a second part, I explore some "Hash Sum Mismatch" errors that we can hit with Kali Linux, errors that would not happen if only Acquire-By-Hash was supported. If anything, this blog post supports the case for adding Acquire-By-Hash support in reprepro, as requested at https://bugs.debian.org/820660. All of this could have just remained some personal notes for myself, but I got carried away and turned it into a blog post, dunno why... Hopefully others will find it interesting, but you really need to like troubleshooting stories, packed with details, and poorly written at that. You've been warned! Introducing Acquire-By-Hash Acquire-By-Hash is a feature of APT package repositories, that might or might not be supported by your favorite Debian-based distribution. A repository that supports it says so, in the Release file, by setting the field Acquire-By-Hash: yes. It's easy to check. Debian and Ubuntu both support it:
$ wget -qO- http://deb.debian.org/debian/dists/sid/Release   grep -i ^Acquire-By-Hash:
Acquire-By-Hash: yes
$ wget -qO- http://archive.ubuntu.com/ubuntu/dists/devel/Release   grep -i ^Acquire-By-Hash:
Acquire-By-Hash: yes
What about other Debian derivatives?
$ wget -qO- http://http.kali.org/kali/dists/kali-rolling/Release   grep -i ^Acquire-By-Hash:   echo not supported
not supported
$ wget -qO- https://archive.raspberrypi.com/debian/dists/trixie/Release   grep -i ^Acquire-By-Hash:   echo not supported
not supported
$ wget -qO- http://packages.linuxmint.com/dists/faye/Release   grep -i ^Acquire-By-Hash:   echo not supported
not supported
$ wget -qO- https://apt.pop-os.org/release/dists/noble/Release   grep -i ^Acquire-By-Hash:   echo not supported
not supported
Huhu, Acquire-By-Hash is not ubiquitous. But wait, what is Acquire-By-Hash to start with? To answer that, we have to take a step back and cover some basics first. The HTTP requests performed by 'apt update' What happens when one runs apt update? APT first requests the Release file from the repository(ies) configured in the APT sources. This file is a starting point, it contains a list of other files (sometimes called "Index files") that are available in the repository, along with their hashes. After fetching the Release file, APT proceeds to request those Index files. To give you an idea, there are many kinds of Index files, among which: There's an excellent Wiki page that details the structure of a Debian package repository, it's there: https://wiki.debian.org/DebianRepository/Format. Note that APT doesn't necessarily download ALL of those Index files. For simplicity, we'll limit ourselves to the minimal scenario, where apt update downloads only the Packages files. Let's try to make it more visual: here's a representation of a apt update transaction, assuming that all the components of the repository are enabled:
apt update -> Release -> Packages (main/amd64)
                      -> Packages (contrib/amd64)
                      -> Packages (non-free/amd64)
                      -> Packages (non-free-firmware/amd64)
Meaning that, in a first step, APT downloads the Release file, reads its content, and then in a second step it downloads the Index files in parallel. You can actually see that happen with a command such as apt -q -o Debug::Acquire::http=true update 2>&1 grep ^GET. For Kali Linux you'll see something pretty similar to what I described above. Try it!
$ podman run --rm kali-rolling apt -q -o Debug::Acquire::http=true update 2>&1   grep ^GET
GET /kali/dists/kali-rolling/InRelease HTTP/1.1    # <- returns a redirect, that is why the file is requested twice
GET /kali/dists/kali-rolling/InRelease HTTP/1.1
GET /kali/dists/kali-rolling/non-free/binary-amd64/Packages.gz HTTP/1.1
GET /kali/dists/kali-rolling/main/binary-amd64/Packages.gz HTTP/1.1
GET /kali/dists/kali-rolling/non-free-firmware/binary-amd64/Packages.gz HTTP/1.1
GET /kali/dists/kali-rolling/contrib/binary-amd64/Packages.gz HTTP/1.1
However, and it's now becoming interesting, for Debian or Ubuntu you won't see the same kind of URLs:
$ podman run --rm debian:sid apt -q -o Debug::Acquire::http=true update 2>&1   grep ^GET
GET /debian/dists/sid/InRelease HTTP/1.1
GET /debian/dists/sid/main/binary-amd64/by-hash/SHA256/22709f0ce67e5e0a33a6e6e64d96a83805903a3376e042c83d64886bb555a9c3 HTTP/1.1
APT doesn't download a file named Packages, instead it fetches a file named after a hash. Why? This is due to the field Acquire-By-Hash: yes that is present in the Debian's Release file. What does Acquire-By-Hash mean for 'apt update' The idea with Acquire-By-Hash is that the Index files are named after their hash on the repository, so if the MD5 sum of main/binary-amd64/Packages is 77b2c1539f816832e2d762adb20a2bb1, then the file will be stored at main/binary-amd64/by-hash/MD5Sum/77b2c1539f816832e2d762adb20a2bb1. The path main/binary-amd64/Packages still exists (it's the "Canonical Location" of this particular Index file), but APT won't use it, instead it downloads the file located in the by-hash/ directory. Why does it matter? This has to do with repository updates, and allowing the package repository to be updated atomically, without interruption of service, and without risk of failure client-side. It's important to understand that the Release file and the Index files are part of a whole, a set of files that go altogether, given that Index files are validated by their hash (as listed in the Release file) after download by APT. If those files are simply named "Release" and "Packages", it means they are not immutable: when the repository is updated, all of those files are updated "in place". And it causes problems. A typical failure mode for the client, during a repository update, is that: 1) APT requests the Release file, then 2) the repository is updated and finally 3) APT requests the Packages files, but their checksum don't match, causing apt update to fail. There are variations of this error, but you get the idea: updating a set of files "in place" is problematic. The Acquire-By-Hash mechanism was introduced exactly to solve this problem: now the Index files have a unique, immutable name. When the repository is updated, at first new Index files are added in the by-hash/ directory, and only after the Release file is updated. Old Index files in by-hash/ are retained for a while, so there's a grace period during which both the old and the new Release files are valid and working: the Index files that they refer to are available in the repo. As a result: no interruption of service, no failure client-side during repository updates. This is explained in more details at https://www.chiark.greenend.org.uk/~cjwatson/blog/no-more-hash-sum-mismatch-errors.html, which is the blog post from Colin Watson that came out at the time Acquire-By-Hash was introduced in... 2016. This is still an excellent read in 2025. So you might be wondering why I'm rambling about a problem that's been solved 10 years ago, but then as I've shown in the introduction, the problem is not solved for everyone. Support for Acquire-By-Hash server side is not for granted, and unfortunately it never landed in reprepro, as one can see at https://bugs.debian.org/820660. reprepro is a popular tool for creating APT package repositories. In particular, at Kali Linux we use reprepro, and that's why there's no Acquire-By-Hash: yes in the Kali Release file. As one can guess, it leads to subtle issues during those moments when the repository is updated. However... we're not ready to talk about that yet! There's still another topic that we need to cover: this window of time during which a repository is being updated, and during which apt update might fail. The window for Hash Sum Mismatches, and the APT trick that saves the day Pay attention! In this section, we're now talking about packages repositories that do NOT support Acquire-By-Hash, such as the Kali Linux repository. As I've said above, it's only when the repository is being updated that there is a "Hash Sum Mismatch Window", ie. a moment when apt update might fail for some unlucky clients, due to invalid Index files. Surely, it's a very very short window of time, right? I mean, it can't take that long to update files on a server, especially when you know that a repository is usually updated via rsync, and rsync goes to great length to update files the most atomically as it can (with the option --delay=updates). So if apt update fails for me, I've been very unlucky, but I can just retry in a few seconds and it should be fixed, isn't it? The answer is: it's not that simple. So far I pictured the "package repository" as a single server, for simplicity. But it's not always what it is. For Kali Linux, by default users have http.kali.org configured in their APT sources, and it is a redirector, ie. a web server that redirects requests to mirrors that are nearby the client. Some context that matters for what comes next: the Kali repository is synced with ~70 mirrors all around the world, 4 times a day. What happens if your apt update requests are redirected to 2 mirrors close-by, and one was just synced, while the other is still syncing (or even worse, failed to sync entirely)? You'll get a mix of old and new Index files. Hash Sum Mismatch! As you can see, with this setup the "Hash Sum Mismatch Window" becomes much longer than a few seconds: as long as nearby mirrors are syncing the window is opened. You could have a fast and a slow mirror next to you, and they can be out of sync with each other for several minutes every time the repository is updated, for example. For Kali Linux in particular, there's a "detail" in our network of mirrors that, as a side-effect, almost guarantees that this window lasts several minutes at least. This is because the pool of mirrors includes kali.download which is in fact the Cloudflare CDN, and from the redirector point of view, it's seen as a "super mirror" that is present in every country. So when APT fires a bunch of requests against http.kali.org, it's likely that some of them will be redirected to the Kali CDN, and others will be redirected to a mirror nearby you. So far so good, but there's another point of detail to be aware of: the Kali CDN is synced first, before the other mirrors. Another thing: usually the mirrors that are the farthest from the Tier-0 mirror are the longest to sync. Packing all of that together: if you live somewhere in Asia, it's not uncommon for your "Hash Sum Mismatch Window" to be as long as 30 minutes, between the moment the Kali CDN is synced, and the moment your nearby mirrors catch up and are finally in sync as well. Having said all of that, and assuming you're still reading (anyone here?), you might be wondering... Does that mean that apt update is broken 4 times a day, for around 30 minutes, for every Kali user out there? How can they bear with that? Answer is: no, of course not, it's not broken like that. It works despite all of that, and this is thanks to yet another detail that we didn't go into yet. This detail lies in APT itself. APT is in fact "redirector aware", in a sense. When it fetches a Release file, and if ever the request is redirected, it then fires the subsequent requests against the server where it was initially redirected. So you are guaranteed that the Release file and the Index files are retrieved from the same mirror! Which brings back our "Hash Sum Mismatch Window" to the window for a single server, ie. something like a few seconds at worst, hopefully. And that's what makes it work for Kali, literally. Without this trick, everything would fall apart. For reference, this feature was implemented in APT back in... 2016! A busy year it seems! Here's the link to the commit: use the same redirection mirror for all index files. To finish, a dump from the console. You can see this behaviour play out easily, again with APT debugging turned on. Below we can see that only the first request hits the Kali redirector:
$ podman run --rm kali-rolling apt -q -o Debug::Acquire::http=true update 2>&1   grep -e ^Answer -e ^HTTP
Answer for: http://http.kali.org/kali/dists/kali-rolling/InRelease
HTTP/1.1 302 Found
Answer for: http://mirror.freedif.org/kali/dists/kali-rolling/InRelease
HTTP/1.1 200 OK
Answer for: http://mirror.freedif.org/kali/dists/kali-rolling/non-free-firmware/binary-amd64/Packages.gz
HTTP/1.1 200 OK
Answer for: http://mirror.freedif.org/kali/dists/kali-rolling/contrib/binary-amd64/Packages.gz
HTTP/1.1 200 OK
Answer for: http://mirror.freedif.org/kali/dists/kali-rolling/main/binary-amd64/Packages.gz
HTTP/1.1 200 OK
Answer for: http://mirror.freedif.org/kali/dists/kali-rolling/non-free/binary-amd64/Packages.gz
HTTP/1.1 200 OK
Interlude Believe it or not, we're done with the introduction! At this point, we have a good understanding of what apt update does (in terms of HTTP requests), we know that Release files and Index files are part of a whole, and we know that a repository can be updated atomically thanks to the Acquire-By-Hash feature, so that users don't experience interruption of service or failures of any sort, even with a rolling repository that is updated several times a day, like Debian sid. We've also learnt that, despite the fact that Acquire-By-Hash landed almost 10 years ago, some distributions like Kali Linux are still doing without it... and yet it works! But the reason why it works is more complicated to grasp, especially when you add a network of mirrors and a redirector to the picture. Moreover, it doesn't work as flawlessly as with the Acquire-By-Hash feature: we still expect some short (seconds at worst) "Hash Sum Mismatch Windows" for those unlucky users that run apt update at the wrong moment. This was a long intro, but that really sets the stage for what comes next: the edge cases. Some situations in which we can hit some Hash Sum Mismatch errors with Kali. Error cases that I've collected and investigated over the time... If anything, it supports the case that Acquire-By-Hash is really something that should be implemented in reprepro. More on that in the conclusion, but for now, let's look at those edge cases. Edge Case 1: the caching proxy If you put a caching proxy (such as approx, my APT caching proxy of choice) between yourself and the actual packages repository, then obviously it's the caching proxy that performs the HTTP requests, and therefore APT will never know about the redirections returned by the server, if any. So the APT trick of downloading all the Index files from the same server in case of redirect doesn't work anymore. It was rather easy to confirm that by building a Kali package during a mirror sync, and watch if fail at the "Update chroot" step:
$ sudo rm /var/cache/approx/kali/dists/ -fr
$ gbp buildpackage --git-builder=sbuild
+------------------------------------------------------------------------------+
  Update chroot                                Wed, 11 Jun 2025 10:33:32 +0000  
+------------------------------------------------------------------------------+
Get:1 http://http.kali.org/kali kali-dev InRelease [41.4 kB]
Get:2 http://http.kali.org/kali kali-dev/contrib Sources [81.6 kB]
Get:3 http://http.kali.org/kali kali-dev/main Sources [17.3 MB]
Get:4 http://http.kali.org/kali kali-dev/non-free Sources [122 kB]
Get:5 http://http.kali.org/kali kali-dev/non-free-firmware Sources [8297 B]
Get:6 http://http.kali.org/kali kali-dev/non-free amd64 Packages [197 kB]
Get:7 http://http.kali.org/kali kali-dev/non-free-firmware amd64 Packages [10.6 kB]
Get:8 http://http.kali.org/kali kali-dev/contrib amd64 Packages [120 kB]
Get:9 http://http.kali.org/kali kali-dev/main amd64 Packages [21.0 MB]
Err:9 http://http.kali.org/kali kali-dev/main amd64 Packages
  File has unexpected size (20984689 != 20984861). Mirror sync in progress? [IP: ::1 9999]
  Hashes of expected file:
   - Filesize:20984861 [weak]
   - SHA256:6cbbee5838849ffb24a800bdcd1477e2f4adf5838a844f3838b8b66b7493879e
   - SHA1:a5c7e557a506013bd0cf938ab575fc084ed57dba [weak]
   - MD5Sum:1433ce57419414ffb348fca14ca1b00f [weak]
  Release file created at: Wed, 11 Jun 2025 07:15:10 +0000
Fetched 17.9 MB in 9s (1893 kB/s)
Reading package lists...
E: Failed to fetch http://http.kali.org/kali/dists/kali-dev/main/binary-amd64/Packages.gz  File has unexpected size (20984689 != 20984861). Mirror sync in progress? [IP: ::1 9999]
   Hashes of expected file:
    - Filesize:20984861 [weak]
    - SHA256:6cbbee5838849ffb24a800bdcd1477e2f4adf5838a844f3838b8b66b7493879e
    - SHA1:a5c7e557a506013bd0cf938ab575fc084ed57dba [weak]
    - MD5Sum:1433ce57419414ffb348fca14ca1b00f [weak]
   Release file created at: Wed, 11 Jun 2025 07:15:10 +0000
E: Some index files failed to download. They have been ignored, or old ones used instead.
E: apt-get update failed
The obvious workaround is to NOT use the redirector in the approx configuration. Either use a mirror close by, or the Kali CDN:
$ grep kali /etc/approx/approx.conf 
#kali http://http.kali.org/kali <- do not use the redirector!
kali  http://kali.download/kali
Edge Case 2: debootstrap struggles What if one tries to debootstrap Kali while mirrors are being synced? It can give you some ugly logs, but it might not be fatal:
$ sudo debootstrap kali-dev kali-dev http://http.kali.org/kali
[...]
I: Target architecture can be executed
I: Retrieving InRelease 
I: Checking Release signature
I: Valid Release signature (key id 827C8569F2518CC677FECA1AED65462EC8D5E4C5)
I: Retrieving Packages 
I: Validating Packages 
W: Retrying failed download of http://http.kali.org/kali/dists/kali-dev/main/binary-amd64/Packages.gz
I: Retrieving Packages 
I: Validating Packages 
W: Retrying failed download of http://http.kali.org/kali/dists/kali-dev/main/binary-amd64/Packages.gz
I: Retrieving Packages 
I: Validating Packages 
W: Retrying failed download of http://http.kali.org/kali/dists/kali-dev/main/binary-amd64/Packages.gz
I: Retrieving Packages 
I: Validating Packages 
W: Retrying failed download of http://http.kali.org/kali/dists/kali-dev/main/binary-amd64/Packages.gz
I: Retrieving Packages 
I: Validating Packages 
I: Resolving dependencies of required packages...
I: Resolving dependencies of base packages...
I: Checking component main on http://http.kali.org/kali...
I: Retrieving adduser 3.152
[...]
To understand this one, we have to go and look at the debootstrap source code. How does debootstrap fetch the Release file and the Index files? It uses wget, and it retries up to 10 times in case of failure. It's not as sophisticated as APT: it doesn't detect when the Release file is served via a redirect. As a consequence, what happens above can be explained as such:
  1. debootstrap requests the Release file, gets redirected to a mirror, and retrieves it from there
  2. then it requests the Packages file, gets redirected to another mirror that is not in sync with the first one, and retrieves it from there
  3. validation fails, since the checksum is not as expected
  4. try again and again
Since debootstrap retries up to 10 times, at some point it's lucky enough to get redirected to the same mirror as the one from where it got its Release file from, and this time it gets the right Packages file, with the expected checksum. So ultimately it succeeds. Edge Case 3: post-debootstrap failure I like this one, because it gets us to yet another detail that we didn't talk about yet. So, what happens after we successfully debootstraped Kali? We have only the main component enabled, and only the Index file for this component have been retrieved. It looks like that:
$ sudo debootstrap kali-dev kali-dev http://http.kali.org/kali
[...]
I: Base system installed successfully.
$ cat kali-dev/etc/apt/sources.list
deb http://http.kali.org/kali kali-dev main
$ ls -l kali-dev/var/lib/apt/lists/
total 80468
-rw-r--r-- 1 root root    41445 Jun 19 07:02 http.kali.org_kali_dists_kali-dev_InRelease
-rw-r--r-- 1 root root 82299122 Jun 19 07:01 http.kali.org_kali_dists_kali-dev_main_binary-amd64_Packages
-rw-r--r-- 1 root root    40562 Jun 19 11:54 http.kali.org_kali_dists_kali-dev_Release
-rw-r--r-- 1 root root      833 Jun 19 11:54 http.kali.org_kali_dists_kali-dev_Release.gpg
drwxr-xr-x 2 root root     4096 Jun 19 11:54 partial
So far so good. Next step would be to complete the sources.list with other components, then run apt update: APT will download the missing Index files. But if you're unlucky, that might fail:
$ sudo sed -i 's/main$/main contrib non-free non-free-firmware/' kali-dev/etc/apt/sources.list
$ cat kali-dev/etc/apt/sources.list
deb http://http.kali.org/kali kali-dev main contrib non-free non-free-firmware
$ sudo chroot kali-dev apt update
Hit:1 http://http.kali.org/kali kali-dev InRelease
Get:2 http://kali.download/kali kali-dev/contrib amd64 Packages [121 kB]
Get:4 http://mirror.sg.gs/kali kali-dev/non-free-firmware amd64 Packages [10.6 kB]
Get:3 http://mirror.freedif.org/kali kali-dev/non-free amd64 Packages [198 kB]
Err:3 http://mirror.freedif.org/kali kali-dev/non-free amd64 Packages
  File has unexpected size (10442 != 10584). Mirror sync in progress? [IP: 66.96.199.63 80]
  Hashes of expected file:
   - Filesize:10584 [weak]
   - SHA256:71a83d895f3488d8ebf63ccd3216923a7196f06f088461f8770cee3645376abb
   - SHA1:c4ff126b151f5150d6a8464bc6ed3c768627a197 [weak]
   - MD5Sum:a49f46a85febb275346c51ba0aa8c110 [weak]
  Release file created at: Fri, 23 May 2025 06:48:41 +0000
Fetched 336 kB in 4s (77.5 kB/s)  
Reading package lists... Done
E: Failed to fetch http://mirror.freedif.org/kali/dists/kali-dev/non-free/binary-amd64/Packages.gz  File has unexpected size (10442 != 10584). Mirror sync in progress? [IP: 66.96.199.63 80]
   Hashes of expected file:
    - Filesize:10584 [weak]
    - SHA256:71a83d895f3488d8ebf63ccd3216923a7196f06f088461f8770cee3645376abb
    - SHA1:c4ff126b151f5150d6a8464bc6ed3c768627a197 [weak]
    - MD5Sum:a49f46a85febb275346c51ba0aa8c110 [weak]
   Release file created at: Fri, 23 May 2025 06:48:41 +0000
E: Some index files failed to download. They have been ignored, or old ones used instead.
What happened here? Again, we need APT debugging options to have a hint:
$ sudo chroot kali-dev apt -q -o Debug::Acquire::http=true update 2>&1   grep -e ^Answer -e ^HTTP
Answer for: http://http.kali.org/kali/dists/kali-dev/InRelease
HTTP/1.1 304 Not Modified
Answer for: http://http.kali.org/kali/dists/kali-dev/contrib/binary-amd64/Packages.gz
HTTP/1.1 302 Found
Answer for: http://http.kali.org/kali/dists/kali-dev/non-free/binary-amd64/Packages.gz
HTTP/1.1 302 Found
Answer for: http://http.kali.org/kali/dists/kali-dev/non-free-firmware/binary-amd64/Packages.gz
HTTP/1.1 302 Found
Answer for: http://kali.download/kali/dists/kali-dev/contrib/binary-amd64/Packages.gz
HTTP/1.1 200 OK
Answer for: http://mirror.sg.gs/kali/dists/kali-dev/non-free-firmware/binary-amd64/Packages.gz
HTTP/1.1 200 OK
Answer for: http://mirror.freedif.org/kali/dists/kali-dev/non-free/binary-amd64/Packages.gz
HTTP/1.1 200 OK
As we can see above, for the Release file we get a 304 (aka. "Not Modified") from the redirector. Why is that? This is due to If-Modified-Since also known as RFC-7232. APT supports this feature when it retrieves the Release file, it basically says to the server "Give me the Release file, but only if it's newer than what I already have". If the file on the server is not newer than that, it answers with a 304, which basically says to the client "You have the latest version already". So APT doesn't get a new Release file, it uses the Release file that is already present locally in /var/lib/apt/lists/, and then it proceeeds to download the missing Index files. And as we can see above: it then hits the redirector for each requests, and might be redirected to different mirrors for each Index file. So the important bit here is: the APT "trick" of downloading all the Index files from the same mirror only works if the Release file is served via a redirect. If it's not, like in this case, then APT hits the redirector for each files it needs to download, and it's subject to the "Hash Sum Mismatch" error again. In practice, for the casual user running apt update every now and then, it's not an issue. If they have the latest Release file, no extra requests are done, because they also have the latest Index files, from a previous apt update transaction. So APT doesn't re-download those Index files. The only reason why they'd have the latest Release file, and would miss some Index files, would be that they added new components to their APT sources, like we just did above. Not so common, and then they'd need to run apt update at a unlucky moment. I don't think many users are affected in practice. Note that this issue is rather new for Kali Linux. The redirector running on http.kali.org is mirrorbits, and support for If-Modified-Since just landed in the latest release, version 0.6. This feature was added by no one else than me, a great example of the expression "shooting oneself in the foot". An obvious workaround here is to empty /var/lib/apt/lists/ in the chroot after debootstrap completed. Or we could disable support for If-Modified-Since entirely for Kali's instance of mirrorbits. Summary and Conclusion The Hash Sum Mismatch failures above are caused by a combination of things: At the same time: All in all, it seems that all those issues would go away if only Acquire-By-Hash was supported in the Kali packages repository. Now is not a bad moment to try to land this feature in reprepro. After development halted in 2019, there's now a new upstream, and patches are being merged again. But it won't be easy: reprepro is a C codebase of around 50k lines of code, and it will take time and effort for the newcomer to get acquainted with the codebase, to the point of being able to implement a significant feature like this one. As an alternative, aptly is another popular tool to manage APT package repositories. And it seems to support Acquire-By-Hash already. Another alternative: I was told that debusine has (experimental) support for package repositories, and that Acquire-By-Hash is supported as well. Options are on the table, and I hope that Kali will eventually get support for Acquire-By-Hash, one way or another. To finish, due credits: this blog post exists thanks to my employer OffSec. Thanks for reading!

15 July 2025

Valhalla's Things: Federated instant messaging, 100% debianized

Posted on July 15, 2025
Tags: madeof:bits, topic:xmpp, topic:debian
This is an approximation of what I told at my talk Federated instant messaging, 100% debianized at DebConf 25, for people who prefer reading text. There will also be a video recording, as soon as it s ready :) at the link above. Communicating is a basic human need, and today some kind of computer-mediated communication is a requirement for most people, especially those in this room. With everything that is happening, it s now more important than ever that these means of communication aren t controlled by entities that can t be trusted, whether because they can stop providing the service at any given time or worse because they are going to abuse it in order to extract more profit. If only there was a well established chat system based on some standard developed in an open way, with all of the features one expects from a chat system but federated so that one can choose between many different and independent providers, or even self-hosting. But wait, it does exist! I m not talking about IRC, I m talking about XMPP! While it has been around since the last millennium, it has not remained still, with hundred of XMPP Extension Protocols, or XEPs that have been developed to add all of the features that nobody in 1999 imagined we could need in Instant Messaging today, and more, such as IoT devices or even social networks. There is a myth that this makes XMPP a mess of incompatible software, but there is an XEP for that: XEP-0479: XMPP Compliance Suites 2023, which is a list of XEPs that needs to be supported by Instant Messaging servers and clients, including mobile ones, and all of the recommended ones will mostly just work. These include conversations.im on android, dino on linux, which also works pretty nicely on linux phones, gajim for a more fully featured option that includes the kitchen sink, profanity for text interface fanatics like me, and I ve heard that monal works decently enough on the iThings. One thing that sets XMPP apart from other federated protocols, is that it has already gone through the phase where everybody was on one very big server, which then cut out federation, and we ve learned from the experience. These days there are still a few places that cater to newcomers, like https://account.conversations.im/, https://snikket.org/ (which also includes tools to make it easier to host your own instance) and https://quicksy.im/, but most people are actually on servers of a manageable size. My strong recommendation is for community hosting: not just self-hosting for yourself, but finding a community you feel part of and trust, and share a server with them, whether managed by volunteers from the community itself, or by a paid provider. If you are a Debian Developer, you already have one: you can go to https://db.debian.org/ , select Change rtc password to set your own password, wait an hour or so and you re good to go, as described at the bottom of https://wiki.debian.org/Teams/DebianSocial. A few years ago it had remained a bit behind, but these days it s managed by an active team, and if you re missing some features, or just want to know what s happening with it, you can join their BoF on Friday afternoon (and also thank them for their work). But for most people in this room, I d also recommend finding a friend or two who can help as a backup, and run a server for your own families or community: as a certified lazy person who doesn t like doing sysadmin jobs, I can guarantee it s perfectly feasible, about in the same range of difficulty as running your own web server for a static site. The two most popular servers for this, prosody and ejabberd, are well maintained in Debian, and these days there isn t a lot more to do than installing them, telling them your hostname, setting up a few DNS entries, and then you mostly need to keep the machine updated and very little else. After that, it s just applying system security updates, upgrading everything every couple years (some configuration updates may be needed, but nothing major) and maybe helping some non-technical users, if you are hosting your non-technical friends (the kind who would need support on any other platform).
Question time (including IRC questions) included which server would be recommended for very few users (I use prosody and I m very happy with it, but I believe ejabberd works also just fine), then somebody reminded me that I had forgotten to mention https://www.chatons.org/ , which lists free, ethical and decentralized services, including xmpp ones. I was also asked a comparison with matrix, which does cover a very similar target as XMPP, but I am quite biased against it, and I d prefer to talk well of my favourite platform than badly of its competitor.

12 July 2025

Christian Kastner: Easy dynamic dispatch using GLIBC Hardware Capabilities

TL;DR With GLIBC 2.33+, you can build a shared library multiple times targeting various optimization levels, and the dynamic linker/loader will pick the highest version supported by the current CPU. For example, with the layout below, on a Ryzen 9 5900X, x86-64-v3/libfoo0.so would be loaded:
/usr/lib/glibc-hwcaps/x86-64-v4/libfoo0.so
/usr/lib/glibc-hwcaps/x86-64-v3/libfoo0.so
/usr/lib/glibc-hwcaps/x86-64-v2/libfoo0.so
/usr/lib/libfoo0.so
Longer Version GLIBC Hardware Capabilities or "hwcaps" are an easy, almost trivial way to add a simple form of dynamic dispatch to any amd64 or POWER build, provided that either the build target or the compiler's optimizations can make use of certain CPU extensions. Mo Zhou pointed me towards this when I was faced with the challenge of creating a performant Debian package for ggml, the tensor library behind llama.cpp and whisper.cpp.
The Challenge A performant yet universally loadable library needs to make use of some form of dynamic dispatch to leverage the most effective SIMD extensions available on any given CPU it may run on. Last January, when I first started with the packaging of ggml for Debian, ggml did have support for this through its GGML_CPU_ALL_VARIANTS=ON option, but this was limited to amd64. This meant that on all the other architectures that Debian supports, I would need to target some ancient baseline, thus effectively crippling the package there.
Dynamic Dispatch using hwcaps hwcaps were introduced in GLIBC 2.33 and replace the (now) Legacy Hardware Capabilities, which were removed in 2.37. The way hwcaps work is delightfully simple: the dynamic linker/loader will look for a shared library not just in the standard library paths, but also in subdirectories thereof of the form hwcaps/<level>, starting with the highest <level> that the current CPU supports. The levels are predefined. I'm using the amd64 levels below. For ggml, this meant that I simply could build the library in multiple passes, each time targeting a different <level>, and install the result in the corresponding subdirectory, which resulted in the following layout (reduced to libggml.so for brevity):
/usr/lib/x86_64-linux-gnu/ggml/glibc-hwcaps/x86-64-v4/libggml.so
/usr/lib/x86_64-linux-gnu/ggml/glibc-hwcaps/x86-64-v3/libggml.so
/usr/lib/x86_64-linux-gnu/ggml/glibc-hwcaps/x86-64-v2/libggml.so
/usr/lib/x86_64-linux-gnu/ggml/libggml.so
In practice, this means that on a CPU supporting AVX512, the linker/loader would load x86-64-v4/libggml.so if it existed, and otherwise continue to look for the other levels, all the way down to the lowest one. On a CPU which supported only SSE4.2, the lookup process would be the same, ending with picking x86-64-v2/libggml.so. With QEMU, all of this was quickly verified. Note that the lowest-level library, targeting x86-64-v1, is not installed to a subdirectory, but to the path where the library would normally have been installed. This has the nice property that on systems not using GLIBC, and thus not having hwcaps available, package installation will still result in a loadable library, albeit the version with the worst performance. And a careful observer might have noticed that in the example above, the library is installed to a private ggml/ directory, so this mechanism also works when using RUNPATH or LD_LIBRARY_PATH. As mentioned above, Debian's ggml package will soon switch to GGML_CPU_ALL_VARIANTS=ON, but this was still quite the useful feature to discover.

Reproducible Builds: Reproducible Builds in June 2025

Welcome to the 6th report from the Reproducible Builds project in 2025. Our monthly reports outline what we ve been up to over the past month, and highlight items of news from elsewhere in the increasingly-important area of software supply-chain security. If you are interested in contributing to the Reproducible Builds project, please see the Contribute page on our website. In this report:
  1. Reproducible Builds at FOSSY 2025
  2. Distribution work
  3. diffoscope
  4. OSS Rebuild updates
  5. Website updates
  6. Upstream patches
  7. Reproducibility testing framework

Reproducible Builds at FOSSY 2025 On Saturday 2nd August, Vagrant Cascadian and Chris Lamb will be presenting at this year s FOSSY 2025. Their talk, titled Never Mind the Checkboxes, Here s Reproducible Builds!, is being introduced as follows:
There are numerous policy compliance and regulatory processes being developed that target software development but do they solve actual problems? Does it improve the quality of software? Do Software Bill of Materials (SBOMs) actually give you the information necessary to verify how a given software artifact was built? What is the goal of all these compliance checklists anyways or more importantly, what should the goals be? If a software object is signed, who should be trusted to sign it, and can they be trusted forever?
The talk will introduce the audience to Reproducible Builds as a set of best practices which allow users and developers to verify that software artifacts were built from the source code, but also allows auditing for license compliance, providing security benefits, and removes the need to trust arbitrary software vendors. Hosted by the Software Freedom Conservancy and taking place in Portland, Oregon, USA, FOSSY aims to be a community-focused event: Whether you are a long time contributing member of a free software project, a recent graduate of a coding bootcamp or university, or just have an interest in the possibilities that free and open source software bring, FOSSY will have something for you . More information on the event is available on the FOSSY 2025 website, including the full programme schedule. Vagrant and Chris will also be staffing a table this year, where they will be available to answer any questions about Reproducible Builds and discuss collaborations with other projects.

Distribution work In Debian this month:
  • Holger Levsen has discovered that it is now possible to bootstrap a minimal Debian trixie using 100% reproducible packages. This result can itself be reproduced, using the debian-repro-status tool and mmdebstrap s support for hooks:
      $ mmdebstrap --variant=apt --include=debian-repro-status \
           --chrooted-customize-hook=debian-repro-status \
           trixie /dev/null 2>&1   grep "Your system has"
       INFO  debian-repro-status > Your system has 100.00% been reproduced.
    
  • On our mailing list this month, Helmut Grohne wrote an extensive message raising an issue related to Uploads with conflicting buildinfo filenames:
    Having several .buildinfo files for the same architecture is something that we plausibly want to have eventually. Imagine running two sets of buildds and assembling a single upload containing buildinfo files from both buildds in the same upload. In a similar vein, as a developer I may want to supply several .buildinfo files with my source upload (e.g. for multiple architectures). Doing any of this is incompatible with current incoming processing and with reprepro.
  • 5 reviews of Debian packages were added, 4 were updated and 8 were removed this month adding to our ever-growing knowledge about identified issues.

In GNU Guix, Timothee Mathieu reported that a long-standing issue with reproducibility of shell containers across different host operating systems has been solved. In their message, Timothee mentions:
I discovered that pytorch (and maybe other dependencies) has a reproducibility problem of order 1e-5 when on AVX512 compared to AVX2. I first tried to solve the problem by disabling AVX512 at the level of pytorch, but it did not work. The dev of pytorch said that it may be because some components dispatch computation to MKL-DNN, I tried to disable AVX512 on MKL, and still the results were not reproducible, I also tried to deactivate in openmpi without success. I finally concluded that there was a problem with AVX512 somewhere in the dependencies graph but I gave up identifying where, as this seems very complicated.

The IzzyOnDroid Android APK repository made more progress in June. Not only have they just passed 48% reproducibility coverage, Ben started making their reproducible builds more visible, by offering rbtlog shields, a kind of badge that has been quickly picked up by many developers who are proud to present their applications reproducibility status.
Lastly, in openSUSE news, Bernhard M. Wiedemann posted another monthly update for their work there.

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made the following changes, including preparing and uploading versions 298, 299 and 300 to Debian:
  • Add python3-defusedxml to the Build-Depends in order to include it in the Docker image. [ ]
  • Handle the RPM format s HEADERSIGNATURES and HEADERIMMUTABLE as a special-case to avoid unnecessarily large diffs. Thanks to Daniel Duan for the report and suggestion. [ ][ ]
  • Update copyright years. [ ]
In addition, @puer-robustus fixed a regression introduced in an earlier commit which resulted in some differences being lost. [ ][ ] Lastly, Vagrant Cascadian updated diffoscope in GNU Guix to version 299 [ ][ ] and 300 [ ][ ].

OSS Rebuild updates OSS Rebuild has added a new network analyzer that provides transparent HTTP(S) interception during builds, capturing all network traffic to monitor external dependencies and identify suspicious behavior, even in unmodified maintainer-controlled build processes. The text-based user interface now features automated failure clustering that can group similar rebuild failures and provides natural language failure summaries, making it easier to identify and understand patterns across large numbers of build failures. OSS Rebuild has also improved the local development experience with a unified interface for build execution strategies, allowing for more extensible environment setup for build execution. The team also designed a new website and logo.

Website updates Once again, there were a number of improvements made to our website this month including:
  • Arnaud Brousseau added Stage , a new Linux distribution, to our Tools page.
  • Chris Lamb improved the docker instructions on the diffoscope website. [ ]


Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In June, however, a number of changes were made by Holger Levsen, including:
  • reproduce.debian.net-related:
    • Installed and deployed rebuilderd version 0.24 from Debian unstable in order to make use of the new compression feature added by Jarl Gullberg for the database. This resulted in massive decrease of the SQLite databases:
      • 79G 2.8G (all)
      • 84G 3.2G (amd64)
      • 75G 2.9G (arm64)
      • 45G 2.1G (armel)
      • 48G 2.2G (armhf)
      • 73G 2.8G (i386)
      • 72G 2.7G (ppc64el)
      • 45G 2.1G (riscv64)
      for a combined saving from 521G 20.8G. This naturally reduces the requirements to run an independent rebuilderd instance and will permit us to add more Debian suites as well.
    • During migration to the latest version of rebuilderd, make sure several services are not started. [ ]
    • Actually run rebuilderd from /usr/bin. [ ]
    • Raise temperatures for NVME devices on some riscv64 nodes that should be ignored. [ ][ ]
    • Use a 64KB kernel page size on the ppc64el architecture (see #1106757). [ ]
    • Improve ordering of some failed to reproduce statistics. [ ]
    • Detect a number of potential causes of build failures within the statistics. [ ][ ]
    • Add support for manually scheduling for the any architecture. [ ]
  • Misc:
    • Update the Codethink nodes as there are now many kernels installed. [ ][ ]
    • Install linux-sysctl-defaults on Debian trixie systems as we need ping functionality. [ ]
    • Limit the fs.nr_open kernel turnable. [ ]
    • Stop submitting results to deprecated buildinfo.debian.net service. [ ][ ]
In addition, Jochen Sprickerhof greatly improved the statistics and the logging functionality, including adopting to the new database format of rebuilderd version 0.24.0 [ ] and temporarily increasing maximum log size in order to debug a nettlesome build [ ]. Jochen also dropped the CPUSchedulingPolicy=idle systemd flag on the workers. [ ]

Finally, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

5 July 2025

Bits from Debian: Bits from the DPL

Dear Debian community, This is bits from the DPL for June. The Challenge of Mentoring Newcomers In June there was an extended discussion about the ongoing challenges around mentoring newcomers in Debian. As many of you know, this is a topic I ve cared about deeply--long before becoming DPL. In my view, the issue isn t just a matter of lacking tools or needing to try harder to attract contributors. Anyone who followed the discussion will likely agree that it s more complex than that. I sometimes wonder whether Debian s success contributes to the problem. From the outside, things may appear to just work , which can lead to the impression: Debian is doing fine without me--they clearly have everything under control. But that overlooks how much volunteer effort it takes to keep the project running smoothly. We should make it clearer that help is always needed--not only in packaging, but also in writing technical documentation, designing web pages, reaching out to upstreams about license issues, finding sponsors, or organising events. (Speaking from experience, I would have appreciated help in patiently explaining Free Software benefits to upstream authors.) Sometimes we think too narrowly about what newcomers can do, and also about which tasks could be offloaded from overcommitted contributors. In fact, one of the most valuable things a newcomer can contribute is better documentation. Those of us who ve been around for years may be too used to how things work--or make assumptions about what others already know. A person who just joined the project is often in the best position to document what s confusing, what s missing, and what they wish they had known sooner. In that sense, the recent "random new contributor s experience" posts might be a useful starting point for further reflection. I think we can learn a lot from positive user stories, like this recent experience of a newcomer adopting the courier package. I'm absolutely convinced that those who just found their way into Debian have valuable perspectives--and that we stand to learn the most from listening to them. We should also take seriously what Russ Allbery noted in the discussion: "This says bad things about the project's sustainability and I think everyone knows that." Volunteers move on--that s normal and expected. But it makes it all the more important that we put effort into keeping Debian's contributor base at least stable, if not growing. Project-wide LLM budget for helping people Lucas Nussbaum has volunteered to handle the paperwork and submit a request on Debian s behalf to LLM providers, aiming to secure project-wide access for Debian Developers. If successful, every DD will be free to use this access--or not--according to their own preferences. Kind regards Andreas.

4 July 2025

Sahil Dhiman: Secondary Authoritative Name Server Options for Self-Hosted Domains

In the past few months, I have moved authoritative name servers (NS) of two of my domains (sahilister.net and sahil.rocks) in house using PowerDNS. Subdomains of sahilister.net see roughly 320,000 hits/day across my IN and DE mirror nodes, so adding secondary name servers with good availability (in addition to my own) servers was one of my first priorities. I explored the following options for my secondary NS, which also didn t cost me anything:

1984 Hosting

Hurriance Electric

Afraid.org

Puck

NS-Global

Asking friends Two of my friends and fellow mirror hosts have their own authoritative name server setup, Shrirang (ie albony) and Luke. Shirang gave me another POP in IN and through Luke (who does have an insane amount of in-house NS, see dig ns jing.rocks +short), I added a JP POP. If we know each other, I would be glad to host a secondary NS for you in (IN and/or DE locations).

Some notes
  • Adding a third-party secondary is putting trust that the third party would serve your zone right.
  • Hurricane Electric and 1984 hosting provide multiple NS. One can use some or all of them. Ideally, you can get away with just using your own with full set from any of these two. Play around with adding and removing secondaries, which gives you the best results. . Using everyone is anyhow overkill, unless you have specific reasons for it.
  • Moving NS in-house isn t that hard. Though, be prepared to get it wrong a few times (and some more). I have already faced partial outages because:
    • Recursive resolvers (RR) in the wild behave in a weird way and cache the wrong NS response for longer time than in TTL.
    • NS expiry took more than time. 2 out of 3 of my Netim s NS (my domain registrar) had stopped serving my domain, while RRs in the wild hadn t picked up my new in-house NS. I couldn t really do anything about it, though.
    • Dot is pretty important at the end.
    • With HE.net, I forgot to delegate my domain on their panel and just added in my NS set, thinking I ve already done so (which I did but for another domain), leading to a lame server situation.
  • In terms of serving traffic, there s no distinction between primary and secondary NS. RR don t really care who they re asking the query to. So one can have hidden primary too.
  • I initially thought of adding periodic RIPE Atlas measurements from the global set but thought against it as I already host a termux mirror, which brings in thousands of queries from around the world leading to a diverse set of RRs querying my domain already.
  • In most cases, query resolution time would increase with out of zone NS servers (which most likely would be in external secondary). 1 query vs. 2 queries. Pay close attention to ADDITIONAL SECTION Shrirang s case followed by mine:
$ dig ns albony.in
; <<>> DiG 9.18.36 <<>> ns albony.in
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 60525
;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 9
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;albony.in.			IN	NS
;; ANSWER SECTION:
albony.in.		1049	IN	NS	ns3.albony.in.
albony.in.		1049	IN	NS	ns4.albony.in.
albony.in.		1049	IN	NS	ns2.albony.in.
albony.in.		1049	IN	NS	ns1.albony.in.
;; ADDITIONAL SECTION:
ns3.albony.in.		1049	IN	AAAA	2a14:3f87:f002:7::a
ns1.albony.in.		1049	IN	A	82.180.145.196
ns2.albony.in.		1049	IN	AAAA	2403:44c0:1:4::2
ns4.albony.in.		1049	IN	A	45.64.190.62
ns2.albony.in.		1049	IN	A	103.77.111.150
ns1.albony.in.		1049	IN	AAAA	2400:d321:2191:8363::1
ns3.albony.in.		1049	IN	A	45.90.187.14
ns4.albony.in.		1049	IN	AAAA	2402:c4c0:1:10::2
;; Query time: 29 msec
;; SERVER: 127.0.0.53#53(127.0.0.53) (UDP)
;; WHEN: Fri Jul 04 07:57:01 IST 2025
;; MSG SIZE  rcvd: 286
vs mine
$ dig ns sahil.rocks
; <<>> DiG 9.18.36 <<>> ns sahil.rocks
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 64497
;; flags: qr rd ra; QUERY: 1, ANSWER: 11, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;sahil.rocks.			IN	NS
;; ANSWER SECTION:
sahil.rocks.		6385	IN	NS	ns5.he.net.
sahil.rocks.		6385	IN	NS	puck.nether.net.
sahil.rocks.		6385	IN	NS	colin.sahilister.net.
sahil.rocks.		6385	IN	NS	marvin.sahilister.net.
sahil.rocks.		6385	IN	NS	ns2.afraid.org.
sahil.rocks.		6385	IN	NS	ns4.he.net.
sahil.rocks.		6385	IN	NS	ns2.albony.in.
sahil.rocks.		6385	IN	NS	ns3.jing.rocks.
sahil.rocks.		6385	IN	NS	ns0.1984.is.
sahil.rocks.		6385	IN	NS	ns1.1984.is.
sahil.rocks.		6385	IN	NS	ns-global.kjsl.com.
;; Query time: 24 msec
;; SERVER: 127.0.0.53#53(127.0.0.53) (UDP)
;; WHEN: Fri Jul 04 07:57:20 IST 2025
;; MSG SIZE  rcvd: 313
  • Theoretically speaking, a small increase/decrease in resolution would occur based on the chosen TLD and the popularity of the TLD in query originators area (already cached vs. fresh recursion).
  • One can get away with having only 3 NS (or be like Google and have 4 anycast NS or like Amazon and have 8 or like Verisign and make it 13 :P).
  • Nowhere it s written, your NS needs not to be called dns* or ns1, ns2 etc. Get creative with naming NS; be deceptive with the naming :D.
  • A good understanding of RR behavior can help engineer a good authoritative NS system.

Further reading

3 July 2025

Sergio Cipriano: Disable sleep on lid close

Disable sleep on lid close I am using an old laptop in my homelab, but I want to do everything from my personal computer, with ssh. The default behavior in Debian is to suspend when the laptop lid is closed, but it's easy to change that, just edit
/etc/systemd/logind.conf
and change the line
#HandleLidSwitch=suspend
to
HandleLidSwitch=ignore
then
$ sudo systemctl restart systemd-logind
That's it.

2 July 2025

Dirk Eddelbuettel: RcppArmadillo 14.6.0-1 on CRAN: New Upstream Minor Release

armadillo image Armadillo is a powerful and expressive C++ template library for linear algebra and scientific computing. It aims towards a good balance between speed and ease of use, has a syntax deliberately close to Matlab, and is useful for algorithm development directly in C++, or quick conversion of research code into production environments. RcppArmadillo integrates this library with the R environment and language and is widely used by (currently) 1241 other packages on CRAN, downloaded 40.4 million times (per the partial logs from the cloud mirrors of CRAN), and the CSDA paper (preprint / vignette) by Conrad and myself has been cited 634 times according to Google Scholar. Conrad released a minor version 4.6.0 yesterday which offers new accessors for non-finite values. And despite being in Beautiful British Columbia on vacation, I had wrapped up two rounds of reverse dependency checks preparing his 4.6.0 release, and shipped this to CRAN this morning where it passed with flying colours and no human intervention even with over 1200 reverse dependencies. The changes since the last CRAN release are summarised below.

Changes in RcppArmadillo version 14.6.0-1 (2025-07-02)
  • Upgraded to Armadillo release 14.6.0 (Caffe Mocha)
    • Added balance() to transform matrices so that column and row norms are roughly the same
    • Added omit_nan() and omit_nonfinite() to extract elements while omitting NaN and non-finite values
    • Added find_nonnan() for finding indices of non-NaN elements
    • Added standalone replace() function
  • The fastLm() help page now mentions that options to solve() can control its behavior.

Courtesy of my CRANberries, there is a diffstat report relative to previous release. More detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the Rcpp R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

Dirk Eddelbuettel: Rcpp 1.1.0 on CRAN: C++11 now Minimum, Regular Semi-Annual Update

rcpp logo With a friendly Canadian hand wave from vacation in Beautiful British Columbia, and speaking on behalf of the Rcpp Core Team, I am excited to shared that the (regularly scheduled bi-annual) update to Rcpp just brought version 1.1.0 to CRAN. Debian builds haven been prepared and uploaded, Windows and macOS builds should appear at CRAN in the next few days, as will builds in different Linux distribution and of course r2u should catch up tomorrow as well. The key highlight of this release is the switch to C++11 as minimum standard. R itself did so in release 4.0.0 more than half a decade ago; if someone is really tied to an older version of R and an equally old compiler then using an older Rcpp with it has to be acceptable. Our own tests (using continuous integration at GitHub) still go back all the way to R 3.5.* and work fine (with a new-enough compiler). In the previous release post, we commented that we had only reverse dependency (falsely) come up in the tests by CRAN, this time there was none among the well over 3000 packages using Rcpp at CRAN. Which really is quite amazing, and possibly also a testament to our rigorous continued testing of our development and snapshot releases on the key branch. This release continues with the six-months January-July cycle started with release 1.0.5 in July 2020. As just mentioned, we do of course make interim snapshot dev or rc releases available. While we not longer regularly update the Rcpp drat repo, the r-universe page and repo now really fill this role admirably (and with many more builds besides just source). We continue to strongly encourage their use and testing I run my systems with these versions which tend to work just as well, and are of course also fully tested against all reverse-dependencies. Rcpp has long established itself as the most popular way of enhancing R with C or C++ code. Right now, 3038 packages on CRAN depend on Rcpp for making analytical code go faster and further. On CRAN, 13.6% of all packages depend (directly) on Rcpp, and 61.3% of all compiled packages do. From the cloud mirror of CRAN (which is but a subset of all CRAN downloads), Rcpp has been downloaded 100.8 million times. The two published papers (also included in the package as preprint vignettes) have, respectively, 2023 (JSS, 2011) and 380 (TAS, 2018) citations, while the the book (Springer useR!, 2013) has another 695. As mentioned, this release switches to C++11 as the minimum standard. The diffstat display in the CRANberries comparison to the previous release shows how several (generated) sources files with C++98 boilerplate have now been removed; we also flattened a number of if/else sections we no longer need to cater to older compilers (see below for details). We also managed more accommodation for the demands of tighter use of the C API of R by removing DATAPTR and CLOENV use. A number of other changes are detailed below. The full list below details all changes, their respective PRs and, if applicable, issue tickets. Big thanks from all of us to all contributors!

Changes in Rcpp release version 1.1.0 (2025-07-01)
  • Changes in Rcpp API:
    • C++11 is now the required minimal C++ standard
    • The std::string_view type is now covered by wrap() (Lev Kandel in #1356 as discussed in #1357)
    • A last remaining DATAPTR use has been converted to DATAPTR_RO (Dirk in #1359)
    • Under R 4.5.0 or later, R_ClosureEnv is used instead of CLOENV (Dirk in #1361 fixing #1360)
    • Use of lsInternal switched to lsInternal3 (Dirk in #1362)
    • Removed compiler detection macro in a header cleanup setting C++11 as the minunum (Dirk in #1364 closing #1363)
    • Variadic templates are now used onconditionally given C++11 (Dirk in #1367 closing #1366)
    • Remove RCPP_USING_CXX11 as a #define as C++11 is now a given (Dirk in #1369)
    • Additional cleanup for __cplusplus checks (I aki in #1371 fixing #1370)
    • Unordered set construction no longer needs a macro for the pre-C++11 case (I aki in #1372)
    • Lambdas are supported in a Rcpp Sugar functions (I aki in #1373)
    • The Date(time)Vector classes now have default ctor (Dirk in #1385 closing #1384)
    • Fixed an issue where Rcpp::Language would duplicate its arguments (Kevin in #1388, fixing #1386)
  • Changes in Rcpp Attributes:
    • The C++26 standard now has plugin support (Dirk in #1381 closing #1380)
  • Changes in Rcpp Documentation:
    • Several typos were correct in the NEWS file (Ben Bolker in #1354)
    • The Rcpp Libraries vignette mentions PACKAGE_types.h to declare types used in RcppExports.cpp (Dirk in #1355)
    • The vignettes bibliography file was updated to current package versions, and now uses doi references (Dirk in #1389)
  • Changes in Rcpp Deployment:
    • Rcpp.package.skeleton() creates URL and BugReports if given a GitHub username (Dirk in #1358)
    • R 4.4.* has been added to the CI matrix (Dirk in #1376)
    • Tests involving NA propagation are skipped under linux-arm64 as they are under macos-arm (Dirk in #1379 closing #1378)

Thanks to my CRANberries, you can also look at a diff to the previous release Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page. Bugs reports are welcome at the GitHub issue tracker as well (where one can also search among open or closed issues).

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

1 July 2025

Guido G nther: Free Software Activities June 2025

Another short status update of what happened on my side last month. Phosh 0.48.0 is out with nice improvements, phosh.mobi e.V. is alive, helped a bit to get cellbroadcastd out, osk bugfixes and some more: See below for details on the above and more: phosh phoc phosh-mobile-settings stevia (formerly phosh-osk-stub) phosh-osk-data phosh-vala-plugins pfs xdg-desktop-portal-phosh xdg-desktop-portal phosh-debs meta-phosh feedbackd feedbackd-device-themes gmobile GNOME clocks Debian Mobian ModemManager libmbim libqmi mobile-broadband-provider-info Cellbroadcastd iio-sensor-proxy twenty-twenty-hugo gotosocial Reviews This is not code by me but reviews on other peoples code. The list is (as usual) slightly incomplete. Thanks for the contributions! Help Development If you want to support my work see donations. Comments? Join the Fediverse thread

30 June 2025

Otto Kek l inen: Corporate best practices for upstream open source contributions

Featured image of post Corporate best practices for upstream open source contributions
This post is based on presentation given at the Validos annual members meeting on June 25th, 2025.
When I started getting into Linux and open source over 25 years ago, the majority of the software development in this area was done by academics and hobbyists. The number of companies participating in open source has since exploded in parallel with the growth of mobile and cloud software, the majority of which is built on top of open source. For example, Android powers most mobile phones today and is based on Linux. Almost all software used to operate large cloud provider data centers, such as AWS or Google, is either open source or made in-house by the cloud provider. Pretty much all companies, regardless of the industry, have been using open source software at least to some extent for years. However, the degree to which they collaborate with the upstream origins of the software varies. I encourage all companies in a technical industry to start contributing upstream. There are many benefits to having a good relationship with your upstream open source software vendors, both for the short term and especially for the long term. Moreover, with the rollout of CRA in EU in 2025-2027, the law will require software companies to contribute security fixes upstream to the open source projects their products use. To ensure the process is well managed, business-aligned and legally compliant, there are a few do s and don t do s that are important to be aware of.

Maintain your SBOMs For every piece of software, regardless of whether the code was done in-house, from an open source project, or a combination of these, every company needs to produce a Software Bill of Materials (SBOM). The SBOMs provide a standardized and interoperable way to track what software and which versions are used where, what software licenses apply, who holds the copyright of which component, which security fixes have been applied and so forth. A catalog of SBOMs, or equivalent, forms the backbone of software supply-chain management in corporations.

Identify your strategic upstream vendors The SBOMs are likely to reveal that for any piece of non-trivial software, there are hundreds or thousands of upstream open source projects in use. Few organizations have resources to contribute to all of their upstreams. If your organization is just starting to organize upstream contribution activities, identify the key projects that have the largest impact on your business and prioritize forming a relationship with them first. Organizations with a mature contribution process will be collaborating with tens or hundreds of upstreams.

Appoint an internal coordinator and champions Having a written policy on how to contribute upstream will help ensure a consistent process and avoid common pitfalls. However, a written policy alone does not automatically translate into a well-running process. It is highly recommended to appoint at least one internal coordinator who is knowledgeable about how open source communities work, how software licensing and patents work, and is senior enough to have a good sense of what business priorities to optimize for. In small organizations it can be a single person, while larger organizations typically have a full Open Source Programs Office. This coordinator should oversee the contribution process, track all contributions made across the organization, and further optimize the process by working with stakeholders across the business, including legal experts, business owners and CTOs. The marketing and recruiting folks should also be involved, as upstream contributions will have a reputation-building aspect as well, which can be enhanced with systematic tracking and publishing of activities. Additionally, at least in the beginning, the organization should also appoint key staff members as open source champions. Implementing a new process always includes some obstacles and occasional setbacks, which may discourage employees from putting in the extra effort to reap the full long-term benefits for the company. Having named champions will empower them to make the first few contributions themselves, setting a good example and encouraging and mentoring others to contribute upstream as well.

Avoid excessive approvals To maintain a high quality bar, it is always good to have all outgoing submissions reviewed by at least one or two people. Two or three pairs of eyeballs are significantly more likely to catch issues that might slip by someone working alone. The review also slows down the process by a day or two, which gives the author time to sleep on it , which usually helps to ensure the final submission is well-thought-out by the author. Do not require more than one or two reviewers. The marginal utility goes quickly to zero beyond a few reviewers, and at around four or five people the effect becomes negative, as the weight of each approval decreases and the reviewers begin to take less personal responsibility. Having too many people in the loop also makes each feedback round slow and expensive, to the extent that the author will hesitate to make updates and ask for re-reviews due to the costs involved. If the organization experiences setbacks due to mistakes slipping through the review process, do not respond by adding more reviewers, as it will just grind the contribution process to a halt. If there are quality concerns, invest in training for engineers, CI systems and perhaps an internal certification program for those making public upstream code submissions. A typical software engineer is more likely to seriously try to become proficient at their job and put effort into a one-off certification exam and then make multiple high-quality contributions, than it is for a low-skilled engineer to improve and even want to continue doing more upstream contributions if they are burdened by heavy review processes every time they try to submit an upstream contribution.

Don t expect upstream to accept all code contributions Sure, identifying the root cause of and fixing a tricky bug or writing a new feature requires significant effort. While an open source project will certainly appreciate the effort invested, it doesn t mean it will always welcome all contributions with open arms. Occasionally, the project won t agree that the code is correct or the feature is useful, and some contributions are bound to be rejected. You can minimize the chance of experiencing rejections by having a solid internal review process that includes assessing how the upstream community is likely to understand the proposal. Sometimes how things are communicated is more important than how they are coded. Polishing inline comments and git commit messages help ensure high-quality communication, along with a commitment to respond quickly to review feedback and conducting regular follow-ups until a contribution is finalized and accepted.

Start small to grow expertise and reputation In addition to keeping the open source contribution policy lean and nimble, it is also good to start practical contributions with small issues. Don t aim to contribute massive features until you have a track record of being able to make multiple small contributions. Keep in mind that not all open source projects are equal. Each has its own culture, written and unwritten rules, development process, documented requirements (which may be outdated) and more. Starting with a tiny contribution, even just a typo fix, is a good way to validate how code submissions, reviews and approvals work in a particular project. Once you have staff who have successfully landed smaller contributions, you can start planning larger proposals. The exact same proposal might be unsuccessful when proposed by a new person, and successful when proposed by a person who already has a reputation for prior high-quality work.

Embrace all and any publicity you get Some companies have concerns about their employees working in the open. Indeed, every email and code patch an employee submits, and all related discussions become public. This may initially sound scary, but is actually a potential source of good publicity. Employees need to be trained on how to conduct themselves publicly, and the discussions about code should contain only information strictly related to the code, without any references to actual production environments or other sensitive information. In the long run most employees contributing have a positive impact and the company should reap the benefits of positive publicity. If there are quality issues or employee judgment issues, hiding the activity or forcing employees to contribute with pseudonyms is not a proper solution. Instead, the problems should be addressed at the root, and bad behavior addressed rather than tolerated. When people are working publicly, there tends to also be some degree of additional pride involved, which motivates people to try their best. Contributions need to be public for the sponsoring corporation to later be able to claim copyright or licenses. Considering that thousands of companies participate in open source every day, the prevalence of bad publicity is quite low, and the benefits far exceed the risks.

Scratch your own itch When choosing what to contribute, select things that benefit your own company. This is not purely about being selfish - often people working on resolving a problem they suffer from are the same people with the best expertise of what the problem is and what kind of solution is optimal. Also, the issues that are most pressing to your company are more likely to be universally useful to solve than any random bug or feature request in the upstream project s issue tracker.

Remember there are many ways to help upstream While submitting code is often considered the primary way to contribute, please keep in mind there are also other highly impactful ways to contribute. Submitting high-quality bug reports will help developers quickly identify and prioritize issues to fix. Providing good research, benchmarks, statistics or feedback helps guide development and the project make better design decisions. Documentation, translations, organizing events and providing marketing support can help increase adoption and strengthen long-term viability for the project. In some of the largest open source projects there are already far more pending contributions than the core maintainers can process. Therefore, developers who contribute code should also get into the habit of contributing reviews. As Linus law states, given enough eyeballs, all bugs are shallow. Reviewing other contributors submissions will help improve quality, and also alleviate the pressure on core maintainers who are the only ones providing feedback. Reviewing code submitted by others is also a great learning opportunity for the reviewer. The reviewer does not need to be better than the submitter - any feedback is useful; merely posting review feedback is not the same thing as making an approval decision. Many projects are also happy to accept monetary support and sponsorships. Some offer specific perks in return. By human nature, the largest sponsors always get their voice heard in important decisions, as no open source project wants to take actions that scare away major financial contributors.

Starting is the hardest part Long-term success in open source comes from a positive feedback loop of an ever-increasing number of users and collaborators. As seen in the examples of countless corporations contributing open source, the benefits are concrete, and the process usually runs well after the initial ramp-up and organizational learning phase has passed. In open source ecosystems, contributing upstream should be as natural as paying vendors in any business. If you are using open source and not contributing at all, you likely have latent business risks without realizing it. You don t want to wake up one morning to learn that your top talent left because they were forbidden from participating in open source for the company s benefit, or that you were fined due to CRA violations and mismanagement in sharing security fixes with the correct parties. The faster you start with the process, the less likely those risks will materialize.

24 June 2025

Evgeni Golov: Using LXCFS together with Podman

JP was puzzled that using podman run --memory=2G would not result in the 2G limit being visible inside the container. While we were able to identify this as a visualization problem tools like free(1) only look at /proc/meminfo and that is not virtualized inside a container, you'd have to look at /sys/fs/cgroup/memory.max and friends instead I couldn't leave it at that. And then I remembered there is actually something that can provide a virtual (cgroup-aware) /proc for containers: LXCFS! But does it work with Podman?! I always used it with LXC, but there is technically no reason why it wouldn't work with a different container solution cgroups are cgroups after all. As we all know: there is only one way to find out! Take a fresh Debian 12 VM, install podman and verify things behave as expected:
user@debian12:~$ podman run -ti --rm --memory=2G centos:stream9
bash-5.1# grep MemTotal /proc/meminfo
MemTotal:        6067396 kB
bash-5.1# cat /sys/fs/cgroup/memory.max
2147483648
And after installing (and starting) lxcfs, we can use the virtual /proc/meminfo it generates by bind-mounting it into the container (LXC does that part automatically for us):
user@debian12:~$ podman run -ti --rm --memory=2G --mount=type=bind,source=/var/lib/lxcfs/proc/meminfo,destination=/proc/meminfo centos:stream9
bash-5.1# grep MemTotal /proc/meminfo
MemTotal:        2097152 kB
bash-5.1# cat /sys/fs/cgroup/memory.max
2147483648
The same of course works with all the other proc entries lxcfs provides (cpuinfo, diskstats, loadavg, meminfo, slabinfo, stat, swaps, and uptime here), just bind-mount them. And yes, free(1) now works too!
bash-5.1# free -m
               total        used        free      shared  buff/cache   available
Mem:            2048           3        1976           0          67        2044
Swap:              0           0           0
Just don't blindly mount the whole /var/lib/lxcfs/proc over the container's /proc. It did work (as in: "bash and free didn't crash") for me, but with /proc/$PID etc missing, I bet things will go south pretty quickly.

22 June 2025

Iustin Pop: Coding, as we knew it, has forever changed

Back when I was terribly na ve When I was younger, and definitely na ve, I was so looking forward to AI, which will help us write lots of good, reliable code faster. Well, principally me, not thinking what impact it will have industry-wide. Other more general concerns, like societal issues, role of humans in the future and so on were totally not on my radar. At the same time, I didn t expect this will actually happen. Even years later, things didn t change dramatically. Even the first release of ChatGPT a few years back didn t click for me, as the limitations were still significant.

Hints of serious change The first hint of the change, for me, was when a few months ago (yes, behind the curve), I asked ChatGPT to re-explain a concept to me, and it just wrote a lot of words, but without a clear explanation. On a whim, I asked Grok then recently launched, I think to do the same. And for the first time, the explanation clicked and I felt I could have a conversation with it. Of course, now I forgot again that theoretical CS concept, but the first step was done: I can ask an LLM to explain something, and it will, and I can have a back and forth logical discussion, even if on some theoretical concept. Additionally, I learned that not all LLMs are the same, and that means there s real competition and that leap frogging is possible. Another topic on which I tried to adopt early and failed to get mileage out of it, was GitHub Copilot (in VSC). I tried, it helped, but didn t feel any speed-up at all. Then more recently, in May, I asked Grok what s the state of the art in AI-assisted coding. It said either Claude in a browser tab, or in VSC via continue.dev extension. The continue.dev extension/tooling is a bit of a strange/interesting thing. It seems to want to be a middle-man between the user and actual LLM services, i.e. you pay a subscription to continue.dev, not to Anthropic itself, and they manage the keys/APIs, for whatever backend LLMs you want to use. The integration with Visual Studio Code is very nice, but I don t know if long-term their business model will make sense. Well, not my problem.

Claude: reverse engineering my old code and teaching new concepts So I installed the latter and subscribed, thinking 20 CHF for a month is good for testing. I skipped the tutorial model/assistant, created a new one from scratch, just enabled Claude 3.7 Sonnet, and started using it. And then, my mind was blown-not just by the LLM, but by the ecosystem. As said, I ve used GitHub copilot before, but it didn t seem effective. I don t know if a threshold has been reached, or Claude (3.7 at that time) is just better than ChatGPT. I didn t use the AI to write (non-trivial) code for me, at most boilerplate snippets. But I used it both as partner for discussion - I want to do x, what do you think, A or B? , and as a teacher, especially for fronted topics, which I m not familiar with. Since May, in mostly fragmented sessions, I ve achieved more than in the last two years. Migration from old school JS to ECMA modules, a webpacker (reducing bundle size by 50%), replacing an old Javascript library with hand written code using modern APIs, implementing the zoom feature together with all of keyboard, mouse, touchpad and touchscreen support, simplifying layout from manually computed to automatic layout, and finding a bug in webkit for which it also wrote a cool minimal test (cool, as in, way better than I d have ever, ever written, because for me it didn t matter that much). And more. Could I have done all this? Yes, definitely, nothing was especially tricky here. But hours and hours of reading MDN, scouring Stack Overflow and Reddit, and lots of trial and error. So doable, but much more toily. This, to me, feels like cheating. 20 CHF per month to make me 3x more productive is free money well, except that I don t make money on my code which is written basically for myself. However, I don t get stuck anymore searching hours in the web for guidance, I ask my question, and I get at least direction if not answer, and I m finished way earlier. I can now actually juggle more hobbies, in the same amount of time, if my personal code takes less time or differently said, if I m more efficient at it. Not all is roses, of course. Once, it did write code with such an endearing error that it made me laugh. It was so blatantly obvious that you shouldn t keep other state in the array that holds pointer status because that confuses the calculation of how many pointers are down , probably to itself too if I d have asked. But I didn t, since it felt a bit embarassing to point out such a dumb mistake. Yes, I m anthropomorphising again, because this is the easiest way to deal with things. In general, it does an OK-to-good-to-sometimes-awesome job, and the best thing is that it summarises documentation and all of Reddit and Stack Overflow. And gives links to those. Now, I have no idea yet what this means for the job of a software engineer. If on open source code, my own code, it makes me 3x faster reverse engineering my code from 10 years ago is no small feat for working on large codebases, it should do at least the same, if not more. As an example of how open-ended the assistance can be, at one point, I started implementing a new feature threading a new attribute to a large number of call points. This is not complex at all, just add a new field to a Haskell record, and modifying everything to take it into account, populate it, merge it when merging the data structures, etc. The code is not complex, tending toward boilerplate a bit, and I was wondering on a few possible choices for implementation, so, with just a few lines of code written that were not even compiling, I asked I want to add a new feature, should I do A or B if I want it to behave like this , and the answer was something along the lines of I see you want to add the specific feature I was working on, but the implementation is incomplete, you still need to to X, Y and Z . My mind was blown at this point, as I thought, if the code doesn t compile, surely the computer won t be able to parse it, but this is not a program, this is an LLM, so of course it could read it kind of as a human would. Again, the code complexity is not great, but the fact that it was able to read a half-written patch, understand what I was working towards, and reason about, was mind-blowing, and scary. Like always.

Non-code writing Now, after all this, while writing a recent blog post, I thought this is going to be public anyway, so let me ask Claude what it thinks about it. And I was very surprised, again: gone was all the pain of rereading three times my post to catch typos (easy) or phrasing structure issues. It gave me very clearly points, and helped me cut 30-40% of the total time. So not only coding, but word smithing too is changed. If I were an author, I d be delighted (and scared). Here is the overall reply it gave me:
  • Spelling and grammar fixes, all of them on point except one mistake (I claimed I didn t capitalize one word, but I did). To the level of a good grammar checker.
  • Flow Suggestions, which was way beyond normal spelling and grammar. It felt like a teacher telling me to do better in my writing, i.e. nitpicking on things that actually were true even if they d still work. I.e. lousy phrase structure, still understandable, but lousy nevertheless.
  • Other notes: an overall summary. This was mostly just praising my post . I wish LLMs were not so focused on praise the user .
So yeah, this speeds me up to about 2x on writing blog posts, too. It definitely feels not fair.

Wither the future? After all this, I m a bit flabbergasted. Gone are the 2000 s with code without unittests, gone are the 2010 s without CI/CD, and now, mid-2020 s, gone is the lone programmer that scours the internet to learn new things, alone? What this all means for our skills in software development, I have no idea, except I know things have irreversibly changed (a butlerian jihad aside). Do I learn better with a dedicated tutor even if I don t fight with the problem for so long? Or is struggling in finding good docs the main method of learning? I don t know yet. I feel like I understand the topics I m discussing with the AI, but who knows in reality what it will mean long term in terms of stickiness of learning. For the better, or for worse, things have changed. After all the advances over the last five centuries in mechanical sciences, it has now come to some aspects of the intellectual work. Maybe this is the answer to the ever-growing complexity of tech stacks? I.e. a return of the lone programmer that builds things end-to-end, but with AI taming the complexity added in the last 25 years? I can dream, of course, but this also means that the industry overall will increase in complexity even more, because large companies tend to do that, so maybe a net effect of not much One thing I did learn so far is that my expectation that AI (at this level) will only help junior/beginner people, i.e. it would flatten the skills band, is not true. I think AI can speed up at least the middle band, likely the middle top band, I don t know about the 10x programmers (I m not one of them). So, my question about AI now is how to best use it, not to lament how all my learning (90% self learning, to be clear) is obsolete. No, it isn t. AI helps me start and finish one migration (that I delayed for ages), then start the second, in the same day. At the end of this a bit rambling reflection on the past month and a half, I still have many questions about AI and humanity. But one has been answered: yes, AI , quotes or no quotes, already has changed this field (producing software), and we ve not seen the end of it, for sure.

21 June 2025

Ravi Dwivedi: Getting Brunei visa

In December 2024, my friend Badri and I were planning a trip to Southeast Asia. At this point, we were planning to visit Singapore, Malaysia and Vietnam. My Singapore visa had already been approved, and Malaysia was visa-free for us. For Vietnam, we had to apply for an e-visa online. We considered adding Brunei to our itinerary. I saw some videos of the Brunei visa process and got the impression that we needed to go to the Brunei embassy in Kuching, Malaysia in person. However, when I happened to search for Brunei on Organic Maps1, I stumbled upon the Brunei Embassy in Delhi. It seemed to be somewhere in Hauz Khas. As I was going to Delhi to collect my Singapore visa the next day, I figured I d also visit the Brunei Embassy to get information about the visa process. The next day I went to the location displayed by Organic Maps. It was next to the embassy of Madagascar, and a sign on the road divider confirmed that I was at the right place. That said, it actually looked like someone s apartment. I entered and asked for directions to the Brunei embassy, but the people inside did not seem to understand my query. After some back and forth, I realized that the embassy wasn t there. I now searched for the Brunei embassy on the Internet, and this time I got an address in Vasant Vihar. It seemed like the embassy had been moved from Hauz Khas to Vasant Vihar. Going by the timings mentioned on the web page, the embassy was closing in an hour. I took a Metro from Hauz Khas to Vasant Vihar. After deboarding at the Vasant Vihar metro station, I took an auto to reach the embassy. The address listed on the webpage got me into the correct block. However, the embassy was still nowhere to be seen. I asked around, but security guards in that area pointed me to the Burundi embassy instead. After some more looking around, I did end up finding the embassy. I spoke to the security guards at the gate and told them that I would like to know the visa process. They dialled a number and asked that person to tell me the visa process. I spoke to a lady on the phone. She listed the documents required for the visa process and mentioned that the timings for visa application were from 9 o clock to 11 o clock in the morning. She also informed me that the visa fees was 1000. I also asked about the process Badri, who lives far away in Tamil Nadu and cannot report at the embassy physically. She told me that I can submit a visa application on his behalf, along with an authorization letter. Having found the embassy in Delhi was a huge relief. The other plan - going to Kuching, Malaysia - was a bit uncertain, and we didn t know how much time it would take. Getting our passport submitted at an embassy in a foreign country was also not ideal. A few days later, Badri sent me all the documents required for his visa. I went to the embassy and submitted both the applications. The lady who collected our visa submissions asked me for our flight reservations from Delhi to Brunei, whereas ours were (keeping with our itinerary) from Kuala Lampur. She said that she might contact me later if it was required. For reference, here is the list of documents we submitted - I then asked about the procedure to collect the passports and visa results. Usually, embassies will tell you that they will contact you when they have decided on your applications. However, here I was informed that if they don t contact me within 5 days, I can come and collect our passports and visa result between 13:30-14:30 hours on the fifth day. That was strange :) I did visit the embassy to collect our visa results on the fifth day. However, the lady scolded me for not bringing the receipt she gave me. I was afraid that I might have to go all the way back home and bring the receipt to get our passports. The travel date was close, and it would take some time for Badri to receive his passport via courier as well. Fortunately, she gave me our passports (with the visa attached) and asked me to share a scanned copy of the receipt via email after I get home. We were elated that our visas were approved. Now we could focus on booking our flights. If you are going to Brunei, remember to fill their arrival card from the website within 48 hours of your arrival! Thanks to Badri and Contrapunctus for reviewing the draft before publishing the article.

  1. Nowadays, I prefer using Comaps instead of Organic Maps and recommend you do the same. Organic Maps had some issues with its governance and the community issues weren t being addressed.

Next.

Previous.