Search Results: "beh"

1 July 2025

Guido G nther: Free Software Activities June 2025

Another short status update of what happened on my side last month. Phosh 0.48.0 is out with nice improvements, phosh.mobi e.V. is alive, helped a bit to get cellbroadcastd out, osk bugfixes and some more: See below for details on the above and more: phosh phoc phosh-mobile-settings stevia (formerly phosh-osk-stub) phosh-osk-data phosh-vala-plugins pfs xdg-desktop-portal-phosh xdg-desktop-portal phosh-debs meta-phosh feedbackd feedbackd-device-themes gmobile GNOME clocks Debian Mobian ModemManager libmbim libqmi mobile-broadband-provider-info Cellbroadcastd iio-sensor-proxy twenty-twenty-hugo gotosocial Reviews This is not code by me but reviews on other peoples code. The list is (as usual) slightly incomplete. Thanks for the contributions! Help Development If you want to support my work see donations. Comments? Join the Fediverse thread

30 June 2025

Otto Kek l inen: Corporate best practices for upstream open source contributions

Featured image of post Corporate best practices for upstream open source contributions
This post is based on presentation given at the Validos annual members meeting on June 25th, 2025.
When I started getting into Linux and open source over 25 years ago, the majority of the software development in this area was done by academics and hobbyists. The number of companies participating in open source has since exploded in parallel with the growth of mobile and cloud software, the majority of which is built on top of open source. For example, Android powers most mobile phones today and is based on Linux. Almost all software used to operate large cloud provider data centers, such as AWS or Google, is either open source or made in-house by the cloud provider. Pretty much all companies, regardless of the industry, have been using open source software at least to some extent for years. However, the degree to which they collaborate with the upstream origins of the software varies. I encourage all companies in a technical industry to start contributing upstream. There are many benefits to having a good relationship with your upstream open source software vendors, both for the short term and especially for the long term. Moreover, with the rollout of CRA in EU in 2025-2027, the law will require software companies to contribute security fixes upstream to the open source projects their products use. To ensure the process is well managed, business-aligned and legally compliant, there are a few do s and don t do s that are important to be aware of.

Maintain your SBOMs For every piece of software, regardless of whether the code was done in-house, from an open source project, or a combination of these, every company needs to produce a Software Bill of Materials (SBOM). The SBOMs provide a standardized and interoperable way to track what software and which versions are used where, what software licenses apply, who holds the copyright of which component, which security fixes have been applied and so forth. A catalog of SBOMs, or equivalent, forms the backbone of software supply-chain management in corporations.

Identify your strategic upstream vendors The SBOMs are likely to reveal that for any piece of non-trivial software, there are hundreds or thousands of upstream open source projects in use. Few organizations have resources to contribute to all of their upstreams. If your organization is just starting to organize upstream contribution activities, identify the key projects that have the largest impact on your business and prioritize forming a relationship with them first. Organizations with a mature contribution process will be collaborating with tens or hundreds of upstreams.

Appoint an internal coordinator and champions Having a written policy on how to contribute upstream will help ensure a consistent process and avoid common pitfalls. However, a written policy alone does not automatically translate into a well-running process. It is highly recommended to appoint at least one internal coordinator who is knowledgeable about how open source communities work, how software licensing and patents work, and is senior enough to have a good sense of what business priorities to optimize for. In small organizations it can be a single person, while larger organizations typically have a full Open Source Programs Office. This coordinator should oversee the contribution process, track all contributions made across the organization, and further optimize the process by working with stakeholders across the business, including legal experts, business owners and CTOs. The marketing and recruiting folks should also be involved, as upstream contributions will have a reputation-building aspect as well, which can be enhanced with systematic tracking and publishing of activities. Additionally, at least in the beginning, the organization should also appoint key staff members as open source champions. Implementing a new process always includes some obstacles and occasional setbacks, which may discourage employees from putting in the extra effort to reap the full long-term benefits for the company. Having named champions will empower them to make the first few contributions themselves, setting a good example and encouraging and mentoring others to contribute upstream as well.

Avoid excessive approvals To maintain a high quality bar, it is always good to have all outgoing submissions reviewed by at least one or two people. Two or three pairs of eyeballs are significantly more likely to catch issues that might slip by someone working alone. The review also slows down the process by a day or two, which gives the author time to sleep on it , which usually helps to ensure the final submission is well-thought-out by the author. Do not require more than one or two reviewers. The marginal utility goes quickly to zero beyond a few reviewers, and at around four or five people the effect becomes negative, as the weight of each approval decreases and the reviewers begin to take less personal responsibility. Having too many people in the loop also makes each feedback round slow and expensive, to the extent that the author will hesitate to make updates and ask for re-reviews due to the costs involved. If the organization experiences setbacks due to mistakes slipping through the review process, do not respond by adding more reviewers, as it will just grind the contribution process to a halt. If there are quality concerns, invest in training for engineers, CI systems and perhaps an internal certification program for those making public upstream code submissions. A typical software engineer is more likely to seriously try to become proficient at their job and put effort into a one-off certification exam and then make multiple high-quality contributions, than it is for a low-skilled engineer to improve and even want to continue doing more upstream contributions if they are burdened by heavy review processes every time they try to submit an upstream contribution.

Don t expect upstream to accept all code contributions Sure, identifying the root cause of and fixing a tricky bug or writing a new feature requires significant effort. While an open source project will certainly appreciate the effort invested, it doesn t mean it will always welcome all contributions with open arms. Occasionally, the project won t agree that the code is correct or the feature is useful, and some contributions are bound to be rejected. You can minimize the chance of experiencing rejections by having a solid internal review process that includes assessing how the upstream community is likely to understand the proposal. Sometimes how things are communicated is more important than how they are coded. Polishing inline comments and git commit messages help ensure high-quality communication, along with a commitment to respond quickly to review feedback and conducting regular follow-ups until a contribution is finalized and accepted.

Start small to grow expertise and reputation In addition to keeping the open source contribution policy lean and nimble, it is also good to start practical contributions with small issues. Don t aim to contribute massive features until you have a track record of being able to make multiple small contributions. Keep in mind that not all open source projects are equal. Each has its own culture, written and unwritten rules, development process, documented requirements (which may be outdated) and more. Starting with a tiny contribution, even just a typo fix, is a good way to validate how code submissions, reviews and approvals work in a particular project. Once you have staff who have successfully landed smaller contributions, you can start planning larger proposals. The exact same proposal might be unsuccessful when proposed by a new person, and successful when proposed by a person who already has a reputation for prior high-quality work.

Embrace all and any publicity you get Some companies have concerns about their employees working in the open. Indeed, every email and code patch an employee submits, and all related discussions become public. This may initially sound scary, but is actually a potential source of good publicity. Employees need to be trained on how to conduct themselves publicly, and the discussions about code should contain only information strictly related to the code, without any references to actual production environments or other sensitive information. In the long run most employees contributing have a positive impact and the company should reap the benefits of positive publicity. If there are quality issues or employee judgment issues, hiding the activity or forcing employees to contribute with pseudonyms is not a proper solution. Instead, the problems should be addressed at the root, and bad behavior addressed rather than tolerated. When people are working publicly, there tends to also be some degree of additional pride involved, which motivates people to try their best. Contributions need to be public for the sponsoring corporation to later be able to claim copyright or licenses. Considering that thousands of companies participate in open source every day, the prevalence of bad publicity is quite low, and the benefits far exceed the risks.

Scratch your own itch When choosing what to contribute, select things that benefit your own company. This is not purely about being selfish - often people working on resolving a problem they suffer from are the same people with the best expertise of what the problem is and what kind of solution is optimal. Also, the issues that are most pressing to your company are more likely to be universally useful to solve than any random bug or feature request in the upstream project s issue tracker.

Remember there are many ways to help upstream While submitting code is often considered the primary way to contribute, please keep in mind there are also other highly impactful ways to contribute. Submitting high-quality bug reports will help developers quickly identify and prioritize issues to fix. Providing good research, benchmarks, statistics or feedback helps guide development and the project make better design decisions. Documentation, translations, organizing events and providing marketing support can help increase adoption and strengthen long-term viability for the project. In some of the largest open source projects there are already far more pending contributions than the core maintainers can process. Therefore, developers who contribute code should also get into the habit of contributing reviews. As Linus law states, given enough eyeballs, all bugs are shallow. Reviewing other contributors submissions will help improve quality, and also alleviate the pressure on core maintainers who are the only ones providing feedback. Reviewing code submitted by others is also a great learning opportunity for the reviewer. The reviewer does not need to be better than the submitter - any feedback is useful; merely posting review feedback is not the same thing as making an approval decision. Many projects are also happy to accept monetary support and sponsorships. Some offer specific perks in return. By human nature, the largest sponsors always get their voice heard in important decisions, as no open source project wants to take actions that scare away major financial contributors.

Starting is the hardest part Long-term success in open source comes from a positive feedback loop of an ever-increasing number of users and collaborators. As seen in the examples of countless corporations contributing open source, the benefits are concrete, and the process usually runs well after the initial ramp-up and organizational learning phase has passed. In open source ecosystems, contributing upstream should be as natural as paying vendors in any business. If you are using open source and not contributing at all, you likely have latent business risks without realizing it. You don t want to wake up one morning to learn that your top talent left because they were forbidden from participating in open source for the company s benefit, or that you were fined due to CRA violations and mismanagement in sharing security fixes with the correct parties. The faster you start with the process, the less likely those risks will materialize.

24 June 2025

Evgeni Golov: Using LXCFS together with Podman

JP was puzzled that using podman run --memory=2G would not result in the 2G limit being visible inside the container. While we were able to identify this as a visualization problem tools like free(1) only look at /proc/meminfo and that is not virtualized inside a container, you'd have to look at /sys/fs/cgroup/memory.max and friends instead I couldn't leave it at that. And then I remembered there is actually something that can provide a virtual (cgroup-aware) /proc for containers: LXCFS! But does it work with Podman?! I always used it with LXC, but there is technically no reason why it wouldn't work with a different container solution cgroups are cgroups after all. As we all know: there is only one way to find out! Take a fresh Debian 12 VM, install podman and verify things behave as expected:
user@debian12:~$ podman run -ti --rm --memory=2G centos:stream9
bash-5.1# grep MemTotal /proc/meminfo
MemTotal:        6067396 kB
bash-5.1# cat /sys/fs/cgroup/memory.max
2147483648
And after installing (and starting) lxcfs, we can use the virtual /proc/meminfo it generates by bind-mounting it into the container (LXC does that part automatically for us):
user@debian12:~$ podman run -ti --rm --memory=2G --mount=type=bind,source=/var/lib/lxcfs/proc/meminfo,destination=/proc/meminfo centos:stream9
bash-5.1# grep MemTotal /proc/meminfo
MemTotal:        2097152 kB
bash-5.1# cat /sys/fs/cgroup/memory.max
2147483648
The same of course works with all the other proc entries lxcfs provides (cpuinfo, diskstats, loadavg, meminfo, slabinfo, stat, swaps, and uptime here), just bind-mount them. And yes, free(1) now works too!
bash-5.1# free -m
               total        used        free      shared  buff/cache   available
Mem:            2048           3        1976           0          67        2044
Swap:              0           0           0
Just don't blindly mount the whole /var/lib/lxcfs/proc over the container's /proc. It did work (as in: "bash and free didn't crash") for me, but with /proc/$PID etc missing, I bet things will go south pretty quickly.

22 June 2025

Iustin Pop: Coding, as we knew it, has forever changed

Back when I was terribly na ve When I was younger, and definitely na ve, I was so looking forward to AI, which will help us write lots of good, reliable code faster. Well, principally me, not thinking what impact it will have industry-wide. Other more general concerns, like societal issues, role of humans in the future and so on were totally not on my radar. At the same time, I didn t expect this will actually happen. Even years later, things didn t change dramatically. Even the first release of ChatGPT a few years back didn t click for me, as the limitations were still significant.

Hints of serious change The first hint of the change, for me, was when a few months ago (yes, behind the curve), I asked ChatGPT to re-explain a concept to me, and it just wrote a lot of words, but without a clear explanation. On a whim, I asked Grok then recently launched, I think to do the same. And for the first time, the explanation clicked and I felt I could have a conversation with it. Of course, now I forgot again that theoretical CS concept, but the first step was done: I can ask an LLM to explain something, and it will, and I can have a back and forth logical discussion, even if on some theoretical concept. Additionally, I learned that not all LLMs are the same, and that means there s real competition and that leap frogging is possible. Another topic on which I tried to adopt early and failed to get mileage out of it, was GitHub Copilot (in VSC). I tried, it helped, but didn t feel any speed-up at all. Then more recently, in May, I asked Grok what s the state of the art in AI-assisted coding. It said either Claude in a browser tab, or in VSC via continue.dev extension. The continue.dev extension/tooling is a bit of a strange/interesting thing. It seems to want to be a middle-man between the user and actual LLM services, i.e. you pay a subscription to continue.dev, not to Anthropic itself, and they manage the keys/APIs, for whatever backend LLMs you want to use. The integration with Visual Studio Code is very nice, but I don t know if long-term their business model will make sense. Well, not my problem.

Claude: reverse engineering my old code and teaching new concepts So I installed the latter and subscribed, thinking 20 CHF for a month is good for testing. I skipped the tutorial model/assistant, created a new one from scratch, just enabled Claude 3.7 Sonnet, and started using it. And then, my mind was blown-not just by the LLM, but by the ecosystem. As said, I ve used GitHub copilot before, but it didn t seem effective. I don t know if a threshold has been reached, or Claude (3.7 at that time) is just better than ChatGPT. I didn t use the AI to write (non-trivial) code for me, at most boilerplate snippets. But I used it both as partner for discussion - I want to do x, what do you think, A or B? , and as a teacher, especially for fronted topics, which I m not familiar with. Since May, in mostly fragmented sessions, I ve achieved more than in the last two years. Migration from old school JS to ECMA modules, a webpacker (reducing bundle size by 50%), replacing an old Javascript library with hand written code using modern APIs, implementing the zoom feature together with all of keyboard, mouse, touchpad and touchscreen support, simplifying layout from manually computed to automatic layout, and finding a bug in webkit for which it also wrote a cool minimal test (cool, as in, way better than I d have ever, ever written, because for me it didn t matter that much). And more. Could I have done all this? Yes, definitely, nothing was especially tricky here. But hours and hours of reading MDN, scouring Stack Overflow and Reddit, and lots of trial and error. So doable, but much more toily. This, to me, feels like cheating. 20 CHF per month to make me 3x more productive is free money well, except that I don t make money on my code which is written basically for myself. However, I don t get stuck anymore searching hours in the web for guidance, I ask my question, and I get at least direction if not answer, and I m finished way earlier. I can now actually juggle more hobbies, in the same amount of time, if my personal code takes less time or differently said, if I m more efficient at it. Not all is roses, of course. Once, it did write code with such an endearing error that it made me laugh. It was so blatantly obvious that you shouldn t keep other state in the array that holds pointer status because that confuses the calculation of how many pointers are down , probably to itself too if I d have asked. But I didn t, since it felt a bit embarassing to point out such a dumb mistake. Yes, I m anthropomorphising again, because this is the easiest way to deal with things. In general, it does an OK-to-good-to-sometimes-awesome job, and the best thing is that it summarises documentation and all of Reddit and Stack Overflow. And gives links to those. Now, I have no idea yet what this means for the job of a software engineer. If on open source code, my own code, it makes me 3x faster reverse engineering my code from 10 years ago is no small feat for working on large codebases, it should do at least the same, if not more. As an example of how open-ended the assistance can be, at one point, I started implementing a new feature threading a new attribute to a large number of call points. This is not complex at all, just add a new field to a Haskell record, and modifying everything to take it into account, populate it, merge it when merging the data structures, etc. The code is not complex, tending toward boilerplate a bit, and I was wondering on a few possible choices for implementation, so, with just a few lines of code written that were not even compiling, I asked I want to add a new feature, should I do A or B if I want it to behave like this , and the answer was something along the lines of I see you want to add the specific feature I was working on, but the implementation is incomplete, you still need to to X, Y and Z . My mind was blown at this point, as I thought, if the code doesn t compile, surely the computer won t be able to parse it, but this is not a program, this is an LLM, so of course it could read it kind of as a human would. Again, the code complexity is not great, but the fact that it was able to read a half-written patch, understand what I was working towards, and reason about, was mind-blowing, and scary. Like always.

Non-code writing Now, after all this, while writing a recent blog post, I thought this is going to be public anyway, so let me ask Claude what it thinks about it. And I was very surprised, again: gone was all the pain of rereading three times my post to catch typos (easy) or phrasing structure issues. It gave me very clearly points, and helped me cut 30-40% of the total time. So not only coding, but word smithing too is changed. If I were an author, I d be delighted (and scared). Here is the overall reply it gave me:
  • Spelling and grammar fixes, all of them on point except one mistake (I claimed I didn t capitalize one word, but I did). To the level of a good grammar checker.
  • Flow Suggestions, which was way beyond normal spelling and grammar. It felt like a teacher telling me to do better in my writing, i.e. nitpicking on things that actually were true even if they d still work. I.e. lousy phrase structure, still understandable, but lousy nevertheless.
  • Other notes: an overall summary. This was mostly just praising my post . I wish LLMs were not so focused on praise the user .
So yeah, this speeds me up to about 2x on writing blog posts, too. It definitely feels not fair.

Wither the future? After all this, I m a bit flabbergasted. Gone are the 2000 s with code without unittests, gone are the 2010 s without CI/CD, and now, mid-2020 s, gone is the lone programmer that scours the internet to learn new things, alone? What this all means for our skills in software development, I have no idea, except I know things have irreversibly changed (a butlerian jihad aside). Do I learn better with a dedicated tutor even if I don t fight with the problem for so long? Or is struggling in finding good docs the main method of learning? I don t know yet. I feel like I understand the topics I m discussing with the AI, but who knows in reality what it will mean long term in terms of stickiness of learning. For the better, or for worse, things have changed. After all the advances over the last five centuries in mechanical sciences, it has now come to some aspects of the intellectual work. Maybe this is the answer to the ever-growing complexity of tech stacks? I.e. a return of the lone programmer that builds things end-to-end, but with AI taming the complexity added in the last 25 years? I can dream, of course, but this also means that the industry overall will increase in complexity even more, because large companies tend to do that, so maybe a net effect of not much One thing I did learn so far is that my expectation that AI (at this level) will only help junior/beginner people, i.e. it would flatten the skills band, is not true. I think AI can speed up at least the middle band, likely the middle top band, I don t know about the 10x programmers (I m not one of them). So, my question about AI now is how to best use it, not to lament how all my learning (90% self learning, to be clear) is obsolete. No, it isn t. AI helps me start and finish one migration (that I delayed for ages), then start the second, in the same day. At the end of this a bit rambling reflection on the past month and a half, I still have many questions about AI and humanity. But one has been answered: yes, AI , quotes or no quotes, already has changed this field (producing software), and we ve not seen the end of it, for sure.

21 June 2025

Ravi Dwivedi: Getting Brunei visa

In December 2024, my friend Badri and I were planning a trip to Southeast Asia. At this point, we were planning to visit Singapore, Malaysia and Vietnam. My Singapore visa had already been approved, and Malaysia was visa-free for us. For Vietnam, we had to apply for an e-visa online. We considered adding Brunei to our itinerary. I saw some videos of the Brunei visa process and got the impression that we needed to go to the Brunei embassy in Kuching, Malaysia in person. However, when I happened to search for Brunei on Organic Maps1, I stumbled upon the Brunei Embassy in Delhi. It seemed to be somewhere in Hauz Khas. As I was going to Delhi to collect my Singapore visa the next day, I figured I d also visit the Brunei Embassy to get information about the visa process. The next day I went to the location displayed by Organic Maps. It was next to the embassy of Madagascar, and a sign on the road divider confirmed that I was at the right place. That said, it actually looked like someone s apartment. I entered and asked for directions to the Brunei embassy, but the people inside did not seem to understand my query. After some back and forth, I realized that the embassy wasn t there. I now searched for the Brunei embassy on the Internet, and this time I got an address in Vasant Vihar. It seemed like the embassy had been moved from Hauz Khas to Vasant Vihar. Going by the timings mentioned on the web page, the embassy was closing in an hour. I took a Metro from Hauz Khas to Vasant Vihar. After deboarding at the Vasant Vihar metro station, I took an auto to reach the embassy. The address listed on the webpage got me into the correct block. However, the embassy was still nowhere to be seen. I asked around, but security guards in that area pointed me to the Burundi embassy instead. After some more looking around, I did end up finding the embassy. I spoke to the security guards at the gate and told them that I would like to know the visa process. They dialled a number and asked that person to tell me the visa process. I spoke to a lady on the phone. She listed the documents required for the visa process and mentioned that the timings for visa application were from 9 o clock to 11 o clock in the morning. She also informed me that the visa fees was 1000. I also asked about the process Badri, who lives far away in Tamil Nadu and cannot report at the embassy physically. She told me that I can submit a visa application on his behalf, along with an authorization letter. Having found the embassy in Delhi was a huge relief. The other plan - going to Kuching, Malaysia - was a bit uncertain, and we didn t know how much time it would take. Getting our passport submitted at an embassy in a foreign country was also not ideal. A few days later, Badri sent me all the documents required for his visa. I went to the embassy and submitted both the applications. The lady who collected our visa submissions asked me for our flight reservations from Delhi to Brunei, whereas ours were (keeping with our itinerary) from Kuala Lampur. She said that she might contact me later if it was required. For reference, here is the list of documents we submitted - I then asked about the procedure to collect the passports and visa results. Usually, embassies will tell you that they will contact you when they have decided on your applications. However, here I was informed that if they don t contact me within 5 days, I can come and collect our passports and visa result between 13:30-14:30 hours on the fifth day. That was strange :) I did visit the embassy to collect our visa results on the fifth day. However, the lady scolded me for not bringing the receipt she gave me. I was afraid that I might have to go all the way back home and bring the receipt to get our passports. The travel date was close, and it would take some time for Badri to receive his passport via courier as well. Fortunately, she gave me our passports (with the visa attached) and asked me to share a scanned copy of the receipt via email after I get home. We were elated that our visas were approved. Now we could focus on booking our flights. If you are going to Brunei, remember to fill their arrival card from the website within 48 hours of your arrival! Thanks to Badri and Contrapunctus for reviewing the draft before publishing the article.

  1. Nowadays, I prefer using Comaps instead of Organic Maps and recommend you do the same. Organic Maps had some issues with its governance and the community issues weren t being addressed.

20 June 2025

Matthew Garrett: My a11y journey

23 years ago I was in a bad place. I'd quit my first attempt at a PhD for various reasons that were, with hindsight, bad, and I was suddenly entirely aimless. I lucked into picking up a sysadmin role back at TCM where I'd spent a summer a year before, but that's not really what I wanted in my life. And then Hanna mentioned that her PhD supervisor was looking for someone familiar with Linux to work on making Dasher, one of the group's research projects, more usable on Linux. I jumped.

The timing was fortuitous. Sun were pumping money and developer effort into accessibility support, and the Inference Group had just received a grant from the Gatsy Foundation that involved working with the ACE Centre to provide additional accessibility support. And I was suddenly hacking on code that was largely ignored by most developers, supporting use cases that were irrelevant to most developers. Being in a relatively green field space sounds refreshing, until you realise that you're catering to actual humans who are potentially going to rely on your software to be able to communicate. That's somewhat focusing.

This was, uh, something of an on the job learning experience. I had to catch up with a lot of new technologies very quickly, but that wasn't the hard bit - what was difficult was realising I had to cater to people who were dealing with use cases that I had no experience of whatsoever. Dasher was extended to allow text entry into applications without needing to cut and paste. We added support for introspection of the current applications UI so menus could be exposed via the Dasher interface, allowing people to fly through menu hierarchies and pop open file dialogs. Text-to-speech was incorporated so people could rapidly enter sentences and have them spoke out loud.

But what sticks with me isn't the tech, or even the opportunities it gave me to meet other people working on the Linux desktop and forge friendships that still exist. It was the cases where I had the opportunity to work with people who could use Dasher as a tool to increase their ability to communicate with the outside world, whose lives were transformed for the better because of what we'd produced. Watching someone use your code and realising that you could write a three line patch that had a significant impact on the speed they could talk to other people is an incomparable experience. It's been decades and in many ways that was the most impact I've ever had as a developer.

I left after a year to work on fruitflies and get my PhD, and my career since then hasn't involved a lot of accessibility work. But it's stuck with me - every improvement in that space is something that has a direct impact on the quality of life of more people than you expect, but is also something that goes almost unrecognised. The people working on accessibility are heroes. They're making all the technology everyone else produces available to people who would otherwise be blocked from it. They deserve recognition, and they deserve a lot more support than they have.

But when we deal with technology, we deal with transitions. A lot of the Linux accessibility support depended on X11 behaviour that is now widely regarded as a set of misfeatures. It's not actually good to be able to inject arbitrary input into an arbitrary window, and it's not good to be able to arbitrarily scrape out its contents. X11 never had a model to permit this for accessibility tooling while blocking it for other code. Wayland does, but suffers from the surrounding infrastructure not being well developed yet. We're seeing that happen now, though - Gnome has been performing a great deal of work in this respect, and KDE is picking that up as well. There isn't a full correspondence between X11-based Linux accessibility support and Wayland, but for many users the Wayland accessibility infrastructure is already better than with X11.

That's going to continue improving, and it'll improve faster with broader support. We've somehow ended up with the bizarre politicisation of Wayland as being some sort of woke thing while X11 represents the Roman Empire or some such bullshit, but the reality is that there is no story for improving accessibility support under X11 and sticking to X11 is going to end up reducing the accessibility of a platform.

When you read anything about Linux accessibility, ask yourself whether you're reading something written by either a user of the accessibility features, or a developer of them. If they're neither, ask yourself why they actually care and what they're doing to make the future better.

comment count unavailable comments

19 June 2025

Debian Outreach Team: GSoC 2025 Introduction: Make Debian for Raspberry Pi Build Again

Hello everyone! I am Kurva Prashanth, Interested in the lower level working of system software, CPUs/SoCs and Hardware design. I was introduced to Open Hardware and Embedded Linux while studying electronics and embedded systems as part of robotics coursework. Initially, I did not pay much attention to it and quickly moved on. However, a short talk on Liberating SBCs using Debian by Yuvraj at MiniDebConf India, 2021 caught my interest. The talk focused on Open Hardware platforms such as Olimex and BeagleBone Black, as well as the Debian distributions tailored for these ARM-based single-board computers has intrigued me to delve deeper into the realm of Open Hardware and Embedded Linux. These days I m trying to improve my abilities to contribute to Debian and Linux Kernel development. Before finding out about the Google Summer of Code project, I had already started my journey with Debian. I extensively used Debian system build tools(debootstrap, sbuild, deb-build-pkg, qemu-debootstrap) for Building Debian Image for Bela Cape a real-time OS for music making to achieve extremely fast audio and sensor processing times. In 2023, I had the opportunity to attend DebConf23 in Kochi, India - thanks to Nilesh Patra (@nilesh) and I met Hector Oron (@zumbi) over dinner at DebConf23 and It was nice talking about his contributions/work at Debian on armhf port and Debian System Administration that conversation got me interested in knowing more about Debian ARM, Installer and I found it fascinating that EmDebian was once a external project bringing Debian to embedded systems and now, Debian itself can be run on many embedded systems. And, also during DebCamp I got Introduced to PGP/GPG keys and the web of trust by Carlos Henrique Lima Melara (@charles) I learned how to use and generate GPG keys. After DebConf23 I tried debian packaging and I miserably failed to get sponsorship for a python library I packaged. I came across the Debian project for this year s Google Summer of Code and found the project titled Make Debian for Raspberry Pi Build Again quite interesting to me and applied. Gladly, on May 8th, I received an acceptance e-mail from GSoC. I got excited that I ll spend the summer working on something that I like doing. I am thrilled to be part of this project and I am super excited for the summer of 25. I m looking forward to work on what I most like, new connections and learning opportunities. So, let me talk a bit more about my project. I will be working on to Make Debian for Raspberry Pi SBC s under the guidance of Gunnar Wolf (@gwolf). In this post, I will describe the project I will be working on.

Why make Debian for Raspberry Pi build again? There is an available set of images for running Debian in Raspberry Pi computers (all models below the 5 series)! However, the maintainer severely lacking time to take care for them; called for help for somebody to adopt them, but have not been successful. The image generation scripts might have bitrotted a bit, but it is mostly all done. And there is a lot of interest and use still in having the images freshly generated and decently tested! This GSoC project is about getting the [https://raspi.debian.net/ Raspberry Pi Debian images] site working reliably, daily-built images become automatic again and ideally making it easily deployable to be run in project machines and migrating exsisting hosting infrastructure to Debian.

How much it differ from Debian build process? While the goal is to stay as close as possible to the Debian build process, Raspberry Pi boards require some necessary platform-specific changes primarily in the early boot sequence and firmware handling. Unlike typical Debian systems, Raspberry Pi boards depend on a non-standard bootloader and use non-free firmware (raspi-firmware), Introducing some hardware-specific differences in the initialization process. These differences are largely confined to the early boot and hardware initialization stages. Once the system boots, the userspace remains closely aligned with a typical Debian install, using Debian packages. The current modifications are required due to non-free firmware. However, several areas merit review: but there are a few parts that might be worth changing.
  1. Boot flow: Transitioning to a U-Boot based boot process (as used in Debian installer images for many other SBCs) would reduce divergence and better align with Debian Installer.
  2. Current scripts/workarounds: Some existing hacks may now be redundant with recent upstream support and could be removed.
  3. Board-specific images: Shift to architecture-specific base images with runtime detection could simplify builds and reduce duplication.
Debian already already building SD card images for a wide range of SBCs (e.g., BeagleBone, BananaPi, OLinuXino, Cubieboard, etc.) installer-arm64/images/u-boot and installer-armhf/images/u-boot, a similar approach for Raspberry Pi could improve maintainability and consistency with Debian s broader SBC support.

Quoted from Mail Discussion Thread with Mentor (Gunnar Wolf)
"One direction we wanted to explore was whether we should still be building one image per family, or whether we could instead switch to one image per architecture (armel, armhf, arm64). There were some details to iron out as RPi3 and RPi4 were quite different, but I think it will be similar to the differences between the RPi 0 and 1, which are handled at first-boot time. To understand what differs between families, take a look at Cyril Brulebois generate-recipe (in the repo), which is a great improvement over the ugly mess I had before he contributed it"
In this project, I intend to to build one image per architecture (armel, armhf, arm64) rather than continuing with the current model of building one image per board. This change simplifies image management, reduces redundancy, and leverages dynamic configuration at boot time to support all supported boards within each architecture. By using U-Boot and flash-kernel, we can detect the board type and configure kernel parameters, DTBs, and firmware during the first boot, reducing duplication across images and simplifying the maintenance burden and we can also generalize image creation while still supporting board-specific behavior at runtime. This method aligns with existing practices in the DebianInstaller team and aligns with Debian s long-term maintainability goals and better leverages upstream capabilities, ensuring a consistent and scalable boot experience. To streamline and standardize the process of building bootable Debian images for Raspberry Pi devices, I proposed a new workflow that leverages U-Boot and flash-kernel Debian packages. This provides a clean, maintainable, and reproducible way to generate images for armel, armhf and arm64 boards. The workflow is vmdb2, a lightweight, declarative tool designed to automate the creation of disk images. A typical vmdb2 recipe defines the disk layout, base system installation (via debootstrap), architecture-specific packages, and any custom post-install hooks and the image should includes U-Boot (the u-boot-rpi package), flash-kernel, and a suitable Debian kernel package like linux-image-arm64 or linux-image-armmp. U-Boot serves as the platform s bootloader and is responsible for loading the kernel and initramfs. Unlike Raspberry Pi s non-free firmware/proprietary bootloader, U-Boot provides an open and scriptable interface, allowing us to follow a more standard Debian boot process. It can be configured to boot using either an extlinux.conf or a boot.scr script generated automatically by flash-kernel. The role of flash-kernel is to bridge Debian s kernel installation system with the specifics of embedded bootloaders like U-Boot. When installed, it automatically copies the kernel image, initrd, and device tree blobs (DTBs) to the /boot partition. It also generates the necessary boot.scr script if the board configuration demands it. To work correctly, flash-kernel requires that the target machine be identified via /etc/flash-kernel/machine, which must correspond to an entry in its internal machine database.\ Once the vmdb2 build is complete, the resulting image will contain a fully configured bootable system with all necessary boot components correctly installed. The image can be flashed to an SD card and used to boot on the intended device without additional manual configuration. Because all key packages (U-Boot, kernel, flash-kernel) are managed through Debian s package system, kernel updates and boot script regeneration are handled automatically during system upgrades.

Current Workflow: Builds one Image per family The current vmdb2 recipe uses the Raspberry Pi GPU bootloader provided via the raspi-firmware package. This is the traditional boot process followed by Raspberry Pi OS, and it s tightly coupled with firmware files like bootcode.bin, start.elf, and fixup.dat. These files are installed to /boot/firmware, which is mounted from a FAT32 partition labeled RASPIFIRM. The device tree files (*.dtb) are manually copied from /usr/lib/linux-image-*-arm64/broadcom/ into this partition. The kernel is installed via the linux-image-arm64 package, and the boot arguments are injected by modifying /boot/firmware/cmdline.txt using sed commands. Booting depends on the root partition being labeled RASPIROOT, referenced through that file. There is no bootloader like UEFI-based or U-Boot involved the Raspberry Pi firmware directly loads the kernel, which is standard for Raspberry Pi boards.
- apt: install
  packages:
    ...
    - raspi-firmware  
The boot partition contents and kernel boot setup are tightly controlled via scripting in the recipe. Limitations of Current Workflow: While this setup works, it is
  1. Proprietary and Raspberry Pi specific It relies on the closed-source GPU bootloader the raspi-firmware package, which is tightly coupled to specific Raspberry Pi models.
  2. Manual DTB handling Device tree files are manually copied and hardcoded, making upgrades or board-specific changes error-prone.
  3. Not easily extendable to future Raspberry Pi boards Any change in bootloader behavior (as seen in the Raspberry Pi 5, which introduces a more flexible firmware boot process) would require significant rework.
  4. No UEFI-based/U-Boot The current method bypasses the standard bootloader layers, making it inconsistent with other Debian ARM platforms and harder to maintain long-term.
As Raspberry Pi firmware and boot processes evolve, especially with the introduction of Pi 5 and potentially Pi 6, maintaining compatibility will require more flexibility - something best delivered by adopting U-Boot and flash-kernel.

New Workflow: Building Architecture-Specific Images with vmdb2, U-Boot, flash-kernel, and Debian Kernel This workflow outlines an improved approach to generating bootable Debian images architecture specific, using vmdb2, U-Boot, flash-kernel, and Debian kernels and also to move away from Raspberry Pi s proprietary bootloader to a fully open-source boot process which improves maintainability, consistency, and cross-board support.

New Method: Shift to U-Boot + flash-kernel U-Boot (via Debian su-boot-rpi package) and flash-kernel bring the image building process closer to how Debian officially boots ARM devices. flash-kernel integrates with the system s initramfs and kernel packages to install bootloaders, prepare boot.scr or extlinux.conf, and copy kernel/initrd/DTBs to /boot in a format that U-Boot expects. U-Boot will be used as a second-stage bootloader, loaded by the Raspberry Pi s built-in firmware. Once U-Boot is in place, it will read standard boot scripts ( boot.scr) generated by flash-kernel, providing a Debian-compatible and board-flexible solution. Extending YAML spec for vmdb2 build with U-Boot and flash-kernel To improve an existing vmdb2 YAML spec(https://salsa.debian.org/raspi-team/image-specs/raspi_master.yaml), to integrate U-Boot, flash-kernel, and the architecture-specific Debian kernel into the image build process. By incorporating u-boot-rpi and flash-kernel from Debian packages, alongside the standard initramfs-tools, we align the image closer to Debian best practices while supporting both armhf and arm64 architectures. Below are key additions and adjustments needed in a vmdb2 YAML spec to support the workflow: Install U-Boot, flash-kernel, initramfs-tools and the architecture-specific Debian kernel.
- apt: install
  packages:
    - u-boot-rpi
    - flash-kernel
    - initramfs-tools
    - linux-image-arm64 # or linux-image-armmp for armhf 
  tag: tag-root
Replace linux-image-arm64 with the correct kernel package for specific target architecture. These packages should be added under the tag-root section in YAML spec for vmdb2 build recipe. This ensures that the necessary bootloader, kernel, and initramfs tools are included and properly configured in the image. Configure Raspberry Pi firmware to Load U-Boot Install the U-Boot binary as kernel.img in /boot/firmware we can also download and build U-Boot from source, but Debian provides tested binaries.
- shell:  
    cp /usr/lib/u-boot/rpi_4/u-boot.bin $ ROOT? /boot/firmware/kernel.img
    echo "enable_uart=1" >> $ ROOT? /boot/firmware/config.txt
  root-fs: tag-root
This makes the RPi firmware load u-boot.bin instead of the Linux kernel directly. Set Up flash-kernel for Debian-style Boot flash-kernel integrates with initramfs-tools and writes boot config suitable for U-Boot. We need to make sure /etc/flash-kernel/db contains an entry for board (most Raspberry Pi boards already supported in Bookworm). Set up /etc/flash-kernel.conf with:
- create-file: /etc/flash-kernel.conf
  contents:  
    MACHINE="Raspberry Pi 4"
    BOOTPART="/dev/disk/by-label/RASPIFIRM"
    ROOTPART="/dev/disk/by-label/RASPIROOT"
  unless: rootfs_unpacked
This allows flash-kernel to write an extlinux.conf or boot.scr into /boot/firmware. Clean up Proprietary/Non-Free Firmware Bootflow Remove the direct kernel loading flow:
- shell:  
    rm -f $ ROOT? /boot/firmware/vmlinuz*
    rm -f $ ROOT? /boot/firmware/initrd.img*
    rm -f $ ROOT? /boot/firmware/cmdline.txt
  root-fs: tag-root
Let U-Boot and flash-kernel manage kernel/initrd and boot parameters instead. Boot Flow After This Change
[SoC ROM] -> [start.elf] -> [U-Boot] -> [boot.scr] -> [Linux Kernel]
  1. This still depends on the Raspberry Pi firmware to start, but it only loads U-Boot, not Linux kernel.
  2. U-Boot gives you more flexibility (e.g., networking, boot menus, signed boot).
  3. Using flash-kernel ensures kernel updates are handled the Debian Installer way.
  4. Test with a serial console (enable_uart=1) in case HDMI doesn t show early boot logs.
Advantage of New Workflow
  1. Replaces the proprietary Raspberry Pi bootloader with upstream U-Boot.
  2. Debian-native tooling Uses flash-kernel and initramfs-tools to manage boot configuration.
  3. Consistent across boards Works for both armhf and arm64, unifying the image build process.
  4. Easier to support new boards Like the Raspberry Pi 5 and future models.
This transition will standardize a bit image-building process, making it aligned with upstream Debian Installer workflows.

vmdb2 configuration for arm64 using u-boot and flash-kernel NOTE: This is a baseline example and may require tuning.
# Raspberry Pi arm64 image using U-Boot and flash-kernel
steps:
  # ... (existing mkimg, partitions, mount, debootstrap, etc.) ...
  # Install U-Boot, flash-kernel, initramfs-tools and architecture specific kernel
  - apt: install
    packages:
      - u-boot-rpi
      - flash-kernel
      - initramfs-tools
      - linux - image - arm64 # or linux - image - armmp for armhf
    tag: tag-root
  # Install U-Boot binary as kernel.img in firmware partition
  - shell:  
      cp /usr/lib/u-boot/rpi_arm64 /u-boot.bin $ ROOT? /boot/firmware/kernel.img
      echo "enable_uart=1" >> $ ROOT? /boot/firmware/config.txt
    root-fs: tag-root
  # Configure flash-kernel for Raspberry Pi
  - create-file: /etc/flash-kernel.conf
    contents:  
      MACHINE="Generic Raspberry Pi ARM64"
      BOOTPART="/dev/disk/by-label/RASPIFIRM"
      ROOTPART="/dev/disk/by-label/RASPIROOT"
    unless: rootfs_unpacked
  # Remove direct kernel boot files from Raspberry Pi firmware
  - shell:  
      rm -f $ ROOT? /boot/firmware/vmlinuz*
      rm -f $ ROOT? /boot/firmware/initrd.img*
      rm -f $ ROOT? /boot/firmware/cmdline.txt
    root-fs: tag-root
  # flash-kernel will manage boot scripts and extlinux.conf
  # Rest of image build continues...

Required Changes to Support Raspberry Pi Boards in Debian (flash-kernel + U-Boot)

Overview of Required Changes
Component Required Task
Debian U-Boot Package Add build target for rpi_arm64 in u-boot-rpi. Optionally deprecate legacy 32-bit targets.
Debian flash-kernel Package Add or verify entries in db/all.db for Pi 4, Pi 5, Zero 2W, CM4. Ensure boot script generation works via bootscr.uboot-generic.
Debian Kernel Ensure DTBs are installed at /usr/lib/linux-image-<version>/ and available for flash-kernel to reference.

flash-kernel

Already Supported Boards in flash-kernel Debian Package https://sources.debian.org/src/flash-kernel/3.109/db/all.db/#L1700
Model Arch DTB-Id
Raspberry Pi 1 A/B/B+, Rev2 armel bcm2835-*
Raspberry Pi CM1 armel bcm2835-rpi-cm1-io1.dtb
Raspberry Pi Zero/Zero W armel bcm2835-rpi-zero*.dtb
Raspberry Pi 2B armhf bcm2836-rpi-2-b.dtb
Raspberry Pi 3B/3B+ arm64 bcm2837-*
Raspberry Pi CM3 arm64 bcm2837-rpi-cm3-io3.dtb
Raspberry Pi 400 arm64 bcm2711-rpi-400.dtb

uboot

Already Supported Boards in Debian U-Boot Package https://salsa.debian.org/installer-team/flash-kernel/-/blob/master/db/all.db

arm64 Model Arch Upstream Defconfig Debian Target - - - Raspberry Pi 3B arm64 rpi_3_defconfig rpi_3 Raspberry Pi 4B arm64 rpi_4_defconfig rpi_4 Raspberry Pi 3B/3B+/CM3/CM3+/4B/CM4/400/5B/Zero 2W arm64 rpi_arm64_defconfig rpi_arm64
armhf Model Arch Upstream Defconfig Debian Target - - - Raspberry Pi 2 armhf rpi_2_defconfig rpi_2 Raspberry Pi 3B (32-bit) armhf rpi_3_32b_defconfig rpi_3_32b Raspberry Pi 4B (32-bit) armhf rpi_4_32b_defconfig rpi_4_32b
armel Model Arch Upstream Defconfig Debian Target - - - Raspberry Pi armel rpi_defconfig rpi Raspberry Pi 1/Zero armel rpi_0_w rpi_0_w These boards are already defined in debian/rules under the u-boot-rpi source package and generates usable U-Boot binaries for corresponding Raspberry Pi models.

To-Do: Add Missing Board Support to U-Boot and flash-kernel in Debian Several Raspberry Pi models are missing from the Debian U-Boot and flash-kernel packages, even though upstream support and DTBs exist in the Debian kernel but are missing entries in the flash-kernel database to enable support for bootloader installation and initrd handling.

Boards Not Yet Supported in flash-kernel Debian Package
Model Arch DTB-Id
Raspberry Pi 3A+ (32 & 64 bit) armhf, arm64 bcm2837-rpi-3-a-plus.dtb
Raspberry Pi 4B (32 & 64 bit) armhf, arm64 bcm2711-rpi-4-b.dtb
Raspberry Pi CM4 arm64 bcm2711-rpi-cm4-io.dtb
Raspberry Pi CM 4S arm64 -
Raspberry Zero 2 W arm64 bcm2710-rpi-zero-2-w.dtb
Raspberry Pi 5 arm64 bcm2712-rpi-5-b.dtb
Raspberry Pi CM5 arm64 -
Raspberry Pi 500 arm64 -

Boards Not Yet Supported in Debian U-Boot Package
Model Arch Upstream defconfig(s)
Raspberry Pi 3A+/3B+ arm64 -, rpi_3_b_plus_defconfig
Raspberry Pi CM 4S arm64 -
Raspberry Pi 5 arm64 -
Raspberry Pi CM5 arm64 -
Raspberry Pi 500 arm64 -

So, what next? During the Community Bonding Period, I got hands-on with workflow improvements, set up test environments, and began reviewing Raspberry Pi support in Debian s U-Boot and flash-kernel and these are the logs of the project, where I provide weekly reports on the work done. You can check here: Community Bonding Period logs. My next steps include submitting patches to the u-boot and flash-kernel packages to ensure all missing Raspberry Pi entries are built and shipped. And, also to confirm the kernel DTB installation paths and make sure the necessary files are included for all Raspberry Pi variants. Finally, plan to validate changes with test builds on Raspberry Pi hardware. In parallel, I m organizing my tasks and setting up my environment to contribute more effectively. It s been exciting to explore how things work under the hood and to prepare for a summer of learning and contributing to this great community.

18 June 2025

Sergio Durigan Junior: GCC, glibc, stack unwinding and relocations A war story

I ve been meaning to write a post about this bug for a while, so here it is (before I forget the details!). First, I d like to thank a few people: I ll probably forget some details because it s been more than a week (and life at $DAYJOB moves fast), but we ll see.

The background story Wolfi OS takes security seriously, and one of the things we have is a package which sets the hardening compiler flags for C/C++ according to the best practices recommended by OpenSSF. At the time of this writing, these flags are (in GCC s spec file parlance):
*self_spec:
+ % !O:% !O1:% !O2:% !O3:% !O0:% !Os:% !0fast:% !0g:% !0z:-O2  -fhardened -Wno-error=hardened -Wno-hardened % !fdelete-null-pointer-checks:-fno-delete-null-pointer-checks  -fno-strict-overflow -fno-strict-aliasing % !fomit-frame-pointer:-fno-omit-frame-pointer  -mno-omit-leaf-frame-pointer
*link:
+ --as-needed -O1 --sort-common -z noexecstack -z relro -z now
The important part for our bug is the usage of -z now and -fno-strict-aliasing. As I was saying, these flags are set for almost every build, but sometimes things don t work as they should and we need to disable them. Unfortunately, one of these problematic cases has been glibc. There was an attempt to enable hardening while building glibc, but that introduced a strange breakage to several of our packages and had to be reverted. Things stayed pretty much the same until a few weeks ago, when I started working on one of my roadmap items: figure out why hardening glibc wasn t working, and get it to work as much as possible.

Reproducing the bug I started off by trying to reproduce the problem. It s important to mention this because I often see young engineers forgetting to check if the problem is even valid anymore. I don t blame them; the anxiety to get the bug fixed can be really blinding. Fortunately, I already had one simple test to trigger the failure. All I had to do was install the py3-matplotlib package and then invoke:
$ python3 -c 'import matplotlib'
This would result in an abortion with a coredump. I followed the steps above, and readily saw the problem manifesting again. OK, first step is done; I wasn t getting out easily from this one.

Initial debug The next step is to actually try to debug the failure. In an ideal world you get lucky and are able to spot what s wrong after just a few minutes. Or even better: you also can devise a patch to fix the bug and contribute it to upstream. I installed GDB, and then ran the py3-matplotlib command inside it. When the abortion happened, I issued a backtrace command inside GDB to see where exactly things had gone wrong. I got a stack trace similar to the following:
#0  0x00007c43afe9972c in __pthread_kill_implementation () from /lib/libc.so.6
#1  0x00007c43afe3d8be in raise () from /lib/libc.so.6
#2  0x00007c43afe2531f in abort () from /lib/libc.so.6
#3  0x00007c43af84f79d in uw_init_context_1[cold] () from /usr/lib/libgcc_s.so.1
#4  0x00007c43af86d4d8 in _Unwind_RaiseException () from /usr/lib/libgcc_s.so.1
#5  0x00007c43acac9014 in __cxxabiv1::__cxa_throw (obj=0x5b7d7f52fab0, tinfo=0x7c429b6fd218 <typeinfo for pybind11::attribute_error>, dest=0x7c429b5f7f70 <pybind11::reference_cast_error::~reference_cast_error() [clone .lto_priv.0]>)
    at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:93
#6  0x00007c429b5ec3a7 in ft2font__getattr__(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) [clone .lto_priv.0] [clone .cold] () from /usr/lib/python3.13/site-packages/matplotlib/ft2font.cpython-313-x86_64-linux-gnu.so
#7  0x00007c429b62f086 in pybind11::cpp_function::initialize<pybind11::object (*&)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), pybind11::object, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, pybind11::name, pybind11::scope, pybind11::sibling>(pybind11::object (*&)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), pybind11::object (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&):: lambda(pybind11::detail::function_call&)#1 ::_FUN(pybind11::detail::function_call&) [clone .lto_priv.0] ()
   from /usr/lib/python3.13/site-packages/matplotlib/ft2font.cpython-313-x86_64-linux-gnu.so
#8  0x00007c429b603886 in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) () from /usr/lib/python3.13/site-packages/matplotlib/ft2font.cpython-313-x86_64-linux-gnu.so
...
Huh. Initially this didn t provide me with much information. There was something strange seeing the abort function being called right after _Unwind_RaiseException, but at the time I didn t pay much attention to it. OK, time to expand our horizons a little. Remember when I said that several of our packages would crash with a hardened glibc? I decided to look for another problematic package so that I could make it crash and get its stack trace. My thinking here is that maybe if I can compare both traces, something will come up. I happened to find an old discussion where Dann Frazier mentioned that Emacs was also crashing for him. He and I share the Emacs passion, and I totally agreed with him when he said that Emacs crashing is priority -1! (I m paraphrasing). I installed Emacs, ran it, and voil : the crash happened again. OK, that was good. When I ran Emacs inside GDB and asked for a backtrace, here s what I got:
#0  0x00007eede329972c in __pthread_kill_implementation () from /lib/libc.so.6
#1  0x00007eede323d8be in raise () from /lib/libc.so.6
#2  0x00007eede322531f in abort () from /lib/libc.so.6
#3  0x00007eede262879d in uw_init_context_1[cold] () from /usr/lib/libgcc_s.so.1
#4  0x00007eede2646e7c in _Unwind_Backtrace () from /usr/lib/libgcc_s.so.1
#5  0x00007eede3327b11 in backtrace () from /lib/libc.so.6
#6  0x000059535963a8a1 in emacs_backtrace ()
#7  0x000059535956499a in main ()
Ah, this backtrace is much simpler to follow. Nice. Hmmm. Now the crash is happening inside _Unwind_Backtrace. A pattern emerges! This must have something to do with stack unwinding (or so I thought keep reading to discover the whole truth). You see, the backtrace function (yes, it s a function) and C++ s exception handling mechanism use similar techniques to do their jobs, and it pretty much boils down to unwinding frames from the stack. I looked into Emacs source code, specifically the emacs_backtrace function, but could not find anything strange over there. This bug was probably not going to be an easy fix

The quest for a minimal reproducer Being able to easily reproduce the bug is awesome and really helps with debugging, but even better is being able to have a minimal reproducer for the problem. You see, py3-matplotlib is a huge package and pulls in a bunch of extra dependencies, so it s not easy to ask other people to just install this big package plus these other dependencies, and then run this command , especially if we have to file an upstream bug and talk to people who may not even run the distribution we re using. So I set up to try and come up with a smaller recipe to reproduce the issue, ideally something that s not tied to a specific package from the distribution. Having all the information gathered from the initial debug session, especially the Emacs backtrace, I thought that I could write a very simple program that just invoked the backtrace function from glibc in order to trigger the code path that leads to _Unwind_Backtrace. Here s what I wrote:
#include <execinfo.h>

int
main(int argc, char *argv[])
 
  void *a[4096];
  backtrace (a, 100);
  return 0;
 
After compiling it, I determined that yes, the problem did happen with this small program as well. There was only a small nuisance: the manifestation of the bug was not deterministic, so I had to execute the program a few times until it crashed. But that s much better than what I had before, and a small price to pay. Having a minimal reproducer pretty much allows us to switch our focus to what really matters. I wouldn t need to dive into Emacs or Python s source code anymore. At the time, I was sure this was a glibc bug. But then something else happened.

GCC 15 I had to stop my investigation efforts because something more important came up: it was time to upload GCC 15 to Wolfi. I spent a couple of weeks working on this (it involved rebuilding the whole archive, filing hundreds of FTBFS bugs, patching some programs, etc.), and by the end of it the transition went smooth. When the GCC 15 upload was finally done, I switched my focus back to the glibc hardening problem. The first thing I did was to yes, reproduce the bug again. It had been a few weeks since I had touched the package, after all. So I built a hardened glibc with the latest GCC and the bug did not happen anymore! Fortunately, the very first thing I thought was this must be GCC , so I rebuilt the hardened glibc with GCC 14, and the bug was there again. Huh, unexpected but very interesting.

Diving into glibc and libgcc At this point, I was ready to start some serious debugging. And then I got a message on Signal. It was one of those moments where two minds think alike: Gabriel decided to check how I was doing, and I was thinking about him because this involved glibc, and Gabriel contributed to the project for many years. I explained what I was doing, and he promptly offered to help. Yes, there are more people who love low level debugging! We spent several hours going through disassembles of certain functions (because we didn t have any debug information in the beginning), trying to make sense of what we were seeing. There was some heavy GDB involved; unfortunately I completely lost the session s history because it was done inside a container running inside an ephemeral VM. But we learned a lot. For example:
  • It was hard to actually understand the full stack trace leading to uw_init_context_1[cold]. _Unwind_Backtrace obviously didn t call it (it called uw_init_context_1, but what was that [cold] doing?). We had to investigate the disassemble of uw_init_context_1 in order to determined where uw_init_context_1[cold] was being called.
  • The [cold] suffix is a GCC function attribute that can be used to tell the compiler that the function is unlikely to be reached. When I read that, my mind immediately jumped to this must be an assertion , so I went to the source code and found the spot.
  • We were able to determine that the return code of uw_frame_state_for was 5, which means _URC_END_OF_STACK. That s why the assertion was triggering.
After finding these facts without debug information, I decided to bite the bullet and recompiled GCC 14 with -O0 -g3, so that we could debug what uw_frame_state_for was doing. After banging our heads a bit more, we found that fde is NULL at this excerpt:
// ...
  fde = _Unwind_Find_FDE (context->ra + _Unwind_IsSignalFrame (context) - 1,
                          &context->bases);
  if (fde == NULL)
     
#ifdef MD_FALLBACK_FRAME_STATE_FOR
      /* Couldn't find frame unwind info for this function.  Try a
         target-specific fallback mechanism.  This will necessarily
         not provide a personality routine or LSDA.  */
      return MD_FALLBACK_FRAME_STATE_FOR (context, fs);
#else
      return _URC_END_OF_STACK;
#endif
     
// ...
We re debugging on amd64, which means that MD_FALLBACK_FRAME_STATE_FOR is defined and therefore is called. But that s not really important for our case here, because we had established before that _Unwind_Find_FDE would never return NULL when using a non-hardened glibc (or a glibc compiled with GCC 15). So we decided to look into what _Unwind_Find_FDE did. The function is complex because it deals with .eh_frame , but we were able to pinpoint the exact location where find_fde_tail (one of the functions called by _Unwind_Find_FDE) is returning NULL:
if (pc < table[0].initial_loc + data_base)
  return NULL;
We looked at the addresses of pc and table[0].initial_loc + data_base, and found that the former fell within libgcc s text section, which the latter fell within /lib/ld-linux-x86-64.so.2 text. At this point, we were already too tired to continue. I decided to keep looking at the problem later and see if I could get any further.

Bisecting GCC The next day, I woke up determined to find what changed in GCC 15 that caused the bug to disappear. Unless you know GCC s internals like they are your own home (which I definitely don t), the best way to do that is to git bisect the commits between GCC 14 and 15. I spent a few days running the bisect. It took me more time than I d have liked to find the right range of commits to pass git bisect (because of how branches and tags are done in GCC s repository), and I also had to write some helper scripts that:
  • Modified the gcc.yaml package definition to make it build with the commit being bisected.
  • Built glibc using the GCC that was just built.
  • Ran tests inside a docker container (with the recently built glibc installed) to determine whether the bug was present.
At the end, I had a commit to point to:
commit 99b1daae18c095d6c94d32efb77442838e11cbfb
Author: Richard Biener <rguenther@suse.de>
Date:   Fri May 3 14:04:41 2024 +0200
    tree-optimization/114589 - remove profile based sink heuristics
Makes sense, right?! No? Well, it didn t for me either. Even after reading what was changed in the code and the upstream bug fixed by the commit, I was still clueless as to why this change fixed the problem (I say fixed because it may very well be an unintended consequence of the change, and some other problem might have been introduced).

Upstream takes over After obtaining the commit that possibly fixed the bug, while talking to Dann and explaining what I did, he suggested that I should file an upstream bug and check with them. Great idea, of course. I filed the following upstream bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120653 It s a bit long, very dense and complex, but ultimately upstream was able to find the real problem and have a patch accepted in just two days. Nothing like knowing the code base. The initial bug became: https://sourceware.org/bugzilla/show_bug.cgi?id=33088 In the end, the problem was indeed in how the linker defines __ehdr_start, which, according to the code (from elf/dl-support.c):
if (_dl_phdr == NULL)
   
    /* Starting from binutils-2.23, the linker will define the
       magic symbol __ehdr_start to point to our own ELF header
       if it is visible in a segment that also includes the phdrs.
       So we can set up _dl_phdr and _dl_phnum even without any
       information from auxv.  */


    extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
    assert (__ehdr_start.e_phentsize == sizeof *GL(dl_phdr));
    _dl_phdr = (const void *) &__ehdr_start + __ehdr_start.e_phoff;
    _dl_phnum = __ehdr_start.e_phnum;
   
But the following definition is the problematic one (from elf/rtld.c):
extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
This symbol (along with its counterpart, __ehdr_end) was being run-time relocated when it shouldn t be. The fix that was pushed added optimization barriers to prevent the compiler from doing the relocations. I don t claim to fully understand what was done here, and Jakub s analysis is a thing to behold, but in the end I was able to confirm that the patch fixed the bug. And in the end, it was indeed a glibc bug.

Conclusion This was an awesome bug to investigate. It s one of those that deserve a blog post, even though some of the final details of the fix flew over my head. I d like to start blogging more about these sort of bugs, because I ve encountered my fair share of them throughout my career. And it was great being able to do some debugging with another person, exchange ideas, learn things together, and ultimately share that deep satisfaction when we find why a crash is happening. I have at least one more bug in my TODO list to write about (another one with glibc, but this time I was able to get to the end of it and come up with a patch). Stay tunned. P.S.: After having published the post I realized that I forgot to explain why the -z now and -fno-strict-aliasing flags were important. -z now is the flag that I determined to be the root cause of the breakage. If I compiled glibc with every hardening flag except -z now, everything worked. So initially I thought that the problem had to do with how ld.so was resolving symbols at runtime. As it turns out, this ended up being more a symptom than the real cause of the bug. As for -fno-strict-aliasing, a Gentoo developer who commented on the GCC bug above mentioned that this OpenSSF bug had a good point against using this flag for hardening. I still have to do a deep dive on what was discussed in the issue, but this is certainly something to take into consideration. There s this very good write-up about strict aliasing in general if you re interested in understanding it better.

17 June 2025

Evgeni Golov: Arguing with an AI or how Evgeni tried to use CodeRabbit

Everybody is trying out AI assistants these days, so I figured I'd jump on that train and see how fast it derails. I went with CodeRabbit because I've seen it on YouTube ads work, I guess. I am trying to answer the following questions: To reduce the amount of output and not to confuse contributors, CodeRabbit was configured to only do reviews on demand. What follows is a rather unscientific evaluation of CodeRabbit based on PRs in two Foreman-related repositories, looking at the summaries CodeRabbit posted as well as the comments/suggestions it had about the code. Ansible 2.19 support PR: theforeman/foreman-ansible-modules#1848 summary posted The summary CodeRabbit posted is technically correct.
This update introduces several changes across CI configuration, Ansible roles, plugins, and test playbooks. It expands CI test coverage to a new Ansible version, adjusts YAML key types in test variables, refines conditional logic in Ansible tasks, adds new default variables, and improves clarity and consistency in playbook task definitions and debug output.
Yeah, it does all of that, all right. But it kinda misses the point that the addition here is "Ansible 2.19 support", which starts with adding it to the CI matrix and then adjusting the code to actually work with that version. Also, the changes are not for "clarity" or "consistency", they are fixing bugs in the code that the older Ansible versions accepted, but the new one is more strict about. Then it adds a table with the changed files and what changed in there. To me, as the author, it felt redundant, and IMHO doesn't add any clarity to understand the changes. (And yes, same "clarity" vs bugfix mistake here, but that makes sense as it apparently miss-identified the change reason) And then the sequence diagrams They probably help if you have a dedicated change to a library or a library consumer, but for this PR it's just noise, especially as it only covers two of the changes (addition of 2.19 to the test matrix and a change to the inventory plugin), completely ignoring other important parts. Overall verdict: noise, don't need this. comments posted CodeRabbit also posted 4 comments/suggestions to the changes. Guard against undefined result.task IMHO a valid suggestion, even if on the picky side as I am not sure how to make it undefined here. I ended up implementing it, even if with slightly different (and IMHO better readable) syntax. Inconsistent pipeline in when for composite CV versions That one was funny! The original complaint was that the when condition used slightly different data manipulation than the data that was passed when the condition was true. The code was supposed to do "clean up the data, but only if there are any items left after removing the first 5, as we always want to keep 5 items". And I do agree with the analysis that it's badly maintainable code. But the suggested fix was to re-use the data in the variable we later use for performing the cleanup. While this is (to my surprise!) valid Ansible syntax, it didn't make the code much more readable as you need to go and look at the variable definition. The better suggestion then came from Ewoud: to compare the length of the data with the number we want to keep. Humans, so smart! But Ansible is not Ewoud's native turf, so he asked whether there is a more elegant way to count how much data we have than to use list count in Jinja (the data comes from a Python generator, so needs to be converted to a list first). And the AI helpfully suggested to use count instead! However, count is just an alias for length in Jinja, so it behaves identically and needs a list. Luckily the AI quickly apologized for being wrong after being pointed at the Jinja source and didn't try to waste my time any further. Wouldn't I have known about the count alias, we'd have committed that suggestion and let CI fail before reverting again. Apply the same fix for non-composite CV versions The very same complaint was posted a few lines later, as the logic there is very similar just slightly different data to be filtered and cleaned up. Interestingly, here the suggestion also was to use the variable. But there is no variable with the data! The text actually says one need to "define" it, yet the "committable suggestion" doesn't contain that part. Interestingly, when asked where it sees the "inconsistency" in that hunk, it said the inconsistency is with the composite case above. That however is nonsense, as while we want to keep the same number of composite and non-composite CV versions, the data used in the task is different it even gets consumed by a totally different playbook so there can't be any real consistency between the branches. I ended up applying the same logic as suggested by Ewoud above. As that refactoring was possible in a consistent way. Ensure consistent naming for Oracle Linux subscription defaults One of the changes in Ansible 2.19 is that Ansible fails when there are undefined variables, even if they are only undefined for cases where they are unused. CodeRabbit complains that the names of the defaults I added are inconsistent. And that is technically correct. But those names are already used in other places in the code, so I'd have to refactor more to make it work properly. Once being pointed at the fact that the variables already exist, the AI is as usual quick to apologize, yay. add new parameters to the repository module PR: theforeman/foreman-ansible-modules#1860 summary posted Again, the summary is technically correct
The repository module was updated to support additional parameters for repository synchronization and authentication. New options were added for ansible collections, ostree, Python packages, and yum repositories, including authentication tokens, filtering controls, and version retention settings. All changes were limited to module documentation and argument specification.
But it doesn't add anything you'd not get from looking at the diff, especially as it contains a large documentation chunk explaining those parameters. No sequence diagram this time. That's a good thing! Overall verdict: noise (even if the amount is small), don't need this. comments posted CodeRabbit generated two comments for this PR. Interestingly, none of them overlapped with the issues ansible-lint and friends found. get rid of the FIXMEs Yepp, that's fair add validation for the new parameters Yepp, I forgot these (not intentionally!). The diff it suggests is nonsense, as it doesn't take into account the existing Ansible and Yum validations, but it clearly has read them as the style etc of the new ones matches. It also managed to group the parameters correctly by repository type, so it's something.
 if module.foreman_params['content_type'] != 'ansible_collection':
     invalid_list = [key for key in ['ansible_collection_requirements'] if key in module.foreman_params]
     if invalid_list:
         module.fail_json(msg="( 0 ) can only be used with content_type 'ansible_collection'".format(",".join(invalid_list)))
+
+# Validate ansible_collection specific parameters
+if module.foreman_params['content_type'] != 'ansible_collection':
+    invalid_list = [key for key in ['ansible_collection_auth_token', 'ansible_collection_auth_url'] if key in module.foreman_params]
+    if invalid_list:
+        module.fail_json(msg="( 0 ) can only be used with content_type 'ansible_collection'".format(",".join(invalid_list)))
+
+# Validate ostree specific parameters
+if module.foreman_params['content_type'] != 'ostree':
+    invalid_list = [key for key in ['depth', 'exclude_refs', 'include_refs'] if key in module.foreman_params]
+    if invalid_list:
+        module.fail_json(msg="( 0 ) can only be used with content_type 'ostree'".format(",".join(invalid_list)))
+
+# Validate python package specific parameters
+if module.foreman_params['content_type'] != 'python':
+    invalid_list = [key for key in ['excludes', 'includes', 'package_types', 'keep_latest_packages'] if key in module.foreman_params]
+    if invalid_list:
+        module.fail_json(msg="( 0 ) can only be used with content_type 'python'".format(",".join(invalid_list)))
+
+# Validate yum specific parameter
+if module.foreman_params['content_type'] != 'yum' and 'upstream_authentication_token' in module.foreman_params:
+    module.fail_json(msg="upstream_authentication_token can only be used with content_type 'yum'")
Interestingly, it also said "Note: If 'python' is not a valid content_type, please adjust the validation accordingly." which is quite a hint at a bug in itself. The module currently does not even allow to create content_type=python repositories. That should have been more prominent, as it's a BUG! parameter persistence in obsah PR: theforeman/obsah#72 summary posted Mostly correct. It did miss-interpret the change to a test playbook as an actual "behavior" change: "Introduced new playbook variables for database configuration" there is no database configuration in this repository, just the test playbook using the same metadata as a consumer of the library. Later on it does say "Playbook metadata and test fixtures", so unclear whether this is a miss-interpretation or just badly summarized. As long as you also look at the diff, it won't confuse you, but if you're using the summary as the sole source of information (bad!) it would. This time the sequence diagram is actually useful, yay. Again, not 100% accurate: it's missing the fact that saving the parameters is hidden behind an "if enabled" flag something it did represent correctly for loading them. Overall verdict: not really useful, don't need this. comments posted Here I was a bit surprised, especially as the nitpicks were useful! Persist-path should respect per-user state locations (nitpick) My original code used os.environ.get('OBSAH_PERSIST_PATH', '/var/lib/obsah/parameters.yaml') for the location of the persistence file. CodeRabbit correctly pointed out that this won't work for non-root users and one should respect XDG_STATE_HOME. Ewoud did point that out in his own review, so I am not sure whether CodeRabbit came up with this on its own, or also took the human comments into account. The suggested code seems fine too just doesn't use /var/lib/obsah at all anymore. This might be a good idea for the generic library we're working on here, and then be overridden to a static /var/lib path in a consumer (which always runs as root). In the end I did not implement it, but mostly because I was lazy and was sure we'd override it anyway. Positional parameters are silently excluded from persistence (nitpick) The library allows you to generate both positional (foo without --) and non-positional (--foo) parameters, but the code I wrote would only ever persist non-positional parameters. This was intentional, but there is no documentation of the intent in a comment which the rabbit thought would be worth pointing out. It's a fair nitpick and I ended up adding a comment. Enforce FQDN validation for database_host The library has a way to perform type checking on passed parameters, and one of the supported types is "FQDN" so a fully qualified domain name, with dots and stuff. The test playbook I added has a database_host variable, but I didn't bother adding a type to it, as I don't really need any type checking here. While using "FQDN" might be a bit too strict here technically a working database connection can also use a non-qualified name or an IP address, I was positively surprised by this suggestion. It shows that the rest of the repository was taken into context when preparing the suggestion. reset_args() can raise AttributeError when a key is absent This is a correct finding, the code is not written in a way that would survive if it tries to reset things that are not set. However, that's only true for the case where users pass in --reset-<parameter> without ever having set parameter before. The complaint about the part where the parameter is part of the persisted set but not in the parsed args is wrong as parsed args inherit from the persisted set. The suggested code is not well readable, so I ended up fixing it slightly differently. Persisted values bypass argparse type validation When persisting, I just yaml.safe_dump the parsed parameters, which means the YAML will contain native types like integers. The argparse documentation warns that the type checking argparse does only applies to strings and is skipped if you pass anything else (via default values). While correct, it doesn't really hurt here as the persisting only happens after the values were type-checked. So there is not really a reason to type-check them again. Well, unless the type changes, anyway. Not sure what I'll do with this comment. consider using contextlib.suppress This was added when I asked CodeRabbit for a re-review after pushing some changes. Interestingly, the PR already contained try: except: pass code before, and it did not flag that. Also, the code suggestion contained import contextlib in the middle of the code, instead in the head of the file. Who would do that?! But the comment as such was valid, so I fixed it in all places it is applicable, not only the one the rabbit found. workaround to ensure LCE and CV are always sent together PR: theforeman/foreman-ansible-modules#1867 summary posted
A workaround was added to the _update_entity method in the ForemanAnsibleModule class to ensure that when updating a host, both content_view_id and lifecycle_environment_id are always included together in the update payload. This prevents partial updates that could cause inconsistencies.
Partial updates are not a thing. The workaround is purely for the fact that Katello expects both parameters to be sent, even if only one of them needs an actual update. No diagram, good. Overall verdict: misleading summaries are bad! comments posted Given a small patch, there was only one comment. Implementation looks correct, but consider adding error handling for robustness. This reads correct on the first glance. More error handling is always better, right? But if you dig into the argumentation, you see it's wrong. Either: The AI accepted defeat once I asked it to analyze things in more detail, but why did I have to ask in the first place?! Summary Well, idk, really. Did the AI find things that humans did not find (or didn't bother to mention)? Yes. It's debatable whether these were useful (see e.g. the database_host example), but I tend to be in the "better to nitpick/suggest more and dismiss than oversee" team, so IMHO a positive win. Did the AI output help the humans with the review (useful summary etc)? In my opinion it did not. The summaries were either "lots of words, no real value" or plain wrong. The sequence diagrams were not useful either. Luckily all of that can be turned off in the settings, which is what I'd do if I'd continue using it. Did the AI output help the humans with the code (useful suggestions etc)? While the actual patches it posted were "meh" at best, there were useful findings that resulted in improvements to the code. Was the AI output misleading? Absolutely! The whole Jinja discussion would have been easier without the AI "help". Same applies for the "error handling" in the workaround PR. Was the AI output distracting? The output is certainly a lot, so yes I think it can be distracting. As mentioned, I think dropping the summaries can make the experience less distracting. What does all that mean? I will disable the summaries for the repositories, but will leave the @coderabbitai review trigger active if someone wants an AI-assisted review. This won't be something that I'll force on our contributors and maintainers, but they surely can use it if they want. But I don't think I'll be using this myself on a regular basis. Yes, it can be made "usable". But so can be vim ;-) Also, I'd prefer to have a junior human asking all the questions and making bad suggestions, so they can learn from it, and not some planet burning machine.

11 June 2025

Iustin Pop: This blog finally goes git-annex!

A long, long time ago I have a few pictures on this blog, mostly in earlier years, because even with small pictures, the git repository became 80MiB soon this is not much in absolute terms, but the actual Markdown/Haskell/CSS/HTML total size is tiny compared to the picture, PDFs and fonts. I realised I need a better solution, probably about ten years ago, and that I should investigate git-annex. Then time passed, and I heard about git-lfs, so I thought that s the way forward. Now, I recently got interested again into doing something about this repository, and started researching.

Detour: git-lfs I was sure that git-lfs, being supported by large providers, would be the modern solution. But to my surprise, git-lfs is very server centric, which in hindsight makes sense, but for a home setup, it s not very good. Maybe I misunderstood, but git-lfs is more a protocol/method for a forge to store files, rather than an end-user solution. But then you need to backup those files separately (together with the rest of the forge), or implement another way of safeguarding them. Further details such as the fact that it keeps two copies of the files (one in the actual checked-out tree, one in internal storage) means it s not a good solution. Well, for my blog yes, but not in general. Then posts on Reddit about horror stories people being locked out of github due to quota, as an example, or this Stack Overflow post about git-lfs constraining how one uses git, convinced me that s not what I want. To each their own, but not for me I might want to push this blog s repo to github, but I definitely wouldn t want in that case to pay for github storage for my blog images (which are copies, not originals). And yes, even in 2025, those quotas are real GitHub limits and I agree with GitHub, storage and large bandwidth can t be free.

Back to the future: git-annex So back to git-annex. I thought it s going to be a simple thing, but oh boy, was I wrong. It took me half a week of continuous (well, in free time) reading and discussions with LLMs to understand a bit how it works. I think, honestly, it s a bit too complex, which is why the workflows page lists seven (!) levels of workflow complexity, from fully-managed, to fully-manual. IMHO, respect to the author for the awesome tool, but if you need a web app to help you manage git, it hints that the tool is too complex. I made the mistake of running git annex sync once, to realise it actually starts pushing to my upstream repo and creating new branches and whatnot, so after enough reading, I settled on workflow 6/7, since I don t want another tool to manage my git history. Maybe I m an outlier here, but everything automatic is a bit too much for me. Once you do managed yourself how git-annex works (on the surface, at least), it is a pretty cool thing. It uses a git-annex git branch to store metainformation, and that is relatively clean. If you do run git annex sync, it creates some extra branches, which I don t like, but meh.

Trick question: what is a remote? One of the most confusing things about git-annex was understanding its remote concept. I thought a remote is a place where you replicate your data. But not, that s a special remote. A normal remote is a git remote, but which is expected to be git/ssh/with command line access. So if you have a git+ssh remote, git-annex will not only try to push it s above-mentioned branch, but also copy the files. If such a remote is on a forge that doesn t support git-annex, then it will complain and get confused. Of course, if you read the extensive docs, you just do git config remote.<name>.annex-ignore true, and it will understand that it should not sync to it. But, aside, from this case, git-annex expects that all checkouts and clones of the repository are both metadata and data. And if you do any annex commands in them, all other clones will know about them! This can be unexpected, and you find people complaining about it, but nowadays there s a solution:
git clone   dir && cd dir
git config annex.private true
git annex init "temp copy"
This is important. Any leaf git clone must be followed by that annex.private true config, especially on CI/CD machines. Honestly, I don t understand why by default clones should be official data stores, but it is what it is. I settled on not making any of my checkouts stable , but only the actual storage places. Except those are not git repositories, but just git-annex storage things. I.e., special remotes. Is it confusing enough yet ?

Special remotes The special remotes, as said, is what I expected to be the normal git annex remotes, i.e. places where the data is stored. But well, they exist, and while I m only using a couple simple ones, there is a large number of them. Among the interesting ones: git-lfs, a remote that allows also storing the git repository itself (git-remote-annex), although I m bit confused about this one, and most of the common storage providers via the rclone remote. Plus, all of the special remotes support encryption, so this is a really neat way to store your files across a large number of things, and handle replication, number of copies, from which copy to retrieve, etc. as you with.

And many of other features git-annex has tons of other features, so to some extent, the sky s the limit. Automatic selection of what to add git it vs plain git, encryption handling, number of copies, clusters, computed files, etc. etc. etc. I still think it s cool but too complex, though!

Uses Aside from my blog post, of course. I ve seen blog posts/comments about people using git-annex to track/store their photo collection, and I could see very well how the remote encrypted repos any of the services supported by rclone could be an N+2 copy or so. For me, tracking photos would be a bit too tedious, but it could maybe work after more research. A more practical thing would probably be replicating my local movie collection (all legal, to be clear) better than just run rsync from time to time and tracking the large files in it via git-annex. That s an exercise for another day, though, once I get more mileage with it - my blog pictures are copies, so I don t care much if they get lost, but movies are primary online copies, and I don t want to re-dump the discs. Anyway, for later.

Migrating to git-annex Migrating here means ending in a state where all large files are in git-annex, and the plain git repo is small. Just moving the files to git annex at the current head doesn t remove them from history, so your git repository is still large; it won t grow in the future, but remains with old size (and contains the large files in its history). In my mind, a nice migration would be: run a custom command, and all the history is migrated to git-annex, so I can go back in time and the still use git-annex. I na vely expected this would be easy and already available, only to find comments on the git-annex site with unsure git-filter-branch calls and some web discussions. This is the discussion on the git annex website, but it didn t make me confident it would do the right thing. But that discussion is now 8 years old. Surely in 2025, with git-filter-repo, it s easier? And, maybe I m missing something, but it is not. Not from the point of view of plain git, that s easy, but because interacting with git-annex, which stores its data in git itself, so doing this properly across successive steps of a repo (when replaying the commits) is, I think, not well defined behaviour. So I was stuck here for a few days, until I got an epiphany: As I m going to rewrite the repository, of course I m keeping a copy of it from before git-annex. If so, I don t need the history, back in time, to be correct in the sense of being able to retrieve the binary files too. It just needs to be correct from the point of view of the actual Markdown and Haskell files that represent the meat of the blog. This simplified the problem a lot. At first, I wanted to just skip these files, but this could also drop commits (git-filter-repo, by default, drops the commits if they re empty), and removing the files loses information - when they were added, what were the paths, etc. So instead I came up with a rather clever idea, if I might say so: since git-annex replaces files with symlinks already, just replace the files with symlinks in the whole history, except symlinks that are dangling (to represent the fact that files are missing). One could also use empty files, but empty files are more valid in a sense than dangling symlinks, hence why I settled on those. Doing this with git-filter-repo is easy, in newer versions, with the new --file-info-callback. Here is the simple code I used:
import os
import os.path
import pathlib

SKIP_EXTENSIONS= 'jpg', 'jpeg', 'png', 'pdf', 'woff', 'woff2' 
FILE_MODES =  b"100644", b"100755" 
SYMLINK_MODE = b"120000"

fas_string = filename.decode()
path = pathlib.PurePosixPath(fas_string)
ext = path.suffix.removeprefix('.')

if ext not in SKIP_EXTENSIONS:
  return (filename, mode, blob_id)

if mode not in FILE_MODES:
  return (filename, mode, blob_id)

print(f"Replacing ' filename ' (extension '. ext ') in  os.getcwd() ")

symlink_target = '/none/binary-file-removed-from-git-history'.encode()
new_blob_id = value.insert_file_with_contents(symlink_target)
return (filename, SYMLINK_MODE, new_blob_id)
This goes and replaces files with a symlink to nowhere, but the symlink should explain why it s dangling. Then later renames or moving the files around work naturally , as the rename/mv doesn t care about file contents. Then, when the filtering is done via:
git-filter-repo --file-info-callback <(cat ~/filter-big.py ) --force
It is easy to onboard to git annex:
  • remove all dangling symlinks
  • copy the (binary) files from the original repository
  • since they re named the same, and in the same places, git sees a type change
  • then simply run git annex add on those files
For me it was easy as all such files were in a few directories, so just copying those directories back, a few git-annex add commands, and done. Of course, then adding a few rsync remotes, git annex copy --to, and the repository was ready. Well, I also found a bug in my own Hakyll setup: on a fresh clone, when the large files are just dangling symlinks, the builder doesn t complain, just ignores the images. Will have to fix.

Other resources This is a blog that I read at the beginning, and I found it very useful as an intro: https://switowski.com/blog/git-annex/. It didn t help me understand how it works under the covers, but it is well written. The author does use the sync command though, which is too magic for me, but also agrees about its complexity

The proof is in the pudding And now, for the actual first image to be added that never lived in the old plain git repository. It s not full-res/full-size, it s cropped a bit on the bottom. Earlier in the year, I went to Paris for a very brief work trip, and I walked around a bit it was more beautiful than what I remembered from way way back. So a bit random selection of a picture, but here it is:
Un bateau sur la Seine Un bateau sur la Seine
Enjoy!

Gunnar Wolf: Understanding Misunderstandings - Evaluating LLMs on Networking Questions

This post is a review for Computing Reviews for Understanding Misunderstandings - Evaluating LLMs on Networking Questions , a article published in Association for Computing Machinery (ACM), SIGCOMM Computer Communication Review
Large language models (LLMs) have awed the world, emerging as the fastest-growing application of all time ChatGPT reached 100 million active users in January 2023, just two months after its launch. After an initial cycle, they have gradually been mostly accepted and incorporated into various workflows, and their basic mechanics are no longer beyond the understanding of people with moderate computer literacy. Now, given that the technology is better understood, we face the question of how convenient LLM chatbots are for different occupations. This paper embarks on the question of whether LLMs can be useful for networking applications. This paper systematizes querying three popular LLMs (GPT-3.5, GPT-4, and Claude 3) with questions taken from several network management online courses and certifications, and presents a taxonomy of six axes along which the incorrect responses were classified: The authors also measure four strategies toward improving answers: The authors observe that, while some of those strategies were marginally useful, they sometimes resulted in degraded performance. The authors queried the commercially available instances of Gemini and GPT, which achieved scores over 90 percent for basic subjects but fared notably worse in topics that require understanding and converting between different numeric notations, such as working with Internet protocol (IP) addresses, even if they are trivial (that is, presenting the subnet mask for a given network address expressed as the typical IPv4 dotted-quad representation). As a last item in the paper, the authors compare performance with three popular open-source models: Llama3.1, Gemma2, and Mistral with their default settings. Although those models are almost 20 times smaller than the GPT-3.5 commercial model used, they reached comparable performance levels. Sadly, the paper does not delve deeper into these models, which can be deployed locally and adapted to specific scenarios. The paper is easy to read and does not require deep mathematical or AI-related knowledge. It presents a clear comparison along the described axes for the 503 multiple-choice questions presented. This paper can be used as a guide for structuring similar studies over different fields.

Freexian Collaborators: Monthly report about Debian Long Term Support, May 2025 (by Roberto C. S nchez)

Like each month, have a look at the work funded by Freexian s Debian LTS offering.

Debian LTS contributors In May, 22 contributors have been paid to work on Debian LTS, their reports are available:
  • Abhijith PA did 8.0h (out of 0.0h assigned and 8.0h from previous period).
  • Adrian Bunk did 26.0h (out of 26.0h assigned).
  • Andreas Henriksson did 1.0h (out of 15.0h assigned and 3.0h from previous period), thus carrying over 17.0h to the next month.
  • Andrej Shadura did 3.0h (out of 10.0h assigned), thus carrying over 7.0h to the next month.
  • Bastien Roucari s did 20.0h (out of 20.0h assigned).
  • Ben Hutchings did 8.0h (out of 20.0h assigned and 4.0h from previous period), thus carrying over 16.0h to the next month.
  • Carlos Henrique Lima Melara did 12.0h (out of 11.0h assigned and 1.0h from previous period).
  • Chris Lamb did 15.5h (out of 0.0h assigned and 15.5h from previous period).
  • Daniel Leidert did 25.0h (out of 26.0h assigned), thus carrying over 1.0h to the next month.
  • Emilio Pozuelo Monfort did 21.0h (out of 16.75h assigned and 11.0h from previous period), thus carrying over 6.75h to the next month.
  • Guilhem Moulin did 11.5h (out of 8.5h assigned and 6.5h from previous period), thus carrying over 3.5h to the next month.
  • Jochen Sprickerhof did 3.5h (out of 8.75h assigned and 17.5h from previous period), thus carrying over 22.75h to the next month.
  • Lee Garrett did 26.0h (out of 12.75h assigned and 13.25h from previous period).
  • Lucas Kanashiro did 20.0h (out of 18.0h assigned and 2.0h from previous period).
  • Markus Koschany did 20.0h (out of 26.25h assigned), thus carrying over 6.25h to the next month.
  • Roberto C. S nchez did 20.75h (out of 24.0h assigned), thus carrying over 3.25h to the next month.
  • Santiago Ruano Rinc n did 15.0h (out of 12.5h assigned and 2.5h from previous period).
  • Sean Whitton did 6.25h (out of 6.0h assigned and 2.0h from previous period), thus carrying over 1.75h to the next month.
  • Sylvain Beucler did 26.25h (out of 26.25h assigned).
  • Thorsten Alteholz did 15.0h (out of 15.0h assigned).
  • Tobias Frost did 12.0h (out of 12.0h assigned).
  • Utkarsh Gupta did 1.0h (out of 15.0h assigned), thus carrying over 14.0h to the next month.

Evolution of the situation In May, we released 54 DLAs. The LTS Team was particularly active in May, publishing a higher than normal number of advisories, as well as helping with a wide range of updates to packages in stable and unstable, plus some other interesting work. We are also pleased to welcome several updates from contributors outside the regular team.
  • Notable security updates:
    • containerd, prepared by Andreas Henriksson, fixes a vulnerability that could cause containers launched as non-root users to be run as root
    • libapache2-mod-auth-openidc, prepared by Moritz Schlarb, fixes a vulnerability which could allow an attacker to crash an Apache web server with libapache2-mod-auth-openidc installed
    • request-tracker4, prepared by Andrew Ruthven, fixes multiple vulnerabilities which could result in information disclosure, cross-site scripting and use of weak encryption for S/MIME emails
    • postgresql-13, prepared by Bastien Roucari s, fixes an application crash vulnerability that could affect the server or applications using libpq
    • dropbear, prepared by Guilhem Moulin, fixes a vulnerability which could potentially result in execution of arbitrary shell commands
    • openjdk-17, openjdk-11, prepared by Thorsten Glaser, fixes several vulnerabilities, which include denial of service, information disclosure or bypass of sandbox restrictions
    • glibc, prepared by Sean Whitton, fixes a privilege escalation vulnerability
  • Notable non-security updates:
    • wireless-regdb, prepared by Ben Hutchings, updates information reflecting changes to radio regulations in many countries
This month s contributions from outside the regular team include the libapache2-mod-auth-openidc update mentioned above, prepared by Moritz Schlarb (the maintainer of the package); the update of request-tracker4, prepared by Andrew Ruthven (the maintainer of the package); and the updates of openjdk-17 and openjdk-11, also noted above, prepared by Thorsten Glaser. Additionally, LTS Team members contributed stable updates of the following packages:
  • rubygems and yelp/yelp-xsl, prepared by Lucas Kanashiro
  • simplesamlphp, prepared by Tobias Frost
  • libbson-xs-perl, prepared by Roberto C. S nchez
  • fossil, prepared by Sylvain Beucler
  • setuptools and mydumper, prepared by Lee Garrett
  • redis and webpy, prepared by Adrian Bunk
  • xrdp, prepared by Abhijith PA
  • tcpdf, prepared by Santiago Ruano Rinc n
  • kmail-account-wizard, prepared by Thorsten Alteholz
Other contributions were also made by LTS Team members to packages in unstable:
  • proftpd-dfsg DEP-8 tests (autopkgtests) were provided to the maintainer, prepared by Lucas Kanashiro
  • a regular upload of libsoup2.4, prepared by Sean Whitton
  • a regular upload of setuptools, prepared by Lee Garrett
Freexian, the entity behind the management of the Debian LTS project, has been working for some time now on the development of an advanced CI platform for Debian-based distributions, called Debusine. Recently, Debusine has reached a level of feature implementation that makes it very usable. Some members of the LTS Team have been using Debusine informally, and during May LTS coordinator Santiago Ruano Rinc n has made a call for the team to help with testing of Debusine, and to help evaluate its suitability for the LTS Team to eventually begin using as the primary mechanism for uploading packages into Debian. Team members who have started using Debusine are providing valuable feedback to the Debusine development team, thus helping to improve the platform for all users. Actually, a number of updates, for both bullseye and bookworm, made during the month of May were handled using Debusine, e.g. rubygems s DLA-4163-1. By the way, if you are a Debian Developer, you can easily test Debusine following the instructions found at https://wiki.debian.org/DebusineDebianNet. DebConf, the annual Debian Conference, is coming up in July and, as is customary each year, the week preceding the conference will feature an event called DebCamp. The DebCamp week provides an opportunity for teams and other interested groups/individuals to meet together in person in the same venue as the conference itself, with the purpose of doing focused work, often called sprints . LTS coordinator Roberto C. S nchez has announced that the LTS Team is planning to hold a sprint primarily focused on the Debian security tracker and the associated tooling used by the LTS Team and the Debian Security Team.

Thanks to our sponsors Sponsors that joined recently are in bold.

4 June 2025

Gunnar Wolf: Computational modelling of robot personhood and relationality

This post is an unpublished review for Computational modelling of robot personhood and relationality
If humans and robots were to be able to roam around the same spaces, mutually recognizing each other for what they are, how would interaction be? How can we model such interactions in a way that we can reason about and understand the implications of a given behavior? This book aims at answering this question. The book is split into two very different parts. Chapters 1 through 3 are mostly written with a philosophical angle. It starts by framing the possibility of having sentient androids exist in the same plane as humans, without them trying to pass as us or vice versa. The first chapters look at issues related to personhood, that is, how androids can be treated as valid interaction partners in a society with humans, and how interactions with them can be seen as meaningful. In doing so, several landmarks of the past 40 years in the AI field are reviewed. The issues of the Significant Concerns that make up a society and give it coherence and of Personhood and Relationality , describing how this permeates from a society into each of the individuals that make it up, the relations between them and the social objects that bring individuals closer together (or farther apart) are introduced and explained. The second part of the book is written from a very different angle, and the change in pace took me somewhat by surprise. Each subsequent chapter presents a different angle of the Affinity system, a model that follows some aspects of human behavior over time and in a given space. Chapter 4 introduces the Affinity environment: a 3D simulated environment with simulated physical laws and characteristics, where a number of agents (30-50 is mentioned as usual) interact. Agents have a series of attributes ( value memory ), can adhere to different programs ( narratives ), and gain or lose on some vectors ( economy ). They can sense the world around them with sensors, and can modify the world or signal other agents using effectors. The last two chapters round out the book, as expected: the first presents a set of results from analyzing a given set of value systems, and the second gives readers the conclusions reached by the author. However, I was expecting more either having at least a link to download the Affinity system and continue exploring it or modifying some of the aspects it models to get it to model a set of agents with different stories and narratives, or extend it to yet unforseen behaviors, or at least have the author present a more complete comparison of results than the evaluation of patterns resulting from a given run. The author is a well-known, prolific author in the field, and I was expecting bigger insights from this book. Nevertheless, the book is an interesting and fun read, with important insights in both the first and second parts. There is a certain lack of connection between their respective rhythms, and the second part indeed builds on the concepts introduced in the first one. Overall, I enjoyed reading the book despite expecting more.

30 May 2025

Russell Coker: Machine Learning Security

I just read an interesting blog post about ML security recommended by Bruce Schneier [1]. This approach of having 2 AI systems where one processes user input and the second performs actions on quarantined data is good and solves some real problems. But I think the bigger issue is the need to do this. Why not have a multi stage approach, instead of a single user input to do everything (the example given is Can you send Bob the document he requested in our last meeting? Bob s email and the document he asked for are in the meeting notes file ) you could have get Bob s email address from the meeting notes file followed by create a new email to that address and find the document etc. A major problem with many plans for ML systems is that they are based around automating relatively simple tasks. The example of sending an email based on meeting notes is a trivial task that s done many times a day but for which expressing it verbally isn t much faster than doing it the usual way. The usual way of doing such things (manually finding the email address from the meeting notes etc) can be accelerated without ML by having a recent documents access method that gets the notes, having the email address be a hot link to the email program (IE wordprocessor or note taking program being able to call the MUA), having a put all data objects of type X into the clipboard (where X can be email address, URL, filename, or whatever), and maybe optimising the MUA UI. The problems that people are talking about solving via ML and treating everything as text to be arbitrarily parsed can in many cases by solved by having the programs dealing with the data know what they have and have support for calling system services accordingly. The blog post suggests a problem of user fatigue from asking the user to confirm all actions, that is a real concern if the system is going to automate everything such that the user gives a verbal description of the problem and then says yes many times to confirm it. But if the user is at every step of the way pushing the process take this email address attach this file it won t be a series of yes operations with a risk of saying yes once too often. I think that one thing that should be investigated is better integration between services to allow working live on data. If in an online meeting someone says I ll work on task A please send me an email at the end of the meeting with all issues related to it then you should be able to click on their email address in the meeting software to bring up the MUA to send a message and then just paste stuff in. The user could then not immediately send the message and clicking on the email address again would bring up the message in progress to allow adding to it (the behaviour of most MUAs of creating a new message for every click on a mailto:// URL is usually not what you desire). In this example you could of course use ALT-TAB or other methods to switch windows to the email, but imagine the situation of having 5 people in the meeting who are to be emailed about different things and that wouldn t scale. Another thing for the meeting example is that having a text chat for a video conference is a standard feature now and being able to directly message individuals is available in BBB and probably some other online meeting systems. It shouldn t be hard to add a feature to BBB and similar programs to have each user receive an email at the end of the meeting with the contents of every DM chat they were involved in and have everyone in the meeting receive an emailed transcript of the public chat. In conclusion I think that there are real issues with ML security and something like this technology is needed. But for most cases the best option is to just not have ML systems do such things. Also there is significant scope for improving the integration of various existing systems in a non-ML way.

28 May 2025

Clint Adams: Potted meat is viewed differently by different cultures

I've been working on a multi-label email classification model. It's been a frustrating slog, fraught with challenges, including a lack of training data. Labeling emails is labor-intensive and error-prone. Also, I habitually delete certain classes of email immediately after its usefulness has been reduced. I use a CRM-114-based spam filtering system (actually I use two different isntances of the same mailreaver config, but that's another story), which is differently frustrating, but I delete spam when it's detected or when it's trained. Fortunately, there's no shortage of incoming spam, so I can collect enough, but for other, arguably more important labels, they arrive infrequently. So, those labels need to be excluded, or the small sample sizes wreck the training feedback loop. Currently, I have ten active labels, and even though the point of this is not to be a spam filter, spam is one of the labels. Out of curiosity, I decided to compare the performance of my three different models, and to do so on a neutral corpus (in other words, emails that none of them had ever been trained on). I grabbed the full TREC 2007 corpus and ran inference. The results were unexpected in many ways. For example, the Pearson correlation coefficient between my older CRM-114 model and my newer CRM-114 was only about 0.78. I was even more surprised by how poorly all three performed. Were they overfit to my email? So, I decided to look at the TREC corpus for the first time, and lo and behold, the first spam-labeled email I checked was something I would definitely train all three models with as non-spam, but ham for CRM-114 and an entirely different label for my experimental model.
Posted on 2025-05-28
Tags:

25 May 2025

Valhalla's Things: Honeycomb shirt

Posted on May 25, 2025
Tags: madeof:atoms, craft:sewing, FreeSoftWear, GNU Terry Pratchett
A woman wearing a purplish blue shirt with very wide sleeves, gathered at the cuffs and shoulder with honeycombing, and also a rectangle of honeycombing in the front between the neckline and just above the bust. The shirt is gathered at the waist with a wide belt, and an almost lilac towel hangs from the belt. After cartridge pleating, the next fabric manipulation technique I wanted to try was smocking, of the honeycombing variety, on a shirt. My current go-to pattern for shirts is the 1880 menswear one I have on my website: I love the fact that most of the fabric is still cut as big rectangles, but the shaped yoke and armscyes make it significantly more comfortable than the earlier style where most of the shaping at the neck was done with gathers into a straight collar. A woman wearing a shirt in the same fabric; this one has a slit in the front, is gathered into a tall rectangular collar and has dropped shoulders because it's cut from plain rectangles. The sleeves are still huge, and gathered into tall cuffs. It is worn belted (with the same wide white elastic belt used in the previous picture) and the woman is wearing a matching fabric mask, because the picture has been taken in 2021. In my stash I had a cut of purple-blue hopefully cotton [#cotton] I had bought for a cheap price and used for my first attempt at an historically accurate pirate / vampire shirt that has now become by official summer vaccine jab / blood test shirt (because it has the long sleeves I need, but they are pretty easy to roll up to give access to my arm. That shirt tends to get out of the washing machine pretty wearable even without ironing, which made me think it could be a good fabric for something that may be somewhat hard to iron (but also made me suspicious about the actual composition of the fabric, even if it feels nice enough even when worn in the summer). A piece of fabric with many rows of honeycombing laid on top of the collar and yoke of the shirt; a metal snap peeks from behind the piece of honeycombed fabric.  There are still basting lines for the armscyes. Of course I wanted some honeycombing on the front, but I was afraid that the slit in the middle of it would interfere with the honeycombing and gape, so I decided to have the shirt open in an horizontal line at the yoke. I added instructions to the pattern page for how I changed the opening in the front, basically it involved finishing the front edge of the yoke, and sewing the honeycombed yoke to a piece of tape with snaps. Another change from the pattern is that I used plain rectangles for the sleeves, and a square gusset, rather than the new style tapered sleeve , because I wanted to have more fabric to gather at the wrist. I did the side and sleeve seams with a hem + whipstitch method rather than a felled seam, which may have helped, but the sleeves went into the fitted armscyes with no issue. I think that if (yeah, right. when) I ll make another sleeve in this style I ll sew it into the side seam starting 2-3 cm lower than the place I ve marked on the pattern for the original sleeve. The back of the unbelted shirt: it has a fitted yoke, and then it is quite wide and unfitted, with the fabric gathered into the yoke with a row of honeycombing and some pleating on top. I also used a row of honeycombing on the back and two on the upper part of the sleeves, instead of the gathering, and of course some rows to gather the cuffs. The honeycombing on the back was a bit too far away from the edge, so it s a bit of an odd combination of honeycombing and pleating that I don t hate, but don t love either. It s on the back, so I don t mind. On the sleeves I ve done the honeycombing closer to the edge and I ve decided to sew the sleeve as if it was a cartridge pleated sleeve, and that worked better. Because circumstances are still making access to my sewing machine more of a hassle than I d want it to be, this was completely sewn by hand, and at a bit more than a month I have to admit that near the end it felt like it had been taken forever. I m not sure whether it was the actual sewing being slow, some interruptions that happened when I had little time to work on it, or the fact that I ve just gone through a time when my brain kept throwing new projects at me, and I kept thinking of how to make those. Thanks brain. Even when on a hurry to finish it, however, it was still enjoyable sewing, and I think I ll want to do more honeycombing in the future. The same woman with arms wide to show the big sleeves and the shirt unbelted to show that it is pretty wide also from the front, below the yoke and the honeycombing. The back can be seen as about 10 cm longer than the front. Anyway, it s done! And it s going straight into my daily garment rotation, because the weather is getting hot, and that means it s definitely shirt time.

21 May 2025

Simon Quigley: Fences and Values

Don t knock the fence down before you know why it s up. I repeat this phrase over and over again, yet the (metaphorical) Homeowner s Association still decides my fence is the wrong color.Well, now you get to know why the fence is up. If anyone s actually willing to challenge me on this level, I d welcome it.The four ideas I d like to discuss are this: quantum physics, Lutheranism, mental resilience, and psychology. I ve been studying these topics intensely for the past decade as a passion project. I m just going to let my thoughts flow, but I d like to hear other opinions on this.Can the mysteries of the mind, the subatomic world, and faith converge to reveal deeper truths?When it comes to self-taught knowledge on analysis, I m mostly learned on Freud, with some hints of Jung and Peterson. I ve read much of the original source material, and watched countless presentations on it. This all being said, I m both learned on Rothbard and Marx, so if there is a major flaw in the way of Freud is frowned upon, I d genuinely like to know so I can update my research and juxtapose the two schools of thought.Alongside this, although probably not directly relevant, I m learned on John Locke and transcendentalism. What I d like to focus on here is this the Id.The Id is the pleasure-seeking, instinctual part of the psyche. Jung further extends this into the idea of the shadow self, and Peterson maps the meanings of these texts into a combined work (at least in my rudimentary understanding).In my research, the Id represents the part of your psyche that deals with religious values. As an example, if you re an impulsive person, turning to a spiritual or religious outlet can be highly beneficial. I ve been using references from the foundational text of the Judaeo-Christian value system this entire time, feel free to re-read my other blog posts (instead of claiming they don t exist).Let s tie this into quantum physics. This is the part where I ll struggle most. I ve watched several movies about this, read several books, and even learned about it academically, but quantum physics is likely to be my weak spot here.I did some research, and here are the elements I m looking for: uncertainty principle, wave-particle duality, quantum entanglement, and the observer effect.I already know about the cat in the box. And the Cat in the Hat, for that matter. I know about wave-particle duality from an incredibly intelligent high school physics teacher of mine. I know about the uncertainty principle purely in a colloquial sense. The remaining element I need to wrap my head around is quantum entanglement, but it feels like I m almost there.These concepts do actually challenge the idea of pure free will. It s almost like we re coming full circle. Some theologians (including myself, if you can call me a self-taught one) do believe the idea of quantum indeterminacy can be a space where divine action may take place. You could also liken the unpredictable nature of the Id to quantum indeterminacy as well. These are ones to think about, because in all reality, they re subjective opinions. I do believe they re interconnected.In terms of Lutheranism, I ll be short on this one. Please do go read the full history behind Martin Luther and his turbulent relationship with Catholicism. I m not a Bible thumper, and I actually think this is the first time I ve mentioned religion publicly at all. This being said, now I m actually ready to defend the points on an academic level.The Id represents hidden psychological forces, quantum physics reveals subatomic mysteries, and Lutheranism emphasizes faith in the unseen God. Okay, so we have the baseline. Now, time for some mental resilience. When I think of mental resilience, the first people I think of are David Goggins and Jocko Willink. I ve also enjoyed Dr. Andrew Huberman s podcast.The idea there is simple if you understand exactly how to learn, you know your fundamentals well enough to draw them and explain them vividly on a whiteboard, and you can make it a habit, at that point you re ready to work on your mental resilience. Little by little, gradually, how far can you push the bar towards the ceiling?There s obviously limits. People sometimes get scared when I mention mental resilience, but obviously that s a bit of a catch 22. There are plenty of satirical videos out there, and of course, I don t believe in Goggins or Jocko wholeheartedly. They re just tools in the toolbox when times get tough.I wish you all well, and I hope this gets you thinking about those people who just insist there is no God or higher being, and think you re stupid for believing there is one. Those people obviously haven t read analysis, in my own opinion.Have a great night!

19 May 2025

Melissa Wen: A Look at the Latest Linux KMS Color API Developments on AMD and Intel

This week, I reviewed the last available version of the Linux KMS Color API. Specifically, I explored the proposed API by Harry Wentland and Alex Hung (AMD), their implementation for the AMD display driver and tracked the parallel efforts of Uma Shankar and Chaitanya Kumar Borah (Intel) in bringing this plane color management to life. With this API in place, compositors will be able to provide better HDR support and advanced color management for Linux users. To get a hands-on feel for the API s potential, I developed a fork of drm_info compatible with the new color properties. This allowed me to visualize the display hardware color management capabilities being exposed. If you re curious and want to peek behind the curtain, you can find my exploratory work on the drm_info/kms_color branch. The README there will guide you through the simple compilation and installation process. Note: You will need to update libdrm to match the proposed API. You can find an updated version in my personal repository here. To avoid potential conflicts with your official libdrm installation, you can compile and install it in a local directory. Then, use the following command: export LD_LIBRARY_PATH="/usr/local/lib/" In this post, I invite you to familiarize yourself with the new API that is about to be released. You can start doing as I did below: just deploy a custom kernel with the necessary patches and visualize the interface with the help of drm_info. Or, better yet, if you are a userspace developer, you can start developing user cases by experimenting with it. The more eyes the better.

KMS Color API on AMD The great news is that AMD s driver implementation for plane color operations is being developed right alongside their Linux KMS Color API proposal, so it s easy to apply to your kernel branch and check it out. You can find details of their progress in the AMD s series. I just needed to compile a custom kernel with this series applied, intentionally leaving out the AMD_PRIVATE_COLOR flag. The AMD_PRIVATE_COLOR flag guards driver-specific color plane properties, which experimentally expose hardware capabilities while we don t have the generic KMS plane color management interface available. If you don t know or don t remember the details of AMD driver specific color properties, you can learn more about this work in my blog posts [1] [2] [3]. As driver-specific color properties and KMS colorops are redundant, the driver only advertises one of them, as you can see in AMD workaround patch 24. So, with the custom kernel image ready, I installed it on a system powered by AMD DCN3 hardware (i.e. my Steam Deck). Using my custom drm_info, I could clearly see the Plane Color Pipeline with eight color operations as below:
 "COLOR_PIPELINE" (atomic): enum  Bypass, Color Pipeline 258  = Bypass
     Bypass
     Color Pipeline 258
         Color Operation 258
             "TYPE" (immutable): enum  1D Curve, 1D LUT, 3x4 Matrix, Multiplier, 3D LUT  = 1D Curve
             "BYPASS" (atomic): range [0, 1] = 1
             "CURVE_1D_TYPE" (atomic): enum  sRGB EOTF, PQ 125 EOTF, BT.2020 Inverse OETF  = sRGB EOTF
         Color Operation 263
             "TYPE" (immutable): enum  1D Curve, 1D LUT, 3x4 Matrix, Multiplier, 3D LUT  = Multiplier
             "BYPASS" (atomic): range [0, 1] = 1
             "MULTIPLIER" (atomic): range [0, UINT64_MAX] = 0
         Color Operation 268
             "TYPE" (immutable): enum  1D Curve, 1D LUT, 3x4 Matrix, Multiplier, 3D LUT  = 3x4 Matrix
             "BYPASS" (atomic): range [0, 1] = 1
             "DATA" (atomic): blob = 0
         Color Operation 273
             "TYPE" (immutable): enum  1D Curve, 1D LUT, 3x4 Matrix, Multiplier, 3D LUT  = 1D Curve
             "BYPASS" (atomic): range [0, 1] = 1
             "CURVE_1D_TYPE" (atomic): enum  sRGB Inverse EOTF, PQ 125 Inverse EOTF, BT.2020 OETF  = sRGB Inverse EOTF
         Color Operation 278
             "TYPE" (immutable): enum  1D Curve, 1D LUT, 3x4 Matrix, Multiplier, 3D LUT  = 1D LUT
             "BYPASS" (atomic): range [0, 1] = 1
             "SIZE" (atomic, immutable): range [0, UINT32_MAX] = 4096
             "LUT1D_INTERPOLATION" (immutable): enum  Linear  = Linear
             "DATA" (atomic): blob = 0
         Color Operation 285
             "TYPE" (immutable): enum  1D Curve, 1D LUT, 3x4 Matrix, Multiplier, 3D LUT  = 3D LUT
             "BYPASS" (atomic): range [0, 1] = 1
             "SIZE" (atomic, immutable): range [0, UINT32_MAX] = 17
             "LUT3D_INTERPOLATION" (immutable): enum  Tetrahedral  = Tetrahedral
             "DATA" (atomic): blob = 0
         Color Operation 292
             "TYPE" (immutable): enum  1D Curve, 1D LUT, 3x4 Matrix, Multiplier, 3D LUT  = 1D Curve
             "BYPASS" (atomic): range [0, 1] = 1
             "CURVE_1D_TYPE" (atomic): enum  sRGB EOTF, PQ 125 EOTF, BT.2020 Inverse OETF  = sRGB EOTF
         Color Operation 297
             "TYPE" (immutable): enum  1D Curve, 1D LUT, 3x4 Matrix, Multiplier, 3D LUT  = 1D LUT
             "BYPASS" (atomic): range [0, 1] = 1
             "SIZE" (atomic, immutable): range [0, UINT32_MAX] = 4096
             "LUT1D_INTERPOLATION" (immutable): enum  Linear  = Linear
             "DATA" (atomic): blob = 0
Note that Gamescope is currently using AMD driver-specific color properties implemented by me, Autumn Ashton and Harry Wentland. It doesn t use this KMS Color API, and therefore COLOR_PIPELINE is set to Bypass. Once the API is accepted upstream, all users of the driver-specific API (including Gamescope) should switch to the KMS generic API, as this will be the official plane color management interface of the Linux kernel.

KMS Color API on Intel On the Intel side, the driver implementation available upstream was built upon an earlier iteration of the API. This meant I had to apply a few tweaks to bring it in line with the latest specifications. You can explore their latest work here. For a more simplified handling, combining the V9 of the Linux Color API, Intel s contributions, and my necessary adjustments, check out my dedicated branch. I then compiled a kernel from this integrated branch and deployed it on a system featuring Intel TigerLake GT2 graphics. Running my custom drm_info revealed a Plane Color Pipeline with three color operations as follows:
 "COLOR_PIPELINE" (atomic): enum  Bypass, Color Pipeline 480  = Bypass
     Bypass
     Color Pipeline 480
         Color Operation 480
             "TYPE" (immutable): enum  1D Curve, 1D LUT, 3x4 Matrix, 1D LUT Mult Seg, 3x3 Matrix, Multiplier, 3D LUT  = 1D LUT Mult Seg
             "BYPASS" (atomic): range [0, 1] = 1
             "HW_CAPS" (atomic, immutable): blob = 484
             "DATA" (atomic): blob = 0
         Color Operation 487
             "TYPE" (immutable): enum  1D Curve, 1D LUT, 3x4 Matrix, 1D LUT Mult Seg, 3x3 Matrix, Multiplier, 3D LUT  = 3x3 Matrix
             "BYPASS" (atomic): range [0, 1] = 1
             "DATA" (atomic): blob = 0
         Color Operation 492
             "TYPE" (immutable): enum  1D Curve, 1D LUT, 3x4 Matrix, 1D LUT Mult Seg, 3x3 Matrix, Multiplier, 3D LUT  = 1D LUT Mult Seg
             "BYPASS" (atomic): range [0, 1] = 1
             "HW_CAPS" (atomic, immutable): blob = 496
             "DATA" (atomic): blob = 0
Observe that Intel s approach introduces additional properties like HW_CAPS at the color operation level, along with two new color operation types: 1D LUT with Multiple Segments and 3x3 Matrix. It s important to remember that this implementation is based on an earlier stage of the KMS Color API and is awaiting review.

A Shout-Out to Those Who Made This Happen I m impressed by the solid implementation and clear direction of the V9 of the KMS Color API. It aligns with the many insightful discussions we ve had over the past years. A huge thank you to Harry Wentland and Alex Hung for their dedication in bringing this to fruition! Beyond their efforts, I deeply appreciate Uma and Chaitanya s commitment to updating Intel s driver implementation to align with the freshest version of the KMS Color API. The collaborative spirit of the AMD and Intel developers in sharing their color pipeline work upstream is invaluable. We re now gaining a much clearer picture of the color capabilities embedded in modern display hardware, all thanks to their hard work, comprehensive documentation, and engaging discussions. Finally, thanks all the userspace developers, color science experts, and kernel developers from various vendors who actively participate in the upstream discussions, meetings, workshops, each iteration of this API and the crucial code review process. I m happy to be part of the final stages of this long kernel journey, but I know that when it comes to colors, one step is completed for new challenges to be unlocked. Looking forward to meeting you in this year Linux Display Next hackfest, organized by AMD in Toronto, to further discuss HDR, advanced color management, and other display trends.

16 May 2025

Freexian Collaborators: Monthly report about Debian Long Term Support, April 2025 (by Roberto C. S nchez)

Like each month, have a look at the work funded by Freexian s Debian LTS offering.

Debian LTS contributors In April, 22 contributors have been paid to work on Debian LTS, their reports are available:
  • Adrian Bunk did 56.25h (out of 56.25h assigned).
  • Andreas Henriksson did 15.0h (out of 20.0h assigned), thus carrying over 5.0h to the next month.
  • Andrej Shadura did 10.0h (out of 6.0h assigned and 4.0h from previous period).
  • Bastien Roucari s did 31.5h (out of 31.5h assigned).
  • Ben Hutchings did 8.0h (out of 0.0h assigned and 12.0h from previous period), thus carrying over 4.0h to the next month.
  • Carlos Henrique Lima Melara did 11.0h (out of 12.0h assigned), thus carrying over 1.0h to the next month.
  • Chris Lamb did 18.0h (out of 18.0h assigned).
  • Daniel Leidert did 26.0h (out of 26.0h assigned).
  • Emilio Pozuelo Monfort did 30.0h (out of 39.25h assigned and 0.25h from previous period), thus carrying over 9.5h to the next month.
  • Guilhem Moulin did 8.5h (out of 3.25h assigned and 11.75h from previous period), thus carrying over 6.5h to the next month.
  • Jochen Sprickerhof did 12.5h (out of 20.75h assigned and 9.25h from previous period), thus carrying over 17.5h to the next month.
  • Lee Garrett did 26.25h (out of 7.75h assigned and 31.75h from previous period), thus carrying over 13.25h to the next month.
  • Lucas Kanashiro did 50.0h (out of 0.0h assigned and 52.0h from previous period), thus carrying over 2.0h to the next month.
  • Markus Koschany did 39.5h (out of 39.5h assigned).
  • Roberto C. S nchez did 9.0h (out of 0.0h assigned and 12.0h from previous period), thus carrying over 3.0h to the next month.
  • Santiago Ruano Rinc n did 12.5h (out of 7.5h assigned and 7.5h from previous period), thus carrying over 2.5h to the next month.
  • Sean Whitton did 7.0h (out of 7.0h assigned).
  • Stefano Rivera did 0.5h (out of 0.0h assigned and 10.0h from previous period), thus carrying over 9.5h to the next month.
  • Sylvain Beucler did 39.5h (out of 39.25h assigned and 0.25h from previous period).
  • Thorsten Alteholz did 15.0h (out of 15.0h assigned).
  • Tobias Frost did 12.0h (out of 7.75h assigned and 4.25h from previous period).
  • Utkarsh Gupta did 2.0h (out of 2.0h assigned).

Evolution of the situation In April, we released 46 DLAs.
  • Notable security updates:
    • jetty9, prepared by Markus Koschany, fixes an information disclosure and potential remote code execution vulnerability
    • zabbix, prepared by Tobias Frost, fixes several vulnerabilities, encompassing denial of service, information disclosure or remote code inclusion
    • glibc, prepared by Sean Whitton, fixes a buffer overflow vulnerability
  • Notable non-security updates:
    • tzdata, prepared by Emilio Pozuelo Monfort, brings the latest timezone database release
    • php-horde-editor and php-horde-imp, prepared by Sylvain Beucler, have been updated to switch from CKEditor v3, which is EOL, to CKEditor v4; this builds upon work done last month by Sylvain and Bastien for the complete removal of ckeditor3
    • distro-info-data, prepared by Stefano Rivera, adds information concerning future Debian and Ubuntu releases
The LTS team continues to welcome the collaboration of maintainers and other interested parties from outside the regular team. In April, we had external updates contributed by: Yadd - lemonldap-ng and Moritz Schlarb - libapache2-mod-auth-openidc A point release of the current stable Debian 12 (codename bookworm ) is planned for mid-May and several LTS contributors have prepared packages for this update, many of them prepared in conjunction with related LTS updates of the same packages:
  • glib2.0, haproxy, imagemagick, poppler, and python-h11, prepared by Adrian Bunk
  • rubygems, prepared by Lucas Kanashiro
  • ruby3.1 (in collaboration with Lucas Kanashiro), twitter-bootstrap3, twitterboot-strap4, wpa, and erlang, prepared by Bastien Roucari s (corresponding updates of twitter-bootstrap3 and twitter-bootstrap4 were also uploaded to Debian unstable)
  • abseil, prepared by Tobias Frost (a corresponding update was also uploaded to Debian unstable)
  • vips, prepared by Guilhem Moulin
Additional updates of ruby3.3 and rubygems were prepared for Debian unstable by Lucas Kanashiro. And finally, a highlight of our continued commitment to enhancing long term support efforts in upstream projects. Freexian, as the primary entity behind the management and execution of the LTS project, has partnered with Invisible Things Lab to extend the upstream security support of Xen 4.17, which is shipped in Debian 12 bookworm (the current stable release). This partnership will result in significantly improved lifecycle support for users of Xen on bookworm, and members of the LTS team will play a part in this endeavour. The Freexian announcement has additional details.

Thanks to our sponsors Sponsors that joined recently are in bold.

12 May 2025

Reproducible Builds: Reproducible Builds in April 2025

Welcome to our fourth report from the Reproducible Builds project in 2025. These monthly reports outline what we ve been up to over the past month, and highlight items of news from elsewhere in the increasingly-important area of software supply-chain security. Lastly, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. Table of contents:
  1. reproduce.debian.net
  2. Fifty Years of Open Source Software Supply Chain Security
  3. 4th CHAINS Software Supply Chain Workshop
  4. Mailing list updates
  5. Canonicalization for Unreproducible Builds in Java
  6. OSS Rebuild adds new TUI features
  7. Distribution roundup
  8. diffoscope & strip-nondeterminism
  9. Website updates
  10. Reproducibility testing framework
  11. Upstream patches

reproduce.debian.net The last few months have seen the introduction, development and deployment of reproduce.debian.net. In technical terms, this is an instance of rebuilderd, our server designed monitor the official package repositories of Linux distributions and attempt to reproduce the observed results there. This month, however, we are pleased to announce that reproduce.debian.net now tests all the Debian trixie architectures except s390x and mips64el. The ppc64el architecture was added through the generous support of Oregon State University Open Source Laboratory (OSUOSL), and we can support the armel architecture thanks to CodeThink.

Fifty Years of Open Source Software Supply Chain Security Russ Cox has published a must-read article in ACM Queue on Fifty Years of Open Source Software Supply Chain Security. Subtitled, For decades, software reuse was only a lofty goal. Now it s very real. , Russ article goes on to outline the history and original goals of software supply-chain security in the US military in the early 1970s, all the way to the XZ Utils backdoor of 2024. Through that lens, Russ explores the problem and how it has changed, and hasn t changed, over time. He concludes as follows:
We are all struggling with a massive shift that has happened in the past 10 or 20 years in the software industry. For decades, software reuse was only a lofty goal. Now it s very real. Modern programming environments such as Go, Node and Rust have made it trivial to reuse work by others, but our instincts about responsible behaviors have not yet adapted to this new reality. We all have more work to do.

4th CHAINS Software Supply Chain Workshop Convened as part of the CHAINS research project at the KTH Royal Institute of Technology in Stockholm, Sweden, the 4th CHAINS Software Supply Chain Workshop occurred during April. During the workshop, there were a number of relevant workshops, including: The full listing of the agenda is available on the workshop s website.

Mailing list updates On our mailing list this month:
  • Luca DiMaio of Chainguard posted to the list reporting that they had successfully implemented reproducible filesystem images with both ext4 and an EFI system partition. They go on to list the various methods, and the thread generated at least fifteen replies.
  • David Wheeler announced that the OpenSSF is building a glossary of sorts in order that they consistently use the same meaning for the same term and, moreover, that they have drafted a definition for reproducible build . The thread generated a significant number of replies on the definition, leading to a potential update to the Reproducible Build s own definition.
  • Lastly, kpcyrd posted to the list with a timely reminder and update on their repro-env tool. As first reported in our July 2023 report, kpcyrd mentions that:
    My initial interest in reproducible builds was how do I distribute pre-compiled binaries on GitHub without people raising security concerns about them . I ve cycled back to this original problem about 5 years later and built a tool that is meant to address this. [ ]

Canonicalization for Unreproducible Builds in Java Aman Sharma, Benoit Baudry and Martin Monperrus have published a new scholarly study related to reproducible builds within Java. Titled Canonicalization for Unreproducible Builds in Java, the article s abstract is as follows:
[ ] Achieving reproducibility at scale remains difficult, especially in Java, due to a range of non-deterministic factors and caveats in the build process. In this work, we focus on reproducibility in Java-based software, archetypal of enterprise applications. We introduce a conceptual framework for reproducible builds, we analyze a large dataset from Reproducible Central and we develop a novel taxonomy of six root causes of unreproducibility. We study actionable mitigations: artifact and bytecode canonicalization using OSS-Rebuild and jNorm respectively. Finally, we present Chains-Rebuild, a tool that raises reproducibility success from 9.48% to 26.89% on 12,283 unreproducible artifacts. To sum up, our contributions are the first large-scale taxonomy of build unreproducibility causes in Java, a publicly available dataset of unreproducible builds, and Chains-Rebuild, a canonicalization tool for mitigating unreproducible builds in Java.
A full PDF of their article is available from arXiv.

OSS Rebuild adds new TUI features OSS Rebuild aims to automate rebuilding upstream language packages (e.g. from PyPI, crates.io and npm registries) and publish signed attestations and build definitions for public use. OSS Rebuild ships a text-based user interface (TUI) for viewing, launching, and debugging rebuilds. While previously requiring ownership of a full instance of OSS Rebuild s hosted infrastructure, the TUI now supports a fully local mode of build execution and artifact storage. Thanks to Giacomo Benedetti for his usage feedback and work to extend the local-only development toolkit. Another feature added to the TUI was an experimental chatbot integration that provides interactive feedback on rebuild failure root causes and suggests fixes.

Distribution roundup In Debian this month:
  • Roland Clobus posted another status report on reproducible ISO images on our mailing list this month, with the summary that all live images build reproducibly from the online Debian archive .
  • Debian developer Simon Josefsson published another two reproducibility-related blog posts this month, the first on the topic of Verified Reproducible Tarballs. Simon sardonically challenges the reader as follows: Do you want a supply-chain challenge for the Easter weekend? Pick some well-known software and try to re-create the official release tarballs from the corresponding Git checkout. Is anyone able to reproduce anything these days? After that, they also published a blog post on Building Debian in a GitLab Pipeline using their multi-stage rebuild approach.
  • Roland also posted to our mailing list to highlight that there is now another tool in Debian that generates reproducible output, equivs . This is a tool to create trivial Debian packages that might Depend on other packages. As Roland writes, building the [equivs] package has been reproducible for a while, [but] now the output of the [tool] has become reproducible as well .
  • Lastly, 9 reviews of Debian packages were added, 10 were updated and 10 were removed this month adding to our extensive knowledge about identified issues.
The IzzyOnDroid Android APK repository made more progress in April. Thanks to funding by NLnet and Mobifree, the project was also to put more time into their tooling. For instance, developers can now easily run their own verification builder in less than 5 minutes . This currently supports Debian-based systems, but support for RPM-based systems is incoming.
  • The rbuilder_setup tool can now setup the entire framework within less than five minutes. The process is configurable, too, so everything from just the basics to verify builds up to a fully-fledged RB environment is also possible.
  • This tool works on Debian, RedHat and Arch Linux, as well as their derivates. The project has received successful reports from Debian, Ubuntu, Fedora and some Arch Linux derivates so far.
  • Documentation on how to work with reproducible builds (making apps reproducible, debugging unreproducible packages, etc) is available in the project s wiki page.
  • Future work is also in the pipeline, including documentation, guidelines and helpers for debugging.
NixOS defined an Outreachy project for improving build reproducibility. In the application phase, NixOS saw some strong candidates providing contributions, both on the NixOS side and upstream: guider-le-ecit analyzed a libpinyin issue. Tessy James fixed an issue in arandr and helped analyze one in libvlc that led to a proposed upstream fix. Finally, 3pleX fixed an issue which was accepted in upstream kitty, one in upstream maturin, one in upstream python-sip and one in the Nix packaging of python-libbytesize. Sadly, the funding for this internship fell through, so NixOS were forced to abandon their search. Lastly, in openSUSE news, Bernhard M. Wiedemann posted another monthly update for their work there.

diffoscope & strip-nondeterminism diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made the following changes, including preparing and uploading a number of versions to Debian:
  • Use the --walk argument over the potentially dangerous alternative --scan when calling out to zipdetails(1). [ ]
  • Correct a longstanding issue where many >-based version tests used in conditional fixtures were broken. This was used to ensure that specific tests were only run when the version on the system was newer than a particular number. Thanks to Colin Watson for the report (Debian bug #1102658) [ ]
  • Address a long-hidden issue in the test_versions testsuite as well, where we weren t actually testing the greater-than comparisons mentioned above, as it was masked by the tests for equality. [ ]
  • Update copyright years. [ ]
In strip-nondeterminism, however, Holger Levsen updated the Continuous Integration (CI) configuration in order to use the standard Debian pipelines via debian/salsa-ci.yml instead of using .gitlab-ci.yml. [ ]

Website updates Once again, there were a number of improvements made to our website this month including:
  • Aman Sharma added OSS-Rebuild s stabilize tool to the Tools page. [ ][ ]
  • Chris Lamb added a configure.ac (GNU Autotools) example for using SOURCE_DATE_EPOCH. [ ]. Chris also updated the SOURCE_DATE_EPOCH snippet and move the archive metadata to a more suitable location. [ ]
  • Denis Carikli added GNU Boot to our ever-evolving Projects page.

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In April, a number of changes were made by Holger Levsen, including:
  • reproduce.debian.net-related:
    • Add armel.reproduce.debian.net to support the armel architecture. [ ][ ]
    • Add a new ARM node, codethink05. [ ][ ]
    • Add ppc64el.reproduce.debian.net to support testing of the ppc64el architecture. [ ][ ][ ]
    • Improve the reproduce.debian.net front page. [ ][ ]
    • Make various changes to the ppc64el nodes. [ ][ ]9[ ][ ]
    • Make various changes to the arm64 and armhf nodes. [ ][ ][ ][ ]
    • Various changes related to the rebuilderd-worker entry point. [ ][ ][ ]
    • Create and deploy a pkgsync script. [ ][ ][ ][ ][ ][ ][ ][ ]
    • Fix the monitoring of the riscv64 architecture. [ ][ ]
    • Make a number of changes related to starting the rebuilderd service. [ ][ ][ ][ ]
  • Backup-related:
    • Backup the rebuilder databases every week. [ ][ ][ ][ ]
    • Improve the node health checks. [ ][ ]
  • Misc:
    • Re-use existing connections to the SSH proxy node on the riscv64 nodes. [ ][ ]
    • Node maintenance. [ ][ ][ ]
In addition:
  • Jochen Sprickerhof fixed the risvc64 host names [ ] and requested access to all the rebuilderd nodes [ ].
  • Mattia Rizzolo updated the self-serve rebuild scheduling tool, replacing the deprecated SSO -style authentication with OpenIDC which authenticates against salsa.debian.org. [ ][ ][ ]
  • Roland Clobus updated the configuration for the osuosl3 node to designate 4 workers for bigger builds. [ ]

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Finally, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

Next.

Previous.