Search Results: "ed"

27 June 2025

Reproducible Builds (diffoscope): diffoscope 300 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 300. This version includes the following changes:
[ "Alex" ]
* Fix a regression and add a test so that diffoscope picks up differences
  in metadata for identical files again. (Closes: reproducible-builds/diffoscope#411)
You find out more by visiting the project homepage.

26 June 2025

Bits from Debian: AMD Platinum Sponsor of DebConf25

amd-logo We are pleased to announce that AMD has committed to sponsor DebConf25 as a Platinum Sponsor. The AMD ROCm platform includes programming models, tools, compilers, libraries, and runtimes for AI and HPC solution development on AMD GPUs. Debian is an officially supported platform for AMD ROCm and a growing number of components are now included directly in the Debian distribution. For more than 55 years AMD has driven innovation in high-performance computing, graphics and visualization technologies. AMD is deeply committed to supporting and contributing to open-source projects, foundations, and open-standards organizations, taking pride in fostering innovation and collaboration within the open-source community. With this commitment as Platinum Sponsor, AMD is contributing to the annual Debian Developers Conference, directly supporting the progress of Debian and Free Software. AMD contributes to strengthening the worldwide community that collaborates on Debian projects year-round. Thank you very much, AMD, for your support of DebConf25! Become a sponsor too! DebConf25 will take place from 14 to 20 July 2025 in Brest, France, and will be preceded by DebCamp, from 7 to 13 July 2025. DebConf25 is accepting sponsors! Interested companies and organizations may contact the DebConf team through sponsors@debconf.org, and visit the DebConf25 website at https://debconf25.debconf.org/sponsors /become-a-sponsor/.

25 June 2025

Tollef Fog Heen: Pronoun support in userdir-ldap

Debian uses LDAP for storing information about users, hosts and other objects. The wrapping around this is called userdir-ldap, or ud-ldap for short. It provides a mail gateway, web UI and a couple of schemas for different object types. Back in late 2018 and early 2019, we (DSA) removed support for ISO5218 in userdir-ldap, and removed the corresponding data. This made some people upset, since they were using that information, as imprecise as it was, to infer people s pronouns. ISO5218 has four values for sex, unknown, male, female and N/A. This might have been acceptable when the standard was new (in 1976), but it wasn t acceptable any longer in 2018. A couple of days ago, I finally got around to adding support to userdir-ldap to let people specify their pronouns. As it should be, it s a free-form text field. (We don t have localised fields in LDAP, so it probably makes sense for people to put the English version of their pronouns there, but the software does not try to control that.) So far, it s only exposed through the LDAP gateway, not in the web UI. If you re a Debian developer, you can set your pronouns using
echo "pronouns: he/him"   gpg --clearsign   mail changes@db.debian.org
I see that four people have already done so in the time I ve taken to write this post.

24 June 2025

Evgeni Golov: Using LXCFS together with Podman

JP was puzzled that using podman run --memory=2G would not result in the 2G limit being visible inside the container. While we were able to identify this as a visualization problem tools like free(1) only look at /proc/meminfo and that is not virtualized inside a container, you'd have to look at /sys/fs/cgroup/memory.max and friends instead I couldn't leave it at that. And then I remembered there is actually something that can provide a virtual (cgroup-aware) /proc for containers: LXCFS! But does it work with Podman?! I always used it with LXC, but there is technically no reason why it wouldn't work with a different container solution cgroups are cgroups after all. As we all know: there is only one way to find out! Take a fresh Debian 12 VM, install podman and verify things behave as expected:
user@debian12:~$ podman run -ti --rm --memory=2G centos:stream9
bash-5.1# grep MemTotal /proc/meminfo
MemTotal:        6067396 kB
bash-5.1# cat /sys/fs/cgroup/memory.max
2147483648
And after installing (and starting) lxcfs, we can use the virtual /proc/meminfo it generates by bind-mounting it into the container (LXC does that part automatically for us):
user@debian12:~$ podman run -ti --rm --memory=2G --mount=type=bind,source=/var/lib/lxcfs/proc/meminfo,destination=/proc/meminfo centos:stream9
bash-5.1# grep MemTotal /proc/meminfo
MemTotal:        2097152 kB
bash-5.1# cat /sys/fs/cgroup/memory.max
2147483648
The same of course works with all the other proc entries lxcfs provides (cpuinfo, diskstats, loadavg, meminfo, slabinfo, stat, swaps, and uptime here), just bind-mount them. And yes, free(1) now works too!
bash-5.1# free -m
               total        used        free      shared  buff/cache   available
Mem:            2048           3        1976           0          67        2044
Swap:              0           0           0
Just don't blindly mount the whole /var/lib/lxcfs/proc over the container's /proc. It did work (as in: "bash and free didn't crash") for me, but with /proc/$PID etc missing, I bet things will go south pretty quickly.

Dirk Eddelbuettel: RcppRedis 0.2.6 on CRAN: Extensions

A new minor release 0.2.6 of our RcppRedis package arrived on CRAN today. RcppRedis is one of several packages connecting R to the fabulous Redis in-memory datastructure store (and much more). It works equally well with the newer fork Valkey. RcppRedis does not pretend to be feature complete, but it may do some things faster than the other interfaces, and also offers an optional coupling with MessagePack binary (de)serialization via RcppMsgPack. The package has been deployed in production as a risk / monitoring tool on a trading floor for several years. It also supports pub/sub dissemination of streaming market data as per this earlier example. This update brings new functions del, lrem, and lmove (for the matching Redis / Valkey commands) which may be helpful in using Redis (or Valkey) as a job queue. We also extended the publish accessor by supporting text (i.e. string) mode along with raw or rds (the prior default which always serialized R objects) just how listen already worked with these three cases. The change makes it possible to publish from R to subscribers not running R as they cannot rely on the R deserealizer. An example is provided by almm, a live market monitor, which we introduced in this blog post. Apart from that the continuous integration script received another mechanical update. The detailed changes list follows.

Changes in version 0.2.6 (2025-06-24)
  • The commands DEL, LREM and LMOVE have been added
  • The continuous integration setup was updated once more
  • The pub/sub publisher now supports a type argument similar to the listener, this allows string message publishing for non-R subscribers

Courtesy of my CRANberries, there is also a diffstat report for this this release. More information is on the RcppRedis page and at the repository and its issue tracker.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

Uwe Kleine-K nig: Temperature and humitidy sensor on OpenWrt

I have a SHT3x humidity and temperature sensor connected to the i2c bus of my Turris Omnia that runs OpenWrt. To make it produce nice graphs shown in the webif I installed the packages collectd-mod-sensors, luci-app-statistics and kmod-hwmon-sht3x. To make the sht3x driver bind to the device I added
echo 'sht3x 0x44' > /sys/bus/i2c/devices/0-0070/channel-6/new_device
to /etc/rc.local. After that I only had to enable the Sensors plugin below Statistics -> Setup -> General plugins and check 'Monitor all except specified in its "Configure" dialog.

Matthew Garrett: Why is there no consistent single signon API flow?

Single signon is a pretty vital part of modern enterprise security. You have users who need access to a bewildering array of services, and you want to be able to avoid the fallout of one of those services being compromised and your users having to change their passwords everywhere (because they're clearly going to be using the same password everywhere), or you want to be able to enforce some reasonable MFA policy without needing to configure it in 300 different places, or you want to be able to disable all user access in one place when someone leaves the company, or, well, all of the above. There's any number of providers for this, ranging from it being integrated with a more general app service platform (eg, Microsoft or Google) or a third party vendor (Okta, Ping, any number of bizarre companies). And, in general, they'll offer a straightforward mechanism to either issue OIDC tokens or manage SAML login flows, requiring users present whatever set of authentication mechanisms you've configured.

This is largely optimised for web authentication, which doesn't seem like a huge deal - if I'm logging into Workday then being bounced to another site for auth seems entirely reasonable. The problem is when you're trying to gate access to a non-web app, at which point consistency in login flow is usually achieved by spawning a browser and somehow managing submitting the result back to the remote server. And this makes some degree of sense - browsers are where webauthn token support tends to live, and it also ensures the user always has the same experience.

But it works poorly for CLI-based setups. There's basically two options - you can use the device code authorisation flow, where you perform authentication on what is nominally a separate machine to the one requesting it (but in this case is actually the same) and as a result end up with a straightforward mechanism to have your users socially engineered into giving Johnny Badman a valid auth token despite webauthn nominally being unphisable (as described years ago), or you reduce that risk somewhat by spawning a local server and POSTing the token back to it - which works locally but doesn't work well if you're dealing with trying to auth on a remote device. The user experience for both scenarios sucks, and it reduces a bunch of the worthwhile security properties that modern MFA supposedly gives us.

There's a third approach, which is in some ways the obviously good approach and in other ways is obviously a screaming nightmare. All the browser is doing is sending a bunch of requests to a remote service and handling the response locally. Why don't we just do the same? Okta, for instance, has an API for auth. We just need to submit the username and password to that and see what answer comes back. This is great until you enable any kind of MFA, at which point the additional authz step is something that's only supported via the browser. And basically everyone else is the same.

Of course, when we say "That's only supported via the browser", the browser is still just running some code of some form and we can figure out what it's doing and do the same. Which is how you end up scraping constants out of Javascript embedded in the API response in order to submit that data back in the appropriate way. This is all possible but it's incredibly annoying and fragile - the contract with the identity provider is that a browser is pointed at a URL, not that any of the internal implementation remains consistent.

I've done this. I've implemented code to scrape an identity provider's auth responses to extract the webauthn challenges and feed those to a local security token without using a browser. I've also written support for forwarding those challenges over the SSH agent protocol to make this work with remote systems that aren't running a GUI. This week I'm working on doing the same again, because every identity provider does all of this differently.

There's no fundamental reason all of this needs to be custom. It could be a straightforward "POST username and password, receive list of UUIDs describing MFA mechanisms, define how those MFA mechanisms work". That even gives space for custom auth factors (I'm looking at you, Okta Fastpass). But instead I'm left scraping JSON blobs out of Javascript and hoping nobody renames a field, even though I only care about extremely standard MFA mechanisms that shouldn't differ across different identity providers.

Someone, please, write a spec for this. Please don't make it be me.

comment count unavailable comments

23 June 2025

Gunnar Wolf: Private key management Oh, the humanity...

If we ever thought a couple of years or decades of constant use would get humankind to understand how an asymetric key pair is to be handled It s time we moved back to square one. I had to do an online tramit with the Mexican federal government to get a statement certifying I successfully finished my studies, and I found this jewel of user interface: E.firma So I have to:
  1. Submit the asymetric key I use for tax purposes, as that s the ID the government has registered for me. OK, I didn t expect it to be used for this purpose as well, but I ll accept it. Of course, in our tax system many people don t require having a public key generated ( easier regimes are authenticated by password only), but all professionals with a c dula profesional (everybody getting a unviersitary title) is now compelled to do this step.
  2. Not only I have to submit my certificate (public key) But also the private part (and, of course, the password that secures it). I understand I m interacting with a Javascript thingie that runs only client-side, and I trust it is not shipping my private key to their servers. But given it is an opaque script, I have no assurance about it. And, of course, this irks me because I am who I am and because I ve spent several years thinking about cryptography. But for regular people, it just looks as a stupid inconvenience: they have to upload two weird files with odd names and provide a password. What for?
This is beyond stupid. I m baffled. (of course, I did it, because I need the fsckin document. Oh, and of course, I paid my MX$1770, 80, for it which does not make me too happy for a tramit that s not even shuffling papers, only storing the right bits in the right corner of the right datacenter, but anyhow )

Russell Coker: PFAs

For some time I ve been noticing news reports about PFAs [1]. I hadn t thought much about that issue, I grew up when leaded petrol was standard, when almost all thermometers had mercury, when all small batteries had mercury, and I had generally considered that I had already had so many nasty chemicals in my body that as long as I don t eat bottom feeding seafood often I didn t have much to worry about. I already had a higher risk of a large number of medical issues than I d like due to decisions made before I was born and there s not much to do about it given that there are regulations restricting the emissions of lead, mercury etc. I just watched a Veritasium video about Teflon and the PFA poisoning related to it s production [2]. This made me realise that it s more of a problem than I realised and it s a problem that s getting worse. PFA levels in the parts-per-trillion range in the environment can cause parts-per-billion in the body which increases the risks of several cancers and causes other health problems. Fortunately there is some work being done on water filtering, you can get filters for a home level now and they are working on filters that can work at a sufficient scale for a city water plant. There is a map showing PFAs in the environment in Australia which shows some sites with concerning levels that are near residential areas [3]. One of the major causes for that in Australia is fire retardant foam Australia has never had much if any Teflon manufacturing AFAIK. Also they noted that donating blood regularly can decrease levels of PFAs in the bloodstream. So presumably people who have medical conditions that require receiving donated blood regularly will have really high levels.

22 June 2025

Iustin Pop: Coding, as we knew it, has forever changed

Back when I was terribly na ve When I was younger, and definitely na ve, I was so looking forward to AI, which will help us write lots of good, reliable code faster. Well, principally me, not thinking what impact it will have industry-wide. Other more general concerns, like societal issues, role of humans in the future and so on were totally not on my radar. At the same time, I didn t expect this will actually happen. Even years later, things didn t change dramatically. Even the first release of ChatGPT a few years back didn t click for me, as the limitations were still significant.

Hints of serious change The first hint of the change, for me, was when a few months ago (yes, behind the curve), I asked ChatGPT to re-explain a concept to me, and it just wrote a lot of words, but without a clear explanation. On a whim, I asked Grok then recently launched, I think to do the same. And for the first time, the explanation clicked and I felt I could have a conversation with it. Of course, now I forgot again that theoretical CS concept, but the first step was done: I can ask an LLM to explain something, and it will, and I can have a back and forth logical discussion, even if on some theoretical concept. Additionally, I learned that not all LLMs are the same, and that means there s real competition and that leap frogging is possible. Another topic on which I tried to adopt early and failed to get mileage out of it, was GitHub Copilot (in VSC). I tried, it helped, but didn t feel any speed-up at all. Then more recently, in May, I asked Grok what s the state of the art in AI-assisted coding. It said either Claude in a browser tab, or in VSC via continue.dev extension. The continue.dev extension/tooling is a bit of a strange/interesting thing. It seems to want to be a middle-man between the user and actual LLM services, i.e. you pay a subscription to continue.dev, not to Anthropic itself, and they manage the keys/APIs, for whatever backend LLMs you want to use. The integration with Visual Studio Code is very nice, but I don t know if long-term their business model will make sense. Well, not my problem.

Claude: reverse engineering my old code and teaching new concepts So I installed the latter and subscribed, thinking 20 CHF for a month is good for testing. I skipped the tutorial model/assistant, created a new one from scratch, just enabled Claude 3.7 Sonnet, and started using it. And then, my mind was blown-not just by the LLM, but by the ecosystem. As said, I ve used GitHub copilot before, but it didn t seem effective. I don t know if a threshold has been reached, or Claude (3.7 at that time) is just better than ChatGPT. I didn t use the AI to write (non-trivial) code for me, at most boilerplate snippets. But I used it both as partner for discussion - I want to do x, what do you think, A or B? , and as a teacher, especially for fronted topics, which I m not familiar with. Since May, in mostly fragmented sessions, I ve achieved more than in the last two years. Migration from old school JS to ECMA modules, a webpacker (reducing bundle size by 50%), replacing an old Javascript library with hand written code using modern APIs, implementing the zoom feature together with all of keyboard, mouse, touchpad and touchscreen support, simplifying layout from manually computed to automatic layout, and finding a bug in webkit for which it also wrote a cool minimal test (cool, as in, way better than I d have ever, ever written, because for me it didn t matter that much). And more. Could I have done all this? Yes, definitely, nothing was especially tricky here. But hours and hours of reading MDN, scouring Stack Overflow and Reddit, and lots of trial and error. So doable, but much more toily. This, to me, feels like cheating. 20 CHF per month to make me 3x more productive is free money well, except that I don t make money on my code which is written basically for myself. However, I don t get stuck anymore searching hours in the web for guidance, I ask my question, and I get at least direction if not answer, and I m finished way earlier. I can now actually juggle more hobbies, in the same amount of time, if my personal code takes less time or differently said, if I m more efficient at it. Not all is roses, of course. Once, it did write code with such an endearing error that it made me laugh. It was so blatantly obvious that you shouldn t keep other state in the array that holds pointer status because that confuses the calculation of how many pointers are down , probably to itself too if I d have asked. But I didn t, since it felt a bit embarassing to point out such a dumb mistake. Yes, I m anthropomorphising again, because this is the easiest way to deal with things. In general, it does an OK-to-good-to-sometimes-awesome job, and the best thing is that it summarises documentation and all of Reddit and Stack Overflow. And gives links to those. Now, I have no idea yet what this means for the job of a software engineer. If on open source code, my own code, it makes me 3x faster reverse engineering my code from 10 years ago is no small feat for working on large codebases, it should do at least the same, if not more. As an example of how open-ended the assistance can be, at one point, I started implementing a new feature threading a new attribute to a large number of call points. This is not complex at all, just add a new field to a Haskell record, and modifying everything to take it into account, populate it, merge it when merging the data structures, etc. The code is not complex, tending toward boilerplate a bit, and I was wondering on a few possible choices for implementation, so, with just a few lines of code written that were not even compiling, I asked I want to add a new feature, should I do A or B if I want it to behave like this , and the answer was something along the lines of I see you want to add the specific feature I was working on, but the implementation is incomplete, you still need to to X, Y and Z . My mind was blown at this point, as I thought, if the code doesn t compile, surely the computer won t be able to parse it, but this is not a program, this is an LLM, so of course it could read it kind of as a human would. Again, the code complexity is not great, but the fact that it was able to read a half-written patch, understand what I was working towards, and reason about, was mind-blowing, and scary. Like always.

Non-code writing Now, after all this, while writing a recent blog post, I thought this is going to be public anyway, so let me ask Claude what it thinks about it. And I was very surprised, again: gone was all the pain of rereading three times my post to catch typos (easy) or phrasing structure issues. It gave me very clearly points, and helped me cut 30-40% of the total time. So not only coding, but word smithing too is changed. If I were an author, I d be delighted (and scared). Here is the overall reply it gave me:
  • Spelling and grammar fixes, all of them on point except one mistake (I claimed I didn t capitalize one word, but I did). To the level of a good grammar checker.
  • Flow Suggestions, which was way beyond normal spelling and grammar. It felt like a teacher telling me to do better in my writing, i.e. nitpicking on things that actually were true even if they d still work. I.e. lousy phrase structure, still understandable, but lousy nevertheless.
  • Other notes: an overall summary. This was mostly just praising my post . I wish LLMs were not so focused on praise the user .
So yeah, this speeds me up to about 2x on writing blog posts, too. It definitely feels not fair.

Wither the future? After all this, I m a bit flabbergasted. Gone are the 2000 s with code without unittests, gone are the 2010 s without CI/CD, and now, mid-2020 s, gone is the lone programmer that scours the internet to learn new things, alone? What this all means for our skills in software development, I have no idea, except I know things have irreversibly changed (a butlerian jihad aside). Do I learn better with a dedicated tutor even if I don t fight with the problem for so long? Or is struggling in finding good docs the main method of learning? I don t know yet. I feel like I understand the topics I m discussing with the AI, but who knows in reality what it will mean long term in terms of stickiness of learning. For the better, or for worse, things have changed. After all the advances over the last five centuries in mechanical sciences, it has now come to some aspects of the intellectual work. Maybe this is the answer to the ever-growing complexity of tech stacks? I.e. a return of the lone programmer that builds things end-to-end, but with AI taming the complexity added in the last 25 years? I can dream, of course, but this also means that the industry overall will increase in complexity even more, because large companies tend to do that, so maybe a net effect of not much One thing I did learn so far is that my expectation that AI (at this level) will only help junior/beginner people, i.e. it would flatten the skills band, is not true. I think AI can speed up at least the middle band, likely the middle top band, I don t know about the 10x programmers (I m not one of them). So, my question about AI now is how to best use it, not to lament how all my learning (90% self learning, to be clear) is obsolete. No, it isn t. AI helps me start and finish one migration (that I delayed for ages), then start the second, in the same day. At the end of this a bit rambling reflection on the past month and a half, I still have many questions about AI and humanity. But one has been answered: yes, AI , quotes or no quotes, already has changed this field (producing software), and we ve not seen the end of it, for sure.

Sahil Dhiman: Case of (broken) maharashtra.gov.in Authoritative Name Servers

Maharashtra is a state here in India, which has Mumbai, the financial capital of India, as its capital. maharashtra.gov.in is the official website of the State Government of Maharashtra. We re going to talk about authoritative name servers serving it (and bunch of child zones under maharashtra.gov.in). Here s a simple trace for the main domain:
$ dig +trace maharashtra.gov.in
; <<>> DiG 9.18.33-1~deb12u2-Debian <<>> +trace maharashtra.gov.in
;; global options: +cmd
.            33128    IN    NS    j.root-servers.net.
.            33128    IN    NS    h.root-servers.net.
.            33128    IN    NS    l.root-servers.net.
.            33128    IN    NS    k.root-servers.net.
.            33128    IN    NS    i.root-servers.net.
.            33128    IN    NS    g.root-servers.net.
.            33128    IN    NS    f.root-servers.net.
.            33128    IN    NS    e.root-servers.net.
.            33128    IN    NS    b.root-servers.net.
.            33128    IN    NS    d.root-servers.net.
.            33128    IN    NS    c.root-servers.net.
.            33128    IN    NS    m.root-servers.net.
.            33128    IN    NS    a.root-servers.net.
.            33128    IN    RRSIG    NS 8 0 518400 20250704050000 20250621040000 53148 . pGxGZftwj+6VNTSQtstTKVN95Z7/b5Q8GSjRCXI68GoVYbVai9HNelxs OGIRKL4YmSrsiSsndXuEsBuvL9QvQ+qbybNLkekJUAiicKYNgr3KM3+X 69rsS9KxHgT2T8/oqG8KN8EJLJ8VkuM2PJ2HfSKijtF7ULtgBbERNQ4i u2I/wQ7elOyeF2M76iEOa7UGhgiBHSBqPulsbpnB//WbKL71yyFhWSk0 tiFEPuZM+iLrN2qBsElriF4kkw37uRHq8sSGcCjfBVdkpbb3/Sb3sIgN /zKU17f+hOvuBQTDr5qFIymqGAENA5UZ2RQjikk6+zK5EfBUXNpq1+oo 2y64DQ==
;; Received 525 bytes from 9.9.9.9#53(9.9.9.9) in 3 ms
in.            172800    IN    NS    ns01.trs-dns.com.
in.            172800    IN    NS    ns01.trs-dns.net.
in.            172800    IN    NS    ns10.trs-dns.org.
in.            172800    IN    NS    ns10.trs-dns.info.
in.            86400    IN    DS    48140 8 2 5EE4748C2069B99C98BC39A56881A64AF17CC78711E6297D43AC5A4F 4B5BB6E5
in.            86400    IN    RRSIG    DS 8 1 86400 20250704050000 20250621040000 53148 . jkCotYosapreoKKPvr9zPOEDECYVe9OtJLjkQbFfTin8uYbm/kdWzieW CkN5sabif5IHTFU4FEVOShfu4DFeUolhNav56TPKjGqEGjQ7qCghpqTj dNN4iY2s8BcJ2ujHwhm6HRfdbQRVoKYQ73UUZ+oWSute6lXWHE9+Snk2 1ZCAYPdZ2s1s7NZhrZW2YXVw/nHIcRl/rHqWIQ9sgUlsd6MwmahcAAG+ v15HG9Q48rCG1A2gJlJPbxWpVe0EUEu8LzDsp+ORqy1pHhzgJynrJHJz qMiYU0egv2j7xVPSoQHXjx3PG2rsOLNnqDBYCA+piEXOLsY3d+7c1SZl w9u66g==
;; Received 679 bytes from 199.7.83.42#53(l.root-servers.net) in 3 ms
maharashtra.gov.in.    900    IN    NS    ns8.maharashtra.gov.in.
maharashtra.gov.in.    900    IN    NS    ns9.maharashtra.gov.in.
maharashtra.gov.in.    900    IN    NS    ns10.maharashtra.gov.in.
maharashtra.gov.in.    900    IN    NS    ns18.maharashtra.gov.in.
maharashtra.gov.in.    900    IN    NS    ns20.maharashtra.gov.in.
npk19skvsdmju264d4ono0khqf7eafqv.gov.in. 300 IN    NSEC3 1 1 0 - P0KKR4BMBGLJDOKBGBI0KDM39DSM0EA4 NS SOA MX TXT RRSIG DNSKEY NSEC3PARAM
npk19skvsdmju264d4ono0khqf7eafqv.gov.in. 300 IN    RRSIG NSEC3 8 3 300 20250626140337 20250528184339 48544 gov.in. Khcq3n1Jn34HvuBEZExusVqoduEMH6DzqkWHk9dFkM+q0RVBYBHBbW+u LsSnc2/Rqc3HAYutk3EZeS+kXVF07GA/A486dr17Hqf3lHszvG/MNT/s CJfcdrqO0Q8NZ9NQxvAwWo44bCPaECQV+fhznmIaVSgbw7de9xC6RxWG ZFcsPYwYt07yB5neKa99RlVvJXk4GHX3ISxiSfusCNOuEKGy5cMxZg04 4PbYsP0AQNiJWALAduq2aNs80FQdWweLhd2swYuZyfsbk1nSXJQcYbTX aONc0VkYFeEJzTscX8/wNbkJeoLP0r/W2ebahvFExl3NYpb7b2rMwGBY omC/QA==
npk19skvsdmju264d4ono0khqf7eafqv.gov.in. 300 IN    RRSIG NSEC3 13 3 300 20250718144138 20250619135610 22437 gov.in. mbj7td3E6YE7kIhYoSlDTZR047TXY3Z60NY0aBwU7obyg5enBQU9j5nl GUxn9zUiwVUzei7v5GIPxXS7XDpk7g==
6bflkoouitlvj011i2mau7ql5pk61sks.gov.in. 300 IN    NSEC3 1 1 0 - 78S0UO5LI1KV1SVMH1889FHUCNC40U6T TXT RRSIG
6bflkoouitlvj011i2mau7ql5pk61sks.gov.in. 300 IN    RRSIG NSEC3 8 3 300 20250626133905 20250528184339 48544 gov.in. M2yPThQpX0sEf4klooQ06h+rLR3e3Q/BqDTSFogyTIuGwjgm6nwate19 jGmgCeWCYL3w/oxsg1z7SfCvDBCXOObH8ftEBOfLe8/AGHAEkWFSu3e0 s09Ccoz8FJiCfBJbbZK5Vf4HWXtBLfBq+ncGCEE24tCQLXaS5cT85BxZ Zne6Y6u8s/WPgo8jybsvlGnL4QhIPlW5UkHDs7cLLQSwlkZs3dwxyHTn EgjNWClhghGXP9nlvOlnDjUkmacEYeq5ItnCQjYPl4uwh9fBJ9CD/8LV K+Tn3+dgqDBek6+2HRzjGs59NzuHX8J9wVFxP7/nd+fUgaSgz+sST80O vrXlHA==
6bflkoouitlvj011i2mau7ql5pk61sks.gov.in. 300 IN    RRSIG NSEC3 13 3 300 20250718141148 20250619135610 22437 gov.in. raWzWsQnPkXYtr2v1SRH/fk2dEAv/K85NH+06pNUwkxPxQk01nS8eYlq BPQ41b26kikg8mNOgr2ULlBpJHb1OQ==
couldn't get address for 'ns18.maharashtra.gov.in': not found
couldn't get address for 'ns20.maharashtra.gov.in': not found
;; Received 1171 bytes from 2620:171:813:1534:8::1#53(ns10.trs-dns.org) in 0 ms
;; communications error to 10.187.202.24#53: timed out
;; communications error to 10.187.202.24#53: timed out
;; communications error to 10.187.202.24#53: timed out
;; communications error to 10.187.202.28#53: timed out
;; communications error to 10.187.203.201#53: timed out
;; no servers could be reached
Quick takeaways: It s a hit or miss for this DNS query resolution.

Looking at in zone data Let s look at NS added in zone itself (with 9.9.9.9):
$ dig ns maharashtra.gov.in
; <<>> DiG 9.18.33-1~deb12u2-Debian <<>> ns maharashtra.gov.in
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 172
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 3
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;maharashtra.gov.in.        IN    NS
;; ANSWER SECTION:
maharashtra.gov.in.    300    IN    NS    ns8.maharashtra.gov.in.
maharashtra.gov.in.    300    IN    NS    ns9.maharashtra.gov.in.
;; ADDITIONAL SECTION:
ns9.maharashtra.gov.in.    300    IN    A    10.187.202.24
ns8.maharashtra.gov.in.    300    IN    A    10.187.202.28
;; Query time: 180 msec
;; SERVER: 9.9.9.9#53(9.9.9.9) (UDP)
;; WHEN: Sat Jun 21 23:00:49 IST 2025
;; MSG SIZE  rcvd: 115
Pay special attention to ADDITIONAL SECTION . Running dig ns9.maharashtra.gov.in and dig ns8.maharashtra.gov.in, return RFC 1918 ie these private addresses. This is coming from zone itself, so in zone A records of NS8 and NS9 point to 10.187.202.28 and 10.187.202.24 respectively. Cloudflare s 1.1.1.1 has a slightly different version:
$ dig ns maharashtra.gov.in @1.1.1.1
; <<>> DiG 9.18.33-1~deb12u2-Debian <<>> ns maharashtra.gov.in @1.1.1.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36005
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;maharashtra.gov.in.        IN    NS
;; ANSWER SECTION:
maharashtra.gov.in.    300    IN    NS    ns8.
maharashtra.gov.in.    300    IN    NS    ns10.maharashtra.gov.in.
maharashtra.gov.in.    300    IN    NS    ns9.
;; Query time: 7 msec
;; SERVER: 1.1.1.1#53(1.1.1.1) (UDP)
;; WHEN: Sun Jun 22 10:38:30 IST 2025
;; MSG SIZE  rcvd: 100
Interesting response here for sure :D. The reason for difference between response from 1.1.1.1 and 9.9.9.9 is in the next section.

Looking at parent zone gov.in is the parent zone here. Tucows is operator for gov.in as well as .in ccTLD zone:
$ dig ns gov.in +short
ns01.trs-dns.net.
ns01.trs-dns.com.
ns10.trs-dns.org.
ns10.trs-dns.info.
Let s take a look at what parent zone (NS) hold:
$ dig ns maharashtra.gov.in @ns01.trs-dns.net.
; <<>> DiG 9.18.36 <<>> ns maharashtra.gov.in @ns01.trs-dns.net.
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 56535
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 5, ADDITIONAL: 6
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: f13027aa39632404010000006856fa2a9c97d6bbc973ba4f (good)
;; QUESTION SECTION:
;maharashtra.gov.in.        IN    NS
;; AUTHORITY SECTION:
maharashtra.gov.in.    900    IN    NS    ns8.maharashtra.gov.in.
maharashtra.gov.in.    900    IN    NS    ns18.maharashtra.gov.in.
maharashtra.gov.in.    900    IN    NS    ns10.maharashtra.gov.in.
maharashtra.gov.in.    900    IN    NS    ns9.maharashtra.gov.in.
maharashtra.gov.in.    900    IN    NS    ns20.maharashtra.gov.in.
;; ADDITIONAL SECTION:
ns20.maharashtra.gov.in. 900    IN    A    52.183.143.210
ns18.maharashtra.gov.in. 900    IN    A    35.154.30.166
ns10.maharashtra.gov.in. 900    IN    A    164.100.128.234
ns9.maharashtra.gov.in.    900    IN    A    103.23.150.89
ns8.maharashtra.gov.in.    900    IN    A    103.23.150.88
;; Query time: 28 msec
;; SERVER: 64.96.2.1#53(ns01.trs-dns.net.) (UDP)
;; WHEN: Sun Jun 22 00:00:02 IST 2025
;; MSG SIZE  rcvd: 248
The ADDITIONAL SECTION gives a completely different picture (different from in zone NSes). Maybe this was how it was supposed to be, but none of the IPs listed for NS10, NS18 and NS20 are responding to any DNS query. Assuming NS8 as 103.23.150.88 and NS9 as 103.23.150.89, checking SOA on each gives following:
$ dig soa maharashtra.gov.in @103.23.150.88 +short
ns8.maharashtra.gov.in. postmaster.maharashtra.gov.in. 2013116777 1200 600 1296000 300
$ dig soa maharashtra.gov.in @103.23.150.89 +short
ns8.maharashtra.gov.in. postmaster.maharashtra.gov.in. 2013116757 1200 600 1296000 300
NS8 (which is marked as primary in SOA) has serial 2013116777 and NS9 is on serial 2013116757, so looks like the sync (IXFR/AXFR) between primary and secondary is broken. That s why NS8 and NS9 are serving different responses, evident from the following:
$ dig ns8.maharashtra.gov.in @103.23.150.88 +short
103.23.150.88
$ dig ns8.maharashtra.gov.in @103.23.150.89 +short
10.187.202.28
$ dig ns9.maharashtra.gov.in @103.23.150.88 +short
103.23.150.89
$ dig ns9.maharashtra.gov.in @103.23.150.89 +short
10.187.202.24
$ dig ns maharashtra.gov.in @103.23.150.88 +short
ns9.
ns8.
ns10.maharashtra.gov.in.
$ dig ns maharashtra.gov.in @103.23.150.89 +short
ns9.maharashtra.gov.in.
ns8.maharashtra.gov.in.
$ dig ns10.maharashtra.gov.in @103.23.150.88 +short
10.187.203.201
$ dig ns10.maharashtra.gov.in @103.23.150.89 +short
# No/empty response ^
This is the reason for difference in 1.1.1.1 and 9.9.9.9 responses in previous section.

To summarize:
  • Primary and secondary NS aren t in sync. Serials aren t matching, while NS8 and NS9 are responding differently for same queries.
  • NSes have A records with private address, not reachable on the internet, so lame servers.
  • Incomplete NS address, not even FQDN in some cases.
  • Difference between NS delegated in parent zone and NS added in actual zone.
  • Name resolution works in very particular order (in my initial trace it failed).
Initially, I thought of citing RFCs, but I don t really think it s even required. 1.1.1.1, 8.8.8.8 and 9.9.9.9 are handling (lame servers, this probelm) well, handing out the A record for the main website, so dig maharashtra.gov.in would mostly pass and that was the reason I started this post with +trace to recurse the complete zone to show the problem. For later reference:
$ dig maharashtra.gov.in @8.8.8.8 +short
103.8.188.109

Email to SOA address I have sent the following email to address listed in SOA:
Subject - maharashtra.gov.in authoritative DNS servers not reachable Hello, I wanted to highlight the confusing state of maharashtra.gov.in authoritative DNS servers. Parent zone list following as name servers for your DNS zone:
  • ns8.maharashtra.gov.in.
  • ns18.maharashtra.gov.in.
  • ns10.maharashtra.gov.in.
  • ns9.maharashtra.gov.in.
  • ns20.maharashtra.gov.in.
Out of these, ns18 and ns20 don t have public A/AAAA records and are thus not reachable. ns10 keeps on shuffling between NO A record and 10.187.203.201 (private, not reachable address). ns8 keeps on shuffling between 103.23.150.88 and 10.187.202.28 (private, not reachable address). ns9 keeps on shuffling between 103.23.150.89 and 10.187.202.24 (private, not reachable address). These are leading to long, broken, or no DNS resolution for the website(s). Can you take a look at the problem? Regards, Sahil
I ll update here if I get a response. Hopefully, they ll listen and fix their problem.

21 June 2025

Ravi Dwivedi: Getting Brunei visa

In December 2024, my friend Badri and I were planning a trip to Southeast Asia. At this point, we were planning to visit Singapore, Malaysia and Vietnam. My Singapore visa had already been approved, and Malaysia was visa-free for us. For Vietnam, we had to apply for an e-visa online. We considered adding Brunei to our itinerary. I saw some videos of the Brunei visa process and got the impression that we needed to go to the Brunei embassy in Kuching, Malaysia in person. However, when I happened to search for Brunei on Organic Maps1, I stumbled upon the Brunei Embassy in Delhi. It seemed to be somewhere in Hauz Khas. As I was going to Delhi to collect my Singapore visa the next day, I figured I d also visit the Brunei Embassy to get information about the visa process. The next day I went to the location displayed by Organic Maps. It was next to the embassy of Madagascar, and a sign on the road divider confirmed that I was at the right place. That said, it actually looked like someone s apartment. I entered and asked for directions to the Brunei embassy, but the people inside did not seem to understand my query. After some back and forth, I realized that the embassy wasn t there. I now searched for the Brunei embassy on the Internet, and this time I got an address in Vasant Vihar. It seemed like the embassy had been moved from Hauz Khas to Vasant Vihar. Going by the timings mentioned on the web page, the embassy was closing in an hour. I took a Metro from Hauz Khas to Vasant Vihar. After deboarding at the Vasant Vihar metro station, I took an auto to reach the embassy. The address listed on the webpage got me into the correct block. However, the embassy was still nowhere to be seen. I asked around, but security guards in that area pointed me to the Burundi embassy instead. After some more looking around, I did end up finding the embassy. I spoke to the security guards at the gate and told them that I would like to know the visa process. They dialled a number and asked that person to tell me the visa process. I spoke to a lady on the phone. She listed the documents required for the visa process and mentioned that the timings for visa application were from 9 o clock to 11 o clock in the morning. She also informed me that the visa fees was 1000. I also asked about the process Badri, who lives far away in Tamil Nadu and cannot report at the embassy physically. She told me that I can submit a visa application on his behalf, along with an authorization letter. Having found the embassy in Delhi was a huge relief. The other plan - going to Kuching, Malaysia - was a bit uncertain, and we didn t know how much time it would take. Getting our passport submitted at an embassy in a foreign country was also not ideal. A few days later, Badri sent me all the documents required for his visa. I went to the embassy and submitted both the applications. The lady who collected our visa submissions asked me for our flight reservations from Delhi to Brunei, whereas ours were (keeping with our itinerary) from Kuala Lampur. She said that she might contact me later if it was required. For reference, here is the list of documents we submitted - I then asked about the procedure to collect the passports and visa results. Usually, embassies will tell you that they will contact you when they have decided on your applications. However, here I was informed that if they don t contact me within 5 days, I can come and collect our passports and visa result between 13:30-14:30 hours on the fifth day. That was strange :) I did visit the embassy to collect our visa results on the fifth day. However, the lady scolded me for not bringing the receipt she gave me. I was afraid that I might have to go all the way back home and bring the receipt to get our passports. The travel date was close, and it would take some time for Badri to receive his passport via courier as well. Fortunately, she gave me our passports (with the visa attached) and asked me to share a scanned copy of the receipt via email after I get home. We were elated that our visas were approved. Now we could focus on booking our flights. If you are going to Brunei, remember to fill their arrival card from the website within 48 hours of your arrival! Thanks to Badri and Contrapunctus for reviewing the draft before publishing the article.

  1. Nowadays, I prefer using Comaps instead of Organic Maps and recommend you do the same. Organic Maps had some issues with its governance and the community issues weren t being addressed.

20 June 2025

Sven Hoexter: Terraform: Validation Condition Cycles

Terraform 1.9 introduced some time ago the capability to reference in an input variable validation condition other variables, not only the one you're validating. What does not work is having two variables which validate each other, e.g.
variable "nat_min_ports"  
  description = "Minimal amount of ports to allocate for 'min_ports_per_vm'"
  default     = 32
  type        = number
  validation  
    condition = (
      var.nat_min_ports >= 32 &&
      var.nat_min_ports <= 32768 &&
      var.nat_min_ports < var.nat_max_ports
    )
    error_message = "Must be between 32 and 32768 and less than 'nat_max_ports'"
   
 
variable "nat_max_ports"  
  description = "Maximal amount of ports to allocate for 'max_ports_per_vm'"
  default     = 16384
  type        = number
  validation  
    condition = (
      var.nat_max_ports >= 64 &&
      var.nat_max_ports <= 65536 &&
      var.nat_max_ports > var.nat_min_ports
    )
    error_message = "Must be between 64 and 65536 and above 'nat_min_ports'"
   
 
That let directly to the following rather opaque error message: Received an error Error: Cycle: module.gcp_project_network.var.nat_max_ports (validation), module.gcp_project_network.var.nat_min_ports (validation) Removed the sort of duplicate check var.nat_max_ports > var.nat_min_ports on nat_max_ports to break the cycle.

Matthew Garrett: My a11y journey

23 years ago I was in a bad place. I'd quit my first attempt at a PhD for various reasons that were, with hindsight, bad, and I was suddenly entirely aimless. I lucked into picking up a sysadmin role back at TCM where I'd spent a summer a year before, but that's not really what I wanted in my life. And then Hanna mentioned that her PhD supervisor was looking for someone familiar with Linux to work on making Dasher, one of the group's research projects, more usable on Linux. I jumped.

The timing was fortuitous. Sun were pumping money and developer effort into accessibility support, and the Inference Group had just received a grant from the Gatsy Foundation that involved working with the ACE Centre to provide additional accessibility support. And I was suddenly hacking on code that was largely ignored by most developers, supporting use cases that were irrelevant to most developers. Being in a relatively green field space sounds refreshing, until you realise that you're catering to actual humans who are potentially going to rely on your software to be able to communicate. That's somewhat focusing.

This was, uh, something of an on the job learning experience. I had to catch up with a lot of new technologies very quickly, but that wasn't the hard bit - what was difficult was realising I had to cater to people who were dealing with use cases that I had no experience of whatsoever. Dasher was extended to allow text entry into applications without needing to cut and paste. We added support for introspection of the current applications UI so menus could be exposed via the Dasher interface, allowing people to fly through menu hierarchies and pop open file dialogs. Text-to-speech was incorporated so people could rapidly enter sentences and have them spoke out loud.

But what sticks with me isn't the tech, or even the opportunities it gave me to meet other people working on the Linux desktop and forge friendships that still exist. It was the cases where I had the opportunity to work with people who could use Dasher as a tool to increase their ability to communicate with the outside world, whose lives were transformed for the better because of what we'd produced. Watching someone use your code and realising that you could write a three line patch that had a significant impact on the speed they could talk to other people is an incomparable experience. It's been decades and in many ways that was the most impact I've ever had as a developer.

I left after a year to work on fruitflies and get my PhD, and my career since then hasn't involved a lot of accessibility work. But it's stuck with me - every improvement in that space is something that has a direct impact on the quality of life of more people than you expect, but is also something that goes almost unrecognised. The people working on accessibility are heroes. They're making all the technology everyone else produces available to people who would otherwise be blocked from it. They deserve recognition, and they deserve a lot more support than they have.

But when we deal with technology, we deal with transitions. A lot of the Linux accessibility support depended on X11 behaviour that is now widely regarded as a set of misfeatures. It's not actually good to be able to inject arbitrary input into an arbitrary window, and it's not good to be able to arbitrarily scrape out its contents. X11 never had a model to permit this for accessibility tooling while blocking it for other code. Wayland does, but suffers from the surrounding infrastructure not being well developed yet. We're seeing that happen now, though - Gnome has been performing a great deal of work in this respect, and KDE is picking that up as well. There isn't a full correspondence between X11-based Linux accessibility support and Wayland, but for many users the Wayland accessibility infrastructure is already better than with X11.

That's going to continue improving, and it'll improve faster with broader support. We've somehow ended up with the bizarre politicisation of Wayland as being some sort of woke thing while X11 represents the Roman Empire or some such bullshit, but the reality is that there is no story for improving accessibility support under X11 and sticking to X11 is going to end up reducing the accessibility of a platform.

When you read anything about Linux accessibility, ask yourself whether you're reading something written by either a user of the accessibility features, or a developer of them. If they're neither, ask yourself why they actually care and what they're doing to make the future better.

comment count unavailable comments

Russell Coker: The Intel Arc B580 and PCIe Slot Size

A few months ago I bought a Intel Arc B580 for the main purpose of getting 8K video going [1]. I had briefly got it working in a test PC but then I wanted to deploy it on my HP z840 that I use as a build server and for playing with ML stuff [2]. I only did brief tests of it previously and this was my first attempt at installing it in a system I use. My plan was to keep the NVidia RTX A2000 in place and run 2 GPUs, that s not an uncommon desire among people who want to do ML stuff and it s the type of thing that the z840 is designed for, the machine has slots 2, 4, and 6 being PCIe*16 so it should be able to fit 3 cards that each take 2 slots. So having one full size GPU, the half-height A2000, and a NVMe controller that uses *16 to run four NVMe devices should be easy. Intel designed the B580 to use every millimeter of space possible while still being able to claim to be a 2 slot card. On the circuit board side there is a plastic cover over the board that takes all the space before the next slot so a 2 slot card can t go on that side without having it s airflow blocked. On the other side it takes all the available space so that any card that wants to blow air through can t fit and also such that a medium size card (such as the card for 4 NVMe devices) would block it s air flow. So it s impossible to have a computer with 6 PCIe slots run the B580 as well as 2 other full size *16 cards. Support for this type of GPU is something vendors like HP should consider when designing workstation class systems. For HP there is no issue of people installing motherboards in random cases (the HP motherboard in question uses proprietary power connectors and won t even boot with an ATX PSU without significant work). So they could easily design a motherboard and case with a few extra mm of space between pairs of PCIe slots. The cards that are double width are almost always *16 so you could pair up a *16 slot and another slot and have extra space on each side of the pair. I think for most people a system with 6 PCIe slots with a bit of extra space for GPU cooling would be more useful than having 7 PCIe slots. But as HP have full design control they don t even need to reduce the number of PCIe slots, they could just make the case taller. If they added another 4 slots and increased the case size accordingly it still wouldn t be particularly tall by the standards of tower cases from the 90s! The z8 series of workstations are the biggest workstations that HP sells so they should design them to do these things. At the time that the z840 was new there was a lot of ML work being done and HP was selling them as ML workstations, they should have known how people would use them and design them accordingly. So I removed the NVidia card and decided to run the system with just the Arc card, things should have been fine but Intel designed the card to be as high as possible and put the power connector on top. This prevented installing the baffle for directing air flow over the PCIe slots and due to the design of the z840 (which is either ingenious or stupid depending on your point of view) the baffle is needed to secure the PCIe cards in place. So now all the PCIe cards are just secured by friction in the slots, this isn t an unusual situation for machines I assemble but it s not something I desired. This is the first time I ve felt compelled to write a blog post reviewing a product before even getting it working. But the physical design of the B580 is outrageously impractical unless you are designing your entire computer around the GPU. As an aside the B580 does look very nice. The plastic surround is very fancy, it s a pity that it interferes with the operation of the rest of the system.

Reproducible Builds (diffoscope): diffoscope 299 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 299. This version includes the following changes:
[ Chris Lamb ]
* Add python3-defusedxml to the Build-Depends in order to include it in the
  Docker image. (Closes: #407)
You find out more by visiting the project homepage.

19 June 2025

Jonathan Carter: My first tag2upload upload

Tag2upload? The tag2upload service has finally gone live for Debian Developers in an open beta. If you ve never heard of tag2upload before, here is a great primer presented by Ian Jackson and prepared by Ian Jackson and Sean Whitton. In short, the world has moved on to hosting and working with source code in Git repositories. In Debian, we work with source packages that are used to generated the binary artifacts that users know as .deb files. In Debian, there is so much tooling and culture built around this. For example, our workflow passes what we call the island test you could take every source package in Debian along with you to an island with no Internet, and you ll still be able to rebuild or modify every package. When changing the workflows, you risk losing benefits like this, and over the years there has been a number of different ideas on how to move to a purely or partially git flow for Debian, none that really managed to gain enough momentum or project-wide support. Tag2upload makes a lot of sense. It doesn t take away any of the benefits of the current way of working (whether technical or social), but it does make some aspects of Debian packages significantly simpler and faster. Even so, if you re a Debian Developer and more familiar with how the sausage have made, you ll have noticed that this has been a very long road for the tag2upload maintainers, they ve hit multiple speed bumps since 2019, but with a lot of patience and communication and persistence from all involved (and almost even a GR), it is finally materializing.

Performing my first tag2upload So, first, I needed to choose which package I want to upload. We re currently in hard freeze for the trixie release, so I ll look for something simple that I can upload to experimental.
I chose bundlewrap, it s quote a straightforward python package, and updates are usually just as straightforward, so it s probably a good package to work on without having to deal with extra complexities in learning how to use tag2upload. So, I do the usual uscan and dch -i to update my package
And then I realise that I still want to build a source package to test it in cowbuilder. Hmm, I remember that Helmut showed me that building a source package isn t necessary in sbuild, but I have a habit of breaking my sbuild configs somehow, but I guess I should revisit that. So, I do a dpkg-buildpackage -S -sa and test it out with cowbuilder, because that s just how I roll (at least for now, fixing my local sbuild setup is yak shaving for another day, let s focus!). I end up with a binary that looks good, so I m satisfied that I can upload this package to the Debian archives. So, time to configure tag2upload. The first step is to set up the webhook in Salsa. I was surprised two find two webhooks already configured:
I know of KGB that posts to IRC, didn t know that this was the mechanism it does that by before. Nice! Also don t know what the tagpending one does, I ll go look into that some other time. Configuring a tag2upload webhook is quite simple, add a URL, call the name tag2upload, and select only tag push events:
I run the test webhook, and it returned a code 400 message about a missing message header, which the documentation says is normal. Next, I install git-debpush from experimental. The wiki page simply states that you can use the git-debpush command to upload, but doesn t give any examples on how to use it, and its manpage doesn t either. And when I run just git-debpush I get:
jonathan@lapcloud:~/devel/debian/python-team/bundlewrap/bundlewrap-4.23.1$ git-debpush
git-debpush: check failed: upstream tag upstream/4.22.0 is not an ancestor of refs/heads/debian/master; probably a mistake ('upstream-nonancestor' check)
pristine-tar is /usr/bin/pristine-tar
git-debpush: some check(s) failed; you can pass --force to ignore them
I have no idea what that s supposed to mean. I was also not sure whether I should tag anything to begin with, or if some part of the tag2upload machinery automatically does it. I think I might have tagged debian/4.23-1 before tagging upstream/4.23 and perhaps it didn t like it, I reverted and did it the other way around and got a new error message. Progress!
jonathan@lapcloud:~/devel/debian/python-team/bundlewrap/bundlewrap-4.23.1$ git-debpush
git-debpush: could not determine the git branch layout
git-debpush: please supply a --quilt= argument
Looking at the manpage, it looks like quilt=baredebian matches my package the best, so I try that:
jonathan@lapcloud:~/devel/debian/python-team/bundlewrap/bundlewrap-4.23.1$ git-debpush --quilt=baredebian
Enumerating objects: 70, done.
Counting objects: 100% (70/70), done.
Delta compression using up to 12 threads
Compressing objects: 100% (37/37), done.
Writing objects: 100% (37/37), 8.97 KiB   2.99 MiB/s, done.
Total 37 (delta 30), reused 0 (delta 0), pack-reused 0 (from 0)
To salsa.debian.org:python-team/packages/bundlewrap.git
6f55d99..3d5498f debian/master -> debian/master
 * [new tag] upstream/4.23.1 -> upstream/4.23.1
 * [new tag] debian/4.23.1-1_exp1 -> debian/4.23.1-1_exp1
Ooh! That looked like it did something! And a minute later I received the notification of the upload in my inbox:
So, I m not 100% sure that this makes things much easier for me than doing a dput, but, it s not any more difficult or more work either (once you know how it works), so I ll be using git-debpush from now on, and I m sure as I get more used to the git workflow of doing things I ll understand more of the benefits. And at last, my one last use case for using FTP is now properly dead. RIP FTP :)

Russell Coker: Matching Intel CPUs

To run a SMP system with multiple CPUs you need to have CPUs that are identical , the question is what does identical mean. In this case I m interested in Intel CPUs because SMP motherboards and server systems for Intel CPUs are readily available and affordable. There are people selling matched pairs of CPUs on ebay which tend to be more expensive than randomly buying 2 of the same CPU model, so if you can identify 2 CPUs that are identical which are sold separately then you can save some money. Also if you own a two CPU system with only one CPU installed then buying a second CPU to match the first is cheaper and easier than buying two more CPUs and removing a perfectly working CPU. e5-2640 v4 cpus
Intel (R) Xeon (R)
E5-2640V4
SR2NZ 2.40GHZ
J717B324 (e4)
7758S4100843
Above is a pic of 2 E5-2640v4 CPUs that were in a SMP system I purchased along with a plain ASCII representation of the text on one of them. The bottom code (starting with 77 ) is apparently the serial number, one of the two codes above it is what determines how identical those CPUs are. The code on the same line as the nominal clock speed (in this case SR2NZ) is the spec number which is sometimes referred to as sspec [1]. The line below the sspec and above the serial number has J717B324 which doesn t have a google hit. I looked at more than 20 pics of E5-2640v4 CPUs on ebay, they all had the code SR2NZ but had different numbers on the line below. I conclude that the number on the line below probably indicates the model AND stepping while SR2NZ just means E5-2640v4 regardless of stepping. As I wasn t able to find another CPU on ebay with the same number on the line below the sspec I believe that it will be unreasonably difficult to get a match for an existing CPU. For the purpose of matching CPUs I believe that if the line above the serial number matches then the CPUs can be used together. I am not certain that CPUs with this number slightly mismatching won t work but I definitely wouldn t want to spend money on CPUs with this number being different.
smpboot: CPU0: Intel(R) Xeon(R) CPU E5-2699A v4 @ 2.40GHz (family: 0x6, model: 0x4f, stepping: 0x1)
When you boot Linux the kernel identifies the CPU in a manner like the above, the combination of family and model seem to map to one spec number. The combination of family, model, and stepping should be all that s required to have them work together. I think that Intel did the wrong thing in not making this clearer. It would have been very easy to print the stepping on the CPU case next to the sspec or the CPU model name. It also wouldn t have been too hard to make the CPU provide the magic number that is apparently the required match for SMP to the OS. Having the Intel web site provide a mapping of those numbers to steppings of CPUs also shouldn t be difficult for them. If anyone knows more about these issues please let me know.

Debian Outreach Team: GSoC 2025 Introduction: Make Debian for Raspberry Pi Build Again

Hello everyone! I am Kurva Prashanth, Interested in the lower level working of system software, CPUs/SoCs and Hardware design. I was introduced to Open Hardware and Embedded Linux while studying electronics and embedded systems as part of robotics coursework. Initially, I did not pay much attention to it and quickly moved on. However, a short talk on Liberating SBCs using Debian by Yuvraj at MiniDebConf India, 2021 caught my interest. The talk focused on Open Hardware platforms such as Olimex and BeagleBone Black, as well as the Debian distributions tailored for these ARM-based single-board computers has intrigued me to delve deeper into the realm of Open Hardware and Embedded Linux. These days I m trying to improve my abilities to contribute to Debian and Linux Kernel development. Before finding out about the Google Summer of Code project, I had already started my journey with Debian. I extensively used Debian system build tools(debootstrap, sbuild, deb-build-pkg, qemu-debootstrap) for Building Debian Image for Bela Cape a real-time OS for music making to achieve extremely fast audio and sensor processing times. In 2023, I had the opportunity to attend DebConf23 in Kochi, India - thanks to Nilesh Patra (@nilesh) and I met Hector Oron (@zumbi) over dinner at DebConf23 and It was nice talking about his contributions/work at Debian on armhf port and Debian System Administration that conversation got me interested in knowing more about Debian ARM, Installer and I found it fascinating that EmDebian was once a external project bringing Debian to embedded systems and now, Debian itself can be run on many embedded systems. And, also during DebCamp I got Introduced to PGP/GPG keys and the web of trust by Carlos Henrique Lima Melara (@charles) I learned how to use and generate GPG keys. After DebConf23 I tried debian packaging and I miserably failed to get sponsorship for a python library I packaged. I came across the Debian project for this year s Google Summer of Code and found the project titled Make Debian for Raspberry Pi Build Again quite interesting to me and applied. Gladly, on May 8th, I received an acceptance e-mail from GSoC. I got excited that I ll spend the summer working on something that I like doing. I am thrilled to be part of this project and I am super excited for the summer of 25. I m looking forward to work on what I most like, new connections and learning opportunities. So, let me talk a bit more about my project. I will be working on to Make Debian for Raspberry Pi SBC s under the guidance of Gunnar Wolf (@gwolf). In this post, I will describe the project I will be working on.

Why make Debian for Raspberry Pi build again? There is an available set of images for running Debian in Raspberry Pi computers (all models below the 5 series)! However, the maintainer severely lacking time to take care for them; called for help for somebody to adopt them, but have not been successful. The image generation scripts might have bitrotted a bit, but it is mostly all done. And there is a lot of interest and use still in having the images freshly generated and decently tested! This GSoC project is about getting the [https://raspi.debian.net/ Raspberry Pi Debian images] site working reliably, daily-built images become automatic again and ideally making it easily deployable to be run in project machines and migrating exsisting hosting infrastructure to Debian.

How much it differ from Debian build process? While the goal is to stay as close as possible to the Debian build process, Raspberry Pi boards require some necessary platform-specific changes primarily in the early boot sequence and firmware handling. Unlike typical Debian systems, Raspberry Pi boards depend on a non-standard bootloader and use non-free firmware (raspi-firmware), Introducing some hardware-specific differences in the initialization process. These differences are largely confined to the early boot and hardware initialization stages. Once the system boots, the userspace remains closely aligned with a typical Debian install, using Debian packages. The current modifications are required due to non-free firmware. However, several areas merit review: but there are a few parts that might be worth changing.
  1. Boot flow: Transitioning to a U-Boot based boot process (as used in Debian installer images for many other SBCs) would reduce divergence and better align with Debian Installer.
  2. Current scripts/workarounds: Some existing hacks may now be redundant with recent upstream support and could be removed.
  3. Board-specific images: Shift to architecture-specific base images with runtime detection could simplify builds and reduce duplication.
Debian already already building SD card images for a wide range of SBCs (e.g., BeagleBone, BananaPi, OLinuXino, Cubieboard, etc.) installer-arm64/images/u-boot and installer-armhf/images/u-boot, a similar approach for Raspberry Pi could improve maintainability and consistency with Debian s broader SBC support.

Quoted from Mail Discussion Thread with Mentor (Gunnar Wolf)
"One direction we wanted to explore was whether we should still be building one image per family, or whether we could instead switch to one image per architecture (armel, armhf, arm64). There were some details to iron out as RPi3 and RPi4 were quite different, but I think it will be similar to the differences between the RPi 0 and 1, which are handled at first-boot time. To understand what differs between families, take a look at Cyril Brulebois generate-recipe (in the repo), which is a great improvement over the ugly mess I had before he contributed it"
In this project, I intend to to build one image per architecture (armel, armhf, arm64) rather than continuing with the current model of building one image per board. This change simplifies image management, reduces redundancy, and leverages dynamic configuration at boot time to support all supported boards within each architecture. By using U-Boot and flash-kernel, we can detect the board type and configure kernel parameters, DTBs, and firmware during the first boot, reducing duplication across images and simplifying the maintenance burden and we can also generalize image creation while still supporting board-specific behavior at runtime. This method aligns with existing practices in the DebianInstaller team and aligns with Debian s long-term maintainability goals and better leverages upstream capabilities, ensuring a consistent and scalable boot experience. To streamline and standardize the process of building bootable Debian images for Raspberry Pi devices, I proposed a new workflow that leverages U-Boot and flash-kernel Debian packages. This provides a clean, maintainable, and reproducible way to generate images for armel, armhf and arm64 boards. The workflow is vmdb2, a lightweight, declarative tool designed to automate the creation of disk images. A typical vmdb2 recipe defines the disk layout, base system installation (via debootstrap), architecture-specific packages, and any custom post-install hooks and the image should includes U-Boot (the u-boot-rpi package), flash-kernel, and a suitable Debian kernel package like linux-image-arm64 or linux-image-armmp. U-Boot serves as the platform s bootloader and is responsible for loading the kernel and initramfs. Unlike Raspberry Pi s non-free firmware/proprietary bootloader, U-Boot provides an open and scriptable interface, allowing us to follow a more standard Debian boot process. It can be configured to boot using either an extlinux.conf or a boot.scr script generated automatically by flash-kernel. The role of flash-kernel is to bridge Debian s kernel installation system with the specifics of embedded bootloaders like U-Boot. When installed, it automatically copies the kernel image, initrd, and device tree blobs (DTBs) to the /boot partition. It also generates the necessary boot.scr script if the board configuration demands it. To work correctly, flash-kernel requires that the target machine be identified via /etc/flash-kernel/machine, which must correspond to an entry in its internal machine database.\ Once the vmdb2 build is complete, the resulting image will contain a fully configured bootable system with all necessary boot components correctly installed. The image can be flashed to an SD card and used to boot on the intended device without additional manual configuration. Because all key packages (U-Boot, kernel, flash-kernel) are managed through Debian s package system, kernel updates and boot script regeneration are handled automatically during system upgrades.

Current Workflow: Builds one Image per family The current vmdb2 recipe uses the Raspberry Pi GPU bootloader provided via the raspi-firmware package. This is the traditional boot process followed by Raspberry Pi OS, and it s tightly coupled with firmware files like bootcode.bin, start.elf, and fixup.dat. These files are installed to /boot/firmware, which is mounted from a FAT32 partition labeled RASPIFIRM. The device tree files (*.dtb) are manually copied from /usr/lib/linux-image-*-arm64/broadcom/ into this partition. The kernel is installed via the linux-image-arm64 package, and the boot arguments are injected by modifying /boot/firmware/cmdline.txt using sed commands. Booting depends on the root partition being labeled RASPIROOT, referenced through that file. There is no bootloader like UEFI-based or U-Boot involved the Raspberry Pi firmware directly loads the kernel, which is standard for Raspberry Pi boards.
- apt: install
  packages:
    ...
    - raspi-firmware  
The boot partition contents and kernel boot setup are tightly controlled via scripting in the recipe. Limitations of Current Workflow: While this setup works, it is
  1. Proprietary and Raspberry Pi specific It relies on the closed-source GPU bootloader the raspi-firmware package, which is tightly coupled to specific Raspberry Pi models.
  2. Manual DTB handling Device tree files are manually copied and hardcoded, making upgrades or board-specific changes error-prone.
  3. Not easily extendable to future Raspberry Pi boards Any change in bootloader behavior (as seen in the Raspberry Pi 5, which introduces a more flexible firmware boot process) would require significant rework.
  4. No UEFI-based/U-Boot The current method bypasses the standard bootloader layers, making it inconsistent with other Debian ARM platforms and harder to maintain long-term.
As Raspberry Pi firmware and boot processes evolve, especially with the introduction of Pi 5 and potentially Pi 6, maintaining compatibility will require more flexibility - something best delivered by adopting U-Boot and flash-kernel.

New Workflow: Building Architecture-Specific Images with vmdb2, U-Boot, flash-kernel, and Debian Kernel This workflow outlines an improved approach to generating bootable Debian images architecture specific, using vmdb2, U-Boot, flash-kernel, and Debian kernels and also to move away from Raspberry Pi s proprietary bootloader to a fully open-source boot process which improves maintainability, consistency, and cross-board support.

New Method: Shift to U-Boot + flash-kernel U-Boot (via Debian su-boot-rpi package) and flash-kernel bring the image building process closer to how Debian officially boots ARM devices. flash-kernel integrates with the system s initramfs and kernel packages to install bootloaders, prepare boot.scr or extlinux.conf, and copy kernel/initrd/DTBs to /boot in a format that U-Boot expects. U-Boot will be used as a second-stage bootloader, loaded by the Raspberry Pi s built-in firmware. Once U-Boot is in place, it will read standard boot scripts ( boot.scr) generated by flash-kernel, providing a Debian-compatible and board-flexible solution. Extending YAML spec for vmdb2 build with U-Boot and flash-kernel To improve an existing vmdb2 YAML spec(https://salsa.debian.org/raspi-team/image-specs/raspi_master.yaml), to integrate U-Boot, flash-kernel, and the architecture-specific Debian kernel into the image build process. By incorporating u-boot-rpi and flash-kernel from Debian packages, alongside the standard initramfs-tools, we align the image closer to Debian best practices while supporting both armhf and arm64 architectures. Below are key additions and adjustments needed in a vmdb2 YAML spec to support the workflow: Install U-Boot, flash-kernel, initramfs-tools and the architecture-specific Debian kernel.
- apt: install
  packages:
    - u-boot-rpi
    - flash-kernel
    - initramfs-tools
    - linux-image-arm64 # or linux-image-armmp for armhf 
  tag: tag-root
Replace linux-image-arm64 with the correct kernel package for specific target architecture. These packages should be added under the tag-root section in YAML spec for vmdb2 build recipe. This ensures that the necessary bootloader, kernel, and initramfs tools are included and properly configured in the image. Configure Raspberry Pi firmware to Load U-Boot Install the U-Boot binary as kernel.img in /boot/firmware we can also download and build U-Boot from source, but Debian provides tested binaries.
- shell:  
    cp /usr/lib/u-boot/rpi_4/u-boot.bin $ ROOT? /boot/firmware/kernel.img
    echo "enable_uart=1" >> $ ROOT? /boot/firmware/config.txt
  root-fs: tag-root
This makes the RPi firmware load u-boot.bin instead of the Linux kernel directly. Set Up flash-kernel for Debian-style Boot flash-kernel integrates with initramfs-tools and writes boot config suitable for U-Boot. We need to make sure /etc/flash-kernel/db contains an entry for board (most Raspberry Pi boards already supported in Bookworm). Set up /etc/flash-kernel.conf with:
- create-file: /etc/flash-kernel.conf
  contents:  
    MACHINE="Raspberry Pi 4"
    BOOTPART="/dev/disk/by-label/RASPIFIRM"
    ROOTPART="/dev/disk/by-label/RASPIROOT"
  unless: rootfs_unpacked
This allows flash-kernel to write an extlinux.conf or boot.scr into /boot/firmware. Clean up Proprietary/Non-Free Firmware Bootflow Remove the direct kernel loading flow:
- shell:  
    rm -f $ ROOT? /boot/firmware/vmlinuz*
    rm -f $ ROOT? /boot/firmware/initrd.img*
    rm -f $ ROOT? /boot/firmware/cmdline.txt
  root-fs: tag-root
Let U-Boot and flash-kernel manage kernel/initrd and boot parameters instead. Boot Flow After This Change
[SoC ROM] -> [start.elf] -> [U-Boot] -> [boot.scr] -> [Linux Kernel]
  1. This still depends on the Raspberry Pi firmware to start, but it only loads U-Boot, not Linux kernel.
  2. U-Boot gives you more flexibility (e.g., networking, boot menus, signed boot).
  3. Using flash-kernel ensures kernel updates are handled the Debian Installer way.
  4. Test with a serial console (enable_uart=1) in case HDMI doesn t show early boot logs.
Advantage of New Workflow
  1. Replaces the proprietary Raspberry Pi bootloader with upstream U-Boot.
  2. Debian-native tooling Uses flash-kernel and initramfs-tools to manage boot configuration.
  3. Consistent across boards Works for both armhf and arm64, unifying the image build process.
  4. Easier to support new boards Like the Raspberry Pi 5 and future models.
This transition will standardize a bit image-building process, making it aligned with upstream Debian Installer workflows.

vmdb2 configuration for arm64 using u-boot and flash-kernel NOTE: This is a baseline example and may require tuning.
# Raspberry Pi arm64 image using U-Boot and flash-kernel
steps:
  # ... (existing mkimg, partitions, mount, debootstrap, etc.) ...
  # Install U-Boot, flash-kernel, initramfs-tools and architecture specific kernel
  - apt: install
    packages:
      - u-boot-rpi
      - flash-kernel
      - initramfs-tools
      - linux - image - arm64 # or linux - image - armmp for armhf
    tag: tag-root
  # Install U-Boot binary as kernel.img in firmware partition
  - shell:  
      cp /usr/lib/u-boot/rpi_arm64 /u-boot.bin $ ROOT? /boot/firmware/kernel.img
      echo "enable_uart=1" >> $ ROOT? /boot/firmware/config.txt
    root-fs: tag-root
  # Configure flash-kernel for Raspberry Pi
  - create-file: /etc/flash-kernel.conf
    contents:  
      MACHINE="Generic Raspberry Pi ARM64"
      BOOTPART="/dev/disk/by-label/RASPIFIRM"
      ROOTPART="/dev/disk/by-label/RASPIROOT"
    unless: rootfs_unpacked
  # Remove direct kernel boot files from Raspberry Pi firmware
  - shell:  
      rm -f $ ROOT? /boot/firmware/vmlinuz*
      rm -f $ ROOT? /boot/firmware/initrd.img*
      rm -f $ ROOT? /boot/firmware/cmdline.txt
    root-fs: tag-root
  # flash-kernel will manage boot scripts and extlinux.conf
  # Rest of image build continues...

Required Changes to Support Raspberry Pi Boards in Debian (flash-kernel + U-Boot)

Overview of Required Changes
Component Required Task
Debian U-Boot Package Add build target for rpi_arm64 in u-boot-rpi. Optionally deprecate legacy 32-bit targets.
Debian flash-kernel Package Add or verify entries in db/all.db for Pi 4, Pi 5, Zero 2W, CM4. Ensure boot script generation works via bootscr.uboot-generic.
Debian Kernel Ensure DTBs are installed at /usr/lib/linux-image-<version>/ and available for flash-kernel to reference.

flash-kernel

Already Supported Boards in flash-kernel Debian Package https://sources.debian.org/src/flash-kernel/3.109/db/all.db/#L1700
Model Arch DTB-Id
Raspberry Pi 1 A/B/B+, Rev2 armel bcm2835-*
Raspberry Pi CM1 armel bcm2835-rpi-cm1-io1.dtb
Raspberry Pi Zero/Zero W armel bcm2835-rpi-zero*.dtb
Raspberry Pi 2B armhf bcm2836-rpi-2-b.dtb
Raspberry Pi 3B/3B+ arm64 bcm2837-*
Raspberry Pi CM3 arm64 bcm2837-rpi-cm3-io3.dtb
Raspberry Pi 400 arm64 bcm2711-rpi-400.dtb

uboot

Already Supported Boards in Debian U-Boot Package https://salsa.debian.org/installer-team/flash-kernel/-/blob/master/db/all.db

arm64 Model Arch Upstream Defconfig Debian Target - - - Raspberry Pi 3B arm64 rpi_3_defconfig rpi_3 Raspberry Pi 4B arm64 rpi_4_defconfig rpi_4 Raspberry Pi 3B/3B+/CM3/CM3+/4B/CM4/400/5B/Zero 2W arm64 rpi_arm64_defconfig rpi_arm64
armhf Model Arch Upstream Defconfig Debian Target - - - Raspberry Pi 2 armhf rpi_2_defconfig rpi_2 Raspberry Pi 3B (32-bit) armhf rpi_3_32b_defconfig rpi_3_32b Raspberry Pi 4B (32-bit) armhf rpi_4_32b_defconfig rpi_4_32b
armel Model Arch Upstream Defconfig Debian Target - - - Raspberry Pi armel rpi_defconfig rpi Raspberry Pi 1/Zero armel rpi_0_w rpi_0_w These boards are already defined in debian/rules under the u-boot-rpi source package and generates usable U-Boot binaries for corresponding Raspberry Pi models.

To-Do: Add Missing Board Support to U-Boot and flash-kernel in Debian Several Raspberry Pi models are missing from the Debian U-Boot and flash-kernel packages, even though upstream support and DTBs exist in the Debian kernel but are missing entries in the flash-kernel database to enable support for bootloader installation and initrd handling.

Boards Not Yet Supported in flash-kernel Debian Package
Model Arch DTB-Id
Raspberry Pi 3A+ (32 & 64 bit) armhf, arm64 bcm2837-rpi-3-a-plus.dtb
Raspberry Pi 4B (32 & 64 bit) armhf, arm64 bcm2711-rpi-4-b.dtb
Raspberry Pi CM4 arm64 bcm2711-rpi-cm4-io.dtb
Raspberry Pi CM 4S arm64 -
Raspberry Zero 2 W arm64 bcm2710-rpi-zero-2-w.dtb
Raspberry Pi 5 arm64 bcm2712-rpi-5-b.dtb
Raspberry Pi CM5 arm64 -
Raspberry Pi 500 arm64 -

Boards Not Yet Supported in Debian U-Boot Package
Model Arch Upstream defconfig(s)
Raspberry Pi 3A+/3B+ arm64 -, rpi_3_b_plus_defconfig
Raspberry Pi CM 4S arm64 -
Raspberry Pi 5 arm64 -
Raspberry Pi CM5 arm64 -
Raspberry Pi 500 arm64 -

So, what next? During the Community Bonding Period, I got hands-on with workflow improvements, set up test environments, and began reviewing Raspberry Pi support in Debian s U-Boot and flash-kernel and these are the logs of the project, where I provide weekly reports on the work done. You can check here: Community Bonding Period logs. My next steps include submitting patches to the u-boot and flash-kernel packages to ensure all missing Raspberry Pi entries are built and shipped. And, also to confirm the kernel DTB installation paths and make sure the necessary files are included for all Raspberry Pi variants. Finally, plan to validate changes with test builds on Raspberry Pi hardware. In parallel, I m organizing my tasks and setting up my environment to contribute more effectively. It s been exciting to explore how things work under the hood and to prepare for a summer of learning and contributing to this great community.

18 June 2025

Sergio Durigan Junior: GCC, glibc, stack unwinding and relocations A war story

I ve been meaning to write a post about this bug for a while, so here it is (before I forget the details!). First, I d like to thank a few people: I ll probably forget some details because it s been more than a week (and life at $DAYJOB moves fast), but we ll see.

The background story Wolfi OS takes security seriously, and one of the things we have is a package which sets the hardening compiler flags for C/C++ according to the best practices recommended by OpenSSF. At the time of this writing, these flags are (in GCC s spec file parlance):
*self_spec:
+ % !O:% !O1:% !O2:% !O3:% !O0:% !Os:% !0fast:% !0g:% !0z:-O2  -fhardened -Wno-error=hardened -Wno-hardened % !fdelete-null-pointer-checks:-fno-delete-null-pointer-checks  -fno-strict-overflow -fno-strict-aliasing % !fomit-frame-pointer:-fno-omit-frame-pointer  -mno-omit-leaf-frame-pointer
*link:
+ --as-needed -O1 --sort-common -z noexecstack -z relro -z now
The important part for our bug is the usage of -z now and -fno-strict-aliasing. As I was saying, these flags are set for almost every build, but sometimes things don t work as they should and we need to disable them. Unfortunately, one of these problematic cases has been glibc. There was an attempt to enable hardening while building glibc, but that introduced a strange breakage to several of our packages and had to be reverted. Things stayed pretty much the same until a few weeks ago, when I started working on one of my roadmap items: figure out why hardening glibc wasn t working, and get it to work as much as possible.

Reproducing the bug I started off by trying to reproduce the problem. It s important to mention this because I often see young engineers forgetting to check if the problem is even valid anymore. I don t blame them; the anxiety to get the bug fixed can be really blinding. Fortunately, I already had one simple test to trigger the failure. All I had to do was install the py3-matplotlib package and then invoke:
$ python3 -c 'import matplotlib'
This would result in an abortion with a coredump. I followed the steps above, and readily saw the problem manifesting again. OK, first step is done; I wasn t getting out easily from this one.

Initial debug The next step is to actually try to debug the failure. In an ideal world you get lucky and are able to spot what s wrong after just a few minutes. Or even better: you also can devise a patch to fix the bug and contribute it to upstream. I installed GDB, and then ran the py3-matplotlib command inside it. When the abortion happened, I issued a backtrace command inside GDB to see where exactly things had gone wrong. I got a stack trace similar to the following:
#0  0x00007c43afe9972c in __pthread_kill_implementation () from /lib/libc.so.6
#1  0x00007c43afe3d8be in raise () from /lib/libc.so.6
#2  0x00007c43afe2531f in abort () from /lib/libc.so.6
#3  0x00007c43af84f79d in uw_init_context_1[cold] () from /usr/lib/libgcc_s.so.1
#4  0x00007c43af86d4d8 in _Unwind_RaiseException () from /usr/lib/libgcc_s.so.1
#5  0x00007c43acac9014 in __cxxabiv1::__cxa_throw (obj=0x5b7d7f52fab0, tinfo=0x7c429b6fd218 <typeinfo for pybind11::attribute_error>, dest=0x7c429b5f7f70 <pybind11::reference_cast_error::~reference_cast_error() [clone .lto_priv.0]>)
    at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:93
#6  0x00007c429b5ec3a7 in ft2font__getattr__(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) [clone .lto_priv.0] [clone .cold] () from /usr/lib/python3.13/site-packages/matplotlib/ft2font.cpython-313-x86_64-linux-gnu.so
#7  0x00007c429b62f086 in pybind11::cpp_function::initialize<pybind11::object (*&)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), pybind11::object, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, pybind11::name, pybind11::scope, pybind11::sibling>(pybind11::object (*&)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), pybind11::object (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&):: lambda(pybind11::detail::function_call&)#1 ::_FUN(pybind11::detail::function_call&) [clone .lto_priv.0] ()
   from /usr/lib/python3.13/site-packages/matplotlib/ft2font.cpython-313-x86_64-linux-gnu.so
#8  0x00007c429b603886 in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) () from /usr/lib/python3.13/site-packages/matplotlib/ft2font.cpython-313-x86_64-linux-gnu.so
...
Huh. Initially this didn t provide me with much information. There was something strange seeing the abort function being called right after _Unwind_RaiseException, but at the time I didn t pay much attention to it. OK, time to expand our horizons a little. Remember when I said that several of our packages would crash with a hardened glibc? I decided to look for another problematic package so that I could make it crash and get its stack trace. My thinking here is that maybe if I can compare both traces, something will come up. I happened to find an old discussion where Dann Frazier mentioned that Emacs was also crashing for him. He and I share the Emacs passion, and I totally agreed with him when he said that Emacs crashing is priority -1! (I m paraphrasing). I installed Emacs, ran it, and voil : the crash happened again. OK, that was good. When I ran Emacs inside GDB and asked for a backtrace, here s what I got:
#0  0x00007eede329972c in __pthread_kill_implementation () from /lib/libc.so.6
#1  0x00007eede323d8be in raise () from /lib/libc.so.6
#2  0x00007eede322531f in abort () from /lib/libc.so.6
#3  0x00007eede262879d in uw_init_context_1[cold] () from /usr/lib/libgcc_s.so.1
#4  0x00007eede2646e7c in _Unwind_Backtrace () from /usr/lib/libgcc_s.so.1
#5  0x00007eede3327b11 in backtrace () from /lib/libc.so.6
#6  0x000059535963a8a1 in emacs_backtrace ()
#7  0x000059535956499a in main ()
Ah, this backtrace is much simpler to follow. Nice. Hmmm. Now the crash is happening inside _Unwind_Backtrace. A pattern emerges! This must have something to do with stack unwinding (or so I thought keep reading to discover the whole truth). You see, the backtrace function (yes, it s a function) and C++ s exception handling mechanism use similar techniques to do their jobs, and it pretty much boils down to unwinding frames from the stack. I looked into Emacs source code, specifically the emacs_backtrace function, but could not find anything strange over there. This bug was probably not going to be an easy fix

The quest for a minimal reproducer Being able to easily reproduce the bug is awesome and really helps with debugging, but even better is being able to have a minimal reproducer for the problem. You see, py3-matplotlib is a huge package and pulls in a bunch of extra dependencies, so it s not easy to ask other people to just install this big package plus these other dependencies, and then run this command , especially if we have to file an upstream bug and talk to people who may not even run the distribution we re using. So I set up to try and come up with a smaller recipe to reproduce the issue, ideally something that s not tied to a specific package from the distribution. Having all the information gathered from the initial debug session, especially the Emacs backtrace, I thought that I could write a very simple program that just invoked the backtrace function from glibc in order to trigger the code path that leads to _Unwind_Backtrace. Here s what I wrote:
#include <execinfo.h>

int
main(int argc, char *argv[])
 
  void *a[4096];
  backtrace (a, 100);
  return 0;
 
After compiling it, I determined that yes, the problem did happen with this small program as well. There was only a small nuisance: the manifestation of the bug was not deterministic, so I had to execute the program a few times until it crashed. But that s much better than what I had before, and a small price to pay. Having a minimal reproducer pretty much allows us to switch our focus to what really matters. I wouldn t need to dive into Emacs or Python s source code anymore. At the time, I was sure this was a glibc bug. But then something else happened.

GCC 15 I had to stop my investigation efforts because something more important came up: it was time to upload GCC 15 to Wolfi. I spent a couple of weeks working on this (it involved rebuilding the whole archive, filing hundreds of FTBFS bugs, patching some programs, etc.), and by the end of it the transition went smooth. When the GCC 15 upload was finally done, I switched my focus back to the glibc hardening problem. The first thing I did was to yes, reproduce the bug again. It had been a few weeks since I had touched the package, after all. So I built a hardened glibc with the latest GCC and the bug did not happen anymore! Fortunately, the very first thing I thought was this must be GCC , so I rebuilt the hardened glibc with GCC 14, and the bug was there again. Huh, unexpected but very interesting.

Diving into glibc and libgcc At this point, I was ready to start some serious debugging. And then I got a message on Signal. It was one of those moments where two minds think alike: Gabriel decided to check how I was doing, and I was thinking about him because this involved glibc, and Gabriel contributed to the project for many years. I explained what I was doing, and he promptly offered to help. Yes, there are more people who love low level debugging! We spent several hours going through disassembles of certain functions (because we didn t have any debug information in the beginning), trying to make sense of what we were seeing. There was some heavy GDB involved; unfortunately I completely lost the session s history because it was done inside a container running inside an ephemeral VM. But we learned a lot. For example:
  • It was hard to actually understand the full stack trace leading to uw_init_context_1[cold]. _Unwind_Backtrace obviously didn t call it (it called uw_init_context_1, but what was that [cold] doing?). We had to investigate the disassemble of uw_init_context_1 in order to determined where uw_init_context_1[cold] was being called.
  • The [cold] suffix is a GCC function attribute that can be used to tell the compiler that the function is unlikely to be reached. When I read that, my mind immediately jumped to this must be an assertion , so I went to the source code and found the spot.
  • We were able to determine that the return code of uw_frame_state_for was 5, which means _URC_END_OF_STACK. That s why the assertion was triggering.
After finding these facts without debug information, I decided to bite the bullet and recompiled GCC 14 with -O0 -g3, so that we could debug what uw_frame_state_for was doing. After banging our heads a bit more, we found that fde is NULL at this excerpt:
// ...
  fde = _Unwind_Find_FDE (context->ra + _Unwind_IsSignalFrame (context) - 1,
                          &context->bases);
  if (fde == NULL)
     
#ifdef MD_FALLBACK_FRAME_STATE_FOR
      /* Couldn't find frame unwind info for this function.  Try a
         target-specific fallback mechanism.  This will necessarily
         not provide a personality routine or LSDA.  */
      return MD_FALLBACK_FRAME_STATE_FOR (context, fs);
#else
      return _URC_END_OF_STACK;
#endif
     
// ...
We re debugging on amd64, which means that MD_FALLBACK_FRAME_STATE_FOR is defined and therefore is called. But that s not really important for our case here, because we had established before that _Unwind_Find_FDE would never return NULL when using a non-hardened glibc (or a glibc compiled with GCC 15). So we decided to look into what _Unwind_Find_FDE did. The function is complex because it deals with .eh_frame , but we were able to pinpoint the exact location where find_fde_tail (one of the functions called by _Unwind_Find_FDE) is returning NULL:
if (pc < table[0].initial_loc + data_base)
  return NULL;
We looked at the addresses of pc and table[0].initial_loc + data_base, and found that the former fell within libgcc s text section, which the latter fell within /lib/ld-linux-x86-64.so.2 text. At this point, we were already too tired to continue. I decided to keep looking at the problem later and see if I could get any further.

Bisecting GCC The next day, I woke up determined to find what changed in GCC 15 that caused the bug to disappear. Unless you know GCC s internals like they are your own home (which I definitely don t), the best way to do that is to git bisect the commits between GCC 14 and 15. I spent a few days running the bisect. It took me more time than I d have liked to find the right range of commits to pass git bisect (because of how branches and tags are done in GCC s repository), and I also had to write some helper scripts that:
  • Modified the gcc.yaml package definition to make it build with the commit being bisected.
  • Built glibc using the GCC that was just built.
  • Ran tests inside a docker container (with the recently built glibc installed) to determine whether the bug was present.
At the end, I had a commit to point to:
commit 99b1daae18c095d6c94d32efb77442838e11cbfb
Author: Richard Biener <rguenther@suse.de>
Date:   Fri May 3 14:04:41 2024 +0200
    tree-optimization/114589 - remove profile based sink heuristics
Makes sense, right?! No? Well, it didn t for me either. Even after reading what was changed in the code and the upstream bug fixed by the commit, I was still clueless as to why this change fixed the problem (I say fixed because it may very well be an unintended consequence of the change, and some other problem might have been introduced).

Upstream takes over After obtaining the commit that possibly fixed the bug, while talking to Dann and explaining what I did, he suggested that I should file an upstream bug and check with them. Great idea, of course. I filed the following upstream bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120653 It s a bit long, very dense and complex, but ultimately upstream was able to find the real problem and have a patch accepted in just two days. Nothing like knowing the code base. The initial bug became: https://sourceware.org/bugzilla/show_bug.cgi?id=33088 In the end, the problem was indeed in how the linker defines __ehdr_start, which, according to the code (from elf/dl-support.c):
if (_dl_phdr == NULL)
   
    /* Starting from binutils-2.23, the linker will define the
       magic symbol __ehdr_start to point to our own ELF header
       if it is visible in a segment that also includes the phdrs.
       So we can set up _dl_phdr and _dl_phnum even without any
       information from auxv.  */


    extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
    assert (__ehdr_start.e_phentsize == sizeof *GL(dl_phdr));
    _dl_phdr = (const void *) &__ehdr_start + __ehdr_start.e_phoff;
    _dl_phnum = __ehdr_start.e_phnum;
   
But the following definition is the problematic one (from elf/rtld.c):
extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
This symbol (along with its counterpart, __ehdr_end) was being run-time relocated when it shouldn t be. The fix that was pushed added optimization barriers to prevent the compiler from doing the relocations. I don t claim to fully understand what was done here, and Jakub s analysis is a thing to behold, but in the end I was able to confirm that the patch fixed the bug. And in the end, it was indeed a glibc bug.

Conclusion This was an awesome bug to investigate. It s one of those that deserve a blog post, even though some of the final details of the fix flew over my head. I d like to start blogging more about these sort of bugs, because I ve encountered my fair share of them throughout my career. And it was great being able to do some debugging with another person, exchange ideas, learn things together, and ultimately share that deep satisfaction when we find why a crash is happening. I have at least one more bug in my TODO list to write about (another one with glibc, but this time I was able to get to the end of it and come up with a patch). Stay tunned. P.S.: After having published the post I realized that I forgot to explain why the -z now and -fno-strict-aliasing flags were important. -z now is the flag that I determined to be the root cause of the breakage. If I compiled glibc with every hardening flag except -z now, everything worked. So initially I thought that the problem had to do with how ld.so was resolving symbols at runtime. As it turns out, this ended up being more a symptom than the real cause of the bug. As for -fno-strict-aliasing, a Gentoo developer who commented on the GCC bug above mentioned that this OpenSSF bug had a good point against using this flag for hardening. I still have to do a deep dive on what was discussed in the issue, but this is certainly something to take into consideration. There s this very good write-up about strict aliasing in general if you re interested in understanding it better.

Next.

Previous.