Search Results: "aph"

18 June 2025

Sergio Durigan Junior: GCC, glibc, stack unwinding and relocations A war story

I ve been meaning to write a post about this bug for a while, so here it is (before I forget the details!). First, I d like to thank a few people: I ll probably forget some details because it s been more than a week (and life at $DAYJOB moves fast), but we ll see.

The background story Wolfi OS takes security seriously, and one of the things we have is a package which sets the hardening compiler flags for C/C++ according to the best practices recommended by OpenSSF. At the time of this writing, these flags are (in GCC s spec file parlance):
*self_spec:
+ % !O:% !O1:% !O2:% !O3:% !O0:% !Os:% !0fast:% !0g:% !0z:-O2  -fhardened -Wno-error=hardened -Wno-hardened % !fdelete-null-pointer-checks:-fno-delete-null-pointer-checks  -fno-strict-overflow -fno-strict-aliasing % !fomit-frame-pointer:-fno-omit-frame-pointer  -mno-omit-leaf-frame-pointer
*link:
+ --as-needed -O1 --sort-common -z noexecstack -z relro -z now
The important part for our bug is the usage of -z now and -fno-strict-aliasing. As I was saying, these flags are set for almost every build, but sometimes things don t work as they should and we need to disable them. Unfortunately, one of these problematic cases has been glibc. There was an attempt to enable hardening while building glibc, but that introduced a strange breakage to several of our packages and had to be reverted. Things stayed pretty much the same until a few weeks ago, when I started working on one of my roadmap items: figure out why hardening glibc wasn t working, and get it to work as much as possible.

Reproducing the bug I started off by trying to reproduce the problem. It s important to mention this because I often see young engineers forgetting to check if the problem is even valid anymore. I don t blame them; the anxiety to get the bug fixed can be really blinding. Fortunately, I already had one simple test to trigger the failure. All I had to do was install the py3-matplotlib package and then invoke:
$ python3 -c 'import matplotlib'
This would result in an abortion with a coredump. I followed the steps above, and readily saw the problem manifesting again. OK, first step is done; I wasn t getting out easily from this one.

Initial debug The next step is to actually try to debug the failure. In an ideal world you get lucky and are able to spot what s wrong after just a few minutes. Or even better: you also can devise a patch to fix the bug and contribute it to upstream. I installed GDB, and then ran the py3-matplotlib command inside it. When the abortion happened, I issued a backtrace command inside GDB to see where exactly things had gone wrong. I got a stack trace similar to the following:
#0  0x00007c43afe9972c in __pthread_kill_implementation () from /lib/libc.so.6
#1  0x00007c43afe3d8be in raise () from /lib/libc.so.6
#2  0x00007c43afe2531f in abort () from /lib/libc.so.6
#3  0x00007c43af84f79d in uw_init_context_1[cold] () from /usr/lib/libgcc_s.so.1
#4  0x00007c43af86d4d8 in _Unwind_RaiseException () from /usr/lib/libgcc_s.so.1
#5  0x00007c43acac9014 in __cxxabiv1::__cxa_throw (obj=0x5b7d7f52fab0, tinfo=0x7c429b6fd218 <typeinfo for pybind11::attribute_error>, dest=0x7c429b5f7f70 <pybind11::reference_cast_error::~reference_cast_error() [clone .lto_priv.0]>)
    at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:93
#6  0x00007c429b5ec3a7 in ft2font__getattr__(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) [clone .lto_priv.0] [clone .cold] () from /usr/lib/python3.13/site-packages/matplotlib/ft2font.cpython-313-x86_64-linux-gnu.so
#7  0x00007c429b62f086 in pybind11::cpp_function::initialize<pybind11::object (*&)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), pybind11::object, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, pybind11::name, pybind11::scope, pybind11::sibling>(pybind11::object (*&)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), pybind11::object (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&):: lambda(pybind11::detail::function_call&)#1 ::_FUN(pybind11::detail::function_call&) [clone .lto_priv.0] ()
   from /usr/lib/python3.13/site-packages/matplotlib/ft2font.cpython-313-x86_64-linux-gnu.so
#8  0x00007c429b603886 in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) () from /usr/lib/python3.13/site-packages/matplotlib/ft2font.cpython-313-x86_64-linux-gnu.so
...
Huh. Initially this didn t provide me with much information. There was something strange seeing the abort function being called right after _Unwind_RaiseException, but at the time I didn t pay much attention to it. OK, time to expand our horizons a little. Remember when I said that several of our packages would crash with a hardened glibc? I decided to look for another problematic package so that I could make it crash and get its stack trace. My thinking here is that maybe if I can compare both traces, something will come up. I happened to find an old discussion where Dann Frazier mentioned that Emacs was also crashing for him. He and I share the Emacs passion, and I totally agreed with him when he said that Emacs crashing is priority -1! (I m paraphrasing). I installed Emacs, ran it, and voil : the crash happened again. OK, that was good. When I ran Emacs inside GDB and asked for a backtrace, here s what I got:
#0  0x00007eede329972c in __pthread_kill_implementation () from /lib/libc.so.6
#1  0x00007eede323d8be in raise () from /lib/libc.so.6
#2  0x00007eede322531f in abort () from /lib/libc.so.6
#3  0x00007eede262879d in uw_init_context_1[cold] () from /usr/lib/libgcc_s.so.1
#4  0x00007eede2646e7c in _Unwind_Backtrace () from /usr/lib/libgcc_s.so.1
#5  0x00007eede3327b11 in backtrace () from /lib/libc.so.6
#6  0x000059535963a8a1 in emacs_backtrace ()
#7  0x000059535956499a in main ()
Ah, this backtrace is much simpler to follow. Nice. Hmmm. Now the crash is happening inside _Unwind_Backtrace. A pattern emerges! This must have something to do with stack unwinding (or so I thought keep reading to discover the whole truth). You see, the backtrace function (yes, it s a function) and C++ s exception handling mechanism use similar techniques to do their jobs, and it pretty much boils down to unwinding frames from the stack. I looked into Emacs source code, specifically the emacs_backtrace function, but could not find anything strange over there. This bug was probably not going to be an easy fix

The quest for a minimal reproducer Being able to easily reproduce the bug is awesome and really helps with debugging, but even better is being able to have a minimal reproducer for the problem. You see, py3-matplotlib is a huge package and pulls in a bunch of extra dependencies, so it s not easy to ask other people to just install this big package plus these other dependencies, and then run this command , especially if we have to file an upstream bug and talk to people who may not even run the distribution we re using. So I set up to try and come up with a smaller recipe to reproduce the issue, ideally something that s not tied to a specific package from the distribution. Having all the information gathered from the initial debug session, especially the Emacs backtrace, I thought that I could write a very simple program that just invoked the backtrace function from glibc in order to trigger the code path that leads to _Unwind_Backtrace. Here s what I wrote:
#include <execinfo.h>

int
main(int argc, char *argv[])
 
  void *a[4096];
  backtrace (a, 100);
  return 0;
 
After compiling it, I determined that yes, the problem did happen with this small program as well. There was only a small nuisance: the manifestation of the bug was not deterministic, so I had to execute the program a few times until it crashed. But that s much better than what I had before, and a small price to pay. Having a minimal reproducer pretty much allows us to switch our focus to what really matters. I wouldn t need to dive into Emacs or Python s source code anymore. At the time, I was sure this was a glibc bug. But then something else happened.

GCC 15 I had to stop my investigation efforts because something more important came up: it was time to upload GCC 15 to Wolfi. I spent a couple of weeks working on this (it involved rebuilding the whole archive, filing hundreds of FTBFS bugs, patching some programs, etc.), and by the end of it the transition went smooth. When the GCC 15 upload was finally done, I switched my focus back to the glibc hardening problem. The first thing I did was to yes, reproduce the bug again. It had been a few weeks since I had touched the package, after all. So I built a hardened glibc with the latest GCC and the bug did not happen anymore! Fortunately, the very first thing I thought was this must be GCC , so I rebuilt the hardened glibc with GCC 14, and the bug was there again. Huh, unexpected but very interesting.

Diving into glibc and libgcc At this point, I was ready to start some serious debugging. And then I got a message on Signal. It was one of those moments where two minds think alike: Gabriel decided to check how I was doing, and I was thinking about him because this involved glibc, and Gabriel contributed to the project for many years. I explained what I was doing, and he promptly offered to help. Yes, there are more people who love low level debugging! We spent several hours going through disassembles of certain functions (because we didn t have any debug information in the beginning), trying to make sense of what we were seeing. There was some heavy GDB involved; unfortunately I completely lost the session s history because it was done inside a container running inside an ephemeral VM. But we learned a lot. For example:
  • It was hard to actually understand the full stack trace leading to uw_init_context_1[cold]. _Unwind_Backtrace obviously didn t call it (it called uw_init_context_1, but what was that [cold] doing?). We had to investigate the disassemble of uw_init_context_1 in order to determined where uw_init_context_1[cold] was being called.
  • The [cold] suffix is a GCC function attribute that can be used to tell the compiler that the function is unlikely to be reached. When I read that, my mind immediately jumped to this must be an assertion , so I went to the source code and found the spot.
  • We were able to determine that the return code of uw_frame_state_for was 5, which means _URC_END_OF_STACK. That s why the assertion was triggering.
After finding these facts without debug information, I decided to bite the bullet and recompiled GCC 14 with -O0 -g3, so that we could debug what uw_frame_state_for was doing. After banging our heads a bit more, we found that fde is NULL at this excerpt:
// ...
  fde = _Unwind_Find_FDE (context->ra + _Unwind_IsSignalFrame (context) - 1,
                          &context->bases);
  if (fde == NULL)
     
#ifdef MD_FALLBACK_FRAME_STATE_FOR
      /* Couldn't find frame unwind info for this function.  Try a
         target-specific fallback mechanism.  This will necessarily
         not provide a personality routine or LSDA.  */
      return MD_FALLBACK_FRAME_STATE_FOR (context, fs);
#else
      return _URC_END_OF_STACK;
#endif
     
// ...
We re debugging on amd64, which means that MD_FALLBACK_FRAME_STATE_FOR is defined and therefore is called. But that s not really important for our case here, because we had established before that _Unwind_Find_FDE would never return NULL when using a non-hardened glibc (or a glibc compiled with GCC 15). So we decided to look into what _Unwind_Find_FDE did. The function is complex because it deals with .eh_frame , but we were able to pinpoint the exact location where find_fde_tail (one of the functions called by _Unwind_Find_FDE) is returning NULL:
if (pc < table[0].initial_loc + data_base)
  return NULL;
We looked at the addresses of pc and table[0].initial_loc + data_base, and found that the former fell within libgcc s text section, which the latter fell within /lib/ld-linux-x86-64.so.2 text. At this point, we were already too tired to continue. I decided to keep looking at the problem later and see if I could get any further.

Bisecting GCC The next day, I woke up determined to find what changed in GCC 15 that caused the bug to disappear. Unless you know GCC s internals like they are your own home (which I definitely don t), the best way to do that is to git bisect the commits between GCC 14 and 15. I spent a few days running the bisect. It took me more time than I d have liked to find the right range of commits to pass git bisect (because of how branches and tags are done in GCC s repository), and I also had to write some helper scripts that:
  • Modified the gcc.yaml package definition to make it build with the commit being bisected.
  • Built glibc using the GCC that was just built.
  • Ran tests inside a docker container (with the recently built glibc installed) to determine whether the bug was present.
At the end, I had a commit to point to:
commit 99b1daae18c095d6c94d32efb77442838e11cbfb
Author: Richard Biener <rguenther@suse.de>
Date:   Fri May 3 14:04:41 2024 +0200
    tree-optimization/114589 - remove profile based sink heuristics
Makes sense, right?! No? Well, it didn t for me either. Even after reading what was changed in the code and the upstream bug fixed by the commit, I was still clueless as to why this change fixed the problem (I say fixed because it may very well be an unintended consequence of the change, and some other problem might have been introduced).

Upstream takes over After obtaining the commit that possibly fixed the bug, while talking to Dann and explaining what I did, he suggested that I should file an upstream bug and check with them. Great idea, of course. I filed the following upstream bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120653 It s a bit long, very dense and complex, but ultimately upstream was able to find the real problem and have a patch accepted in just two days. Nothing like knowing the code base. The initial bug became: https://sourceware.org/bugzilla/show_bug.cgi?id=33088 In the end, the problem was indeed in how the linker defines __ehdr_start, which, according to the code (from elf/dl-support.c):
if (_dl_phdr == NULL)
   
    /* Starting from binutils-2.23, the linker will define the
       magic symbol __ehdr_start to point to our own ELF header
       if it is visible in a segment that also includes the phdrs.
       So we can set up _dl_phdr and _dl_phnum even without any
       information from auxv.  */


    extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
    assert (__ehdr_start.e_phentsize == sizeof *GL(dl_phdr));
    _dl_phdr = (const void *) &__ehdr_start + __ehdr_start.e_phoff;
    _dl_phnum = __ehdr_start.e_phnum;
   
But the following definition is the problematic one (from elf/rtld.c):
extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
This symbol (along with its counterpart, __ehdr_end) was being run-time relocated when it shouldn t be. The fix that was pushed added optimization barriers to prevent the compiler from doing the relocations. I don t claim to fully understand what was done here, and Jakub s analysis is a thing to behold, but in the end I was able to confirm that the patch fixed the bug. And in the end, it was indeed a glibc bug.

Conclusion This was an awesome bug to investigate. It s one of those that deserve a blog post, even though some of the final details of the fix flew over my head. I d like to start blogging more about these sort of bugs, because I ve encountered my fair share of them throughout my career. And it was great being able to do some debugging with another person, exchange ideas, learn things together, and ultimately share that deep satisfaction when we find why a crash is happening. I have at least one more bug in my TODO list to write about (another one with glibc, but this time I was able to get to the end of it and come up with a patch). Stay tunned. P.S.: After having published the post I realized that I forgot to explain why the -z now and -fno-strict-aliasing flags were important. -z now is the flag that I determined to be the root cause of the breakage. If I compiled glibc with every hardening flag except -z now, everything worked. So initially I thought that the problem had to do with how ld.so was resolving symbols at runtime. As it turns out, this ended up being more a symptom than the real cause of the bug. As for -fno-strict-aliasing, a Gentoo developer who commented on the GCC bug above mentioned that this OpenSSF bug had a good point against using this flag for hardening. I still have to do a deep dive on what was discussed in the issue, but this is certainly something to take into consideration. There s this very good write-up about strict aliasing in general if you re interested in understanding it better.

17 June 2025

Matthew Garrett: Locally hosting an internet-connected server

I'm lucky enough to have a weird niche ISP available to me, so I'm paying $35 a month for around 600MBit symmetric data. Unfortunately they don't offer static IP addresses to residential customers, and nor do they allow multiple IP addresses per connection, and I'm the sort of person who'd like to run a bunch of stuff myself, so I've been looking for ways to manage this.

What I've ended up doing is renting a cheap VPS from a vendor that lets me add multiple IP addresses for minimal extra cost. The precise nature of the VPS isn't relevant - you just want a machine (it doesn't need much CPU, RAM, or storage) that has multiple world routeable IPv4 addresses associated with it and has no port blocks on incoming traffic. Ideally it's geographically local and peers with your ISP in order to reduce additional latency, but that's a nice to have rather than a requirement.

By setting that up you now have multiple real-world IP addresses that people can get to. How do we get them to the machine in your house you want to be accessible? First we need a connection between that machine and your VPS, and the easiest approach here is Wireguard. We only need a point-to-point link, nothing routable, and none of the IP addresses involved need to have anything to do with any of the rest of your network. So, on your local machine you want something like:

[Interface]
PrivateKey = privkeyhere
ListenPort = 51820
Address = localaddr/32

[Peer]
Endpoint = VPS:51820
PublicKey = pubkeyhere
AllowedIPs = VPS/0


And on your VPS, something like:

[Interface]
Address = vpswgaddr/32
SaveConfig = true
ListenPort = 51820
PrivateKey = privkeyhere

[Peer]
PublicKey = pubkeyhere
AllowedIPs = localaddr/32


The addresses here are (other than the VPS address) arbitrary - but they do need to be consistent, otherwise Wireguard is going to be unhappy and your packets will not have a fun time. Bring that interface up with wg-quick and make sure the devices can ping each other. Hurrah! That's the easy bit.

Now you want packets from the outside world to get to your internal machine. Let's say the external IP address you're going to use for that machine is 321.985.520.309 and the wireguard address of your local system is 867.420.696.005. On the VPS, you're going to want to do:

iptables -t nat -A PREROUTING -p tcp -d 321.985.520.309 -j DNAT --to-destination 867.420.696.005

Now, all incoming packets for 321.985.520.309 will be rewritten to head towards 867.420.696.005 instead (make sure you've set net.ipv4.ip_forward to 1 via sysctl!). Victory! Or is it? Well, no.

What we're doing here is rewriting the destination address of the packets so instead of heading to an address associated with the VPS, they're now going to head to your internal system over the Wireguard link. Which is then going to ignore them, because the AllowedIPs statement in the config only allows packets coming from your VPS, and these packets still have their original source IP. We could rewrite the source IP to match the VPS IP, but then you'd have no idea where any of these packets were coming from, and that sucks. Let's do something better. On the local machine, in the peer, let's update AllowedIps to 0.0.0.0/0 to permit packets form any source to appear over our Wireguard link. But if we bring the interface up now, it'll try to route all traffic over the Wireguard link, which isn't what we want. So we'll add table = off to the interface stanza of the config to disable that, and now we can bring the interface up without breaking everything but still allowing packets to reach us. However, we do still need to tell the kernel how to reach the remote VPN endpoint, which we can do with ip route add vpswgaddr dev wg0. Add this to the interface stanza as:

PostUp = ip route add vpswgaddr dev wg0
PreDown = ip route del vpswgaddr dev wg0


That's half the battle. The problem is that they're going to show up there with the source address still set to the original source IP, and your internal system is (because Linux) going to notice it has the ability to just send replies to the outside world via your ISP rather than via Wireguard and nothing is going to work. Thanks, Linux. Thinux.

But there's a way to solve this - policy routing. Linux allows you to have multiple separate routing tables, and define policy that controls which routing table will be used for a given packet. First, let's define a new table reference. On the local machine, edit /etc/iproute2/rt_tables and add a new entry that's something like:

1 wireguard


where "1" is just a standin for a number not otherwise used there. Now edit your wireguard config and replace table=off with table=wireguard - Wireguard will now update the wireguard routing table rather than the global one. Now all we need to do is to tell the kernel to push packets into the appropriate routing table - we can do that with ip rule add from localaddr lookup wireguard, which tells the kernel to take any packet coming from our Wireguard address and push it via the Wireguard routing table. Add that to your Wireguard interface config as:

PostUp = ip rule add from localaddr lookup wireguard
PreDown = ip rule del from localaddr lookup wireguard

and now your local system is effectively on the internet.

You can do this for multiple systems - just configure additional Wireguard interfaces on the VPS and make sure they're all listening on different ports. If your local IP changes then your local machines will end up reconnecting to the VPS, but to the outside world their accessible IP address will remain the same. It's like having a real IP without the pain of convincing your ISP to give it to you.

comment count unavailable comments

16 June 2025

Kentaro Hayashi: Fixing long standing font issue about Debian Graphical Installer

Introduction This is just a note-taking about how fixed the long standing font issue about Debian Graphical Installer for up-coming trixie ready. Recently, this issue had been resolved by Cyril Brulebois. Thanks!

What is the problem? Because of Han unification, wrong font typefaces are rendered by default when you choose Japanese language using Graphical Debian installer.
"Wrong" glyph for Japanese
Most of typefaces seems correct, but there are wrong typefaces (Simplified Chinese) which is used for widget rendering. This issue will not be solved during using DroidSansFallback.ttf continuously for Japanese. Thus, it means that we need to switch font itself which contains Japanese typeface to fix this issue. If you wan to know about how Han Unification is harmful in this context, See

What causes this problem? In short, fonts-android (DroidSansFallback.ttf) had been used for CJK, especially for Japanese. Since Debian 9 (stretch), fonts-android was adopted for CJK fonts by default. Thus this issue was not resolved in Debian 9, Debian 10, Debian 11 and Debian 12 release cycle!

What is the impact about this issue? For sadly, Japanese native speakers can recognize such a unexpectedly rendered "Wrong" glyph, so it is not hard to continue Debian installation process. Even if there is no problem with the installer's functionality, it gives a terrible user experience for newbie. For example, how can you trust an installer which contains full of typos? It is similar situation for Japanese users.

How Debian Graphical Installer was fixed? In short, newly fonts-motoya-l-cedar-udeb was bundled for Japanese, and changed to switch that font via gtk-set-font command. It was difficult that what is the best font to deal font file size and visibility. Typically Japanese font file occupies extra a few MB. Luckily, some space was back for Installer, it was not seen as a problem (I guess). As a bonus, we tried to investigate a possibility of font compression mechanism for Installer, but it was regarded as too complicated and not suitable for trixie release cycle.

Conclution
  • The font issue was fixed in Debian Graphical Installer for Japanese
  • As recently fixed, not officially shipped yet (NOTE Debian Installer Trixie RC1 does not contain this fix) Try daily build installer if you want.
This article was written with Ultimate Hacking Keyboard 60 v2 with Rizer 60 (New my gear!).

8 June 2025

Thorsten Alteholz: My Debian Activities in May 2025

Debian LTS This was my hundred-thirty-first month that I did some work for the Debian LTS initiative, started by Raphael Hertzog at Freexian. During my allocated time I uploaded or worked on: I also continued my to work on libxmltok and suricata. This month I also had to do some support on seger, for example to inject packages newly needed for builds. Debian ELTS This month was the eighty-second ELTS month. During my allocated time I uploaded or worked on: All packages I worked on have been on the list of longstanding packages. For example espeak-ng has been on this list for more than nine month. I now understood that there is a reason why packages are on this list. Some parts of the software have been almost completely reworked, so that the patches need a reverse rework. For some packages this is easy, but for others this rework needs quite some time. I also continued to work on libxmltok and suricata. Debian Printing Unfortunately I didn t found any time to work on this topic. Debian Astro This month I uploaded bugfix versions of: Debian Mobcom This month I uploaded bugfix versions of: misc This month I uploaded bugfix versions of: Thanks a lot to the Release Team who quickly handled all my unblock bugs! FTP master It is this time of the year when just a few packages arrive in NEW: it is Hard Freeze. So I enjoy this period and basically just take care of kernels or other important packages. As people seem to be more interested in discussions than in fixing RC bugs, my period of rest seems to continue for a while. So thanks for all this valuable discussions and really thanks to the few people who still take care of Trixie. This month I accepted 146 and rejected 10 packages. The overall number of packages that got accepted was 147.

6 June 2025

Reproducible Builds: Reproducible Builds in May 2025

Welcome to our 5th report from the Reproducible Builds project in 2025! Our monthly reports outline what we ve been up to over the past month, and highlight items of news from elsewhere in the increasingly-important area of software supply-chain security. If you are interested in contributing to the Reproducible Builds project, please do visit the Contribute page on our website. In this report:
  1. Security audit of Reproducible Builds tools published
  2. When good pseudorandom numbers go bad
  3. Academic articles
  4. Distribution work
  5. diffoscope and disorderfs
  6. Website updates
  7. Reproducibility testing framework
  8. Upstream patches

Security audit of Reproducible Builds tools published The Open Technology Fund s (OTF) security partner Security Research Labs recently an conducted audit of some specific parts of tools developed by Reproducible Builds. This form of security audit, sometimes called a whitebox audit, is a form testing in which auditors have complete knowledge of the item being tested. They auditors assessed the various codebases for resilience against hacking, with key areas including differential report formats in diffoscope, common client web attacks, command injection, privilege management, hidden modifications in the build process and attack vectors that might enable denials of service. The audit focused on three core Reproducible Builds tools: diffoscope, a Python application that unpacks archives of files and directories and transforms their binary formats into human-readable form in order to compare them; strip-nondeterminism, a Perl program that improves reproducibility by stripping out non-deterministic information such as timestamps or other elements introduced during packaging; and reprotest, a Python application that builds source code multiple times in various environments in order to to test reproducibility. OTF s announcement contains more of an overview of the audit, and the full 24-page report is available in PDF form as well.

When good pseudorandom numbers go bad Danielle Navarro published an interesting and amusing article on their blog on When good pseudorandom numbers go bad. Danielle sets the stage as follows:
[Colleagues] approached me to talk about a reproducibility issue they d been having with some R code. They d been running simulations that rely on generating samples from a multivariate normal distribution, and despite doing the prudent thing and using set.seed() to control the state of the random number generator (RNG), the results were not computationally reproducible. The same code, executed on different machines, would produce different random numbers. The numbers weren t just a little bit different in the way that we ve all wearily learned to expect when you try to force computers to do mathematics. They were painfully, brutally, catastrophically, irreproducible different. Somewhere, somehow, something broke.
Thanks to David Wheeler for posting about this article on our mailing list

Academic articles There were two scholarly articles published this month that related to reproducibility: Daniel Hugenroth and Alastair R. Beresford of the University of Cambridge in the United Kingdom and Mario Lins and Ren Mayrhofer of Johannes Kepler University in Linz, Austria published an article titled Attestable builds: compiling verifiable binaries on untrusted systems using trusted execution environments. In their paper, they:
present attestable builds, a new paradigm to provide strong source-to-binary correspondence in software artifacts. We tackle the challenge of opaque build pipelines that disconnect the trust between source code, which can be understood and audited, and the final binary artifact, which is difficult to inspect. Our system uses modern trusted execution environments (TEEs) and sandboxed build containers to provide strong guarantees that a given artifact was correctly built from a specific source code snapshot. As such it complements existing approaches like reproducible builds which typically require time-intensive modifications to existing build configurations and dependencies, and require independent parties to continuously build and verify artifacts.
The authors compare attestable builds with reproducible builds by noting an attestable build requires only minimal changes to an existing project, and offers nearly instantaneous verification of the correspondence between a given binary and the source code and build pipeline used to construct it , and proceed by determining that t he overhead (42 seconds start-up latency and 14% increase in build duration) is small in comparison to the overall build time.
Timo Pohl, Pavel Nov k, Marc Ohm and Michael Meier have published a paper called Towards Reproducibility for Software Packages in Scripting Language Ecosystems. The authors note that past research into Reproducible Builds has focused primarily on compiled languages and their ecosystems, with a further emphasis on Linux distribution packages:
However, the popular scripting language ecosystems potentially face unique issues given the systematic difference in distributed artifacts. This Systemization of Knowledge (SoK) [paper] provides an overview of existing research, aiming to highlight future directions, as well as chances to transfer existing knowledge from compiled language ecosystems. To that end, we work out key aspects in current research, systematize identified challenges for software reproducibility, and map them between the ecosystems.
Ultimately, the three authors find that the literature is sparse , focusing on few individual problems and ecosystems, and therefore identify space for more critical research.

Distribution work In Debian this month:
Hans-Christoph Steiner of the F-Droid catalogue of open source applications for the Android platform published a blog post on Making reproducible builds visible. Noting that Reproducible builds are essential in order to have trustworthy software , Hans also mentions that F-Droid has been delivering reproducible builds since 2015 . However:
There is now a Reproducibility Status link for each app on f-droid.org, listed on every app s page. Our verification server shows or based on its build results, where means our rebuilder reproduced the same APK file and means it did not. The IzzyOnDroid repository has developed a more elaborate system of badges which displays a for each rebuilder. Additionally, there is a sketch of a five-level graph to represent some aspects about which processes were run.
Hans compares the approach with projects such as Arch Linux and Debian that provide developer-facing tools to give feedback about reproducible builds, but do not display information about reproducible builds in the user-facing interfaces like the package management GUIs.
Arnout Engelen of the NixOS project has been working on reproducing the minimal installation ISO image. This month, Arnout has successfully reproduced the build of the minimal image for the 25.05 release without relying on the binary cache. Work on also reproducing the graphical installer image is ongoing.
In openSUSE news, Bernhard M. Wiedemann posted another monthly update for their work there.
Lastly in Fedora news, Jelle van der Waa opened issues tracking reproducible issues in Haskell documentation, Qt6 recording the host kernel and R packages recording the current date. The R packages can be made reproducible with packaging changes in Fedora.

diffoscope & disorderfs diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made the following changes, including preparing and uploading versions 295, 296 and 297 to Debian:
  • Don t rely on zipdetails --walk argument being available, and only add that argument on newer versions after we test for that. [ ]
  • Review and merge support for NuGet packages from Omair Majid. [ ]
  • Update copyright years. [ ]
  • Merge support for an lzma comparator from Will Hollywood. [ ][ ]
Chris also merged an impressive changeset from Siva Mahadevan to make disorderfs more portable, especially on FreeBSD. disorderfs is our FUSE-based filesystem that deliberately introduces non-determinism into directory system calls in order to flush out reproducibility issues [ ]. This was then uploaded to Debian as version 0.6.0-1. Lastly, Vagrant Cascadian updated diffoscope in GNU Guix to version 296 [ ][ ] and 297 [ ][ ], and disorderfs to version 0.6.0 [ ][ ].

Website updates Once again, there were a number of improvements made to our website this month including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. However, Holger Levsen posted to our mailing list this month in order to bring a wider awareness to funding issues faced by the Oregon State University (OSU) Open Source Lab (OSL). As mentioned on OSL s public post, recent changes in university funding makes our current funding model no longer sustainable [and that] unless we secure $250,000 in committed funds, the OSL will shut down later this year . As Holger notes in his post to our mailing list, the Reproducible Builds project relies on hardware nodes hosted there. Nevertheless, Lance Albertson of OSL posted an update to the funding situation later in the month with broadly positive news.
Separate to this, there were various changes to the Jenkins setup this month, which is used as the backend driver of for both tests.reproducible-builds.org and reproduce.debian.net, including:
  • Migrating the central jenkins.debian.net server AMD Opteron to Intel Haswell CPUs. Thanks to IONOS for hosting this server since 2012.
  • After testing it for almost ten years, the i386 architecture has been dropped from tests.reproducible-builds.org. This is because that, with the upcoming release of Debian trixie, i386 is no longer supported as a regular architecture there will be no official kernel and no Debian installer for i386 systems. As a result, a large number of nodes hosted by Infomaniak have been retooled from i386 to amd64.
  • Another node, ionos17-amd64.debian.net, which is used for verifying packages for all.reproduce.debian.net (hosted by IONOS) has had its memory increased from 40 to 64GB, and the number of cores doubled to 32 as well. In addition, two nodes generously hosted by OSUOSL have had their memory doubled to 16GB.
  • Lastly, we have been granted access to more riscv64 architecture boards, so now we have seven such nodes, all with 16GB memory and 4 cores that are verifying packages for riscv64.reproduce.debian.net. Many thanks to PLCT Lab, ISCAS for providing those.

Outside of this, a number of smaller changes were also made by Holger Levsen:
  • reproduce.debian.net-related:
    • Only use two workers for the ppc64el architecture due to RAM size. [ ]
    • Monitor nginx_request and nginx_status with the Munin monitoring system. [ ][ ]
    • Detect various variants of network and memory errors. [ ][ ][ ][ ]
    • Add a prominent link to reproducible-builds.org. [ ]
    • Add a rebuilderd-cache-cleanup.service and run it daily via timer. [ ][ ][ ][ ][ ]
    • Be more verbose what sources are being downloaded. [ ]
    • Correctly deal with packages with an epoch in their version [ ] and deal with binNMUs versions with an epoch as well [ ][ ].
    • Document how to reschedule all other errors on all archs. [ ]
    • Misc documentation improvements. [ ][ ][ ][ ]
    • Include the $HOSTNAME variable in the rebuilderd logfiles. [ ]
    • Install the equivs package on all worker nodes. [ ][ ]
  • Jenkins nodes:
    • Permit the sudo tool to fix up permission issues. [ ][ ]
    • Document how to manage diskspace with OpenStack. [ ]
    • Ignore a number of spurious monitoring errors on riscv64, FreeBSD, etc.. [ ][ ][ ][ ]
    • Install ntpsec-ntpdate (instead of ntpdate) as the former is available on Debian trixie and bookworm. [ ][ ]
    • Use the same SSH ControlPath for all nodes. [ ]
    • Make sure the munin user uses the same SSH config as the jenkins user. [ ]
  • tests.reproducible-builds.org-related:
    • Disable testing of the i386 architecture. [ ][ ][ ][ ][ ]
    • Document the current disk usage. [ ][ ]
    • Address some image placement now that we only test three architectures. [ ]
    • Keep track of build performance. [ ]
  • Misc:
    • Fix a (harmless) typo in the multiarch_versionskew script. [ ]
In addition, Jochen Sprickerhof made a series of changes related to reproduce.debian.net:
  • Add out of memory detection to the statistics page. [ ]
  • Reverse the sorting order on the statistics page. [ ][ ][ ][ ]
  • Improve the spacing between statistics groups. [ ]
  • Update a (hard-coded) line number in error message detection pertaining to a debrebuild line number. [ ]
  • Support Debian unstable in the rebuilder-debian.sh script. [ ] ]
  • Rely on rebuildctl to sync only arch-specific packages. [ ][ ]

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. This month, we wrote a large number of such patches, including:

Finally, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

Dirk Eddelbuettel: #49: The Two Cultures of Deploying Statistical Software

Welcome to post 49 in the R4 series. The Two Cultures is a term first used by C.P. Snow in a 1959 speech and monograph focused on the split between humanities and the sciences. Decades later, the term was (quite famously) re-used by Leo Breiman in a (somewhat prophetic) 2001 article about the split between data models and algorithmic models . In this note, we argue that statistical computing practice and deployment can also be described via this Two Cultures moniker. Referring to the term linking these foundational pieces is of course headline bait. Yet when preparing for the discussion of r2u in the invited talk in Mons (video, slides), it occurred to me that there is in fact a wide gulf between two alternative approaches of using R and, specifically, deploying packages. On the one hand we have the approach described by my friend Jeff as you go to the Apple store, buy the nicest machine you can afford, install what you need and then never ever touch it . A computer / workstation / laptop is seen as an immutable object where every attempt at change may lead to breakage, instability, and general chaos and is hence best avoided. If you know Jeff, you know he exaggerates. Maybe only slightly though. Similarly, an entire sub-culture of users striving for reproducibility (and sometimes also replicability ) does the same. This is for example evidenced by the popularity of package renv by Rcpp collaborator and pal Kevin. The expressed hope is that by nailing down a (sub)set of packages, outcomes are constrained to be unchanged. Hope springs eternal, clearly. (Personally, if need be, I do the same with Docker containers and their respective Dockerfile.) On the other hand, rolling is fundamentally different approach. One (well known) example is Google building everything at @HEAD . The entire (ginormous) code base is considered as a mono-repo which at any point in time is expected to be buildable as is. All changes made are pre-tested to be free of side effects to other parts. This sounds hard, and likely is more involved than an alternative of a whatever works approach of independent changes and just hoping for the best. Another example is a rolling (Linux) distribution as for example Debian. Changes are first committed to a staging place (Debian calls this the unstable distribution) and, if no side effects are seen, propagated after a fixed number of days to the rolling distribution (called testing ). With this mechanism, testing should always be installable too. And based on the rolling distribution, at certain times (for Debian roughly every two years) a release is made from testing into stable (following more elaborate testing). The released stable version is then immutable (apart from fixes for seriously grave bugs and of course security updates). So this provides the connection between frequent and rolling updates, and produces immutable fixed set: a release. This Debian approach has been influential for any other projects including CRAN as can be seen in aspects of its system providing a rolling set of curated packages. Instead of a staging area for all packages, extensive tests are made for candidate packages before adding an update. This aims to ensure quality and consistence and has worked remarkably well. We argue that it has clearly contributed to the success and renown of CRAN. Now, when accessing CRAN from R, we fundamentally have two accessor functions. But seemingly only one is widely known and used. In what we may call the Jeff model , everybody is happy to deploy install.packages() for initial installations. That sentiment is clearly expressed by this bsky post:
One of my #rstats coding rituals is that every time I load a @vincentab.bsky.social package I go check for a new version because invariably it s been updated with 18 new major features
And that is why we have two cultures. Because some of us, yours truly included, also use update.packages() at recurring (frequent !!) intervals: daily or near-daily for me. The goodness and, dare I say, gift of packages is not limited to those by my pal Vincent. CRAN updates all the time, and updates are (generally) full of (usually excellent) changes, fixes, or new features. So update frequently! Doing (many but small) updates (frequently) is less invasive than (large, infrequent) waterfall -style changes! But the fear of change, or disruption, is clearly pervasive. One can only speculate why. Is the experience of updating so painful on other operating systems? Is it maybe a lack of exposure / tutorials on best practices? These Two Cultures coexist. When I delivered the talk in Mons, I briefly asked for a show of hands among all the R users in the audience to see who in fact does use update.packages() regularly. And maybe a handful of hands went up: surprisingly few! Now back to the context of installing packages: Clearly only installing has its uses. For continuous integration checks we generally install into ephemeral temporary setups. Some debugging work may be with one-off container or virtual machine setups. But all other uses may well be under maintained setups. So consider calling update.packages() once in while. Or even weekly or daily. The rolling feature of CRAN is a real benefit, and it is there for the taking and enrichment of your statistical computing experience. So to sum up, the real power is to use For both tasks, relying on binary installations accelerates and eases the process. And where available, using binary installation with system-dependency support as r2u does makes it easier still, following the r2u slogan of Fast. Easy. Reliable. Pick All Three. Give it a try!

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can now sponsor me at GitHub.

5 June 2025

Matthew Garrett: Twitter's new encrypted DMs aren't better than the old ones

(Edit: Twitter could improve this significantly with very few changes - I wrote about that here. It's unclear why they'd launch without doing that, since it entirely defeats the point of using HSMs)

When Twitter[1] launched encrypted DMs a couple
of years ago, it was the worst kind of end-to-end
encrypted - technically e2ee, but in a way that made it relatively easy for Twitter to inject new encryption keys and get everyone's messages anyway. It was also lacking a whole bunch of features such as "sending pictures", so the entire thing was largely a waste of time. But a couple of days ago, Elon announced the arrival of "XChat", a new encrypted message platform built on Rust with (Bitcoin style) encryption, whole new architecture. Maybe this time they've got it right?

tl;dr - no. Use Signal. Twitter can probably obtain your private keys, and admit that they can MITM you and have full access to your metadata.

The new approach is pretty similar to the old one in that it's based on pretty straightforward and well tested cryptographic primitives, but merely using good cryptography doesn't mean you end up with a good solution. This time they've pivoted away from using the underlying cryptographic primitives directly and into higher level abstractions, which is probably a good thing. They're using Libsodium's boxes for message encryption, which is, well, fine? It doesn't offer forward secrecy (if someone's private key is leaked then all existing messages can be decrypted) so it's a long way from the state of the art for a messaging client (Signal's had forward secrecy for over a decade!), but it's not inherently broken or anything. It is, however, written in C, not Rust[2].

That's about the extent of the good news. Twitter's old implementation involved clients generating keypairs and pushing the public key to Twitter. Each client (a physical device or a browser instance) had its own private key, and messages were simply encrypted to every public key associated with an account. This meant that new devices couldn't decrypt old messages, and also meant there was a maximum number of supported devices and terrible scaling issues and it was pretty bad. The new approach generates a keypair and then stores the private key using the Juicebox protocol. Other devices can then retrieve the private key.

Doesn't this mean Twitter has the private key? Well, no. There's a PIN involved, and the PIN is used to generate an encryption key. The stored copy of the private key is encrypted with that key, so if you don't know the PIN you can't decrypt the key. So we brute force the PIN, right? Juicebox actually protects against that - before the backend will hand over the encrypted key, you have to prove knowledge of the PIN to it (this is done in a clever way that doesn't directly reveal the PIN to the backend). If you ask for the key too many times while providing the wrong PIN, access is locked down.

But this is true only if the Juicebox backend is trustworthy. If the backend is controlled by someone untrustworthy[3] then they're going to be able to obtain the encrypted key material (even if it's in an HSM, they can simply watch what comes out of the HSM when the user authenticates if there's no validation of the HSM's keys). And now all they need is the PIN. Turning the PIN into an encryption key is done using the Argon2id key derivation function, using 32 iterations and a memory cost of 16MB (the Juicebox white paper says 16KB, but (a) that's laughably small and (b) the code says 16 * 1024 in an argument that takes kilobytes), which makes it computationally and moderately memory expensive to generate the encryption key used to decrypt the private key. How expensive? Well, on my (not very fast) laptop, that takes less than 0.2 seconds. How many attempts to I need to crack the PIN? Twitter's chosen to fix that to 4 digits, so a maximum of 10,000. You aren't going to need many machines running in parallel to bring this down to a very small amount of time, at which point private keys can, to a first approximation, be extracted at will.

Juicebox attempts to defend against this by supporting sharding your key over multiple backends, and only requiring a subset of those to recover the original. I can't find any evidence that Twitter's does seem to be making use of this,Twitter uses three backends and requires data from at least two, but all the backends used are under x.com so are presumably under Twitter's direct control. Trusting the keystore without needing to trust whoever's hosting it requires a trustworthy communications mechanism between the client and the keystore. If the device you're talking to can prove that it's an HSM that implements the attempt limiting protocol and has no other mechanism to export the data, this can be made to work. Signal makes use of something along these lines using Intel SGX for contact list and settings storage and recovery, and Google and Apple also have documentation about how they handle this in ways that make it difficult for them to obtain backed up key material. Twitter has no documentation of this, and as far as I can tell does nothing to prove that the backend is in any way trustworthy. (Edit to add: The Juicebox API does support authenticated communication between the client and the HSM, but that relies on you having some way to prove that the public key you're presented with corresponds to a private key that only exists in the HSM. Twitter gives you the public key whenever you communicate with them, so even if they've implemented this properly you can't prove they haven't made up a new key and MITMed you the next time you retrieve your key)

On the plus side, Juicebox is written in Rust, so Elon's not 100% wrong. Just mostly wrong.

But ok, at least you've got viable end-to-end encryption even if someone can put in some (not all that much, really) effort to obtain your private key and render it all pointless? Actually no, since you're still relying on the Twitter server to give you the public key of the other party and there's no out of band mechanism to do that or verify the authenticity of that public key at present. Twitter can simply give you a public key where they control the private key, decrypt the message, and then reencrypt it with the intended recipient's key and pass it on. The support page makes it clear that this is a known shortcoming and that it'll be fixed at some point, but they said that about the original encrypted DM support and it never was, so that's probably dependent on whether Elon gets distracted by something else again. And the server knows who and when you're messaging even if they haven't bothered to break your private key, so there's a lot of metadata leakage.

Signal doesn't have these shortcomings. Use Signal.

[1] I'll respect their name change once Elon respects his daughter

[2] There are implementations written in Rust, but Twitter's using the C one with these JNI bindings

[3] Or someone nominally trustworthy but who's been compelled to act against your interests - even if Elon were absolutely committed to protecting all his users, his overarching goals for Twitter require him to have legal presence in multiple jurisdictions that are not necessarily above placing employees in physical danger if there's a perception that they could obtain someone's encryption keys

comment count unavailable comments

4 June 2025

Gunnar Wolf: The subjective value of privacy Assessing individuals' calculus of costs and benefits in the context of state surveillance

This post is an unpublished review for The subjective value of privacy Assessing individuals' calculus of costs and benefits in the context of state surveillance
Internet users, software developers, academics, entrepreneurs basically everybody is now aware of the importance of considering privacy as a core part of our online experience. User demand, and various national or regional laws, have made privacy a continuously present subject. And privacy is such an all-encompassing, complex topic, the angles from which it can be studied seems never to finish; I recommend computer networking-oriented newcomers to the topic to refer to Brian Kernighan s excellent work [1]. However, how do regular people like ourselves, in our many capacities feel about privacy? Lukas Antoine presents a series of experiments aiming at better understanding how people throughout the world understands privacy, and when is privacy held as more or less important than security in different aspects, Particularly, privacy is often portrayed as a value set at tension against surveillance, and particularly state surveillance, in the name of security: conventional wisdom presents the idea of privacy calculus. This is, it is often assumed that individuals continuously evaluate the costs and benefits of divulging their personal data, sharing data when they expect a positive net outcome, and denying it otherwise. This framework has been accepted for decades, and the author wishes to challenge it. This book is clearly his doctoral thesis on political sciences, and its contents are as thorough as expected in this kind of product. The author presents three empirical studies based on cross-survey analysis. The first experiment explores the security justifications for surveillance and how they influence their support. The second one searches whether the stance on surveillance can be made dependent on personal convenience or financial cost. The third study explores whether privacy attitude is context-dependant or can be seen as a stable personality trait. The studies aim to address the shortcomings of published literature in the field, mainly, (a) the lack of comprehensive research on state surveillance, needed or better understanding privacy appreciation, (b) while several studies have tackled the subjective measure of privacy, there is a lack of cross-national studies to explain wide-ranging phenomena, (c) most studies in this regard are based on population-based surveys, which cannot establish causal relationships, (d) a seemingly blind acceptance of the privacy calculus mentioned above, with no strong evidence that it accurately measures people s motivations for disclosing or withholding their data. The specific take, including the framing of the tension between privacy and surveillance has long been studied, as can be seen in Steven Nock s 1993 book [2], but as Sannon s article in 2022 shows [3], social and technological realities require our undertanding to be continuously kept up to date. The book is full with theoretical references and does a very good job of explaining the path followed by the author. It is, though, a heavy read, and, for people not coming from the social sciences tradition, leads to the occasional feeling of being lost. The conceptual and theoretical frameworks and presented studies are thorough and clear. The author is honest in explaining when the data points at some of his hypotheses being disproven, while others are confirmed. The aim of the book is for people digging deep into this topic. Personally, I have authored several works on different aspects of privacy (such as a book [4] and a magazine number [5]), but this book did get me thinking on many issues I had not previously considered. Looking for comparable works, I find Friedewald et al. s 2017 book [6] chapter organization to follow a similar thought line. My only complaint would be that, for the publication as part of its highly prestigious publisher, little attention has been paid to editorial aspects: sub-subsection depth is often excessive and unclear. Also, when publishing monographs based on doctoral works, it is customary to no longer refer to the work as a thesis and to soften some of the formal requirements such a work often has, with the aim of producing a more gentle and readable book; this book seems just like the mass-production of an (otherwise very interesting and well made) thesis work. References:

Gunnar Wolf: Humanities and big data in Ibero-America Theory, methodology and practical applications

This post is an unpublished review for Humanities and big data in Ibero-America Theory, methodology and practical applications
Digital humanities is a young though established field. It deals with different expressions in which digital data manipulation techniques can be applied and used to analyze subjects that are identified as belonging to the humanities. Although most often used to analyze different aspects of literature or social network analysis, it can also be applied to other humanistic disciplines or artistic expressions. Digital humanities employs many tools, but those categorized as big data are among the most frequently employed. This book samples different takes on digital humanities, with the particularity that it focuses on Ibero-American uses. It is worth noting that this book is the second in a series of four volumes, published or set to be published between 2022 and 2026. Being the output of a field survey, I perceive this book to be targeted towards fellow Digital Humanists people interested in applying computational methods to further understand and research topics in the humanities. It is not a technical book in the sense Computer Science people would recognize as such, but several of the presented works do benefit from understanding some technical concepts. The 12 articles (plus an introduction) that make up this book are organized in three parts: (1) Theoretical Framework presents the ideas and techniques of data science (that make up the tools for handling big data), and explores how data science can contribute to literary analysis, all while noting that many such techniques are usually frowned upon in Latin America as data science smells neoliberal ; (2) Methodological Issues looks at specific issues through the lens of how they can be applied to big data, with specific attention given to works in Spanish; and (3) Practical Applications analyzes specific Spanish works and communities based on big data techniques. Several chapters treat a recurring theme: the simultaneous resistance and appropriation of big data by humanists. For example, at least three of the chapters describe the tensions between humanism ( aesthesis ) and cold, number-oriented data analysis ( mathesis ). The analyzed works of Parts 2 and 3 are interesting and relatively easy to follow. Some inescapable ideological gleans from several word uses from the book s and series name, which refers to the Spanish-speaking regions as Ibero-America , often seen as Eurocentric, in contrast with the Latin America term much more widely used throughout the region. I will end with some notes about the specific versions of the book I reviewed. I read both an EPUB version and a print copy. The EPUB did not include links for easy navigation to footnotes, that is, the typographical superindexes are not hyperlinked to the location of the notes, so it is very impractical to try to follow them. The print version (unlike the EPUB) did not have an index, that is, the six pages before the introduction are missing from the print copy I received. For a book such as this one, not having an index hampers the ease of reading and referencing.

1 June 2025

Emmanuel Kasper: ARM64 desktop as daily driver

I have bought myself an expensive ARM64 workstation, the System 76 Thelio Astra that I intend to use as my main desktop computer for the next 15 years, running Debian. The box is basically a server motherboard repurposed in a good desktop chassis. In Europe it seems you can order similar ready systems here. The hardware is well supported by Debian 12 and Debian testing.I had some initial issues with graphics, due to the board being designed for a server use, but I am solving these as we go. Annoyances I got so far:
  • When you power on the machine using the power supply switch, you have to wait for the BMC to finished its startup sequence, before the front power button does anything. As starting the BMC can take 90 seconds, I thought initially the machine was dead on arrival.
  • The default graphical output is redirected to the BMC Serial over LAN, which means if you want to install Debian using an attached display you need to force the output on the attached display passing console=tty0 as an installer parameter.
  • Finally the Xorg Nouveau driver does not work with the Nvidia A400 GPU I got with the machine. After passing nodemodeset as a kernel parameter, I can force Xorg to use an unaccelerated framebuffer, which at least displays something. I passed this parameter to the installer, so that I could install in graphical mode. The driver from Nvidia works, but I d like very much to get Nouveau running.
Ugly point
  • A server mother board we said. This mean there is NO suspend to RAM, you have to power off if you don t want to keep the machine on all the time. As the boot sequence is long (server board again) I am pondering setting a startup time in the UEFI firmware to turn the box on at specific usage time.
Good points
  • The firmware of the machine is a standard EFI, which means you can use the debian arm64 installer on an USB stick straight away, without any kind of device tree / bootloader fiddling.
  • The 3 Nics, Wifi, bluetooth were all recognized on first boot.
  • I was afraid the machine would be loud. However it is quiet, you hear the humming of a fan, but it is quieter than most desktops I owned, from the Atari TT to an all in one Lenovo M92z I used for 10 years. I am certainly not a hardware and cooling specialist, but meseems the quietness comes from slow rotating but very large fans.
  • Due the clean design of Linux and Debian, thousands of packages working correctly on ARM64, starting with the Gnome desktop environment and Firefox.
  • The documentation from system76 is fine, their Ubuntu 20.04 setup guide was helpful to understand the needed parameters mentioned above.
Update: The display is working correctly with the nouveau driver after installing the non-free Nvidia firmware. See the Debian wiki.

29 May 2025

Arthur Diniz: Bringing Kubernetes Back to Debian

I ve been part of the Debian Project since 2019, when I attended DebConf held in Curitiba, Brazil. That event sparked my interest in the community, packaging, and how Debian works as a distribution. In the early years of my involvement, I contributed to various teams such as the Python, Golang and Cloud teams, packaging dependencies and maintaining various tools. However, I soon felt the need to focus on packaging software I truly enjoyed, tools I was passionate about using and maintaining. That s when I turned my attention to Kubernetes within Debian.

A Broken Ecosystem The Kubernetes packaging situation in Debian had been problematic for some time. Given its large codebase and complex dependency tree, the initial packaging approach involved vendorizing all dependencies. While this allowed a somewhat functional package to be published, it introduced several long-term issues, especially security concerns. Vendorized packages bundle third-party dependencies directly into the source tarball. When vulnerabilities arise in those dependencies, it becomes difficult for Debian s security team to patch and rebuild affected packages system-wide. This approach broke Debian s best practices, and it eventually led to the abandonment of the Kubernetes source package, which had stalled at version 1.20.5. Due to this abandonment, critical bugs emerged and the package was removed from Debian s testing channel, as we can see in the package tracker.

New Debian Kubernetes Team Around this time, I became a Debian Maintainer (DM), with permissions to upload certain packages. I saw an opportunity to both contribute more deeply to Debian and to fix Kubernetes packaging. In early 2024, just before DebConf Busan in South Korea, I founded the Debian Kubernetes Team. The mission of the team was to repackage Kubernetes in a maintainable, security-conscious, and Debian-compliant way. At DebConf, I shared our progress with the broader community and received great feedback and more visibility, along with people interested in contributing to the team. Our first tasks was to migrate existing Kubernetes-related tools such as kubectx, kubernetes-split-yaml and kubetail into a dedicated namespace on Salsa, Debian s GitLab instance. Many of these tools were stored across different teams (like the Go team), and consolidating them helped us organize development and focus our efforts.

De-vendorizing Kubernetes Our main goal was to un-vendorize Kubernetes and bring it up-to-date with upstream releases. This meant:
  • Removing the vendor directory and all embedded third-party code.
  • Trimming the build scope to focus solely on building kubectl, Kubernetes CLI.
  • Using Files-Excluded in debian/copyright to cleanly drop unneeded files during source imports.
  • Rebuilding the dependency tree, ensuring all Go modules were separately packaged in Debian.
We used uscan, a standard Debian packaging tool that fetches upstream tarballs and prepares them accordingly. The Files-Excluded directive in our debian/copyright file instructed uscan to automatically remove unnecessary files during the repackaging process:
$ uscan
Newest version of kubernetes on remote site is 1.32.3, specified download version is 1.32.3
Successfully repacked ../v1.32.3 as ../kubernetes_1.32.3+ds.orig.tar.gz, deleting 30616 files from it.
The results were dramatic. By comparing the original upstream tarball with our repackaged version, we can see that our approach reduced the tarball size by over 75%:
$ du -h upstream-v1.32.3.tar.gz kubernetes_1.32.3+ds.orig.tar.gz
14M	upstream-v1.32.3.tar.gz
3.2M	kubernetes_1.32.3+ds.orig.tar.gz
This significant reduction wasn t just about saving space. By removing over 30,000 files, we simplified the package, making it more maintainable. Each dependency could now be properly tracked, updated, and patched independently, resolving the security concerns that had plagued the previous packaging approach.

Dependency Graph To give you an idea of the complexity involved in packaging Kubernetes for Debian, the image below is a dependency graph generated with debtree, visualizing all the Go modules and other dependencies required to build the kubectl binary. kubectl-depgraph This web of nodes and edges represents every module and its relationship during the compilation process of kubectl. Each box is a Debian package, and the lines connecting them show how deeply intertwined the ecosystem is. What might look like a mess of blue spaghetti is actually a clear demonstration of the vast and interconnected upstream world that tools like kubectl rely on. But more importantly, this graph is a testament to the effort that went into making kubectl build entirely using Debian-packaged dependencies only, no vendoring, no downloading from the internet, no proprietary blobs.

Upstream Version 1.32.3 and Beyond After nearly two years of work, we successfully uploaded version 1.32.3+ds of kubectl to Debian unstable. kubernetes/-/merge_requests/1 The new package also includes:
  • Zsh, Fish, and Bash completions installed automatically
  • Man pages and metadata for improved discoverability
  • Full integration with kind and docker for testing purposes

Integration Testing with Autopkgtest To ensure the reliability of kubectl in real-world scenarios, we developed a new autopkgtest suite that runs integration tests using real Kubernetes clusters created via Kind. Autopkgtest is a Debian tool used to run automated tests on binary packages. These tests are executed after the package is built but before it s accepted into the Debian archive, helping catch regressions and integration issues early in the packaging pipeline. Our test workflow validates kubectl by performing the following steps:
  • Installing Kind and Docker as test dependencies.
  • Spinning up two local Kubernetes clusters.
  • Switching between cluster contexts to ensure multi-cluster support.
  • Deploying and scaling a sample nginx application using kubectl.
  • Cleaning up the entire test environment to avoid side effects.
  • debian/tests/kubectl.sh

Popcon: Measuring Adoption To measure real-world usage, we rely on data from Debian s popularity contest (popcon), which gives insight into how many users have each binary installed. popcon-graph popcon-table Here s what the data tells us:
  • kubectl (new binary): Already installed on 2,124 systems.
  • golang-k8s-kubectl-dev: This is the Go development package (a library), useful for other packages and developers who want to interact with Kubernetes programmatically.
  • kubernetes-client: The legacy package that kubectl is replacing. We expect this number to decrease in future releases as more systems transition to the new package.
Although the popcon data shows activity for kubectl before the official Debian upload date, it s important to note that those numbers represent users who had it installed from upstream source-lists, not from the Debian repositories. This distinction underscores a demand that existed even before the package was available in Debian proper, and it validates the importance of bringing it into the archive.
Also worth mentioning: this number is not the real total number of installations, since users can choose not to participate in the popularity contest. So the actual adoption is likely higher than what popcon reflects.

Community and Documentation The team also maintains a dedicated wiki page which documents:
  • Maintained tools and packages
  • Contribution guidelines
  • Our roadmap for the upcoming Debian releases
https://debian-kubernetes.org

Looking Ahead to Debian 13 (Trixie) The next stable release of Debian will ship with kubectl version 1.32.3, built from a clean, de-vendorized source. This version includes nearly all the latest upstream features, and will be the first time in years that Debian users can rely on an up-to-date, policy-compliant kubectl directly from the archive. By comparing with upstream, our Debian package even delivers more out of the box, including shell completions, which the upstream still requires users to generate manually. In 2025, the Debian Kubernetes team will continue expanding our packaging efforts for the Kubernetes ecosystem. Our roadmap includes:
  • kubelet: The primary node agent that runs on each node. This will enable Debian users to create fully functional Kubernetes nodes without relying on external packages.
  • kubeadm: A tool for creating Kubernetes clusters. With kubeadm in Debian, users will then be able to bootstrap minimum viable clusters directly from the official repositories.
  • helm: The package manager for Kubernetes that helps manage applications through Kubernetes YAML files defined as charts.
  • kompose: A conversion tool that helps users familiar with docker-compose move to Kubernetes by translating Docker Compose files into Kubernetes resources.

Final Thoughts This journey was only possible thanks to the amazing support of the debian-devel-br community and the collective effort of contributors who stepped up to package missing dependencies, fix bugs, and test new versions. Special thanks to:
  • Carlos Henrique Melara (@charles)
  • Guilherme Puida (@puida)
  • Jo o Pedro Nobrega (@jnpf)
  • Lucas Kanashiro (@kanashiro)
  • Matheus Polkorny (@polkorny)
  • Samuel Henrique (@samueloph)
  • Sergio Cipriano (@cipriano)
  • Sergio Durigan Junior (@sergiodj)
I look forward to continuing this work, bringing more Kubernetes tools into Debian and improving the developer experience for everyone.

Arthur Diniz: Bringing Kubernetes Back to Debian

I ve been part of the Debian Project since 2019, when I attended DebConf held in Curitiba, Brazil. That event sparked my interest in the community, packaging, and how Debian works as a distribution. In the early years of my involvement, I contributed to various teams such as the Python, Golang and Cloud teams, packaging dependencies and maintaining various tools. However, I soon felt the need to focus on packaging software I truly enjoyed, tools I was passionate about using and maintaining. That s when I turned my attention to Kubernetes within Debian.

A Broken Ecosystem The Kubernetes packaging situation in Debian had been problematic for some time. Given its large codebase and complex dependency tree, the initial packaging approach involved vendorizing all dependencies. While this allowed a somewhat functional package to be published, it introduced several long-term issues, especially security concerns. Vendorized packages bundle third-party dependencies directly into the source tarball. When vulnerabilities arise in those dependencies, it becomes difficult for Debian s security team to patch and rebuild affected packages system-wide. This approach broke Debian s best practices, and it eventually led to the abandonment of the Kubernetes source package, which had stalled at version 1.20.5. Due to this abandonment, critical bugs emerged and the package was removed from Debian s testing channel, as we can see in the package tracker.

New Debian Kubernetes Team Around this time, I became a Debian Maintainer (DM), with permissions to upload certain packages. I saw an opportunity to both contribute more deeply to Debian and to fix Kubernetes packaging. In early 2024, just before DebConf Busan in South Korea, I founded the Debian Kubernetes Team. The mission of the team was to repackage Kubernetes in a maintainable, security-conscious, and Debian-compliant way. At DebConf, I shared our progress with the broader community and received great feedback and more visibility, along with people interested in contributing to the team. Our first tasks was to migrate existing Kubernetes-related tools such as kubectx, kubernetes-split-yaml and kubetail into a dedicated namespace on Salsa, Debian s GitLab instance. Many of these tools were stored across different teams (like the Go team), and consolidating them helped us organize development and focus our efforts.

De-vendorizing Kubernetes Our main goal was to un-vendorize Kubernetes and bring it up-to-date with upstream releases. This meant:
  • Removing the vendor directory and all embedded third-party code.
  • Trimming the build scope to focus solely on building kubectl, Kubernetes CLI.
  • Using Files-Excluded in debian/copyright to cleanly drop unneeded files during source imports.
  • Rebuilding the dependency tree, ensuring all Go modules were separately packaged in Debian.
We used uscan, a standard Debian packaging tool that fetches upstream tarballs and prepares them accordingly. The Files-Excluded directive in our debian/copyright file instructed uscan to automatically remove unnecessary files during the repackaging process:
$ uscan
Newest version of kubernetes on remote site is 1.32.3, specified download version is 1.32.3
Successfully repacked ../v1.32.3 as ../kubernetes_1.32.3+ds.orig.tar.gz, deleting 30616 files from it.
The results were dramatic. By comparing the original upstream tarball with our repackaged version, we can see that our approach reduced the tarball size by over 75%:
$ du -h upstream-v1.32.3.tar.gz kubernetes_1.32.3+ds.orig.tar.gz
14M	upstream-v1.32.3.tar.gz
3.2M	kubernetes_1.32.3+ds.orig.tar.gz
This significant reduction wasn t just about saving space. By removing over 30,000 files, we simplified the package, making it more maintainable. Each dependency could now be properly tracked, updated, and patched independently, resolving the security concerns that had plagued the previous packaging approach.

Dependency Graph To give you an idea of the complexity involved in packaging Kubernetes for Debian, the image below is a dependency graph generated with debtree, visualizing all the Go modules and other dependencies required to build the kubectl binary. kubectl-depgraph This web of nodes and edges represents every module and its relationship during the compilation process of kubectl. Each box is a Debian package, and the lines connecting them show how deeply intertwined the ecosystem is. What might look like a mess of blue spaghetti is actually a clear demonstration of the vast and interconnected upstream world that tools like kubectl rely on. But more importantly, this graph is a testament to the effort that went into making kubectl build entirely using Debian-packaged dependencies only, no vendoring, no downloading from the internet, no proprietary blobs.

Upstream Version 1.32.3 and Beyond After nearly two years of work, we successfully uploaded version 1.32.3+ds of kubectl to Debian unstable. kubernetes/-/merge_requests/1 The new package also includes:
  • Zsh, Fish, and Bash completions installed automatically
  • Man pages and metadata for improved discoverability
  • Full integration with kind and docker for testing purposes

Integration Testing with Autopkgtest To ensure the reliability of kubectl in real-world scenarios, we developed a new autopkgtest suite that runs integration tests using real Kubernetes clusters created via Kind. Autopkgtest is a Debian tool used to run automated tests on binary packages. These tests are executed after the package is built but before it s accepted into the Debian archive, helping catch regressions and integration issues early in the packaging pipeline. Our test workflow validates kubectl by performing the following steps:
  • Installing Kind and Docker as test dependencies.
  • Spinning up two local Kubernetes clusters.
  • Switching between cluster contexts to ensure multi-cluster support.
  • Deploying and scaling a sample nginx application using kubectl.
  • Cleaning up the entire test environment to avoid side effects.
  • debian/tests/kubectl.sh

Popcon: Measuring Adoption To measure real-world usage, we rely on data from Debian s popularity contest (popcon), which gives insight into how many users have each binary installed. popcon-graph popcon-table Here s what the data tells us:
  • kubectl (new binary): Already installed on 2,124 systems.
  • golang-k8s-kubectl-dev: This is the Go development package (a library), useful for other packages and developers who want to interact with Kubernetes programmatically.
  • kubernetes-client: The legacy package that kubectl is replacing. We expect this number to decrease in future releases as more systems transition to the new package.
Although the popcon data shows activity for kubectl before the official Debian upload date, it s important to note that those numbers represent users who had it installed from upstream source-lists, not from the Debian repositories. This distinction underscores a demand that existed even before the package was available in Debian proper, and it validates the importance of bringing it into the archive.
Also worth mentioning: this number is not the real total number of installations, since users can choose not to participate in the popularity contest. So the actual adoption is likely higher than what popcon reflects.

Community and Documentation The team also maintains a dedicated wiki page which documents:
  • Maintained tools and packages
  • Contribution guidelines
  • Our roadmap for the upcoming Debian releases
https://debian-kubernetes.org

Looking Ahead to Debian 13 (Trixie) The next stable release of Debian will ship with kubectl version 1.32.3, built from a clean, de-vendorized source. This version includes nearly all the latest upstream features, and will be the first time in years that Debian users can rely on an up-to-date, policy-compliant kubectl directly from the archive. By comparing with upstream, our Debian package even delivers more out of the box, including shell completions, which the upstream still requires users to generate manually. In 2025, the Debian Kubernetes team will continue expanding our packaging efforts for the Kubernetes ecosystem. Our roadmap includes:
  • kubelet: The primary node agent that runs on each node. This will enable Debian users to create fully functional Kubernetes nodes without relying on external packages.
  • kubeadm: A tool for creating Kubernetes clusters. With kubeadm in Debian, users will then be able to bootstrap minimum viable clusters directly from the official repositories.
  • helm: The package manager for Kubernetes that helps manage applications through Kubernetes YAML files defined as charts.
  • kompose: A conversion tool that helps users familiar with docker-compose move to Kubernetes by translating Docker Compose files into Kubernetes resources.

Final Thoughts This journey was only possible thanks to the amazing support of the debian-devel-br community and the collective effort of contributors who stepped up to package missing dependencies, fix bugs, and test new versions. Special thanks to:
  • Carlos Henrique Melara (@charles)
  • Guilherme Puida (@puida)
  • Jo o Pedro Nobrega (@jnpf)
  • Lucas Kanashiro (@kanashiro)
  • Matheus Polkorny (@polkorny)
  • Samuel Henrique (@samueloph)
  • Sergio Cipriano (@cipriano)
  • Sergio Durigan Junior (@sergiodj)
I look forward to continuing this work, bringing more Kubernetes tools into Debian and improving the developer experience for everyone.

27 May 2025

Ravi Dwivedi: Singapore Visa Process

In November 2024, Badri and I applied for a Singapore visa to visit the country. To apply for a Singapore visa, you need to visit an authorized travel agent listed by the Singapore High Commission on their website. Unlike the Schengen visa (where only VFS can process applications), the Singapore visa has many authorized travel agents to choose from. I remember that the list mentioned as many as 25 authorized agents in Chennai. For my application, I randomly selected Ria International in Karol Bagh, New Delhi from the list. Further, you need to apply not more than a month before your travel dates. As our travel dates were in December, we applied in the month of November. For your reference, I submitted the following documents: I didn t have my photograph in the specified dimensions, so the travel agent took my photo on the spot. The visa application was 2,567. Furthermore, I submitted my application on a Saturday and received a call from the travel agent on Tuesday informing me that they had received my visa from the Singapore High Commission. The next day, I visit the travel agent s office and picked up my passport and a black and white copy of my e-visa. Later, I downloaded a PDF of my visa from the website mentioned on it, and took a colored printout myself. Singapore granted me a multiple-entry visa for 2 months, even though I had applied for a 4-day single-entry visa. We were planning to add more countries to this trip; therefore, a multiple-entry visa would be helpful in case we wanted to use Singapore Airport, as it has good connectivity. However, it turned out that flights from Kuala Lumpur were much cheaper than those from Singapore, so we didn t enter Singapore again after leaving. Badri also did the same process but entirely remotely he posted the documents to the visa agency in Chennai, and got his e-visa in a few days followed by his original passport which was delivered by courier. He got his photo taken in the same dimensions mentioned above, and printed as matte finish as instructed. However, the visa agents asked why his photo was looking so faded. We don t know if they thought the matte finish was faded or what. To rectify this, Badri emailed them a digital copy of the photo to them (both the cropped version and the original) and they handled the reprinting on their end (which he never got to see). Before entering Singapore, we had to fill an arrival card - an online form asking a few details about our trip - within 72 hours of our arrival in Singapore. That s it for now. Meet you in the next post. Thanks to Badri for reviewing the draft.

26 May 2025

Otto Kek l inen: Creating Debian packages from upstream Git

Featured image of post Creating Debian packages from upstream GitIn this post, I demonstrate the optimal workflow for creating new Debian packages in 2025, preserving the upstream git history. The motivation for this is to lower the barrier for sharing improvements to and from upstream, and to improve software provenance and supply-chain security by making it easy to inspect every change at any level using standard git tooling. Key elements of this workflow include: To make the instructions so concrete that anyone can repeat all the steps themselves on a real package, I demonstrate the steps by packaging the command-line tool Entr. It is written in C, has very few dependencies, and its final Debian source package structure is simple, yet exemplifies all the important parts that go into a complete Debian package:
  1. Creating a new packaging repository and publishing it under your personal namespace on salsa.debian.org.
  2. Using dh_make to create the initial Debian packaging.
  3. Posting the first draft of the Debian packaging as a Merge Request (MR) and using Salsa CI to verify Debian packaging quality.
  4. Running local builds efficiently and iterating on the packaging process.

Create new Debian packaging repository from the existing upstream project git repository First, create a new empty directory, then clone the upstream Git repository inside it:
shell
mkdir debian-entr
cd debian-entr
git clone --origin upstreamvcs --branch master \
 --single-branch https://github.com/eradman/entr.git
Using a clean directory makes it easier to inspect the build artifacts of a Debian package, which will be output in the parent directory of the Debian source directory. The extra parameters given to git clone lay the foundation for the Debian packaging git repository structure where the upstream git remote name is upstreamvcs. Only the upstream main branch is tracked to avoid cluttering git history with upstream development branches that are irrelevant for packaging in Debian. Next, enter the git repository directory and list the git tags. Pick the latest upstream release tag as the commit to start the branch upstream/latest. This latest refers to the upstream release, not the upstream development branch. Immediately after, branch off the debian/latest branch, which will have the actual Debian packaging files in the debian/ subdirectory.
shell
cd entr
git tag # shows the latest upstream release tag was '5.6'
git checkout -b upstream/latest 5.6
git checkout -b debian/latest
%% init:   'gitGraph':   'mainBranchName': 'master'      %%
gitGraph:
checkout master
commit id: "Upstream 5.6 release" tag: "5.6"
branch upstream/latest
checkout upstream/latest
commit id: "New upstream version 5.6" tag: "upstream/5.6"
branch debian/latest
checkout debian/latest
commit id: "Initial Debian packaging"
commit id: "Additional change 1"
commit id: "Additional change 2"
commit id: "Additional change 3"
At this point, the repository is structured according to DEP-14 conventions, ensuring a clear separation between upstream and Debian packaging changes, but there are no Debian changes yet. Next, add the Salsa repository as a new remote which called origin, the same as the default remote name in git.
shell
git remote add origin git@salsa.debian.org:otto/entr-demo.git
git push --set-upstream origin debian/latest
This is an important preparation step to later be able to create a Merge Request on Salsa that targets the debian/latest branch, which does not yet have any debian/ directory.

Launch a Debian Sid (unstable) container to run builds in To ensure that all packaging tools are of the latest versions, run everything inside a fresh Sid container. This has two benefits: you are guaranteed to have the most up-to-date toolchain, and your host system stays clean without getting polluted by various extra packages. Additionally, this approach works even if your host system is not Debian/Ubuntu.
shell
cd ..
podman run --interactive --tty --rm --shm-size=1G --cap-add SYS_PTRACE \
 --env='DEB*' --volume=$PWD:/tmp/test --workdir=/tmp/test debian:sid bash
Note that the container should be started from the parent directory of the git repository, not inside it. The --volume parameter will loop-mount the current directory inside the container. Thus all files created and modified are on the host system, and will persist after the container shuts down. Once inside the container, install the basic dependencies:
shell
apt update -q && apt install -q --yes git-buildpackage dpkg-dev dh-make

Automate creating the debian/ files with dh-make To create the files needed for the actual Debian packaging, use dh_make:
shell
# dh_make --packagename entr_5.6 --single --createorig
Maintainer Name : Otto Kek l inen
Email-Address : otto@debian.org
Date : Sat, 15 Feb 2025 01:17:51 +0000
Package Name : entr
Version : 5.6
License : blank
Package Type : single
Are the details correct? [Y/n/q]

Done. Please edit the files in the debian/ subdirectory now.
Due to how dh_make works, the package name and version need to be written as a single underscore separated string. In this case, you should choose --single to specify that the package type is a single binary package. Other options would be --library for library packages (see libgda5 sources as an example) or --indep (see dns-root-data sources as an example). The --createorig will create a mock upstream release tarball (entr_5.6.orig.tar.xz) from the current release directory, which is necessary due to historical reasons and how dh_make worked before git repositories became common and Debian source packages were based off upstream release tarballs (e.g. *.tar.gz). At this stage, a debian/ directory has been created with template files, and you can start modifying the files and iterating towards actual working packaging.
shell
git add debian/
git commit -a -m "Initial Debian packaging"

Review the files The full list of files after the above steps with dh_make would be:
 -- entr
   -- LICENSE
   -- Makefile.bsd
   -- Makefile.linux
   -- Makefile.linux-compat
   -- Makefile.macos
   -- NEWS
   -- README.md
   -- configure
   -- data.h
   -- debian
     -- README.Debian
     -- README.source
     -- changelog
     -- control
     -- copyright
     -- gbp.conf
     -- entr-docs.docs
     -- entr.cron.d.ex
     -- entr.doc-base.ex
     -- manpage.1.ex
     -- manpage.md.ex
     -- manpage.sgml.ex
     -- manpage.xml.ex
     -- postinst.ex
     -- postrm.ex
     -- preinst.ex
     -- prerm.ex
     -- rules
     -- salsa-ci.yml.ex
     -- source
       -- format
     -- upstream
       -- metadata.ex
     -- watch.ex
   -- entr.1
   -- entr.c
   -- missing
     -- compat.h
     -- kqueue_inotify.c
     -- strlcpy.c
     -- sys
     -- event.h
   -- status.c
   -- status.h
   -- system_test.sh
 -- entr_5.6.orig.tar.xz
You can browse these files in the demo repository. The mandatory files in the debian/ directory are:
  • changelog,
  • control,
  • copyright,
  • and rules.
All the other files have been created for convenience so the packager has template files to work from. The files with the suffix .ex are example files that won t have any effect until their content is adjusted and the suffix removed. For detailed explanations of the purpose of each file in the debian/ subdirectory, see the following resources:
  • The Debian Policy Manual: Describes the structure of the operating system, the package archive and requirements for packages to be included in the Debian archive.
  • The Developer s Reference: A collection of best practices and process descriptions Debian packagers are expected to follow while interacting with one another.
  • Debhelper man pages: Detailed information of how the Debian package build system works, and how the contents of the various files in debian/ affect the end result.
As Entr, the package used in this example, is a real package that already exists in the Debian archive, you may want to browse the actual Debian packaging source at https://salsa.debian.org/debian/entr/-/tree/debian/latest/debian for reference. Most of these files have standardized formatting conventions to make collaboration easier. To automatically format the files following the most popular conventions, simply run wrap-and-sort -vast or debputy reformat --style=black.

Identify build dependencies The most common reason for builds to fail is missing dependencies. The easiest way to identify which Debian package ships the required dependency is using apt-file. If, for example, a build fails complaining that pcre2posix.h cannot be found or that libcre2-posix.so is missing, you can use these commands:
shell
$ apt install -q --yes apt-file && apt-file update
$ apt-file search pcre2posix.h
libpcre2-dev: /usr/include/pcre2posix.h
$ apt-file search libpcre2-posix.so
libpcre2-dev: /usr/lib/x86_64-linux-gnu/libpcre2-posix.so
libpcre2-posix3: /usr/lib/x86_64-linux-gnu/libpcre2-posix.so.3
libpcre2-posix3: /usr/lib/x86_64-linux-gnu/libpcre2-posix.so.3.0.6
The output above implies that the debian/control should be extended to define a Build-Depends: libpcre2-dev relationship. There is also dpkg-depcheck that uses strace to trace the files the build process tries to access, and lists what Debian packages those files belong to. Example usage:
shell
dpkg-depcheck -b debian/rules build

Build the Debian sources to generate the .deb package After the first pass of refining the contents of the files in debian/, test the build by running dpkg-buildpackage inside the container:
shell
dpkg-buildpackage -uc -us -b
The options -uc -us will skip signing the resulting Debian source package and other build artifacts. The -b option will skip creating a source package and only build the (binary) *.deb packages. The output is very verbose and gives a large amount of context about what is happening during the build to make debugging build failures easier. In the build log of entr you will see for example the line dh binary --buildsystem=makefile. This and other dh commands can also be run manually if there is a need to quickly repeat only a part of the build while debugging build failures. To see what files were generated or modified by the build simply run git status --ignored:
shell
$ git status --ignored
On branch debian/latest

Untracked files:
 (use "git add <file>..." to include in what will be committed)
 debian/debhelper-build-stamp
 debian/entr.debhelper.log
 debian/entr.substvars
 debian/files

Ignored files:
 (use "git add -f <file>..." to include in what will be committed)
 Makefile
 compat.c
 compat.o
 debian/.debhelper/
 debian/entr/
 entr
 entr.o
 status.o
Re-running dpkg-buildpackage will include running the command dh clean, which assuming it is configured correctly in the debian/rules file will reset the source directory to the original pristine state. The same can of course also be done with regular git commands git reset --hard; git clean -fdx. To avoid accidentally committing unnecessary build artifacts in git, a debian/.gitignore can be useful and it would typically include all four files listed as untracked above. After a successful build you would have the following files:
shell
 -- entr
   -- LICENSE
   -- Makefile -> Makefile.linux
   -- Makefile.bsd
   -- Makefile.linux
   -- Makefile.linux-compat
   -- Makefile.macos
   -- NEWS
   -- README.md
   -- compat.c
   -- compat.o
   -- configure
   -- data.h
   -- debian
     -- README.source.md
     -- changelog
     -- control
     -- copyright
     -- debhelper-build-stamp
     -- docs
     -- entr
       -- DEBIAN
         -- control
         -- md5sums
       -- usr
       -- bin
         -- entr
       -- share
       -- doc
         -- entr
         -- NEWS.gz
         -- README.md
         -- changelog.Debian.gz
         -- copyright
       -- man
       -- man1
       -- entr.1.gz
     -- entr.debhelper.log
     -- entr.substvars
     -- files
     -- gbp.conf
     -- patches
       -- PR149-expand-aliases-in-system-test-script.patch
       -- series
       -- system-test-skip-no-tty.patch
       -- system-test-with-system-binary.patch
     -- rules
     -- salsa-ci.yml
     -- source
       -- format
     -- tests
       -- control
     -- upstream
       -- metadata
       -- signing-key.asc
     -- watch
   -- entr
   -- entr.1
   -- entr.c
   -- entr.o
   -- missing
     -- compat.h
     -- kqueue_inotify.c
     -- strlcpy.c
     -- sys
     -- event.h
   -- status.c
   -- status.h
   -- status.o
   -- system_test.sh
 -- entr-dbgsym_5.6-1_amd64.deb
 -- entr_5.6-1.debian.tar.xz
 -- entr_5.6-1.dsc
 -- entr_5.6-1_amd64.buildinfo
 -- entr_5.6-1_amd64.changes
 -- entr_5.6-1_amd64.deb
 -- entr_5.6.orig.tar.xz
The contents of debian/entr are essentially what goes into the resulting entr_5.6-1_amd64.deb package. Familiarizing yourself with the majority of the files in the original upstream source as well as all the resulting build artifacts is time consuming, but it is a necessary investment to get high-quality Debian packages. There are also tools such as Debcraft that automate generating the build artifacts in separate output directories for each build, thus making it easy to compare the changes to correlate what change in the Debian packaging led to what change in the resulting build artifacts.

Re-run the initial import with git-buildpackage When upstreams publish releases as tarballs, they should also be imported for optimal software supply-chain security, in particular if upstream also publishes cryptographic signatures that can be used to verify the authenticity of the tarballs. To achieve this, the files debian/watch, debian/upstream/signing-key.asc, and debian/gbp.conf need to be present with the correct options. In the gbp.conf file, ensure you have the correct options based on:
  1. Does upstream release tarballs? If so, enforce pristine-tar = True.
  2. Does upstream sign the tarballs? If so, configure explicit signature checking with upstream-signatures = on.
  3. Does upstream have a git repository, and does it have release git tags? If so, configure the release git tag format, e.g. upstream-vcs-tag = %(version%~%.)s.
To validate that the above files are working correctly, run gbp import-orig with the current version explicitly defined:
shell
$ gbp import-orig --uscan --upstream-version 5.6
gbp:info: Launching uscan...
gpgv: Signature made 7. Aug 2024 07.43.27 PDT
gpgv: using RSA key 519151D83E83D40A232B4D615C418B8631BC7C26
gpgv: Good signature from "Eric Radman <ericshane@eradman.com>"
gbp:info: Using uscan downloaded tarball ../entr_5.6.orig.tar.gz
gbp:info: Importing '../entr_5.6.orig.tar.gz' to branch 'upstream/latest'...
gbp:info: Source package is entr
gbp:info: Upstream version is 5.6
gbp:info: Replacing upstream source on 'debian/latest'
gbp:info: Running Postimport hook
gbp:info: Successfully imported version 5.6 of ../entr_5.6.orig.tar.gz
As the original packaging was done based on the upstream release git tag, the above command will fetch the tarball release, create the pristine-tar branch, and store the tarball delta on it. This command will also attempt to create the tag upstream/5.6 on the upstream/latest branch.

Import new upstream versions in the future Forking the upstream git repository, creating the initial packaging, and creating the DEP-14 branch structure are all one-off work needed only when creating the initial packaging. Going forward, to import new upstream releases, one would simply run git fetch upstreamvcs; gbp import-orig --uscan, which fetches the upstream git tags, checks for new upstream tarballs, and automatically downloads, verifies, and imports the new version. See the galera-4-demo example in the Debian source packages in git explained post as a demo you can try running yourself and examine in detail. You can also try running gbp import-orig --uscan without specifying a version. It would fetch it, as it will notice there is now Entr version 5.7 available, and import it.

Build using git-buildpackage From this stage onwards you should build the package using gbp buildpackage, which will do a more comprehensive build.
shell
gbp buildpackage -uc -us
The git-buildpackage build also includes running Lintian to find potential Debian policy violations in the sources or in the resulting .deb binary packages. Many Debian Developers run lintian -EviIL +pedantic after every build to check that there are no new nags, and to validate that changes intended to previous Lintian nags were correct.

Open a Merge Request on Salsa for Debian packaging review Getting everything perfectly right takes a lot of effort, and may require reaching out to an experienced Debian Developers for review and guidance. Thus, you should aim to publish your initial packaging work on Salsa, Debian s GitLab instance, for review and feedback as early as possible. For somebody to be able to easily see what you have done, you should rename your debian/latest branch to another name, for example next/debian/latest, and open a Merge Request that targets the debian/latest branch on your Salsa fork, which still has only the unmodified upstream files. If you have followed the workflow in this post so far, you can simply run:
  1. git checkout -b next/debian/latest
  2. git push --set-upstream origin next/debian/latest
  3. Open in a browser the URL visible in the git remote response
  4. Write the Merge Request description in case the default text from your commit is not enough
  5. Mark the MR as Draft using the checkbox
  6. Publish the MR and request feedback
Once a Merge Request exists, discussion regarding what additional changes are needed can be conducted as MR comments. With an MR, you can easily iterate on the contents of next/debian/latest, rebase, force push, and request re-review as many times as you want. While at it, make sure the Settings > CI/CD page has under CI/CD configuration file the value debian/salsa-ci.yml so that the CI can run and give you immediate automated feedback. For an example of an initial packaging Merge Request, see https://salsa.debian.org/otto/entr-demo/-/merge_requests/1.

Open a Merge Request / Pull Request to fix upstream code Due to the high quality requirements in Debian, it is fairly common that while doing the initial Debian packaging of an open source project, issues are found that stem from the upstream source code. While it is possible to carry extra patches in Debian, it is not good practice to deviate too much from upstream code with custom Debian patches. Instead, the Debian packager should try to get the fixes applied directly upstream. Using git-buildpackage patch queues is the most convenient way to make modifications to the upstream source code so that they automatically convert into Debian patches (stored at debian/patches), and can also easily be submitted upstream as any regular git commit (and rebased and resubmitted many times over). First, decide if you want to work out of the upstream development branch and later cherry-pick to the Debian packaging branch, or work out of the Debian packaging branch and cherry-pick to an upstream branch. The example below starts from the upstream development branch and then cherry-picks the commit into the git-buildpackage patch queue:
shell
git checkout -b bugfix-branch master
nano entr.c
make
./entr # verify change works as expected
git commit -a -m "Commit title" -m "Commit body"
git push # submit upstream
gbp pq import --force --time-machine=10
git cherry-pick <commit id>
git commit --amend # extend commit message with DEP-3 metadata
gbp buildpackage -uc -us -b
./entr # verify change works as expected
gbp pq export --drop --commit
git commit --amend # Write commit message along lines "Add patch to .."
The example below starts by making the fix on a git-buildpackage patch queue branch, and then cherry-picking it onto the upstream development branch:
shell
gbp pq import --force --time-machine=10
nano entr.c
git commit -a -m "Commit title" -m "Commit body"
gbp buildpackage -uc -us -b
./entr # verify change works as expected
gbp pq export --drop --commit
git commit --amend # Write commit message along lines "Add patch to .."
git checkout -b bugfix-branch master
git cherry-pick <commit id>
git commit --amend # prepare commit message for upstream submission
git push # submit upstream
The key git-buildpackage commands to enter and exit the patch-queue mode are:
shell
gbp pq import --force --time-machine=10
gbp pq export --drop --commit
%% init:   'gitGraph':   'mainBranchName': 'debian/latest'      %%
gitGraph
checkout debian/latest
commit id: "Initial packaging"
branch patch-queue/debian/latest
checkout patch-queue/debian/latest
commit id: "Delete debian/patches/..."
commit id: "Patch 1 title"
commit id: "Patch 2 title"
commit id: "Patch 3 title"
These can be run at any time, regardless if any debian/patches existed prior, or if existing patches applied cleanly or not, or if there were old patch queue branches around. Note that the extra -b in gbp buildpackage -uc -us -b instructs to build only binary packages, avoiding any nags from dpkg-source that there are modifications in the upstream sources while building in the patches-applied mode.

Programming-language specific dh-make alternatives As each programming language has its specific way of building the source code, and many other conventions regarding the file layout and more, Debian has multiple custom tools to create new Debian source packages for specific programming languages. Notably, Python does not have its own tool, but there is an dh_make --python option for Python support directly in dh_make itself. The list is not complete and many more tools exist. For some languages, there are even competing options, such as for Go there is in addition to dh-make-golang also Gophian. When learning Debian packaging, there is no need to learn these tools upfront. Being aware that they exist is enough, and one can learn them only if and when one starts to package a project in a new programming language.

The difference between source git repository vs source packages vs binary packages As seen in earlier example, running gbp buildpackage on the Entr packaging repository above will result in several files:
entr_5.6-1_amd64.changes
entr_5.6-1_amd64.deb
entr_5.6-1.debian.tar.xz
entr_5.6-1.dsc
entr_5.6.orig.tar.gz
entr_5.6.orig.tar.gz.asc
The entr_5.6-1_amd64.deb is the binary package, which can be installed on a Debian/Ubuntu system. The rest of the files constitute the source package. To do a source-only build, run gbp buildpackage -S and note the files produced:
entr_5.6-1_source.changes
entr_5.6-1.debian.tar.xz
entr_5.6-1.dsc
entr_5.6.orig.tar.gz
entr_5.6.orig.tar.gz.asc
The source package files can be used to build the binary .deb for amd64, or any architecture that the package supports. It is important to grasp that the Debian source package is the preferred form to be able to build the binary packages on various Debian build systems, and the Debian source package is not the same thing as the Debian packaging git repository contents.
flowchart LR
git[Git repository<br>branch debian/latest] --> gbp buildpackage -S  src[Source Package<br>.dsc + .tar.xz]
src --> dpkg-buildpackage  bin[Binary Packages<br>.deb]
If the package is large and complex, the build could result in multiple binary packages. One set of package definition files in debian/ will however only ever result in a single source package.

Option to repackage source packages with Files-Excluded lists in the debian/copyright file Some upstream projects may include binary files in their release, or other undesirable content that needs to be omitted from the source package in Debian. The easiest way to filter them out is by adding to the debian/copyright file a Files-Excluded field listing the undesired files. The debian/copyright file is read by uscan, which will repackage the upstream sources on-the-fly when importing new upstream releases. For a real-life example, see the debian/copyright files in the Godot package that lists:
debian
Files-Excluded: platform/android/java/gradle/wrapper/gradle-wrapper.jar
The resulting repackaged upstream source tarball, as well as the upstream version component, will have an extra +ds to signify that it is not the true original upstream source but has been modified by Debian:
godot_4.3+ds.orig.tar.xz
godot_4.3+ds-1_amd64.deb

Creating one Debian source package from multiple upstream source packages also possible In some rare cases the upstream project may be split across multiple git repositories or the upstream release may consist of multiple components each in their own separate tarball. Usually these are very large projects that get some benefits from releasing components separately. If in Debian these are deemed to go into a single source package, it is technically possible using the component system in git-buildpackage and uscan. For an example see the gbp.conf and watch files in the node-cacache package. Using this type of structure should be a last resort, as it creates complexity and inter-dependencies that are bound to cause issues later on. It is usually better to work with upstream and champion universal best practices with clear releases and version schemes.

When not to start the Debian packaging repository as a fork of the upstream one Not all upstreams use Git for version control. It is by far the most popular, but there are still some that use e.g. Subversion or Mercurial. Who knows maybe in the future some new version control systems will start to compete with Git. There are also projects that use Git in massive monorepos and with complex submodule setups that invalidate the basic assumptions required to map an upstream Git repository into a Debian packaging repository. In those cases one can t use a debian/latest branch on a clone of the upstream git repository as the starting point for the Debian packaging, but one must revert the traditional way of starting from an upstream release tarball with gbp import-orig package-1.0.tar.gz.

Conclusion Created in August 1993, Debian is one of the oldest Linux distributions. In the 32 years since inception, the .deb packaging format and the tooling to work with it have evolved several generations. In the past 10 years, more and more Debian Developers have converged on certain core practices evidenced by https://trends.debian.net/, but there is still a lot of variance in workflows even for identical tasks. Hopefully, you find this post useful in giving practical guidance on how exactly to do the most common things when packaging software for Debian. Happy packaging!

25 May 2025

Otto Kek l inen: New Debian package creation from upstream git repository

Featured image of post New Debian package creation from upstream git repositoryIn this post, I demonstrate the optimal workflow for creating new Debian packages in 2025, preserving the upstream git history. The motivation for this is to lower the barrier for sharing improvements to and from upstream, and to improve software provenance and supply-chain security by making it easy to inspect every change at any level using standard git tooling. Key elements of this workflow include: To make the instructions so concrete that anyone can repeat all the steps themselves on a real package, I demonstrate the steps by packaging the command-line tool Entr. It is written in C, has very few dependencies, and its final Debian source package structure is simple, yet exemplifies all the important parts that go into a complete Debian package:
  1. Creating a new packaging repository and publishing it under your personal namespace on salsa.debian.org.
  2. Using dh_make to create the initial Debian packaging.
  3. Posting the first draft of the Debian packaging as a Merge Request (MR) and using Salsa CI to verify Debian packaging quality.
  4. Running local builds efficiently and iterating on the packaging process.

Create new Debian packaging repository from the existing upstream project git repository First, create a new empty directory, then clone the upstream Git repository inside it:
shell
mkdir debian-entr
cd debian-entr
git clone --origin upstreamvcs --branch master \
 --single-branch https://github.com/eradman/entr.git
Using a clean directory makes it easier to inspect the build artifacts of a Debian package, which will be output in the parent directory of the Debian source directory. The extra parameters given to git clone lay the foundation for the Debian packaging git repository structure where the upstream git remote name is upstreamvcs. Only the upstream main branch is tracked to avoid cluttering git history with upstream development branches that are irrelevant for packaging in Debian. Next, enter the git repository directory and list the git tags. Pick the latest upstream release tag as the commit to start the branch upstream/latest. This latest refers to the upstream release, not the upstream development branch. Immediately after, branch off the debian/latest branch, which will have the actual Debian packaging files in the debian/ subdirectory.
shell
cd entr
git tag # shows the latest upstream release tag was '5.6'
git checkout -b upstream/latest 5.6
git checkout -b debian/latest
%% init:   'gitGraph':   'mainBranchName': 'master'      %%
gitGraph:
checkout master
commit id: "Upstream 5.6 release" tag: "5.6"
branch upstream/latest
checkout upstream/latest
commit id: "New upstream version 5.6" tag: "upstream/5.6"
branch debian/latest
checkout debian/latest
commit id: "Initial Debian packaging"
commit id: "Additional change 1"
commit id: "Additional change 2"
commit id: "Additional change 3"
At this point, the repository is structured according to DEP-14 conventions, ensuring a clear separation between upstream and Debian packaging changes, but there are no Debian changes yet. Next, add the Salsa repository as a new remote which called origin, the same as the default remote name in git.
shell
git remote add origin git@salsa.debian.org:otto/entr-demo.git
git push --set-upstream origin debian/latest
This is an important preparation step to later be able to create a Merge Request on Salsa that targets the debian/latest branch, which does not yet have any debian/ directory.

Launch a Debian Sid (unstable) container to run builds in To ensure that all packaging tools are of the latest versions, run everything inside a fresh Sid container. This has two benefits: you are guaranteed to have the most up-to-date toolchain, and your host system stays clean without getting polluted by various extra packages. Additionally, this approach works even if your host system is not Debian/Ubuntu.
shell
cd ..
podman run --interactive --tty --rm --shm-size=1G --cap-add SYS_PTRACE \
 --env='DEB*' --volume=$PWD:/tmp/test --workdir=/tmp/test debian:sid bash
Note that the container should be started from the parent directory of the git repository, not inside it. The --volume parameter will loop-mount the current directory inside the container. Thus all files created and modified are on the host system, and will persist after the container shuts down. Once inside the container, install the basic dependencies:
shell
apt update -q && apt install -q --yes git-buildpackage dpkg-dev dh-make

Automate creating the debian/ files with dh-make To create the files needed for the actual Debian packaging, use dh_make:
shell
# dh_make --packagename entr_5.6 --single --createorig
Maintainer Name : Otto Kek l inen
Email-Address : otto@debian.org
Date : Sat, 15 Feb 2025 01:17:51 +0000
Package Name : entr
Version : 5.6
License : blank
Package Type : single
Are the details correct? [Y/n/q]

Done. Please edit the files in the debian/ subdirectory now.
Due to how dh_make works, the package name and version need to be written as a single underscore separated string. In this case, you should choose --single to specify that the package type is a single binary package. Other options would be --library for library packages (see libgda5 sources as an example) or --indep (see dns-root-data sources as an example). The --createorig will create a mock upstream release tarball (entr_5.6.orig.tar.xz) from the current release directory, which is necessary due to historical reasons and how dh_make worked before git repositories became common and Debian source packages were based off upstream release tarballs (e.g. *.tar.gz). At this stage, a debian/ directory has been created with template files, and you can start modifying the files and iterating towards actual working packaging.
shell
git add debian/
git commit -a -m "Initial Debian packaging"

Review the files The full list of files after the above steps with dh_make would be:
 -- entr
   -- LICENSE
   -- Makefile.bsd
   -- Makefile.linux
   -- Makefile.linux-compat
   -- Makefile.macos
   -- NEWS
   -- README.md
   -- configure
   -- data.h
   -- debian
     -- README.Debian
     -- README.source
     -- changelog
     -- control
     -- copyright
     -- gbp.conf
     -- entr-docs.docs
     -- entr.cron.d.ex
     -- entr.doc-base.ex
     -- manpage.1.ex
     -- manpage.md.ex
     -- manpage.sgml.ex
     -- manpage.xml.ex
     -- postinst.ex
     -- postrm.ex
     -- preinst.ex
     -- prerm.ex
     -- rules
     -- salsa-ci.yml.ex
     -- source
       -- format
     -- upstream
       -- metadata.ex
     -- watch.ex
   -- entr.1
   -- entr.c
   -- missing
     -- compat.h
     -- kqueue_inotify.c
     -- strlcpy.c
     -- sys
     -- event.h
   -- status.c
   -- status.h
   -- system_test.sh
 -- entr_5.6.orig.tar.xz
You can browse these files in the demo repository. The mandatory files in the debian/ directory are:
  • changelog,
  • control,
  • copyright,
  • and rules.
All the other files have been created for convenience so the packager has template files to work from. The files with the suffix .ex are example files that won t have any effect until their content is adjusted and the suffix removed. For detailed explanations of the purpose of each file in the debian/ subdirectory, see the following resources:
  • The Debian Policy Manual: Describes the structure of the operating system, the package archive and requirements for packages to be included in the Debian archive.
  • The Developer s Reference: A collection of best practices and process descriptions Debian packagers are expected to follow while interacting with one another.
  • Debhelper man pages: Detailed information of how the Debian package build system works, and how the contents of the various files in debian/ affect the end result.
As Entr, the package used in this example, is a real package that already exists in the Debian archive, you may want to browse the actual Debian packaging source at https://salsa.debian.org/debian/entr/-/tree/debian/latest/debian for reference. Most of these files have standardized formatting conventions to make collaboration easier. To automatically format the files following the most popular conventions, simply run wrap-and-sort -vast or debputy reformat --style=black.

Identify build dependencies The most common reason for builds to fail is missing dependencies. The easiest way to identify which Debian package ships the required dependency is using apt-file. If, for example, a build fails complaining that pcre2posix.h cannot be found or that libcre2-posix.so is missing, you can use these commands:
shell
$ apt install -q --yes apt-file && apt-file update
$ apt-file search pcre2posix.h
libpcre2-dev: /usr/include/pcre2posix.h
$ apt-file search libpcre2-posix.so
libpcre2-dev: /usr/lib/x86_64-linux-gnu/libpcre2-posix.so
libpcre2-posix3: /usr/lib/x86_64-linux-gnu/libpcre2-posix.so.3
libpcre2-posix3: /usr/lib/x86_64-linux-gnu/libpcre2-posix.so.3.0.6
The output above implies that the debian/control should be extended to define a Build-Depends: libpcre2-dev relationship. There is also dpkg-depcheck that uses strace to trace the files the build process tries to access, and lists what Debian packages those files belong to. Example usage:
shell
dpkg-depcheck -b debian/rules build

Build the Debian sources to generate the .deb package After the first pass of refining the contents of the files in debian/, test the build by running dpkg-buildpackage inside the container:
shell
dpkg-buildpackage -uc -us -b
The options -uc -us will skip signing the resulting Debian source package and other build artifacts. The -b option will skip creating a source package and only build the (binary) *.deb packages. The output is very verbose and gives a large amount of context about what is happening during the build to make debugging build failures easier. In the build log of entr you will see for example the line dh binary --buildsystem=makefile. This and other dh commands can also be run manually if there is a need to quickly repeat only a part of the build while debugging build failures. To see what files were generated or modified by the build simply run git status --ignored:
shell
$ git status --ignored
On branch debian/latest

Untracked files:
 (use "git add <file>..." to include in what will be committed)
 debian/debhelper-build-stamp
 debian/entr.debhelper.log
 debian/entr.substvars
 debian/files

Ignored files:
 (use "git add -f <file>..." to include in what will be committed)
 Makefile
 compat.c
 compat.o
 debian/.debhelper/
 debian/entr/
 entr
 entr.o
 status.o
Re-running dpkg-buildpackage will include running the command dh clean, which assuming it is configured correctly in the debian/rules file will reset the source directory to the original pristine state. The same can of course also be done with regular git commands git reset --hard; git clean -fdx. To avoid accidentally committing unnecessary build artifacts in git, a debian/.gitignore can be useful and it would typically include all four files listed as untracked above. After a successful build you would have the following files:
shell
 -- entr
   -- LICENSE
   -- Makefile -> Makefile.linux
   -- Makefile.bsd
   -- Makefile.linux
   -- Makefile.linux-compat
   -- Makefile.macos
   -- NEWS
   -- README.md
   -- compat.c
   -- compat.o
   -- configure
   -- data.h
   -- debian
     -- README.source.md
     -- changelog
     -- control
     -- copyright
     -- debhelper-build-stamp
     -- docs
     -- entr
       -- DEBIAN
         -- control
         -- md5sums
       -- usr
       -- bin
         -- entr
       -- share
       -- doc
         -- entr
         -- NEWS.gz
         -- README.md
         -- changelog.Debian.gz
         -- copyright
       -- man
       -- man1
       -- entr.1.gz
     -- entr.debhelper.log
     -- entr.substvars
     -- files
     -- gbp.conf
     -- patches
       -- PR149-expand-aliases-in-system-test-script.patch
       -- series
       -- system-test-skip-no-tty.patch
       -- system-test-with-system-binary.patch
     -- rules
     -- salsa-ci.yml
     -- source
       -- format
     -- tests
       -- control
     -- upstream
       -- metadata
       -- signing-key.asc
     -- watch
   -- entr
   -- entr.1
   -- entr.c
   -- entr.o
   -- missing
     -- compat.h
     -- kqueue_inotify.c
     -- strlcpy.c
     -- sys
     -- event.h
   -- status.c
   -- status.h
   -- status.o
   -- system_test.sh
 -- entr-dbgsym_5.6-1_amd64.deb
 -- entr_5.6-1.debian.tar.xz
 -- entr_5.6-1.dsc
 -- entr_5.6-1_amd64.buildinfo
 -- entr_5.6-1_amd64.changes
 -- entr_5.6-1_amd64.deb
 -- entr_5.6.orig.tar.xz
The contents of debian/entr are essentially what goes into the resulting entr_5.6-1_amd64.deb package. Familiarizing yourself with the majority of the files in the original upstream source as well as all the resulting build artifacts is time consuming, but it is a necessary investment to get high-quality Debian packages. There are also tools such as Debcraft that automate generating the build artifacts in separate output directories for each build, thus making it easy to compare the changes to correlate what change in the Debian packaging led to what change in the resulting build artifacts.

Re-run the initial import with git-buildpackage When upstreams publish releases as tarballs, they should also be imported for optimal software supply-chain security, in particular if upstream also publishes cryptographic signatures that can be used to verify the authenticity of the tarballs. To achieve this, the files debian/watch, debian/upstream/signing-key.asc, and debian/gbp.conf need to be present with the correct options. In the gbp.conf file, ensure you have the correct options based on:
  1. Does upstream release tarballs? If so, enforce pristine-tar = True.
  2. Does upstream sign the tarballs? If so, configure explicit signature checking with upstream-signatures = on.
  3. Does upstream have a git repository, and does it have release git tags? If so, configure the release git tag format, e.g. upstream-vcs-tag = %(version%~%.)s.
To validate that the above files are working correctly, run gbp import-orig with the current version explicitly defined:
shell
$ gbp import-orig --uscan --upstream-version 5.6
gbp:info: Launching uscan...
gpgv: Signature made 7. Aug 2024 07.43.27 PDT
gpgv: using RSA key 519151D83E83D40A232B4D615C418B8631BC7C26
gpgv: Good signature from "Eric Radman <ericshane@eradman.com>"
gbp:info: Using uscan downloaded tarball ../entr_5.6.orig.tar.gz
gbp:info: Importing '../entr_5.6.orig.tar.gz' to branch 'upstream/latest'...
gbp:info: Source package is entr
gbp:info: Upstream version is 5.6
gbp:info: Replacing upstream source on 'debian/latest'
gbp:info: Running Postimport hook
gbp:info: Successfully imported version 5.6 of ../entr_5.6.orig.tar.gz
As the original packaging was done based on the upstream release git tag, the above command will fetch the tarball release, create the pristine-tar branch, and store the tarball delta on it. This command will also attempt to create the tag upstream/5.6 on the upstream/latest branch.

Import new upstream versions in the future Forking the upstream git repository, creating the initial packaging, and creating the DEP-14 branch structure are all one-off work needed only when creating the initial packaging. Going forward, to import new upstream releases, one would simply run git fetch upstreamvcs; gbp import-orig --uscan, which fetches the upstream git tags, checks for new upstream tarballs, and automatically downloads, verifies, and imports the new version. See the galera-4-demo example in the Debian source packages in git explained post as a demo you can try running yourself and examine in detail. You can also try running gbp import-orig --uscan without specifying a version. It would fetch it, as it will notice there is now Entr version 5.7 available, and import it.

Build using git-buildpackage From this stage onwards you should build the package using gbp buildpackage, which will do a more comprehensive build.
shell
gbp buildpackage -uc -us
The git-buildpackage build also includes running Lintian to find potential Debian policy violations in the sources or in the resulting .deb binary packages. Many Debian Developers run lintian -EviIL +pedantic after every build to check that there are no new nags, and to validate that changes intended to previous Lintian nags were correct.

Open a Merge Request on Salsa for Debian packaging review Getting everything perfectly right takes a lot of effort, and may require reaching out to an experienced Debian Developers for review and guidance. Thus, you should aim to publish your initial packaging work on Salsa, Debian s GitLab instance, for review and feedback as early as possible. For somebody to be able to easily see what you have done, you should rename your debian/latest branch to another name, for example next/debian/latest, and open a Merge Request that targets the debian/latest branch on your Salsa fork, which still has only the unmodified upstream files. If you have followed the workflow in this post so far, you can simply run:
  1. git checkout -b next/debian/latest
  2. git push --set-upstream origin next/debian/latest
  3. Open in a browser the URL visible in the git remote response
  4. Write the Merge Request description in case the default text from your commit is not enough
  5. Mark the MR as Draft using the checkbox
  6. Publish the MR and request feedback
Once a Merge Request exists, discussion regarding what additional changes are needed can be conducted as MR comments. With an MR, you can easily iterate on the contents of next/debian/latest, rebase, force push, and request re-review as many times as you want. While at it, make sure the Settings > CI/CD page has under CI/CD configuration file the value debian/salsa-ci.yml so that the CI can run and give you immediate automated feedback. For an example of an initial packaging Merge Request, see https://salsa.debian.org/otto/entr-demo/-/merge_requests/1.

Open a Merge Request / Pull Request to fix upstream code Due to the high quality requirements in Debian, it is fairly common that while doing the initial Debian packaging of an open source project, issues are found that stem from the upstream source code. While it is possible to carry extra patches in Debian, it is not good practice to deviate too much from upstream code with custom Debian patches. Instead, the Debian packager should try to get the fixes applied directly upstream. Using git-buildpackage patch queues is the most convenient way to make modifications to the upstream source code so that they automatically convert into Debian patches (stored at debian/patches), and can also easily be submitted upstream as any regular git commit (and rebased and resubmitted many times over). First, decide if you want to work out of the upstream development branch and later cherry-pick to the Debian packaging branch, or work out of the Debian packaging branch and cherry-pick to an upstream branch. The example below starts from the upstream development branch and then cherry-picks the commit into the git-buildpackage patch queue:
shell
git checkout -b bugfix-branch master
nano entr.c
make
./entr # verify change works as expected
git commit -a -m "Commit title" -m "Commit body"
git push # submit upstream
gbp pq import --force --time-machine=10
git cherry-pick <commit id>
git commit --amend # extend commit message with DEP-3 metadata
gbp buildpackage -uc -us -b
./entr # verify change works as expected
gbp pq export --drop --commit
git commit --amend # Write commit message along lines "Add patch to .."
The example below starts by making the fix on a git-buildpackage patch queue branch, and then cherry-picking it onto the upstream development branch:
shell
gbp pq import --force --time-machine=10
nano entr.c
git commit -a -m "Commit title" -m "Commit body"
gbp buildpackage -uc -us -b
./entr # verify change works as expected
gbp pq export --drop --commit
git commit --amend # Write commit message along lines "Add patch to .."
git checkout -b bugfix-branch master
git cherry-pick <commit id>
git commit --amend # prepare commit message for upstream submission
git push # submit upstream
The key git-buildpackage commands to enter and exit the patch-queue mode are:
shell
gbp pq import --force --time-machine=10
gbp pq export --drop --commit
%% init:   'gitGraph':   'mainBranchName': 'debian/latest'      %%
gitGraph
checkout debian/latest
commit id: "Initial packaging"
branch patch-queue/debian/latest
checkout patch-queue/debian/latest
commit id: "Delete debian/patches/..."
commit id: "Patch 1 title"
commit id: "Patch 2 title"
commit id: "Patch 3 title"
These can be run at any time, regardless if any debian/patches existed prior, or if existing patches applied cleanly or not, or if there were old patch queue branches around. Note that the extra -b in gbp buildpackage -uc -us -b instructs to build only binary packages, avoiding any nags from dpkg-source that there are modifications in the upstream sources while building in the patches-applied mode.

Programming-language specific dh-make alternatives As each programming language has its specific way of building the source code, and many other conventions regarding the file layout and more, Debian has multiple custom tools to create new Debian source packages for specific programming languages. Notably, Python does not have its own tool, but there is an dh_make --python option for Python support directly in dh_make itself. The list is not complete and many more tools exist. For some languages, there are even competing options, such as for Go there is in addition to dh-make-golang also Gophian. When learning Debian packaging, there is no need to learn these tools upfront. Being aware that they exist is enough, and one can learn them only if and when one starts to package a project in a new programming language.

The difference between source git repository vs source packages vs binary packages As seen in earlier example, running gbp buildpackage on the Entr packaging repository above will result in several files:
entr_5.6-1_amd64.changes
entr_5.6-1_amd64.deb
entr_5.6-1.debian.tar.xz
entr_5.6-1.dsc
entr_5.6.orig.tar.gz
entr_5.6.orig.tar.gz.asc
The entr_5.6-1_amd64.deb is the binary package, which can be installed on a Debian/Ubuntu system. The rest of the files constitute the source package. To do a source-only build, run gbp buildpackage -S and note the files produced:
entr_5.6-1_source.changes
entr_5.6-1.debian.tar.xz
entr_5.6-1.dsc
entr_5.6.orig.tar.gz
entr_5.6.orig.tar.gz.asc
The source package files can be used to build the binary .deb for amd64, or any architecture that the package supports. It is important to grasp that the Debian source package is the preferred form to be able to build the binary packages on various Debian build systems, and the Debian source package is not the same thing as the Debian packaging git repository contents.
flowchart LR
git[Git repository<br>branch debian/latest] --> gbp buildpackage -S  src[Source Package<br>.dsc + .tar.xz]
src --> dpkg-buildpackage  bin[Binary Packages<br>.deb]
If the package is large and complex, the build could result in multiple binary packages. One set of package definition files in debian/ will however only ever result in a single source package.

Option to repackage source packages with Files-Excluded lists in the debian/copyright file Some upstream projects may include binary files in their release, or other undesirable content that needs to be omitted from the source package in Debian. The easiest way to filter them out is by adding to the debian/copyright file a Files-Excluded field listing the undesired files. The debian/copyright file is read by uscan, which will repackage the upstream sources on-the-fly when importing new upstream releases. For a real-life example, see the debian/copyright files in the Godot package that lists:
debian
Files-Excluded: platform/android/java/gradle/wrapper/gradle-wrapper.jar
The resulting repackaged upstream source tarball, as well as the upstream version component, will have an extra +ds to signify that it is not the true original upstream source but has been modified by Debian:
godot_4.3+ds.orig.tar.xz
godot_4.3+ds-1_amd64.deb

Creating one Debian source package from multiple upstream source packages also possible In some rare cases the upstream project may be split across multiple git repositories or the upstream release may consist of multiple components each in their own separate tarball. Usually these are very large projects that get some benefits from releasing components separately. If in Debian these are deemed to go into a single source package, it is technically possible using the component system in git-buildpackage and uscan. For an example see the gbp.conf and watch files in the node-cacache package. Using this type of structure should be a last resort, as it creates complexity and inter-dependencies that are bound to cause issues later on. It is usually better to work with upstream and champion universal best practices with clear releases and version schemes.

When not to start the Debian packaging repository as a fork of the upstream one Not all upstreams use Git for version control. It is by far the most popular, but there are still some that use e.g. Subversion or Mercurial. Who knows maybe in the future some new version control systems will start to compete with Git. There are also projects that use Git in massive monorepos and with complex submodule setups that invalidate the basic assumptions required to map an upstream Git repository into a Debian packaging repository. In those cases one can t use a debian/latest branch on a clone of the upstream git repository as the starting point for the Debian packaging, but one must revert the traditional way of starting from an upstream release tarball with gbp import-orig package-1.0.tar.gz.

Conclusion Created in August 1993, Debian is one of the oldest Linux distributions. In the 32 years since inception, the .deb packaging format and the tooling to work with it have evolved several generations. In the past 10 years, more and more Debian Developers have converged on certain core practices evidenced by https://trends.debian.net/, but there is still a lot of variance in workflows even for identical tasks. Hopefully, you find this post useful in giving practical guidance on how exactly to do the most common things when packaging software for Debian. Happy packaging!

22 May 2025

Simon Quigley: Bootstrapping and Bikeshedding

When you learn to write, one of the major pieces is writing an introduction. You could say quite a bit about the formal process and the way formal writing is typically structured.That being said, I m not your typical writer. I didn t exactly plan to be in this spot, and honestly, it s just a hobby of mine. If you enjoy reading my posts, great. I appreciate it.My entire point, from the very beginning of this week, has been to bootstrap my own platform, on my own two feet. As you all know, rumors were going around, and I felt as if I wasn t in that great of a position to go out and say everything that I ve said, in one long post. I ve needed to break it apart, and work through some of my own notes again. Prove myself, rather than asking for it to be handed to me. Keep people guessing on some elements, to let the people who have been doing wrong firmly prove themselves. At least twelve people at this point have sent me undeniable proof. I don t want to do anything with it. I want to move on, and I want to write about technology, and other things I actually enjoy writing about.I really don t enjoy conflict, or writing about it, at all. I just know how to defend myself if I need to. If you think I actively want to make everyone mad for no good reason, you re honestly fooling yourself.I already know that I m not going to agree with everything I write in a few years, or maybe even weeks. But I ll hold myself to the same standard I hold everyone else. If you say something and a few years pass, I m not going to assume you still have the same opinion. I ll give you the opportunity to correct it. I d ask you lend me the same courtesy.I do this because I enjoy it, not out of anger, sadness, or anything like that. A number of people have approached me now simply asking why I m doing what I m doing. I planned to write this blog post at the very end all along. I just haven t revealed the plans before publishing. [I wrote the outline for this more than a day ago.]My entire point here is to give the common person a voice. I know exactly what it s like to start from nothing, and work your way up to a comfortable spot. I didn t only do it once, I did it twice. This is my third time. I genuinely don t appreciate it when people silence other people s voices just because they don t like them. I wasn t raised that way, and that s not how I run my own projects. In fact, I d say that if you silence opinions or values that don t exactly match yours, you re missing out on the variety of life.You can lead a horse to water, but you can t make them drink. If you still think there are issues on my end, you re being misinformed. Plain and simple. I m not going to spend any more time on it, I actually want to start what I ve always wanted to do for years, just never had the opportunity to.This is my last post for the week. I feel like I ve bootstrapped enough where next week I can focus on another topic, as originally planned.Thanks for reading. I appreciate your support. Talk to you on Monday. If you have topic suggestions, feel free to leave them in the comments. Feel free to leave your hate mail too, so I know where I m at.If you read Piaget, you know that silence is concerning with respect to socialization. I just took a look at the view count for the first time, and I m already up to more than 2k views. In the first week. Without actually digging into the content I want to write about, yet.I don t mean to say that to brag, I m just telling you for a fact that I m not ranting into thin air. I m not even ranting at all. In fact, every single one of my posts has been written either calmly or with happiness, including this one.My entire point is this: debate me on the merits of the subject, not on me as a person. If you don t like me, that s okay. Go play elsewhere.But, I m still going to keep writing. Next week, different metaphors.We re bootstrapped now.

21 May 2025

Simon Quigley: Fences and Values

Don t knock the fence down before you know why it s up. I repeat this phrase over and over again, yet the (metaphorical) Homeowner s Association still decides my fence is the wrong color.Well, now you get to know why the fence is up. If anyone s actually willing to challenge me on this level, I d welcome it.The four ideas I d like to discuss are this: quantum physics, Lutheranism, mental resilience, and psychology. I ve been studying these topics intensely for the past decade as a passion project. I m just going to let my thoughts flow, but I d like to hear other opinions on this.Can the mysteries of the mind, the subatomic world, and faith converge to reveal deeper truths?When it comes to self-taught knowledge on analysis, I m mostly learned on Freud, with some hints of Jung and Peterson. I ve read much of the original source material, and watched countless presentations on it. This all being said, I m both learned on Rothbard and Marx, so if there is a major flaw in the way of Freud is frowned upon, I d genuinely like to know so I can update my research and juxtapose the two schools of thought.Alongside this, although probably not directly relevant, I m learned on John Locke and transcendentalism. What I d like to focus on here is this the Id.The Id is the pleasure-seeking, instinctual part of the psyche. Jung further extends this into the idea of the shadow self, and Peterson maps the meanings of these texts into a combined work (at least in my rudimentary understanding).In my research, the Id represents the part of your psyche that deals with religious values. As an example, if you re an impulsive person, turning to a spiritual or religious outlet can be highly beneficial. I ve been using references from the foundational text of the Judaeo-Christian value system this entire time, feel free to re-read my other blog posts (instead of claiming they don t exist).Let s tie this into quantum physics. This is the part where I ll struggle most. I ve watched several movies about this, read several books, and even learned about it academically, but quantum physics is likely to be my weak spot here.I did some research, and here are the elements I m looking for: uncertainty principle, wave-particle duality, quantum entanglement, and the observer effect.I already know about the cat in the box. And the Cat in the Hat, for that matter. I know about wave-particle duality from an incredibly intelligent high school physics teacher of mine. I know about the uncertainty principle purely in a colloquial sense. The remaining element I need to wrap my head around is quantum entanglement, but it feels like I m almost there.These concepts do actually challenge the idea of pure free will. It s almost like we re coming full circle. Some theologians (including myself, if you can call me a self-taught one) do believe the idea of quantum indeterminacy can be a space where divine action may take place. You could also liken the unpredictable nature of the Id to quantum indeterminacy as well. These are ones to think about, because in all reality, they re subjective opinions. I do believe they re interconnected.In terms of Lutheranism, I ll be short on this one. Please do go read the full history behind Martin Luther and his turbulent relationship with Catholicism. I m not a Bible thumper, and I actually think this is the first time I ve mentioned religion publicly at all. This being said, now I m actually ready to defend the points on an academic level.The Id represents hidden psychological forces, quantum physics reveals subatomic mysteries, and Lutheranism emphasizes faith in the unseen God. Okay, so we have the baseline. Now, time for some mental resilience. When I think of mental resilience, the first people I think of are David Goggins and Jocko Willink. I ve also enjoyed Dr. Andrew Huberman s podcast.The idea there is simple if you understand exactly how to learn, you know your fundamentals well enough to draw them and explain them vividly on a whiteboard, and you can make it a habit, at that point you re ready to work on your mental resilience. Little by little, gradually, how far can you push the bar towards the ceiling?There s obviously limits. People sometimes get scared when I mention mental resilience, but obviously that s a bit of a catch 22. There are plenty of satirical videos out there, and of course, I don t believe in Goggins or Jocko wholeheartedly. They re just tools in the toolbox when times get tough.I wish you all well, and I hope this gets you thinking about those people who just insist there is no God or higher being, and think you re stupid for believing there is one. Those people obviously haven t read analysis, in my own opinion.Have a great night!

20 May 2025

Arturo Borrero Gonz lez: Wikimedia Cloud VPS: IPv6 support

Cape Town (ZA), Sea Point, Nachtansicht Dietmar Rabich, Cape Town (ZA), Sea Point, Nachtansicht 2024 1867-70 2, CC BY-SA 4.0 This post was originally published in the Wikimedia Tech blog, authored by Arturo Borrero Gonzalez. Wikimedia Cloud VPS is a service offered by the Wikimedia Foundation, built using OpenStack and managed by the Wikimedia Cloud Services team. It provides cloud computing resources for projects related to the Wikimedia movement, including virtual machines, databases, storage, Kubernetes, and DNS. A few weeks ago, in April 2025, we were finally able to introduce IPv6 to the cloud virtual network, enhancing the platform s scalability, security, and future-readiness. This is a major milestone, many years in the making, and serves as an excellent point to take a moment to reflect on the road that got us here. There were definitely a number of challenges that needed to be addressed before we could get into IPv6. This post covers the journey to this implementation. The Wikimedia Foundation was an early adopter of the OpenStack technology, and the original OpenStack deployment in the organization dates back to 2011. At that time, IPv6 support was still nascent and had limited implementation across various OpenStack components. In 2012, the Wikimedia cloud users formally requested IPv6 support. When Cloud VPS was originally deployed, we had set up the network following some of the upstream-recommended patterns: In order for us to be able to implement IPv6 in a way that aligned with our architectural goals and operational requirements, pretty much all the elements in this list would need to change. First of all, we needed to migrate from nova-networks into Neutron, a migration effort that started in 2017. Neutron was the more modern component to implement software-defined networks in OpenStack. To facilitate this transition, we made the strategic decision to backport certain functionalities from nova-networks into Neutron, specifically the dmz_cidr mechanism and some egress NAT capabilities. Once in Neutron, we started to think about IPv6. In 2018 there was an initial attempt to decide on the network CIDR allocations that Wikimedia Cloud Services would have. This initiative encountered unforeseen challenges and was subsequently put on hold. We focused on removing the previously backported nova-networks patches from Neutron. Between 2020 and 2021, we initiated another significant network refresh. We were able to introduce the cloudgw project, as part of a larger effort to rework the Cloud VPS edge network. The new edge routers allowed us to drop all the custom backported patches we had in Neutron from the nova-networks era, unblocking further progress. Worth mentioning that the cloudgw router would use nftables as firewalling and NAT engine. A pivotal decision in 2022 was to expose the OpenStack APIs to the internet, which crucially enabled infrastructure management via OpenTofu. This was key in the IPv6 rollout as will be explained later. Before this, management was limited to Horizon the OpenStack graphical interface or the command-line interface accessible only from internal control servers. Later, in 2023, following the OpenStack project s announcement of the deprecation of the neutron-linuxbridge-agent, we began to seriously consider migrating to the neutron-openvswitch-agent. This transition would, in turn, simplify the enablement of tenant networks a feature allowing each OpenStack project to define its own isolated network, rather than all virtual machines sharing a single flat network. Once we replaced neutron-linuxbridge-agent with neutron-openvswitch-agent, we were ready to migrate virtual machines to VXLAN. Demonstrating perseverance, we decided to execute the VXLAN migration in conjunction with the IPv6 rollout. We prepared and tested several things, including the rework of the edge routing to be based on BGP/OSPF instead of static routing. In 2024 we were ready for the initial attempt to deploy IPv6, which failed for unknown reasons. There was a full network outage and we immediately reverted the changes. This quick rollback was feasible due to our adoption of OpenTofu: deploying IPv6 had been reduced to a single code change within our repository. We started an investigation, corrected a few issues, and increased our network functional testing coverage before trying again. One of the problems we discovered was that Neutron would enable the enable_snat configuration flag for our main router when adding the new external IPv6 address. Finally, in April 2025, after many years in the making, IPv6 was successfully deployed. Compared to the network from 2011, we would have: Over time, the WMCS team has skillfully navigated numerous challenges to ensure our service offerings consistently meet high standards of quality and operational efficiency. Often engaging in multi-year planning strategies, we have enabled ourselves to set and achieve significant milestones. The successful IPv6 deployment stands as further testament to the team s dedication and hard work over the years. I believe we can confidently say that the 2025 Cloud VPS represents its most advanced and capable iteration to date. This post was originally published in the Wikimedia Tech blog, authored by Arturo Borrero Gonzalez.

19 May 2025

Melissa Wen: A Look at the Latest Linux KMS Color API Developments on AMD and Intel

This week, I reviewed the last available version of the Linux KMS Color API. Specifically, I explored the proposed API by Harry Wentland and Alex Hung (AMD), their implementation for the AMD display driver and tracked the parallel efforts of Uma Shankar and Chaitanya Kumar Borah (Intel) in bringing this plane color management to life. With this API in place, compositors will be able to provide better HDR support and advanced color management for Linux users. To get a hands-on feel for the API s potential, I developed a fork of drm_info compatible with the new color properties. This allowed me to visualize the display hardware color management capabilities being exposed. If you re curious and want to peek behind the curtain, you can find my exploratory work on the drm_info/kms_color branch. The README there will guide you through the simple compilation and installation process. Note: You will need to update libdrm to match the proposed API. You can find an updated version in my personal repository here. To avoid potential conflicts with your official libdrm installation, you can compile and install it in a local directory. Then, use the following command: export LD_LIBRARY_PATH="/usr/local/lib/" In this post, I invite you to familiarize yourself with the new API that is about to be released. You can start doing as I did below: just deploy a custom kernel with the necessary patches and visualize the interface with the help of drm_info. Or, better yet, if you are a userspace developer, you can start developing user cases by experimenting with it. The more eyes the better.

KMS Color API on AMD The great news is that AMD s driver implementation for plane color operations is being developed right alongside their Linux KMS Color API proposal, so it s easy to apply to your kernel branch and check it out. You can find details of their progress in the AMD s series. I just needed to compile a custom kernel with this series applied, intentionally leaving out the AMD_PRIVATE_COLOR flag. The AMD_PRIVATE_COLOR flag guards driver-specific color plane properties, which experimentally expose hardware capabilities while we don t have the generic KMS plane color management interface available. If you don t know or don t remember the details of AMD driver specific color properties, you can learn more about this work in my blog posts [1] [2] [3]. As driver-specific color properties and KMS colorops are redundant, the driver only advertises one of them, as you can see in AMD workaround patch 24. So, with the custom kernel image ready, I installed it on a system powered by AMD DCN3 hardware (i.e. my Steam Deck). Using my custom drm_info, I could clearly see the Plane Color Pipeline with eight color operations as below:
 "COLOR_PIPELINE" (atomic): enum  Bypass, Color Pipeline 258  = Bypass
     Bypass
     Color Pipeline 258
         Color Operation 258
             "TYPE" (immutable): enum  1D Curve, 1D LUT, 3x4 Matrix, Multiplier, 3D LUT  = 1D Curve
             "BYPASS" (atomic): range [0, 1] = 1
             "CURVE_1D_TYPE" (atomic): enum  sRGB EOTF, PQ 125 EOTF, BT.2020 Inverse OETF  = sRGB EOTF
         Color Operation 263
             "TYPE" (immutable): enum  1D Curve, 1D LUT, 3x4 Matrix, Multiplier, 3D LUT  = Multiplier
             "BYPASS" (atomic): range [0, 1] = 1
             "MULTIPLIER" (atomic): range [0, UINT64_MAX] = 0
         Color Operation 268
             "TYPE" (immutable): enum  1D Curve, 1D LUT, 3x4 Matrix, Multiplier, 3D LUT  = 3x4 Matrix
             "BYPASS" (atomic): range [0, 1] = 1
             "DATA" (atomic): blob = 0
         Color Operation 273
             "TYPE" (immutable): enum  1D Curve, 1D LUT, 3x4 Matrix, Multiplier, 3D LUT  = 1D Curve
             "BYPASS" (atomic): range [0, 1] = 1
             "CURVE_1D_TYPE" (atomic): enum  sRGB Inverse EOTF, PQ 125 Inverse EOTF, BT.2020 OETF  = sRGB Inverse EOTF
         Color Operation 278
             "TYPE" (immutable): enum  1D Curve, 1D LUT, 3x4 Matrix, Multiplier, 3D LUT  = 1D LUT
             "BYPASS" (atomic): range [0, 1] = 1
             "SIZE" (atomic, immutable): range [0, UINT32_MAX] = 4096
             "LUT1D_INTERPOLATION" (immutable): enum  Linear  = Linear
             "DATA" (atomic): blob = 0
         Color Operation 285
             "TYPE" (immutable): enum  1D Curve, 1D LUT, 3x4 Matrix, Multiplier, 3D LUT  = 3D LUT
             "BYPASS" (atomic): range [0, 1] = 1
             "SIZE" (atomic, immutable): range [0, UINT32_MAX] = 17
             "LUT3D_INTERPOLATION" (immutable): enum  Tetrahedral  = Tetrahedral
             "DATA" (atomic): blob = 0
         Color Operation 292
             "TYPE" (immutable): enum  1D Curve, 1D LUT, 3x4 Matrix, Multiplier, 3D LUT  = 1D Curve
             "BYPASS" (atomic): range [0, 1] = 1
             "CURVE_1D_TYPE" (atomic): enum  sRGB EOTF, PQ 125 EOTF, BT.2020 Inverse OETF  = sRGB EOTF
         Color Operation 297
             "TYPE" (immutable): enum  1D Curve, 1D LUT, 3x4 Matrix, Multiplier, 3D LUT  = 1D LUT
             "BYPASS" (atomic): range [0, 1] = 1
             "SIZE" (atomic, immutable): range [0, UINT32_MAX] = 4096
             "LUT1D_INTERPOLATION" (immutable): enum  Linear  = Linear
             "DATA" (atomic): blob = 0
Note that Gamescope is currently using AMD driver-specific color properties implemented by me, Autumn Ashton and Harry Wentland. It doesn t use this KMS Color API, and therefore COLOR_PIPELINE is set to Bypass. Once the API is accepted upstream, all users of the driver-specific API (including Gamescope) should switch to the KMS generic API, as this will be the official plane color management interface of the Linux kernel.

KMS Color API on Intel On the Intel side, the driver implementation available upstream was built upon an earlier iteration of the API. This meant I had to apply a few tweaks to bring it in line with the latest specifications. You can explore their latest work here. For a more simplified handling, combining the V9 of the Linux Color API, Intel s contributions, and my necessary adjustments, check out my dedicated branch. I then compiled a kernel from this integrated branch and deployed it on a system featuring Intel TigerLake GT2 graphics. Running my custom drm_info revealed a Plane Color Pipeline with three color operations as follows:
 "COLOR_PIPELINE" (atomic): enum  Bypass, Color Pipeline 480  = Bypass
     Bypass
     Color Pipeline 480
         Color Operation 480
             "TYPE" (immutable): enum  1D Curve, 1D LUT, 3x4 Matrix, 1D LUT Mult Seg, 3x3 Matrix, Multiplier, 3D LUT  = 1D LUT Mult Seg
             "BYPASS" (atomic): range [0, 1] = 1
             "HW_CAPS" (atomic, immutable): blob = 484
             "DATA" (atomic): blob = 0
         Color Operation 487
             "TYPE" (immutable): enum  1D Curve, 1D LUT, 3x4 Matrix, 1D LUT Mult Seg, 3x3 Matrix, Multiplier, 3D LUT  = 3x3 Matrix
             "BYPASS" (atomic): range [0, 1] = 1
             "DATA" (atomic): blob = 0
         Color Operation 492
             "TYPE" (immutable): enum  1D Curve, 1D LUT, 3x4 Matrix, 1D LUT Mult Seg, 3x3 Matrix, Multiplier, 3D LUT  = 1D LUT Mult Seg
             "BYPASS" (atomic): range [0, 1] = 1
             "HW_CAPS" (atomic, immutable): blob = 496
             "DATA" (atomic): blob = 0
Observe that Intel s approach introduces additional properties like HW_CAPS at the color operation level, along with two new color operation types: 1D LUT with Multiple Segments and 3x3 Matrix. It s important to remember that this implementation is based on an earlier stage of the KMS Color API and is awaiting review.

A Shout-Out to Those Who Made This Happen I m impressed by the solid implementation and clear direction of the V9 of the KMS Color API. It aligns with the many insightful discussions we ve had over the past years. A huge thank you to Harry Wentland and Alex Hung for their dedication in bringing this to fruition! Beyond their efforts, I deeply appreciate Uma and Chaitanya s commitment to updating Intel s driver implementation to align with the freshest version of the KMS Color API. The collaborative spirit of the AMD and Intel developers in sharing their color pipeline work upstream is invaluable. We re now gaining a much clearer picture of the color capabilities embedded in modern display hardware, all thanks to their hard work, comprehensive documentation, and engaging discussions. Finally, thanks all the userspace developers, color science experts, and kernel developers from various vendors who actively participate in the upstream discussions, meetings, workshops, each iteration of this API and the crucial code review process. I m happy to be part of the final stages of this long kernel journey, but I know that when it comes to colors, one step is completed for new challenges to be unlocked. Looking forward to meeting you in this year Linux Display Next hackfest, organized by AMD in Toronto, to further discuss HDR, advanced color management, and other display trends.

12 May 2025

Freexian Collaborators: Debian Contributions: DebConf 25 preparations, PyPA tools updates, Removing libcrypt-dev from build-essential and more! (by Anupa Ann Joseph)

Debian Contributions: 2025-04 Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services.

DebConf 25 Preparations, by Stefano Rivera and Santiago Ruano Rinc n DebConf 25 preparations continue. In April, the bursary team reviewed and ranked bursary applications. Santiago Ruano Rinc n examined the current state of the conference s finances, to see if we could allocate any more money to bursaries. Stefano Rivera supported the bursary team s work with infrastructure and advice and added some metrics to assist Santiago s budget review. Santiago was also involved in different parts of the organization, including Content team matters, as reviewing the first of proposals, preparing public information about the new Academic Track; or coordinating different aspects of the Day trip activities and the Conference Dinner.

PyPA tools updates, by Stefano Rivera Around the beginning of the freeze (in retrospect, definitely too late) Stefano looked at updating setuptools in the archive to 78.1.0. This brings support for more comprehensive license expressions (PEP-639), that people are expected to adopt soon upstream. While the reverse-autopkgtests all passed, it all came with some unexpected complications, and turned into a mini-transition. The new setuptools broke shebangs for scripts (pypa/setuptools#4952). It also required a bump of wheel to 0.46 and wheel 0.46 now has a dependency outside the standard library (it de-vendored packaging). This meant it was no longer suitable to distribute a standalone wheel.whl file to seed into new virtualenvs, as virtualenv does by default. The good news here is that setuptools doesn t need wheel any more, it included its own implementation of the bdist_wheel command, in 70.1. But the world hadn t adapted to take advantage of this, yet. Stefano scrambled to get all of these issues resolved upstream and in Debian: We re now at the point where python3-wheel-whl is no longer needed in Debian unstable, and it should migrate to trixie.

Removing libcrypt-dev from build-essential, by Helmut Grohne The crypt function was originally part of glibc, but it got separated to libxcrypt. As a result, libc6-dev now depends on libcrypt-dev. This poses a cycle during architecture cross bootstrap. As the number of packages actually using crypt is relatively small, Helmut proposed removing the dependency. He analyzed an archive rebuild kindly performed by Santiago Vila (not affiliated with Freexian) and estimated the necessary changes. It looks like we may complete this with modifications to less than 300 source packages in the forky cycle. Half of the bugs have been filed at this time. They are tracked with libcrypt-* usertags.

Miscellaneous contributions
  • Carles uploaded a new version of simplemonitor.
  • Carles improved the documentation of salsa-ci-team/pipeline regarding piuparts arguments.
  • Carles closed an FTBFS on gcc-15 on qnetload.
  • Carles worked on Catalan translations using po-debconf-manager: reviewed 57 translations and created their merge requests in salsa, created 59 bug reports for packages that didn t merge in more than 30 days. Followed-up merge requests and comments in bug reports. Managed some translations manually for packages that are not in Salsa.
  • Lucas did some work on the DebConf Content and Bursary teams.
  • Lucas fixed multiple CVEs and bugs involving the upgrade from bookworm to trixie in ruby3.3.
  • Lucas fixed a CVE in valkey in unstable.
  • Stefano updated beautifulsoup4, python-authlib, python-html2text, python-packaging, python-pip, python-soupsieve, and unidecode.
  • Stefano packaged python-dependency-groups, a new vendored library in python-pip.
  • During an afternoon Bug Squashing Party in Montevideo, Santiago uploaded a couple of packages fixing RC bugs #1057226 and #1102487. The latter was a sponsored upload.
  • Thorsten uploaded new upstream versions of brlaser, ptouch-driver and sane-airscan to get the latest upstream bug fixes into Trixie.
  • Rapha l filed an upstream bug on zim for a graphical glitch that he has been experiencing.
  • Colin Watson upgraded openssh to 10.0p1 (also known as 10.0p2), and debugged various follow-up bugs. This included adding riscv64 support to vmdb2 in passing, and enabling native wtmpdb support so that wtmpdb last now reports the correct tty for SSH connections.
  • Colin fixed dput-ng s override option, which had never previously worked.
  • Colin fixed a security bug in debmirror.
  • Colin did his usual routine work on the Python team: 21 packages upgraded to new upstream versions, 8 CVEs fixed, and about 25 release-critical bugs fixed.
  • Helmut filed patches for 21 cross build failures.
  • Helmut uploaded a new version of debvm featuring a new tool debefivm-create to generate EFI-bootable disk images compatible with other tools such as libvirt or VirtualBox. Much of the work was prototyped in earlier months. This generalizes mmdebstrap-autopkgtest-build-qemu.
  • Helmut continued reporting undeclared file conflicts and suggested package removals from unstable.
  • Helmut proposed build profiles for libftdi1 and gnupg2. To deal with recently added dependencies in the architecture cross bootstrap package set.
  • Helmut managed the /usr-move transition. He worked on ensuring that systemd would comply with Debian s policy. Dumat continues to locate problems here and there yielding discussion occasionally. He sent a patch for an upgrade problem in zutils.
  • Anupa worked with the Debian publicity team to publish Micronews and Bits posts.
  • Anupa worked with the DebConf 25 content team to review talk and event proposals for DebConf 25.

Next.

Previous.