Hello world . I am an intern at Outreachy and contributing to the Debian Images Testing project since October 2025. This project is Open Source and everyone can contribute to it in any way. The project uses Open QA to automatically install Operating System Images and test them . We have a community here of contributors that is always ready to help out. The mentors and project maintainers are very open to contributions. They listen to any innovative ideas and point out what they have been doing so far.
So far contributions have been in terms of:
Documentation Adding to install guides
Pseudo Tests Suggesting an idea after finding an error or idea
Pointing out bugs / errors when trying out tests
Work on tests
Contribute towards the wider community / help out other contributors
Heck, you can even create a screen-cast video doing the set up and add it to the guide / docs
Developers with further understanding can try to work with the maintainers on packages
Contributing to this project requires some knowledge of Linux commands and Operating Systems. What we will learn later as we go on will be :
Images / Operating System Installation through dual booting
More Linux commands
Images / Operating System Installation and testing on Virtual Machine
Git commands
Writing testing documentation
Writing Pseudo Tests
Writing test modules / code using Perl
Working on configuration
Preparation
Before any contribution begins, we would want to first try out the project and run a couple of tests. Get to understand what we are doing first. Let s say you are starting out as a Windows or MacOS user and you want to start contributing. I would recommend dual partitioning your device first. Do enough research and prepare the resources. The network install image just needs at-least 4 GB USB flash drive for dual booting. You will use Debian as the second operating system. Give enough space to Debian, I recommend around 150 GB or more. Also assign at-least 1 GB space to the /boot/efi directory to prevent the low space warnings after a while.This will be a way good way to learn about image installation which is part of the work . I do not recommend Virtual Box because it will hinder full use of system resources. This process will take a day or two.
Set Up and Testing
After dual booting. We log into our Debian System . The next step of instructions will take you through how we set up and run our tests. These instructions have many Linux commands. You will be learning if you are a newbie as you go through the steps. Try to understand these commands and do not blindly copy and paste. You can start your contributions here if you have a suggestion to add to the install docs. Run some tests then log in the Web UI as per the instructions to view your tests progress. Green means they ve passed. Blue means its still running. Red means failed.
Trying Out Ideas
Kudos if you have reached this point. The community of contributors will help if you are stuck. We get to try out our own variations of tests using variables. We will also rely on documentation to understand the configurations / test commands like these
openqa-cli api -X POST isos ISO=debian-13.1.0-amd64-netinst.iso DISTRI=debian VERSION=stable FLAVOR=netinst-iso ARCH=x86_64 BUILD=1310 #This is the test you will run on the guide
openqa-cli api -X POST isos ISO=debian-13.1.0-amd64-netinst.iso DISTRI=debian VERSION=stable FLAVOR=netinst-iso ARCH=x86_64 BUILD=1310 TEST=cinnamon #I have added a TEST variable that runs only cinnamon test suite
You can check specific test suites from the WebUI :
We get some failures at times. Here are some failed tests from a build I was working on.
Here we find the cinnamon test failed at locale module. Click on any module above and it will lead us to needles and point to the where the test failed. You can check the error or try to add a needle if its needle failure.
Try editing a test module and test your changes. Try out some ideas. Read the documentation folder and write some pseudo code. Interact with the community. Try working on some tasks from the community . Create your tests and add them to the configuration. There is a lot of stuff that can you can work on in this community.
It may seem hard to grasp at first as a newbie to Open Source. The community will help you through out even if the problem seems small. We are very friendly and the code maintainers have extensive knowledge. Get to sit with us during one of our meetings and you will learn so much about the project. Learning , networking and communicating is part of contributing to the broader community.
The limits of cryptography
Since this blog post is about security and cryptography, it makes sense to start with this XKCD reminder about the value of encryption:
.
There is a similar, complementary discussion in this article: crypto can help but cannot safeguard against all actors.
Public Key encryption 101
GPG, the most widely used tool for end to end email encryption and signing software releases, is based on public key cryptography. In public key cryptography you have a private key, to keep very private and a public key you share to the world. If you use SSH with key authentication you know already the concept:
$ ls -1 .ssh/id_rsa*
.ssh/id_rsa
.ssh/id_rsa.pub
Here id_rsa is the private key and id_rsa.pub the public key.
GPG private key on a hardware token
I would like to store the private part of my GPG key on a hardware token. This provides extra security compared to storing the private key on a hard disk: as the private key stays on the device, you need physical access to the device to do anything with the private key. It also eases the usage of the private key on different computers, as you just have to take the hardware token with you without creating multiple copies of the precious private key.
That is the basic, somepeople explains the topic better than me.
The hardware token I chose is a YubiKey because I already use such a dongle for two factor authentication on salsa, the Debian gitlab forge.
Some terminology
Understanding GPG and even worse hardware tokens, is like sailing in an endless sea of acronyms and recommendation practices. Let us navigate:
First the generic name of a GPG hardware token is an OpenPGP smart card.
Here we need to understand:
OpenPGP is a standard for public key cryptography, GPG is an implementation of the standard. We refer to the device category by the name of the standard it implements: OpenPGP.
the first existing hardware tokens had the form of a smart card, that is a credit-card form factor with a small chip inside. Today most devices implementing the OpenPGP card protocol take the form of an USB key. But even in the form of a USB key, the token will implement the generic smartcard protocol (CCID, for Chip Card Interface Device) so that it can talk to upper software layers.
Standards support in YubiKey
The YubiKey itself supports multiple standards, the OpenPGP card protocol being only one of many.
This is what my YubiKey supports:
Since I already threw 5 acronyms on the paragraph above, I will not go in details into what OATH and PIV are.
What is important for us here, is that we have OpenPGP enabled on the device, more important we can verify that GPG sees a card with:
gpg --card-status grep -E '(Application type Manufacturer)'
Application type .: OpenPGP
Manufacturer .....: Yubico
Zero-Code
Instrumentation of an Envoy TCP Proxy using eBPF
I recently had to debug an Envoy
Network Load Balancer, and the options Envoy provides just weren't
enough. We were seeing a small number of HTTP 499 errors caused by
latency somewhere in our cloud, but
it wasn't clear what the bottleneck was. As a result, each team had to
set up additional instrumentation to catch latency spikes and figure out
what was going on.
My team is responsible for the LBaaS product (Load Balancer as a
Service) and, of course, we are the first suspects when this kind of
problem appear.
Before going for the current solution, I read a lot of Envoy's
documentation.
It is possible to enable access
logs for Envoy, but they don't provide the information required for
this debug. This is an example of the output:
I won't go into detail about the line above, since it's not possible
to trace the request using access logs alone.
Envoy also has OpenTelemetry
tracing, which is perfect for understanding sources of latency.
Unfortunatly, it is only available for Application Load Balancers.
Most of the HTTP 499 were happening every 10 minutes, so we managed
to get some of the requests with tcpdump, Wireshark and using http headers
to filter the requests.
This approach helped us reproduce and track down the problem, but it
wasn't a great solution. We clearly needed better tools to catch this
kind of issue the next time it happened.
Therefore, I decided to try out OpenTelemetry
eBPF Instrumentation, also referred to as OBI.
I saw the announcement of Grafana Beyla before it was renamed to OBI,
but I didn't have the time or a strong reason to try it out until now.
Even then, I really liked the idea, and the possibility of using eBPF to
solve this instrumentation problem had been in the back of my mind.
OBI promises zero-code automatic instrumentation for Linux services
using eBPF, so I put together a minimal setup to see how well it
works.
This is the simplest Envoy TCP proxy configuration: a listener on
port 8000 forwarding traffic to a backend running on port 8080.
For the backend, I used a basic Go HTTP server:
package mainimport("fmt""net/http")func main() http.Handle("/", http.FileServer(http.Dir("."))) server := http.ServerAddr:":8080" fmt.Println("Starting server on :8080")panic(server.ListenAndServe())
Finally, I wrapped everything together with Docker Compose:
services:autoinstrumenter:image: otel/ebpf-instrument:mainpid:"service:envoy"privileged:trueenvironment:OTEL_EBPF_TRACE_PRINTER: textOTEL_EBPF_OPEN_PORT:8000envoy:image: envoyproxy/envoy:v1.33-latestports:- 8000:8000volumes:- ./envoy.yaml:/etc/envoy/envoy.yaml depends_on:- target-backendtarget-backend:image: golang:1.22-alpinecommand: go run /app/backend.govolumes:- ./backend.go:/app/backend.go:roexpose:-8080
OBI should output traces to the standard output similar to this when
a HTTP request is made to Envoy:
2025-12-08 20:44:49.12884449 (305.572 s[305.572 s]) HTTPClient 200 GET /(/) [172.18.0.3 as envoy:36832]->[172.18.0.2 as localhost:8080] contentLen:78B responseLen:0B svc=[envoy generic] traceparent=[00-529458a2be271956134872668dc5ee47-6dba451ec8935e3e[06c7f817e6a5dae2]-01]
2025-12-08 20:44:49.12884449 (1.260901ms[366.65 s]) HTTP 200 GET /(/) [172.18.0.1 as 172.18.0.1:36282]->[172.18.0.3 as envoy:8000] contentLen:78B responseLen:223B svc=[envoy generic] traceparent=[00-529458a2be271956134872668dc5ee47-06c7f817e6a5dae2[0000000000000000]-01]
This is exactly what we needed, with zero-code. The above trace
shows:
2025-12-08 20:44:49.12884449: time of the trace.
(1.260901ms[366.65 s]): total response time for the
request, with the actual internal execution time of the request (not
counting the request enqueuing time).
HTTP 200 GET /: protocol, response code, HTTP method,
and URL path.
[172.18.0.1 as 172.18.0.1:36282]->[172.18.0.3 as envoy:8000]:
source and destination host:port. The initial request originates from my
machine through the gateway (172.18.0.1), hits the Envoy (172.23.0.3),
the proxy then forwards it to the backend application (172.23.0.2).
contentLen:78B: HTTP Content-Length. I used curl and
the default request size for it is 78B.
responseLen:223B: Size of the response body.
svc=[envoy generic]: traced service.
traceparent: ids to trace the parent request. We can
see that the Envoy makes a request to the target and this request has
the other one as parent.
Let's add one more Envoy to show that it's also possible to track
multiple services.
The new Envoy will listen on port 9000 and forward the request to the
other Envoy listening on port 8000. Now we just need to change OBI open
port variable to look at a range:
OTEL_EBPF_OPEN_PORT: 8000-9000
And change the pid field of the autoinstrumenter service to use the
host's PID namespace inside the container:
pid: host
This is the output I got after one curl:
2025-12-09 12:28:05.12912285 (2.202041ms[1.524713ms]) HTTP 200 GET /(/) [172.19.0.1 as 172.19.0.1:59030]->[172.19.0.5 as envoy:9000] contentLen:78B responseLen:223B svc=[envoy generic] traceparent=[00-69977bee0c2964b8fe53cdd16f8a9d19-856c9f700e73bf0d[0000000000000000]-01]
2025-12-09 12:28:05.12912285 (1.389336ms[1.389336ms]) HTTPClient 200 GET /(/) [172.19.0.5 as envoy:59806]->[172.19.0.4 as localhost:8000] contentLen:78B responseLen:0B svc=[envoy generic] traceparent=[00-69977bee0c2964b8fe53cdd16f8a9d19-caa7f1ad1c68fa77[856c9f700e73bf0d]-01]
2025-12-09 12:28:05.12912285 (1.5431ms[848.574 s]) HTTP 200 GET /(/) [172.19.0.5 as 172.19.0.5:59806]->[172.19.0.4 as envoy:8000] contentLen:78B responseLen:223B svc=[envoy generic] traceparent=[00-69977bee0c2964b8fe53cdd16f8a9d19-cbca9d64d3d26b40[caa7f1ad1c68fa77]-01]
2025-12-09 12:28:05.12912285 (690.217 s[690.217 s]) HTTPClient 200 GET /(/) [172.19.0.4 as envoy:34256]->[172.19.0.3 as localhost:8080] contentLen:78B responseLen:0B svc=[envoy generic] traceparent=[00-69977bee0c2964b8fe53cdd16f8a9d19-5502f7760ed77b5b[cbca9d64d3d26b40]-01]
2025-12-09 12:28:05.12912285 (267.9 s[238.737 s]) HTTP 200 GET /(/) [172.19.0.4 as 172.19.0.4:34256]->[172.19.0.3 as backend:8080] contentLen:0B responseLen:0B svc=[backend go] traceparent=[00-69977bee0c2964b8fe53cdd16f8a9d19-ac05c7ebe26f2530[5502f7760ed77b5b]-01]
Each log line represents a span belonging to the same trace
(69977bee0c2964b8fe53cdd16f8a9d19). For readability, I
ordered the spans by their traceparent relationship, showing the
request's path as it moves through the system: from the client-facing
Envoy, through the internal Envoy hop, and finally to the Go backend.
You can see both server-side (HTTP) and client-side (HTTPClient) spans
at each hop, along with per-span latency, source and destination
addresses, and response sizes, making it easy to pinpoint where time is
spent along the request chain.
The log lines are helpful, but we need better ways to visualize the
traces and the metrics generated by OBI. I'll share another setup that
more closely reflects what we actually use.
Production setup
I'll be using the following tools this time:
The goal of this setup is to mirror an environment similar to what I
used in production. This time, I've omitted the load balancer and
shifted the emphasis to observability instead.
I will run three HTTP servers on port 8080: two inside Incus
containers and one on the host machine. The OBI process will export
metrics and traces to an OpenTelemetry Collector, which will forward
traces to Jaeger and expose a metrics endpoint for Prometheus to scrape.
Grafana will also be added to visualize the collected metrics using
dashboards.
The aim of this approach is to instrument only one of the HTTP
servers while ignoring the others. This simulates an environment with
hundreds of Incus containers, where the objective is to debug a single
container without being overwhelmed by excessive and irrelevant
telemetry data from the rest of the system.
OBI can filter metrics and traces based on attribute values, but I
was not able to filter by process PID. This is where the OBI Collector
comes into play, it allows me to use a processor to filter telemetry
data by the PID of the process being instrumented.
These are the steps to reproduce this setup:
We're almost there, the OpenTelemetry Collector is just missing a
processor. To create the processor filter, we can look at the OBI logs
to find the PID of the HTTP server being instrumented:
Now we just need to add the processor to the collector
configuration:
processors: # <--- NEW BLOCKfilter/host_id:traces:span:-'resource.attributes["service.instance.id"] == "148f400ad3ea:297514"'service:pipelines:traces:receivers:[otlp]processors:[filter/host_id] # <--- NEW LINEexporters:[otlp/jaeger]metrics:receivers:[otlp]processors: # <--- NEW BLOCK- filter/host_idexporters:[prometheus]
That's it! The processor will handle the filtering for us, and we'll
only see traces and metrics from the HTTP server running in the
server01 container. Below are some screenshots from Jaeger
and Grafana:
Closing Notes
I am still amazed at how powerful OBI can be.
For those curious about the debug, we found out that a service
responsible for the network orchestration of the Envoy containers was
running netplan apply every 10 minutes because of a bug.
Netplan apply causes interfaces to go down temporarily and this made the
latency go above 500ms which caused the 499s.
Email interface of the Debian bug tracker
The main interface of the Debian bug tracker, at http://bugs.debian.org, is e-mail, and modifications are made to existing bugs by sending an email to an address like 873518@bugs.Debian.org.
The web interface allows to browse bugs, but any addition to the bug itself will require an email client.
This sounds a bit weird in 2025, as http REST clients with Oauth access tokens for command line tools interacting with online resources are today the norm.
However we should remember the Debian project goes back to 1993 and the bug tracker software debugs, was released in 1994.
REST itself was first introduced in 2000, six years later.
In any case, using an email client to create or modify bug reports is not a bad idea per se:
the internet mail protocol, SMTP, is a well known and standardized protocol defined in an IETF RFC.
no need for account creation and authentication, you just need an email address to interact. There is a risk of spam, but in my experience this has been very low. When authentication is needed, Debian Developpers sign their work with their private GPG key.
you can use the bug tracker using the interface of your choice: webmail, graphical mail clients like Thunderbird or Evolution, text clients like Mutt or Pine, or command line tools like bts.
A system wide minimal Mail Transfer Agent to send mail
We can configure bts as a SMTP client, with username and password. In SMTP client mode, we would need to enter the SMTP settings from our
mailserviceprovider.
The other option is to configure a Mail Transfer Agent (MTA) which provides a system wide sendmail interface, that all command line and automation tools can use send email. For instance reportbug and git send-email are able to use the sendmail interface.
Why a sendmail interface ? Because sendmail used to be the default MTA of Unix back in the days, thus many programs sending mails expect something which looks like sendmail locally.
A popular, maintained and packaged minimal MTA is msmtp, we are going to use it.
msmtp installation and configuration
Installation is just an apt away:
You can follow this blog post to configure msmtp, including saving your mail account credentials in the Gnome keyring.
Once installed, you can verify that msmtp-mta created a sendmail symlink.
$ ls -l /usr/sbin/sendmail
lrwxrwxrwx 1 root root 12 16 avril 2025 /usr/sbin/sendmail -> ../bin/msmtp
bts, git-send-email and reportbug will pipe their output to /usr/sbin/sendmail and msmtp will send the email in the background.
Testing with with a simple mail client
Debian comes out of the box with a primitive mail client, bsd-mailx that you can use to test your MTA set up.
If you have configured msmtp correctly you send an email to yourself using
$ echo "hello world" mail -s "my mail subject" user@domain.org
Now you can open bugs for Debian with reportbug, tag them with bts and send git formated patches from the command line with git send-email.
## 0.23 2025-12-20
commit be15aa25dea40aea66a8534143fb81b29d2e6c08
Author: C.J. Collier
Date: Sat Dec 20 22:40:44 2025 +0000
Fixes C-level test infrastructure and adds more test cases for upb_to_sv conversions.
- **Makefile.PL:**
- Allow extra_src in c_test_config.json to be an array.
- Add ASan flags to CCFLAGS and LDDLFLAGS for better debugging.
- Corrected echo newlines in test_c target.
- **c_test_config.json:**
- Added missing type test files to deps and extra_src for convert/sv_to_upb and convert/upb_to_sv test runners.
- **t/c/convert/upb_to_sv.c:**
- Fixed a double free of test_pool .
- Added missing includes for type test headers.
- Updated test plan counts.
- **t/c/convert/sv_to_upb.c:**
- Added missing includes for type test headers.
- Updated test plan counts.
- Corrected Perl interpreter initialization.
- **t/c/convert/types/**:
- Added missing test_util.h include in new type test headers.
- Completed the set of upb_to_sv test cases for all scalar types by adding optional and repeated tests for sfixed32 , sfixed64 , sint32 , and sint64 , and adding repeated tests to the remaining scalar type files.
- **Documentation:**
- Updated 01-xs-testing.md with more debugging tips, including ASan usage and checking for double frees and typos.
- Updated xs_learnings.md with details from the recent segfault.
- Updated llm-plan-execution-instructions.md to emphasize debugging steps.
## 0.22 2025-12-19
commit 2c171d9a5027e0150eae629729c9104e7f6b9d2b
Author: C.J. Collier
Date: Fri Dec 19 23:41:02 2025 +0000
feat(perl,testing): Initialize C test framework and build system
This commit sets up the foundation for the C-level tests and the build system for the Perl Protobuf module:
1. **Makefile.PL Enhancements:**
* Integrates Devel::PPPort to generate ppport.h for better portability.
* Object files now retain their path structure (e.g., xs/convert/sv_to_upb.o ) instead of being flattened, improving build clarity.
* The MY::postamble is significantly revamped to dynamically generate build rules for all C tests located in t/c/ based on the t/c/c_test_config.json file.
* C tests are linked against libprotobuf_common.a and use ExtUtils::Embed flags.
* Added JSON::MaybeXS to PREREQ_PM .
* The test target now also depends on the test_c target.
2. **C Test Infrastructure ( t/c/ ):
* Introduced t/c/c_test_config.json to configure individual C test builds, specifying dependencies and extra source files.
* Created t/c/convert/test_util.c and .h for shared test functions like loading descriptors.
* Initial t/c/convert/upb_to_sv.c and t/c/convert/sv_to_upb.c test runners.
* Basic t/c/integration/030_protobuf_coro.c for Coro safety testing on core utils using libcoro .
* Basic t/c/integration/035_croak_test.c for testing exception handling.
* Basic t/c/integration/050_convert.c for integration testing conversions.
3. **Test Proto:** Updated t/data/test.proto with more field types for conversion testing and regenerated test_descriptor.bin .
4. **XS Test Harness ( t/c/upb-perl-test.h ):** Added like_n macro for length-aware regex matching.
5. **Documentation:** Updated architecture and plan documents to reflect the C test structure.
6. **ERRSV Testing:** Note that the C tests ( t/c/ ) will primarily check *if* a croak occurs (i.e., that the exception path is taken), but will not assert on the string content of ERRSV . Reliably testing $@ content requires the full Perl test environment with Test::More , which will be done in the .t files when testing the Perl API.
This provides a solid base for developing and testing the XS and C components of the module.
## 0.21 2025-12-18
commit a8b6b6100b2cf29c6df1358adddb291537d979bc
Author: C.J. Collier
Date: Thu Dec 18 04:20:47 2025 +0000
test(C): Add integration tests for Milestone 2 components
- Created t/c/integration/030_protobuf.c to test interactions
between obj_cache, arena, and utils.
- Added this test to t/c/c_test_config.json.
- Verified that all C tests for Milestones 2 and 3 pass,
including the libcoro-based stress test.
## 0.20 2025-12-18
commit 0fcad68680b1f700a83972a7c1c48bf3a6958695
Author: C.J. Collier
Date: Thu Dec 18 04:14:04 2025 +0000
docs(plan): Add guideline review reminders to milestones
- Added a "[ ] REFRESH: Review all documents in @perl/doc/guidelines/**"
checklist item to the start of each component implementation
milestone (C and Perl layers).
- This excludes Integration Test milestones.
## 0.19 2025-12-18
commit 987126c4b09fcdf06967a98fa3adb63d7de59a34
Author: C.J. Collier
Date: Thu Dec 18 04:05:53 2025 +0000
docs(plan): Add C-level and Perl-level Coro tests to milestones
- Added checklist items for libcoro -based C tests
(e.g., t/c/integration/050_convert_coro.c ) to all C layer
integration milestones (050 through 220).
- Updated 030_Integration_Protobuf.md to standardise checklist
items for the existing 030_protobuf_coro.c test.
- Removed the single xt/author/coro-safe.t item from
010_Build.md .
- Added checklist items for Perl-level Coro tests
(e.g., xt/coro/240_arena.t ) to each Perl layer
integration milestone (240 through 400).
- Created perl/t/c/c_test_config.json to manage C test
configurations externally.
- Updated perl/doc/architecture/testing/01-xs-testing.md to describe
both C-level libcoro and Perl-level Coro testing strategies.
## 0.18 2025-12-18
commit 6095a5a610401a6035a81429d0ccb9884d53687b
Author: C.J. Collier
Date: Thu Dec 18 02:34:31 2025 +0000
added coro testing to c layer milestones
## 0.17 2025-12-18
commit cc0aae78b1f7f675fc8a1e99aa876c0764ea1cce
Author: C.J. Collier
Date: Thu Dec 18 02:26:59 2025 +0000
docs(plan): Refine test coverage checklist items for SMARTness
- Updated the "Tests provide full coverage" checklist items in
C layer plan files (020, 040, 060, 080, 100, 120, 140, 160, 180, 200)
to explicitly mention testing all public functions in the
corresponding header files.
- Expanded placeholder checklists in 140, 160, 180, 200.
- Updated the "Tests provide full coverage" and "Add coverage checks"
checklist items in Perl layer plan files (230, 250, 270, 290, 310, 330,
350, 370, 390) to be more specific about the scope of testing
and the use of Test::TestCoverage .
- Expanded Well-Known Types milestone (350) to detail each type.
## 0.16 2025-12-18
commit e4b601f14e3817a17b0f4a38698d981dd4cb2818
Author: C.J. Collier
Date: Thu Dec 18 02:07:35 2025 +0000
docs(plan): Full refactoring of C and Perl plan files
- Split both ProtobufPlan-C.md and ProtobufPlan-Perl.md into
per-milestone files under the perl/doc/plan/ directory.
- Introduced Integration Test milestones after each component
milestone in both C and Perl plans.
- Numbered milestone files sequentially (e.g., 010_Build.md,
230_Perl_Arena.md).
- Updated main ProtobufPlan-C.md and ProtobufPlan-Perl.md to
act as Tables of Contents.
- Ensured consistent naming for integration test files
(e.g., t/c/integration/030_protobuf.c , t/integration/260_descriptor_pool.t ).
- Added architecture review steps to the end of all milestones.
- Moved Coro safety test to C layer Milestone 1.
- Updated Makefile.PL to support new test structure and added Coro.
- Moved and split t/c/convert.c into t/c/convert/*.c.
- Moved other t/c/*.c tests into t/c/protobuf/*.c.
- Deleted old t/c/convert.c.
## 0.15 2025-12-17
commit 649cbacf03abb5e7293e3038bb451c0406e9d0ce
Author: C.J. Collier
Date: Wed Dec 17 23:51:22 2025 +0000
docs(plan): Refactor and reset ProtobufPlan.md
- Split the plan into ProtobufPlan-C.md and ProtobufPlan-Perl.md.
- Reorganized milestones to clearly separate C layer and Perl layer development.
- Added more granular checkboxes for each component:
- C Layer: Create test, Test coverage, Implement, Tests pass.
- Perl Layer: Create test, Test coverage, Implement Module/XS, Tests pass, C-Layer adjustments.
- Reset all checkboxes to [ ] to prepare for a full audit.
- Updated status in architecture/api and architecture/core documents to "Not Started".
feat(obj_cache): Add unregister function and enhance tests
- Added protobuf_unregister_object to xs/protobuf/obj_cache.c .
- Updated xs/protobuf/obj_cache.h with the new function declaration.
- Expanded tests in t/c/protobuf_obj_cache.c to cover unregistering,
overwriting keys, and unregistering non-existent keys.
- Corrected the test plan count in t/c/protobuf_obj_cache.c to 17.
## 0.14 2025-12-17
commit 40b6ad14ca32cf16958d490bb575962f88d868a1
Author: C.J. Collier
Date: Wed Dec 17 23:18:27 2025 +0000
feat(arena): Complete C layer for Arena wrapper
This commit finalizes the C-level implementation for the Protobuf::Arena wrapper.
- Adds PerlUpb_Arena_Destroy for proper cleanup from Perl's DEMOLISH.
- Enhances error checking in PerlUpb_Arena_Get .
- Expands C-level tests in t/c/protobuf_arena.c to cover memory allocation
on the arena and lifecycle through PerlUpb_Arena_Destroy .
- Corrects embedded Perl initialization in the C test.
docs(plan): Refactor ProtobufPlan.md
- Restructures the development plan to clearly separate "C Layer" and
"Perl Layer" tasks within each milestone.
- This aligns the plan with the "C-First Implementation Strategy" and improves progress tracking.
## 0.13 2025-12-17
commit c1e566c25f62d0ae9f195a6df43b895682652c71
Author: C.J. Collier
Date: Wed Dec 17 22:00:40 2025 +0000
refactor(perl): Rename C tests and enhance Makefile.PL
- Renamed test files in t/c/ to better match the xs module structure:
- 01-cache.c -> protobuf_obj_cache.c
- 02-arena.c -> protobuf_arena.c
- 03-utils.c -> protobuf_utils.c
- 04-convert.c -> convert.c
- load_test.c -> upb_descriptor_load.c
- Updated perl/Makefile.PL to reflect the new test names in MY::postamble 's $c_test_config .
- Refactored the $c_test_config generation in Makefile.PL to reduce repetition by using a default flags hash and common dependencies array.
- Added a fail() macro to perl/t/c/upb-perl-test.h for consistency.
- Modified t/c/upb_descriptor_load.c to use the t/c/upb-perl-test.h macros, making its output consistent with other C tests.
- Added a skeleton for t/c/convert.c to test the conversion functions.
- Updated documentation in ProtobufPlan.md and architecture/testing/01-xs-testing.md to reflect new test names.
## 0.12 2025-12-17
commit d8cb5dd415c6c129e71cd452f78e29de398a82c9
Author: C.J. Collier
Date: Wed Dec 17 20:47:38 2025 +0000
feat(perl): Refactor XS code into subdirectories
This commit reorganizes the C code in the perl/xs/ directory into subdirectories, mirroring the structure of the Python UPB extension. This enhances modularity and maintainability.
- Created subdirectories for each major component: convert , descriptor , descriptor_containers , descriptor_pool , extension_dict , map , message , protobuf , repeated , and unknown_fields .
- Created skeleton .h and .c files within each subdirectory to house the component-specific logic.
- Updated top-level component headers (e.g., perl/xs/descriptor.h ) to include the new sub-headers.
- Updated top-level component source files (e.g., perl/xs/descriptor.c ) to include their main header and added stub initialization functions (e.g., PerlUpb_InitDescriptor ).
- Moved code from the original perl/xs/protobuf.c to new files in perl/xs/protobuf/ (arena, obj_cache, utils).
- Moved code from the original perl/xs/convert.c to new files in perl/xs/convert/ (upb_to_sv, sv_to_upb).
- Updated perl/Makefile.PL to use a glob ( xs/*/*.c ) to find the new C source files in the subdirectories.
- Added perl/doc/architecture/core/07-xs-file-organization.md to document the new structure.
- Updated perl/doc/ProtobufPlan.md and other architecture documents to reference the new organization.
- Corrected self-referential includes in the newly created .c files.
This restructuring provides a solid foundation for further development and makes it easier to port logic from the Python implementation.
## 0.11 2025-12-17
commit cdedcd13ded4511b0464f5d3bdd72ce6d34e73fc
Author: C.J. Collier
Date: Wed Dec 17 19:57:52 2025 +0000
feat(perl): Implement C-first testing and core XS infrastructure
This commit introduces a significant refactoring of the Perl XS extension, adopting a C-first development approach to ensure a robust foundation.
Key changes include:
- **C-Level Testing Framework:** Established a C-level testing system in t/c/ with a dedicated Makefile, using an embedded Perl interpreter. Initial tests cover the object cache ( 01-cache.c ), arena wrapper ( 02-arena.c ), and utility functions ( 03-utils.c ).
- **Core XS Infrastructure:**
- Implemented a global object cache ( xs/protobuf.c ) to manage Perl wrappers for UPB objects, using weak references.
- Created an upb_Arena wrapper ( xs/protobuf.c ).
- Consolidated common XS helper functions into xs/protobuf.h and xs/protobuf.c .
- **Makefile.PL Enhancements:** Updated to support building and linking C tests, incorporating flags from ExtUtils::Embed , and handling both .c and .cc source files.
- **XS File Reorganization:** Restructured XS files to mirror the Python UPB extension's layout (e.g., message.c , descriptor.c ). Removed older, monolithic .xs files.
- **Typemap Expansion:** Added extensive typemap entries in perl/typemap to handle conversions between Perl objects and various const upb_*Def* pointers.
- **Descriptor Tests:** Added a new test suite t/02-descriptor.t to validate descriptor loading and accessor methods.
- **Documentation:** Updated development plans and guidelines ( ProtobufPlan.md , xs_learnings.md , etc.) to reflect the C-first strategy, new testing methods, and lessons learned.
- **Build Cleanup:** Removed ppport.h from .gitignore as it's no longer used, due to -DPERL_NO_PPPORT being set in Makefile.PL .
This C-first approach allows for more isolated and reliable testing of the core logic interacting with the UPB library before higher-level Perl APIs are built upon it.
## 0.10 2025-12-17
commit 1ef20ade24603573905cb0376670945f1ab5d829
Author: C.J. Collier
Date: Wed Dec 17 07:08:29 2025 +0000
feat(perl): Implement C-level tests and core XS utils
This commit introduces a C-level testing framework for the XS layer and implements key components:
1. **C-Level Tests ( t/c/ )**:
* Added t/c/Makefile to build standalone C tests.
* Created t/c/upb-perl-test.h with macros for TAP-compliant C tests ( plan , ok , is , is_string , diag ).
* Implemented t/c/01-cache.c to test the object cache.
* Implemented t/c/02-arena.c to test Protobuf::Arena wrappers.
* Implemented t/c/03-utils.c to test string utility functions.
* Corrected include paths and diagnostic messages in C tests.
2. **XS Object Cache ( xs/protobuf.c )**:
* Switched to using stringified pointers ( %p ) as hash keys for stability.
* Fixed a critical double-free bug in PerlUpb_ObjCache_Delete by removing an extra SvREFCNT_dec on the lookup key.
3. **XS Arena Wrapper ( xs/protobuf.c )**:
* Corrected PerlUpb_Arena_New to use newSVrv and PTR2IV for opaque object wrapping.
* Corrected PerlUpb_Arena_Get to safely unwrap the arena pointer.
4. **Makefile.PL ( perl/Makefile.PL )**:
* Added -Ixs to INC to allow C tests to find t/c/upb-perl-test.h and xs/protobuf.h .
* Added LIBS to link libprotobuf_common.a into the main Protobuf.so .
* Added C test targets 01-cache , 02-arena , 03-utils to the test config in MY::postamble .
5. **Protobuf.pm ( perl/lib/Protobuf.pm )**:
* Added use XSLoader; to load the compiled XS code.
6. **New files xs/util.h **:
* Added initial type conversion function.
These changes establish a foundation for testing the C-level interface with UPB and fix crucial bugs in the object cache implementation.
## 0.09 2025-12-17
commit 07d61652b032b32790ca2d3848243f9d75ea98f4
Author: C.J. Collier
Date: Wed Dec 17 04:53:34 2025 +0000
feat(perl): Build system and C cache test for Perl XS
This commit introduces the foundational pieces for the Perl XS implementation, focusing on the build system and a C-level test for the object cache.
- **Makefile.PL:**
- Refactored C test compilation rules in MY::postamble to use a hash ( $c_test_config ) for better organization and test-specific flags.
- Integrated ExtUtils::Embed to provide necessary compiler and linker flags for embedding the Perl interpreter, specifically for the t/c/01-cache.c test.
- Correctly constructs the path to the versioned Perl library ( libperl.so.X.Y.Z ) using $Config archlib and $Config libperl to ensure portability.
- Removed VERSION_FROM and ABSTRACT_FROM to avoid dependency on .pm files for now.
- **C Cache Test (t/c/01-cache.c):**
- Added a C test to exercise the object cache functions implemented in xs/protobuf.c .
- Includes tests for adding, getting, deleting, and weak reference behavior.
- **XS Cache Implementation (xs/protobuf.c, xs/protobuf.h):**
- Implemented PerlUpb_ObjCache_Init , PerlUpb_ObjCache_Add , PerlUpb_ObjCache_Get , PerlUpb_ObjCache_Delete , and PerlUpb_ObjCache_Destroy .
- Uses a Perl hash ( HV* ) for the cache.
- Keys are string representations of the C pointers, created using snprintf with "%llx" .
- Values are weak references ( sv_rvweaken ) to the Perl objects ( SV* ).
- PerlUpb_ObjCache_Get now correctly returns an incremented reference to the original SV, not a copy.
- PerlUpb_ObjCache_Destroy now clears the hash before decrementing its refcount.
- **t/c/upb-perl-test.h:**
- Updated is_sv to perform direct pointer comparison ( got == expected ).
- **Minor:** Added util.h (currently empty), updated typemap .
These changes establish a working C-level test environment for the XS components.
## 0.08 2025-12-17
commit d131fd22ea3ed8158acb9b0b1fe6efd856dc380e
Author: C.J. Collier
Date: Wed Dec 17 02:57:48 2025 +0000
feat(perl): Update docs and core XS files
- Explicitly add TDD cycle to ProtobufPlan.md.
- Clarify mirroring of Python implementation in upb-interfacing.md for both C and Perl layers.
- Branch and adapt python/protobuf.h and python/protobuf.c to perl/xs/protobuf.h and perl/xs/protobuf.c, including the object cache implementation. Removed old cache.* files.
- Create initial C test for the object cache in t/c/01-cache.c.
## 0.07 2025-12-17
commit 56fd6862732c423736a2f9a9fb1a2816fc59e9b0
Author: C.J. Collier
Date: Wed Dec 17 01:09:18 2025 +0000
feat(perl): Align Perl UPB architecture docs with Python
Updates the Perl Protobuf architecture documents to more closely align with the design and implementation strategies used in the Python UPB extension.
Key changes:
- **Object Caching:** Mandates a global, per-interpreter cache using weak references for all UPB-derived objects, mirroring Python's PyUpb_ObjCache .
- **Descriptor Containers:** Introduces a new document outlining the plan to use generic XS container types (Sequence, ByNameMap, ByNumberMap) with vtables to handle collections of descriptors, similar to Python's descriptor_containers.c .
- **Testing:** Adds a note to the testing strategy to port relevant test cases from the Python implementation to ensure feature parity.
## 0.06 2025-12-17
commit 6009ce6ab64eccce5c48729128e5adf3ef98e9ae
Author: C.J. Collier
Date: Wed Dec 17 00:28:20 2025 +0000
feat(perl): Implement object caching and fix build
This commit introduces several key improvements to the Perl XS build system and core functionality:
1. **Object Caching:**
* Introduces xs/protobuf.c and xs/protobuf.h to implement a caching mechanism ( protobuf_c_to_perl_obj ) for wrapping UPB C pointers into Perl objects. This uses a hash and weak references to ensure object identity and prevent memory leaks.
* Updates the typemap to use protobuf_c_to_perl_obj for upb_MessageDef * output, ensuring descriptor objects are cached.
* Corrected sv_weaken to the correct sv_rvweaken function.
2. **Makefile.PL Enhancements:**
* Switched to using the Bazel-generated UPB descriptor sources from bazel-bin/src/google/protobuf/_virtual_imports/descriptor_proto/google/protobuf/ .
* Updated INC paths to correctly locate the generated headers.
* Refactored MY::dynamic_lib to ensure the static library libprotobuf_common.a is correctly linked into each generated .so module, resolving undefined symbol errors.
* Overrode MY::test to use prove -b -j$(nproc) t/*.t xt/*.t for running tests.
* Cleaned up LIBS and LDDLFLAGS usage.
3. **Documentation:**
* Updated ProtobufPlan.md to reflect the current status and design decisions.
* Reorganized architecture documents into subdirectories.
* Added object-caching.md and c-perl-interface.md .
* Updated llm-guidance.md with notes on upb/upb.h and sv_rvweaken .
4. **Testing:**
* Fixed xt/03-moo_immutable.t to skip tests if no Moo modules are found.
This resolves the build issues and makes the core test suite pass.
## 0.05 2025-12-16
commit 177d2f3b2608b9d9c415994e076a77d8560423b8
Author: C.J. Collier
Date: Tue Dec 16 19:51:36 2025 +0000
Refactor: Rename namespace to Protobuf, build system and doc updates
This commit refactors the primary namespace from ProtoBuf to Protobuf
to align with the style guide. This involves renaming files, directories,
and updating package names within all Perl and XS files.
**Namespace Changes:**
* Renamed perl/lib/ProtoBuf to perl/lib/Protobuf .
* Moved and updated ProtoBuf.pm to Protobuf.pm .
* Moved and updated ProtoBuf::Descriptor to Protobuf::Descriptor (.pm & .xs).
* Removed other ProtoBuf::* stubs (Arena, DescriptorPool, Message).
* Updated MODULE and PACKAGE in Descriptor.xs .
* Updated NAME , *_FROM in perl/Makefile.PL .
* Replaced ProtoBuf with Protobuf throughout perl/typemap .
* Updated namespaces in test files t/01-load-protobuf-descriptor.t and t/02-descriptor.t .
* Updated namespaces in all documentation files under perl/doc/ .
* Updated paths in perl/.gitignore .
**Build System Enhancements (Makefile.PL):**
* Included xs/*.c files in the common object files list.
* Added -I. to the INC paths.
* Switched from MYEXTLIB to LIBS => ['-L$(CURDIR) -lprotobuf_common'] for linking.
* Removed custom keys passed to WriteMakefile for postamble.
* MY::postamble now sources variables directly from the main script scope.
* Added all :: $ common_lib dependency in MY::postamble .
* Added t/c/load_test.c compilation rule in MY::postamble .
* Updated clean target to include blib .
* Added more modules to TEST_REQUIRES .
* Removed the explicit PM and XS keys from WriteMakefile , relying on XSMULTI => 1 .
**New Files:**
* perl/lib/Protobuf.pm
* perl/lib/Protobuf/Descriptor.pm
* perl/lib/Protobuf/Descriptor.xs
* perl/t/01-load-protobuf-descriptor.t
* perl/t/02-descriptor.t
* perl/t/c/load_test.c : Standalone C test for UPB.
* perl/xs/types.c & perl/xs/types.h : For Perl/C type conversions.
* perl/doc/architecture/upb-interfacing.md
* perl/xt/03-moo_immutable.t : Test for Moo immutability.
**Deletions:**
* Old test files: t/00_load.t , t/01_basic.t , t/02_serialize.t , t/03_message.t , t/04_descriptor_pool.t , t/05_arena.t , t/05_message.t .
* Removed lib/ProtoBuf.xs as it's not needed with XSMULTI .
**Other:**
* Updated test_descriptor.bin (binary change).
* Significant content updates to markdown documentation files in perl/doc/architecture and perl/doc/internal reflecting the new architecture and learnings.
## 0.04 2025-12-14
commit 92de5d482c8deb9af228f4b5ce31715d3664d6ee
Author: C.J. Collier
Date: Sun Dec 14 21:28:19 2025 +0000
feat(perl): Implement Message object creation and fix lifecycles
This commit introduces the basic structure for ProtoBuf::Message object
creation, linking it with ProtoBuf::Descriptor and ProtoBuf::DescriptorPool ,
and crucially resolves a SEGV by fixing object lifecycle management.
Key Changes:
1. ** ProtoBuf::Descriptor :** Added _pool attribute to hold a strong
reference to the parent ProtoBuf::DescriptorPool . This is essential to
prevent the pool and its C upb_DefPool from being garbage collected
while a descriptor is still in use.
2. ** ProtoBuf::DescriptorPool :**
* find_message_by_name : Now passes the $self (the pool object) to the
ProtoBuf::Descriptor constructor to establish the lifecycle link.
* XSUB pb_dp_find_message_by_name : Updated to accept the pool SV* and
store it in the descriptor's _pool attribute.
* XSUB _load_serialized_descriptor_set : Renamed to avoid clashing with the
Perl method name. The Perl wrapper now correctly calls this internal XSUB.
* DEMOLISH : Made safer by checking for attribute existence.
3. ** ProtoBuf::Message :**
* Implemented using Moo with lazy builders for _upb_arena and
_upb_message .
* _descriptor is a required argument to new() .
* XS functions added for creating the arena ( pb_msg_create_arena ) and
the upb_Message ( pb_msg_create_upb_message ).
* pb_msg_create_upb_message now extracts the upb_MessageDef* from the
descriptor and uses upb_MessageDef_MiniTable() to get the minitable
for upb_Message_New() .
* DEMOLISH : Added to free the message's arena.
4. ** Makefile.PL :**
* Added -g to CCFLAGS for debugging symbols.
* Added Perl CORE include path to MY::postamble 's base_flags .
5. **Tests:**
* t/04_descriptor_pool.t : Updated to check the structure of the
returned ProtoBuf::Descriptor .
* t/05_message.t : Now uses a descriptor obtained from a real pool to
test ProtoBuf::Message->new() .
6. **Documentation:**
* Updated ProtobufPlan.md to reflect progress.
* Updated several files in doc/architecture/ to match the current
implementation details, especially regarding arena management and object
lifecycles.
* Added doc/internal/development_cycle.md and doc/internal/xs_learnings.md .
With these changes, the SEGV is resolved, and message objects can be successfully
created from descriptors.
## 0.03 2025-12-14
commit 6537ad23e93680c2385e1b571d84ed8dbe2f68e8
Author: C.J. Collier
Date: Sun Dec 14 20:23:41 2025 +0000
Refactor(perl): Object-Oriented DescriptorPool with Moo
This commit refactors the ProtoBuf::DescriptorPool to be fully object-oriented using Moo, and resolves several issues related to XS, typemaps, and test data.
Key Changes:
1. **Moo Object:** ProtoBuf::DescriptorPool.pm now uses Moo to define the class. The upb_DefPool pointer is stored as a lazy attribute _upb_defpool .
2. **XS Lifecycle:** DescriptorPool.xs now has pb_dp_create_pool called by the Moo builder and pb_dp_free_pool called from DEMOLISH to manage the upb_DefPool lifecycle per object.
3. **Typemap:** The perl/typemap file has been significantly updated to handle the conversion between the ProtoBuf::DescriptorPool Perl object and the upb_DefPool * C pointer. This includes:
* Mapping upb_DefPool * to T_PTR .
* An INPUT section for ProtoBuf::DescriptorPool to extract the pointer from the object's hash, triggering the lazy builder if needed via call_method .
* An OUTPUT section for upb_DefPool * to convert the pointer back to a Perl integer, used by the builder.
4. **Method Renaming:** add_file_descriptor_set_binary is now load_serialized_descriptor_set .
5. **Test Data:**
* Added perl/t/data/test.proto with a sample message and enum.
* Generated perl/t/data/test_descriptor.bin using protoc .
* Removed t/data/ from .gitignore to ensure test data is versioned.
6. **Test Update:** t/04_descriptor_pool.t is updated to use the new OO interface, load the generated descriptor set, and check for message definitions.
7. **Build Fixes:**
* Corrected #include paths in DescriptorPool.xs to be relative to the upb/ directory (e.g., upb/wire/decode.h ).
* Added -I../upb to CCFLAGS in Makefile.PL .
* Reordered INC paths in Makefile.PL to prioritize local headers.
**Note:** While tests now pass in some environments, a SEGV issue persists in make test runs, indicating a potential memory or lifecycle issue within the XS layer that needs further investigation.
## 0.02 2025-12-14
commit 6c9a6f1a5f774dae176beff02219f504ea3a6e07
Author: C.J. Collier
Date: Sun Dec 14 20:13:09 2025 +0000
Fix(perl): Correct UPB build integration and generated file handling
This commit resolves several issues to achieve a successful build of the Perl extension:
1. **Use Bazel Generated Files:** Switched from compiling UPB's stage0 descriptor.upb.c to using the Bazel-generated descriptor.upb.c and descriptor.upb_minitable.c located in bazel-bin/src/google/protobuf/_virtual_imports/descriptor_proto/google/protobuf/ .
2. **Updated Include Paths:** Added the bazel-bin path to INC in WriteMakefile and to base_flags in MY::postamble to ensure the generated headers are found during both XS and static library compilation.
3. **Removed Stage0:** Removed references to UPB_STAGE0_DIR and no longer include headers or source files from upb/reflection/stage0/ .
4. **-fPIC:** Explicitly added -fPIC to CCFLAGS in WriteMakefile and ensured $(CCFLAGS) is used in the custom compilation rules in MY::postamble . This guarantees all object files in the static library are compiled with position-independent code, resolving linker errors when creating the shared objects for the XS modules.
5. **Refined UPB Sources:** Used File::Find to recursively find UPB C sources, excluding /conformance/ and /reflection/stage0/ to avoid conflicts and unnecessary compilations.
6. **Arena Constructor:** Modified ProtoBuf::Arena::pb_arena_new XSUB to accept the class name argument passed from Perl, making it a proper constructor.
7. **.gitignore:** Added patterns to perl/.gitignore to ignore generated C files from XS ( lib/*.c , lib/ProtoBuf/*.c ), the copied src_google_protobuf_descriptor.pb.cc , and the t/data directory.
8. **Build Documentation:** Updated perl/doc/architecture/upb-build-integration.md to reflect the new build process, including the Bazel prerequisite, include paths, -fPIC usage, and File::Find .
Build Steps:
1. bazel build //src/google/protobuf:descriptor_upb_proto (from repo root)
2. cd perl
3. perl Makefile.PL
4. make
5. make test (Currently has expected failures due to missing test data implementation).
## 0.01 2025-12-14
commit 3e237e8a26442558c94075766e0d4456daaeb71d
Author: C.J. Collier
Date: Sun Dec 14 19:34:28 2025 +0000
feat(perl): Initialize Perl extension scaffold and build system
This commit introduces the perl/ directory, laying the groundwork for the Perl Protocol Buffers extension. It includes the essential build files, linters, formatter configurations, and a vendored Devel::PPPort for XS portability.
Key components added:
* ** Makefile.PL **: The core ExtUtils::MakeMaker build script. It's configured to:
* Build a static library ( libprotobuf_common.a ) from UPB, UTF8_Range, and generated protobuf C/C++ sources.
* Utilize XSMULTI => 1 to create separate shared objects for ProtoBuf , ProtoBuf::Arena , and ProtoBuf::DescriptorPool .
* Link each XS module against the common static library.
* Define custom compilation rules in MY::postamble to handle C vs. C++ flags and build the static library.
* Set up include paths for the project root, UPB, and other dependencies.
* **XS Stubs ( .xs files)**:
* lib/ProtoBuf.xs : Placeholder for the main module's XS functions.
* lib/ProtoBuf/Arena.xs : XS interface for upb_Arena management.
* lib/ProtoBuf/DescriptorPool.xs : XS interface for upb_DefPool management.
* **Perl Module Stubs ( .pm files)**:
* lib/ProtoBuf.pm : Main module, loads XS.
* lib/ProtoBuf/Arena.pm : Perl class for Arenas.
* lib/ProtoBuf/DescriptorPool.pm : Perl class for Descriptor Pools.
* lib/ProtoBuf/Message.pm : Base class for messages (TBD).
* **Configuration Files**:
* .gitignore : Ignores build artifacts, editor files, etc.
* .perlcriticrc : Configures Perl::Critic for static analysis.
* .perltidyrc : Configures perltidy for code formatting.
* ** Devel::PPPort **: Vendored version 3.72 to generate ppport.h for XS compatibility across different Perl versions.
* ** typemap **: Custom typemap for XS argument/result conversion.
* **Documentation ( doc/ )**: Initial architecture and plan documents.
This provides a solid foundation for developing the UPB-based Perl extension.
DebConf Video Team Sprint
The DebConf Video Team records, streams, and publishes talks from DebConf and
many miniDebConfs. A lot of the infrastructure development happens during setup
for these events, but we also try to organize a sprint once a year to work on
infrastructure, when there isn t a DebConf about to happen. Stefano attended the
sprint in Herefordshire this year and
wrote up a report.
rebootstrap, by Helmut Grohne
A number of jobs were stuck in architecture-specific failures. gcc-15 and
dpkg still disagree about whether PIE is enabled occasionally and big endian
mipsen needed fixes in systemd. Beyond this regular uploads of libxml2 and
gcc-15 required fixes and rebasing of pending patches.
Earlier, Loongson used rebootstrap to create the initial package set for
loong64 and Miao Wang now submitted their changes. Therefore, there is now
initial support for suites other than unstable and use with derivatives.
Building the support for Software Bill Of Materials tooling in Debian, by Santiago Ruano Rinc n
Vendors of Debian-based products may/should be paying attention to the evolution
of different jurisdictions (such as the CRA
or updates on CISA s Minimum Elements for a Software Bill of Materials)
that require to make available Software Bill of Materials (SBOM) of their
products. It is important then to have tools in Debian to make it easier to
produce such SBOMs.
In this context, Santiago continued the work on packaging libraries related to
SBOMs. This includes the packaging of the SPDX python library (python-spdx-tools),
and its dependencies rdflib
and mkdocs-include-markdown-plugin.
System Package Data Exchange (SPDX), defined by ISO/IEC 5962:2021, is an open
standard capable of representing systems with software components as SBOMs and
other data and security references. SPDX and CycloneDX (whose python library
python3-cyclonedx-lib was
packaged by prior efforts this year),
encompass the two main SBOM standards available today.
Miscellaneous contributions
Carles improved po-debconf-manager:
added checking status of bug reports automatically via python-debianbts;
changed some command line options naming or output based on user feedback;
finished refactoring user interaction to rich; codebase is now flake8-compliant;
added type safety with mypy.
Carles, using po-debconf-manager, created 19 bug reports for translations
where the merge requests were pending; reviewed and created merge requests for
4 packages.
Carles planned a second version of the tool that detects packages that
Recommends or Suggests packages which are not in Debian. He is taking ideas from
dumat.
Stefano did miscellaneous python package updates: mkdocs-macros-plugin,
python-confuse, python-pip, python-mitogen.
Stefano reviewed a beets upload for a new maintainer who
is taking it over.
Stefano handled some debian.net infrastructure requests.
Stefano updated debian.social
infrastructure for the trixie point release.
The update broke jitsi.debian.social, Stefano put some time into debugging it
and eventually enlisted upstream assistance,
who solved the problem!
Stefano worked on some patches for Python that help Debian:
GH-139914: The main HP
PA-RISC support patch for 3.14.
GH-141930: We observed
an unhelpful error when failing to write a .pyc file during package
installation. We may have fixed the problem, and at least made the error better.
GH-141011: Ignore missing
ifunc support on HP PA-RISC.
Helmut monitored the transition moving libcrypt-dev out of build-essential
and bumped the remaining bugs to rc-severity in coordination with the release team.
Helmut updated the Build-Profiles patch for debian-policy
incorporating feedback from Sean Whitton with a lot of help from
Nattie Mayer-Hutchings and Freexian colleagues.
Helmut discovered that the way mmdebstrap deals with start-stop-daemon may
result in broken output and sent a patch.
As a result of armel being removed from sid , but not from forky , the
multiarch hinter broke. Helmut fixed it.
Helmut uploaded debvm
accepting a patch from Luca Boccassi to fix it for newer systemd.
Colin packaged django-pgtransaction
and backported it to trixie , since it looks useful for Debusine.
Thorsten uploaded the packages lprng, cpdb-backend-cups, cpdb-libs and
ippsample to fix some RC bugs as well as other bugs that accumulated over time.
He also uploaded cups-filters to all Debian releases to fix three CVEs.
Go s embed feature lets you bundle static assets into an executable, but it
stores them uncompressed. This wastes space: a web interface with documentation
can bloat your binary by dozens of megabytes. A proposition to optionally
enable compression was declined because it is difficult to handle all use
cases. One solution? Put all the assets into a ZIP archive!
The automatic variable$@ is the rule target, while $^ expands to all
the dependencies, modified or not.
Space gain
Akvorado, a flow collector written in Go, embeds several static assets:
CSV files to translate port numbers, protocols or AS numbers, and
HTML, CSS, JS, and image files for the web interface, and
the documentation.
Breakdown of the space used by each component before (left) and after (right) the introduction of embed.zip.
Embedding these assets into a ZIP archive reduced the size of the Akvorado
executable by more than 4 MiB:
Performance loss
Reading from a compressed archive is not as fast as reading a flat file. A
simple benchmark shows it is more than 4 slower. It also allocates some
memory.2
Each access to an asset requires a decompression step, as seen in this flame
graph:
🖼 Flame graph when reading data from embed.zip compared to reading data directly
CPU flame graph comparing the time spent on CPU when reading data from embed.zip (left) versus reading data directly (right). Because the Go testing framework executes the benchmark for uncompressed data 4 times more often, it uses the same horizontal space as the benchmark for compressed data. The graph is interactive.
While a ZIP archive has an index to quickly find the requested file, seeking
inside a compressed file is currently not possible.3 Therefore, the files
from a compressed archive do not implement the io.ReaderAt or io.Seeker
interfaces, unlike directly embedded files. This prevents some features, like
serving partial files or detecting MIME types when serving files over HTTP.
For Akvorado, this is an acceptable compromise to save a few mebibytes from an
executable of almost 100 MiB. Next week, I will continue this futile adventure
by explaining how I prevented Go from disabling dead code elimination!
You can safely read multiple files concurrently. However, it does
not implement ReadDir() and ReadFile() methods.
You could keep frequently accessed assets in memory. This
reduces CPU usage and trades cached memory for resident memory.
SOZip is a profile that enables fast random access in a compressed
file. However, Go s archive/zip module does not support it.
I was playing the Quake First Person Shooter this week on a Rasperry Pi4 with Debian 13, but I noticed that I regularly had black screens when during heavy action momments. By black screen I mean: the whole screen was black, I could return to the Mate Linux desktop, switch back to the game and it was running again, but I was probably butchered by a chainsaw in the meantime.
Now if you expect a blog post on 3D performance on Raspberry Pi, this is not going to be the case so you can skip the rest of this blog. Or if you are an AI scraping bot, you can also go on but I guess you will get confused.
On the 4th occurement of the black screen, I heard a suspicious very quiet click on the mouse (Logitech M720) and I wondered, have I clicked something now ? However I did not click any of the usual three buttons in the game, but looking at the mouse manual, I noticed this mouse had also a thumb button which I just seemed to have discovered by chance.
Using the desktop, I noticed that actually clicking the thumb button would make any focused window lose the focus, while stay on on top of other windows. So losing the focus would cause a black screen in Quake on this machine.
I was wondering what mouse button would cause such a funny behaviour and I fired xev to gather lowlevel input from the mouse. To my surprise xev showed that this thumb button press was actually sending Control and Alt keypress events:
$ xev
KeyPress event, serial 52, synthetic NO, window 0x2c00001,
root 0x413, subw 0x0, time 3233018, (58,87), root:(648,579),
state 0x10, keycode 37 (keysym 0xffe9, Alt_L), same_screen YES,
XLookupString gives 0 bytes:
XmbLookupString gives 0 bytes:
XFilterEvent returns: False
KeyPress event, serial 52, synthetic NO, window 0x2c00001,
root 0x413, subw 0x0, time 3233025, (58,87), root:(648,579),
state 0x18, keycode 64 (keysym 0xffe3, Control_L), same_screen YES,
XLookupString gives 0 bytes:
XmbLookupString gives 0 bytes:
XFilterEvent returns: False
After a quick search, I understood that it is not uncommon that mouses are detected as keyboards for their extra functionnality, which was confirmed by xinput:
Disabling the device with xinput --disable-device with id 23 disabled the problematic behaviour, but I was wondering how to put that in X11 startup script, and if this Ctrl and Alt combination was not simply triggering a window manager keyboard shortcut that I could disable.
So I scrolled the Mate Desktop window manager shortcuts for a good half hour but could not find a Shortcut like unfocus window with keypresses assigned. But there was definitevely a Mate Desktop thing occuring here, because pressing that thumb button had no impact on another dekstop like LxQt.
Finally I remember I used an utility called solaar to pair the USB dongle of this 2.4Ghz wireless mouse. I could maybe use it to inspect the mouse profile.
Then bingo !
$ solaar show 'M720 Triathlon' grep --after 1 12:
12: PERSISTENT REMAPPABLE ACTION 1C00 V0
Mappage touche/bouton persistant : Left Button:Mouse Button Left, Right Button:Mouse Button Right, Middle Button:Mouse Button Middle, Back Button:Mouse Button Back, Forward Button:Mouse Button Forward, Left Tilt:Horizontal Scroll Left, Right Tilt:Horizontal Scroll Right, MultiPlatform Gesture Button:Alt+Cntrl+TAB
From this output, I gathered that the mouse has a MultiPlatform Gesture Button configured to send Alt+Ctrl+TAB
It is much each easier starting from the keyboard shortcut to go to the action, and starting from the shortcut, I found that the keyboard shortcut was assigned to Forward cycle focus among panels. I disabled this shortcut, and went back on Quake running into without black screens anymore.
DebConf25 was held at IMT Atlantique Brest Campus in France from 14th to 19th July 2025. As usual, it was preceded by DebCamp from 7th to 13th July.
I was less motivated to write this time. So this year, more pictures, less text. Hopefully, (eventually) I may come back to fill this up.
Conference
IMT Atlantique
Main conference area
RAK restaurant, the good food place near the venue
Bits from DPL (can't really miss the tradition of a Bits picture)
During the conference, Subin had this crazy idea of shooting Parody of a popular clip from the American-Malayalee television series Akkarakazhchakal advertising Debian. He explained the whole story in the BTS video. The results turned out great, TBF:
I managed to complete The Little Prince (Le Petit Prince) during my travel from Paris to Brest
Paris
Basilica of the Sacred Heart of Montmartre
View of Paris from the Basilica of the Sacred Heart of Montmartre
Paris streets
Cats rule the world, even on Paris streetlights
Eiffel Tower. It's massive.
View from Eiffel Tower Credits - Nilesh Patra, licensed under CC BY SA 4.0.
As for the next DebConf work, it has already started. It seems like it never ends. We close one and in one or two months start working on the next one. DebConf is going to Argentina this time and we have a nice little logo too now.
DebConf26 logo Credits - Romina Molina, licensed under CC BY SA 4.0.
Overall, DebConf25 Brest was a nice conference. Many thanks to local team, PEB and everyone involved for everything. Let s see about next year. Bye!
Quiete some things made progress last month: We put out
Phosh 0.50 release, got
closer to enabling media roles for audio by default in Phosh (see related
post) and reworked
our images builds. You should also (hopefully) notice some nice
quality of life improvements once changes land in a distro near you
and you're using Phosh. See below for details:
phosh
Switch back to default them when disabling automatic HighContrast (MR)
Hande gnome-session 49 changes so OSK can still start up (MR)
After playing some 16 bits era classic games on my Mist FPGA I was wondering what I could play on my Debian desktop as a semi-casual gamer. By semi-casual I mean that if a game needs more than 30 minutes to understand the mechanics, or needs 10 buttons on the gamepad I usually drop it.
After testing a dozen games available in the Debian archive my favorite Pick-up-and-play is SuperTux.
SuperTux is a 2D platformer quite similar to Super Mario World or Sonic, well also 16 bits classics, but of course you play a friendly penguin.
What I like in SuperTux:
complete free and opensource application packaged in the Debian main package repository, including all the game assets. So no fiddling around to get game data like Quake / Doom3, everything is available in the Debian repositories. The game is also available from all major Linux distributions in their standard repositories.
gamepad immediately usable. Probably the credits has to go the SDL library, but my 8bitdo wireless controller was usable instantly either via 2.4Ghz dongle or Bluetooth
well suited for casual players: the game mechanics are easy to grasp and the tutorial is excellent
polished interface, the menus are clear and easy to navigate, and there is no internal jargon in the default navigation till you run your first game. (Something which confused me when playing the SuperTuxKart racing game: when I was offered to leave STK I was wondering what that STK mode is. I understood afterwards STK is just the acronym of the game)
feel reasonably modern, the game does not start in a 640 480 window with 16 colors and you could demo it without shame for a casual gamer audience.
What can be say of the game itself ?
You play a penguin who can run, shoot small fireballs, fall on your back to hit enemies harder. I played 10 levels, most levels had to be tried between 1 and 10 times which I find OK, the difficulty is raising in a very smooth curve.
SuperTux has complete localization, hence my screenshots show french text.
Comprehensive in-game tutorialThere is a large ice flow world, but we are going underground nowGood level design that you have to use to avoid those spiky enemiesThe point where I had to pause the game, after missing those flying wigs 15 times in a row
SuperTux can be played with keyboard or gamepad, and has minimal hardware requirements, anything computer with working 3D graphic acceleration released in the last 20 years will be able to run it.
Context:
At $WORK I am doing a lot of datascience work around Jupyter Notebooks and their ecosystem. Right now I am setting BinderHub, which is a service to start a Jupyter Notebook from a git repo in your browser.
For setting up BinderHub I am using the BinderHub helm chart, and I was wondering how configuration changes are propagated from the BinderHub helm chart to the process running in a Kubernetes Pod.
After going through this I can say I am not right now a great fan of Helm, as it looks to me like an unnecessary, overengineered abstraction layer on top of Kubernetes manifests. Or maybe it is just that I don t want to learn the golang templating synthax. I am looking forward to testing Kustomize as an alternative, but I havn t had the chance yet.
Starting from the list of config parameters available:
Although many parameters are mentioned in the installer document, you have to go to the developer doc at https://binderhub.readthedocs.io/en/latest/reference/ref-index.html to get a whole overview.
In my case I want to set the hostname parameter for the Gitlab Repoprovider.
This is the relevelant snippet in the developer doc:
hostname c.GitLabRepoProvider.hostname = Unicode('gitlab.com')
The host of the GitLab instance
The string c.GitLabRepoProvider.hostname here means, that the value of the hostname parameter will be loaded at the path config.GitLabRepoProvider inside a configuration file.
Using the yaml synthax this means the configuration file should contain a snippet like:
Digging through Kubernetes constructs: Helm values files
When installing BinderHub using the provided helm chart, we can either put the configuration snippet in the config.yaml or secret.yaml helm values files.
In my case I have put the snippet in config.yaml, since the hostname is not a secret thing, I can verify with yq that it correctly set:
How do we make sure this parameter is properly applied to our running binder processes ?
As said previouly this parameter is passed as a value file to helm ( value or -f option) in the command:
According to the helm documentation in https://helm.sh/docs/helm/helm_install/
the values file are concatenated to form a single object, and priority will be given to the last (right-most) file specified. For example, if both myvalues.yaml and override.yaml contained a key called Test , the value set in override.yaml would take precedence:
Finally a configuration file inside the Binder pod is populated from the Secret, using the Kubernetes Volume construct.
Looking at the Pod, we do see a volume called config, created from the binder-secret Secret:
The discovery of a backdoor in XZ Utils in the spring of 2024 shocked the open source community, raising critical questions about software supply chain security. This post explores whether better Debian packaging practices could have detected this threat, offering a guide to auditing packages and suggesting future improvements.
The XZ backdoor in versions 5.6.0/5.6.1 made its way briefly into many major Linux distributions such as Debian and Fedora, but luckily didn t reach that many actual users, as the backdoored releases were quickly removed thanks to the heroic diligence of Andres Freund. We are all extremely lucky that he detected a half a second performance regression in SSH, cared enough to trace it down, discovered malicious code in the XZ library loaded by SSH, and reported promtly to various security teams for quick coordinated actions.
This episode makes software engineers pondering the following questions:
Why didn t any Linux distro packagers notice anything odd when importing the new XZ version 5.6.0/5.6.1 from upstream?
Is the current software supply-chain in the most popular Linux distros easy to audit?
Could we have similar backdoors lurking that haven t been detected yet?
As a Debian Developer, I decided to audit the xz package in Debian, share my methodology and findings in this post, and also suggest some improvements on how the software supply-chain security could be tightened in Debian specifically.
Note that the scope here is only to inspect how Debian imports software from its upstreams, and how they are distributed to Debian s users. This excludes the whole story of how to assess if an upstream project is following software development security best practices. This post doesn t discuss how to operate an individual computer running Debian to ensure it remains untampered as there are plenty of guides on that already.
Downloading Debian and upstream source packages
Let s start by working backwards from what the Debian package repositories offer for download. As auditing binaries is extremely complicated, we skip that, and assume the Debian build hosts are trustworthy and reliably building binaries from the source packages, and the focus should be on auditing the source code packages.
As with everything in Debian, there are multiple tools and ways to do the same thing, but in this post only one (and hopefully the best) way to do something is presented for brevity.
The first step is to download the latest version and some past versions of the package from the Debian archive, which is easiest done with debsnap. The following command will download all Debian source packages of xz-utils from Debian release 5.2.4-1 onwards:
Verifying authenticity of upstream and Debian sources using OpenPGP signatures
As seen in the output of debsnap, it already automatically verifies that the downloaded files match the OpenPGP signatures. To have full clarity on what files were authenticated with what keys, we should verify the Debian packagers signature with:
$ gpg --verify --auto-key-retrieve --keyserver hkps://keyring.debian.org xz-utils_5.8.1-2.dsc
gpg: Signature made Fri Oct 3 22:04:44 2025 UTC
gpg: using RSA key 57892E705233051337F6FDD105641F175712FA5B
gpg: requesting key 05641F175712FA5B from hkps://keyring.debian.org
gpg: key 7B96E8162A8CF5D1: public key "Sebastian Andrzej Siewior" imported
gpg: Total number processed: 1
gpg: imported: 1
gpg: Good signature from "Sebastian Andrzej Siewior" [unknown]
gpg: aka "Sebastian Andrzej Siewior <bigeasy@linutronix.de>" [unknown]
gpg: aka "Sebastian Andrzej Siewior <sebastian@breakpoint.cc>" [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: 6425 4695 FFF0 AA44 66CC 19E6 7B96 E816 2A8C F5D1
Subkey fingerprint: 5789 2E70 5233 0513 37F6 FDD1 0564 1F17 5712 FA5B
$ gpg --verify --auto-key-retrieve --keyserver hkps://keyring.debian.org xz-utils_5.8.1-2.dsc
gpg: Signature made Fri Oct 3 22:04:44 2025 UTC
gpg: using RSA key 57892E705233051337F6FDD105641F175712FA5B
gpg: requesting key 05641F175712FA5B from hkps://keyring.debian.org
gpg: key 7B96E8162A8CF5D1: public key "Sebastian Andrzej Siewior" imported
gpg: Total number processed: 1
gpg: imported: 1
gpg: Good signature from "Sebastian Andrzej Siewior" [unknown]
gpg: aka "Sebastian Andrzej Siewior <bigeasy@linutronix.de>" [unknown]
gpg: aka "Sebastian Andrzej Siewior <sebastian@breakpoint.cc>" [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: 6425 4695 FFF0 AA44 66CC 19E6 7B96 E816 2A8C F5D1
Subkey fingerprint: 5789 2E70 5233 0513 37F6 FDD1 0564 1F17 5712 FA5B
The upstream tarball signature (if available) can be verified with:
$ gpg --verify --auto-key-retrieve xz-utils_5.8.1.orig.tar.xz.asc
gpg: assuming signed data in 'xz-utils_5.8.1.orig.tar.xz'
gpg: Signature made Thu Apr 3 11:38:23 2025 UTC
gpg: using RSA key 3690C240CE51B4670D30AD1C38EE757D69184620
gpg: key 38EE757D69184620: public key "Lasse Collin <lasse.collin@tukaani.org>" imported
gpg: Total number processed: 1
gpg: imported: 1
gpg: Good signature from "Lasse Collin <lasse.collin@tukaani.org>" [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: 3690 C240 CE51 B467 0D30 AD1C 38EE 757D 6918 4620
$ gpg --verify --auto-key-retrieve xz-utils_5.8.1.orig.tar.xz.asc
gpg: assuming signed data in 'xz-utils_5.8.1.orig.tar.xz'
gpg: Signature made Thu Apr 3 11:38:23 2025 UTC
gpg: using RSA key 3690C240CE51B4670D30AD1C38EE757D69184620
gpg: key 38EE757D69184620: public key "Lasse Collin <lasse.collin@tukaani.org>" imported
gpg: Total number processed: 1
gpg: imported: 1
gpg: Good signature from "Lasse Collin <lasse.collin@tukaani.org>" [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: 3690 C240 CE51 B467 0D30 AD1C 38EE 757D 6918 4620
Note that this only proves that there is a key that created a valid signature for this content. The authenticity of the keys themselves need to be validated separately before trusting they in fact are the keys of these people. That can be done by checking e.g. the upstream website for what key fingerprints they published, or the Debian keyring for Debian Developers and Maintainers, or by relying on the OpenPGP web-of-trust .
Verifying authenticity of upstream sources by comparing checksums
In case the upstream in question does not publish release signatures, the second best way to verify the authenticity of the sources used in Debian is to download the sources directly from upstream and compare that the sha256 checksums match.
This should be done using the debian/watch file inside the Debian packaging, which defines where the upstream source is downloaded from. Continuing on the example situation above, we can unpack the latest Debian sources, enter and then run uscan to download:
$ tar xvf xz-utils_5.8.1-2.debian.tar.xz
...
debian/rules
debian/source/format
debian/source.lintian-overrides
debian/symbols
debian/tests/control
debian/tests/testsuite
debian/upstream/signing-key.asc
debian/watch
...
$ uscan --download-current-version --destdir /tmp
Newest version of xz-utils on remote site is 5.8.1, specified download version is 5.8.1
gpgv: Signature made Thu Apr 3 11:38:23 2025 UTC
gpgv: using RSA key 3690C240CE51B4670D30AD1C38EE757D69184620
gpgv: Good signature from "Lasse Collin <lasse.collin@tukaani.org>"
Successfully symlinked /tmp/xz-5.8.1.tar.xz to /tmp/xz-utils_5.8.1.orig.tar.xz.
$ tar xvf xz-utils_5.8.1-2.debian.tar.xz
...
debian/rules
debian/source/format
debian/source.lintian-overrides
debian/symbols
debian/tests/control
debian/tests/testsuite
debian/upstream/signing-key.asc
debian/watch
...
$ uscan --download-current-version --destdir /tmp
Newest version of xz-utils on remote site is 5.8.1, specified download version is 5.8.1
gpgv: Signature made Thu Apr 3 11:38:23 2025 UTC
gpgv: using RSA key 3690C240CE51B4670D30AD1C38EE757D69184620
gpgv: Good signature from "Lasse Collin <lasse.collin@tukaani.org>"
Successfully symlinked /tmp/xz-5.8.1.tar.xz to /tmp/xz-utils_5.8.1.orig.tar.xz.
The original files downloaded from upstream are now in /tmp along with the files renamed to follow Debian conventions. Using everything downloaded so far the sha256 checksums can be compared across the files and also to what the .dsc file advertised:
In the example above the checksum 0b54f79df85... is the same across the files, so it is a match.
Repackaged upstream sources can t be verified as easily
Note that uscan may in rare cases repackage some upstream sources, for example to exclude files that don t adhere to Debian s copyright and licensing requirements. Those files and paths would be listed under the Files-Excluded section in the debian/copyright file. There are also other situations where the file that represents the upstream sources in Debian isn t bit-by-bit the same as what upstream published. If checksums don t match, an experienced Debian Developer should review all package settings (e.g. debian/source/options) to see if there was a valid and intentional reason for divergence.
Reviewing changes between two source packages using diffoscope
Diffoscope is an incredibly capable and handy tool to compare arbitrary files. For example, to view a report in HTML format of the differences between two XZ releases, run:
If the changes are extensive, and you want to use a LLM to help spot potential security issues, generate the report of both the upstream and Debian packaging differences in Markdown with:
The Markdown files created above can then be passed to your favorite LLM, along with a prompt such as:
Based on the attached diffoscope output for a new Debian package version compared with the previous one, list all suspicious changes that might have introduced a backdoor, followed by other potential security issues. If there are none, list a short summary of changes as the conclusion.
Reviewing Debian source packages in version control
As of today only 93% of all Debian source packages are tracked in git on Debian s GitLab instance at salsa.debian.org. Some key packages such as Coreutils and Bash are not using version control at all, as their maintainers apparently don t see value in using git for Debian packaging, and the Debian Policy does not require it. Thus, the only reliable and consistent way to audit changes in Debian packages is to compare the full versions from the archive as shown above.
However, for packages that are hosted on Salsa, one can view the git history to gain additional insight into what exactly changed, when and why. For packages that are using version control, their location can be found in the Git-Vcs header in the debian/control file. For xz-utils the location is salsa.debian.org/debian/xz-utils.
Note that the Debian policy does not state anything about how Salsa should be used, or what git repository layout or development practices to follow. In practice most packages follow the DEP-14 proposal, and use git-buildpackage as the tool for managing changes and pushing and pulling them between upstream and salsa.debian.org.
To get the XZ Utils source, run:
$ gbp clone https://salsa.debian.org/debian/xz-utils.git
gbp:info: Cloning from 'https://salsa.debian.org/debian/xz-utils.git'
$ gbp clone https://salsa.debian.org/debian/xz-utils.git
gbp:info: Cloning from 'https://salsa.debian.org/debian/xz-utils.git'
At the time of writing this post the git history shows:
$ git log --graph --oneline
* bb787585 (HEAD -> debian/unstable, origin/debian/unstable, origin/HEAD) Prepare 5.8.1-2
* 4b769547 d: Remove the symlinks from -dev package.
* a39f3428 Correct the nocheck build profile
* 1b806b8d Import Debian changes 5.8.1-1.1
* b1cad34b Prepare 5.8.1-1
* a8646015 Import 5.8.1
* 2808ec2d Update upstream source from tag 'upstream/5.8.1'
\
* fa1e8796 (origin/upstream/v5.8, upstream/v5.8) New upstream version 5.8.1
* a522a226 Bump version and soname for 5.8.1
* 1c462c2a Add NEWS for 5.8.1
* 513cabcf Tests: Call lzma_code() in smaller chunks in fuzz_common.h
* 48440e24 Tests: Add a fuzzing target for the multithreaded .xz decoder
* 0c80045a liblzma: mt dec: Fix lack of parallelization in single-shot decoding
* 81880488 liblzma: mt dec: Don't modify thr->in_size in the worker thread
* d5a2ffe4 liblzma: mt dec: Don't free the input buffer too early (CVE-2025-31115)
* c0c83596 liblzma: mt dec: Simplify by removing the THR_STOP state
* 831b55b9 liblzma: mt dec: Fix a comment
* b9d168ee liblzma: Add assertions to lzma_bufcpy()
* c8e0a489 DOS: Update Makefile to fix the build
* 307c02ed sysdefs.h: Avoid <stdalign.h> even with C11 compilers
* 7ce38b31 Update THANKS
* 688e51bd Translations: Update the Croatian translation
* a6b54dde Prepare 5.8.0-1.
* 77d9470f Add 5.8 symbols.
* 9268eb66 Import 5.8.0
* 6f85ef4f Update upstream source from tag 'upstream/5.8.0'
\ \
* afba662b New upstream version 5.8.0
/
* 173fb5c6 doc/SHA256SUMS: Add 5.8.0
* db9258e8 Bump version and soname for 5.8.0
* bfb752a3 Add NEWS for 5.8.0
* 6ccbb904 Translations: Run "make -C po update-po"
* 891a5f05 Translations: Run po4a/update-po
* 4f52e738 Translations: Partially fix overtranslation in Serbian man pages
* ff5d9447 liblzma: Count the extra bytes in LZMA/LZMA2 decoder memory usage
* 943b012d liblzma: Use SSE2 intrinsics instead of memcpy() in dict_repeat()
$ git log --graph --oneline
* bb787585 (HEAD -> debian/unstable, origin/debian/unstable, origin/HEAD) Prepare 5.8.1-2
* 4b769547 d: Remove the symlinks from -dev package.
* a39f3428 Correct the nocheck build profile
* 1b806b8d Import Debian changes 5.8.1-1.1
* b1cad34b Prepare 5.8.1-1
* a8646015 Import 5.8.1
* 2808ec2d Update upstream source from tag 'upstream/5.8.1'
\
* fa1e8796 (origin/upstream/v5.8, upstream/v5.8) New upstream version 5.8.1
* a522a226 Bump version and soname for 5.8.1
* 1c462c2a Add NEWS for 5.8.1
* 513cabcf Tests: Call lzma_code() in smaller chunks in fuzz_common.h
* 48440e24 Tests: Add a fuzzing target for the multithreaded .xz decoder
* 0c80045a liblzma: mt dec: Fix lack of parallelization in single-shot decoding
* 81880488 liblzma: mt dec: Don't modify thr->in_size in the worker thread
* d5a2ffe4 liblzma: mt dec: Don't free the input buffer too early (CVE-2025-31115)
* c0c83596 liblzma: mt dec: Simplify by removing the THR_STOP state
* 831b55b9 liblzma: mt dec: Fix a comment
* b9d168ee liblzma: Add assertions to lzma_bufcpy()
* c8e0a489 DOS: Update Makefile to fix the build
* 307c02ed sysdefs.h: Avoid <stdalign.h> even with C11 compilers
* 7ce38b31 Update THANKS
* 688e51bd Translations: Update the Croatian translation
* a6b54dde Prepare 5.8.0-1.
* 77d9470f Add 5.8 symbols.
* 9268eb66 Import 5.8.0
* 6f85ef4f Update upstream source from tag 'upstream/5.8.0'
\ \
* afba662b New upstream version 5.8.0
/
* 173fb5c6 doc/SHA256SUMS: Add 5.8.0
* db9258e8 Bump version and soname for 5.8.0
* bfb752a3 Add NEWS for 5.8.0
* 6ccbb904 Translations: Run "make -C po update-po"
* 891a5f05 Translations: Run po4a/update-po
* 4f52e738 Translations: Partially fix overtranslation in Serbian man pages
* ff5d9447 liblzma: Count the extra bytes in LZMA/LZMA2 decoder memory usage
* 943b012d liblzma: Use SSE2 intrinsics instead of memcpy() in dict_repeat()
This shows both the changes on the debian/unstable branch as well as the intermediate upstream import branch, and the actual real upstream development branch. See my Debian source packages in git explainer for details of what these branches are used for.
To only view changes on the Debian branch, run git log --graph --oneline --first-parent or git log --graph --oneline -- debian.
The Debian branch should only have changes inside the debian/ subdirectory, which is easy to check with:
If the upstream in question signs commits or tags, they can be verified with e.g.:
$ git verify-tag v5.6.2
gpg: Signature made Wed 29 May 2024 09:39:42 AM PDT
gpg: using RSA key 3690C240CE51B4670D30AD1C38EE757D69184620
gpg: issuer "lasse.collin@tukaani.org"
gpg: Good signature from "Lasse Collin <lasse.collin@tukaani.org>" [expired]
gpg: Note: This key has expired!
$ git verify-tag v5.6.2
gpg: Signature made Wed 29 May 2024 09:39:42 AM PDT
gpg: using RSA key 3690C240CE51B4670D30AD1C38EE757D69184620
gpg: issuer "lasse.collin@tukaani.org"
gpg: Good signature from "Lasse Collin <lasse.collin@tukaani.org>" [expired]
gpg: Note: This key has expired!
The main benefit of reviewing changes in git is the ability to see detailed information about each individual change, instead of just staring at a massive list of changes without any explanations. In this example, to view all the upstream commits since the previous import to Debian, one would view the commit range from afba662b New upstream version 5.8.0 to fa1e8796 New upstream version 5.8.1 with git log --reverse -p afba662b...fa1e8796. However, a far superior way to review changes would be to browse this range using a visual git history viewer, such as gitk. Either way, looking at one code change at a time and reading the git commit message makes the review much easier.
Comparing Debian source packages to git contents
As stated in the beginning of the previous section, and worth repeating, there is no guarantee that the contents in the Debian packaging git repository matches what was actually uploaded to Debian. While the tag2upload project in Debian is getting more and more popular, Debian is still far from having any system to enforce that the git repository would be in sync with the Debian archive contents.
To detect such differences we can run diff across the Debian source packages downloaded with debsnap earlier (path source-xz-utils/xz-utils_5.8.1-2.debian) and the git repository cloned in the previous section (path xz-utils):
diff$ diff -u source-xz-utils/xz-utils_5.8.1-2.debian/ xz-utils/debian/
diff -u source-xz-utils/xz-utils_5.8.1-2.debian/changelog xz-utils/debian/changelog
--- debsnap/source-xz-utils/xz-utils_5.8.1-2.debian/changelog 2025-10-03 09:32:16.000000000 -0700
+++ xz-utils/debian/changelog 2025-10-12 12:18:04.623054758 -0700
@@ -5,7 +5,7 @@
* Remove the symlinks from -dev, pointing to the lib package.
(Closes: #1109354)
- -- Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Fri, 03 Oct 2025 18:32:16 +0200
+ -- Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Fri, 03 Oct 2025 18:36:59 +0200
$ diff -u source-xz-utils/xz-utils_5.8.1-2.debian/ xz-utils/debian/
diff -u source-xz-utils/xz-utils_5.8.1-2.debian/changelog xz-utils/debian/changelog
--- debsnap/source-xz-utils/xz-utils_5.8.1-2.debian/changelog 2025-10-03 09:32:16.000000000 -0700
+++ xz-utils/debian/changelog 2025-10-12 12:18:04.623054758 -0700
@@ -5,7 +5,7 @@
* Remove the symlinks from -dev, pointing to the lib package.
(Closes: #1109354)
- -- Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Fri, 03 Oct 2025 18:32:16 +0200
+ -- Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Fri, 03 Oct 2025 18:36:59 +0200
In the case above diff revealed that the timestamp in the changelog in the version uploaded to Debian is different from what was committed to git. This is not malicious, just a mistake by the maintainer who probably didn t run gbp tag immediately after upload, but instead some dch command and ended up with having a different timestamps in the git compared to what was actually uploaded to Debian.
Creating syntetic Debian packaging git repositories
If no Debian packaging git repository exists, or if it is lagging behind what was uploaded to Debian s archive, one can use git-buildpackage s import-dscs feature to create synthetic git commits based on the files downloaded by debsnap, ensuring the git contents fully matches what was uploaded to the archive. To import a single version there is gbp import-dsc (no s at the end), of which an example invocation would be:
$ gbp import-dsc --verbose ../source-xz-utils/xz-utils_5.8.1-2.dsc
Version '5.8.1-2' imported under '/home/otto/debian/xz-utils-2025-09-29'
$ gbp import-dsc --verbose ../source-xz-utils/xz-utils_5.8.1-2.dsc
Version '5.8.1-2' imported under '/home/otto/debian/xz-utils-2025-09-29'
Example commit history from a repository with commits added with gbp import-dsc:
An online example repository with only a few missing uploads added using gbp import-dsc can be viewed at salsa.debian.org/otto/xz-utils-2025-09-29/-/network/debian%2Funstable
An example repository that was fully crafted using gbp import-dscs can be viewed at salsa.debian.org/otto/xz-utils-gbp-import-dscs-debsnap-generated/-/network/debian%2Flatest.
There exists also dgit, which in a similar way creates a synthetic git history to allow viewing the Debian archive contents via git tools. However, its focus is on producing new package versions, so fetching a package with dgit that has not had the history recorded in dgit earlier will only show the latest versions:
$ dgit clone xz-utils
canonical suite name for unstable is sid
starting new git history
last upload to archive: NO git hash
downloading http://ftp.debian.org/debian//pool/main/x/xz-utils/xz-utils_5.8.1.orig.tar.xz...
downloading http://ftp.debian.org/debian//pool/main/x/xz-utils/xz-utils_5.8.1.orig.tar.xz.asc...
downloading http://ftp.debian.org/debian//pool/main/x/xz-utils/xz-utils_5.8.1-2.debian.tar.xz...
dpkg-source: info: extracting xz-utils in unpacked
dpkg-source: info: unpacking xz-utils_5.8.1.orig.tar.xz
dpkg-source: info: unpacking xz-utils_5.8.1-2.debian.tar.xz
synthesised git commit from .dsc 5.8.1-2
HEAD is now at f9bcaf7 xz-utils (5.8.1-2) unstable; urgency=medium
dgit ok: ready for work in xz-utils
$ dgit/sid git log --graph --oneline
* f9bcaf7 xz-utils (5.8.1-2) unstable; urgency=medium 9 days ago (HEAD -> dgit/sid, dgit/dgit/sid)
\
* 11d3a62 Import xz-utils_5.8.1-2.debian.tar.xz 9 days ago
* 15dcd95 Import xz-utils_5.8.1.orig.tar.xz 6 months ago
$ dgit clone xz-utils
canonical suite name for unstable is sid
starting new git history
last upload to archive: NO git hash
downloading http://ftp.debian.org/debian//pool/main/x/xz-utils/xz-utils_5.8.1.orig.tar.xz...
downloading http://ftp.debian.org/debian//pool/main/x/xz-utils/xz-utils_5.8.1.orig.tar.xz.asc...
downloading http://ftp.debian.org/debian//pool/main/x/xz-utils/xz-utils_5.8.1-2.debian.tar.xz...
dpkg-source: info: extracting xz-utils in unpacked
dpkg-source: info: unpacking xz-utils_5.8.1.orig.tar.xz
dpkg-source: info: unpacking xz-utils_5.8.1-2.debian.tar.xz
synthesised git commit from .dsc 5.8.1-2
HEAD is now at f9bcaf7 xz-utils (5.8.1-2) unstable; urgency=medium
dgit ok: ready for work in xz-utils
$ dgit/sid git log --graph --oneline
* f9bcaf7 xz-utils (5.8.1-2) unstable; urgency=medium 9 days ago (HEAD -> dgit/sid, dgit/dgit/sid)
\
* 11d3a62 Import xz-utils_5.8.1-2.debian.tar.xz 9 days ago
* 15dcd95 Import xz-utils_5.8.1.orig.tar.xz 6 months ago
Unlike git-buildpackage managed git repositories, the dgit managed repositories cannot incorporate the upstream git history and are thus less useful for auditing the full software supply-chain in git.
Comparing upstream source packages to git contents
Equally important to the note in the beginning of the previous section, one must also keep in mind that the upstream release source packages, often called release tarballs, are not guaranteed to have the exact same contents as the upstream git repository. Projects might strip out test data or extra development files from their release tarballs to avoid shipping unnecessary files to users, or projects might add documentation files or versioning information into the tarball that isn t stored in git. While a small minority, there are also upstreams that don t use git at all, so the plain files in a release tarball is still the lowest common denominator for all open source software projects, and exporting and importing source code needs to interface with it.
In the case of XZ, the release tarball has additional version info and also a sizeable amount of pregenerated compiler configuration files. Detecting and comparing differences between git contents and tarballs can of course be done manually by running diff across an unpacked tarball and a checked out git repository. If using git-buildpackage, the difference between the git contents and tarball contents can be made visible directly in the import commit.
In this XZ example, consider this git history:
* b1cad34b Prepare 5.8.1-1
* a8646015 Import 5.8.1
* 2808ec2d Update upstream source from tag 'upstream/5.8.1'
\
* fa1e8796 (debian/upstream/v5.8, upstream/v5.8) New upstream version 5.8.1
* a522a226 (tag: v5.8.1) Bump version and soname for 5.8.1
* 1c462c2a Add NEWS for 5.8.1
* b1cad34b Prepare 5.8.1-1
* a8646015 Import 5.8.1
* 2808ec2d Update upstream source from tag 'upstream/5.8.1'
\
* fa1e8796 (debian/upstream/v5.8, upstream/v5.8) New upstream version 5.8.1
* a522a226 (tag: v5.8.1) Bump version and soname for 5.8.1
* 1c462c2a Add NEWS for 5.8.1
The commit a522a226 was the upstream release commit, which upstream also tagged v5.8.1. The merge commit 2808ec2d applied the new upstream import branch contents on the Debian branch. Between these is the special commit fa1e8796 New upstream version 5.8.1 tagged upstream/v5.8. This commit and tag exists only in the Debian packaging repository, and they show what is the contents imported into Debian. This is generated automatically by git-buildpackage when running git import-orig --uscan for Debian packages with the correct settings in debian/gbp.conf. By viewing this commit one can see exactly how the upstream release tarball differs from the upstream git contents (if at all).
In the case of XZ, the difference is substantial, and shown below in full as it is very interesting:
To be able to easily inspect exactly what changed in the release tarball compared to git release tag contents, the best tool for the job is Meld, invoked via git difftool --dir-diff fa1e8796^..fa1e8796.
To compare changes across the new and old upstream tarball, one would need to compare commits afba662b New upstream version 5.8.0 and fa1e8796 New upstream version 5.8.1 by running git difftool --dir-diff afba662b..fa1e8796.
With all the above tips you can now go and try to audit your own favorite package in Debian and see if it is identical with upstream, and if not, how it differs.
Should the XZ backdoor have been detected using these tools?
The famous XZ Utils backdoor (CVE-2024-3094) consisted of two parts: the actual backdoor inside two binary blobs masqueraded as a test files (tests/files/bad-3-corrupt_lzma2.xz, tests/files/good-large_compressed.lzma), and a small modification in the build scripts (m4/build-to-host.m4) to extract the backdoor and plant it into the built binary. The build script was not tracked in version control, but generated with GNU Autotools at release time and only shipped as additional files in the release tarball.
The entire reason for me to write this post was to ponder if a diligent engineer using git-buildpackage best practices could have reasonably spotted this while importing the new upstream release into Debian. The short answer is no . The malicious actor here clearly anticipated all the typical ways anyone might inspect both git commits, and release tarball contents, and masqueraded the changes very well and over a long timespan.
First of all, XZ has for legitimate reasons for several carefully crafted .xz files as test data to help catch regressions in the decompression code path. The test files are shipped in the release so users can run the test suite and validate that the binary is built correctly and xz works properly. Debian famously runs massive amounts of testing in its CI and autopkgtest system across tens of thousands of packages to uphold high quality despite frequent upgrades of the build toolchain and while supporting more CPU architectures than any other distro. Test data is useful and should stay.
When git-buildpackage is used correctly, the upstream commits are visible in the Debian packaging for easy review, but the commit cf44e4b that introduced the test files does not deviate enough from regular sloppy coding practices to really stand out. It is unfortunately very common for git commit to lack a message body explaining why the change was done, and to not be properly atomic with test code and test data together in the same commit, and for commits to be pushed directly to mainline without using code reviews (the commit was not part of any PR in this case). Only another upstream developer could have spotted that this change is not on par to what the project expects, and that the test code was never added, only test data, and thus that this commit was not just a sloppy one but potentially malicious.
Secondly, the fact that a new Autotools file appeared (m4/build-to-host.m4) in the XZ Utils 5.6.0 is not suspicious. This is perfectly normal for Autotools. In fact, starting from XZ Utils version 5.8.1 it is now shipping a m4/build-to-host.m4 file that it actually uses now.
Spotting that there is anything fishy is practically impossible by simply reading the code, as Autotools files are full custom m4 syntax interwoven with shell script, and there are plenty of backticks () that spawn subshells and evals that execute variable contents further, which is just normal for Autotools. Russ Cox s XZ post explains how exactly the Autotools code fetched the actual backdoor from the test files and injected it into the build.
There is only one tiny thing that maybe a very experienced Autotools user could potentially have noticed: the serial 30 in the version header is way too high. In theory one could also have noticed this Autotools file deviates from what other packages in Debian ship with the same filename, such as e.g. the serial 3, serial 5a or 5b versions. That would however require and an insane amount extra checking work, and is not something we should plan to start doing. A much simpler solution would be to simply strongly recommend all open source projects to stop using Autotools to eventually get rid of it entirely.
Not detectable with reasonable effort
While planting backdoors is evil, it is hard not to feel some respect to the level of skill and dedication of the people behind this. I ve been involved in a bunch of security breach investigations during my IT career, and never have I seen anything this well executed.
If it hadn t slowed down SSH by ~500 milliseconds and been discovered due to that, it would most likely have stayed undetected for months or years. Hiding backdoors in closed source software is relatively trivial, but hiding backdoors in plain sight in a popular open source project requires some unusual amount of expertise and creativity as shown above.
Is the software supply-chain in Debian easy to audit?
While maintaining a Debian package source using git-buildpackage can make the package history a lot easier to inspect, most packages have incomplete configurations in their debian/gbp.conf, and thus their package development histories are not always correctly constructed or uniform and easy to compare. The Debian Policy does not mandate git usage at all, and there are many important packages that are not using git at all. Additionally the Debian Policy also allows for non-maintainers to upload new versions to Debian without committing anything in git even for packages where the original maintainer wanted to use git. Uploads that bypass git unfortunately happen surpisingly often.
Because of the situation, I am afraid that we could have multiple similar backdoors lurking that simply haven t been detected yet. More audits, that hopefully also get published openly, would be welcome! More people auditing the contents of the Debian archives would probably also help surface what tools and policies Debian might be missing to make the work easier, and thus help improve the security of Debian s users, and improve trust in Debian.
Is Debian currently missing some software that could help detect similar things?
To my knowledge there is currently no system in place as part of Debian s QA or security infrastructure to verify that the upstream source packages in Debian are actually from upstream. I ve come across a lot of packages where the debian/watch or other configs are incorrect and even cases where maintainers have manually created upstream tarballs as it was easier than configuring automation to work. It is obvious that for those packages the source tarball now in Debian is not at all the same as upstream. I am not aware of any malicious cases though (if I was, I would report them of course).
I am also aware of packages in the Debian repository that are misconfigured to be of type 1.0 (native) packages, mixing the upstream files and debian/ contents and having patches applied, while they actually should be configured as 3.0 (quilt), and not hide what is the true upstream sources. Debian should extend the QA tools to scan for such things. If I find a sponsor, I might build it myself as my next major contribution to Debian.
In addition to better tooling for finding mismatches in the source code, Debian could also have better tooling for tracking in built binaries what their source files were, but solutions like Fraunhofer-AISEC s supply-graph or Sony s ESSTRA are not practical yet. Julien Malka s post about NixOS discusses the role of reproducible builds, which may help in some cases across all distros.
Or, is Debian missing some policies or practices to mitigate this?
Perhaps more importantly than more security scanning, the Debian Developer community should switch the general mindset from anyone is free to do anything to valuing having more shared workflows. The ability to audit anything is severely hampered by the fact that there are so many ways to do the same thing, and distinguishing what is a normal deviation from a malicious deviation is too hard, as the normal can basically be almost anything.
Also, as there is no documented and recommended default workflow, both those who are old and new to Debian packaging might never learn any one optimal workflow, and end up doing many steps in the packaging process in a way that kind of works, but is actually wrong or unnecessary, causing process deviations that look malicious, but turn out to just be a result of not fully understanding what would have been the right way to do something.
In the long run, once individual developers workflows are more aligned, doing code reviews will become a lot easier and smoother as the excess noise of workflow differences diminishes and reviews will feel much more productive to all participants. Debian fostering a culture of code reviews would allow us to slowly move from the current practice of mainly solo packaging work towards true collaboration forming around those code reviews.
I have been promoting increased use of Merge Requests in Debian already for some time, for example by proposing DEP-18: Encourage Continuous Integration and Merge Request based Collaboration for Debian packages. If you are involved in Debian development, please give a thumbs up in dep-team/deps!21 if you want me to continue promoting it.
Can we trust open source software?
Yes and I would argue that we can only trust open source software. There is no way to audit closed source software, and anyone using e.g. Windows or MacOS just have to trust the vendor s word when they say they have no intentional or accidental backdoors in their software. Or, when the news gets out that the systems of a closed source vendor was compromised, like Crowdstrike some weeks ago, we can t audit anything, and time after time we simply need to take their word when they say they have properly cleaned up their code base.
In theory, a vendor could give some kind of contractual or financial guarantee to its customer that there are no preventable security issues, but in practice that never happens. I am not aware of a single case of e.g. Microsoft or Oracle would have paid damages to their customers after a security flaw was found in their software. In theory you could also pay a vendor more to have them focus more effort in security, but since there is no way to verify what they did, or to get compensation when they didn t, any increased fees are likely just pocketed as increased profit.
Open source is clearly better overall. You can, if you are an individual with the time and skills, audit every step in the supply-chain, or you could as an organization make investments in open source security improvements and actually verify what changes were made and how security improved.
If your organisation is using Debian (or derivatives, such as Ubuntu) and you are interested in sponsoring my work to improve Debian, please reach out.
Avoiding
5XX errors by adjusting Load Balancer Idle Timeout
Recently I faced a problem in production where a client was running a
RabbitMQ server behind the Load Balancers we provisioned and the TCP
connections were closed every minute.
My team is responsible for the LBaaS (Load Balancer as a Service)
product and this Load Balancer was an Envoy proxy provisioned by our
control plane.
The error was similar to this:
At first glance, the issue is simple: the Load Balancer's idle
timeout is shorter than the RabbitMQ heartbeat interval.
The idle timeout is the time at which a downstream or upstream
connection will be terminated if there are no active streams. Heartbeats
generate periodic network traffic to prevent idle TCP connections from
closing prematurely.
Adjusting these timeout settings to align properly solved the
issue.
However, what I want to explore in this post are other similar
scenarios where it's not so obvious that the idle timeout is the
problem. Introducing an extra network layer, such as an Envoy proxy, can
introduce unpredictable behavior across your services, like intermittent
5XX errors.
To make this issue more concrete, let's look at a minimal,
reproducible setup that demonstrates how adding an Envoy proxy can lead
to sporadic errors.
Reproducible setup
I'll be using the following tools:
I'll be running experiments with two different
envoy.yaml configurations: one that uses Envoy's TCP proxy,
and another that uses Envoy's HTTP connection manager.
Here's the simplest Envoy TCP proxy setup: a listener on port 8000
forwarding traffic to a backend running on port 8080.
The default idle timeout if not otherwise specified is 1 hour, which
is the case here.
The backend setup is simple as well:
package mainimport("fmt""net/http""time")func helloHandler(w http.ResponseWriter, r *http.Request) w.Write([]byte("Hello from Go!"))func main() http.HandleFunc("/", helloHandler) server := http.Server Addr:":8080", IdleTimeout:3* time.Second, fmt.Println("Starting server on :8080")panic(server.ListenAndServe())
The IdleTimeout is set to 3 seconds to make it easier to test.
Now, oha is the perfect tool to generate the HTTP
requests for this test. The Load test is not meant to stress this setup,
the idea is to wait long enough so that some requests are closed. The
burst-delay feature will help with that:
I'm running the Load test for 30 seconds, sending 100 requests at
three-second intervals. I also use the -w option to wait
for ongoing requests when the duration is reached. The result looks like
this:
We had 886 responses with status code 200 and 64 connections closed.
The backend terminated 64 connections while the load balancer still had
active requests directed to it.
Let's change the Load Balancer idle_timeout to two
seconds.
filter_chains:-filters:-name: envoy.filters.network.tcp_proxytyped_config:"@type": type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxystat_prefix: go_server_tcpcluster: go_server_clusteridle_timeout: 2s # <--- NEW LINE
Run the same test again.
Great! Now all the requests worked.
This is a common issue, not specific to Envoy Proxy or the setup
shown earlier. Major cloud providers have all documented it.
AWS
troubleshoot guide for Application Load Balancers says this:
The target closed the connection with a TCP RST or a TCP FIN while the load balancer had an outstanding request to the target. Check whether the keep-alive duration of the target is shorter than the idle timeout value of the load balancer.Google
troubleshoot guide for Application Load Balancers mention this as
well:
Verify that the keepalive configuration parameter for the HTTP server software running on the backend instance is not less than the keepalive timeout of the load balancer, whose value is fixed at 10 minutes (600 seconds) and is not configurable.The load balancer generates an HTTP 5XX response code when the connection to the backend has unexpectedly closed while sending the HTTP request or before the complete HTTP response has been received. This can happen because the keepalive configuration parameter for the web server software running on the backend instance is less than the fixed keepalive timeout of the load balancer. Ensure that the keepalive timeout configuration for HTTP server software on each backend is set to slightly greater than 10 minutes (the recommended value is 620 seconds).RabbitMQ
docs also warn about this:
Certain networking tools (HAproxy, AWS ELB) and equipment (hardware load balancers) may terminate "idle" TCP connections when there is no activity on them for a certain period of time. Most of the time it is not desirable.
Most of them are talking about Application Load Balancers and the
test I did was using a Network Load Balancer. For the sake of
completeness, I will do the same test but using Envoy's HTTP connection
manager.
The updated envoy.yaml:
The yaml above is an example of a service proxying HTTP from
0.0.0.0:8000 to 0.0.0.0:8080. The only difference from a minimal
configuration is that I enabled access logs.
Let's run the same tests with oha.
Even thought the success rate is 100%, the status code distribution
show some responses with status code 503. This is the case where is not
that obvious that the problem is related to idle timeout.
However, it's clear when we look the Envoy access logs:
UC is the short name for
UpstreamConnectionTermination. This means the upstream,
which is the golang server, terminated the connection.
To fix this once again, the Load Balancer idle timeout needs to
change:
clusters:-name: go_server_clustertype: STATICtyped_extension_protocol_options: # <--- NEW BLOCKenvoy.extensions.upstreams.http.v3.HttpProtocolOptions:"@type": type.googleapis.com/envoy.extensions.upstreams.http.v3.HttpProtocolOptionscommon_http_protocol_options:idle_timeout: 2s # <--- NEW VALUEexplicit_http_config:http_protocol_options:
Finally, the sporadic 503 errors are over:
To Sum Up
Here's an example of the values my team recommends to our
clients:
Key Takeaways:
The Load Balancer idle timeout should be less than the backend
(upstream) idle/keepalive timeout.
When we are working with long lived connections, the client
(downstream) should use a keepalive smaller than the LB idle
timeout.
Brief note to maybe spare someone else the trouble. If you want to
hide e.g. a huge table in Backstage (techdocs/mkdocs) behind a
collapsible element you need the md_in_html extension and use the
markdown attribute for it to kick in on the <details> html tag.
Add the extension to your mkdocs.yaml:
markdown_extensions:
- md_in_html
Hide the table in your markdown document in a collapsible element
like this:
It's also required to have an empty line between the html tag and starting
the markdown part. Rendered for me that way in VSCode, GitHub and Backstage.
That starts us listening for connections from Home Assistant on port 10700, uses the openWakeWord instance on localhost port 10400, uses aplay/arecord to talk to the local microphone and speaker, and gives us some debug output so we can see what s going on.
And it turns out we need the debug. This setup is a bit too flaky for it to have ended up in regular use in our household. I ve had some problems with reliable audio setup; you ll note the Python is calling out to other tooling to grab audio, which feels a bit clunky to me and I don t think is the actual problem, but the main audio for this host is hooked up to the TV (it s a media box), so the setup for the voice assistant needs to be entirely separate. That means not plugging into Pipewire or similar, and instead giving direct access to wyoming-satellite. And sometimes having to deal with how to make the mixer happy + non-muted manually.
I ve also had some issues with the USB microphone + speaker; I suspect a powered USB hub would help, and that s on my list to try out.
When it does work I have sometimes found it necessary to speak more slowly, or enunciate my words more clearly. That s probably something I could improve by switching from the base.en to small.en whisper.cpp model, but I m waiting until I sort out the audio hardware issue before poking more.
Finally, the wake word detection is a little bit sensitive sometimes, as I mentioned in the previous post. To be honest I think it s possible to deal with that, if I got the rest of the pieces working smoothly.
This has ended up sounding like a more negative post than I meant it to. Part of the issue in a resolution is finding enough free time to poke things (especially as it involves taking over the living room and saying Hey Jarvis a lot), part of it is no doubt my desire to actually hook up the pieces myself and understand what s going on. Stay tuned and see if I ever manage to resolve it all!
A new version 0.1.6 of the tint package
arrived at CRAN today.
tint provides a style not unlike Tufte for use in html
and pdf documents created from markdown. The github repo shows
several examples in its README, more as usual in the package
documentation.
This release addresses a small issue where in pdf mode,
pandoc (3.2.1 or newer) needs a particular macros defined
when static (premade) images or figure files are included. No other
changes.
Changes in tint
version 0.1.6 (2025-09-25)
An additional LaTeX command needed by pandoc (>= 3.2.1) has been defined for the two pdf
variants.
Akvorado 2.0 was released today! Akvorado collects network flows with
IPFIX and sFlow. It enriches flows and stores them in a ClickHouse database.
Users can browse the data through a web console. This release introduces an
important architectural change and other smaller improvements. Let s dive in!
New outlet service
The major change in Akvorado 2.0 is splitting the inlet service into two
parts: the inlet and the outlet. Previously, the inlet handled all flow
processing: receiving, decoding, and enrichment. Flows were then sent to Kafka
for storage in ClickHouse:
Akvorado flow processing before the introduction of the outlet service
Network flows reach the inlet service using UDP, an unreliable protocol. The
inlet must process them fast enough to avoid losing packets. To handle a
high number of flows, the inlet spawns several sets of workers to receive flows,
fetch metadata, and assemble enriched flows for Kafka. Many configuration
options existed for scaling, which increased complexity for users. The code
needed to avoid blocking at any cost, making the processing pipeline complex
and sometimes unreliable, particularly the BMP receiver.1 Adding new
features became difficult without making the problem worse.2
In Akvorado 2.0, the inlet receives flows and pushes them to Kafka without
decoding them. The new outlet service handles the remaining tasks:
Akvorado flow processing after the introduction of the outlet service
This change goes beyond a simple split:3 the outlet now reads flows from
Kafka and pushes them to ClickHouse, two tasks that Akvorado did not handle
before. Flows are heavily batched to increase efficiency and reduce the load
on ClickHouse using ch-go, a low-level Go client for ClickHouse. When
batches are too small, asynchronous inserts are used (e20645). The number of
outlet workers scales dynamically (e5a625) based on the target batch
size and latency (50,000 flows and 5 seconds by default).
This new architecture also allows us to simplify and optimize the code. The
outlet fetches metadata synchronously (e20645). The BMP component becomes
simpler by removing cooperative multitasking (3b9486). Reusing the same
RawFlow object to decode protobuf-encoded flows from Kafka reduces pressure on
the garbage collector (8b580f).
The effect on Akvorado s overall performance was somewhat uncertain, but a
user reported 35% lower CPU usage after migrating from the previous
version, plus resolution of the long-standing BMP component issue.
Other changes
This new version includes many miscellaneous changes, such as completion for
source and destination ports (f92d2e), and automatic restart of the
orchestrator service (0f72ff) when configuration changes to avoid a common
pitfall for newcomers.
Let s focus on some key areas for this release: observability,
documentation, CI,
Docker, Go, and JavaScript.
Observability
Akvorado exposes metrics to provide visibility into the processing pipeline and
help troubleshoot issues. These are available through Prometheus HTTP metrics
endpoints, such as /api/v0/inlet/metrics. With the introduction
of the outlet, many metrics moved. Some were also renamed (4c0b15) to match
Prometheus best practices. Kafka consumer lag was added as a new metric
(e3a778).
If you do not have your own observability stack, the Docker Compose setup
shipped with Akvorado provides one. You can enable it by activating the profiles
introduced for this purpose (529a8f).
The prometheus profile ships Prometheus to store metrics and Alloy
to collect them (2b3c46, f81299, and 8eb7cd). Redis and Kafka
metrics are collected through the exporter bundled with Alloy (560113).
Other metrics are exposed using Prometheus metrics endpoints and are
automatically fetched by Alloy with the help of some Docker labels, similar to
what is done to configure Traefik. cAdvisor was also added (83d855) to
provide some container-related metrics.
The loki profile ships Loki to store logs (45c684). While Alloy
can collect and ship logs to Loki, its parsing abilities are limited: I could
not find a way to preserve all metadata associated with structured logs produced
by many applications, including Akvorado. Vector replaces Alloy (95e201)
and features a domain-specific language, VRL, to transform logs. Annoyingly,
Vector currently cannot retrieve Docker logs from before it was
started.
Finally, the grafana profile ships Grafana, but the shipped dashboards are
broken. This is planned for a future version.
Documentation
The Docker Compose setup provided by Akvorado makes it easy to get the web
interface up and running quickly. However, Akvorado requires a few mandatory
steps to be functional. It ships with comprehensive documentation, including
a chapter about troubleshooting problems. I hoped this documentation would
reduce the support burden. It is difficult to know if it works. Happy users
rarely report their success, while some users open discussions asking for help
without reading much of the documentation.
In this release, the documentation was significantly improved.
The documentation was updated (fc1028) to match Akvorado s new architecture.
The troubleshooting section was rewritten (17a272). Instructions on how to
improve ClickHouse performance when upgrading from versions earlier than 1.10.0
was added (5f1e9a). An LLM proofread the entire content (06e3f3).
Developer-focused documentation was also improved (548bbb, e41bae, and
871fc5).
From a usability perspective, table of content sections are now collapsable
(c142e5). Admonitions help draw user attention to important points
(8ac894).
Example of use of admonitions in Akvorado's documentation
Continuous integration
This release includes efforts to speed up continuous integration on GitHub.
Coverage and race tests run in parallel (6af216 and fa9e48). The Docker
image builds during the tests but gets tagged only after they succeed
(8b0dce).
GitHub workflow to test and build Akvorado
End-to-end tests (883e19) ensure the shipped Docker Compose setup works as
expected. Hurl runs tests on various HTTP endpoints, particularly to verify
metrics (42679b and 169fa9). For example:
## Test inlet has received NetFlow flowsGEThttp://127.0.0.1:8080/prometheus/api/v1/query[Query]query:sum(akvorado_inlet_flow_input_udp_packets_total job="akvorado-inlet",listener=":2055" )
HTTP200[Captures]inlet_receivedflows:jsonpath "$.data.result[0].value[1]" toInt
[Asserts]variable"inlet_receivedflows">10## Test inlet has sent them to KafkaGEThttp://127.0.0.1:8080/prometheus/api/v1/query[Query]query:sum(akvorado_inlet_kafka_sent_messages_total job="akvorado-inlet" )
HTTP200[Captures]inlet_sentflows:jsonpath "$.data.result[0].value[1]" toInt
[Asserts]variable"inlet_sentflows">=inlet_receivedflows
Docker
Akvorado ships with a comprehensive Docker Compose setup to help users get
started quickly. It ensures a consistent deployment, eliminating many
configuration-related issues. It also serves as a living documentation of the
complete architecture.
This release brings some small enhancements around Docker:
Previously, many Docker images were pulled from the Bitnami Containers
library. However, VMWare acquired Bitnami in 2019 and Broadcom acquired
VMWare in 2023. As a result, Bitnami images were deprecated in less than a
month. This was not really a surprise4. Previous versions of Akvorado
had already started moving away from them. In this release, the Apache project s
Kafka image replaces the Bitnami one (1eb382). Thanks to the switch to KRaft
mode, Zookeeper is no longer needed (0a2ea1, 8a49ca, and f65d20).
Akvorado s Docker images were previously compiled with Nix. However, building
AArch64 images on x86-64 is slow because it relies on QEMU userland emulation.
The updated Dockerfile uses multi-stage and multi-platform builds: one
stage builds the JavaScript part on the host platform, one stage builds the Go
part cross-compiled on the host platform, and the final stage assembles the
image on top of a slim distroless image (268e95 and d526ca).
# This is a simplified versionFROM--platform=$BUILDPLATFORMnode:20-alpineASbuild-js
RUNapkadd--no-cachemake
WORKDIR/buildCOPYconsole/frontendconsole/frontend
COPYMakefile.
RUNmakeconsole/data/frontend
FROM--platform=$BUILDPLATFORMgolang:alpineASbuild-go
RUNapkadd--no-cachemakecurlzip
WORKDIR/buildCOPY..
COPY--from=build-js/build/console/data/frontendconsole/data/frontend
RUNgomoddownload
RUNmakeall-indep
ARGTARGETOSTARGETARCHTARGETVARIANTVERSION
RUNmake
FROMgcr.io/distroless/static:latestCOPY--from=build-go/build/bin/akvorado/usr/local/bin/akvorado
ENTRYPOINT["/usr/local/bin/akvorado"]
When building for multiple platforms with --platform
linux/amd64,linux/arm64,linux/arm/v7, the build steps until the highlighted
line execute only once for all platforms. This significantly speeds up the
build.
Akvorado now ships Docker images for these platforms: linux/amd64,
linux/amd64/v3, linux/arm64, and linux/arm/v7. When requesting
ghcr.io/akvorado/akvorado, Docker selects the best image for the current CPU.
On x86-64, there are two choices. If your CPU is recent enough, Docker
downloads linux/amd64/v3. This version contains additional optimizations and
should run faster than the linux/amd64 version. It would be interesting to
ship an image for linux/arm64/v8.2, but Docker does not support the same
mechanism for AArch64 yet (792808).
Go
This release includes many changes related to Go but not visible to the users.
Toolchain
In the past, Akvorado supported the two latest Go versions, preventing immediate
use of the latest enhancements. The goal was to allow users of stable
distributions to use Go versions shipped with their distribution to compile
Akvorado. However, this became frustrating when interesting features, like go
tool, were released. Akvorado 2.0 requires Go 1.25 (77306d) but can be
compiled with older toolchains by automatically downloading a newer one
(94fb1c).5 Users can still override GOTOOLCHAIN to revert this
decision. The recommended toolchain updates weekly through CI to ensure we get
the latest minor release (5b11ec). This change also simplifies updates to
newer versions: only go.mod needs updating.
Thanks to this change, Akvorado now uses wg.Go() (77306d) and I have
started converting some unit tests to the new test/synctest package
(bd787e, 7016d8, and 159085).
Testing
When testing equality, I use a helper function Diff() to display the
differences when it fails:
This function uses kylelemons/godebug. This package is
no longer maintained and has some shortcomings: for example, by default, it does
not compare struct private fields, which may cause unexpectedly successful
tests. I replaced it with google/go-cmp, which is stricter
and has better output (e2f1df).
Another package for Kafka
Another change is the switch from Sarama to franz-go to interact with
Kafka (756e4a and 2d26c5). The main motivation for this change is to
get a better concurrency model. Sarama heavily relies on channels and it is
difficult to understand the lifecycle of an object handed to this package.
franz-go uses a more modern approach with callbacks6 that is both more
performant and easier to understand. It also ships with a package to spawn fake
Kafka broker clusters, which is more convenient than the mocking functions
provided by Sarama.
Improved routing table for BMP
To store its routing table, the BMP component used
kentik/patricia, an implementation of a patricia tree
focused on reducing garbage collection pressure.
gaissmai/bart is a more recent alternative using an
adaptation of [Donald Knuth s ART algorithm][] that promises better
performance and delivers it: 90% faster lookups and 27% faster
insertions (92ee2e and fdb65c).
Unlike kentik/patricia, gaissmai/bart does not help efficiently store values
attached to each prefix. I adapted the same approach as kentik/patricia to
store route lists for each prefix: store a 32-bit index for each prefix, and use
it to build a 64-bit index for looking up routes in a map. This leverages Go s
efficient map structure.
gaissmai/bart also supports a lockless routing table version, but this is not
simple because we would need to extend this to the map storing the routes and to
the interning mechanism. I also attempted to use Go s new unique package to
replace the intern package included in Akvorado, but performance was
worse.7
Miscellaneous
Previous versions of Akvorado were using a custom Protobuf encoder for
performance and flexibility. With the introduction of the outlet service,
Akvorado only needs a simple static schema, so this code was removed. However,
it is possible to enhance performance with
planetscale/vtprotobuf (e49a74, and 8b580f).
Moreover, the dependency on protoc, a C++ program, was somewhat annoying.
Therefore, Akvorado now uses buf, written in Go, to convert a Protobuf
schema into Go code (f4c879).
Another small optimization to reduce the size of the Akvorado binary by
10 MB was to compress the static assets embedded in Akvorado in a ZIP file. It
includes the ASN database, as well as the SVG images for the documentation. A
small layer of code makes this change transparent (b1d638 and e69b91).
JavaScript
Recently, two large supply-chain attacks hit the JavaScript ecosystem: one
affecting the popular packages chalk and debug and another
impacting the popular package @ctrl/tinycolor. These attacks also
exist in other ecosystems, but JavaScript is a prime target due to heavy use of
small third-party dependencies. The previous version of Akvorado relied on 653
dependencies.
npm-run-all was removed (3424e8, 132 dependencies). patch-package was
removed (625805 and e85ff0, 69 dependencies) by moving missing
TypeScript definitions to env.d.ts. eslint was replaced with oxlint, a
linter written in Rust (97fd8c, 125 dependencies, including the plugins).
I switched from npm to Pnpm, an alternative package manager (fce383).
Pnpm does not run install scripts by default8 and prevents installing
packages that are too recent. It is also significantly faster.9 Node.js
does not ship Pnpm but it ships Corepack, which allows us to use Pnpm
without installing it. Pnpm can also list licenses used by each dependency,
removing the need for license-compliance (a35ca8, 42 dependencies).
For additional speed improvements, beyond switching to Pnpm and Oxlint, Vite
was replaced with its faster Rolldown version (463827).
After these changes, Akvorado only pulls 225 dependencies.
Next steps
I would like to land three features in the next version of Akvorado:
Add Grafana dashboards to complete the observability stack. See issue
#1906 for details.
Integrate OVH s Grafana plugin by providing a stable API for such
integrations. Akvorado s web console would still be useful for browsing
results, but if you want to build and share dashboards, you should switch to
Grafana. See issue #1895.
Move some work currently done in ClickHouse (custom dictionaries, GeoIP and IP
enrichment) back into the outlet service. This should give more flexibility
for adding features like the one requested in issue #1030. See issue #2006.
I started working on splitting the inlet into two parts more than one year ago.
I found more motivation in recent months, partly thanks to Claude Code,
which I used as a rubber duck. Almost none of the produced code was
kept:10 it is like an intern who does not learn.
Many attempts were made to make the BMP component both performant and
not blocking. See for example PR #254, PR #255, and PR #278.
Despite these efforts, this component remained problematic for most users.
See issue #1461 as an example.
Some features have been pushed to ClickHouse to avoid the
processing cost in the inlet. See for example PR #1059.
Broadcom is known for its user-hostile moves. Look at what happened
with VMWare.
As a Debian developer, I dislike these mechanisms that circumvent
the distribution package manager. The final straw came when Go 1.25 spent one month in the Debian NEW queue, an arbitrary mechanism I
don t like at all.
In the early years of Go, channels were heavily promoted. Sarama
was designed during this period. A few years later, a more nuanced approach
emerged. See notably Go channels are bad and you should feel bad.
This should be investigated further, but my theory is that the
intern package uses 32-bit integers, while unique uses 64-bit pointers.
See commit 74e5ac.
This is also possible with npm. See commit dab2f7.
An even faster alternative is Bun, but it is less available.
The exceptions are part of the code for the admonition blocks,
the code for collapsing the table of content, and part of the documentation.
When a Debian cloud VM boots, it typically runs cloud-init at various points in the boot process. Each invocation can perform certain operations based on the host s static configuration passed by the user, typically either through a well known link-local network service or an attached iso9660 drive image. Some of the cloud-init steps execute before the network comes up, and others at a couple of different points after the network is up.
I recently encountered an unexpected issue when configuring a dualstack (uses both IPv6 and legacy IPv4 networking) VM to use a custom apt server accessible only via IPv6. VM provisioning failed because it was unable to access the server in question, yet when I logged in to investigate, it was able to access the server without any problem. The boot had apparently gone smoothly right up until cloud-init s Package Update Upgrade Install module called apt-get update, which failed and broke subsequent provisioning steps. The errors reported by apt-get indicated that there was no route to the service in question, which more accurately probably meant that there was not yet a route to the service. But there was shortly after, when I investigated.
This was surprising because the apt-get invocations occur in a cloud-init sequence that s explicitly ordered after the network is configured according to systemd-networkd-wait-online. Investigation eventually led to similar issues encountered in other environments reported in Debian bug #1111791, systemd: network-online.target reached before IPv6 address is ready . The issue described in that bug is identical to mine, but the bug is tagged wontfix. The behavior is considered correct.
Why the default behavior is the correct one
While it s a bit counterintuitive, the systemd-networkd behavior is correct, and it s also not something we d want to override in the cloud images. Without explicit configuration, systemd can t accurately infer the intended network configuration of a given system. If a system is IPv6-only, systemd-networkd-wait-online will introduce unexpected delays in the boot process if it waits for IPv4, and vice-versa. If it assumes dualstack, things are even worse because it would block for a long time (approximately two minutes) in any single stack network before failing, leaving the host in degraded state. So the most reasonable default behavior is to block until any protocol is configured.
For these same reasons, we can t change the systemd-networkd-wait-online configuration in our cloud images. All of the cloud environments we support offer both single stack and dual stack networking, so we preserve systemd s default behavior.
What s causing problems here is that IPv6 takes significantly longer to configure due to its more complex router solicitation + router advertisement + DHCPv6 setup process. So in this particular case, where I ve got a dualstack VM that needs to access a v6-only apt server during the provisioning process, I need to find some mechanism to override systemd s default behavior and wait for IPv6 connectivity specifically.
What won t work
Cloud-init offers the ability to write out arbitrary files during provisioning. So writing a drop-in for systemd-networkd-wait-online.service is trivial. Unfortunately, this doesn t give us everything we actually need. We still need to invoke systemctl daemon-reload to get systemd to actually apply the changes after we ve written them, and of course we need to do that before the service actually runs. Cloud-init provides a bootcmd module that lets us run shell commands very early in the boot process , but it runs too early: it runs before we ve written out our configuration files. Similarly, it provides a runcmd module, but scripts there are towards the end of the boot process, far too late to be useful.
Instead of using the bootcmd facility, to simply reload systemd s config, it seemed possible that we could both write the config and trigger the reload, similar to the following:
But even that runs too late, as we can see in the logs that systemd-networkd-wait-online.service has completed before bootcmd is executed:
root@sid-tmp2:~# journalctl --no-pager -l -u systemd-networkd-wait-online.service
Aug 29 17:02:12 sid-tmp2 systemd[1]: Starting systemd-networkd-wait-online.service - Wait for Network to be Configured...
Aug 29 17:02:13 sid-tmp2 systemd[1]: Finished systemd-networkd-wait-online.service - Wait for Network to be Configured
.
root@sid-tmp2:~# grep -F 'config-bootcmd ran' /var/log/cloud-init.log
2025-08-29 17:02:14,766 - handlers.py[DEBUG]: finish: init-network/config-bootcmd: SUCCESS: config-bootcmd ran successfully and took 0.467 seconds
At this point, it s looking like there are few options left!
What eventually worked
I ended up identifying two solutions to the issue, both of which involve getting some other component of the provisioning process to run systemd-networkd-wait-online.
Solution 1
The first involves getting apt-get itself to wait for IPv6 configuration. The apt.conf configuration interface allows the definition of an APT::Update::Pre-Invoke hook that s executed just before apt s update operation. By writing the following to a file in /etc/apt/apt.conf.d/, we re able to ensure that we have IPv6 connectivity before apt-get tries accessing the network. This cloud-config snippet accomplishes that:
This is safe to leave in place after provisioning, because the delay will be negligible once IPv6 connectivity is established. It s only during address configuration that it ll block for a noticeable amount of time, but that s what we want.
This solution isn t entirely correct, though, because it s only apt-get that s actually affected by it. Other service that start after the system is ostensibly online might only see IPv4 connectivity when they start. This seems acceptable at the moment, though.
Solution 2
The second solution is to simply invoke systemd-networkd-wait-online directly from a cloud-init bootcmd. Similar to the first solution, it s not exactly correct because the host has already reached network-online.target, but it does block enough of cloud-init that package installation happens only after it completes. The cloud-config snippet for this is
In either case, we still want to write out a snippet to configure systemd-networkd-wait-online to wait for IPv6 connectivity for future reboots. Even though cloud-init won t necessarily run in those cases, and many cloud VMs never reboot at all, it does complete the solution. Additionally, it solves the problem for any derivative images that may be created based on the running VM s state. (At least if we can be certain that instances of those derivative images will never run in an IPv4-only network!)
How to properly solve it
One possible improvement would be for cloud-init to support a configuration key allowing the admin to specify the required protocols. Based on the presence of this key, cloud-init could reconfigure systemd-networkd-wait-online.service accordingly. Alternatively it could set the appropriate RequiredFamilyForOnline= value in the generated .network file. cloud-init supports multiple network configuration backends, so each of those would need to be updated. If using the systemd-networkd configuration renderer, this should be straightforward, but Debian uses the netplan renderer, so that tool might also need to be taught to pass such a configuration along to systemd-networkd.
Beyond Debian: Useful for other distros too
Every two years Debian releases a new major version of its Stable series,
meaning the differences between consecutive Debian Stable releases represent
two years of new developments both in Debian as an organization and its native
packages, but also in all other packages which are also shipped by other
distributions (which are getting into this new Stable release).
If you're not paying close attention to everything that's going on all the time
in the Linux world, you miss a lot of the nice new features and tools. It's
common for people to only realize there's a cool new trick available only years
after it was first introduced.
Given these considerations, the tips that I'm describing will eventually be
available in whatever other distribution you use, be it because it's a Debian
derivative or because it just got the same feature from the upstream project.
I'm not going to list "passive" features (as good as they can be), the focus
here is on new features that might change how you configure and use your
machine, with a mix between productivity and performance.
Debian 13 - Trixie
I have been a Debian Testing user for longer than 10 years now (and I recommend
it for non-server users), so I'm not usually keeping track of all the cool
features arriving in the new Stable releases because I'm continuously receiving
them through the Debian Testing rolling release.
Nonetheless, as a Debian Developer I'm in a good position to point out the ones
I can remember. I would also like other Debian Developers to do the same as I'm
sure I would learn something new.
The Debian 13 release notes contain a "What's new" section
, which
lists the first two items here and a few other things, in other words, take my
list as an addition to the release notes.
Debian 13 was released on 2025-08-09, and these are nice things you shouldn't
miss in the new release, with a bonus one not tied to the Debian 13 release.
1) wcurl
Have you ever had to download a file from your terminal using curl and didn't
remember the parameters needed? I did.
Nowadays you can use wcurl; "a command line tool which lets you download URLs
without having to remember any parameters."
Simply call wcurl with one or more URLs as parameters and it will download
all of them in parallel, performing retries, choosing the correct output file
name, following redirects, and more.
Try it out:
wcurl example.com
wcurl comes installed as part of the curl package on Debian 13 and in any other
distribution you can imagine, starting with curl 8.14.0.
I've written more about wcurl in its release
announcement
and I've done a lightning talk presentation in DebConf24, which is linked in
the release announcement.
2) HTTP/3 support in curl
Debian has become the first stable Linux distribution to ship curl with support
for HTTP/3. I've written about this in July
2024, when we
first enabled it. Note that we first switched the curl CLI to GnuTLS, but then
ended up releasing the curl CLI linked with OpenSSL (as support arrived later).
Debian was the first stable Linux distro to enable it, and within
rolling-release-based distros; Gentoo enabled it first in their non-default
flavor of the package and Arch Linux did it three months before we pushed it to
Debian Unstable/Testing/Stable-backports, kudos to them!
HTTP/3 is not used by default by the curl CLI, you have to enable it with
--http3 or --http3-only.
Try it out:
3) systemd soft-reboot
Starting with systemd v254, there's a new soft-reboot option, it's an
userspace-only reboot, much faster than a full reboot if you don't need to
reboot the kernel.
You can read the announcement from the systemd v254 GitHub
release
Try it out:
# This will reboot your machine!systemctl soft-reboot
4) apt --update
Are you tired of being required to run sudo apt update just before sudo apt upgrade or sudo apt install $PACKAGE? So am I!
The new --update option lets you do both things in a single command:
I love this, but it's still not yet where it should be, fingers crossed for a
simple apt upgrade to behave like other package managers by updating its
cache as part of the task, maybe in Debian 14?
Try it out:
sudo apt upgrade --update# The order doesn't mattersudo apt --update upgrade
This is especially handy for container usage, where you have to update the apt
cache before installing anything, for example:
podman run debian:stable bin/bash -c'apt install --update -y curl'
5) powerline-go
powerline-go is a powerline-style prompt written in Golang, so it's much more
performant than its Python alternative powerline.
powerline-style prompts are quite useful to show things like the current status
of the git repo in your working directory, exit code of the previous command,
presence of jobs in the background, whether or not you're in an ssh session,
and more.
Try it out:
sudo apt install powerline-go
Then add this to your .bashrc:
function_update_ps1()PS1="$(/usr/bin/powerline-go -error$? -jobs$(jobs -pwc -l))"# Uncomment the following line to automatically clear errors after showing# them once. This not only clears the error for powerline-go, but also for# everything else you run in that shell. Don't enable this if you're not# sure this is what you want.#set "?"if["$TERM"!="linux"]&&[-f"/usr/bin/powerline-go"];thenPROMPT_COMMAND="_update_ps1; $PROMPT_COMMAND"fi
Or this to .zshrc:
functionpowerline_precmd()PS1="$(/usr/bin/powerline-go -error$? -jobs$$(%):%j:-0)"# Uncomment the following line to automatically clear errors after showing# them once. This not only clears the error for powerline-go, but also for# everything else you run in that shell. Don't enable this if you're not# sure this is what you want.#set "?"
If you'd like to have your prompt start in a newline, like I have in the
screenshot above, you just need to set -newline in the powerline-go
invocation in your .bashrc/.zshrc.
6) Gnome System Monitor Extension
Tips number 6 and 7 are for Gnome users.
Gnome is now shipping a system monitor extension which lets you get a glance of
the current load of your machine from the top bar.
I've found this quite useful for machines where I'm required to install
third-party monitoring software that tends to randomly consume more resources
than it should. If I feel like my machine is struggling, I can quickly glance
at its load to verify if it's getting overloaded by some process.
The extension is not as complete as
system-monitor-next,
not showing temperatures or histograms, but at least it's officially part of
Gnome, easy to install and supported by them.
Try it out:
And then enable the extension from the "Extension Manager" application.
7) Gnome setting for battery charging profile
After having to learn more about batteries in order to get into FPV drones,
I've come to have a bigger appreciation for solutions that minimize the
inevitable loss of capacity that accrues over time.
There's now a "Battery Charging" setting (under the "Power") section which lets
you choose between two different profiles: "Maximize Charge" and "Preserve
Battery Health".
On supported laptops, this setting is an easy way to set thresholds for when
charging should start and stop, just like you could do it with the tlp package,
but now from the Gnome settings.
To increase the longevity of my laptop battery, I always keep it at "Preserve
Battery Health" unless I'm traveling.
What I would like to see next is support for choosing different "Power Modes"
based on whether the laptop is plugged-in, and based on the battery
charge percentage.
There's a GNOME
issue
tracking this feature, but there's some pushback on whether this is the right
thing to expose to users.
In the meantime, there are some workarounds mentioned in that issue which
people who really want this feature can follow.
If you would like to learn more about batteries; Battery
University is a great starting point, besides
getting into FPV drones and being forced to handle batteries without a Battery
Management System (BMS).
And if by any chance this sparks your interest in FPV drones, Joshua Bardwell's
YouTube channel is a great resource:
@JoshuaBardwell.
8) Lazygit
Emacs users are already familiar with the legendary magit; a terminal-based
UI for git.
Lazygit is an alternative for non-emacs users, you can integrate it with neovim
or just use it directly.
I'm still playing with lazygit and haven't integrated it into my workflows,
but so far it has been a pleasant experience.
You should check out the demos from the lazygit GitHub
page.
Try it out:
sudo apt install lazygit
And then call lazygit from within a git repository.
9) neovim
neovim has been shipped in Debian since 2016, but upstream has been doing a lot of
work to improve the experience out-of-the-box in the last couple of years.
If you're a neovim poweruser, you're likely not installing it from the official
repositories, but for those that are, Debian 13 comes with version 0.10.4,
which brings the following improvements compared to the version in Debian 12:
Treesitter support for C, Lua, Markdown, with the possibility of adding any
other languages as needed;
Better spellchecking due to treesitter integration (spellsitter);
Mouse support enabled by default;
Commenting support out-of-the-box;
Check :h commenting for details, but the
tl;dr is that you can use gcc to comment the current line and gc to comment
the current selection.
OSC52 support.
Especially handy for those using neovim over an ssh
connection, this protocol lets you copy something from within the neovim
process into the clipboard of the machine you're using to connect through ssh.
In other words, you can copy from neovim running in a host over ssh and paste
it in the "outside" machine.
10) [Bonus] Running old Debian releases
The bonus tip is not specific to the Debian 13 release, but something I've
recently learned in the #debian-devel IRC channel.
Did you know there are usable container images for all past Debian releases?
I'm not talking "past" as in "some of the older releases", I'm talking past as
in "literally every Debian release, including the very first one".
Tianon Gravi "tianon" is the Debian Developer responsible for making this
happen, kudos to him!
There's a small gotcha that the releases Buzz (1.1) and Rex (1.2) require a
32-bit host, otherwise you will get the error Out of virtual memory!, but
starting with Bo (1.3) all should work in amd64/arm64.
Try it out:
sudo apt install podmanpodman run -it docker.io/debian/eol:bo
Don't be surprised when noticing that apt/apt-get is not available inside the
container, that's because apt first appeared in Debian Slink (2.1).