Search Results: "andi"

12 February 2024

Andrew Cater: Lessons from (and for) colleagues - and, implicitly, how NOT to get on

I have had excellent colleagues both at my day job and, especially, in Debian over the last thirty-odd years. Several have attempted to give me good advice - others have been exemplars. People retire: sadly, people die. What impression do you want to leave behind when you leave here?Belatedly, I've come to realise that obduracy, sheer bloody mindedness, force of will and obstinacy will only get you so far. The following began very much as a tongue in cheek private memo to myself a good few years ago. I showed it to a colleague who suggested at the time that I should share it to a wider audience.


Personal conduct
  • Never argue with someone you believe to be arguing idiotically - a dispassionate bystander may have difficulty telling who's who.
  • You can't make yourself seem reasonable by behaving unreasonably
  • It does not matter how correct your point of view is if you get people's backs up
  • They may all be #####, @@@@@, %%%% and ******* - saying so out loud doesn't help improve matters and may make you seem intemperate.

Working with others
  • Be the change you want to be and behave the way you want others to behave in order to achieve the desired outcome.
  • You can demolish someone's argument constructively and add weight to good points rather than tearing down their ideas and hard work and being ultra-critical and negative - no-one likes to be told "You know - you've got a REALLY ugly baby there"
  • It's easier to work with someone than to work against them and have to apologise repeatedly.
  • Even when you're outstanding and superlative, even you had to learn it all once. Be generous to help others learn: you shouldn't have to teach too many times if you teach correctly once and take time in doing so.

Getting the message across
  • Stop: think: write: review: (peer review if necessary): publish.
  • Clarity is all: just because you understand it doesn't mean anyone else will.
  • It does not matter how correct your point of view is if you put it across badly.
  • If you're giving advice: make sure it is:
  • Considered
  • Constructive
  • Correct as far as you can (and)
  • Refers to other people who may be able to help
  • Say thank you promptly if someone helps you and be prepared to give full credit where credit's due.

Work is like that
  • You may not know all the answers or even have the whole picture - consult, take advice - LISTEN TO THE ADVICE
  • Sometimes the right answer is not the immediately correct answer
  • Corollary: Sometimes the right answer for the business is not your suggested/preferred outcome
  • Corollary: Just because you can do it like that in the real world doesn't mean that you can do it that way inside the business. [This realisation is INTENSELY frustrating but you have to learn to deal with it]
  • DON'T ALWAYS DO IT YOURSELF - Attempt to fix the system, sometimes allow the corporate monster to fail - then do it yourself and fix it. It is always easier and tempting to work round the system and Just Flaming Do It but it doesn't solve problems in the longer term and may create more problems and ill-feeling than it solves.

    [Worked out for Andy Cater for himself after many years of fighting the system as a misguided missile - though he will freely admit that he doesn't always follow them as often as he should :) ]

11 February 2024

Freexian Collaborators: Debian Contributions: Upcoming Improvements to Salsa CI, /usr-move, and more! (by Utkarsh Gupta)

Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services.

Upcoming Improvements to Salsa CI, by Santiago Ruano Rinc n Santiago started picking up the work made by Outreachy Intern, Enock Kashada (a big thanks to him!), to solve some long-standing issues in Salsa CI. Currently, the first job in a Salsa CI pipeline is the extract-source job, used to produce a debianize source tree of the project. This job was introduced to make it possible to build the projects on different architectures, on the subsequent build jobs. However, that extract-source approach is sub-optimal: not only it increases the execution time of the pipeline by some minutes, but also projects whose source tree is too large are not able to use the pipeline. The debianize source tree is passed as an artifact to the build jobs, and for those large projects, the size of their source tree exceeds the Salsa s limits. This is specific issue is documented as issue #195, and the proposed solution is to get rid of the extract-source job, relying on sbuild in the very build job (see issue #296). Switching to sbuild would also help to improve the build source job, solving issues such as #187 and #298. The current work-in-progress is very preliminary, but it has already been possible to run the build (amd64), build-i386 and build-source job using sbuild with the unshare mode. The image on the right shows a pipeline that builds grep. All the test jobs use the artifacts of the new build job. There is a lot of remaining work, mainly making the integration with ccache work. This change could break some things, it will also be important to test how the new pipeline works with complex projects. Also, thanks to Emmanuel Arias, we are proposing a Google Summer of Code 2024 project to improve Salsa CI. As part of the ongoing work in preparation for the GSoC 2024 project, Santiago has proposed a merge request to make more efficient how contributors can test their changes on the Salsa CI pipeline.

/usr-move, by Helmut Grohne In January, we sent most of the moving patches for the set of packages involved with debootstrap. Notably missing is glibc, which turns out harder than anticipated via dumat, because it has Conflicts between different architectures, which dumat does not analyze. Patches for diversion mitigations have been updated in a way to not exhibit any loss anymore. The main change here is that packages which are being diverted now support the diverting packages in transitioning their diversions. We also supported a few packages with non-trivial changes such as dumat has been enhanced to better support derivatives such as Ubuntu.

Miscellaneous contributions
  • Python 3.12 migration trundles on. Stefano Rivera helped port several new packages to support 3.12.
  • Stefano updated the Sphinx configuration of DebConf Video Team s documentation, which was broken by Sphinx 7.
  • Stefano published the videos from the Cambridge MiniDebConf to YouTube and PeerTube.
  • DebConf 24 planning has begun, and Stefano & Utkarsh have started work on this.
  • Utkarsh re-sponsored the upload of golang-github-prometheus-community-pgbouncer-exporter for Lena.
  • Colin Watson added Incus support to autopkgtest.
  • Colin discovered Perl::Critic and used it to tidy up some poor practices in several of his packages, including debconf.
  • Colin did some overdue debconf maintenance, mainly around tidying up error message handling in several places (1, 2, 3).
  • Colin figured out how to update the mirror size documentation in debmirror, last updated in 2010. It should now be much easier to keep it up to date regularly.
  • Colin issued a man-db buster update to clean up some irritations due to strict sandboxing.
  • Thorsten Alteholz adopted two more packages, magicfilter and ifhp, for the debian-printing team. Those packages are the last ones of the latest round of adoptions to preserve the old printing protocol within Debian. If you know of other packages that should be retained, please don t hesitate to contact Thorsten.
  • Enrico participated in /usr-merge discussions with Helmut.
  • Helmut sent patches for 16 cross build failures.
  • Helmut supported Matthias Klose (not affiliated with Freexian) with adding -for-host support to gcc-defaults.
  • Helmut uploaded dput-ng enabling dcut migrate and merging two MRs of Ben Hutchings.
  • Santiago took part in the discussions relating to the EU Cyber Resilience Act (CRA) and the Debian public statement that was published last year. He participated in a meeting with Members of the European Parliament (MEPs), Marcel Kolaja and Karen Melchior, and their teams to clarify some points about the impact of the CRA and Debian and downstream projects, and the improvements in the last version of the proposed regulation.

8 February 2024

Dirk Eddelbuettel: RcppArmadillo on CRAN: New Upstream, Interface Polish

armadillo image Armadillo is a powerful and expressive C++ template library for linear algebra and scientific computing. It aims towards a good balance between speed and ease of use, has a syntax deliberately close to Matlab, and is useful for algorithm development directly in C++, or quick conversion of research code into production environments. RcppArmadillo integrates this library with the R environment and language and is widely used by (currently) 1119 other packages on CRAN, downloaded 32.5 million times (per the partial logs from the cloud mirrors of CRAN), and the CSDA paper (preprint / vignette) by Conrad and myself has been cited 575 times according to Google Scholar. This release brings a new (stable) upstream (minor) release Armadillo 12.8.0 prepared by Conrad two days ago. We, as usual, prepared a release candidate which we tested against the over 1100 CRAN packages using RcppArmadillo. This found no issues, which was confirmed by CRAN once we uploaded and so it arrived as a new release today in a fully automated fashion. We also made a small change that had been prepared by GitHub issue #400: a few internal header files that were cluttering the top-level of the include directory have been moved to internal directories. The standard header is of course unaffected, and the set of full / light / lighter / lightest headers (matching we did a while back in Rcpp) also continue to work as one expects. This change was also tested in a full reverse-dependency check in January but had not been released to CRAN yet. The set of changes since the last CRAN release follows.

Changes in RcppArmadillo version (2024-02-06)
  • Upgraded to Armadillo release 12.8.0 (Cortisol Injector)
    • Faster detection of symmetric expressions by pinv() and rank()
    • Expanded shift() to handle sparse matrices
    • Expanded conv_to for more flexible conversions between sparse and dense matrices
    • Added cbrt()
    • More compact representation of integers when saving matrices in CSV format
  • Five non-user facing top-level include files have been removed (#432 closing #400 and building on #395 and #396)

Courtesy of my CRANberries, there is a diffstat report relative to previous release. More detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the Rcpp R-Forge page. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

30 January 2024

Matthew Palmer: Why Certificate Lifecycle Automation Matters

If you ve perused the ActivityPub feed of certificates whose keys are known to be compromised, and clicked on the Show More button to see the name of the certificate issuer, you may have noticed that some issuers seem to come up again and again. This might make sense after all, if a CA is issuing a large volume of certificates, they ll be seen more often in a list of compromised certificates. In an attempt to see if there is anything that we can learn from this data, though, I did a bit of digging, and came up with some illuminating results.

The Procedure I started off by finding all the unexpired certificates logged in Certificate Transparency (CT) logs that have a key that is in the pwnedkeys database as having been publicly disclosed. From this list of certificates, I removed duplicates by matching up issuer/serial number tuples, and then reduced the set by counting the number of unique certificates by their issuer. This gave me a list of the issuers of these certificates, which looks a bit like this:
/C=BE/O=GlobalSign nv-sa/CN=AlphaSSL CA - SHA256 - G4
/C=GB/ST=Greater Manchester/L=Salford/O=Sectigo Limited/CN=Sectigo RSA Domain Validation Secure Server CA
/C=GB/ST=Greater Manchester/L=Salford/O=Sectigo Limited/CN=Sectigo RSA Organization Validation Secure Server CA
/C=US/ST=Arizona/L=Scottsdale/, Inc./OU= Daddy Secure Certificate Authority - G2
/C=US/ST=Arizona/L=Scottsdale/O=Starfield Technologies, Inc./OU= Secure Certificate Authority - G2
/C=AT/O=ZeroSSL/CN=ZeroSSL RSA Domain Secure Site CA
/C=BE/O=GlobalSign nv-sa/CN=GlobalSign GCC R3 DV TLS CA 2020
Rather than try to work with raw issuers (because, as Andrew Ayer says, The SSL Certificate Issuer Field is a Lie), I mapped these issuers to the organisations that manage them, and summed the counts for those grouped issuers together.

The Data
Lieutenant Commander Data from Star Trek: The Next Generation Insert obligatory "not THAT data" comment here
The end result of this work is the following table, sorted by the count of certificates which have been compromised by exposing their private key:
IssuerCompromised Count
ISRG (Let's Encrypt)161
If you re familiar with the CA ecosystem, you ll probably recognise that the organisations with large numbers of compromised certificates are also those who issue a lot of certificates. So far, nothing particularly surprising, then. Let s look more closely at the relationships, though, to see if we can get more useful insights.

Volume Control Using the issuance volume report from, we can compare issuance volumes to compromise counts, to come up with a compromise rate . I m using the Unexpired Precertificates colume from the issuance volume report, as I feel that s the number that best matches the certificate population I m examining to find compromised certificates. To maintain parity with the previous table, this one is still sorted by the count of certificates that have been compromised.
IssuerIssuance VolumeCompromised CountCompromise Rate
Sectigo88,323,0681701 in 519,547
ISRG (Let's Encrypt)315,476,4021611 in 1,959,480
GoDaddy56,121,4291411 in 398,024
DigiCert144,713,475811 in 1,786,586
GlobalSign1,438,485461 in 31,271
Entrust23,16631 in 7,722
SSL.com171,81611 in 171,816
If we now sort this table by compromise rate, we can see which organisations have the most (and least) leakiness going on from their customers:
IssuerIssuance VolumeCompromised CountCompromise Rate
Entrust23,16631 in 7,722
GlobalSign1,438,485461 in 31,271
SSL.com171,81611 in 171,816
GoDaddy56,121,4291411 in 398,024
Sectigo88,323,0681701 in 519,547
DigiCert144,713,475811 in 1,786,586
ISRG (Let's Encrypt)315,476,4021611 in 1,959,480
By grouping by order-of-magnitude in the compromise rate, we can identify three bands :
  • The Super Leakers: Customers of Entrust and GlobalSign seem to love to lose control of their private keys. For Entrust, at least, though, the small volumes involved make the numbers somewhat untrustworthy. The three compromised certificates could very well belong to just one customer, for instance. I m not aware of anything that GlobalSign does that would make them such an outlier, either, so I m inclined to think they just got unlucky with one or two customers, but as CAs don t include customer IDs in the certificates they issue, it s not possible to say whether that s the actual cause or not.
  • The Regular Leakers: Customers of, GoDaddy, and Sectigo all have compromise rates in the 1-in-hundreds-of-thousands range. Again, the low volumes of make the numbers somewhat unreliable, but the other two organisations in this group have large enough numbers that we can rely on that data fairly well, I think.
  • The Low Leakers: Customers of DigiCert and Let s Encrypt are at least three times less likely than customers of the regular leakers to lose control of their private keys. Good for them!
Now we have some useful insights we can think about.

Why Is It So?
Professor Julius Sumner Miller If you don't know who Professor Julius Sumner Miller is, I highly recommend finding out
All of the organisations on the list, with the exception of Let s Encrypt, are what one might term traditional CAs. To a first approximation, it s reasonable to assume that the vast majority of the customers of these traditional CAs probably manage their certificates the same way they have for the past two decades or more. That is, they generate a key and CSR, upload the CSR to the CA to get a certificate, then copy the cert and key somewhere. Since humans are handling the keys, there s a higher risk of the humans using either risky practices, or making a mistake, and exposing the private key to the world. Let s Encrypt, on the other hand, issues all of its certificates using the ACME (Automatic Certificate Management Environment) protocol, and all of the Let s Encrypt documentation encourages the use of software tools to generate keys, issue certificates, and install them for use. Given that Let s Encrypt has 161 compromised certificates currently in the wild, it s clear that the automation in use is far from perfect, but the significantly lower compromise rate suggests to me that lifecycle automation at least reduces the rate of key compromise, even though it doesn t eliminate it completely.

Explaining the Outlier The difference in presumed issuance practices would seem to explain the significant difference in compromise rates between Let s Encrypt and the other organisations, if it weren t for one outlier. This is a largely traditional CA, with the manual-handling issues that implies, but with a compromise rate close to that of Let s Encrypt. We are, of course, talking about DigiCert. The thing about DigiCert, that doesn t show up in the raw numbers from, is that DigiCert manages the issuance of certificates for several of the biggest hosted TLS providers, such as CloudFlare and AWS. When these services obtain a certificate from DigiCert on their customer s behalf, the private key is kept locked away, and no human can (we hope) get access to the private key. This is supported by the fact that no certificates identifiably issued to either CloudFlare or AWS appear in the set of certificates with compromised keys. When we ask for all certificates issued by DigiCert , we get both the certificates issued to these big providers, which are very good at keeping their keys under control, as well as the certificates issued to everyone else, whose key handling practices may not be quite so stringent. It s possible, though not trivial, to account for certificates issued to these hosted TLS providers, because the certificates they use are issued from intermediates branded to those companies. With the psql interface we can run this query to get the total number of unexpired precertificates issued to these managed services:
  FROM (
    SELECT, max(coalesce(coalesce(nullif(trim(cc.SUBORDINATE_CA_OWNER), ''), nullif(trim(cc.CA_OWNER), '')), cc.INCLUDED_CERTIFICATE_OWNER)) as OWNER,
           ca.NUM_ISSUED, ca.NUM_EXPIRED
      FROM ccadb_certificate cc, ca_certificate cac, ca
       AND cac.CA_ID = ca.ID
  ) sub
 WHERE ILIKE '%Amazon%' OR ILIKE '%CloudFlare%' AND sub.owner = 'DigiCert';
The number I get from running that query is 104,316,112, which should be subtracted from DigiCert s total issuance figures to get a more accurate view of what DigiCert s regular customers do with their private keys. When I do this, the compromise rates table, sorted by the compromise rate, looks like this:
IssuerIssuance VolumeCompromised CountCompromise Rate
Entrust23,16631 in 7,722
GlobalSign1,438,485461 in 31,271
SSL.com171,81611 in 171,816
GoDaddy56,121,4291411 in 398,024
"Regular" DigiCert40,397,363811 in 498,732
Sectigo88,323,0681701 in 519,547
All DigiCert144,713,475811 in 1,786,586
ISRG (Let's Encrypt)315,476,4021611 in 1,959,480
In short, it appears that DigiCert s regular customers are just as likely as GoDaddy or Sectigo customers to expose their private keys.

What Does It All Mean? The takeaway from all this is fairly straightforward, and not overly surprising, I believe.

The less humans have to do with certificate issuance, the less likely they are to compromise that certificate by exposing the private key. While it may not be surprising, it is nice to have some empirical evidence to back up the common wisdom. Fully-managed TLS providers, such as CloudFlare, AWS Certificate Manager, and whatever Azure s thing is called, is the platonic ideal of this principle: never give humans any opportunity to expose a private key. I m not saying you should use one of these providers, but the security approach they have adopted appears to be the optimal one, and should be emulated universally. The ACME protocol is the next best, in that there are a variety of standardised tools widely available that allow humans to take themselves out of the loop, but it s still possible for humans to handle (and mistakenly expose) key material if they try hard enough. Legacy issuance methods, which either cannot be automated, or require custom, per-provider automation to be developed, appear to be at least four times less helpful to the goal of avoiding compromise of the private key associated with a certificate.

Humans Are, Of Course, The Problem
Bender, the robot from Futurama, asking if we'd like to kill all humans No thanks, Bender, I'm busy tonight
This observation that if you don t let humans near keys, they don t get leaked is further supported by considering the biggest issuers by volume who have not issued any certificates whose keys have been compromised: Google Trust Services (fourth largest issuer overall, with 57,084,529 unexpired precertificates), and Microsoft Corporation (sixth largest issuer overall, with 22,852,468 unexpired precertificates). It appears that somewhere between most and basically all of the certificates these organisations issue are to customers of their public clouds, and my understanding is that the keys for these certificates are managed in same manner as CloudFlare and AWS the keys are locked away where humans can t get to them. It should, of course, go without saying that if a human can never have access to a private key, it makes it rather difficult for a human to expose it. More broadly, if you are building something that handles sensitive or secret data, the more you can do to keep humans out of the loop, the better everything will be.

Your Support is Appreciated If you d like to see more analysis of how key compromise happens, and the lessons we can learn from examining billions of certificates, please show your support by buying me a refreshing beverage. Trawling CT logs is thirsty work.

Appendix: Methodology Limitations In the interests of clarity, I feel it s important to describe ways in which my research might be flawed. Here are the things I know of that may have impacted the accuracy, that I couldn t feasibly account for.
  • Time Periods: Because time never stops, there is likely to be some slight mismatches in the numbers obtained from the various data sources, because they weren t collected at exactly the same moment.
  • Issuer-to-Organisation Mapping: It s possible that the way I mapped issuers to organisations doesn t match exactly with how does it, meaning that counts might be skewed. I tried to minimise that by using the same data sources (the CCADB AllCertificates report) that I believe that uses for its mapping, but I cannot be certain of a perfect match.
  • Unwarranted Grouping: I ve drawn some conclusions about the practices of the various organisations based on their general approach to certificate issuance. If a particular subordinate CA that I ve grouped into the parent organisation is managed in some unusual way, that might cause my conclusions to be erroneous. I was able to fairly easily separate out CloudFlare, AWS, and Azure, but there are almost certainly others that I didn t spot, because hoo boy there are a lot of intermediate CAs out there.

28 January 2024

Niels Thykier: Annotating the Debian packaging directory

In my previous blog post Providing online reference documentation for debputy, I made a point about how debhelper documentation was suboptimal on account of being static rather than online. The thing is that debhelper is not alone in this problem space, even if it is a major contributor to the number of packaging files you have to to know about. If we look at the "competition" here such as Fedora and Arch Linux, they tend to only have one packaging file. While most Debian people will tell you a long list of cons about having one packaging file (such a Fedora's spec file being 3+ domain specific languages "mashed" into one file), one major advantage is that there is only "the one packaging file". You only need to remember where to find the documentation for one file, which is great when you are running on wetware with limited storage capacity. Which means as a newbie, you can dedicate less mental resources to tracking multiple files and how they interact and more effort understanding the "one file" at hand. I started by asking myself how can we in Debian make the packaging stack more accessible to newcomers? Spoiler alert, I dug myself into rabbit hole and ended up somewhere else than where I thought I was going. I started by wanting to scan the debian directory and annotate all files that I could with documentation links. The logic was that if debputy could do that for you, then you could spend more mental effort elsewhere. So I combined debputy's packager provided files detection with a static list of files and I quickly had a good starting point for debputy-based packages.
Adding (non-static) dpkg and debhelper files to the mix Now, I could have closed the topic here and said "Look, I did debputy files plus couple of super common files". But I decided to take it a bit further. I added support for handling some dpkg files like packager provided files (such as debian/substvars and debian/symbols). But even then, we all know that debhelper is the big hurdle and a major part of the omission... In another previous blog post (A new Debian package helper: debputy), I made a point about how debputy could list all auxiliary files while debhelper could not. This was exactly the kind of feature that I would need for this feature, if this feature was to cover debhelper. Now, I also remarked in that blog post that I was not willing to maintain such a list. Also, I may have ranted about static documentation being unhelpful for debhelper as it excludes third-party provided tooling. Fortunately, a recent update to dh_assistant had provided some basic plumbing for loading dh sequences. This meant that getting a list of all relevant commands for a source package was a lot easier than it used to be. Once you have a list of commands, it would be possible to check all of them for dh's NOOP PROMISE hints. In these hints, a command can assert it does nothing if a given pkgfile is not present. This lead to the new dh_assistant list-guessed-dh-config-files command that will list all declared pkgfiles and which helpers listed them. With this combined feature set in place, debputy could call dh_assistant to get a list of pkgfiles, pretend they were packager provided files and annotate those along with manpage for the relevant debhelper command. The exciting thing about letting debpputy resolve the pkgfiles is that debputy will resolve "named" files automatically (debhelper tools will only do so when --name is passed), so it is much more likely to detect named pkgfiles correctly too. Side note: I am going to ignore the elephant in the room for now, which is dh_installsystemd and its package@.service files and the wide-spread use of debian/foo.service where there is no package called foo. For the latter case, the "proper" name would be debian/ With the new dh_assistant feature done and added to debputy, debputy could now detect the ubiquitous debian/install file. Excellent. But less great was that the very common debian/docs file was not. Turns out that dh_installdocs cannot be skipped by dh, so it cannot have NOOP PROMISE hints. Meh... Well, dh_assistant could learn about a new INTROSPECTABLE marker in addition to the NOOP PROMISE and then I could sprinkle that into a few commands. Indeed that worked and meant that debian/postinst (etc.) are now also detectable. At this point, debputy would be able to identify a wide range of debhelper related configuration files in debian/ and at least associate each of them with one or more commands. Nice, surely, this would be a good place to stop, right...?
Adding more metadata to the files The debhelper detected files only had a command name and manpage URI to that command. It would be nice if we could contextualize this a bit more. Like is this file installed into the package as is like debian/pam or is it a file list to be processed like debian/install. To make this distinction, I could add the most common debhelper file types to my static list and then merge the result together. Except, I do not want to maintain a full list in debputy. Fortunately, debputy has a quite extensible plugin infrastructure, so added a new plugin feature to provide this kind of detail and now I can outsource the problem! I split my definitions into two and placed the generic ones in the debputy-documentation plugin and moved the debhelper related ones to debhelper-documentation. Additionally, third-party dh addons could provide their own debputy plugin to add context to their configuration files. So, this gave birth file categories and configuration features, which described each file on different fronts. As an example, debian/gbp.conf could be tagged as a maint-config to signal that it is not directly related to the package build but more of a tool or style preference file. On the other hand, debian/install and debian/debputy.manifest would both be tagged as a pkg-helper-config. Files like debian/pam were tagged as ppf-file for packager provided file and so on. I mentioned configuration features above and those were added because, I have had a beef with debhelper's "standard" configuration file format as read by filearray and filedoublearray. They are often considered simple to understand, but it is hard to know how a tool will actually read the file. As an example, consider the following:
  • Will the debhelper use filearray, filedoublearray or none of them to read the file? This topic has about 2 bits of entropy.
  • Will the config file be executed if it is marked executable assuming you are using the right compat level? If it is executable, does dh-exec allow renaming for this file? This topic adds 1 or 2 bit of entropy depending on the context.
  • Will the config file be subject to glob expansions? This topic sounds like a boolean but is a complicated mess. The globs can be handled either by debhelper as it parses the file for you. In this case, the globs are applied to every token. However, this is not what dh_install does. Here the last token on each line is supposed to be a directory and therefore not subject to globs. Therefore, dh_install does the globbing itself afterwards but only on part of the tokens. So that is about 2 bits of entropy more. Actually, it gets worse...
    • If the file is executed, debhelper will refuse to expand globs in the output of the command, which was a deliberate design choice by the original debhelper maintainer took when he introduced the feature in debhelper/8.9.12. Except, dh_install feature interacts with the design choice and does enable glob expansion in the tool output, because it does so manually after its filedoublearray call.
So these "simple" files have way too many combinations of how they can be interpreted. I figured it would be helpful if debputy could highlight these difference, so I added support for those as well. Accordingly, debian/install is tagged with multiple tags including dh-executable-config and dh-glob-after-execute. Then, I added a datatable of these tags, so it would be easy for people to look up what they meant. Ok, this seems like a closed deal, right...?
Context, context, context However, the dh-executable-config tag among other are only applicable in compat 9 or later. It does not seem newbie friendly if you are told that this feature exist, but then have to read in the extended description that that it actually does not apply to your package. This problem seems fixable. Thanks to dh_assistant, it is easy to figure out which compat level the package is using. Then tweak some metadata to enable per compat level rules. With that tags like dh-executable-config only appears for packages using compat 9 or later. Also, debputy should be able to tell you where packager provided files like debian/pam are installed. We already have the logic for packager provided files that debputy supports and I am already using debputy engine for detecting the files. If only the plugin provided metadata gave me the install pattern, debputy would be able tell you where this file goes in the package. Indeed, a bit of tweaking later and setting install-pattern to usr/lib/pam.d/ name , debputy presented me with the correct install-path with the package name placing the name placeholder. Now, I have been using debian/pam as an example, because debian/pam is installed into usr/lib/pam.d in compat 14. But in earlier compat levels, it was installed into etc/pam.d. Well, I already had an infrastructure for doing compat file tags. Off we go to add install-pattern to the complat level infrastructure and now changing the compat level would change the path. Great. (Bug warning: The value is off-by-one in the current version of debhelper. This is fixed in git) Also, while we are in this install-pattern business, a number of debhelper config files causes files to be installed into a fixed directory. Like debian/docs which causes file to be installed into /usr/share/docs/ package . Surely, we can expand that as well and provide that bit of context too... and done. (Bug warning: The code currently does not account for the main documentation package context) It is rather common pattern for people to do debian/ files, because they want to custom generation of debian/foo. Which means if you have debian/foo you get "Oh, let me tell you about debian/foo ". Then you rename it to debian/ and the result is "debian/ is a total mystery to me!". That is suboptimal, so lets detect those as well as if they were the original file but add a tag saying that they are a generate template and which file we suspect it generates. Finally, if you use debputy, almost all of the standard debhelper commands are removed from the sequence, since debputy replaces them. It would be weird if these commands still contributed configuration files when they are not actually going to be invoked. This mostly happened naturally due to the way the underlying dh_assistant command works. However, any file mentioned by the debhelper-documentation plugin would still appear unfortunately. So off I went to filter the list of known configuration files against which dh_ commands that dh_assistant thought would be used for this package.
Wrapping it up I was several layers into this and had to dig myself out. I have ended up with a lot of data and metadata. But it was quite difficult for me to arrange the output in a user friendly manner. However, all this data did seem like it would be useful any tool that wants to understand more about the package. So to get out of the rabbit hole, I for now wrapped all of this into JSON and now we have a debputy tool-support annotate-debian-directory command that might be useful for other tools. To try it out, you can try the following demo: In another day, I will figure out how to structure this output so it is useful for non-machine consumers. Suggestions are welcome. :)
Limitations of the approach As a closing remark, I should probably remind people that this feature relies heavily on declarative features. These include:
  • When determining which commands are relevant, using Build-Depends: dh-sequence-foo is much more reliable than configuring it via the Turing complete configuration we call debian/rules.
  • When debhelper commands use NOOP promise hints, dh_assistant can "see" the config files listed those hints, meaning the file will at least be detected. For new introspectable hint and the debputy plugin, it is probably better to wait until the dust settles a bit before adding any of those.
You can help yourself and others to better results by using the declarative way rather than using debian/rules, which is the bane of all introspection!

25 January 2024

Dirk Eddelbuettel: qlcal 0.0.10 on CRAN: Calendar Updates

The tenth release of the qlcal package arrivied at CRAN today. qlcal delivers the calendaring parts of QuantLib. It is provided (for the R package) as a set of included files, so the package is self-contained and does not depend on an external QuantLib library (which can be demanding to build). qlcal covers over sixty country / market calendars and can compute holiday lists, its complement (i.e. business day lists) and much more. Examples are in the README at the repository, the package page, and course at the CRAN package page. This releases synchronizes qlcal with the QuantLib release 1.33 and its updates to 2024 calendars.

Changes in version 0.0.10 (2024-01-24)
  • Synchronized with QuantLib 1.33

Courtesy of my CRANberries, there is a diffstat report for this release. See the project page and package documentation for more details, and more examples. If you like this or other open-source work I do, you can now sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

22 January 2024

Paul Tagliamonte: Writing a simulator to check phased array beamforming

Interested in future updates? Follow me on mastodon at Posts about will be tagged #hztools.

If you're on the Fediverse, I'd very much appreciate boosts on my toot!
While working on, I started to move my beamforming code from 2-D (meaning, beamforming to some specific angle on the X-Y plane for waves on the X-Y plane) to 3-D. I ll have more to say about that once I get around to publishing the code as soon as I m sure it s not completely wrong, but in the meantime I decided to write a simple simulator to visually check the beamformer against the textbooks. The results were pretty rad, so I figured I d throw together a post since it s interesting all on its own outside of beamforming as a general topic. I figured I d write this in Rust, since I ve been using Rust as my primary language over at zoo, and it s a good chance to learn the language better.
This post has some large GIFs

It make take a little bit to load depending on your internet connection. Sorry about that, I'm not clever enough to do better without doing tons of complex engineering work. They may be choppy while they load or something. I tried to compress an ensmall them, so if they're loaded but fuzzy, click on them to load a slightly larger version.
This post won t cover the basics of how phased arrays work or the specifics of calculating the phase offsets for each antenna, but I ll dig into how I wrote a simple simulator and how I wound up checking my phase offsets to generate the renders below.

Assumptions I didn t want to build a general purpose RF simulator, anything particularly generic, or something that would solve for any more than the things right in front of me. To do this as simply (and quickly all this code took about a day to write, including the beamforming math) I had to reduce the amount of work in front of me. Given that I was concerend with visualizing what the antenna pattern would look like in 3-D given some antenna geometry, operating frequency and configured beam, I made the following assumptions: All anetnnas are perfectly isotropic they receive a signal that is exactly the same strength no matter what direction the signal originates from. There s a single point-source isotropic emitter in the far-field (I modeled this as being 1 million meters away 1000 kilometers) of the antenna system. There is no noise, multipath, loss or distortion in the signal as it travels through space. Antennas will never interfere with each other.

2-D Polar Plots The last time I wrote something like this, I generated 2-D GIFs which show a radiation pattern, not unlike the polar plots you d see on a microphone. These are handy because it lets you visualize what the directionality of the antenna looks like, as well as in what direction emissions are captured, and in what directions emissions are nulled out. You can see these plots on spec sheets for antennas in both 2-D and 3-D form. Now, let s port the 2-D approach to 3-D and see how well it works out.

Writing the 3-D simulator As an EM wave travels through free space, the place at which you sample the wave controls that phase you observe at each time-step. This means, assuming perfectly synchronized clocks, a transmitter and receiver exactly one RF wavelength apart will observe a signal in-phase, but a transmitter and receiver a half wavelength apart will observe a signal 180 degrees out of phase. This means that if we take the distance between our point-source and antenna element, divide it by the wavelength, we can use the fractional part of the resulting number to determine the phase observed. If we multiply that number (in the range of 0 to just under 1) by tau, we can generate a complex number by taking the cos and sin of the multiplied phase (in the range of 0 to tau), assuming the transmitter is emitting a carrier wave at a static amplitude and all clocks are in perfect sync.
 let observed_phases: Vec<Complex> = antennas
.map( antenna   
let distance = (antenna - tx).magnitude();
let distance = distance - (distance as i64 as f64);
((distance / wavelength) * TAU)
.map( phase  Complex(phase.cos(), phase.sin()))
At this point, given some synthetic transmission point and each antenna, we know what the expected complex sample would be at each antenna. At this point, we can adjust the phase of each antenna according to the beamforming phase offset configuration, and add up every sample in order to determine what the entire system would collectively produce a sample as.
 let beamformed_phases: Vec<Complex> = ...;
let magnitude = beamformed_phases
.map( (beamformed, observed)  observed * beamformed)
.reduce( acc, el  acc + el)
Armed with this information, it s straight forward to generate some number of (Azimuth, Elevation) points to sample, generate a transmission point far away in that direction, resolve what the resulting Complex sample would be, take its magnitude, and use that to create an (x, y, z) point at (azimuth, elevation, magnitude). The color attached two that point is based on its distance from (0, 0, 0). I opted to use the Life Aquatic table for this one. After this process is complete, I have a point cloud of ((x, y, z), (r, g, b)) points. I wrote a small program using kiss3d to render point cloud using tons of small spheres, and write out the frames to a set of PNGs, which get compiled into a GIF. Now for the fun part, let s take a look at some radiation patterns!

1x4 Phased Array The first configuration is a phased array where all the elements are in perfect alignment on the y and z axis, and separated by some offset in the x axis. This configuration can sweep 180 degrees (not the full 360), but can t be steared in elevation at all. Let s take a look at what this looks like for a well constructed 1x4 phased array: And now let s take a look at the renders as we play with the configuration of this array and make sure things look right. Our initial quarter-wavelength spacing is very effective and has some outstanding performance characteristics. Let s check to see that everything looks right as a first test. Nice. Looks perfect. When pointing forward at (0, 0), we d expect to see a torus, which we do. As we sweep between 0 and 360, astute observers will notice the pattern is mirrored along the axis of the antennas, when the beam is facing forward to 0 degrees, it ll also receive at 180 degrees just as strong. There s a small sidelobe that forms when it s configured along the array, but it also becomes the most directional, and the sidelobes remain fairly small.

Long compared to the wavelength (1 ) Let s try again, but rather than spacing each antenna of a wavelength apart, let s see about spacing each antenna 1 of a wavelength apart instead. The main lobe is a lot more narrow (not a bad thing!), but some significant sidelobes have formed (not ideal). This can cause a lot of confusion when doing things that require a lot of directional resolution unless they re compensated for.

Going from ( to 5 ) The last model begs the question - what do things look like when you separate the antennas from each other but without moving the beam? Let s simulate moving our antennas but not adjusting the configured beam or operating frequency. Very cool. As the spacing becomes longer in relation to the operating frequency, we can see the sidelobes start to form out of the end of the antenna system.

2x2 Phased Array The second configuration I want to try is a phased array where the elements are in perfect alignment on the z axis, and separated by a fixed offset in either the x or y axis by their neighbor, forming a square when viewed along the x/y axis. Let s take a look at what this looks like for a well constructed 2x2 phased array: Let s do the same as above and take a look at the renders as we play with the configuration of this array and see what things look like. This configuration should suppress the sidelobes and give us good performance, and even give us some amount of control in elevation while we re at it. Sweet. Heck yeah. The array is quite directional in the configured direction, and can even sweep a little bit in elevation, a definite improvement from the 1x4 above.

Long compared to the wavelength (1 ) Let s do the same thing as the 1x4 and take a look at what happens when the distance between elements is long compared to the frequency of operation say, 1 of a wavelength apart? What happens to the sidelobes given this spacing when the frequency of operation is much different than the physical geometry? Mesmerising. This is my favorate render. The sidelobes are very fun to watch come in and out of existence. It looks absolutely other-worldly.

Going from ( to 5 ) Finally, for completeness' sake, what do things look like when you separate the antennas from each other just as we did with the 1x4? Let s simulate moving our antennas but not adjusting the configured beam or operating frequency. Very very cool. The sidelobes wind up turning the very blobby cardioid into an electromagnetic dog toy. I think we ve proven to ourselves that using a phased array much outside its designed frequency of operation seems like a real bad idea.

Future Work Now that I have a system to test things out, I m a bit more confident that my beamforming code is close to right! I d love to push that code over the line and blog about it, since it s a really interesting topic on its own. Once I m sure the code involved isn t full of lies, I ll put it up on the hztools org, and post about it here and on mastodon.

20 January 2024

Niels Thykier: Making debputy: Writing declarative parsing logic

In this blog post, I will cover how debputy parses its manifest and the conceptual improvements I did to make parsing of the manifest easier. All instructions to debputy are provided via the debian/debputy.manifest file and said manifest is written in the YAML format. After the YAML parser has read the basic file structure, debputy does another pass over the data to extract the information from the basic structure. As an example, the following YAML file:
manifest-version: "0.1"
  - install:
      source: foo
      dest-dir: usr/bin
would be transformed by the YAML parser into a structure resembling:
  "manifest-version": "0.1",
  "installations": [
         "source": "foo",
         "dest-dir": "usr/bin",
This structure is then what debputy does a pass on to translate this into an even higher level format where the "install" part is translated into an InstallRule. In the original prototype of debputy, I would hand-write functions to extract the data that should be transformed into the internal in-memory high level format. However, it was quite tedious. Especially because I wanted to catch every possible error condition and report "You are missing the required field X at Y" rather than the opaque KeyError: X message that would have been the default. Beyond being tedious, it was also quite error prone. As an example, in debputy/0.1.4 I added support for the install rule and you should allegedly have been able to add a dest-dir: or an as: inside it. Except I crewed up the code and debputy was attempting to look up these keywords from a dict that could never have them. Hand-writing these parsers were so annoying that it demotivated me from making manifest related changes to debputy simply because I did not want to code the parsing logic. When I got this realization, I figured I had to solve this problem better. While reflecting on this, I also considered that I eventually wanted plugins to be able to add vocabulary to the manifest. If the API was "provide a callback to extract the details of whatever the user provided here", then the result would be bad.
  1. Most plugins would probably throw KeyError: X or ValueError style errors for quite a while. Worst case, they would end on my table because the user would have a hard time telling where debputy ends and where the plugins starts. "Best" case, I would teach debputy to say "This poor error message was brought to you by plugin foo. Go complain to them". Either way, it would be a bad user experience.
  2. This even assumes plugin providers would actually bother writing manifest parsing code. If it is that difficult, then just providing a custom file in debian might tempt plugin providers and that would undermine the idea of having the manifest be the sole input for debputy.
So beyond me being unsatisfied with the current situation, it was also clear to me that I needed to come up with a better solution if I wanted externally provided plugins for debputy. To put a bit more perspective on what I expected from the end result:
  1. It had to cover as many parsing errors as possible. An error case this code would handle for you, would be an error where I could ensure it sufficient degree of detail and context for the user.
  2. It should be type-safe / provide typing support such that IDEs/mypy could help you when you work on the parsed result.
  3. It had to support "normalization" of the input, such as
           # User provides
           - install: "foo"
           # Which is normalized into:
           - install:
               source: "foo"
4) It must be simple to tell  debputy  what input you expected.
At this point, I remembered that I had seen a Python (PYPI) package where you could give it a TypedDict and an arbitrary input (Sadly, I do not remember the name). The package would then validate the said input against the TypedDict. If the match was successful, you would get the result back casted as the TypedDict. If the match was unsuccessful, the code would raise an error for you. Conceptually, this seemed to be a good starting point for where I wanted to be. Then I looked a bit on the normalization requirement (point 3). What is really going on here is that you have two "schemas" for the input. One is what the programmer will see (the normalized form) and the other is what the user can input (the manifest form). The problem is providing an automatic normalization from the user input to the simplified programmer structure. To expand a bit on the following example:
# User provides
- install: "foo"
# Which is normalized into:
- install:
    source: "foo"
Given that install has the attributes source, sources, dest-dir, as, into, and when, how exactly would you automatically normalize "foo" (str) into source: "foo"? Even if the code filtered by "type" for these attributes, you would end up with at least source, dest-dir, and as as candidates. Turns out that TypedDict actually got this covered. But the Python package was not going in this direction, so I parked it here and started looking into doing my own. At this point, I had a general idea of what I wanted. When defining an extension to the manifest, the plugin would provide debputy with one or two definitions of TypedDict. The first one would be the "parsed" or "target" format, which would be the normalized form that plugin provider wanted to work on. For this example, lets look at an earlier version of the install-examples rule:
# Example input matching this typed dict.
#       "source": ["foo"]
#       "into": ["pkg"]
class InstallExamplesTargetFormat(TypedDict):
    # Which source files to install (dest-dir is fixed)
    sources: List[str]
    # Which package(s) that should have these files installed.
    into: NotRequired[List[str]]
In this form, the install-examples has two attributes - both are list of strings. On the flip side, what the user can input would look something like this:
# Example input matching this typed dict.
#       "source": "foo"
#       "into": "pkg"
class InstallExamplesManifestFormat(TypedDict):
    # Note that sources here is split into source (str) vs. sources (List[str])
    sources: NotRequired[List[str]]
    source: NotRequired[str]
    # We allow the user to write  into: foo  in addition to  into: [foo] 
    into: Union[str, List[str]]
FullInstallExamplesManifestFormat = Union[
The idea was that the plugin provider would use these two definitions to tell debputy how to parse install-examples. Pseudo-registration code could look something like:
def _handler(
    normalized_form: InstallExamplesTargetFormat,
) -> InstallRule:
    ...  # Do something with the normalized form and return an InstallRule.
This was my conceptual target and while the current actual API ended up being slightly different, the core concept remains the same.
From concept to basic implementation Building this code is kind like swallowing an elephant. There was no way I would just sit down and write it from one end to the other. So the first prototype of this did not have all the features it has now. Spoiler warning, these next couple of sections will contain some Python typing details. When reading this, it might be helpful to know things such as Union[str, List[str]] being the Python type for either a str (string) or a List[str] (list of strings). If typing makes your head spin, these sections might less interesting for you. To build this required a lot of playing around with Python's introspection and typing APIs. My very first draft only had one "schema" (the normalized form) and had the following features:
  • Read TypedDict.__required_attributes__ and TypedDict.__optional_attributes__ to determine which attributes where present and which were required. This was used for reporting errors when the input did not match.
  • Read the types of the provided TypedDict, strip the Required / NotRequired markers and use basic isinstance checks based on the resulting type for str and List[str]. Again, used for reporting errors when the input did not match.
This prototype did not take a long (I remember it being within a day) and worked surprisingly well though with some poor error messages here and there. Now came the first challenge, adding the manifest format schema plus relevant normalization rules. The very first normalization I did was transforming into: Union[str, List[str]] into into: List[str]. At that time, source was not a separate attribute. Instead, sources was a Union[str, List[str]], so it was the only normalization I needed for all my use-cases at the time. There are two problems when writing a normalization. First is determining what the "source" type is, what the target type is and how they relate. The second is providing a runtime rule for normalizing from the manifest format into the target format. Keeping it simple, the runtime normalizer for Union[str, List[str]] -> List[str] was written as:
def normalize_into_list(x: Union[str, List[str]]) -> List[str]:
    return x if isinstance(x, list) else [x]
This basic form basically works for all types (assuming none of the types will have List[List[...]]). The logic for determining when this rule is applicable is slightly more involved. My current code is about 100 lines of Python code that would probably lose most of the casual readers. For the interested, you are looking for _union_narrowing in With this, when the manifest format had Union[str, List[str]] and the target format had List[str] the generated parser would silently map a string into a list of strings for the plugin provider. But with that in place, I had covered the basics of what I needed to get started. I was quite excited about this milestone of having my first keyword parsed without handwriting the parser logic (at the expense of writing a more generic parse-generator framework).
Adding the first parse hint With the basic implementation done, I looked at what to do next. As mentioned, at the time sources in the manifest format was Union[str, List[str]] and I considered to split into a source: str and a sources: List[str] on the manifest side while keeping the normalized form as sources: List[str]. I ended up committing to this change and that meant I had to solve the problem getting my parser generator to understand the situation:
# Map from
class InstallExamplesManifestFormat(TypedDict):
    # Note that sources here is split into source (str) vs. sources (List[str])
    sources: NotRequired[List[str]]
    source: NotRequired[str]
    # We allow the user to write  into: foo  in addition to  into: [foo] 
    into: Union[str, List[str]]
# ... into
class InstallExamplesTargetFormat(TypedDict):
    # Which source files to install (dest-dir is fixed)
    sources: List[str]
    # Which package(s) that should have these files installed.
    into: NotRequired[List[str]]
There are two related problems to solve here:
  1. How will the parser generator understand that source should be normalized and then mapped into sources?
  2. Once that is solved, the parser generator has to understand that while source and sources are declared as NotRequired, they are part of a exactly one of rule (since sources in the target form is Required). This mainly came down to extra book keeping and an extra layer of validation once the previous step is solved.
While working on all of this type introspection for Python, I had noted the Annotated[X, ...] type. It is basically a fake type that enables you to attach metadata into the type system. A very random example:
# For all intents and purposes,  foo  is a string despite all the  Annotated  stuff.
foo: Annotated[str, "hello world"] = "my string here"
The exciting thing is that you can put arbitrary details into the type field and read it out again in your introspection code. Which meant, I could add "parse hints" into the type. Some "quick" prototyping later (a day or so), I got the following to work:
# Map from
#        "source": "foo"  # (or "sources": ["foo"])
#        "into": "pkg"
class InstallExamplesManifestFormat(TypedDict):
    # Note that sources here is split into source (str) vs. sources (List[str])
    sources: NotRequired[List[str]]
    source: NotRequired[
    # We allow the user to write  into: foo  in addition to  into: [foo] 
    into: Union[str, List[str]]
# ... into
#        "source": ["foo"]
#        "into": ["pkg"]
class InstallExamplesTargetFormat(TypedDict):
    # Which source files to install (dest-dir is fixed)
    sources: List[str]
    # Which package(s) that should have these files installed.
    into: NotRequired[List[str]]
Without me (as a plugin provider) writing a line of code, I can have debputy rename or "merge" attributes from the manifest form into the normalized form. Obviously, this required me (as the debputy maintainer) to write a lot code so other me and future plugin providers did not have to write it.
High level typing At this point, basic normalization between one mapping to another mapping form worked. But one thing irked me with these install rules. The into was a list of strings when the parser handed them over to me. However, I needed to map them to the actual BinaryPackage (for technical reasons). While I felt I was careful with my manual mapping, I knew this was exactly the kind of case where a busy programmer would skip the "is this a known package name" check and some user would typo their package resulting in an opaque KeyError: foo. Side note: "Some user" was me today and I was super glad to see debputy tell me that I had typoed a package name (I would have been more happy if I had remembered to use debputy check-manifest, so I did not have to wait through the upstream part of the build that happened before debhelper passed control to debputy...) I thought adding this feature would be simple enough. It basically needs two things:
  1. Conversion table where the parser generator can tell that BinaryPackage requires an input of str and a callback to map from str to BinaryPackage. (That is probably lie. I think the conversion table came later, but honestly I do remember and I am not digging into the git history for this one)
  2. At runtime, said callback needed access to the list of known packages, so it could resolve the provided string.
It was not super difficult given the existing infrastructure, but it did take some hours of coding and debugging. Additionally, I added a parse hint to support making the into conditional based on whether it was a single binary package. With this done, you could now write something like:
# Map from
class InstallExamplesManifestFormat(TypedDict):
    # Note that sources here is split into source (str) vs. sources (List[str])
    sources: NotRequired[List[str]]
    source: NotRequired[
    # We allow the user to write  into: foo  in addition to  into: [foo] 
    into: Union[BinaryPackage, List[BinaryPackage]]
# ... into
class InstallExamplesTargetFormat(TypedDict):
    # Which source files to install (dest-dir is fixed)
    sources: List[str]
    # Which package(s) that should have these files installed.
    into: NotRequired[
Code-wise, I still had to check for into being absent and providing a default for that case (that is still true in the current codebase - I will hopefully fix that eventually). But I now had less room for mistakes and a standardized error message when you misspell the package name, which was a plus.
The added side-effect - Introspection A lovely side-effect of all the parsing logic being provided to debputy in a declarative form was that the generated parser snippets had fields containing all expected attributes with their types, which attributes were required, etc. This meant that adding an introspection feature where you can ask debputy "What does an install rule look like?" was quite easy. The code base already knew all of this, so the "hard" part was resolving the input the to concrete rule and then rendering it to the user. I added this feature recently along with the ability to provide online documentation for parser rules. I covered that in more details in my blog post Providing online reference documentation for debputy in case you are interested. :)
Wrapping it up This was a short insight into how debputy parses your input. With this declarative technique:
  • The parser engine handles most of the error reporting meaning users get most of the errors in a standard format without the plugin provider having to spend any effort on it. There will be some effort in more complex cases. But the common cases are done for you.
  • It is easy to provide flexibility to users while avoiding having to write code to normalize the user input into a simplified programmer oriented format.
  • The parser handles mapping from basic types into higher forms for you. These days, we have high level types like FileSystemMode (either an octal or a symbolic mode), different kind of file system matches depending on whether globs should be performed, etc. These types includes their own validation and parsing rules that debputy handles for you.
  • Introspection and support for providing online reference documentation. Also, debputy checks that the provided attribute documentation covers all the attributes in the manifest form. If you add a new attribute, debputy will remind you if you forget to document it as well. :)
In this way everybody wins. Yes, writing this parser generator code was more enjoyable than writing the ad-hoc manual parsers it replaced. :)

11 January 2024

Reproducible Builds: Reproducible Builds in December 2023

Welcome to the December 2023 report from the Reproducible Builds project! In these reports we outline the most important things that we have been up to over the past month. As a rather rapid recap, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries (more).

Reproducible Builds: Increasing the Integrity of Software Supply Chains awarded IEEE Software Best Paper award In February 2022, we announced in these reports that a paper written by Chris Lamb and Stefano Zacchiroli was now available in the March/April 2022 issue of IEEE Software. Titled Reproducible Builds: Increasing the Integrity of Software Supply Chains (PDF). This month, however, IEEE Software announced that this paper has won their Best Paper award for 2022.

Reproducibility to affect package migration policy in Debian In a post summarising the activities of the Debian Release Team at a recent in-person Debian event in Cambridge, UK, Paul Gevers announced a change to the way packages are migrated into the staging area for the next stable Debian release based on its reproducibility status:
The folks from the Reproducibility Project have come a long way since they started working on it 10 years ago, and we believe it s time for the next step in Debian. Several weeks ago, we enabled a migration policy in our migration software that checks for regression in reproducibility. At this moment, that is presented as just for info, but we intend to change that to delays in the not so distant future. We eventually want all packages to be reproducible. To stimulate maintainers to make their packages reproducible now, we ll soon start to apply a bounty [speedup] for reproducible builds, like we ve done with passing autopkgtests for years. We ll reduce the bounty for successful autopkgtests at that moment in time.

Speranza: Usable, privacy-friendly software signing Kelsey Merrill, Karen Sollins, Santiago Torres-Arias and Zachary Newman have developed a new system called Speranza, which is aimed at reassuring software consumers that the product they are getting has not been tampered with and is coming directly from a source they trust. A write-up on goes into some more details:
What we have done, explains Sollins, is to develop, prove correct, and demonstrate the viability of an approach that allows the [software] maintainers to remain anonymous. Preserving anonymity is obviously important, given that almost everyone software developers included value their confidentiality. This new approach, Sollins adds, simultaneously allows [software] users to have confidence that the maintainers are, in fact, legitimate maintainers and, furthermore, that the code being downloaded is, in fact, the correct code of that maintainer. [ ]
The corresponding paper is published on the arXiv preprint server in various formats, and the announcement has also been covered in MIT News.

Nondeterministic Git bundles Paul Baecher published an interesting blog post on Reproducible git bundles. For those who are not familiar with them, Git bundles are used for the offline transfer of Git objects without an active server sitting on the other side of a network connection. Anyway, Paul wrote about writing a backup system for his entire system, but:
I noticed that a small but fixed subset of [Git] repositories are getting backed up despite having no changes made. That is odd because I would think that repeated bundling of the same repository state should create the exact same bundle. However [it] turns out that for some, repositories bundling is nondeterministic.
Paul goes on to to describe his solution, which involves forcing git to be single threaded makes the output deterministic . The article was also discussed on Hacker News.

Output from libxlst now deterministic libxslt is the XSLT C library developed for the GNOME project, where XSLT itself is an XML language to define transformations for XML files. This month, it was revealed that the result of the generate-id() XSLT function is now deterministic across multiple transformations, fixing many issues with reproducible builds. As the Git commit by Nick Wellnhofer describes:
Rework the generate-id() function to return deterministic values. We use
a simple incrementing counter and store ids in the 'psvi' member of
nodes which was freed up by previous commits. The presence of an id is
indicated by a new "source node" flag.
This fixes long-standing problems with reproducible builds, see
This also hardens security, as the old implementation leaked the
difference between a heap and a global pointer, see
The old implementation could also generate the same id for dynamically
created nodes which happened to reuse the same memory. Ids for namespace
nodes were completely broken. They now use the id of the parent element
together with the hex-encoded namespace prefix.

Community updates There were made a number of improvements to our website, including Chris Lamb fixing the generate-draft script to not blow up if the input files have been corrupted today or even in the past [ ], Holger Levsen updated the Hamburg 2023 summit to add a link to farewell post [ ] & to add a picture of a Post-It note. [ ], and Pol Dellaiera updated the paragraph about tar and the --clamp-mtime flag [ ]. On our mailing list this month, Bernhard M. Wiedemann posted an interesting summary on some of the reasons why packages are still not reproducible in 2023. diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes, including processing objdump symbol comment filter inputs as Python byte (and not str) instances [ ] and Vagrant Cascadian extended diffoscope support for GNU Guix [ ] and updated the version in that distribution to version 253 [ ].

Challenges of Producing Software Bill Of Materials for Java Musard Balliu, Benoit Baudry, Sofia Bobadilla, Mathias Ekstedt, Martin Monperrus, Javier Ron, Aman Sharma, Gabriel Skoglund, C sar Soto-Valero and Martin Wittlinger (!) of the KTH Royal Institute of Technology in Sweden, have published an article in which they:
deep-dive into 6 tools and the accuracy of the SBOMs they produce for complex open-source Java projects. Our novel insights reveal some hard challenges regarding the accurate production and usage of software bills of materials.
The paper is available on arXiv.

Debian Non-Maintainer campaign As mentioned in previous reports, the Reproducible Builds team within Debian has been organising a series of online and offline sprints in order to clear the huge backlog of reproducible builds patches submitted by performing so-called NMUs (Non-Maintainer Uploads). During December, Vagrant Cascadian performed a number of such uploads, including: In addition, Holger Levsen performed three no-source-change NMUs in order to address the last packages without .buildinfo files in Debian trixie, specifically lorene (0.0.0~cvs20161116+dfsg-1.1), maria (1.3.5-4.2) and ruby-rinku (1.7.3-2.1).

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework (available at in order to check packages and other artifacts for reproducibility. In December, a number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Fix matching packages for the [R programming language]( [ ][ ][ ]
    • Add a Certbot configuration for the Nginx web server. [ ]
    • Enable debugging for the create-meta-pkgs tool. [ ][ ]
  • Arch Linux-related changes
    • The asp has been deprecated by pkgctl; thanks to dvzrv for the pointer. [ ]
    • Disable the Arch Linux builders for now. [ ]
    • Stop referring to the /trunk branch / subdirectory. [ ]
    • Use --protocol https when cloning repositories using the pkgctl tool. [ ]
  • Misc changes:
    • Install the python3-setuptools and swig packages, which are now needed to build OpenWrt. [ ]
    • Install pkg-config needed to build Coreboot artifacts. [ ]
    • Detect failures due to an issue where the fakeroot tool is implicitly required but not automatically installed. [ ]
    • Detect failures due to rename of the vmlinuz file. [ ]
    • Improve the grammar of an error message. [ ]
    • Document that has been updated to FreeBSD 14.0. [ ]
In addition, node maintenance was performed by Holger Levsen [ ] and Vagrant Cascadian [ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

Matthias Klumpp: Wayland really breaks things Just for now?

This post is in part a response to an aspect of Nate s post Does Wayland really break everything? , but also my reflection on discussing Wayland protocol additions, a unique pleasure that I have been involved with for the past months1.

Some facts Before I start I want to make a few things clear: The Linux desktop will be moving to Wayland2 this is a fact at this point (and has been for a while), sticking to X11 makes no sense for future projects. From reading Wayland protocols and working with it at a much lower level than I ever wanted to, it is also very clear to me that Wayland is an exceptionally well-designed core protocol, and so are the additional extension protocols (xdg-shell & Co.). The modularity of Wayland is great, it gives it incredible flexibility and will for sure turn out to be good for the long-term viability of this project (and also provides a path to correct protocol issues in future, if one is found). In other words: Wayland is an amazing foundation to build on, and a lot of its design decisions make a lot of sense! The shift towards people seeing Linux more as an application developer platform, and taking PipeWire and XDG Portals into account when designing for Wayland is also an amazing development and I love to see this this holistic approach is something I always wanted! Furthermore, I think Wayland removes a lot of functionality that shouldn t exist in a modern compositor and that s a good thing too! Some of X11 s features and design decisions had clear drawbacks that we shouldn t replicate. I highly recommend to read Nate s blog post, it s very good and goes into more detail. And due to all of this, I firmly believe that any advancement in the Wayland space must come from within the project.

But! But! Of course there was a but coming  I think while developing Wayland-as-an-ecosystem we are now entrenched into narrow concepts of how a desktop should work. While discussing Wayland protocol additions, a lot of concepts clash, people from different desktops with different design philosophies debate the merits of those over and over again never reaching any conclusion (just as you will never get an answer out of humans whether sushi or pizza is the clearly superior food, or whether CSD or SSD is better). Some people want to use Wayland as a vehicle to force applications to submit to their desktop s design philosophies, others prefer the smallest and leanest protocol possible, other developers want the most elegant behavior possible. To be clear, I think those are all very valid approaches. But this also creates problems: By switching to Wayland compositors, we are already forcing a lot of porting work onto toolkit developers and application developers. This is annoying, but just work that has to be done. It becomes frustrating though if Wayland provides toolkits with absolutely no way to reach their goal in any reasonable way. For Nate s Photoshop analogy: Of course Linux does not break Photoshop, it is Adobe s responsibility to port it. But what if Linux was missing a crucial syscall that Photoshop needed for proper functionality and Adobe couldn t port it without that? In that case it becomes much less clear on who is to blame for Photoshop not being available. A lot of Wayland protocol work is focused on the environment and design, while applications and work to port them often is considered less. I think this happens because the overlap between application developers and developers of the desktop environments is not necessarily large, and the overlap with people willing to engage with Wayland upstream is even smaller. The combination of Windows developers porting apps to Linux and having involvement with toolkits or Wayland is pretty much nonexistent. So they have less of a voice.

A quick detour through the neuroscience research lab I have been involved with Freedesktop, GNOME and KDE for an incredibly long time now (more than a decade), but my actual job (besides consulting for Purism) is that of a PhD candidate in a neuroscience research lab (working on the morphology of biological neurons and its relation to behavior). I am mostly involved with three research groups in our institute, which is about 35 people. Most of us do all our data analysis on powerful servers which we connect to using RDP (with KDE Plasma as desktop). Since I joined, I have been pushing the envelope a bit to extend Linux usage to data acquisition and regular clients, and to have our data acquisition hardware interface well with it. Linux brings some unique advantages for use in research, besides the obvious one of having every step of your data management platform introspectable with no black boxes left, a goal I value very highly in research (but this would be its own blogpost). In terms of operating system usage though, most systems are still Windows-based. Windows is what companies develop for, and what people use by default and are familiar with. The choice of operating system is very strongly driven by application availability, and WSL being really good makes this somewhat worse, as it removes the need for people to switch to a real Linux system entirely if there is the occasional software requiring it. Yet, we have a lot more Linux users than before, and use it in many places where it makes sense. I also developed a novel data acquisition software that even runs on Linux-only and uses the abilities of the platform to its fullest extent. All of this resulted in me asking existing software and hardware vendors for Linux support a lot more often. Vendor-customer relationship in science is usually pretty good, and vendors do usually want to help out. Same for open source projects, especially if you offer to do Linux porting work for them But overall, the ease of use and availability of required applications and their usability rules supreme. Most people are not technically knowledgeable and just want to get their research done in the best way possible, getting the best results with the least amount of friction.
KDE/Linux usage at a control station for a particle accelerator at Adlershof Technology Park, Germany, for reference (by 25years of KDE)3

Back to the point The point of that story is this: GNOME, KDE, RHEL, Debian or Ubuntu: They all do not matter if the necessary applications are not available for them. And as soon as they are, the easiest-to-use solution wins. There are many facets of easiest : In many cases this is RHEL due to Red Hat support contracts being available, in many other cases it is Ubuntu due to its mindshare and ease of use. KDE Plasma is also frequently seen, as it is perceived a bit easier to onboard Windows users with it (among other benefits). Ultimately, it comes down to applications and 3rd-party support though. Here s a dirty secret: In many cases, porting an application to Linux is not that difficult. The thing that companies (and FLOSS projects too!) struggle with and will calculate the merits of carefully in advance is whether it is worth the support cost as well as continuous QA/testing. Their staff will have to do all of that work, and they could spend that time on other tasks after all. So if they learn that porting to Linux not only means added testing and support, but also means to choose between the legacy X11 display server that allows for 1:1 porting from Windows or the new Wayland compositors that do not support the same features they need, they will quickly consider it not worth the effort at all. I have seen this happen. Of course many apps use a cross-platform toolkit like Qt, which greatly simplifies porting. But this just moves the issue one layer down, as now the toolkit needs to abstract Windows, macOS and Wayland. And Wayland does not contain features to do certain things or does them very differently from e.g. Windows, so toolkits have no way to actually implement the existing functionality in a way that works on all platforms. So in Qt s documentation you will often find texts like works everywhere except for on Wayland compositors or mobile 4. Many missing bits or altered behavior are just papercuts, but those add up. And if users will have a worse experience, this will translate to more support work, or people not wanting to use the software on the respective platform.

What s missing?

Window positioning SDI applications with multiple windows are very popular in the scientific world. For data acquisition (for example with microscopes) we often have one monitor with control elements and one larger one with the recorded image. There is also other configurations where multiple signal modalities are acquired, and the experimenter aligns windows exactly in the way they want and expects the layout to be stored and to be loaded upon reopening the application. Even in the image from Adlershof Technology Park above you can see this style of UI design, at mega-scale. Being able to pop-out elements as windows from a single-window application to move them around freely is another frequently used paradigm, and immensely useful with these complex apps. It is important to note that this is not a legacy design, but in many cases an intentional choice these kinds of apps work incredibly well on larger screens or many screens and are very flexible (you can have any window configuration you want, and switch between them using the (usually) great window management abilities of your desktop). Of course, these apps will work terribly on tablets and small form factors, but that is not the purpose they were designed for and nobody would use them that way. I assumed for sure these features would be implemented at some point, but when it became clear that that would not happen, I created the ext-placement protocol which had some good discussion but was ultimately rejected from the xdg namespace. I then tried another solution based on feedback, which turned out not to work for most apps, and now proposed xdg-placement (v2) in an attempt to maybe still get some protocol done that we can agree on, exploring more options before pushing the existing protocol for inclusion into the ext Wayland protocol namespace. Meanwhile though, we can not port any application that needs this feature, while at the same time we are switching desktops and distributions to Wayland by default.

Window position restoration Similarly, a protocol to save & restore window positions was already proposed in 2018, 6 years ago now, but it has still not been agreed upon, and may not even help multiwindow apps in its current form. The absence of this protocol means that applications can not restore their former window positions, and the user has to move them to their previous place again and again. Meanwhile, toolkits can not adopt these protocols and applications can not use them and can not be ported to Wayland without introducing papercuts.

Window icons Similarly, individual windows can not set their own icons, and not-installed applications can not have an icon at all because there is no desktop-entry file to load the icon from and no icon in the theme for them. You would think this is a niche issue, but for applications that create many windows, providing icons for them so the user can find them is fairly important. Of course it s not the end of the world if every window has the same icon, but it s one of those papercuts that make the software slightly less user-friendly. Even applications with fewer windows like LibrePCB are affected, so much so that they rather run their app through Xwayland for now. I decided to address this after I was working on data analysis of image data in a Python virtualenv, where my code and the Python libraries used created lots of windows all with the default yellow W icon, making it impossible to distinguish them at a glance. This is xdg-toplevel-icon now, but of course it is an uphill battle where the very premise of needing this is questioned. So applications can not use it yet.

Limited window abilities requiring specialized protocols Firefox has a picture-in-picture feature, allowing it to pop out media from a mediaplayer as separate floating window so the user can watch the media while doing other things. On X11 this is easily realized, but on Wayland the restrictions posed on windows necessitate a different solution. The xdg-pip protocol was proposed for this specialized usecase, but it is also not merged yet. So this feature does not work as well on Wayland.

Automated GUI testing / accessibility / automation Automation of GUI tasks is a powerful feature, so is the ability to auto-test GUIs. This is being worked on, with libei and wlheadless-run (and stuff like ydotool exists too), but we re not fully there yet.

Wayland is frustrating for (some) application authors As you see, there is valid applications and valid usecases that can not be ported yet to Wayland with the same feature range they enjoyed on X11, Windows or macOS. So, from an application author s perspective, Wayland does break things quite significantly, because things that worked before can no longer work and Wayland (the whole stack) does not provide any avenue to achieve the same result. Wayland does break screen sharing, global hotkeys, gaming latency (via no tearing ) etc, however for all of these there are solutions available that application authors can port to. And most developers will gladly do that work, especially since the newer APIs are usually a lot better and more robust. But if you give application authors no path forward except use Xwayland and be on emulation as second-class citizen forever , it just results in very frustrated application developers. For some application developers, switching to a Wayland compositor is like buying a canvas from the Linux shop that forces your brush to only draw triangles. But maybe for your avant-garde art, you need to draw a circle. You can approximate one with triangles, but it will never be as good as the artwork of your friends who got their canvases from the Windows or macOS art supply shop and have more freedom to create their art.

Triangles are proven to be the best shape! If you are drawing circles you are creating bad art! Wayland, via its protocol limitations, forces a certain way to build application UX often for the better, but also sometimes to the detriment of users and applications. The protocols are often fairly opinionated, a result of the lessons learned from X11. In any case though, it is the odd one out Windows and macOS do not pose the same limitations (for better or worse!), and the effort to port to Wayland is orders of magnitude bigger, or sometimes in case of the multiwindow UI paradigm impossible to achieve to the same level of polish. Desktop environments of course have a design philosophy that they want to push, and want applications to integrate as much as possible (same as macOS and Windows!). However, there are many applications out there, and pushing a design via protocol limitations will likely just result in fewer apps.

The porting dilemma I spent probably way too much time looking into how to get applications cross-platform and running on Linux, often talking to vendors (FLOSS and proprietary) as well. Wayland limitations aren t the biggest issue by far, but they do start to come come up now, especially in the scientific space with Ubuntu having switched to Wayland by default. For application authors there is often no way to address these issues. Many scientists do not even understand why their Python script that creates some GUIs suddenly behaves weirdly because Qt is now using the Wayland backend on Ubuntu instead of X11. They do not know the difference and also do not want to deal with these details even though they may be programmers as well, the real goal is not to fiddle with the display server, but to get to a scientific result somehow. Another issue is portability layers like Wine which need to run Windows applications as-is on Wayland. Apparently Wine s Wayland driver has some heuristics to make window positioning work (and I am amazed by the work done on this!), but that can only go so far.

A way out? So, how would we actually solve this? Fundamentally, this excessively long blog post boils down to just one essential question: Do we want to force applications to submit to a UX paradigm unconditionally, potentially loosing out on application ports or keeping apps on X11 eternally, or do we want to throw them some rope to get as many applications ported over to Wayland, even through we might sacrifice some protocol purity? I think we really have to answer that to make the discussions on wayland-protocols a lot less grueling. This question can be answered at the wayland-protocols level, but even more so it must be answered by the individual desktops and compositors. If the answer for your environment turns out to be Yes, we want the Wayland protocol to be more opinionated and will not make any compromises for application portability , then your desktop/compositor should just immediately NACK protocols that add something like this and you simply shouldn t engage in the discussion, as you reject the very premise of the new protocol: That it has any merit to exist and is needed in the first place. In this case contributors to Wayland and application authors also know where you stand, and a lot of debate is skipped. Of course, if application authors want to support your environment, you are basically asking them now to rewrite their UI, which they may or may not do. But at least they know what to expect and how to target your environment. If the answer turns out to be We do want some portability , the next question obviously becomes where the line should be drawn and which changes are acceptable and which aren t. We can t blindly copy all X11 behavior, some porting work to Wayland is simply inevitable. Some written rules for that might be nice, but probably more importantly, if you agree fundamentally that there is an issue to be fixed, please engage in the discussions for the respective MRs! We for sure do not want to repeat X11 mistakes, and I am certain that we can implement protocols which provide the required functionality in a way that is a nice compromise in allowing applications a path forward into the Wayland future, while also being as good as possible and improving upon X11. For example, the toplevel-icon proposal is already a lot better than anything X11 ever had. Relaxing ACK requirements for the ext namespace is also a good proposed administrative change, as it allows some compositors to add features they want to support to the shared repository easier, while also not mandating them for others. In my opinion, it would allow for a lot less friction between the two different ideas of how Wayland protocol development should work. Some compositors could move forward and support more protocol extensions, while more restrictive compositors could support less things. Applications can detect supported protocols at launch and change their behavior accordingly (ideally even abstracted by toolkits). You may now say that a lot of apps are ported, so surely this issue can not be that bad. And yes, what Wayland provides today may be enough for 80-90% of all apps. But what I hope the detour into the research lab has done is convince you that this smaller percentage of apps matters. A lot. And that it may be worthwhile to support them. To end on a positive note: When it came to porting concrete apps over to Wayland, the only real showstoppers so far5 were the missing window-positioning and window-position-restore features. I encountered them when porting my own software, and I got the issue as feedback from colleagues and fellow engineers. In second place was UI testing and automation support, the window-icon issue was mentioned twice, but being a cosmetic issue it likely simply hurts people less and they can ignore it easier. What this means is that the majority of apps are already fine, and many others are very, very close! A Wayland future for everyone is within our grasp!  I will also bring my two protocol MRs to their conclusion for sure, because as application developers we need clarity on what the platform (either all desktops or even just a few) supports and will or will not support in future. And the only way to get something good done is by contribution and friendly discussion.

  1. Apologies for the clickbait-y title it comes with the subject
  2. When I talk about Wayland I mean the combined set of display server protocols and accepted protocol extensions, unless otherwise clarified.
  3. I would have picked a picture from our lab, but that would have needed permission first
  4. Qt has awesome platform issues pages, like for macOS and Linux/X11 which help with porting efforts, but Qt doesn t even list Linux/Wayland as supported platform. There is some information though, like window geometry peculiarities, which aren t particularly helpful when porting (but still essential to know).
  5. Besides issues with Nvidia hardware CUDA for simulations and machine-learning is pretty much everywhere, so Nvidia cards are common, which causes trouble on Wayland still. It is improving though.

8 January 2024

Russ Allbery: Review: The Faithless

Review: The Faithless, by C.L. Clark
Series: Magic of the Lost #2
Publisher: Orbit
Copyright: March 2023
ISBN: 0-316-54283-0
Format: Kindle
Pages: 527
The Faithless is the second book in a political fantasy series that seems likely to be a trilogy. It is a direct sequel to The Unbroken, which you should read first. As usual, Orbit made it unnecessarily hard to get re-immersed in the world by refusing to provide memory aids for readers who read books as they come out instead of only when the series is complete, but this is not the fault of Clark or the book and you've heard me rant about this before. The Unbroken was set in Qaz l (not-Algeria). The Faithless, as readers of the first book might guess from the title, is set in Balladaire (not-France). This is the palace intrigue book. Princess Luca is fighting for her throne against her uncle, the regent. Touraine is trying to represent her people. Whether and to what extent those interests are aligned is much of the meat of this book. Normally I enjoy palace intrigue novels for the competence porn: watching someone navigate a complex political situation with skill and cunning, or upend the entire system by building unlikely coalitions or using unexpected routes to power. If you are similar, be warned that this is not what you're going to get. Touraine is a fish out of water with no idea how to navigate the Balladairan court, and does not magically become an expert in the course of this novel. Luca has the knowledge, but she's unsure, conflicted, and largely out-maneuvered. That means you will have to brace for some painful scenes of some of the worst people apparently getting what they want. Despite that, I could not put this down. It was infuriating, frustrating, and a much slower burn than I prefer, but the layers of complex motivations that Clark builds up provided a different sort of payoff. Two books in, the shape of this series is becoming clearer. This series is about empire and colonialism, but with considerably more complexity than fantasy normally brings to that topic. Power does not loosen its grasp easily, and it has numerous tools for subtle punishment after apparent upstart victories. Righteous causes rarely call banners to your side; instead, they create opportunities for other people to maneuver to their own advantage. Touraine has some amount of power now, but it's far from obvious how to use it. Her life's training tells her that exercising power will only cause trouble, and her enemies are more than happy to reinforce that message at every opportunity. Most notable to me is Clark's bitingly honest portrayal of the supposed allies within the colonial power. It is clear that Luca is attempting to take the most ethical actions as she defines them, but it's remarkable how those efforts inevitably imply that Touraine should help Luca now in exchange for Luca's tenuous and less-defined possible future aid. This is not even a lie; it may be an accurate summary of Balladairan politics. And yet, somehow what Balladaire needs always matters more than the needs of their abused colony. Underscoring this, Clark introduces another faction in the form of a populist movement against the Balladairan monarchy. The details of that setup in another fantasy novel would make them allies of the Qaz l. Here, as is so often the case in real life, a substantial portion of the populists are even more xenophobic and racist than the nobility. There are no easy alliances. The trump card that Qaz l holds is magic. They have it, and (for reasons explored in The Unbroken) Balladaire needs it, although that is a position held by Luca's faction and not by her uncle. But even Luca wants to reduce that magic to a manageable technology, like any other element of the Balladairan state. She wants to understand it, harness it, and bring it under local control. Touraine, trained by Balladaire and facing Balladairan political problems, has the same tendency. The magic, at least in this book, refuses not in the flashy, rebellious way that it would in most fantasy, but in a frustrating and incomprehensible lack of predictable or convenient rules. I think this will feel like a plot device to some readers, and that is to some extent true, but I think I see glimmers of Clark setting up a conflict of world views that will play out in the third book. I think some people are going to bounce off this book. It's frustrating, enraging, at times melodramatic, and does not offer the cathartic payoff typically offered in fantasy novels of this type. Usually these are things I would be complaining about as well. And yet, I found it satisfyingly challenging, engrossing, and memorable. I spent a lot of the book yelling "just kill him already" at the characters, but I think one of Clark's points is that overcoming colonial relationships requires a lot more than just killing one evil man. The characters profoundly fail to execute some clever and victorious strategy. Instead, as in the first book, they muddle through, making the best choice that they can see in each moment, making lots of mistakes, and paying heavy prices. It's realistic in a way that has nothing to do with blood or violence or grittiness. (Although I did appreciate having the thin thread of Pruett's story and its highly satisfying conclusion.) This is also a slow-burn romance, and there too I think opinions will differ. Touraine and Luca keep circling back to the same arguments and the same frustrations, and there were times that this felt repetitive. It also adds a lot of personal drama to the politics in a way that occasionally made me dubious. But here too, I think Clark is partly using the romance to illustrate the deeper political points. Luca is often insufferable, cruel and ambitious in ways she doesn't realize, and only vaguely able to understand the Qaz l perspective; in short, she's the pragmatic centrist reformer. I am dubious that her ethics would lead her to anything other than endless compromise without Touraine to push her. To Luca's credit, she also realizes that and wants to be a better person, but struggles to have the courage to act on it. Touraine both does and does not want to manipulate her; she wants Luca's help (and more), but it's not clear Luca will give it under acceptable terms, or even understand how much she's demanding. It's that foundational conflict that turns the romance into a slow burn by pushing them apart. Apparently I have more patience for this type of on-again, off-again relationship than one based on artificial miscommunication. The more I noticed the political subtext, the more engaging I found the romance on the surface. I picked this up because I'd read several books about black characters written by white authors, and while there was nothing that wrong with those books, the politics felt a little too reductionist and simplified. I wanted a book that was going to force me out of comfortable political assumptions. The Faithless did exactly what I was looking for, and I am definitely here for the rest of the series. In that sense, recommended, although do not go into this book hoping for adroit court maneuvering and competence porn. Followed by The Sovereign, which does not yet have a release date. Content warnings: Child death, attempted cultural genocide. Rating: 7 out of 10

2 January 2024

Valhalla's Things: Crescent Shawl

Posted on January 2, 2024
Tags: madeof:atoms
a woman wearing a shawl, seen from the back where it looks like a big dark grey triangle with a light grey border and another light grey border with a grid of holes. There is also a double line of holes in the center of the back, and two single ones towards the sides. One of the knitting projects I m working on is a big bottom-up triangular shawl in less-than-fingering weight yarn (NM 1/15): it feels like a cloud should by all rights feel, and I have good expectations out of it, but it s taking forever and a day. And then one day last spring I started thinking in the general direction of top-down shawls, and decided I couldn t wait until I had finished the first one to see if I could design one. For my first attempt I used an odd ball of 50% wool 50% plastic I had in my stash and worked it on 12 mm tree trunks, and I quickly made something between a scarf and a shawl that got some use during the summer thunderstorms when temperatures got a bit lower, but not really cold. I was happy with the shape, not with the exact position of the increases, but I had ideas for improvements, so I just had to try another time. Digging through the stash I found four balls of Drops Alpaca in two shades of grey: I had bought it with the intent to test its durability in somewhat more demanding situations (such as gloves or even socks), but then the LYS1 no longer carries it, so I might as well use it for something a bit more one-off (and when I received the yarn it felt so soft that doing something for the upper body looked like a better idea anyway). I decided to start working in garter stitch with the darker colour, then some garter stitch in the lighter shade and to finish with yo / k2t lace, to make the shawl sort of fade out. The first half was worked relatively slowly through the summer, and then when I reached the colour change I suddenly picked up working on it and it was finished in a couple of weeks. the same shawl, worn before blocking: the garter stitch part
looks denser in a nice way, but the the lace border is scrunched up.
Then I had doubts on whether I wanted to block it, since I liked the soft feel, but I decided to try it anyway: it didn t lose the feel, and the look is definitely better, even if it was my first attempt at blocking a shawl and I wasn t that good at it. the same shawl, blocked, worn and seen from the front, where it falls in wide falls from the shoulders between the arms and the body. I m glad that I did it, however, as it s still soft and warm, but now also looks nicer. The pattern is of course online as #FreeSoftWear on my fiber craft patterns website.

  1. at least local to somebody: I can t get to a proper yarn shop by foot, so I ve bought this yarn online from one that I could in theory reach on a day trip, but it has not happened yet.

27 December 2023

Bits from Debian: Statement about the EU Cyber Resilience Act

Debian Public Statement about the EU Cyber Resilience Act and the Product Liability Directive The European Union is currently preparing a regulation "on horizontal cybersecurity requirements for products with digital elements" known as the Cyber Resilience Act (CRA). It is currently in the final "trilogue" phase of the legislative process. The act includes a set of essential cybersecurity and vulnerability handling requirements for manufacturers. It will require products to be accompanied by information and instructions to the user. Manufacturers will need to perform risk assessments and produce technical documentation and, for critical components, have third-party audits conducted. Discovered security issues will have to be reported to European authorities within 25 hours (1). The CRA will be followed up by the Product Liability Directive (PLD) which will introduce compulsory liability for software. While a lot of these regulations seem reasonable, the Debian project believes that there are grave problems for Free Software projects attached to them. Therefore, the Debian project issues the following statement:
  1. Free Software has always been a gift, freely given to society, to take and to use as seen fit, for whatever purpose. Free Software has proven to be an asset in our digital age and the proposed EU Cyber Resilience Act is going to be detrimental to it. a. As the Debian Social Contract states, our goal is "make the best system we can, so that free works will be widely distributed and used." Imposing requirements such as those proposed in the act makes it legally perilous for others to redistribute our work and endangers our commitment to "provide an integrated system of high-quality materials with no legal restrictions that would prevent such uses of the system". (2) b. Knowing whether software is commercial or not isn't feasible, neither in Debian nor in most free software projects - we don't track people's employment status or history, nor do we check who finances upstream projects (the original projects that we integrate in our operating system). c. If upstream projects stop making available their code for fear of being in the scope of CRA and its financial consequences, system security will actually get worse rather than better. d. Having to get legal advice before giving a gift to society will discourage many developers, especially those without a company or other organisation supporting them.
  2. Debian is well known for its security track record through practices of responsible disclosure and coordination with upstream developers and other Free Software projects. We aim to live up to the commitment made in the Debian Social Contract: "We will not hide problems." (3) a.The Free Software community has developed a fine-tuned, tried-and-tested system of responsible disclosure in case of security issues which will be overturned by the mandatory reporting to European authorities within 24 hours (Art. 11 CRA). b. Debian spends a lot of volunteering time on security issues, provides quick security updates and works closely together with upstream projects and in coordination with other vendors. To protect its users, Debian regularly participates in limited embargos to coordinate fixes to security issues so that all other major Linux distributions can also have a complete fix when the vulnerability is disclosed. c. Security issue tracking and remediation is intentionally decentralized and distributed. The reporting of security issues to ENISA and the intended propagation to other authorities and national administrations would collect all software vulnerabilities in one place. This greatly increases the risk of leaking information about vulnerabilities to threat actors, representing a threat for all the users around the world, including European citizens. d. Activists use Debian (e.g. through derivatives such as Tails), among other reasons, to protect themselves from authoritarian governments; handing threat actors exploits they can use for oppression is against what Debian stands for. e. Developers and companies will downplay security issues because a "security" issue now comes with legal implications. Less clarity on what is truly a security issue will hurt users by leaving them vulnerable.
  3. While proprietary software is developed behind closed doors, Free Software development is done in the open, transparent for everyone. To retain parity with proprietary software the open development process needs to be entirely exempt from CRA requirements, just as the development of software in private is. A "making available on the market" can only be considered after development is finished and the software is released.
  4. Even if only "commercial activities" are in the scope of CRA, the Free Software community - and as a consequence, everybody - will lose a lot of small projects. CRA will force many small enterprises and most probably all self employed developers out of business because they simply cannot fulfill the requirements imposed by CRA. Debian and other Linux distributions depend on their work. If accepted as it is, CRA will undermine not only an established community but also a thriving market. CRA needs an exemption for small businesses and, at the very least, solo-entrepreneurs.

Information about the voting process: Debian uses the Condorcet method for voting. Simplistically, plain Condorcets method can be stated like so : "Consider all possible two-way races between candidates. The Condorcet winner, if there is one, is the one candidate who can beat each other candidate in a two-way race with that candidate." The problem is that in complex elections, there may well be a circular relationship in which A beats B, B beats C, and C beats A. Most of the variations on Condorcet use various means of resolving the tie. Debian's variation is spelled out in the constitution, specifically, A.5(3) Sources: (1) CRA proposals and links & PLD proposals and links (2) Debian Social Contract No. 2, 3, and 4 (3) Debian Constitution

Russ Allbery: Review: A Study in Scarlet

Review: A Study in Scarlet, by Arthur Conan Doyle
Series: Sherlock Holmes #1
Publisher: AmazonClassics
Copyright: 1887
Printing: February 2018
ISBN: 1-5039-5525-7
Format: Kindle
Pages: 159
A Study in Scarlet is the short mystery novel (probably a novella, although I didn't count words) that introduced the world to Sherlock Holmes. I'm going to invoke the 100-year-rule and discuss the plot of this book rather freely on the grounds that even someone who (like me prior to a few days ago) has not yet read it is probably not that invested in avoiding all spoilers. If you do want to remain entirely unspoiled, exercise caution before reading on. I had somehow managed to avoid ever reading anything by Arthur Conan Doyle, not even a short story. I therefore couldn't be sure that some of the assertions I was making in my review of A Study in Honor were correct. Since A Study in Scarlet would be quick to read, I decided on a whim to do a bit of research and grab a free copy of the first Holmes novel. Holmes is such a part of English-speaking culture that I thought I had a pretty good idea of what to expect. This was largely true, but cultural osmosis had somehow not prepared me for the surprise Mormons. A Study in Scarlet establishes the basic parameters of a Holmes story: Dr. James Watson as narrator, the apartment he shares with Holmes at 221B Baker Street, the Baker Street Irregulars, Holmes's competition with police detectives, and his penchant for making leaps of logical deduction from subtle clues. The story opens with Watson meeting Holmes, agreeing to split the rent of a flat, and being baffled by the apparent randomness of Holmes's fields of study before Holmes reveals he's a consulting detective. The first case is a murder: a man is found dead in an abandoned house, without a mark on him although there are blood splatters on the walls and the word "RACHE" written in blood. Since my only prior exposure to Holmes was from cultural references and a few TV adaptations, there were a few things that surprised me. One is that Holmes is voluble and animated rather than aloof. Doyle is clearly going for passionate eccentric rather than calculating mastermind. Another is that he is intentionally and unabashedly ignorant on any topic not related to solving mysteries.
My surprise reached a climax, however, when I found incidentally that he was ignorant of the Copernican Theory and of the composition of the Solar System. That any civilized human being in this nineteenth century should not be aware that the earth travelled round the sun appeared to be to me such an extraordinary fact that I could hardly realize it. "You appear to be astonished," he said, smiling at my expression of surprise. "Now that I do know it I shall do my best to forget it." "To forget it!" "You see," he explained, "I consider that a man's brain originally is like a little empty attic, and you have to stock it with such furniture as you chose. A fool takes in all the lumber of every sort that he comes across, so that the knowledge which might be useful to him gets crowded out, or at best is jumbled up with a lot of other things so that he has a difficulty in laying his hands upon it. Now the skilful workman is very careful indeed as to what he takes into his brain-attic. He will have nothing but the tools which may help him in doing his work, but of these he has a large assortment, and all in the most perfect order. It is a mistake to think that that little room has elastic walls and can distend to any extent. Depend upon it there comes a time when for every addition of knowledge you forget something that you knew before. It is of the highest importance, therefore, not to have useless facts elbowing out the useful ones."
This is directly contrary to my expectation that the best way to make leaps of deduction is to know something about a huge range of topics so that one can draw unexpected connections, particularly given the puzzle-box construction and odd details so beloved in classic mysteries. I'm now curious if Doyle stuck with this conception, and if there were any later mysteries that involved astronomy. Speaking of classic mysteries, A Study in Scarlet isn't quite one, although one can see the shape of the genre to come. Doyle does not "play fair" by the rules that have not yet been invented. Holmes at most points knows considerably more than the reader, including bits of evidence that are not described until Holmes describes them and research that Holmes does off-camera and only reveals when he wants to be dramatic. This is not the sort of story where the reader is encouraged to try to figure out the mystery before the detective. Rather, what Doyle seems to be aiming for, and what Watson attempts (unsuccessfully) as the reader surrogate, is slightly different: once Holmes makes one of his grand assertions, the reader is encouraged to guess what Holmes might have done to arrive at that conclusion. Doyle seems to want the reader to guess technique rather than outcome, while providing only vague clues in general descriptions of Holmes's behavior at a crime scene. The structure of this story is quite odd. The first part is roughly what you would expect: first-person narration from Watson, supposedly taken from his journals but not at all in the style of a journal and explicitly written for an audience. Part one concludes with Holmes capturing and dramatically announcing the name of the killer, who the reader has never heard of before. Part two then opens with... a western?
In the central portion of the great North American Continent there lies an arid and repulsive desert, which for many a long year served as a barrier against the advance of civilization. From the Sierra Nevada to Nebraska, and from the Yellowstone River in the north to the Colorado upon the south, is a region of desolation and silence. Nor is Nature always in one mood throughout the grim district. It comprises snow-capped and lofty mountains, and dark and gloomy valleys. There are swift-flowing rivers which dash through jagged ca ons; and there are enormous plains, which in winter are white with snow, and in summer are grey with the saline alkali dust. They all preserve, however, the common characteristics of barrenness, inhospitality, and misery.
First, I have issues with the geography. That region contains some of the most beautiful areas on earth, and while a lot of that region is arid, describing it primarily as a repulsive desert is a bit much. Doyle's boundaries and distances are also confusing: the Yellowstone is a northeast-flowing river with its source in Wyoming, so the area between it and the Colorado does not extend to the Sierra Nevadas (or even to Utah), and it's not entirely clear to me that he realizes Nevada exists. This is probably what it's like for people who live anywhere else in the world when US authors write about their country. But second, there's no Holmes, no Watson, and not even the pretense of a transition from the detective novel that we were just reading. Doyle just launches into a random western with an omniscient narrator. It features a lean, grizzled man and an adorable child that he adopts and raises into a beautiful free spirit, who then falls in love with a wild gold-rush adventurer. This was written about 15 years before the first critically recognized western novel, so I can't blame Doyle for all the cliches here, but to a modern reader all of these characters are straight from central casting. Well, except for the villains, who are the Mormons. By that, I don't mean that the villains are Mormon. I mean Brigham Young is the on-page villain, plotting against the hero to force his adopted daughter into a Mormon harem (to use the word that Doyle uses repeatedly) and ruling Salt Lake City with an iron hand, border guards with passwords (?!), and secret police. This part of the book was wild. I was laughing out-loud at the sheer malevolent absurdity of the thirty-day countdown to marriage, which I doubt was the intended effect. We do eventually learn that this is the backstory of the murder, but we don't return to Watson and Holmes for multiple chapters. Which leads me to the other thing that surprised me: Doyle lays out this backstory, but then never has his characters comment directly on the morality of it, only the spectacle. Holmes cares only for the intellectual challenge (and for who gets credit), and Doyle sets things up so that the reader need not concern themselves with aftermath, punishment, or anything of that sort. I probably shouldn't have been surprised this does fit with the Holmes stereotype but I'm used to modern fiction where there is usually at least some effort to pass judgment on the events of the story. Doyle draws very clear villains, but is utterly silent on whether the murder is justified. Given its status in the history of literature, I'm not sorry to have read this book, but I didn't particularly enjoy it. It is very much of its time: everyone's moral character is linked directly to their physical appearance, and Doyle uses the occasional racial stereotype without a second thought. Prevailing writing styles have changed, so the prose feels long-winded and breathless. The rivalry between Holmes and the police detectives is tedious and annoying. I also find it hard to read novels from before the general absorption of techniques of emotional realism and interiority into all genres. The characters in A Study in Scarlet felt more like cartoon characters than fully-realized human beings. I have no strong opinion about the objective merits of this book in the context of its time other than to note that the sudden inserted western felt very weird. My understanding is that this is not considered one of the better Holmes stories, and Holmes gets some deeper characterization later on. Maybe I'll try another of Doyle's works someday, but for now my curiosity has been sated. Followed by The Sign of the Four. Rating: 4 out of 10

26 December 2023

Russ Allbery: Review: A Study in Honor

Review: A Study in Honor, by Claire O'Dell
Series: Janet Watson Chronicles #1
Publisher: Harper Voyager
Copyright: July 2018
ISBN: 0-06-269932-6
Format: Kindle
Pages: 295
A Study in Honor is a near-future science fiction novel by Claire O'Dell, a pen name for Beth Bernobich. You will see some assertions, including by the Lambda Literary Award judges, that it is a mystery novel. There is a mystery, but... well, more on that in a moment. Janet Watson was an Army surgeon in the Second US Civil War when New Confederacy troops overran the lines in Alton, Illinois. Watson lost her left arm to enemy fire. As this book opens, she is returning to Washington, D.C. with a medical discharge, PTSD, and a field replacement artificial arm scavenged from a dead soldier. It works, sort of, mostly, despite being mismatched to her arm and old in both technology and previous wear. It does not work well enough for her to resume her career as a surgeon. Watson's plan is to request a better artificial arm from the VA (the United States Department of Veterans Affairs, which among other things is responsible for the medical care of wounded veterans). That plan meets a wall of unyielding and uninterested bureaucracy. She has a pension, but it's barely enough for cheap lodging. A lifeline comes in the form of a chance encounter with a former assistant in the Army, who has a difficult friend looking to split the cost of an apartment. The name of that friend is Sara Holmes. At this point, you know what to expect. This is clearly one of the many respinnings of Arthur Conan Doyle. This time, the setting is in the future and Watson and Holmes are both black women, but the other elements of the setup are familiar: the immediate deduction that Watson came from the front, the shared rooms (2809 Q Street this time, sacrificing homage for the accuracy of a real address), Holmes's tendency to play an instrument (this time the piano), and even the title of this book, which is an obvious echo of the title of the first Holmes novel, A Study in Scarlet. Except that's not what you'll get. There are a lot of parallels and references here, but this is not a Holmes-style detective novel. First, it's only arguably a detective novel at all. There is a mystery, which starts with a patient Watson sees in her fallback job as a medical tech in the VA hospital and escalates to a physical attack, but that doesn't start until a third of the way into the book. It certainly is not solved through minute clues and leaps of deduction; instead, that part of the plot has the shape of a thriller rather than a classic mystery. There is a good argument that the thriller is the modern mystery novel, so I don't want to overstate my case, but I think someone who came to this book wanting a Doyle-style mystery would be disappointed. Second, the mystery is not the heart of this book. Watson is. She, like Doyle's Watson, is the first-person narrator, but she is far more present in the book. I have no idea how accurate O'Dell's portrayal of Watson's PTSD is, but it was certainly compelling and engrossing reading. Her fight for basic dignity and her rage at the surface respect and underlying disinterested hostility of the bureaucratic war machinery is what kept me turning the pages. The mystery plot is an outgrowth of that and felt more like a case study than the motivating thread of the plot. And third, Sara Holmes... well, I hesitate to say definitively that she's not Sherlock Holmes. There have been so many versions of Holmes over the years, even apart from the degree to which a black woman would necessarily not be like Doyle's character. But she did not remind me of Sherlock Holmes. She reminded me of a cross between James Bond and a high fae. This sounds like a criticism. It very much is not. I found this high elf spy character far more interesting than I have ever found Sherlock Holmes. But here again, if you came into this book hoping for a Holmes-style master detective, I fear you may be wrong-footed. The James Bond parts will be obvious when you get there and aren't the most interesting (and thankfully the misogyny is entirely absent). The part I found more fascinating is the way O'Dell sets Holmes apart by making her fae rather than insufferable. She projects effortless elegance, appears and disappears on a mysterious schedule of her own, thinks nothing of reading her roommate's diary, leaves meticulously arranged gifts, and even bargains with Watson for answers to precisely three questions. The reader does learn some mundane explanations for some of this behavior, but to be honest I found them somewhat of a letdown. Sara Holmes is at her best as a character when she tacks her own mysterious path through a rather grim world of exhausted war, penny-pinching bureaucracy, and despair, pursuing an unexplained agenda of her own while showing odd but unmistakable signs of friendship and care. This is not a romance, at least in this book. It is instead a slowly-developing friendship between two extremely different people, one that I thoroughly enjoyed. I do have a couple of caveats about this book. The first is that the future US in which it is set is almost pure Twitter doomcasting. Trump's election sparked a long slide into fascism, and when that was arrested by the election of a progressive candidate backed by a fragile coalition, Midwestern red states seceded to form the New Confederacy and start a second civil war that has dragged on for nearly eight years. It's a very specific mainstream liberal dystopian scenario that I've seen so many times it felt like a cliche even though I don't remember seeing it in a book before. This type of future projection of current fears is of course not new for science fiction; Cold War nuclear war novels are probably innumerable. But I had questions, such as how a sparsely-populated, largely non-industrial, and entirely landlocked set of breakaway states could maintain a war footing for eight years. Despite some hand-waving about covert support, those questions are not really answered here. The second problem is that the ending of this book kind of falls apart. The climax of the mystery investigation is unsatisfyingly straightforward, and the resulting revelation is a hoary cliche. Maybe I'm just complaining about the banality of evil, but if I'd been engrossed in this book for the thriller plot, I think I would have been annoyed. I wasn't, though; I was here for the characters, for Watson's PTSD and dogged determination, for Sara's strangeness, and particularly for the growing improbable friendship between two women with extremely different life experiences, emotions, and outlooks. That part was great, regardless of the ending. Do not pick this book up because you want a satisfying deductive mystery with bumbling police and a blizzard of apparently inconsequential clues. That is not at all what's happening here. But this was great on its own terms, and I will be reading the sequel shortly. Recommended, although if you are very online expect to do a bit of eye-rolling at the setting. Followed by The Hound of Justice, but the sequel is not required. This book reaches a satisfying conclusion of its own. Rating: 8 out of 10

25 December 2023

Sergio Talens-Oliag: GitLab CI/CD Tips: Automatic Versioning Using semantic-release

This post describes how I m using semantic-release on gitlab-ci to manage versioning automatically for different kinds of projects following a simple workflow (a develop branch where changes are added or merged to test new versions, a temporary release/#.#.# to generate the release candidate versions and a main branch where the final versions are published).

What is semantic-releaseIt is a Node.js application designed to manage project versioning information on Git Repositories using a Continuous integration system (in this post we will use gitlab-ci)

How does it workBy default semantic-release uses semver for versioning (release versions use the format MAJOR.MINOR.PATCH) and commit messages are parsed to determine the next version number to publish. If after analyzing the commits the version number has to be changed, the command updates the files we tell it to (i.e. the package.json file for nodejs projects and possibly a file), creates a new commit with the changed files, creates a tag with the new version and pushes the changes to the repository. When running on a CI/CD system we usually generate the artifacts related to a release (a package, a container image, etc.) from the tag, as it includes the right version number and usually has passed all the required tests (it is a good idea to run the tests again in any case, as someone could create a tag manually or we could run extra jobs when building the final assets if they fail it is not a big issue anyway, numbers are cheap and infinite, so we can skip releases if needed).

Commit messages and versioningThe commit messages must follow a known format, the default module used to analyze them uses the angular git commit guidelines, but I prefer the conventional commits one, mainly because it s a lot easier to use when you want to update the MAJOR version. The commit message format used must be:
<type>(optional scope): <description>
[optional body]
[optional footer(s)]
The system supports three types of branches: release, maintenance and pre-release, but for now I m not using maintenance ones. The branches I use and their types are:
  • main as release branch (final versions are published from there)
  • develop as pre release branch (used to publish development and testing versions with the format #.#.#-SNAPSHOT.#)
  • release/#.#.# as pre release branches (they are created from develop to publish release candidate versions with the format #.#.#-rc.# and once they are merged with main they are deleted)
On the release branch (main) the version number is updated as follows:
  1. The MAJOR number is incremented if a commit with a BREAKING CHANGE: footer or an exclamation (!) after the type/scope is found in the list of commits found since the last version change (it looks for tags on the same branch).
  2. The MINOR number is incremented if the MAJOR number is not going to be changed and there is a commit with type feat in the commits found since the last version change.
  3. The PATCH number is incremented if neither the MAJOR nor the MINOR numbers are going to be changed and there is a commit with type fix in the the commits found since the last version change.
On the pre release branches (develop and release/#.#.#) the version and pre release numbers are always calculated from the last published version available on the branch (i. e. if we published version 1.3.2 on main we need to have the commit with that tag on the develop or release/#.#.# branch to get right what will be the next version). The version number is updated as follows:
  1. The MAJOR number is incremented if a commit with a BREAKING CHANGE: footer or an exclamation (!) after the type/scope is found in the list of commits found since the last released version.In our example it was 1.3.2 and the version is updated to 2.0.0-SNAPSHOT.1 or 2.0.0-rc.1 depending on the branch.
  2. The MINOR number is incremented if the MAJOR number is not going to be changed and there is a commit with type feat in the commits found since the last released version.In our example the release was 1.3.2 and the version is updated to 1.4.0-SNAPSHOT.1 or 1.4.0-rc.1 depending on the branch.
  3. The PATCH number is incremented if neither the MAJOR nor the MINOR numbers are going to be changed and there is a commit with type fix in the the commits found since the last version change.In our example the release was 1.3.2 and the version is updated to 1.3.3-SNAPSHOT.1 or 1.3.3-rc.1 depending on the branch.
  4. The pre release number is incremented if the MAJOR, MINOR and PATCH numbers are not going to be changed but there is a commit that would otherwise update the version (i.e. a fix on 1.3.3-SNAPSHOT.1 will set the version to 1.3.3-SNAPSHOT.2, a fix or feat on 1.4.0-rc.1 will set the version to 1.4.0-rc.2 an so on).

How do we manage its configurationAlthough the system is designed to work with nodejs projects, it can be used with multiple programming languages and project types. For nodejs projects the usual place to put the configuration is the project s package.json, but I prefer to use the .releaserc file instead. As I use a common set of CI templates, instead of using a .releaserc on each project I generate it on the fly on the jobs that need it, replacing values related to the project type and the current branch on a template using the tmpl command (lately I use a branch of my own fork while I wait for some feedback from upstream, as you will see on the Dockerfile).

Container used to run itAs we run the command on a gitlab-ci job we use the image built from the following Dockerfile:
# Semantic release image
FROM golang:alpine AS tmpl-builder
#RUN go install
RUN go install
FROM node:lts-alpine
COPY --from=tmpl-builder /go/bin/tmpl /usr/local/bin/tmpl
RUN apk update &&\
  apk upgrade &&\
  apk add curl git jq openssh-keygen yq zip &&\
  npm install --location=global\
  rm -rf /var/cache/apk/*
CMD ["/bin/sh"]

How and when is it executedThe job that runs semantic-release is executed when new commits are added to the develop, release/#.#.# or main branches (basically when something is merged or pushed) and after all tests have passed (we don t want to create a new version that does not compile or passes at least the unit tests). The job is something like the following:
    - if: '$CI_COMMIT_BRANCH =~ /^(develop main release\/\d+.\d+.\d+)$/'
      when: always
  stage: release
    - echo "Loading"
    - . $ASSETS_DIR/
    - sr_gen_releaserc_json
    - git_push_setup
    - semantic-release
Where the SEMANTIC_RELEASE_IMAGE variable contains the URI of the image built using the Dockerfile above and the sr_gen_releaserc_json and git_push_setup are functions defined on the $ASSETS_DIR/ file:
  • The sr_gen_releaserc_json function generates the .releaserc.json file using the tmpl command.
  • The git_push_setup function configures git to allow pushing changes to the repository with the semantic-release command, optionally signing them with a SSH key.

The sr_gen_releaserc_json functionThe code for the sr_gen_releaserc_json function is the following:
  # Use nodejs as default project_type
  project_type="$ PROJECT_TYPE:-nodejs "
  # REGEX to match the rc_branch name
  # PATHS on the local ASSETS_DIR
  assets_dir="$ CI_PROJECT_DIR /$ ASSETS_DIR "
  sr_local_plugin="$ assets_dir /local-plugin.cjs"
  releaserc_tmpl="$ assets_dir /releaserc.json.tmpl"
  pipeline_values_yaml="$ assets_dir /values_$ project_type _project.yaml"
  # Destination PATH
  # Create an empty pipeline_values_yaml if missing
  test -f "$pipeline_values_yaml"   : >"$pipeline_values_yaml"
  # Create the pipeline_runtime_values_yaml file
  echo "branch: $ CI_COMMIT_BRANCH " >"$pipeline_runtime_values_yaml"
  echo "gitlab_url: $ CI_SERVER_URL " >"$pipeline_runtime_values_yaml"
  # Add the rc_branch name if we are on an rc_branch
  if [ "$(echo "$CI_COMMIT_BRANCH"   sed -ne "/$rc_branch_regex/ p ")" ]; then
    echo "rc_branch: $ CI_COMMIT_BRANCH " >>"$pipeline_runtime_values_yaml"
      sed -ne "/$rc_branch_regex/ p ")" ]; then
    echo "rc_branch: $ CI_MERGE_REQUEST_SOURCE_BRANCH_NAME " \
  echo "sr_local_plugin: $ sr_local_plugin " >>"$pipeline_runtime_values_yaml"
  # Create the releaserc_json file
  tmpl -f "$pipeline_runtime_values_yaml" -f "$pipeline_values_yaml" \
    "$releaserc_tmpl"   jq . >"$releaserc_json"
  # Remove the pipeline_runtime_values_yaml file
  rm -f "$pipeline_runtime_values_yaml"
  # Print the releaserc_json file
  print_file_collapsed "$releaserc_json"
  # --*-- BEG: NOTE --*--
  # Rename the package.json to ignore it when calling semantic release.
  # The idea is that the local-plugin renames it back on the first step of the
  # semantic-release process.
  # --*-- END: NOTE --*--
  if [ -f "package.json" ]; then
    echo "Renaming 'package.json' to 'package.json_disabled'"
    mv "package.json" "package.json_disabled"
Almost all the variables used on the function are defined by gitlab except the ASSETS_DIR and PROJECT_TYPE; in the complete pipelines the ASSETS_DIR is defined on a common file included by all the pipelines and the project type is defined on the .gitlab-ci.yml file of each project. If you review the code you will see that the file processed by the tmpl command is named releaserc.json.tmpl, its contents are shown here:
  "plugins": [
     - if .sr_local_plugin  
    "  .sr_local_plugin  ",
     - end  
        "preset": "conventionalcommits",
        "releaseRules": [
            "breaking": true, "release": "major"  ,
            "revert": true, "release": "patch"  ,
            "type": "feat", "release": "minor"  ,
            "type": "fix", "release": "patch"  ,
            "type": "perf", "release": "patch"  
     - if .replacements  
        "replacements":   .replacements   toJson    
     - end  
     - if eq .branch "main"  
        "changelogFile": "", "changelogTitle": "# Changelog"  
     - end  
        "assets":   if .assets   .assets   toJson   else  []  end  ,
        "message": "ci(release): v$ nextRelease.version \n\n$ nextRelease.notes "
        "gitlabUrl": "  .gitlab_url  ", "successComment": false  
  "branches": [
      "name": "develop", "prerelease": "SNAPSHOT"  ,
     - if .rc_branch  
      "name": "  .rc_branch  ", "prerelease": "rc"  ,
     - end  
The values used to process the template are defined on a file built on the fly (releaserc_values.yaml) that includes the following keys and values:
  • branch: the name of the current branch
  • gitlab_url: the URL of the gitlab server (the value is taken from the CI_SERVER_URL variable)
  • rc_branch: the name of the current rc branch; we only set the value if we are processing one because semantic-release only allows one branch to match the rc prefix and if we use a wildcard (i.e. release/*) but the users keep more than one release/#.#.# branch open at the same time the calls to semantic-release will fail for sure.
  • sr_local_plugin: the path to the local plugin we use (shown later)
The template also uses a values_$ project_type _project.yaml file that includes settings specific to the project type, the one for nodejs is as follows:
  - files:
      - "package.json"
    from: "\"version\": \".*\""
    to: "\"version\": \"$ nextRelease.version \""
  - ""
  - "package.json"
The replacements section is used to update the version field on the relevant files of the project (in our case the package.json file) and the assets section includes the files that will be committed to the repository when the release is published (looking at the template you can see that the is only updated for the main branch, we do it this way because if we update the file on other branches it creates a merge nightmare and we are only interested on it for released versions anyway). The local plugin adds code to rename the package.json_disabled file to package.json if present and prints the last and next versions on the logs with a format that can be easily parsed using sed:
// Minimal plugin to:
// - rename the package.json_disabled file to package.json if present
// - log the semantic-release last & next versions
function verifyConditions(pluginConfig, context)  
  var fs = require('fs');
  if (fs.existsSync('package.json_disabled'))  
    fs.renameSync('package.json_disabled', 'package.json');
    context.logger.log( verifyConditions: renamed 'package.json_disabled' to 'package.json' );
function analyzeCommits(pluginConfig, context)  
  if (context.lastRelease && context.lastRelease.version)  
    context.logger.log( analyzeCommits: LAST_VERSION=$ context.lastRelease.version  );
function verifyRelease(pluginConfig, context)  
  if (context.nextRelease && context.nextRelease.version)  
    context.logger.log( verifyRelease: NEXT_VERSION=$ context.nextRelease.version  );
module.exports =  

The git_push_setup functionThe code for the git_push_setup function is the following:
  # Update global credentials to allow git clone & push for all the group repos
  git config --global credential.helper store
  cat >"$HOME/.git-credentials" <<EOF
https://fake-user:$ GITLAB_REPOSITORY_TOKEN
  # Define user name, mail and signing key for semantic-release
  # Export git user variables
  export GIT_AUTHOR_NAME="$user_name"
  export GIT_AUTHOR_EMAIL="$user_email"
  export GIT_COMMITTER_NAME="$user_name"
  export GIT_COMMITTER_EMAIL="$user_email"
  # Sign commits with ssh if there is a SSH_SIGNING_KEY variable
  if [ "$ssh_signing_key" ]; then
    echo "Configuring GIT to sign commits with SSH"
    : >"$ssh_keyfile"
    chmod 0400 "$ssh_keyfile"
    echo "$ssh_signing_key"   tr -d '\r' >"$ssh_keyfile"
    git config gpg.format ssh
    git config user.signingkey "$ssh_keyfile"
    git config commit.gpgsign true
The function assumes that the GITLAB_REPOSITORY_TOKEN variable (set on the CI/CD variables section of the project or group we want) contains a token with read_repository and write_repository permissions on all the projects we are going to use this function. The SR_USER_NAME and SR_USER_EMAIL variables can be defined on a common file or the CI/CD variables section of the project or group we want to work with and the script assumes that the optional SSH_SIGNING_KEY is exported as a CI/CD default value of type variable (that is why the keyfile is created on the fly) and git is configured to use it if the variable is not empty.
Warning: Keep in mind that the variables GITLAB_REPOSITORY_TOKEN and SSH_SIGNING_KEY contain secrets, so probably is a good idea to make them protected (if you do that you have to make the develop, main and release/* branches protected too).
Warning: The semantic-release user has to be able to push to all the projects on those protected branches, it is a good idea to create a dedicated user and add it as a MAINTAINER for the projects we want (the MAINTAINERS need to be able to push to the branches), or, if you are using a Gitlab with a Premium license you can use the api to allow the semantic-release user to push to the protected branches without allowing it for any other user.

The semantic-release commandOnce we have the .releaserc file and the git configuration ready we run the semantic-release command. If the branch we are working with has one or more commits that will increment the version, the tool does the following (note that the steps are described are the ones executed if we use the configuration we have generated):
  1. It detects the commits that will increment the version and calculates the next version number.
  2. Generates the release notes for the version.
  3. Applies the replacements defined on the configuration (in our example updates the version field on the package.json file).
  4. Updates the file adding the release notes if we are going to publish the file (when we are on the main branch).
  5. Creates a commit if all or some of the files listed on the assets key have changed and uses the commit message we have defined, replacing the variables for their current values.
  6. Creates a tag with the new version number and the release notes.
  7. As we are using the gitlab plugin after tagging it also creates a release on the project with the tag name and the release notes.

Notes about the git workflows and merges between branchesIt is very important to remember that semantic-release looks at the commits of a given branch when calculating the next version to publish, that has two important implications:
  1. On pre release branches we need to have the commit that includes the tag with the released version, if we don t have it the next version is not calculated correctly.
  2. It is a bad idea to squash commits when merging a branch to another one, if we do that we will lose the information semantic-release needs to calculate the next version and even if we use the right prefix for the squashed commit (fix, feat, ) we miss all the messages that would otherwise go to the file.
To make sure that we have the right commits on the pre release branches we should merge the main branch changes into the develop one after each release tag is created; in my pipelines the fist job that processes a release tag creates a branch from the tag and an MR to merge it to develop. The important thing about that MR is that is must not be squashed, if we do that the tag commit will probably be lost, so we need to be careful. To merge the changes directly we can run the following code:
# Set the SR_TAG variable to the tag you want to process
# Fetch all the changes
git fetch --all --prune
# Switch to the main branch
git switch main
# Pull all the changes
git pull
# Switch to the development branch
git switch develop
# Pull all the changes
git pull
# Create followup branch from tag
git switch -c "followup/$SR_TAG" "$SR_TAG"
# Change files manually & commit the changed files
git commit -a --untracked-files=no -m "ci(followup): $SR_TAG to develop"
# Switch to the development branch
git switch develop
# Merge the followup branch into the development one using the --no-ff option
git merge --no-ff "followup/$SR_TAG"
# Remove the followup branch
git branch -d "followup/$SR_TAG"
# Push the changes
git push
If we can t push directly to develop we can create a MR pushing the followup branch after committing the changes, but we have to make sure that we don t squash the commits when merging or it will not work as we want.

Russ Allbery: Review: The Blackwing War

Review: The Blackwing War, by K.B. Spangler
Series: Deep Witches #1
Publisher: A Girl and Her Fed Books
Copyright: March 2021
ISBN: blackwing-war
Format: Kindle
Pages: 284
The Blackwing War is the first book of a projected space opera series. I previously reviewed Stoneskin, which was intended as a prelude to this series. In theory you can start here, but I would read Stoneskin first. Tembi is a Witch, which means she can ask the Deep to do things for her. At the start of the book, those things mostly involve disarming bombs. The galaxy is in the middle of a genocidal war between the well-equipped and all-but-officially supported Sagittarius Armed Forces, also known as the Blackwings, and the Sabenta resistance movement. To settle the galaxy, humans fiddled with their genes to adapt themselves to otherwise-hostile planets. The Blackwings take exception, in the tradition of racist humans throughout history, and think it's time to purify human bloodlines again. Both sides are using bombs. The Deep is the brilliant idea of this series. It seems to exist everywhere simultaneously, it's alive, it adores teleporting things, and it's basically a giant cosmic puppy. Humans are nearly incomprehensible to the Deep, and it's nearly incomprehensible to humans, but it somehow picks out specific humans who can (sort of) understand it and whom it gets attached to and somehow makes immortal. These are the Witches, and they have turned the Deep into the logistical backbone of human civilization. Essentially all commerce and travel is now done through Deep teleportation, requested by a Witch and coordinated by Lancaster, the Witches' governing council. The exception is war. Lancaster is strictly neutral; it does not take sides, even in the face of an ongoing genocide, and it refuses to transport military ships, any type of weapons, or even war refugees. Domino, Lancaster's cynically manipulative leader, is determined to protect its special privileges and position at all costs. Tembi is one of the quasi-leaders of a resistance against that position, but even they are reluctant to ask the Deep to take sides in a war. To them, the Deep is a living magical creature that they are exploiting, and which also tends to be a bundle of nerves. Using it as a weapon feels like a step too far. That's how the situation lies at the start of this book when, after a successful bomb defusing, the Deep whisks Tembi away to watch an unknown weapon blow up a moon. A lot of this book consists of Tembi unraveling a couple of mysteries, starting with the apparent experimental bomb and then expanding to include the apparent drugging and disappearance of her former classmate. The low-grade war gets worse throughout, leaving Tembi torn between the justifications for Lancaster's neutrality and her strong sense of basic morality. The moments when Tembi gets angry enough or impatient enough to take action are the best parts, but a lot of this book is quite grim. Do not expect all to be resolved in a happy ending. There is some catharsis, but The Blackwing War is also clearly setup for a longer series. Tembi is a great character and the Deep is even better. I thoroughly enjoyed reading about both of them, and Tembi's relationship with the Deep is a delight. Usually I get frustrated by baffling incomprehensibility as a plot devices, but Spangler pulls it off as well as I've seen it done. But unfortunately, this book is firmly in the "gets worse before it will get better" part of the overall story arc, and the sequels have not yet appeared. The Blackwing War ends on a cliffhanger that portends huge changes for the characters and the setting, and if I had the next book to rush into, I wouldn't mind the grimness as much. As is, it was a somewhat depressing reading experience despite its charms, and despite a somewhat optimistic ending (that I doubt will truly resolve anything). I think the world-building elements were a touch predictable, and I wish Spangler wouldn't have her characters keep trying to justify Domino's creepy, abusive, and manipulative actions. But the characters are so much fun, and the idea of the Deep as a character is such a delight, that I am hooked on this series regardless. Recommended, although I will (hopefully) be able to recommend it more heartily once at least one sequel has been published. Content warnings: genocide, racism, violent death. Rating: 7 out of 10

22 December 2023

Russ Allbery: Review: Wintersmith

Review: Wintersmith, by Terry Pratchett
Series: Discworld #35
Publisher: Clarion Books
Copyright: 2006
Printing: 2007
ISBN: 0-06-089033-9
Format: Mass market
Pages: 450
Wintersmith is the 35th Discworld novel and the 3rd Tiffany Aching novel. You could probably start here, since understanding the backstory isn't vital for following the plot, but I'm not sure why you would. Tiffany is now training with Miss Treason, a 113-year-old witch who is quite different in her approach from Miss Level, Tiffany's mentor in A Hat Full of Sky. Miss Level was the unassuming and constantly helpful glue that held the neighborhood together. Miss Treason is the judge; her neighbors are scared of her and proud of being scared of her, since that means they have a proper witch who can see into their heads and sort out their problems. On the surface, they're quite different; part of the story of this book is Tiffany learning to see the similarities. First, though, Miss Treason rushes Tiffany to a strange midnight Morris Dance, without any explanation. The Morris Dance usually celebrates the coming of spring and is at the center of a village party, so Tiffany is quite confused by seeing it danced on a dark and windy night in late autumn. But there is a hole in the dance where the Fool normally is, and Tiffany can't keep herself from joining it. This proves to be a mistake. That space was left for someone very different from Tiffany, and now she's entangled herself in deep magic that she doesn't understand. This is another Pratchett novel where the main storyline didn't do much for me. All the trouble stems from Miss Treason being maddeningly opaque, and while she did warn Tiffany, she did so in that way that guarantees a protagonist of a middle-grade novel will ignore. The Wintersmith is a boring, one-note quasi-villain, and the plot mainly revolves around elemental powers being dumber than a sack of hammers. The one thing I will say about the main plot is that the magic Tiffany danced into is entangled with courtship and romance, Tiffany turns thirteen over the course of this book, and yet this is not weird and uncomfortable reading the way it would be in the hands of many other authors. Pratchett has a keen eye for the age range that he's targeting. The first awareness that there is such a thing as romance that might be relevant to oneself pairs nicely with the Wintersmith's utter confusion at how Tiffany's intrusion unbalanced his dance. This is a very specific age and experience that I think a lot of authors would shy away from, particularly with a female protagonist, and I thought Pratchett handled it adroitly. I personally found the Wintersmith's awkward courting tedious and annoying, but that's more about me than about the book. As with A Hat Full of Sky, though, everything other than the main plot was great. It is becoming obvious how much Tiffany and Granny Weatherwax have in common, and that Granny Weatherwax recognizes this and is training Tiffany herself. This is high-quality coming-of-age material, not in the traditional fantasy sense of chosen ones and map explorations, but in the sense of slowly-developing empathy and understanding of people who think differently than you do. Tiffany, like Granny Weatherwax, has very little patience with nonsense, and her irritation with stupidity is one of her best characteristics. But she's learning how to blunt it long enough to pay attention, and to understand how people she doesn't like can still be the right people for specific situations. I particularly loved how Granny carries on with a feud at the same time that Tiffany is learning to let go of one. It's not a contradiction or hypocrisy; it's a sign that Tiffany is entitled to her judgments and feelings, but has to learn how to keep them in their place and not let them take over. One of the great things about the Tiffany Aching books is that the villages are also characters. We don't see that much of the individual people, but one of the things Tiffany is learning is how to see the interpersonal dynamics and patterns of village life. Somehow the feelings of irritation and exasperation fade once you understand people's motives and see more sides to their character. There is a lot more Nanny Ogg in this book than there has been in the last few, and that reminded me of how much I love her character. She has a completely different approach than Granny Weatherwax, but it's just as effective in different ways. She's also the perfect witch to have around when you've stumbled into a stylized love story that you don't want to be a part of, and yet find oddly fascinating. It says something about the skill of Pratchett's characterization that I could enjoy a book this much while having no interest in the main plot. The Witches have always been great characters, but somehow they're even better when seen through Tiffany's perspective. Good stuff; if you liked any of the other Tiffany Aching books, you will like this as well. Followed by Making Money in publication order. The next Tiffany Aching novel is I Shall Wear Midnight. Rating: 8 out of 10

21 December 2023

Russ Allbery: Review: The Box

Review: The Box, by Marc Levinson
Publisher: Princeton University Press
Copyright: 2006, 2008
Printing: 2008
ISBN: 0-691-13640-8
Format: Trade paperback
Pages: 278
The shipping container as we know it is only about 65 years old. Shipping things in containers is obviously much older; we've been doing that for longer than we've had ships. But the standardized metal box, set on a rail car or loaded with hundreds of its indistinguishable siblings into an enormous, specially-designed cargo ship, became economically significant only recently. Today it is one of the oft-overlooked foundations of global supply chains. The startlingly low cost of container shipping is part of why so much of what US consumers buy comes from Asia, and why most complex machinery is assembled in multiple countries from parts gathered from a dizzying variety of sources. Marc Levinson's The Box is a history of container shipping, from its (arguable) beginnings in the trailer bodies loaded on Pan-Atlantic Steamship Corporation's Ideal-X in 1956 to just-in-time international supply chains in the 2000s. It's a popular history that falls on the academic side, with a full index and 60 pages of citations and other notes. (Per my normal convention, those pages aren't included in the sidebar page count.) The Box is organized mostly chronologically, but Levinson takes extended detours into labor relations and container standardization at the appropriate points in the timeline. The book is very US-centric. Asian, European, and Australian shipping is discussed mostly in relation to trade with the US, and Africa is barely mentioned. I don't have the background to know whether this is historically correct for container shipping or is an artifact of Levinson's focus. Many single-item popular histories focus on something that involves obvious technological innovation (paint pigments) or deep cultural resonance (salt) or at least entertaining quirkiness (punctuation marks, resignation letters). Shipping containers are important but simple and boring. The least interesting chapter in The Box covers container standardization, in which a whole bunch of people had boring meetings, wrote some things done, discovered many of the things they wrote down were dumb, wrote more things down, met with different people to have more meetings, published a standard that partly reflected the fixations of that one guy who is always involved in standards discussions, and then saw that standard be promptly ignored by the major market players. You may be wondering if that describes the whole book. It doesn't, but not because of the shipping containers. The Box is interesting because the process of economic change is interesting, and container shipping is almost entirely about business processes rather than technology. Levinson starts the substance of the book with a description of shipping before standardized containers. This was the most effective, and probably the most informative, chapter. Beyond some vague ideas picked up via cultural osmosis, I had no idea how cargo shipping worked. Levinson gives the reader a memorable feel for the sheer amount of physical labor involved in loading and unloading a ship with mixed cargo (what's called "breakbulk" cargo to distinguish it from bulk cargo like coal or wheat that fills an entire hold). It's not just the effort of hauling barrels, bales, or boxes with cranes or raw muscle power, although that is significant. It's also the need to touch every piece of cargo to move it, inventory it, warehouse it, and then load it on a truck or train. The idea of container shipping is widely attributed, including by Levinson, to Malcom McLean, a trucking magnate who became obsessed with the idea of what we now call intermodal transport: using the same container for goods on ships, railroads, and trucks so that the contents don't have to be unpacked and repacked at each transfer point. Levinson uses his career as an anchor for the story, from his acquisition of Pan-American Steamship Corporation to pursue his original idea (backed by private equity and debt, in a very modern twist), through his years running Sea-Land as the first successful major container shipper, and culminating in his disastrous attempted return to shipping by acquiring United States Lines. I am dubious of Great Man narratives in history books, and I think Levinson may be overselling McLean's role. Container shipping was an obvious idea that the industry had been talking about for decades. Even Levinson admits that, despite a few gestures at giving McLean central credit. Everyone involved in shipping understood that cargo handling was the most expensive and time-consuming part, and that if one could minimize cargo handling at the docks by loading and unloading full containers that didn't have to be opened, shipping costs would be much lower (and profits higher). The idea wasn't the hard part. McLean was the first person to pull it off at scale, thanks to some audacious economic risks and a willingness to throw sharp elbows and play politics, but it seems likely that someone else would have played that role if McLean hadn't existed. Container shipping didn't happen earlier because achieving that cost savings required a huge expenditure of capital and a major disruption of a transportation industry that wasn't interested in being disrupted. The ships had to be remodeled and eventually replaced; manufacturing had to change; railroad and trucking in theory had to change (in practice, intermodal transport; McLean's obsession, didn't happen at scale until much later); pricing had to be entirely reworked; logistical tracking of goods had to be done much differently; and significant amounts of extremely expensive equipment to load and unload heavy containers had to be designed, built, and installed. McLean's efforts proved the cost savings was real and compelling, but it still took two decades before the shipping industry reconstructed itself around containers. That interim period is where this history becomes a labor story, and that's where Levinson's biases become somewhat distracting. In the United States, loading and unloading of cargo ships was done by unionized longshoremen through a bizarre but complex and long-standing system of contract hiring. The cost savings of container shipping comes almost completely from the loss of work for longshoremen. It's a classic replacement of labor with capital; the work done by gangs of twenty or more longshoreman is instead done by a single crane operator at much higher speed and efficiency. The longshoreman unions therefore opposed containerization and launched numerous strikes and other labor actions to delay use of containers, force continued hiring that containers made unnecessary, or win buyouts and payoffs for current longshoremen. Levinson is trying to write a neutral history and occasionally shows some sympathy for longshoremen, but they still get the Luddite treatment in this book: the doomed reactionaries holding back progress. Longshoremen had a vigorous and powerful union that won better working conditions structured in ways that look absurd to outsiders, such as requiring that ships hire twice as many men as necessary so that half of them could get paid while not working. The unions also had a reputation for corruption that Levinson stresses constantly, and theft of breakbulk cargo during loading and warehousing was common. One of the interesting selling points for containers was that lossage from theft during shipping apparently decreased dramatically. It's obvious that the surface demand of the longshoremen unions, that either containers not be used or that just as many manual laborers be hired for container shipping as for earlier breakbulk shipping, was impossible, and that the profession as it existed in the 1950s was doomed. But beneath those facts, and the smoke screen of Levinson's obvious distaste for their unions, is a real question about what society owes workers whose jobs are eliminated by major shifts in business practices. That question of fairness becomes more pointed when one realizes that this shift was massively subsidized by US federal and local governments. McLean's Sea-Land benefited from direct government funding and subsidized navy surplus ships, massive port construction in New Jersey with public funds, and a sweetheart logistics contract from the US military to supply troops fighting the Vietnam War that was so generous that the return voyage was free and every container Sea-Land picked up from Japanese ports was pure profit. The US shipping industry was heavily government-supported, particularly in the early days when the labor conflicts were starting. Levinson notes all of this, but never draws the contrast between the massive support for shipping corporations and the complete lack of formal support for longshoremen. There are hard ethical questions about what society owes displaced workers even in a pure capitalist industry transformation, and this was very far from pure capitalism. The US government bankrolled large parts of the growth of container shipping, but the only way that longshoremen could get part of that money was through strikes to force payouts from private shipping companies. There are interesting questions of social and ethical history here that would require careful disentangling of the tendency of any group to oppose disruptive change and fairness questions of who gets government support and who doesn't. They will have to wait for another book; Levinson never mentions them. There were some things about this book that annoyed me, but overall it's a solid work of popular history and deserves its fame. Levinson's account is easy to follow, specific without being tedious, and backed by voluminous notes. It's not the most compelling story on its own merits; you have to have some interest in logistics and economics to justify reading the entire saga. But it's the sort of history that gives one a sense of the fractal complexity of any area of human endeavor, and I usually find those worth reading. Recommended if you like this sort of thing. Rating: 7 out of 10

15 December 2023

Scarlett Gately Moore: KDE: Snaps, KDEneon, Debian and my future.

First I want to thank KDE for this wonderful write up on LinkedIn. It made my heart explode with happiness as I prepped for my interview on Monday. I didn t get the job ( Just the usual we were very impressed with your experience, but we went with another candidate ). I think the cosmos is determined for me to hold out for the project even though it is only part time work, it is work and if I have learned nothing else this year, I have learned how to live on a very tight budget! It turns out many of the things we think we need, we don t. So with hard work of Kevin Ottens ( Thank you!!!! ), this should be finalized after the first of the year. This plan also allows me time for KDEneon and Debian work. I am happy with it and look forward to the coming year and the exciting things to come. My holiday plans are to help the Debian KDE team with the KF6 packaging, On-going KDEneon efforts, and continue to make sure the snaps Qt6 transition is a painless as possible. I will also be working on the qt6 kde-neon extension. In closing, despite my terrible luck with job-hunting, I am in an amazing community and I am truly grateful to each and every one of you. It has been a great year and I have added many new things to my skillset and I look forward to many more next year. As usual, it is that time of month where I have not raised enough to pay my internet bill ( phone company taking an extra 200.00 didn t help ) If you can spare any change ( any amount helps! ) please consider a donation Thank you! I hope everyone has a wonderful <insert your holiday here> !!!!! ~ Scarlett