Search Results: "matt"

11 February 2025

B lint R czey: Supercharge Your Installs with apt-eatmydata: Because Who Needs Crash Safety Anyway?

APT eatmydata super cow powers
Tired of waiting for apt to finish installing packages? Wish there were a way to make your installations blazingly fast without caring about minor things like, oh, data integrity? Well, today is your lucky day!
I m thrilled to introduce apt-eatmydata, now available for Debian and all supported Ubuntu releases!

What Is apt-eatmydata? If you ve ever used libeatmydata, you know it s a nifty little hack that disables fsync() and friends, making package installations way faster by skipping unnecessary disk writes. Normally, you d have to remember to wrap apt commands manually, like this:
eatmydata apt install texlive-full
But who has time for that? apt-eatmydata takes care of this automagically by integrating eatmydata seamlessly into apt itself! That means every package install is now turbocharged no extra typing required.

How to Get It

Debian If you re on Debian unstable/testing (or possibly soon in stable-backports), you can install it directly with:
sudo apt install apt-eatmydata

Ubuntu Ubuntu users already enjoy faster package installation thanks to zstd-compressed packages and to switch to even higher gear I ve backported apt-eatmydata to all supported Ubuntu releases. Just add this PPA and install:
sudo add-apt-repository ppa:firebuild/apt-eatmydata
sudo apt install apt-eatmydata
And boom! Your apt install times are getting serious upgrade. Let s run some tests
# pre-download package to measure only the installation
$ sudo apt install -d linux-headers-6.8.0-53-lowlatency
...
# installation time is 9.35s without apt-eatmydata:
$ sudo time apt install linux-headers-6.8.0-53-lowlatency
...
2.30user 2.12system 0:09.35elapsed 47%CPU (0avgtext+0avgdata 174680maxresident)k
32inputs+1495216outputs (0major+196945minor)pagefaults 0swaps
$ sudo apt install apt-eatmydata
...
$ sudo apt purge linux-headers-6.8.0-53-lowlatency
# installation time is 3.17s with apt-eatmydata:
$ sudo time eatmydata apt install linux-headers-6.8.0-53-lowlatency
2.30user 0.88system 0:03.17elapsed 100%CPU (0avgtext+0avgdata 174692maxresident)k
0inputs+205664outputs (0major+198099minor)pagefaults 0swaps
apt-eatmydata just made installing Linux headers 3x faster!

But Wait, There s More!  If you re automating CI builds, there s even a GitHub Action to make your workflows faster essentially doing what apt-eatmydata does, just setting it up in less than a second! Check it out here:
 GitHub Marketplace: apt-eatmydata

Should You Use It?  Warning: apt-eatmydata is not for all production environments. If your system crashes mid-install, you might end up with a broken package database. But for throwaway VMs, containers, and CI pipelines? It s an absolute game-changer. I use it on my laptop, too. So go forth and install recklessly fast!  If you run into any issues, feel free to file a bug or drop a comment. Happy hacking! (To accelerate your CI pipeline or local builds, check out Firebuild, that speeds up the builds, too!)

Freexian Collaborators: Debian Contributions: Python 3.13 as the default Python 3 version, Fixing qtpaths6 for cross compilation, sbuild support for Salsa CI, Rails 7 transition, DebConf preparations and more! (by Anupa Ann Joseph)

Debian Contributions: 2025-01 Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services.

Python 3.13 is now the default Python 3 version in Debian, by Stefano Rivera and Colin Watson The Python 3.13 as default transition has now completed. The next step is to remove Python 3.12 from the archive, which should be very straightforward, it just requires rebuilding C extension packages in no particular order. Stefano fixed some miscellaneous bugs blocking the completion of the 3.13 as default transition.

Fixing qtpaths6 for cross compilation, by Helmut Grohne While Qt5 used to use qmake to query installation properties, Qt6 is moving more and more to CMake and to ease that transition it relies on more qtpaths. Since this tool is not naturally aware of the architecture it is called for, it tends to produce results for the build architecture. Therefore, more than 100 packages were picking up a multiarch directory for the build architecture during cross builds. In collaboration with the Qt/KDE team and Sandro Knau in particular (none affiliated with Freexian), we added an architecture-specific wrapper script in the same way qmake has one for Qt5 and Qt6 already. The relevant CMake module has been updated to prefer the triplet-prefixed wrapper. As a result, most of the KDE packages now cross build on unstable ready in time for the trixie release.

/usr-move, by Helmut Grohne In December, Emil S dergren reported that a live-build was not working for him and in January, Colin Watson reported that the proposed mitigation for debian-installer-utils would practically fail. Both failures were to be attributed to a wrong understanding of implementation-defined behavior in dpkg-divert. As a result, all M18 mitigations had to be reviewed and many of them replaced. Many have been uploaded already and all instances have received updated patches. Even though dumat has been in operation for more than a year, it gained recent changes. For one thing, analysis of architectures other than amd64 was requested. Chris Hofstaedler (not affiliated with Freexian) kindly provided computing resources for repeatedly running it on the larger set. Doing so revealed various cross-architecture undeclared file conflicts in gcc, glibc, and binutils-z80, but it also revealed a previously unknown /usr-move issue in rpi.rpi-common. On top of that, dumat produced false positive diagnostics and wrongly associated Debian bugs in some cases, both of which have now been fixed. As a result, a supposedly fixed python3-sepolicy issue had to be reopened.

rebootstrap, by Helmut Grohne As much as we think of our base system as stable, it is changing a lot and the architecture cross bootstrap tooling is very sensitive to such changes requiring permanent maintenance. A problem that recently surfaced was that building a binutils cross toolchain would result in a binutils-for-host package that would not be practically installable as it would depend on a binutils-common package that was not built. This turned into an examination of binutils-common and noticing that it actually differed across architectures even though it should not. Johannes Schauer Marin Rodrigues (not affiliated with Freexian) and Colin Watson kindly helped brainstorm possible solutions. Eventually, Helmut provided a patch to move gprofng bits out of binutils-common. Independently, Matthias Klose (not affiliated with Freexian) split out binutils-gold into a separate source package. As a result, binutils-common is now equal across architectures and can be marked Multi-Arch: foreign resolving the initial problem.

Salsa CI, by Santiago Ruano Rinc n Santiago continued the work about the sbuild support for Salsa CI, that was mentioned in the previous month report. The !568 merge request that created the new build image was merged, making it easier to test !569 with external projects. Santiago used a fork of the debusine repo to try the draft !569, and some issues were spotted, and part of them fixed. This is the last debusine pipeline run with the current !569: https://salsa.debian.org/santiago/debusine/-/pipelines/794233. One of the last improvements relates to how to enable projects to customize the pipeline, in an equivalent way than they currently do in the extract-source and build jobs. While this is work-in-progress, the results are rather promising. Next steps include deciding on introducing schroot support for bookworm, bookworm-security, and older releases, as they are done in the official debian buildd.

DebConf preparations, by Stefano Rivera and Santiago Ruano Rinc n DebConf will be happening in Brest, France, in July. Santiago continued the DebConf 25 organization work, looking for catering providers. Both Stefano and Santiago have been reaching out to some potential sponsors. DebConf depends on sponsors to cover the organization cost, if your company depends on Debian, please consider sponsoring DebConf. Stefano has been winding up some of the finances from previous DebConfs. Finalizing reimbursements to team members from DebConf 23, and handling some outstanding issues from DebConf 24. Stefano and the rest of the DebConf committee have been reviewing bids for DebConf 25, to select the next venue.

Ruby 3.3 is now the default Ruby interpreter, by Lucas Kanashiro Ruby 3.3 is about to become the default Ruby interpreter for Trixie. Many bugs were fixed by Lucas and the Debian Ruby team during the sprint hold in Paris during Jan 27-31. The next step is to remove support of Ruby 3.1, which is the alternative Ruby interpreter for now. Thanks to the Debian Release team for all the support, especially Emilio Pozuelo Monfort.

Rails 7 transition, by Lucas Kanashiro Rails 6 has been shipped by Debian since Bullseye, and as a WEB framework, many issues (especially security related issues) have been encountered and the maintainability of it becomes harder and harder. With that in mind, during the Debian Ruby team sprint last month, the transition to Rack 3 (an important dependency of rails containing many breaking changes) was started in Debian unstable, it is ongoing. Once it is done, the Rails 7 transition will take place, and Rails 7 should be shipped in Debian Trixie.

Miscellaneous contributions
  • Stefano improved a poor ImportError for users of the turtle module on Python 3, who haven t installed the python3-tk package.
  • Stefano updated several packages to new upstream releases.
  • Stefano added the Python extension to the re2 package, allowing for the use of the Google RE2 regular expression library as a direct replacement for the standard library re module.
  • Stefano started provisioning a new physical server for the debian.social infrastructure.
  • Carles improved simplemonitor (documentation on systemd integration, worked with upstream for fixing a bug).
  • Carles upgraded packages to new upstream versions: python-ring-doorbell and python-asyncclick.
  • Carles did po-debconf translations to Catalan: reviewed 44 packages and submitted translations to 90 packages (via salsa merge requests or bugtracker bugs).
  • Carles maintained po-debconf-manager with small fixes.
  • Rapha l worked on some outstanding DEP-14 merge request and participated in the associated discussion. The discussions have been more contentious than anticipated, somewhat exacerbated by Otto s desire to conclude fast while the required tool support is not yet there.
  • Rapha l, with the help of Philipp Kern from the DSA team, upgraded tracker.debian.org to use Django 4.2 (from bookworm-backports) which in turn enabled him to configure authentication via salsa.debian.org. It s now possible to login to tracker.debian.org with your salsa credentials!
  • Rapha l updated zim a nice desktop wiki that is very handy to organize your day-to-day digital life to the latest upstream version (0.76).
  • Helmut sent patches for 10 cross build failures.
  • Helmut continued working on a tool for memory-based concurrency limit of builds.
  • Helmut NMUed libtool, opensysusers and virtualbox.
  • Enrico tried to support Helmut in working out tricky usrmerge situations
  • Thorsten Alteholz uploaded a new upstream version of brlaser.
  • Colin Watson upgraded 33 Python packages to new upstream versions, including fixes for CVE-2024-42353, CVE-2024-47532, and CVE-2025-22153.
  • Emilio Pozuelo managed various transitions, and fixed various RC bugs (telepathy-glib, xorg, xserver-xorg-video-vesa, apitrace, mesa).
  • Anupa attended the monthly team meeting for Debian publicity team and shared the social media stats.
  • Anupa assisted Jean-Pierre Giraud in the point release announcement for Debian 12.9 and published the Micronews.
  • Anupa took part in multiple Debian publicity team discussions regarding our presence in social media platforms.

9 February 2025

Antoine Beaupr : A slow blogging year

Well, 2024 will be remembered, won't it? I guess 2025 already wants to make its mark too, but let's not worry about that right now, and instead let's talk about me. A little over a year ago, I was gloating over how I had such a great blogging year in 2022, and was considering 2023 to be average, then went on to gather more stats and traffic analysis... Then I said, and I quote:
I hope to write more next year. I've been thinking about a few posts I could write for work, about how things work behind the scenes at Tor, that could be informative for many people. We run a rather old setup, but things hold up pretty well for what we throw at it, and it's worth sharing that with the world...
What a load of bollocks.

A bad year for this blog 2024 was the second worst year ever in my blogging history, tied with 2009 at a measly 6 posts for the year:
anarcat@angela:anarc.at$ curl -sSL https://anarc.at/blog/   grep 'href="\./'   grep -o 20[0-9][0-9]   sort   uniq -c   sort -nr   grep -v 2025   tail -3
      6 2024
      6 2009
      3 2014
I did write about my work though, detailing the migration from Gitolite to GitLab we completed that year. But after August, total radio silence until now.

Loads of drafts It's not that I have nothing to say: I have no less than five drafts in my working tree here, not counting three actual drafts recorded in the Git repository here:
anarcat@angela:anarc.at$ git s blog
## main...origin/main
?? blog/bell-bot.md
?? blog/fish.md
?? blog/kensington.md
?? blog/nixos.md
?? blog/tmux.md
anarcat@angela:anarc.at$ git grep -l '\!tag draft'
blog/mobile-massive-gallery.md
blog/on-dying.mdwn
blog/secrets-recovery.md
I just don't have time to wrap those things up. I think part of me is disgusted by seeing my work stolen by large corporations to build proprietary large language models while my idols have been pushed to suicide for trying to share science with the world. Another part of me wants to make those things just right. The "tagged drafts" above are nothing more than a huge pile of chaotic links, far from being useful for anyone else than me, and even then. The on-dying article, in particular, is becoming my nemesis. I've been wanting to write that article for over 6 years now, I think. It's just too hard.

Writing elsewhere There's also the fact that I write for work already. A lot. Here are the top-10 contributors to our team's wiki:
anarcat@angela:help.torproject.org$ git shortlog --numbered --summary --group="format:%al"   head -10
  4272  anarcat
   423  jerome
   117  zen
   116  lelutin
   104  peter
    58  kez
    45  irl
    43  hiro
    18  gaba
    17  groente
... but that's a bit unfair, since I've been there half a decade. Here's the last year:
anarcat@angela:help.torproject.org$ git shortlog --since=2024-01-01 --numbered --summary --group="format:%al"   head -10
   827  anarcat
   117  zen
   116  lelutin
    91  jerome
    17  groente
    10  gaba
     8  micah
     7  kez
     5  jnewsome
     4  stephen.swift
So I still write the most commits! But to truly get a sense of the amount I wrote in there, we should count actual changes. Here it is by number of lines (from commandlinefu.com):
anarcat@angela:help.torproject.org$ git ls-files   xargs -n1 git blame --line-porcelain   sed -n 's/^author //p'   sort -f   uniq -ic   sort -nr   head -10
  99046 Antoine Beaupr 
   6900 Zen Fu
   4784 J r me Charaoui
   1446 Gabriel Filion
   1146 Jerome Charaoui
    837 groente
    705 kez
    569 Gaba
    381 Matt Traudt
    237 Stephen Swift
That, of course, is the entire history of the git repo, again. We should take only the last year into account, and probably ignore the tails directory, as sneaky Zen Fu imported the entire docs from another wiki there...
anarcat@angela:help.torproject.org$ find [d-s]* -type f -mtime -365   xargs -n1 git blame --line-porcelain 2>/dev/null   sed -n 's/^author //p'   sort -f   uniq -ic   sort -nr   head -10
  75037 Antoine Beaupr 
   2932 J r me Charaoui
   1442 Gabriel Filion
   1400 Zen Fu
    929 Jerome Charaoui
    837 groente
    702 kez
    569 Gaba
    381 Matt Traudt
    237 Stephen Swift
Pretty good! 75k lines. But those are the files that were modified in the last year. If we go a little more nuts, we find that:
anarcat@angela:help.torproject.org$ $ git-count-words-range.py    sort -k6 -nr   head -10
parsing commits for words changes from command: git log '--since=1 year ago' '--format=%H %al'
anarcat 126116 - 36932 = 89184
zen 31774 - 5749 = 26025
groente 9732 - 607 = 9125
lelutin 10768 - 2578 = 8190
jerome 6236 - 2586 = 3650
gaba 3164 - 491 = 2673
stephen.swift 2443 - 673 = 1770
kez 1034 - 74 = 960
micah 772 - 250 = 522
weasel 410 - 0 = 410
I wrote 126,116 words in that wiki, only in the last year. I also deleted 37k words, so the final total is more like 89k words, but still: that's about forty (40!) articles of the average size (~2k) I wrote in 2022. (And yes, I did go nuts and write a new log parser, essentially from scratch, to figure out those word diffs. I did get the courage only after asking GPT-4o for an example first, I must admit.) Let's celebrate that again: I wrote 90 thousand words in that wiki in 2024. According to Wikipedia, a "novella" is 17,500 to 40,000 words, which would mean I wrote about a novella and a novel, in the past year. But interestingly, if I look at the repository analytics. I certainly didn't write that much more in the past year. So that alone cannot explain the lull in my production here.

Arguments Another part of me is just tired of the bickering and arguing on the internet. I have at least two articles in there that I suspect is going to get me a lot of push-back (NixOS and Fish). I know how to deal with this: you need to write well, consider the controversy, spell it out, and defuse things before they happen. But that's hard work and, frankly, I don't really care that much about what people think anymore. I'm not writing here to convince people. I have stop evangelizing a long time ago. Now, I'm more into documenting, and teaching. And, while teaching, there's a two-way interaction: when you give out a speech or workshop, people can ask questions, or respond, and you all learn something. When you document, you quickly get told "where is this? I couldn't find it" or "I don't understand this" or "I tried that and it didn't work" or "wait, really? shouldn't we do X instead", and you learn. Here, it's static. It's my little soapbox where I scream in the void. The only thing people can do is scream back.

Collaboration So. Let's see if we can work together here. If you don't like something I say, disagree, or find something wrong or to be improved, instead of screaming on social media or ignoring me, try contributing back. This site here is backed by a git repository and I promise to read everything you send there, whether it is an issue or a merge request. I will, of course, still read comments sent by email or IRC or social media, but please, be kind. You can also, of course, follow the latest changes on the TPA wiki. If you want to catch up with the last year, some of the "novellas" I wrote include: (Well, no, you can't actually follow changes on a GitLab wiki. But we have a wiki-replica git repository where you can see the latest commits, and subscribe to the RSS feed.) See you there!

Antoine Beaupr : Qalculate hacks

This is going to be a controversial statement because some people are absolute nerds about this, but, I need to say it. Qalculate is the best calculator that has ever been made. I am not going to try to convince you of this, I just wanted to put out my bias out there before writing down those notes. I am a total fan. This page will collect my notes of cool hacks I do with Qalculate. Most examples are copy-pasted from the command-line interface (qalc(1)), but I typically use the graphical interface as it's slightly better at displaying complex formulas. Discoverability is obviously also better for the cornucopia of features this fantastic application ships.

Qalc commandline primer On Debian, Qalculate's CLI interface can be installed with:
apt install qalc
Then you start it with the qalc command, and end up on a prompt:
anarcat@angela:~$ qalc
> 
Then it's a normal calculator:
anarcat@angela:~$ qalc
> 1+1
  1 + 1 = 2
> 1/7
  1 / 7   0.1429
> pi
  pi   3.142
> 
There's a bunch of variables to control display, approximation, and so on:
> set precision 6
> 1/7
  1 / 7   0.142857
> set precision 20
> pi
  pi   3.1415926535897932385
When I need more, I typically browse around the menus. One big issue I have with Qalculate is there are a lot of menus and features. I had to fiddle quite a bit to figure out that set precision command above. I might add more examples here as I find them.

Bandwidth estimates I often use the data units to estimate bandwidths. For example, here's what 1 megabit per second is over a month ("about 300 GiB"):
> 1 megabit/s * 30 day to gibibyte 
  (1 megabit/second)   (30 days)   301.7 GiB
Or, "how long will it take to download X", in this case, 1GiB over a 100 mbps link:
> 1GiB/(100 megabit/s)
  (1 gibibyte) / (100 megabits/second)   1 min + 25.90 s

Password entropy To calculate how much entropy (in bits) a given password structure, you count the number of possibilities in each entry (say, [a-z] is 26 possibilities, "one word in a 8k dictionary" is 8000), extract the base-2 logarithm, multiplied by the number of entries. For example, an alphabetic 14-character password is:
> log2(26*2)*14
  log (26   2)   14   79.81
... 80 bits of entropy. To get the equivalent in a Diceware password with a 8000 word dictionary, you would need:
> log2(8k)*x = 80
  (log (8   000)   x) = 80  
  x   6.170
... about 6 words, which gives you:
> log2(8k)*6
  log (8   1000)   6   77.79
78 bits of entropy.

Exchange rates You can convert between currencies!
> 1 EUR to USD
  1 EUR   1.038 USD
Even fake ones!
> 1 BTC to USD
  1 BTC   96712 USD
This relies on a database pulled form the internet (typically the central european bank rates, see the source). It will prompt you if it's too old:
It has been 256 days since the exchange rates last were updated.
Do you wish to update the exchange rates now? y
As a reader pointed out, you can set the refresh rate for currencies, as some countries will require way more frequent exchange rates. The graphical version has a little graphical indicator that, when you mouse over, tells you where the rate comes from.

Other conversions Here are other neat conversions extracted from my history
> teaspoon to ml
  teaspoon = 5 mL
> tablespoon to ml
  tablespoon = 15 mL
> 1 cup to ml 
  1 cup   236.6 mL
> 6 L/100km to mpg
  (6 liters) / (100 kilometers)   39.20 mpg
> 100 kph to mph
  100 kph   62.14 mph
> (108km - 72km) / 110km/h
  ((108 kilometers)   (72 kilometers)) / (110 kilometers/hour)  
  19 min + 38.18 s

Completion time estimates This is a more involved example I often do.

Background Say you have started a long running copy job and you don't have the luxury of having a pipe you can insert pv(1) into to get a nice progress bar. For example, rsync or cp -R can have that problem (but not tar!). (Yes, you can use --info=progress2 in rsync, but that estimate is incremental and therefore inaccurate unless you disable the incremental mode with --no-inc-recursive, but then you pay a huge up-front wait cost while the entire directory gets crawled.)

Extracting a process start time First step is to gather data. Find the process start time. If you were unfortunate enough to forget to run date --iso-8601=seconds before starting, you can get a similar timestamp with stat(1) on the process tree in /proc with:
$ stat /proc/11232
  File: /proc/11232
  Size: 0               Blocks: 0          IO Block: 1024   directory
Device: 0,21    Inode: 57021       Links: 9
Access: (0555/dr-xr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2025-02-07 15:50:25.287220819 -0500
Modify: 2025-02-07 15:50:25.287220819 -0500
Change: 2025-02-07 15:50:25.287220819 -0500
 Birth: -
So our start time is 2025-02-07 15:50:25, we shave off the nanoseconds there, they're below our precision noise floor. If you're not dealing with an actual UNIX process, you need to figure out a start time: this can be a SQL query, a network request, whatever, exercise for the reader.

Saving a variable This is optional, but for the sake of demonstration, let's save this as a variable:
> start="2025-02-07 15:50:25"
  save("2025-02-07T15:50:25"; start; Temporary; ; 1) =
  "2025-02-07T15:50:25"

Estimating data size Next, estimate your data size. That will vary wildly with the job you're running: this can be anything: number of files, documents being processed, rows to be destroyed in a database, whatever. In this case, rsync tells me how many bytes it has transferred so far:
# rsync -ASHaXx --info=progress2 /srv/ /srv-zfs/
2.968.252.503.968  94%    7,63MB/s    6:04:58  xfr#464440, ir-chk=1000/982266) 
Strip off the weird dots in there, because that will confuse qalculate, which will count this as:
  2.968252503968 bytes   2.968 B
Or, essentially, three bytes. We actually transferred almost 3TB here:
  2968252503968 bytes   2.968 TB
So let's use that. If you had the misfortune of making rsync silent, but were lucky enough to transfer entire partitions, you can use df (without -h! we want to be more precise here), in my case:
Filesystem              1K-blocks       Used  Available Use% Mounted on
/dev/mapper/vg_hdd-srv 7512681384 7258298036  179205040  98% /srv
tank/srv               7667173248 2870444032 4796729216  38% /srv-zfs
(Otherwise, of course, you use du -sh $DIRECTORY.)

Digression over bytes Those are 1 K bytes which is actually (and rather unfortunately) Ki, or "kibibytes" (1024 bytes), not "kilobytes" (1000 bytes). Ugh.
> 2870444032 KiB
  2870444032 kibibytes   2.939 TB
> 2870444032 kB
  2870444032 kilobytes   2.870 TB
At this scale, those details matter quite a bit, we're talking about a 69GB (64GiB) difference here:
> 2870444032 KiB - 2870444032 kB
  (2870444032 kibibytes)   (2870444032 kilobytes)   68.89 GB
Anyways. Let's take 2968252503968 bytes as our current progress. Our entire dataset is 7258298064 KiB, as seen above.

Solving a cross-multiplication We have 3 out of four variables for our equation here, so we can already solve:
> (now-start)/x = (2996538438607 bytes)/(7258298064 KiB) to h
  ((actual   start) / x) = ((2996538438607 bytes) / (7258298064
  kibibytes))
  x   59.24 h
The entire transfer will take about 60 hours to complete! Note that's not the time left, that is the total time. To break this down step by step, we could calculate how long it has taken so far:
> now-start
  now   start   23 h + 53 min + 6.762 s
> now-start to s
  now   start   85987 s
... and do the cross-multiplication manually, it's basically:
x/(now-start) = (total/current)
so:
x = (total/current) * (now-start)
or, in Qalc:
> ((7258298064  kibibytes) / ( 2996538438607 bytes) ) *  85987 s
  ((7258298064 kibibytes) / (2996538438607 bytes))   (85987 secondes)  
  2 d + 11 h + 14 min + 38.81 s
It's interesting it gives us different units here! Not sure why.

Now and built-in variables The now here is actually a built-in variable:
> now
  now   "2025-02-08T22:25:25"
There is a bewildering list of such variables, for example:
> uptime
  uptime = 5 d + 6 h + 34 min + 12.11 s
> golden
  golden   1.618
> exact
  golden = ( (5) + 1) / 2

Computing dates In any case, yay! We know the transfer is going to take roughly 60 hours total, and we've already spent around 24h of that, so, we have 36h left. But I did that all in my head, we can ask more of Qalc yet! Let's make another variable, for that total estimated time:
> total=(now-start)/x = (2996538438607 bytes)/(7258298064 KiB)
  save(((now   start) / x) = ((2996538438607 bytes) / (7258298064
  kibibytes)); total; Temporary; ; 1)  
  2 d + 11 h + 14 min + 38.22 s
And we can plug that into another formula with our start time to figure out when we'll be done!
> start+total
  start + total   "2025-02-10T03:28:52"
> start+total-now
  start + total   now   1 d + 11 h + 34 min + 48.52 s
> start+total-now to h
  start + total   now   35 h + 34 min + 32.01 s
That transfer has ~1d left, or 35h24m32s, and should complete around 4 in the morning on February 10th. But that's icing on top. I typically only do the cross-multiplication and calculate the remaining time in my head. I mostly did the last bit to show Qalculate could compute dates and time differences, as long as you use ISO timestamps. Although it can also convert to and from UNIX timestamps, it cannot parse arbitrary date strings (yet?).

Other functionality Qalculate can:
  • Plot graphs;
  • Use RPN input;
  • Do all sorts of algebraic, calculus, matrix, statistics, trigonometry functions (and more!);
  • ... and so much more!
I have a hard time finding things it cannot do. When I get there, I typically need to resort to programming code in Python, use a spreadsheet, and others will turn to more complete engines like Maple, Mathematica or R. But for daily use, Qalculate is just fantastic. And it's pink! Use it!

Further reading and installation This is just scratching the surface, the fine manual has more information, including more examples. There is also of course a qalc(1) manual page which also ships an excellent EXAMPLES section. Qalculate is packaged for over 30 Linux distributions, but also ships packages for Windows and MacOS. There are third-party derivatives as well including a web version and an Android app.

2 February 2025

Bits from Debian: Bits from the DPL

Dear Debian community, this is bits from DPL for January. Sovereign Tech Agency I was recently pointed to Technologies and Projects supported by the Sovereign Tech Agency which is financed by the German Federal Ministry for Economic Affairs and Climate Action. It is a subsidiary of the Federal Agency for Disruptive Innovation, SPRIND GmbH. It is worth sending applications there for distinct projects as that is their preferred method of funding. Distinguished developers can also apply for a fellowship position that pays up to 40hrs / week (32hrs when freelancing) for a year. This is esp. open to maintainers of larger numbers of packages in Debian (or any other Linux distribution). There might be a chance that some of the Debian-related projects submitted to the Google Summer of Code that did not get funded could be retried with those foundations. As per the FAQ of the project: "The Sovereign Tech Agency focuses on securing and strengthening open and foundational digital technologies. These communities working on these are distributed all around the world, so we work with people, companies, and FOSS communities everywhere." Similar funding organizations include the Open Technology Fund and FLOSS/fund. If you have a Debian-related project that fits these funding programs, they might be interesting options. This list is by no means exhaustive just some hints I ve received and wanted to share. More suggestions for such opportunities are welcome. Year of code reviews On the debian-devel mailing list, there was a long thread titled "Let's make 2025 a year when code reviews became common in Debian". It initially suggested something along the lines of: "Let's review MRs in Salsa." The discussion quickly expanded to include patches that have been sitting in the BTS for years, which deserve at least the same attention. One idea I'd like to emphasize is that associating BTS bugs with MRs could be very convenient. It s not only helpful for documentation but also the easiest way to apply patches. I d like to emphasize that no matter what workflow we use BTS, MRs, or a mix it is crucial to uphold Debian s reputation for high quality. However, this reputation is at risk as more and more old issues accumulate. While Debian is known for its technical excellence, long-standing bugs and orphaned packages remain a challenge. If we don t address these, we risk weakening the high standards that Debian is valued for. Revisiting old issues and ensuring that unmaintained packages receive attention is especially important as we prepare for the Trixie release. Debian Publicity Team will no longer post on X/Twitter The Press Team has my full support in its decision to stop posting on X. As per the Publicity delegation: the team once decided to join Twitter, but circumstances have since changed. The current Press delegates have the institutional authority to leave X, just as their predecessors had the authority to join. I appreciate that the team carefully considered the matter, reinforced by the arguments developed on the debian-publicity list, and communicated its reasoning openly. Kind regards, Andreas.

29 January 2025

Keith Packard: picolibc-i18n

Internationalization support in Picolibc There are two major internationalization APIs in the C library: locales and iconv. Iconv is an isolated component which only performs charset conversion in ways that don't interact with anything else in the library. Locales affect pretty much every API that deals with strings and covers charset conversion along with a huge range of localized information from character classification to formatting of time, money, people's names, addresses and even standard paper sizes. Picolibc inherits it's implementation of both of these from newlib. Given that embedded applications rarely need advanced functionality from either these APIs, I hadn't spent much time exploring this space. Newlib locale code When run on Cygwin, Newlib's locale support is quite complete as it leverages the underlying Windows locale support. Without Windows support, everything aside from charset conversion and character classification data is stubbed out at the bottom of the stack. Because the implementation can support full locale functionality, the implementation is designed for that, with large data structures and lots of code. Charset conversion and character classification data for locales is all built-in; none of that can be loaded at runtime. There is support for all of the ISO-8859 charsets, three JIS variants, a bunch of Windows code pages and a few other single-byte encodings. One oddity in this code is that when using a JIS locale, wide characters are stored in EUC-JP rather than Unicode. Every other locale uses Unicode. This means APIs like wctype are implemented by mapping the JIS-encoded character to Unicode and then using the underlying Unicode character classification tables. One consequence of this is that there isn't any Unicode to JIS mapping provided as it isn't necessary. When testing the charset conversion and Unicode character classification data, I found numerous minor errors and a couple of pretty significant ones. The JIS conversion code had the most serious issue I found; most of the conversions are in a 2d array which is manually indexed with the wrong value for the length of each row. This led to nearly every translated value being incorrect. The charset conversion tables and Unicode classification data are now generated using python charset support and the standard Unicode data files. In addition, tests have been added which compare Picolibc to the system C library for every supported charset. Newlib iconv code The iconv charset support is completely separate from the locale charset support with a much wider range of supported targets. It also supports loading charset data from files at runtime, which reduces the size of application images. Because the iconv and locale implementations are completely separate, the charset support isn't the same. Iconv supports a lot more charsets, but it doesn't support all of those available to locales. For example, Iconv has Big5 support which locale lacks. Conversely, locale has Shift-JIS support which iconv does not. There's also a difference in how charset names are mapped in the two APIs. The locale code has a small fixed set of aliases, which doesn't include things like US-ASCII or ANSI X3.4. In contrast, the iconv code has an extensive database of charset aliases which are compiled into the library. Picolibc has a few tests for the iconv API which verify charset names and perform some translations. Without an external reference, it's hard to know if the results are correct. POSIX vs C internationalization In addition to including the iconv API, POSIX extends locale support in a couple of ways:
  1. Exposing locale objects via the newlocale, uselocale, duplocale and freelocale APIs.
  2. uselocale sets a per-thread locale, rather than the process-wide locale.
Goals for Picolibc internationalization support For charsets, supporting UTF-8 should cover the bulk of embedded application needs, and even that is probably more than what most applications require. Most (all?) compilers use Unicode for wide character and string constants. That means wchar_t needs to be Unicode in every locale. Aside from charset support, the rest of the locale infrastructure is heavily focused on creating human-consumable strings. I don't think it's a stretch to say that none of this is very useful these days, even for systems with sophisticated user interactions. For picolibc, the cost to provide any of this would be high. Having two completely separate charset conversion datasets makes for a confusing and error-prone experience for developers. Replacing iconv with code that leverages the existing locale support for translating between multi-byte and wide-character representations will save a bunch of source code and improve consistency. Embedded systems can be very sensitive to memory usage, both read-only and read-write. Applications not using internationalization capabilities shouldn't pay a heavy premium even when the library binary is built with support. For the most sensitive targets, the library should be configurable to remove unnecessary functionality. Picolibc needs to be conforming with at least the C language standard, and as much of POSIX as makes sense. Fortunately, the requirements for C are modest as it only includes a few locale-related APIs and doesn't include iconv. Finally, picolibc should test these APIs to make sure they conform with relevant standards, especially character set translation and character classification. The easiest way to do this is to reference another implementation of the same API and compare results. Switching to Unicode for JIS wchar_t This involved ripping the JIS to Unicode translations out of all of the wide character APIs and inserting them into the translations between multi-byte and wide-char representations. The missing Unicode to JIS translation was kludged by iterating over all JIS code points until a matching Unicode value was found. That's an obvious place for a performance improvement, but at least it works. Tiny locale This is a minimal implementation of locales which conforms with the C language standard while providing only charset translation and character classification data. It handles all of the existing charsets, but splits things into three levels
  1. ASCII
  2. UTF-8
  3. Extended, including any or all of: a. ISO 8859 b. Windows code pages and other 8-bit encodings c. JIS (JIS, EUC-JP and Shift-JIS)
When built for ASCII-only, all of the locale support is short-circuited, except for error checking. In addition, support in printf and scanf for wide characters is removed by default (it can be re-enabled with the -Dio-wchar=true meson option). This offers the smallest code size. Because the wctype APIs (e.g. iswupper) are all locale-specific, this mode restricts them to ASCII-only, which means they become wrappers on top of the ctype APIs with added range checking. When built for UTF-8, character classification for wide characters uses tables that provide the full Unicode range. Setlocale now selects between two locales, "C" and "C.UTF-8". Any locale name other than "C" selects the UTF-8 version. If the locale name contains "." or "-", then the rest of the locale name is taken to be a charset name and matched against the list of supported charsets. In this mode, only "us_ascii", "ascii" and "utf-8" are recognized. Because a single byte of a utf-8 string with the high-bit set is not a complete character, all of the ctype APIs in this mode can use the same implementation as the ASCII-only mode. This means the small ctype implementation is available. Calling setlocale(LC_ALL, "C.UTF-8") will allow the application to use the APIs which translate between multi-byte and wide-characters to deal with UTF-8 encoded strings. In addition, scanf and printf can read and write UTF-8 strings into wchar_t strings. Locale names are converted into locale IDs, an enumeration which lists the available locales. Each ID implies a specific charset as that's the only thing which differs between them. This means a locale can be encoded in a few bytes rather than an array of strings. In terms of memory usage, applications not using locales and not using the wctype APIs should see only a small increase in code space. That's due to the wchar_t support added to printf and scanf which need to translate between multi-byte and wide-character representations. There aren't any tables required as ASCII and UTF-8 are directly convertible to Unicode. On ARM-v7m, The added code in printf and scanf add up to about 1kB and another 32 bytes of RAM is used. The big difference when enabling extended charset support is that all of the charset conversion and character classification operations become table driven and dependent on the locale. Depending on the extended charsets supported, these can be quite large. With all of the extended charsets included, this adds an additional 30kB of code and static data and uses another 56 bytes of RAM. There are two known gaps in functionality compared with the newlib code:
  1. Locale strings that encode different locales for different categories. That's nominally required by POSIX as LC_ALL is supposed to return a string sufficient to restore the locale, but the only category which actually matters is LC_CTYPE.
  2. No nl_langinfo support. This would be fairly easy to add, returning appropriate constant values for each parameter.
Tiny locale was merged to picolibc main in this PR Tiny iconv Replacing the bulky newlib iconv code was far easier than swapping locale implementations. Essentially all that iconv does is compute two functions, one which maps from multi-byte to wide-char in one locale and another which maps from wide-char to multi-byte in another locale. Once the JIS locales were fixed to use Unicode, the new iconv implementation was straightforward. POSIX doesn't provide any _l version of mbrtowc or wcrtomb, so using standard C APIs would have been clunky. Instead, the implementation uses the internal APIs to compute the correct charset conversion functions. The entire implementation fits in under 200 lines of code. Tiny iconv is in process in this PR Future directions Right now, both of these new bits of code sit in the source tree parallel to the old versions. I'm not seeing any particular reason to keep the old versions around; they have provided a useful point of comparison in developing the new code, but I don't think they offer any compelling benefits going forward.

26 January 2025

Russ Allbery: Review: Dark Matters

Review: Dark Matters, by Michelle Diener
Series: Class 5 #4
Publisher: Eclipse
Copyright: October 2019
ISBN: 0-6454658-6-0
Format: Kindle
Pages: 307
Dark Matters is the fourth book in the science fiction semi-romance Class 5 series. There are spoilers for all of the previous books, and although enough is explained that you could make sense of the story starting here, I wouldn't recommend it. As with the other books in the series, it follows new protagonists, but the previous protagonists make an appearance. You will be unsurprised to hear that the Tecran kidnapped yet another Earth woman. The repetitiveness of the setup would be more annoying if the book took itself too seriously, but it doesn't, and so I mostly find it entertaining. I thought Diener was going to dodge the obvious series structure, but now I am wondering if we're going to end up with one woman per Class 5 ship after all. Lucy is not on a ship, however, Tecran or otherwise. She is a captive in a military research facility on the Tecran home world. The Tecran are in very deep trouble given the events of the previous book and have decided that Lucy's existence is a liability. Only the intervention of some sympathetic Tecran scientists she partly befriended during her captivity lets her escape the facility before it's destroyed. Now she's alone, on an alien world, being hunted by the military. It's not entirely the fault of this book that it didn't tell the story that I wanted to read. The setup for Dark Matters implies this book will see the arrival of consequences for the Tecran's blatant violations of the Sentient Beings Agreement. I was looking forward to a more political novel about how such consequences could be administered. This is the sort of problem that we struggle with in our politics: Collective punishment isn't acceptable, but there have to be consequences sufficient to ensure that a state doesn't repeat the outlawed behavior, and yet attempting to deliver those consequences feels like occupation and can set off worse social ruptures and even atrocities. I wasn't expecting that deep of political analysis of what is, after all, a lighthearted SF adventure series, but Diener has been willing to touch on hard problems. The ethics of violence has been an ongoing theme of the series. Alas for me, this is not what we get. The arriving cavalry, in the form of a Class 5 and the inevitable Grih hunk to serve as the love interest du jour, quickly become more interested in helping Lucy elude pursuers (or escape captors) than in the delicate political situation. The conflict between the local population is a significant story element, but only as backdrop. Instead, this reads like a thriller or an action movie, complete with alien predators and a cinematic set piece finale. The political conflict between the Tecran and the United Council does reach a conclusion of sorts, but it's not that satisfying. Perhaps some of the political fallout will happen in future books, but here Diener simplifies the morality of the story in the climax and dodges out of the tricky ethical and social challenge of how to punish a sovereign nation. One of the things I like about this series is that it takes moral indignation seriously, but now that Diener has raised the (correct) complication that people have strong motivations to find excuses for the actions of their own side, I hope she can find a believable political resolution that isn't simple brute force. This entry in the series wasn't bad, but it didn't grab me. Lucy was fine as a protagonist; her ability to manipulate the Tecran into making mistakes fits the longer time she's had to study them and keeps her distinct from the other protagonists. But the small bit of politics we do see is unsatisfying and conveniently simplistic, and this book mostly degenerates into generic action sequences. Bane, the Class 5 ship featured in this story, is great when he's active, and I continue to be entertained by the obsession the Class 5 ships have with Earth women, but he's sidelined for too much of the story. I felt like Diener focused on the least interesting part of the story setup. If you've read this far, there's nothing wrong with this entry. You'll probably want to keep reading. But it felt like a missed opportunity. Followed in publication order by Dark Ambitions, a novella that returns to Rose to tell a side story. The next novel is Dark Class, in which we'll presumably see the last kidnapped Earth woman. Rating: 6 out of 10

Otto Kek l inen: 10 habits to help becoming a Debian Maintainer

Featured image of post 10 habits to help becoming a Debian MaintainerBecoming a Debian maintainer is a journey that combines technical expertise, community collaboration, and continuous learning. In this post, I ll share 10 key habits that will both help you navigate the complexities of Debian packaging without getting lost, and also enable you to contribute more effectively to one of the world s largest open source projects.

1. Read and re-read the Debian Policy, the Developer s Reference and the git-buildpackage manual Anyone learning Debian packaging and aspiring to become a Debian maintainer is likely to wade through a lot of documentation, only to realize that much of it is outdated or sometimes outright incorrect. Therefore, it is important to learn right from the start which sources are the most reliable and truly worth reading and re-reading. I recommend these documents, in order of importance:
  • The Debian Policy Manual: Describes the structure of the operating system, the package archive, and requirements for packages to be included in the Debian archive.
  • The Developer s Reference: A collection of best practices and process descriptions Debian packagers are expected to follow while interacting with one another.
  • The git-buildpackage man pages: While the Policy focuses on the end result and is intentionally void of practical instructions on creating or maintaining Debian packages, the Developer s Reference goes into greater detail. However, it too lacks step-by-step instructions. For the exact commands, consult the man pages of git-buildpackage and its subcommands (e.g., gbp clone, gbp import-orig, gbp pq, gbp dch, gbp push). See also my post on Debian source package git branch and tags for an easy to understand diagrams.

2. Make reading man pages a habit In addition to the above, try to make a habit of checking out the man page of every new tool you use to ensure you are using it as intended. The best place to read accurate and up-to-date documentation is manpages.debian.org. The manual pages are maintained alongside the tools by their developers, ensuring greater accuracy than any third-party documentation. If you are using a tool in the way the tool author documented, you can be confident you are doing the right thing, even if it wasn t explicitly mentioned in some third-party guide about Debian packaging best practices.

3. Read and write emails While members of the Debian community have many channels of communication, the mailing lists are by far the most prominent. Asking questions on the appropriate list is a good way to get current advice from other people doing Debian packaging. Staying subscribed to lists of interest is also a good way to read about new developments as they happen. Note that every post is public and archived permanently, so the discussions on the mailing lists also form a body of documentation that can later be searched and referred to. Regularly writing short and well-structured emails on the mailing lists is great practice for improving technical communication skills a useful ability in general. For Debian specifically, being active on mailing lists helps build a reputation that can later attract collaborators and supporters for more complex initiatives.

4. Create and use an OpenPGP key Related to reputation and identity, OpenPGP keys play a central role in the Debian community. OpenPGP is used to various degrees to sign git commits and tags, sign and encrypt email, and most importantly to sign Debian packages so their origin can be verified. The process of becoming a Debian Maintainer and eventually a Debian Developer culminates in getting your OpenPGP key included in the Debian keyring, which is used to control who can upload packages into the Debian archive. The earlier you create a key and start using it to gain reputation for that specific key that is used to sign your work, the better. Note that due to a recent schism in the OpenPGP standards working group, it is safest to create an OpenPGP key using GnuPG version 2.2.x (not 2.4.x), or using Sequoia-PGP.

5. Integrate Salsa CI in all work One reason Debian remains popular, even 30 years after its inception, is due to its culture of maintaining high standards. For a newcomer, learning all the quality assurance tools such as Lintian, Piuparts, Adequate, various build variations, and reproducible builds may be overwhelming. However, these tasks are easier to manage thanks to Salsa CI, the continuous integration pipeline in Debian that runs tests on every commit at salsa.debian.org. The earlier you activate Salsa CI in the package repository you are working on, the faster you will achieve high quality in your package with fewer missteps. You can also further customize a package specific salsa-ci.yml to have more testing coverage. Example Salsa CI pipeline with customizations

6. Fork on Salsa and use draft Merge Requests to solicit feedback All modern Debian packages are hosted on salsa.debian.org. If you want to make a change to any package, it is easy to fork, make an initial attempt at the change, and publish it as a draft Merge Request (MR) on Salsa to solicit feedback. People might have surprising reasons to object to the change you propose, or they might need time to get used to the idea before agreeing to it. Also, some people might object to a vague idea out of suspicion but agree once they see the exact implementation. There may also be a surprising number of people supporting your idea, and if there is an MR, they have a place to show their support at. Don t expect every Merge Request to be accepted. However, proposing an idea as running code in an MR is far more effective than raising the idea on a mailing list or in a bug report. Get into the habit of publishing plenty of merge requests to solicit feedback and drive discussions toward consensus.

7. Use git rebase frequently Linear git history is much easier to read. The ease of reading git log and git blame output is vital in Debian, where packages often have updates from multiple people spanning many years even decades. Debian packagers likely spend more time than the average software developer reading git history. Make sure you master git commands such as gitk --all, git citool --amend, git commit -a --fixup <commit id>, git rebase -i --autosquash <target branch>, git cherry-pick <commit id 1> <id 2> <id 3>, and git pull --rebase. If rebasing is not done on your initiative, rest assured others will ask you to do it. Thus, if the commands above are familiar, rebasing will be quick and easy for you.

8. Reviews: give some, get some In open source, the larger a project becomes, the more it attracts contributions, and the bottleneck for its growth isn t how much code developers can create but how much code submissions can be properly reviewed. At the time of writing, the main Salsa group Debian has over 800 open merge requests pending reviews and approvals. Feel free to read and comment on any merge request you find. You don t have to be a subject matter expert to provide valuable feedback. Even if you don t have specific feedback, your comment as another human acknowledging that you read the MR and found no issues is viewed positively by the author. Besides, if you spend enough time reviewing MRs in a specific domain, you will eventually become an expert in it. Code reviews are not just about providing feedback to the submitter; they are also great learning opportunities for the reviewer. As a rule of thumb, you should review at least twice as many merge requests as you submit yourself.

9. Improve Debian by improving upstream It is common that while packaging software for Debian, bugs are uncovered and patched in Debian. Do not forget to submit the fixes upstream, and add a Forwarded field to the file in debian/patches! As the person building and packaging something in Debian, you automatically become an authority on that software, and the upstream is likely glad to receive your improvements. While submitting patches upstream is a bit of work initially, getting improvements merged upstream eventually saves time for everyone and makes packaging in Debian easier, as there will be fewer patches to maintain with each new upstream release.

10. Don t hold any habits too firmly Last but not least: Once people learn a specific way of working, they tend to stick to it for decades. Learning how to create and maintain Debian packages requires significant effort, and people tend to stop learning once they feel they ve reached a sufficient level. This tendency to get stuck in a local optimum is understandable and natural, but try to resist it. It is likely that better techniques will evolve over time, so stay humble and re-evaluate your beliefs and practices every few years. Mastering these habits takes time, but each small step brings you closer to making a meaningful impact on Debian. By staying curious, collaborative, and adaptable, you can ensure your contributions stand the test of time just like Debian itself. Good luck on your journey toward becoming a Debian Maintainer!

20 January 2025

Divine Attah-Ohiemi: Progress Report: First Half of My Outreachy Internship

Hello everyone!, I m excited to share a progress report on my Outreachy internship with the Debian community. As I reach the halfway point of this journey, I want to reflect on what I ve accomplished so far and outline my modified goals for the second half of the internship. In truth, there wasn t a strict timeline for my project migrating Debian webpage content to Hugo because the original repository contained thousands of pages. The initial goal was to develop a proof of concept for: Thanks to our daily standups, where we brainstorm and revise contributions, we ve made significant progress. The wiki documentation discussing the technical decisions taken to meet these goals is currently in progress here. During the first half of my internship, I have improved and refined my skills in several areas. I learned new Markdown syntaxes, studied and utilized Apache's mod_rewrite, and halfway studied GNU Make to use Perl scripts for processing data for dynamic content. I recommend Managing Projects with GNU Make by Robert Mecklenburg it's a great book for beginners! While I didn t get stuck on any particular goal, the most challenging aspect was adding Hugo aliases to help with Apache's multilingual content negotiation. The way the webwml repository generates multilingual content differs from debianhugo. For instance, in webwml, the structure looks like this: english/index.wml -> /index.en.html (with a symlink from index.html to index.en.html) and french/index.wml -> /index.fr.html. In contrast, debianhugo uses en/_index.md -> /index.html and fr/_index.md -> /fr/index.html. Apache's multilingual content negotiation checks for index.<user preferred lang code>.html in the current directory, which works well with webwml since all related translations are generated in the same directory. However, with debianhugo using subdirectories for languages other than English, we had to set up aliases for every other language page to be generated in the frontmatter. For example, in fr/_index.md, we added this to the front matter:
...
aliases:
  - /index.fr.html
...
This setup allows Hugo to generate multilingual HTML files in the initial home directory solely for the purpose of setting up a 301 redirect to the same page in the language subdirectory. However, if the client sets their preferred language to English, Apache content negotiation tries to find /index.en.html. If it doesn t find it, it defaults to any other language-suffixed file, which can lead to unexpected behavior. For example, if English is set as the preferred language, accessing the site may serve /index.fr.html, which then redirects to /fr/index.html. This was a significant challenge, and you can see a demo of this hosted here. If I were to start the project over, I would document every decision as I make them in the wiki, no matter how rough the documentation turns out. Waiting until the midpoint of the project to document was not a good idea. As I move into the second half of my internship, the goals we ve set include improving our project wiki documentation and continuing the migration process while enhancing the user experience of complicated sections. I m looking forward to making even more progress and sharing my journey with you all. Happy coding!

14 January 2025

Dirk Eddelbuettel: RProtoBuf 0.4.23 on CRAN: Mulitple Updates

A new maintenance release 0.4.23 of RProtoBuf arrived on CRAN earlier today, about one year after the previous update. RProtoBuf provides R with bindings for the Google Protocol Buffers ( ProtoBuf ) data encoding and serialization library used and released by Google, and deployed very widely in numerous projects as a language and operating-system agnostic protocol. This release brings a number of contributed PRs which are truly appreciate. As the package dates back fifteen+ years, some code corners can be crufty which was addressed in several PRs, as were two updates for ongoing changes / new releases of ProtoBuf itself. I also made the usual changes one does to continuous integrations, README badges and URL as well as correcting one issue the checkbashism script complained about. The following section from the NEWS.Rd file has full details.

Changes in RProtoBuf version 0.4.23 (2022-12-13)
  • More robust tests using toTextFormat() (Xufei Tan in #99 addressing #98)
  • Various standard packaging updates to CI and badges (Dirk)
  • Improvements to string construction in error messages (Michael Chirico in #102 and #103)
  • Accommodate ProtoBuf 26.x and later (Matteo Gianella in #104)
  • Accommodate ProtoBuf 6.30.9 and later (Lev Kandel in #106)
  • Correct bashism issues in configure.ac (Dirk)

Thanks to my CRANberries, there is a diff to the previous release. The RProtoBuf page has copies of the (older) package vignette, the quick overview vignette, and the pre-print of our JSS paper. Questions, comments etc should go to the GitHub issue tracker off the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

10 January 2025

Dirk Eddelbuettel: nanotime 0.3.11 on CRAN: Polish

Another minor update 0.3.11 for our nanotime package is now on CRAN. nanotime relies on the RcppCCTZ package (as well as the RcppDate package for additional C++ operations) and offers efficient high(er) resolution time parsing and formatting up to nanosecond resolution, using the bit64 package for the actual integer64 arithmetic. Initially implemented using the S3 system, it has benefitted greatly from a rigorous refactoring by Leonardo who not only rejigged nanotime internals in S4 but also added new S4 types for periods, intervals and durations. This release covers two corner case. Michael sent in a PR avoiding a clang warning on complex types. We fixed an issue that surfaced in a downstream package under sanitizier checks: R extends coverage of NA to types such as integer or character which need special treatment in non-R library code as they do not know . We flagged (character) formatted values after we had called the corresponding CCTZ function but that leaves potentiall undefined values (from R s NA values for int, say, cast to double) so now we flag them, set a transient safe value for the call and inject the (character) representation "NA" after the call in those spots. End result is the same, but without a possibly slap on the wrist from sanitizer checks. The NEWS snippet below has the full details.

Changes in version 0.3.11 (2025-01-10)
  • Explicit Rcomplex assignment accommodates pickier compilers over newer R struct (Michael Chirico in #135 fixing #134)
  • When formatting, NA are flagged before CCTZ call to to not trigger santizier, and set to NA after call (Dirk in #136)

Thanks to my CRANberries, there is a diffstat report for this release. More details and examples are at the nanotime page; code, issue tickets etc at the GitHub repository and all documentation is provided at the nanotime documentation site.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can now sponsor me at GitHub.

Sergio Talens-Oliag: Testing New User Tools

On recent weeks I ve had some time to scratch my own itch on matters related to tools I use daily on my computer, namely the desktop / window manager and my text editor of choice. This post is a summary of what I tried, how it worked out and my short and medium-term plans related to them.

Desktop / WMOn the desktop / window manager front I ve been using Cinnamon on Debian and Ubuntu systems since Gnome 3 was published (I never liked version 3, so I decided to move to something similar to Gnome 2, including the keyboard shortcuts). In fact I ve never been a fan of Desktop environments, before Gnome I used OpenBox and IceWM because they where a lot faster than desktop systems on my hardware at the time and I was using them only to place one or two windows on multiple workspaces using mainly the keyboard for my interactions (well, except for the web browsers and the image manipulation programs). Although I was comfortable using Cinnamon, some years ago I tried to move to i3, a tilling window manager for X11 that looked like a good choice for me, but I didn t have much time to play with it and never used it enough to make me productive with it (I didn t prepare a complete configuration nor had enough time to learn the new shortcuts, so I went back to Cinnamon and never tried again). Anyway, some weeks ago I updated my work machine OS (it was using Ubuntu 22.04 LTS and I updated it to the 24.04 LTS version) and the Cinnamon systray applet stopped working as it used to do (in fact I still have to restart Cinnamon after starting a session to make it work) and, as I had some time, I decided to try a tilling window manager again, but now I decided to go for SwayWM, as it uses Wayland instead of X11.

Sway configurationOn my ~/.config/sway/config I tuned some things:
  • Set fuzzel as the application launcher.
  • Installed manually the shikane application and created a configuration to be executed always when sway is started / reloaded (I adjusted my configuration with wdisplays and used shikanectl to save it).
  • Added support for running the xdg-desktop-portal-wlr service.
  • Enabled the swayidle command to lock the screen after some time of inactivity.
  • Adjusted the keyboard to use the es key map
  • Added some keybindings to make my life easier, including the use of grimm and swappy to take screenshots
  • Configured waybar as the environment bar.
  • Added a shell script to start applications when sway is started (it uses swaymsg to execute background commands and the i3toolwait script to wait for the
    #!/bin/sh
    # VARIABLES
    CHROMIUM_LOCAL_STATE="$HOME/.config/google-chrome/Local State"
    I3_TOOLWAIT="$HOME/.config/sway/scripts/i3-toolwait"
    # Functions
    chromium_profile_dir()  
      jq -r ".profile.info_cache to_entries map( (.value.name): .key ) add .\"$1\" // \"\"" "$CHROMIUM_LOCAL_STATE"
     
    # MAIN
    IGZ_PROFILE_DIR="$(chromium_profile_dir "sergio.talens@intelygenz.com")"
    OURO_PROFILE_DIR="$(chromium_profile_dir "sergio.talens@nxr.global")"
    PERSONAL_PROFILE_DIR="$(chromium_profile_dir "stalens@gmail.com")"
    # Common programs
    swaymsg "exec nextcloud --background"
    swaymsg "exec nm-applet"
    # Run spotify on the first workspace (it is mapped to the laptop screen)
    swaymsg -q "workspace 1"
    $ I3_TOOLWAIT  "spotify"
    # Run tmux on the
    swaymsg -q "workspace 2"
    $ I3_TOOLWAIT  -- foot tmux a -dt sto
    wp_num="3"
    if [ "$OURO_PROFILE_DIR" ]; then
      swaymsg -q "workspace $wp_num"
      $ I3_TOOLWAIT  -m ouro-browser -- google-chrome --profile-directory="$OURO_PROFILE_DIR"
      wp_num="$((wp_num+1))"
    fi
    if [ "$IGZ_PROFILE_DIR" ]; then
      swaymsg -q "workspace $wp_num"
      $ I3_TOOLWAIT  -m igz-browser -- google-chrome --profile-directory="$IGZ_PROFILE_DIR"
      wp_num="$((wp_num+1))"
    fi
    if [ "$PERSONAL_PROFILE_DIR" ]; then
      swaymsg -q "workspace $wp_num"
      $ I3_TOOLWAIT  -m personal-browser -- google-chrome --profile-directory="$PERSONAL_PROFILE_DIR"
      wp_num="$((wp_num+1))"
    fi
    # Open the browser without setting the profile directory if none was found
    if [ "$wp_num" = "3" ]; then
      swaymsg -q "workspace $wp_num"
      $ I3_TOOLWAIT  google-chrome
      wp_num="$((wp_num+1))"
    fi
    swaymsg -q "workspace $wp_num"
    $ I3_TOOLWAIT  evolution
    wp_num="$((wp_num+1))"
    swaymsg -q "workspace $wp_num"
    $ I3_TOOLWAIT  slack
    wp_num="$((wp_num+1))"
    # Open a private browser and a console in the last workspace
    swaymsg -q "workspace $wp_num"
    $ I3_TOOLWAIT  -- google-chrome --incognito
    $ I3_TOOLWAIT  foot
    # Go back to the second workspace for keepassxc
    swaymsg "workspace 2"
    $ I3_TOOLWAIT  keepassxc

ConclusionAfter using Sway for some days I can confirm that it is a good choice for me, but some of the components needed to make it work as I want are too new and not available on the Ubuntu 24.04 LTS repositories, so I decided to go back to Cinnamon and try Sway again in the future, although I added more workspaces to my setup (now they are only available on the main monitor, the laptop screen is fixed while there is a big monitor connected), added some additional keyboard shortcuts and installed or updated some applets.

Text editorWhen I started using Linux many years ago I used vi/vim and emacs as my text editors (vi for plain text and emacs for programming and editing HTML/XML), but eventually I moved to vim as my main text editor and I ve been using it since (well, I moved to neovim some time ago, although I kept my old vim configuration). To be fair I m not as expert as I could be with vim, but I m productive with it and it has many plugins that make my life easier on my machines, while keeping my ability to edit text and configurations on any system that has a vi compatible editor installed. For work reasons I tried to use Visual Studio Code last year, but I ve never really liked it and almost everything I do with it I can do with neovim (i. e. I even use copilot with it). Besides, I m a heavy terminal user (I use tmux locally and via ssh) and I like to be able to use my text editor on my shell sessions, and code does not work like that. The only annoying thing about vim/neovim is its configuration (well, the problem is that I have a very old one and probably should spend some time fixing and updating it), but, as I said, it s been working well for me for a long time, so I never really had the motivation to do it. Anyway, after finishing my desktop tests I saw that I had the Helix editor installed for some time but I never tried it, so I decided to give it a try and see if it could be a good replacement for neovim on my environments (the only drawback is that as it is not vi compatible, I would need to switch back to vi mode when working on remote systems, but I guess I could live with that). I ran the helix tutorial and I liked it, so I decided to configure and install the Language Servers I can probably take advantage of on my daily work on my personal and work machines and see how it works.

Language server installationsA lot of manual installations are needed to get the language servers working what I did on my machines is more or less the following:
# AWK
sudo npm i -g 'awk-language-server@>=0.5.2'
# BASH
sudo apt-get install shellcheck shfmt
sudo npm i -g bash-language-server
# C/C++
sudo apt-get install clangd
# CSS, HTML, ESLint, JSON, SCS
sudo npm i -g vscode-langservers-extracted
# Docker
sudo npm install -g dockerfile-language-server-nodejs
# Docker compose
sudo npm install -g @microsoft/compose-language-service
# Helm
app="helm_ls_linux_amd64"
url="$(
  curl -s https://api.github.com/repos/mrjosh/helm-ls/releases/latest  
    jq -r ".assets[]   select(.name == \"$app\")   .browser_download_url"
)"
curl -L "$url" --output /tmp/helm_ls
sudo install /tmp/helm_ls /usr/local/bin
rm /tmp/helm_ls
# Markdown
app="marksman-linux-x64"
url="$(
  curl -s https://api.github.com/repos/artempyanykh/marksman/releases/latest  
    jq -r ".assets[]   select(.name == \"$app\")   .browser_download_url"
)"
curl -L "$url" --output /tmp/marksman
sudo install /tmp/marksman /usr/local/bin
rm /tmp/marksman
# Python
sudo npm i -g pyright
# Rust
rustup component add rust-analyzer
# SQL
sudo npm i -g sql-language-server
# Terraform
sudo apt-get install terraform-ls
# TOML
cargo install taplo-cli --locked --features lsp
# YAML
sudo npm install --global yaml-language-server
# JavaScript, TypeScript
sudo npm install -g typescript-language-server typescript
sudo npm install -g --save-dev --save-exact @biomejs/biome

Helix configurationThe helix configuration is done on a couple of toml files that are placed on the ~/.config/helix directory, the config.toml file I used is this one:
theme = "solarized_light"
[editor]
line-number = "relative"
mouse = false
[editor.statusline]
left = ["mode", "spinner"]
center = ["file-name"]
right = ["diagnostics", "selections", "position", "file-encoding", "file-line-ending", "file-type"]
separator = " "
mode.normal = "NORMAL"
mode.insert = "INSERT"
mode.select = "SELECT"
[editor.cursor-shape]
insert = "bar"
normal = "block"
select = "underline"
[editor.file-picker]
hidden = false
[editor.whitespace]
render = "all"
[editor.indent-guides]
render = true
character = " " # Some characters that work well: " ", " ", " ", " "
skip-levels = 1
And to configure the language servers I used the following language-servers.toml file:
[[language]]
name = "go"
auto-format = true
formatter =   command = "goimports"  
[[language]]
name = "javascript"
language-servers = [
  "typescript-language-server", # optional
  "vscode-eslint-language-server",
]
[language-server.rust-analyzer.config.check]
command = "clippy"
[language-server.sql-language-server]
command = "sql-language-server"
args = ["up", "--method", "stdio"]
[[language]]
name = "sql"
language-servers = [ "sql-language-server" ]
[[language]]
name = "hcl"
language-servers = [ "terraform-ls" ]
language-id = "terraform"
[[language]]
name = "tfvars"
language-servers = [ "terraform-ls" ]
language-id = "terraform-vars"
[language-server.terraform-ls]
command = "terraform-ls"
args = ["serve"]
[[language]]
name = "toml"
formatter =   command = "taplo", args = ["fmt", "-"]  
[[language]]
name = "typescript"
language-servers = [
  "typescript-language-server",
  "vscode-eslint-language-server",
]

Neovim configurationAfter a little while I noticed that I was going to need some time to get used to helix and the most interesting thing for me was the easy configuration and the language server integrations, but as I am already comfortable with neovim and just had installed the language server support tools on my machines I just need to configure them for neovim and I can keep using it for a while. As I said my configuration is old, to configure neovim I have the following init.vim file on my ~/.config/nvim folder:
set runtimepath^=~/.vim runtimepath+=~/.vim/after
let &packpath=&runtimepath
source ~/.vim/vimrc
" load lua configuration
lua require('config')
With that configuration I keep my old vimrc (it is a little bit messy, but it works) and I use a lua configuration file for the language servers and some additional neovim plugins on the ~/.config/nvim/lua/config.lua file:
-- -----------------------
-- BEG: LSP Configurations
-- -----------------------
-- AWS (awk_ls)
require'lspconfig'.awk_ls.setup 
-- Bash (bashls)
require'lspconfig'.bashls.setup 
-- C/C++ (clangd)
require'lspconfig'.clangd.setup 
-- CSS (cssls)
require'lspconfig'.cssls.setup 
-- Docker (dockerls)
require'lspconfig'.dockerls.setup 
-- Docker Compose
require'lspconfig'.docker_compose_language_service.setup 
-- Golang (gopls)
require'lspconfig'.gopls.setup 
-- Helm (helm_ls)
require'lspconfig'.helm_ls.setup 
-- Markdown
require'lspconfig'.marksman.setup 
-- Python (pyright)
require'lspconfig'.pyright.setup 
-- Rust (rust-analyzer)
require'lspconfig'.rust_analyzer.setup 
-- SQL (sqlls)
require'lspconfig'.sqlls.setup 
-- Terraform (terraformls)
require'lspconfig'.terraformls.setup 
-- TOML (taplo)
require'lspconfig'.taplo.setup 
-- Typescript (ts_ls)
require'lspconfig'.ts_ls.setup 
-- YAML (yamlls)
require'lspconfig'.yamlls.setup 
  settings =  
    yaml =  
      customTags =   "!reference sequence"  
     
   
 
-- -----------------------
-- END: LSP Configurations
-- -----------------------
-- ---------------------------------
-- BEG: Autocompletion configuration
-- ---------------------------------
-- Ref: https://github.com/neovim/nvim-lspconfig/wiki/Autocompletion
--
-- Pre requisites:
--
--   # Packer
--   git clone --depth 1 https://github.com/wbthomason/packer.nvim \
--      ~/.local/share/nvim/site/pack/packer/start/packer.nvim
--
--   # Start nvim and run :PackerSync or :PackerUpdate
-- ---------------------------------
local use = require('packer').use
require('packer').startup(function()
  use 'wbthomason/packer.nvim' -- Packer, useful to avoid removing it with PackerSync / PackerUpdate
  use 'neovim/nvim-lspconfig' -- Collection of configurations for built-in LSP client
  use 'hrsh7th/nvim-cmp' -- Autocompletion plugin
  use 'hrsh7th/cmp-nvim-lsp' -- LSP source for nvim-cmp
  use 'saadparwaiz1/cmp_luasnip' -- Snippets source for nvim-cmp
  use 'L3MON4D3/LuaSnip' -- Snippets plugin
end)
-- Add additional capabilities supported by nvim-cmp
local capabilities = require("cmp_nvim_lsp").default_capabilities()
local lspconfig = require('lspconfig')
-- Enable some language servers with the additional completion capabilities offered by nvim-cmp
local servers =   'clangd', 'rust_analyzer', 'pyright', 'ts_ls'  
for _, lsp in ipairs(servers) do
  lspconfig[lsp].setup  
    -- on_attach = my_custom_on_attach,
    capabilities = capabilities,
   
end
-- luasnip setup
local luasnip = require 'luasnip'
-- nvim-cmp setup
local cmp = require 'cmp'
cmp.setup  
  snippet =  
    expand = function(args)
      luasnip.lsp_expand(args.body)
    end,
   ,
  mapping = cmp.mapping.preset.insert( 
    ['<C-u>'] = cmp.mapping.scroll_docs(-4), -- Up
    ['<C-d>'] = cmp.mapping.scroll_docs(4), -- Down
    -- C-b (back) C-f (forward) for snippet placeholder navigation.
    ['<C-Space>'] = cmp.mapping.complete(),
    ['<CR>'] = cmp.mapping.confirm  
      behavior = cmp.ConfirmBehavior.Replace,
      select = true,
     ,
    ['<Tab>'] = cmp.mapping(function(fallback)
      if cmp.visible() then
        cmp.select_next_item()
      elseif luasnip.expand_or_jumpable() then
        luasnip.expand_or_jump()
      else
        fallback()
      end
    end,   'i', 's'  ),
    ['<S-Tab>'] = cmp.mapping(function(fallback)
      if cmp.visible() then
        cmp.select_prev_item()
      elseif luasnip.jumpable(-1) then
        luasnip.jump(-1)
      else
        fallback()
      end
    end,   'i', 's'  ),
   ),
  sources =  
      name = 'nvim_lsp'  ,
      name = 'luasnip'  ,
   ,
 
-- ---------------------------------
-- END: Autocompletion configuration
-- ---------------------------------

ConclusionI guess I ll keep helix installed and try it again on some of my personal projects to see if I can get used to it, but for now I ll stay with neovim as my main text editor and learn the shortcuts to use it with the language servers.

9 January 2025

Reproducible Builds: Reproducible Builds in December 2024

Welcome to the December 2024 report from the Reproducible Builds project! Our monthly reports outline what we ve been up to over the past month and highlight items of news from elsewhere in the world of software supply-chain security when relevant. As ever, however, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. Table of contents:
  1. reproduce.debian.net
  2. debian-repro-status
  3. On our mailing list
  4. Enhancing the Security of Software Supply Chains
  5. diffoscope
  6. Supply-chain attack in the Solana ecosystem
  7. Website updates
  8. Debian changes
  9. Other development news
  10. Upstream patches
  11. Reproducibility testing framework

reproduce.debian.net Last month saw the introduction of reproduce.debian.net. Announced at the recent Debian MiniDebConf in Toulouse, reproduce.debian.net is an instance of rebuilderd operated by the Reproducible Builds project. rebuilderd is our server designed monitor the official package repositories of Linux distributions and attempts to reproduce the observed results there. This month, however, we are pleased to announce that not only does the service now produce graphs, the reproduce.debian.net homepage itself has become a start page of sorts, and the amd64.reproduce.debian.net and i386.reproduce.debian.net pages have emerged. The first of these rebuilds the amd64 architecture, naturally, but it also is building Debian packages that are marked with the no architecture label, all. The second builder is, however, only rebuilding the i386 architecture. Both of these services were also switched to reproduce the Debian trixie distribution instead of unstable, which started with 43% of the archive rebuild with 79.3% reproduced successfully. This is very much a work in progress, and we ll start reproducing Debian unstable soon. Our i386 hosts are very kindly sponsored by Infomaniak whilst the amd64 node is sponsored by OSUOSL thank you! Indeed, we are looking for more workers for more Debian architectures; please contact us if you are able to help.

debian-repro-status Reproducible builds developer kpcyrd has published a client program for reproduce.debian.net (see above) that queries the status of the locally installed packages and rates the system with a percentage score. This tool works analogously to arch-repro-status for the Arch Linux Reproducible Builds setup. The tool was packaged for Debian and is currently available in Debian trixie: it can be installed with apt install debian-repro-status.

On our mailing list On our mailing list this month:
  • Bernhard M. Wiedemann wrote a detailed post on his long journey towards a bit-reproducible Emacs package. In his interesting message, Bernhard goes into depth about the tools that they used and the lower-level technical details of, for instance, compatibility with the version for glibc within openSUSE.
  • Shivanand Kunijadar posed a question pertaining to the reproducibility issues with encrypted images. Shivanand explains that they must use a random IV for encryption with AES CBC. The resulting artifact is not reproducible due to the random IV used. The message resulted in a handful of replies, hopefully helpful!
  • User Danilo posted an in interesting question related to their attempts in trying to achieve reproducible builds for Threema Desktop 2.0. The question resulted in a number of replies attempting to find the right combination of compiler and linker flags (for example).
  • Longstanding contributor David A. Wheeler wrote to our list announcing the release of the Census III of Free and Open Source Software: Application Libraries report written by Frank Nagle, Kate Powell, Richie Zitomer and David himself. As David writes in his message, the report attempts to answer the question what is the most popular Free and Open Source Software (FOSS)? .
  • Lastly, kpcyrd followed-up to a post from September 2024 which mentioned their desire for someone to implement a hashset of allowed module hashes that is generated during the kernel build and then embedded in the kernel image , thus enabling a deterministic and reproducible build. However, they are now reporting that somebody implemented the hash-based allow list feature and submitted it to the Linux kernel mailing list . Like kpcyrd, we hope it gets merged.

Enhancing the Security of Software Supply Chains: Methods and Practices Mehdi Keshani of the Delft University of Technology in the Netherlands has published their thesis on Enhancing the Security of Software Supply Chains: Methods and Practices . Their introductory summary first begins with an outline of software supply chains and the importance of the Maven ecosystem before outlining the issues that it faces that threaten its security and effectiveness . To address these:
First, we propose an automated approach for library reproducibility to enhance library security during the deployment phase. We then develop a scalable call graph generation technique to support various use cases, such as method-level vulnerability analysis and change impact analysis, which help mitigate security challenges within the ecosystem. Utilizing the generated call graphs, we explore the impact of libraries on their users. Finally, through empirical research and mining techniques, we investigate the current state of the Maven ecosystem, identify harmful practices, and propose recommendations to address them.
A PDF of Mehdi s entire thesis is available to download.

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made the following changes, including preparing and uploading versions 283 and 284 to Debian:
  • Update copyright years. [ ]
  • Update tests to support file 5.46. [ ][ ]
  • Simplify tests_quines.py::test_ differences,differences_deb to simply use assert_diff and not mangle the test fixture. [ ]

Supply-chain attack in the Solana ecosystem A significant supply-chain attack impacted Solana, an ecosystem for decentralised applications running on a blockchain. Hackers targeted the @solana/web3.js JavaScript library and embedded malicious code that extracted private keys and drained funds from cryptocurrency wallets. According to some reports, about $160,000 worth of assets were stolen, not including SOL tokens and other crypto assets.

Website updates Similar to last month, there was a large number of changes made to our website this month, including:
  • Chris Lamb:
    • Make the landing page hero look nicer when the vertical height component of the viewport is restricted, not just the horizontal width.
    • Rename the Buy-in page to Why Reproducible Builds? [ ]
    • Removing the top black border. [ ][ ]
  • Holger Levsen:
  • hulkoba:
    • Remove the sidebar-type layout and move to a static navigation element. [ ][ ][ ][ ]
    • Create and merge a new Success stories page, which highlights the success stories of Reproducible Builds, showcasing real-world examples of projects shipping with verifiable, reproducible builds. These stories aim to enhance the technical resilience of the initiative by encouraging community involvement and inspiring new contributions. . [ ]
    • Further changes to the homepage. [ ]
    • Remove the translation icon from the navigation bar. [ ]
    • Remove unused CSS styles pertaining to the sidebar. [ ]
    • Add sponsors to the global footer. [ ]
    • Add extra space on large screens on the Who page. [ ]
    • Hide the side navigation on small screens on the Documentation pages. [ ]

Debian changes There were a significant number of reproducibility-related changes within Debian this month, including:
  • Santiago Vila uploaded version 0.11+nmu4 of the dh-buildinfo package. In this release, the dh_buildinfo becomes a no-op ie. it no longer does anything beyond warning the developer that the dh-buildinfo package is now obsolete. In his upload, Santiago wrote that We still want packages to drop their [dependency] on dh-buildinfo, but now they will immediately benefit from this change after a simple rebuild.
  • Holger Levsen filed Debian bug #1091550 requesting a rebuild of a number of packages that were built with a very old version of dpkg.
  • Fay Stegerman contributed to an extensive thread on the debian-devel development mailing list on the topic of Supporting alternative zlib implementations . In particular, Fay wrote about her results experimenting whether zlib-ng produces identical results or not.
  • kpcyrd uploaded a new rust-rebuilderd-worker, rust-derp, rust-in-toto and debian-repro-status to Debian, which passed successfully through the so-called NEW queue.
  • Gioele Barabucci filed a number of bugs against the debrebuild component/script of the devscripts package, including:
    • #1089087: Address a spurious extra subdirectory in the build path.
    • #1089201: Extra zero bytes added to .dynstr when rebuilding CMake projects.
    • #1089088: Some binNMUs have a 1-second offset in some timestamps.
  • Gioele Barabucci also filed a bug against the dh-r package to report that the Recommends and Suggests fields are missing from rebuilt R packages. At the time of writing, this bug has no patch and needs some help to make over 350 binary packages reproducible.
  • Lastly, 8 reviews of Debian packages were added, 11 were updated and 11 were removed this month adding to our knowledge about identified issues.

Other development news In other ecosystem and distribution news:
  • Lastly, in openSUSE, Bernhard M. Wiedemann published another report for the distribution. There, Bernhard reports about the success of building R-B-OS , a partial fork of openSUSE with only 100% bit-reproducible packages. This effort was sponsored by the NLNet NGI0 initiative.

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In November, a number of changes were made by Holger Levsen, including:
  • reproduce.debian.net-related:
    • Add a new i386.reproduce.debian.net rebuilder. [ ][ ][ ][ ][ ][ ]
    • Make a number of updates to the documentation. [ ][ ][ ][ ][ ]
    • Run i386.reproduce.debian.net run on a public port to allow external workers. [ ]
    • Add a link to the /api/v0/pkgs/list endpoint. [ ]
    • Add support for a statistics page. [ ][ ][ ][ ][ ][ ]
    • Limit build logs to 20 MiB and diffoscope output to 10 MiB. [ ]
    • Improve the frontpage. [ ][ ]
    • Explain that we re testing arch:any and arch:all on the amd64 architecture, but only arch:any on i386. [ ]
  • Misc:
    • Remove code for testing Arch Linux, which has moved to reproduce.archlinux.org. [ ][ ]
    • Don t install dstat on Jenkins nodes anymore as its been removed from Debian trixie. [ ]
    • Prepare the infom08-i386 node to become another rebuilder. [ ]
    • Add debug date output for benchmarking the reproducible_pool_buildinfos.sh script. [ ]
    • Install installation-birthday everywhere. [ ]
    • Temporarily disable automatic updates of pool links on buildinfos.debian.net. [ ]
    • Install Recommends by default on Jenkins nodes. [ ]
    • Rename rebuilder_stats.py to rebuilderd_stats.py. [ ]
    • r.d.n/stats: minor formatting changes. [ ]
    • Install files under /etc/cron.d/ with the correct permissions. [ ]
and Jochen Sprickerhof made the following changes: Lastly, Gioele Barabucci also classified packages affected by 1-second offset issue filed as Debian bug #1089088 [ ][ ][ ][ ], Chris Hofstaedtler updated the URL for Grml s dpkg.selections file [ ], Roland Clobus updated the Jenkins log parser to parse warnings from diffoscope [ ] and Mattia Rizzolo banned a number of bots and crawlers from the service [ ][ ].
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

3 January 2025

Taavi V n nen: Automatically updating reverse DNS entries for my Hetzner servers

Some parts of my infrastructure run on Hetzner dedicated servers. Hetzner's management console has an interface to update reverse DNS entries, and I wanted to automate that. Unfortunately there's no option to just delegate the zones to my own authoritative DNS servers. So I did the next best thing, which is updating the Hetzner-managed records with data from my own authoritative DNS servers.

Generating DNS zones the hard way The first step of automating DNS record provisioning is, well, figuring out which records need to be provisioned. I wanted to re-use my existing automation for generating the record data, instead of coming up with a new system for these records. The basic summary is that there's a Go program creatively named dnsgen that's in charge of generating zone file snippets from various sources (these include Netbox, Kubernetes, PuppetDB and my custom reverse web proxy setup). Those snippets are combined with Jinja templates to generate full zone files to be loaded to a hidden primary running Bind9 (like all other DNS servers I run). The zone files are then transferred to a fleet of internal authoritative servers as well as my public authoritative DNS server, which in turn transfers them to various other authoritative DNS servers (like ns-global and Traficom anycast) for redundancy. There's also a bunch of other smaller features, like using Bind views to server different data to internal and external clients, and resolving external records during record generation time to be used on apex records that would use CNAME records if they could. (The latter is a workaround for Masto.host, the hosting provider we use for Wikis World, not having a stable IPv6 address.) Overall it's a really nice system, and I've spent quite a bit of time on it.

Updating records on Hetzner-managed space As mentioned above, Hetzner unfortunately does not support custom DNS servers for reverse records on IP space rented from them. But I wanted to use my existing, perfectly working DNS record generation setup since that works perfectly fine. So the obvious answer is to (ab)use DNS zone file transfers. I quickly wrote a few hundred lines of Go to request the zone data and then use the Hetzner robot API to ensure the reverse entries are in sync. The main obstacle hit here was the Hetzner API somehow requiring an "update" call (instead of a "create" one) to create a new record, as the create endpoint was returning an HTTP 400 response no matter what. Once I sorted that out, the script started working fine and created the few dozen missing records. Finally I added a CronJob in my Kubernetes cluster to run the script once in a while. Overall this is a big improvement over doing things by hand and didn't require that much effort. The obvious next step would be to expand the script to a tiny DNS server capable of receiving zone update NOTIFYs to make the updates happen real-time. Unfortunately there's now no hiding of the records revealing my ugly hacks clever networking solutions :(

2 January 2025

Matthew Garrett: The GPU, not the TPM, is the root of hardware DRM

As part of their "Defective by Design" anti-DRM campaign, the FSF recently made the following claim:
Today, most of the major streaming media platforms utilize the TPM to decrypt media streams, forcefully placing the decryption out of the user's control (from here).
This is part of an overall argument that Microsoft's insistence that only hardware with a TPM can run Windows 11 is with the goal of aiding streaming companies in their attempt to ensure media can only be played in tightly constrained environments.

I'm going to be honest here and say that I don't know what Microsoft's actual motivation for requiring a TPM in Windows 11 is. I've been talking about TPM stuff for a long time. My job involves writing a lot of TPM code. I think having a TPM enables a number of worthwhile security features. Given the choice, I'd certainly pick a computer with a TPM. But in terms of whether it's of sufficient value to lock out Windows 11 on hardware with no TPM that would otherwise be able to run it? I'm not sure that's a worthwhile tradeoff.

What I can say is that the FSF's claim is just 100% wrong, and since this seems to be the sole basis of their overall claim about Microsoft's strategy here, the argument is pretty significantly undermined. I'm not aware of any streaming media platforms making use of TPMs in any way whatsoever. There is hardware DRM that the media companies use to restrict users, but it's not in the TPM - it's in the GPU.

Let's back up for a moment. There's multiple different DRM implementations, but the big three are Widevine (owned by Google, used on Android, Chromebooks, and some other embedded devices), Fairplay (Apple implementation, used for Mac and iOS), and Playready (Microsoft's implementation, used in Windows and some other hardware streaming devices and TVs). These generally implement several levels of functionality, depending on the capabilities of the device they're running on - this will range from all the DRM functionality being implemented in software up to the hardware path that will be discussed shortly. Streaming providers can choose what level of functionality and quality to provide based on the level implemented on the client device, and it's common for 4K and HDR content to be tied to hardware DRM. In any scenario, they stream encrypted content to the client and the DRM stack decrypts it before the compressed data can be decoded and played.

The "problem" with software DRM implementations is that the decrypted material is going to exist somewhere the OS can get at it at some point, making it possible for users to simply grab the decrypted stream, somewhat defeating the entire point. Vendors try to make this difficult by obfuscating their code as much as possible (and in some cases putting some of it in-kernel), but pretty much all software DRM is at least somewhat broken and copies of any new streaming media end up being available via Bittorrent pretty quickly after release. This is why higher quality media tends to be restricted to clients that implement hardware-based DRM.

The implementation of hardware-based DRM varies. On devices in the ARM world this is usually handled by performing the cryptography in a Trusted Execution Environment, or TEE. A TEE is an area where code can be executed without the OS having any insight into it at all, with ARM's TrustZone being an example of this. By putting the DRM code in TrustZone, the cryptography can be performed in RAM that the OS has no access to, making the scraping described earlier impossible. x86 has no well-specified TEE (Intel's SGX is an example, but is no longer implemented in consumer parts), so instead this tends to be handed off to the GPU. The exact details of this implementation are somewhat opaque - of the previously mentioned DRM implementations, only Playready does hardware DRM on x86, and I haven't found any public documentation of what drivers need to expose for this to work.

In any case, as part of the DRM handshake between the client and the streaming platform, encryption keys are negotiated with the key material being stored in the GPU or the TEE, inaccessible from the OS. Once decrypted, the material is decoded (again either on the GPU or in the TEE - even in implementations that use the TEE for the cryptography, the actual media decoding may happen on the GPU) and displayed. One key point is that the decoded video material is still stored in RAM that the OS has no access to, and the GPU composites it onto the outbound video stream (which is why if you take a screenshot of a browser playing a stream using hardware-based DRM you'll just see a black window - as far as the OS can see, there is only a black window there).

Now, TPMs are sometimes referred to as a TEE, and in a way they are. However, they're fixed function - you can't run arbitrary code on the TPM, you only have whatever functionality it provides. But TPMs do have the ability to decrypt data using keys that are tied to the TPM, so isn't this sufficient? Well, no. First, the TPM can't communicate with the GPU. The OS could push encrypted material to it, and it would get plaintext material back. But the entire point of this exercise was to avoid the decrypted version of the stream from ever being visible to the OS, so this would be pointless. And rather more fundamentally, TPMs are slow. I don't think there's a TPM on the market that could decrypt a 1080p stream in realtime, let alone a 4K one.

The FSF's focus on TPMs here is not only technically wrong, it's indicative of a failure to understand what's actually happening in the industry. While the FSF has been focusing on TPMs, GPU vendors have quietly deployed all of this technology without the FSF complaining at all. Microsoft has enthusiastically participated in making hardware DRM on Windows possible, and user freedoms have suffered as a result, but Playready hardware-based DRM works just fine on hardware that doesn't have a TPM and will continue to do so.

comment count unavailable comments

31 December 2024

Russ Allbery: Review: Metal from Heaven

Review: Metal from Heaven, by August Clarke
Publisher: Erewhon
Copyright: November 2024
ISBN: 1-64566-099-0
Format: Kindle
Pages: 443
Metal from Heaven is industrial-era secondary-world fantasy with a literary bent. It is a complete story in one book, and I would be very surprised by a sequel. Clarke previously wrote the Scapegracers young-adult trilogy, which got excellent reviews and a few award nominations, as H.A. Clarke. This is his first adult novel.
Know I adore you. Look out over the glow. The cities sundered, their machines inverted, mountains split and prairies blazing, that long foreseen Hereafter crowning fast. This calamity is a promise made to you. A prayer to you, and to your shadow which has become my second self, tucked behind my eye and growing in tandem with me, pressing outwards through the pupil, the smarter, truer, almost bursting reason for our wrath. Do not doubt me. Just look. Watch us rise as the sun comes up over the beauty. The future stains the bleakness so pink. When my violence subsides, we will have nothing, and be champions.
Marney Honeycutt is twelve years old, a factory worker, and lustertouched. She works in the Yann I. Chauncey Ichorite Foundry in Ignavia City, alongside her family and her best friend, shaping the magical metal ichorite into the valuable industrial products of a new age of commerce and industry. She is the oldest of the lustertouched, the children born to factory workers and poisoned by the metal. It has made her allergic, prone to fits at any contact with ichorite, but also able to exert a strange control over the metal if she's willing to pay the price of spasms and hallucinations for hours afterwards. As Metal from Heaven opens, the workers have declared a strike. Her older sister is the spokesperson, demanding shorter hours, safer working conditions, and an investigation into the health of the lustertouched children. Chauncey's response is to send enforcer snipers to kill the workers, including the entirety of her family.
The girl sang, "Unalone toward dawn we go, toward the glory of the new morning." An enforcer shot her in the belly, and when she did not fall, her head.
Marney survives, fleeing into the city, swearing an impossible personal revenge against Yann Chauncey. An act of charity gets her a ticket on a train into the countryside. The woman who bought her ticket is a bandit who is on the train to rob it. Marney's ability to control ichorite allows her to help the bandits in return, winning her a place with the Highwayman's Choir who have been preying on the shipments of the rich and powerful and then disappearing into the hills. The Choir's secret is that the agoraphobic and paranoid Baron of the Fingerbluffs is dead and has been for years. He was killed by his staff, Hereafterist idealists, who have turned his remote territory into an anarchist commune and haven for pirates and bandits. This becomes Marney's home and the Choir becomes her family, but she never forgets her oath of revenge or the childhood friend she left behind in the piles of bodies and to whom this story is narrated. First, Clarke's writing is absolutely gorgeous.
We scaled the viny mountain jags at Montrose Barony's legal edge, the place where land was and wasn't Ignavia, Royston, and Drustland alike. There was a border but it was diffuse and hallucinatory, even more so than most. On legal papers and state maps there were harsh lines that squashed topography and sanded down the mountains into even hills in planter's rows, but here among the jutting rocks and craggy heather, the ground was lineless.
The rhythm of it, the grasp of contrast and metaphor, the word choice! That climactic word "lineless," with its echo of limitless. So good. Second, this is the rarest of books: a political fantasy that takes class and religion seriously and uses them for more than plot drivers. This is not at all our world, and the technology level is somewhat ambiguous, but the parallels to the Gilded Age and Progressive Era are unmistakable. The Hereafterists that Marney joins are political anarchists, not in the sense of alternative governance structures and political theory sanitized for middle-class liberals, but in the sense of Emma Goldman and Peter Kropotkin. The society they have built in the Fingerbluffs is temporary, threatened, and contingent, but it is sincere and wildly popular among the people who already lived there. Even beyond politics, class is a tangible force in this book. Marney is a factory worker and the child of factory workers. She barely knows how to read and doesn't magically learn over the course of the book. She has friends who are clever in the sense rewarded by politics and nobility, who navigate bureaucracies and political nuance, but that is not Marney's world. When, towards the end of the book, she has to deal with a gathering of high-class women, the contrast is stark, and she navigates that gathering only by being entirely unexpected. Perhaps the best illustration of the subtlety of this is the terminology in the book for lesbian. Marney is a crawly, which is a slur thrown at people like her (and one of the rare fictional slurs that work exactly as the author intended) but is also simply what she calls herself. Whether or not it functions as a slur depends on context, and the context is never hard to understand. The high-class lesbians she meets later are Lunarists, and react to crawly as a vile and insulting word. They use language to separate themselves from both the insult and from the social class that uses it. Language is an indication of culture and manners and therefore of morality, unlike deeds, which admit endless justifications.
Conversation was fleeting. Perdita managed with whomever stood near her, chipper about every prettiness she saw, the flitting butterflies, the dappled light between the leaves, the lushness and the fragrance of untamed land, and her walking companions took turns sharing in her delight. It was infectious, how happy she was. She was going to slaughter millions. She was going to skip like this all the while.
The handling of religion is perhaps even better. Marney was raised a Tullian, which sits alongside two other fleshed-out fictional religions and sketches of several more. Tullians tend to be conservative and patriarchal, and Marney has a realistically complicated relationship with faith: sticking with some Tullian worship practices and gestures because they're part of who she is, feeling a kinship to other Tullians, discarding beliefs that don't fit her, and revising others. Every major religion has a Hereafterist spin or reinterpretation that upends or reverses the parts of the religion that were used to prop up the existing social order and brings it more in line with Hereafterist ideals. We see the Tullian Hereafterist variation in detail, and as someone who has studied a lot of methods of reinterpreting Christianity, I was impressed by how well Clarke invents both a belief system and its revisionist rewrite. This is exactly how religions work in human history, but one almost never sees this subtlety in fantasy novels. Marney's allergy to ichorite causes her internal dialogue to dissolve into hallucinatory synesthesia when she's manipulating or exposed to it. Since that's most of the book, substantial portions read like drug trips with growing body horror. I normally hate this type of narration, so it's a sign of just how good Clarke's writing is that I tolerated it and even enjoyed parts. It helps that the descriptions are irreverent and often surprising, full of unexpected metaphors and sudden turns. It's very hard not to quote paragraph after paragraph of this book. Clarke is also doing a lot with gender that I don't feel qualified to comment in detail on, but it would not surprise me to see this book in the Otherwise Award recommendation list. I can think of three significant male characters, all of whom are well-done, but every other major character is female by at least some gender definition. Within that group, though, is huge gender diversity of the complicated and personal type that doesn't force people into defined boxes. Marney's sexuality is similarly unclassified and sometimes surprising. My one complaint is that I thought the sex scenes (which, to warn, are often graphic) fell into the literary fiction trap of being described so closely and physically that it didn't feel like anyone involved was actually enjoying themselves. (This is almost certainly a matter of personal taste.) I had absolutely no idea how Clarke was going to end this book, and the last couple of chapters caught me by surprise. I'm still not sure what I think about the climax. It's not the ending that I wanted, but one of the merits of this book is that it never did what I thought I wanted and yet made me enjoy the journey anyway. It is, at least, a genre ending, not a literary ending: The reader gets a full explanation of what is going on, and the setting is not static the way that it so often is in literary fiction. The characters can change the world, for good or for ill. The story felt frustrating and incomplete when I first finished it, but I haven't stopped thinking about this book and I think I like the shape of it a bit more now. It was certainly unexpected, at least by me. Clarke names Dhalgren as one of their influences in the acknowledgments, and yes, Metal from Heaven is that kind of book. This is the first 2024 novel I've read that felt like the kind of book that should be on award shortlists. I'm not sure it was entirely successful, and there are parts of it that I didn't like or that weren't for me, but it's trying to do something different and challenging and uncomfortable, and I think it mostly worked. And the writing is so good.
She looked like a mythic princess from the old woodcuts, who ruled nature by force of goodness and faith and had no legal power.
Metal from Heaven is not going to be everyone's taste. If you do not like literary fantasy, there is a real chance that you will hate this. I am very glad that I read it, and also am going to take a significant break from difficult books before I tackle another one. But then I'm probably going to try the Scapegracers series, because Clarke is an author I want to follow. Content notes: Explicit sex, including sadomasochistic sex. Political violence, mostly by authorities. Murdered children, some body horror, and a lot of serious injuries and death. Rating: 8 out of 10

24 December 2024

Russ Allbery: Review: Number Go Up

Review: Number Go Up, by Zeke Faux
Publisher: Crown Currency
Copyright: 2023
Printing: 2024
ISBN: 0-593-44382-9
Format: Kindle
Pages: 373
Number Go Up is a cross between a history and a first-person account of investigative journalism around the cryptocurrency bubble and subsequent collapse in 2022. The edition I read has an afterward from June 2024 that brings the story up to date with Sam Bankman-Fried's trial and a few other events. Zeke Faux is a reporter for Bloomberg News and a fellow of New America. Last year, I read Michael Lewis's Going Infinite, a somewhat-sympathetic book-length profile of Sam Bankman-Fried that made a lot of people angry. One of the common refrains at the time was that people should read Number Go Up instead, and since I'm happy to read more about the absurdities of the cryptocurrency world, I finally got around to reading the other big crypto book of 2023. This is a good book, with some caveats that I am about to explain at absurd length. If you want a skeptical history of the cryptocurrency bubble, you should read it. People who think that it's somehow in competition with Michael Lewis's book or who think the two books disagree (including Faux himself) have profoundly missed the point of Going Infinite. I agree with Matt Levine: Both of these books are worth your time if this is the sort of thing you like reading about. But (much) more on Faux's disagreements with Lewis later. The frame of Number Go Up is Faux's quixotic quest to prove that Tether is a fraud. To review this book, I therefore need to briefly explain what Tether is. This is only the first of many extended digressions. One natural way to buy cryptocurrency would be to follow the same pattern as a stock brokerage account. You would deposit some amount of money into the account (or connect the brokerage account to your bank account), and then exchange money for cryptocurrency or vice versa, using bank transfers to put money in or take it out. However, there are several problems with this. One is that swapping cryptocurrency for money is awkward and sometimes expensive. Another is that holding people's investment money for them is usually highly regulated, partly for customer safety but also to prevent money laundering. These are often called KYC laws (Know Your Customer), and the regulation-hostile world of cryptocurrency didn't want to comply with them. Tether is a stablecoin, which means that the company behind Tether attempts to guarantee that one Tether is always worth exactly one US dollar. It is not a speculative investment like Bitcoin; it's a cryptocurrency substitute for dollars. People exchange dollars for Tether to get their money into the system and then settle all of their subsequent trades in Tether, only converting the Tether back to dollars when they want to take their money out of cryptocurrency entirely. In essence, Tether functions like the cash reserve in a brokerage account: Your Tether holdings are supposedly guaranteed to be equivalent to US dollars, you can withdraw them at any time, and because you can do so, you don't bother, instead leaving your money in the reserve account while you contemplate what new coin you want to buy. As with a bank, this system rests on the assurance that one can always exchange one Tether for one US dollar. The instant people stop believing this is true, people will scramble to get their money out of Tether, creating the equivalent of a bank run. Since Tether is not a regulated bank or broker and has no deposit insurance or strong legal protections, the primary defense against a run on Tether is Tether's promise that they hold enough liquid assets to be able to hand out dollars to everyone who wants to redeem Tether. (A secondary defense that I wish Faux had mentioned is that Tether limits redemptions to registered accounts redeeming more than $100,000, which is a tiny fraction of the people who hold Tether, but for most purposes this doesn't matter because that promise is sufficient to maintain the peg with the dollar.) Faux's firmly-held belief throughout this book is that Tether is lying. He believes they do not have enough money to redeem all existing Tether coins, and that rather than backing every coin with very safe liquid assets, they are using the dollars deposited in the system to make illiquid and risky investments. Faux never finds the evidence that he's looking for, which makes this narrative choice feel strange. His theory was tested when there was a run on Tether following the collapse of the Terra stablecoin. Tether passed without apparent difficulty, redeeming $16B or about 20% of the outstanding Tether coins. This doesn't mean Faux is wrong; being able to redeem 20% of the outstanding tokens is very different from being able to redeem 100%, and Tether has been fined for lying about its reserves. But Tether is clearly more stable than Faux thought it was, which makes the main narrative of the book weirdly unsatisfying. If he admitted he might be wrong, I would give him credit for showing his work even if it didn't lead where he expected, but instead he pivots to focusing on Tether's role in money laundering without acknowledging that his original theory took a serious blow. In Faux's pursuit of Tether, he wanders through most of the other elements of the cryptocurrency bubble, and that's the strength of this book. Rather than write Number Go Up as a traditional history, Faux chooses to closely follow his own thought processes and curiosity. This has the advantage of giving Faux an easy and natural narrative, something that non-fiction books of this type can struggle with, and it lets Faux show how confusing and off-putting the cryptocurrency world is to an outsider. The best parts of this book were the parts unrelated to Tether. Faux provides an excellent summary of the Axie Infinity speculative bubble and even traveled to the Philippines to interview people who were directly affected. He then wandered through the bizarre world of NFTs, and his first-hand account of purchasing one (specifically a Mutant Ape) to get entrance to a party (which sounded like a miserable experience I would pay money to get out of) really drives home how sketchy and weird cryptocurrency-related software and markets can be. He also went to El Salvador to talk to people directly about the country's supposed embrace of Bitcoin, and there's no substitute for that type of reporting to show how exaggerated and dishonest the claims of cryptocurrency adoption are. The disadvantage of this personal focus on Faux himself is that it sometimes feels tedious or sensationalized. I was much less interested in his unsuccessful attempts to interview the founder of Tether than Faux was, and while the digression into forced labor compounds in Cambodia devoted to pig butchering scams was informative (and horrific), I think Faux leaned too heavily on an indirect link to Tether. His argument is that cryptocurrency enables a type of money laundering that is particularly well-suited to supporting scams, but both scams and this type of economic slavery existed before cryptocurrency and will exist afterwards. He did not make a very strong case that Tether was uniquely valuable as a money laundering service, as opposed to a currently useful tool that would be replaced with some other tool should it go away. This part of the book is essentially an argument that money laundering is bad because it enables crime, and sure, to an extent I agree. But if you're going to put this much emphasis on the evils of money laundering, I think you need to at least acknowledge that many people outside the United States do not want to give US government, which is often openly hostile to them, veto power over their financial transactions. Faux does not. The other big complaint I have with this book, and with a lot of other reporting on cryptocurrency, is that Faux is sloppy with the term "Ponzi scheme." This is going to sound like nit-picking, but I think this sloppiness matters because it may obscure an ongoing a shift in cryptocurrency markets. A Ponzi scheme is not any speculative bubble. It is a very specific type of fraud in which investors are promised improbably high returns at very low risk and with safe principal. These returns are paid out, not via investment in some underlying enterprise, but by taking the money from new investments and paying it to earlier investors. Ponzi schemes are doomed because satisfying their promises requires a constantly increasing flow of new investors. Since the population of the world is finite, all Ponzi schemes are mathematically guaranteed to eventually fail, often in a sudden death spiral of ever-increasing promises to lure new investors when the investment stream starts to dry up. There are some Ponzi schemes in cryptocurrency, but most practices that are called Ponzi schemes are not. For example, Faux calls Axie Infinity a Ponzi scheme, but it was missing the critical elements of promised safe returns and fraudulently paying returns from the investments of later investors. It was simply a speculative bubble that people bought into on the assumption that its price would increase, and like any speculative bubble those who sold before the peak made money at the expense of those who bought at the peak. The reason why this matters is that Ponzi schemes are a self-correcting problem. One can decry the damage caused when they collapse, but one can also feel the reassuring certainty that they will inevitably collapse and prove the skeptics correct. The same is not true of speculative assets in general. You may think that the lack of an underlying economic justification for prices means that a speculative bubble is guaranteed to collapse eventually, but in the famous words of Gary Schilling, "markets can remain irrational a lot longer than you and I can remain solvent." One of the people Faux interviews explains this distinction to him directly:
Rong explained that in a true Ponzi scheme, the organizer would have to handle the "fraud money." Instead, he gave the sneakers away and then only took a small cut of each trade. "The users are trading between each other. They are not going through me, right?" Rong said. Essentially, he was arguing that by downloading the Stepn app and walking to earn tokens, crypto bros were Ponzi'ing themselves.
Faux is openly contemptuous of this response, but it is technically correct. Stepn is not a Ponzi scheme; it's a speculative bubble. There are no guaranteed returns being paid out of later investments and no promise that your principal is safe. People are buying in at price that you may consider irrational, but Stepn never promised you would get your money back, let alone make a profit, and therefore it doesn't have the exponential progression of a Ponzi scheme. One can argue that this is a distinction without a moral difference, and personally I would agree, but it matters immensely if one is trying to analyze the future of cryptocurrencies. Schemes as transparently unstable as Stepn (which gives you coins for exercise and then tries to claim those coins have value through some vigorous hand-waving) are nearly as certain as Ponzi schemes to eventually collapse. But it's also possible to create a stable business around allowing large numbers of people to regularly lose money to small numbers of sophisticated players who are collecting all of the winnings. It's called a poker room at a casino, and no one thinks poker rooms are Ponzi schemes or are doomed to collapse, even though nearly everyone who plays poker will lose money. This is the part of the story that I think Faux largely missed, and which Michael Lewis highlights in Going Infinite. FTX was a legitimate business that made money (a lot of money) off of trading fees, in much the same way that a casino makes money off of poker rooms. Lots of people want to bet on cryptocurrencies, similar to how lots of people want to play poker. Some of those people will win; most of those people will lose. The casino doesn't care. Its profit comes from taking a little bit of each pot, regardless of who wins. Bankman-Fried also speculated with customer funds, and therefore FTX collapsed, but there is no inherent reason why the core exchange business cannot be stable if people continue to want to speculate in cryptocurrencies. Perhaps people will get tired of this method of gambling, but poker has been going strong for 200 years. It's also important to note that although trading fees are the most obvious way to be a profitable cryptocurrency casino, they're not the only way. Wall Street firms specialize in finding creative ways to take a cut of every financial transaction, and many of those methods are more sophisticated than fees. They are so good at this that buying and selling stock through trading apps like Robinhood is free. The money to run the brokerage platform comes from companies that are delighted to pay for the opportunity to handle stock trades by day traders with a phone app. This is not, as some conspiracy theories would have you believe, due to some sort of fraudulent price manipulation. It is because the average person with a Robinhood phone app is sufficiently unsophisticated that companies that have invested in complex financial modeling will make a steady profit taking the other side of their trades, mostly because of the spread (the difference between offered buy and sell prices). Faux is so caught up in looking for Ponzi schemes and fraud that I think he misses this aspect of cryptocurrency's transformation. Wall Street trading firms aren't piling into cryptocurrency because they want to do securities fraud. They're entering this market because there seems to be persistent demand for this form of gambling, cryptocurrency markets reward complex financial engineering, and running a legal casino is a profitable business model. Michael Lewis appears as a character in this book, and Faux portrays him quite negatively. The root of this animosity appears to stem from a cryptocurrency conference in the Bahamas that Faux attended. Lewis interviewed Bankman-Fried on stage, and, from Faux's account, his questions were fawning and he praised cryptocurrencies in ways that Faux is certain he knew were untrue. From that point on, Faux treats Lewis as an apologist for the cryptocurrency industry and for Sam Bankman-Fried specifically. I think this is a legitimate criticism of Lewis's methods of getting close to the people he wants to write about, but I think Faux also makes the common mistake of assuming Lewis is a muckraking reporter like himself. This has never been what Lewis is interested in. He writes about people he finds interesting and that he thinks a reader will also find interesting. One can legitimately accuse him of being credulous, but that's partly because he's not even trying to do the same thing Faux is doing. He's not trying to judge; he's trying to understand. This shows when it comes to the parts of this book about Sam Bankman-Fried. Faux's default assumption is that everyone involved in cryptocurrency is knowingly doing fraud, and a lot of his research is looking for evidence to support the conclusion he had already reached. I don't think there's anything inherently wrong with that approach: Faux is largely, although not entirely, correct, and this type of hostile journalism is incredibly valuable for society at large. Upton Sinclair didn't start writing The Jungle with an open mind about the meat-packing industry. But where Faux and Lewis disagree on Bankman-Fried's motivations and intentions, I think Lewis has the much stronger argument. Faux's position is that Bankman-Fried always intended to steal people's money through fraud, perhaps to fund his effective altruism donations, and his protestations that he made mistakes and misplaced funds are obvious lies. This is an appealing narrative if one is looking for a simple villain, but Faux's evidence in support of this is weak. He mostly argues through stereotype: Bankman-Fried was a physics major and a Jane Street trader and therefore could not possibly be the type of person to misplace large amounts of money or miscalculate risk. If he wants to understand how that could be possible, he could read Going Infinite? I find it completely credible that someone with what appears to be uncontrolled, severe ADHD could be adept at trading and calculating probabilities and yet also misplace millions of dollars of assets because he wasn't thinking about them and therefore they stopped existing. Lewis made a lot of people angry by being somewhat sympathetic to someone few people wanted to be sympathetic towards, but Faux (and many others) are also misrepresenting his position. Lewis agrees that Bankman-Fried intentionally intermingled customer funds with his hedge fund and agrees that he lied about doing this. His only contention is that Bankman-Fried didn't do this to steal the money; instead, he invested customer money in risky bets that he thought would pay off. In support of this, Lewis made a prediction that was widely scoffed at, namely that much less of FTX's money was missing than was claimed, and that likely most or all of it would be found. And, well, Lewis was basically correct? The FTX bankruptcy is now expected to recover considerably more than the amount of money owed to creditors. Faux argues that this is only because the bankruptcy clawed back assets and cryptocurrencies have gone up considerably since the FTX bankruptcy, and therefore that the lost money was just replaced by unexpected windfall profits on other investments, but I don't think this point is as strong as he thinks it is. Bankman-Fried lost money on some of what he did with customer funds, made money on other things, and if he'd been able to freeze withdrawals for the year that the bankruptcy froze them, it does appear most of the money would have been recoverable. This does not make what he did legal or morally right, but no one is arguing that, only that he didn't intentionally steal money for his own personal gain or for effective altruism donations. And on that point, I don't think Faux is giving Lewis's argument enough credit. I have a lot of complaints about this book because I know way too much about this topic than anyone should probably know. I think Faux missed the plot in a couple of places, and I wish someone would write a book about where cryptocurrency markets are currently going. (Matt Levine's Money Stuff newsletter is quite good, but it's about all sorts of things other than cryptocurrency and isn't designed to tell a coherent story.) But if you know less about cryptocurrency and just want to hear the details of the run-up to the 2022 bubble, this is a great book for that. Faux is writing for people who are already skeptical and is not going to convince people who are cryptocurrency true believers, but that's fine. The details are largely correct (and extensively footnoted) and will satisfy most people's curiosity. Lewis's Going Infinite is a better book, though. It's not the same type of book at all, and it will not give you the broader overview of the cryptocurrency world. But if you're curious about what was going through the head of someone at the center of all of this chaos, I think Lewis's analysis is much stronger than Faux's. I'm happy I read both books. Rating: 8 out of 10

21 December 2024

Joey Hess: aiming at December

I have been working all year on a solar upgrade aimed at December. Now here it is, midwinter, and my electric car is charging on a cloudy day from my offgrid solar fence. I lived happily enough with 1 kilowatt of solar that I installed in 2017. Meanwhile, solar panel prices came down massively, incentives increased and everything came together: This was the year. In the spring I started clearing forest trees that were leaning over the house, making both a firebreak and a solar field. In June I picked up a pallet of panels in a box truck.
a porch with a a bunch of solar panels, stacked on edge leaning up against the wall. A black and white cat is sprawled in front of them.
In August I bought the EV and was able to charge it offgrid from my old solar system... a few miles per day on the most sunny days. In September and October I built a solar fence, of my own design.
Me standing in front of the solar fence, which is 10 panels long
For the past several weeks I have been installing additional solar panels on ballasted ground mounts full of gravel. At this point I'm half way through installing my 30 panel upgrade. The design goal of my 12 kilowatt system is to produce 1 kilowatt of power all day on a cloudy day in midwinter, which allows swapping between major loads (EV charger, hot water heater, etc) on a cloudy day and running everything on a sunny day. So the size of the battery bank doesn't matter much. Batteries are getting cheaper fast too, but they are a wear item, so it's better to oversize the solar system and minimize the battery. A lot of this is nonstandard and experimental. And that makes sense with the price of solar panels. It costs more to mount solar panels now than the panels are worth. And non-ideal panel orientation isn't a problem when the system is massively overpaneled. I'm hoping to finish up the install before the end of winter. I have more trees to clear, more ballasted ground mounts to install, and need to come up with something even more experimental for a half dozen or so panels. Using solar panels as mounts for solar panels? Hanging them from trees? Soon the wan light will fade, time to head off to the solstice party to enjoy the long night, and a bonfire.
Solar fence with some ballasted ground mounts in front of it, late evening light. Old pole mounted solar panels in the foreground are from the 90's.

17 December 2024

Dirk Eddelbuettel: BH 1.87.0-1 on CRAN: New Upstream

Boost Boost is a very large and comprehensive set of (peer-reviewed) libraries for the C++ programming language, containing well over one hundred individual libraries. The BH package provides a sizeable subset of header-only libraries for (easier, no linking required) use by R. It is fairly widely used: the (partial) CRAN mirror logs (aggregated from the cloud mirrors) show over 38.5 million package downloads. Version 1.87.0 of Boost was released last week following the regular Boost release schedule of April, August and December releases. As before, we packaged it almost immediately and started testing following our annual update cycle which strives to balance being close enough to upstream and not stressing CRAN and the user base too much. The reverse depends check revealed six packages requiring changes or adjustments. We opened issue #103 to coordinate the issue (just as we did in previous years). Our sincere thanks to Matt Fidler who fixed two packages pretty much immediately. As I had not heard back from the other maintainers since filing the issue, I uploaded the package to CRAN suggesting that the coming winter break may be a good opportunity for the four other packages to catch up. CRAN concurred, and 1.87.0-1 is now available there. There are no other changes apart from cosmetics in the DESCRIPTION file. For once, we did not add any new Boost libraries. The short NEWS entry follows.

Changes in version 1.87.0-1 (2024-12-17)
  • Upgrade to Boost 1.87.0, patched as usual to comment-out diagnostic suppression messages per the request of CRAN
  • Switched to Authors@R

Via my CRANberries, there is a diffstat report relative to the previous release. Comments and suggestions about BH are welcome via the issue tracker at the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can now sponsor me at GitHub.

12 December 2024

Matthew Garrett: When should we require that firmware be free?

The distinction between hardware and software has historically been relatively easy to understand - hardware is the physical object that software runs on. This is made more complicated by the existence of programmable logic like FPGAs, but by and large things tend to fall into fairly neat categories if we're drawing that distinction.

Conversations usually become more complicated when we introduce firmware, but should they? According to Wikipedia, Firmware is software that provides low-level control of computing device hardware, and basically anything that's generally described as firmware certainly fits into the "software" side of the above hardware/software binary. From a software freedom perspective, this seems like something where the obvious answer to "Should this be free" is "yes", but it's worth thinking about why the answer is yes - the goal of free software isn't freedom for freedom's sake, but because the freedoms embodied in the Free Software Definition (and by proxy the DFSG) are grounded in real world practicalities.

How do these line up for firmware? Firmware can fit into two main classes - it can be something that's responsible for initialisation of the hardware (such as, historically, BIOS, which is involved in initialisation and boot and then largely irrelevant for runtime[1]) or it can be something that makes the hardware work at runtime (wifi card firmware being an obvious example). The role of free software in the latter case feels fairly intuitive, since the interface and functionality the hardware offers to the operating system is frequently largely defined by the firmware running on it. Your wifi chipset is, these days, largely a software defined radio, and what you can do with it is determined by what the firmware it's running allows you to do. Sometimes those restrictions may be required by law, but other times they're simply because the people writing the firmware aren't interested in supporting a feature - they may see no reason to allow raw radio packets to be provided to the OS, for instance. We also shouldn't ignore the fact that sufficiently complicated firmware exposed to untrusted input (as is the case in most wifi scenarios) may contain exploitable vulnerabilities allowing attackers to gain arbitrary code execution on the wifi chipset - and potentially use that as a way to gain control of the host OS (see this writeup for an example). Vendors being in a unique position to update that firmware means users may never receive security updates, leaving them with a choice between discarding hardware that otherwise works perfectly or leaving themselves vulnerable to known security issues.

But even the cases where firmware does nothing other than initialise the hardware cause problems. A lot of hardware has functionality controlled by registers that can be locked during the boot process. Vendor firmware may choose to disable (or, rather, never to enable) functionality that may be beneficial to a user, and then lock out the ability to reconfigure the hardware later. Without any ability to modify that firmware, the user lacks the freedom to choose what functionality their hardware makes available to them. Again, the ability to inspect this firmware and modify it has a distinct benefit to the user.

So, from a practical perspective, I think there's a strong argument that users would benefit from most (if not all) firmware being free software, and I don't think that's an especially controversial argument. So I think this is less of a philosophical discussion, and more of a strategic one - is spending time focused on ensuring firmware is free worthwhile, and if so what's an appropriate way of achieving this?

I think there's two consistent ways to view this. One is to view free firmware as desirable but not necessary. This approach basically argues that code that's running on hardware that isn't the main CPU would benefit from being free, in the same way that code running on a remote network service would benefit from being free, but that this is much less important than ensuring that all the code running in the context of the OS on the primary CPU is free. The maximalist position is not to compromise at all - all software on a system, whether it's running at boot or during runtime, and whether it's running on the primary CPU or any other component on the board, should be free.

Personally, I lean towards the former and think there's a reasonably coherent argument here. I think users would benefit from the ability to modify the code running on hardware that their OS talks to, in the same way that I think users would benefit from the ability to modify the code running on hardware the other side of a network link that their browser talks to. I also think that there's enough that remains to be done in terms of what's running on the host CPU that it's not worth having that fight yet. But I think the latter is absolutely intellectually consistent, and while I don't agree with it from a pragmatic perspective I think things would undeniably be better if we lived in that world.

This feels like a thing you'd expect the Free Software Foundation to have opinions on, and it does! There are two primarily relevant things - the Respects your Freedoms campaign focused on ensuring that certified hardware meets certain requirements (including around firmware), and the Free System Distribution Guidelines, which define a baseline for an OS to be considered free by the FSF (including requirements around firmware).

RYF requires that all software on a piece of hardware be free other than under one specific set of circumstances. If software runs on (a) a secondary processor and (b) within which software installation is not intended after the user obtains the product, then the software does not need to be free. (b) effectively means that the firmware has to be in ROM, since any runtime interface that allows the firmware to be loaded or updated is intended to allow software installation after the user obtains the product.

The Free System Distribution Guidelines require that all non-free firmware be removed from the OS before it can be considered free. The recommended mechanism to achieve this is via linux-libre, a project that produces tooling to remove anything that looks plausibly like a non-free firmware blob from the Linux source code, along with any incitement to the user to load firmware - including even removing suggestions to update CPU microcode in order to mitigate CPU vulnerabilities.

For hardware that requires non-free firmware to be loaded at runtime in order to work, linux-libre doesn't do anything to work around this - the hardware will simply not work. In this respect, linux-libre reduces the amount of non-free firmware running on a system in the same way that removing the hardware would. This presumably encourages users to purchase RYF compliant hardware.

But does that actually improve things? RYF doesn't require that a piece of hardware have no non-free firmware, it simply requires that any non-free firmware be hidden from the user. CPU microcode is an instructive example here. At the time of writing, every laptop listed here has an Intel CPU. Every Intel CPU has microcode in ROM, typically an early revision that is known to have many bugs. The expectation is that this microcode is updated in the field by either the firmware or the OS at boot time - the updated version is loaded into RAM on the CPU, and vanishes if power is cut. The combination of RYF and linux-libre doesn't reduce the amount of non-free code running inside the CPU, it just means that the user (a) is more likely to hit since-fixed bugs (including security ones!), and (b) has less guidance on how to avoid them.

As long as RYF permits hardware that makes use of non-free firmware I think it hurts more than it helps. In many cases users aren't guided away from non-free firmware - instead it's hidden away from them, leaving them less aware that their freedom is constrained. Linux-libre goes further, refusing to even inform the user that the non-free firmware that their hardware depends on can be upgraded to improve their security.

Out of sight shouldn't mean out of mind. If non-free firmware is a threat to user freedom then allowing it to exist in ROM doesn't do anything to solve that problem. And if it isn't a threat to user freedom, then what's the point of requiring linux-libre for a Linux distribution to be considered free by the FSF? We seem to have ended up in the worst case scenario, where nothing is being done to actually replace any of the non-free firmware running on people's systems and where users may even end up with a reduced awareness that the non-free firmware even exists.

[1] Yes yes SMM

comment count unavailable comments

Next.