Search Results: "hug"

2 December 2021

Paul Tagliamonte: Intro to PACKRAT (Part 0/5)

Hello! Welcome. I m so thrilled you re here. Some of you may know this (as I ve written about in the past), but if you re new to my RF travels, I ve spent nights and weekends over the last two years doing some self directed learning on how radios work. I ve gone from a very basic understanding of wireless communications, all the way through the process of learning about and implementing a set of libraries to modulate and demodulate data using my now formidable stash of SDRs. I ve been implementing all of the RF processing code from first principals and purely based on other primitives I ve written myself to prove to myself that I understand each concept before moving on. I ve just finished a large personal milestone I was able to successfully send a cURL HTTP request through a network interface into my stack of libraries, through my own BPSK implementation, framed in my own artisanal hand crafted Layer 2 framing scheme, demodulated by my code on the other end, and sent into a Linux network interface. The combination of the Layer 1 PHY and Layer 2 Data Link is something that I ve been calling PACKRAT .
$ curl http://44.127.0.8:8000/
* Connected to 44.127.0.8 (44.127.0.8) port 8000 (#0)
> GET / HTTP/1.1
> Host: localhost:1313
> User-Agent: curl/7.79.1
> Accept: */*
>
* Mark bundle as not supporting multiuse
* HTTP/1.0, assume close after body
< HTTP/1.0 200 OK
< Content-Length: 236
<
____ _ ____ _ ______ _ _____
  _ \ / \ / ___   / / _ \ / \ _ _ 
   _) / _ \      ' /   _)   / _ \    
  __/ ___ \  ___  . \  _ < / ___ \   
 _  /_/ \_\____ _ \_\_  \_\/_/ \_\_ 
* Closing connection 0
In an effort to pay it forward to thank my friends for their time walking me through huge chunks of this, and those who publish their work, I m now spending some time documenting how I was able to implement this protocol. I would never have gotten as far as I did without the incredible patience and kindness of friends spending time working with me, and educators publishing their hard work for the world to learn from. Please accept my deepest thanks and appreciation. The PACKRAT posts are written from the perspective of a novice radio engineer, but experienced software engineer. I ll be leaving out a lot of the technical details on the software end and specific software implementation, focusing on the general gist of the implementation in the radio critical components exclusively. The idea here is this is intended to be a framework a jumping off point for those who are interested in doing this themselves. I hope that this series of blog posts will come to be useful to those who embark on this incredibly rewarding journey after me. This is the first post in the series, and it will contain links to all the posts to follow. This is going to be the landing page I link others to as I publish additional posts, I ll be updating the links on this page. The posts will also grow a tag, which you can check back on, or follow along with here.

Tau Tau ( ) is a much more natural expression of the mathematical constant used for circles which I use rather than Pi ( ). You may see me use Tau in code or text Tau is the same as 2 , so if you see a Tau and don t know what to do, feel free to mentally or textually replace it with 2 . I just hate always writing 2 everywhere and only using (or worse yet 2 /2) .when I mean 1/2 of a circle (or, /2).

Psuedo-code Basicaly none of the code contained in this series is valid on its own. It s very lightly basically Go, and only meant to express concepts in term of software. The examples in the post shouldn t be taken on their own as working snippits to process IQ data, but rather, be used to guide implementations to process the data in question. I d love to invite all readers to try to play at home with the examples, and try and work through the example data captures!

Captures Speaking of captures, I ve included live on-the-air captures of PACKRAT packets, as transmitted from my implementation, in different parts of these posts. This means you can go through the process of building code to parse and receive PACKRAT packets, and then build a transmitter that is validated by your receiver. It s my hope folks will follow along at home and experiment with software to process RF data on their own!

Posts in this series

21 November 2021

Julian Andres Klode: APT Z3 Solver Basics

Z3 is a theorem prover developed at Microsoft research and available as a dynamically linked C++ library in Debian-based distributions. While the library is a whopping 16 MB, and the solver is a tad slow, it s permissive licensing, and number of tactics offered give it a huge potential for use in solving dependencies in a wide variety of applications. Z3 does not need normalized formulas, but offers higher level abstractions like atmost and atleast and implies, that we will make use of together with boolean variables to translate the dependency problem to a form Z3 understands. In this post, we ll see how we can apply Z3 to the dependency resolution in APT. We ll only discuss the basics here, a future post will explore optimization criteria and recommends.

Translating the universe APT s package universe consists of 3 relevant things: packages (the tuple of name and architecture), versions (basically a .deb), and dependencies between versions. While we could translate our entire universe to Z3 problems, we instead will construct a root set from packages that were manually installed and versions marked for installation, and then build the transitive root set from it by translating all versions reachable from the root set. For each package P in the transitive root set, we create a boolean literal P. We then translate each version P1, P2, and so on. Translating a version means building a boolean literal for it, e.g. P1, and then translating the dependencies as shown below. We now need to create two more clauses to satisfy the basic requirements for debs:
  1. If a version is installed, the package is installed; and vice versa. We can encode this requirement for P above as P == atleast( P1,P2 , 1).
  2. There can only be one version installed. We add an additional constraint of the form atmost( P1,P2 , 1).
We also encode the requirements of the operation.
  1. For each package P that is manually installed, add a constraint P.
  2. For each version V that is marked for install, add a constraint V.
  3. For each package P that is marked for removal, add a constraint !P.

Dependencies Packages in APT have dependencies of two basic forms: Depends and Conflicts, as well as variations like Breaks (identical to Conflicts in solving terms), and Recommends (soft Depends) - we ll ignore those for now. We ll discuss Conflicts in the next section. Let s take a basic dependency list: A Depends: X Y, Z. To represent that dependency, we expand each name to a list of versions that can satisfy the dependency, for example X1 X2 Y1, Z1. Translating this dependency list to our Z3 solver, we create boolean variables X1,X2,Y1,Z1 and define two rules:
  1. A implies atleast( X1,X2,Y1 , 1)
  2. A implies atleast( Z1 , 1)
If there actually was nothing that satisfied the Z requirement, we d have added a rule not A. It would be possible to simply not tell Z3 about the version at all as an optimization, but that adds more complexity, and the not A constraint should not cause too many problems.

Conflicts Conflicts cannot have or in them. A dependency B Conflicts: X, Y means that only one of B, X, and Y can be installed. We can directly encode this in Z3 by using the constraint atmost( B,X,Y , 1). This is an optimized encoding of the constraint: We could have encoded each conflict in the form !B or !X, !B or !X, and so on. Usually this leads to worse performance as it introduces additional clauses.

Complete example Let s assume we start with an empty install and want to install the package a below.
Package: a
Version: 1
Depends: c   b
Package: b
Version: 1
Package: b
Version: 2
Conflicts: x
Package: d
Version: 1
Package: x
Version: 1
The translation in Z3 rules looks like this:
  1. Package rules for a:
    1. a == atleast( a1 , 1) - package is installed iff one version is
    2. atmost( a1 , 1) - only one version may be installed
    3. a a must be installed
  2. Dependency rules for a
    1. implies(a1, atleast( b2, b1 , 1)) the translated dependency above. note that c is gone, it s not reachable.
  3. Package rules for b:
    1. b == atleast( b1,b2 , 1) - package is installed iff one version is
    2. atmost( b1, b2 , 1) - only one version may be installed
  4. Dependencies for b (= 2):
    1. atmost( b2, x1 , 1) - the conflicts between x and b = 2 above
  5. Package rules for x:
    1. x == atleast( x1 , 1) - package is installed iff one version is
    2. atmost( x1 , 1) - only one version may be installed
The package d is not translated, as it is not reachable from the root set a1 , the transitive root set is a1,b1,b2,x1 .

Next iteration: Optimization We have now constructed the basic set of rules that allows us to solve solve our dependency problems (equivalent to SAT), however it might lead to suboptimal solutions where it removes automatically installed packages, or installs more packages than necessary, to name a few examples. In our next iteration, we have to look at introducing optimization; for example, have the minimum number of removals, the minimal number of changed packages, or satisfy as many recommends as possible. We will also look at the upgrade problem (upgrade as many packages as possible), the autoremove problem (remove as many automatically installed packages as possible).

Antoine Beaupr : The last syncmaildir crash

My syncmaildir (SMD) setup failed me one too many times (previously, previously). In an attempt to migrate to an alternative mail synchronization tool, I looked into using my IMAP server again, and found out my mail spool was in a pretty bad shape. I'm comparing mbsync and offlineimap in the next post but this post talks about how I recovered the mail spool so that tools like those could correctly synchronise the mail spool again.

The latest crash On Monday, SMD just started failing with this error:
nov 15 16:12:19 angela systemd[2305]: Starting pull emails with syncmaildir...
nov 15 16:12:22 angela systemd[2305]: smd-pull.service: Succeeded.
nov 15 16:12:22 angela systemd[2305]: Finished pull emails with syncmaildir.
nov 15 16:14:08 angela systemd[2305]: Starting pull emails with syncmaildir...
nov 15 16:14:11 angela systemd[2305]: smd-pull.service: Main process exited, code=exited, status=1/FAILURE
nov 15 16:14:11 angela systemd[2305]: smd-pull.service: Failed with result 'exit-code'.
nov 15 16:14:11 angela systemd[2305]: Failed to start pull emails with syncmaildir.
nov 15 16:16:14 angela systemd[2305]: Starting pull emails with syncmaildir...
nov 15 16:16:17 angela smd-pull[27178]: smd-client: ERROR: Network error.
nov 15 16:16:17 angela smd-pull[27178]: smd-client: ERROR: Unable to get any data from the other endpoint.
nov 15 16:16:17 angela smd-pull[27178]: smd-client: ERROR: This problem may be transient, please retry.
nov 15 16:16:17 angela smd-pull[27178]: smd-client: ERROR: Hint: did you correctly setup the SERVERNAME variable
nov 15 16:16:17 angela smd-pull[27178]: smd-client: ERROR: on your client? Did you add an entry for it in your ssh
nov 15 16:16:17 angela smd-pull[27178]: smd-client: ERROR: configuration file?
nov 15 16:16:17 angela smd-pull[27178]: smd-client: ERROR: Network error
nov 15 16:16:17 angela smd-pull[27188]: register: smd-client@localhost: TAGS: error::context(handshake) probable-cause(network) human-intervention(avoidable) suggested-actions(retry)
nov 15 16:16:17 angela systemd[2305]: smd-pull.service: Main process exited, code=exited, status=1/FAILURE
nov 15 16:16:17 angela systemd[2305]: smd-pull.service: Failed with result 'exit-code'.
nov 15 16:16:17 angela systemd[2305]: Failed to start pull emails with syncmaildir.
What is frustrating is that there's actually no network error here. Running the command by hand I did see a different message, but now I have lost it in my backlog. It had something to do with a filename being too long, and I gave up debugging after a while. This happened suddenly too, which added to the confusion. In a fit of rage I started this blog post and experimenting with alternatives, which led me down a lot of rabbit holes. Reviewing my previous mail crash documentation, it seems most solutions involve talking to an IMAP server, so I figured I would just do that. Wanting to try something new, i gave isync (AKA mbsync) a try. Oh dear, I did not expect how much trouble just talking to my IMAP server would be, which wasn't not isync's fault, for what that's worth. It was the primary tool I used to debug things, and served me well in that regard.

Mailbox corruption The first thing I found out is that certain messages in the IMAP spool were corrupted. mbsync would stop on a FETCH command and Dovecot would give me those errors on the server side.

"wrong W value"
nov 16 15:31:27 marcos dovecot[3621800]: imap(anarcat)<3630489><wAmSzO3QZtfAqAB1>: Error: Mailbox junk: Maildir filename has wrong W value, renamed the file from /home/anarcat/Maildir/.junk/cur/1454623938.M101164P22216.marcos,S=2495,W=2578:2,S to /home/anarcat/Maildir/.junk/cur/1454623938.M101164P22216.marcos,S=2495:2,S
nov 16 15:31:27 marcos dovecot[3621800]: imap(anarcat)<3630489><wAmSzO3QZtfAqAB1>: Error: Mailbox junk: Deleting corrupted cache record uid=1582: UID 1582: Broken virtual size in mailbox junk: read(/home/anarcat/Maildir/.junk/cur/1454623938.M101164P22216.marcos,S=2495,W=2578:2,S): FETCH BODY[] got too little data: 2540 vs 2578
At least this first error was automatically healed by Dovecot (by renaming the file without the W= flag). The problem is that the FETCH command fails and mbsync exits noisily. So you need to constantly restart mbsync with a silly command like:
while ! mbsync -a; do sleep 1; done

"cached message size larger than expected"
nov 16 13:53:08 marcos dovecot[3520770]: imap(anarcat)<3594402><M5JHb+zQ3NLAqAB1>: Error: Mailbox Sent: UID=19288: read(/home/anarcat/Maildir/.Sent/cur/1224790447.M898726P9811V000000000000FE06I00794FB1_0.marvin,S=2588:2,S) failed: Cached message size larger than expected (2588 > 2482, box=Sent, UID=19288) (read reason=mail stream)
nov 16 13:53:08 marcos dovecot[3520770]: imap(anarcat)<3594402><M5JHb+zQ3NLAqAB1>: Error: Mailbox Sent: Deleting corrupted cache record uid=19288: UID 19288: Broken physical size in mailbox Sent: read(/home/anarcat/Maildir/.Sent/cur/1224790447.M898726P9811V000000000000FE06I00794FB1_0.marvin,S=2588:2,S) failed: Cached message size larger than expected (2588 > 2482, box=Sent, UID=19288)
nov 16 13:53:08 marcos dovecot[3520770]: imap(anarcat)<3594402><M5JHb+zQ3NLAqAB1>: Error: Mailbox Sent: UID=19288: read(/home/anarcat/Maildir/.Sent/cur/1224790447.M898726P9811V000000000000FE06I00794FB1_0.marvin,S=2588:2,S) failed: Cached message size larger than expected (2588 > 2482, box=Sent, UID=19288) (read reason=)
nov 16 13:53:08 marcos dovecot[3520770]: imap-login: Panic: epoll_ctl(del, 7) failed: Bad file descriptor
This second problem is much harder to fix, because dovecot does not recover automatically. This is Dovecot complaining that the cached size (the S= field, but also present in Dovecot's metadata files) doesn't match the file size. I wonder if at least some of those messages were corrupted in the OfflineIMAP to syncmaildir migration because part of that procedure is to run the strip_header script to remove content from the emails. That could easily have broken things since the files do not also get renamed.

Workaround So I read a lot of the Dovecot documentation on the maildir format, and wrote an extensive fix script for those two errors. The script worked and mbsync was able to sync the entire mail spool. And no, rebuilding the index files didn't work. Also tried doveadm force-resync -u anarcat which didn't do anything. In the end I also had to do this, because the wrong cache values were also stored elsewhere.
service dovecot stop ; find -name 'dovecot*' -delete; service dovecot start
This would have totally broken any existing clients, but thankfully I'm starting from scratch (except maybe webmail, but I'm hoping it will self-heal as well, assuming it only has a cache and not a full replica of the mail spool).

Incoherence between Maildir and IMAP Unfortunately, the first mbsync was incomplete as it was missing about 15,000 mails:
anarcat@angela:~(main)$ find Maildir -type f -type f -a \! -name '.*'   wc -l 
384836
anarcat@angela:~(main)$ find Maildir-mbsync/ -type f -a \! -name '.*'   wc -l 
369221
As it turns out, mbsync was not at fault here either: this was yet more mail spool corruption. It's actually 26 folders (out of 205) with inconsistent sizes, which can be found with:
for folder in * .[^.]* ; do 
  printf "%s\t%d\n" $folder $(find "$folder" -type f -a \! -name '.*'   wc -l );
done
The special \! -name '.*' bit is to ignore the mbsync metadata, which creates .uidvalidity and .mbsyncstate in every folder. That ignores about 200 files but since they are spread around all folders, which was making it impossible to review where the problem was. Here is what the diff looks like:
--- Maildir-list    2021-11-17 20:42:36.504246752 -0500
+++ Maildir-mbsync-list 2021-11-17 20:18:07.731806601 -0500
@@ -6,16 +6,15 @@
[...]
 .Archives  1
 .Archives.2010 3553
-.Archives.2011 3583
-.Archives.2012 12593
+.Archives.2011 3582
+.Archives.2012 620
 .Archives.2013 8576
 .Archives.2014 11057
-.Archives.2015 8173
+.Archives.2015 8165
 .Archives.2016 54
 .band  34
 .bitbuck   1
@@ -38,13 +37,12 @@
 .couchsurfers  2
-cur    11285
+cur    11280
 .current   130
 .cv    2
 .debbug    262
-.debian    37544
-drafts 1
-.Drafts    4
+.debian    37533
+.Drafts    2
 .drone 241
 .drupal    188
 .drupal-devel  303
[...]

Misfiled messages It's a bit all over the place, but we can already notice some huge differences between mailboxes, for example in the Archives folders. As it turns out, at least 12,000 of those missing mails were actually misfiled: instead of being in the Maildir/.Archives.2012/cur/ folder, they were directly in Maildir/.Archives.2012/. This is something that doesn't matter for SMD (and possibly for notmuch? it does matter, notmuch suddenly found 12,000 new mails) but that definitely matters to Dovecot and therefore mbsync... After moving those files around, we still have 4,000 message missing:
anarcat@angela:~(main)$ find Maildir-mbsync/  -type f -a \! -name '.*'   wc -l 
381196
anarcat@angela:~(main)$ find Maildir/  -type f -a \! -name '.*'   wc -l 
385053
The problem is that those 4,000 missing mails are harder to track. Take, for example, .Archives.2011, which has a single message missing, out of 3,582. And the files are not identical: the checksums don't match after going through the IMAP transport, so we can't use a tool like hashdeep to compare the trees and find why any single file is missing.

"register" folder One big chunk of the 4,000, however, is a special folder called register in my spool, which I am syncing separately (see Securing registration email for details on that setup). That actually covers 3,700 of those messages, so I actually have a more modest 300 messages to figure out, after (easily!) configuring mbsync to sync that folder separately:
 @@ -30,9 +33,29 @@ Slave :anarcat-local:
  # Exclude everything under the internal [Gmail] folder, except the interesting folders
  #Patterns * ![Gmail]* "[Gmail]/Sent Mail" "[Gmail]/Starred" "[Gmail]/All Mail"
  # Or include everything
 -Patterns *
 +#Patterns *
 +Patterns * !register  !.register
  # Automatically create missing mailboxes, both locally and on the server
  #Create Both
  Create slave
  # Sync the movement of messages between folders and deletions, add after making sure the sync works
  #Expunge Both
 +
 +IMAPAccount anarcat-register
 +Host imap.anarc.at
 +User register
 +PassCmd "pass imap.anarc.at-register"
 +SSLType IMAPS
 +CertificateFile /etc/ssl/certs/ca-certificates.crt
 +
 +IMAPStore anarcat-register-remote
 +Account anarcat-register
 +
 +MaildirStore anarcat-register-local
 +SubFolders Maildir++
 +Inbox ~/Maildir-mbsync/.register/
 +
 +Channel anarcat-register
 +Master :anarcat-register-remote:
 +Slave :anarcat-register-local:
 +Create slave

"tmp" folders and empty messages After syncing the "register" messages, I end up with the measly little 160 emails out of sync:
anarcat@angela:~(main)$ find Maildir-mbsync/  -type f -a \! -name '.*'   wc -l 
384900
anarcat@angela:~(main)$ find Maildir/  -type f -a \! -name '.*'   wc -l 
385059
Argh. After more digging, I have found 131 mails in the tmp/ directories of the client's mail spool. Mysterious! On the server side, it's even more files, and not the same ones. Possible that those were mails that were left there during a failed delivery of some sort, during a power failure or some sort of crash? Who knows. It could be another race condition in SMD if it runs while mail is being delivered in tmp/... The first thing to do with those is to cleanup a bunch of empty files (21 on angela):
find .[^.]*/tmp -type f -empty -delete
As it turns out, they are all duplicates, in the sense that notmuch can easily find a copy of files with the same message ID in its database. In other words, this hairy command returns nothing
find .[^.]*/tmp -type f   while read path; do
  msgid=$(grep -m 1  -i ^message-id "$path"   sed 's/Message-ID: //i;s/[<>]//g');
  if notmuch count --exclude=false  "id:$msgid"   grep -q 0; then
    echo "$path <$msgid> not in notmuch" ;
  fi;
done
... which is good. Or, to put it another way, this is safe:
find .[^.]*/tmp -type f -delete
Poof! 314 mails cleaned on the server side. Interestingly, SMD doesn't pick up on those changes at all and still sees files in tmp/ directories on the client side, so we need to operate the same twisted logic there.

notmuch to the rescue again After cleaning that on the client, we get:
anarcat@angela:~(main)$ find Maildir/  -type f -a \! -name '.*'   wc -l 
384928
anarcat@angela:~(main)$ find Maildir-mbsync/  -type f -a \! -name '.*'   wc -l 
384901
Ha! 27 mails difference. Those are the really sticky, unclear ones. I was hoping a full sync might clear that up, but after deleting the entire directory and starting from scratch, I end up with:
anarcat@angela:~(main)$ find Maildir -type f -type f -a \! -name '.*'   wc -l 
385034
anarcat@angela:~(main)$ find Maildir-mbsync -type f -type f -a \! -name '.*'   wc -l 
384993
That is: even more messages missing (now 37). Sigh. Thankfully, this is something notmuch can help with: it can index all files by Message-ID (which I learned is case-insensitive, yay) and tell us which messages don't make it through. Considering the corruption I found in the mail spool, I wouldn't be the least surprised those messages are just skipped by the IMAP server. Unfortunately, there's nothing on the Dovecot server logs that would explain the discrepancy. Here again, notmuch comes to the rescue. We can list all message IDs to figure out that discrepancy:
notmuch search --exclude=false --output=messages '*'   pv -s 18M   sort > Maildir-msgids
notmuch --config=.notmuch-config-mbsync search --exclude=false --output=messages '*'   pv -s 18M   sort > Maildir-mbsync-msgids
And then we can see how many messages notmuch thinks are missing:
$ wc -l *msgids
372723 Maildir-mbsync-msgids
372752 Maildir-msgids
That's 29 messages. Oddly, it doesn't exactly match the find output:
anarcat@angela:~(main)$ find Maildir-mbsync -type f -type f -a \! -name '.*'   wc -l 
385204
anarcat@angela:~(main)$ find Maildir -type f -type f -a \! -name '.*'   wc -l 
385241
That is 10 more messages. Ugh. But actually, I know what those are: more misfiled messages (in a .folder/draft/ directory, bizarrely, so the totals actually match. In the notmuch output, there's a lot of stuff like this:
id:notmuch-sha1-fb880d673e24f5dae71b6b4d825d4a0d5d01cde4
Those are messages without a valid Message-ID. Notmuch (presumably) constructs one based on the file's checksum. Because the files differ between the IMAP server and the local mail spool (which is unfortunate, but possibly inevitable), those do not match. There are exactly the same number of those on both sides, so I'll go ahead and assume those are all accounted for. What remains is:
anarcat@angela:~(main)$ diff -u Maildir-mbsync-msgids Maildir-msgids    grep '^\-[^-]'   grep -v sha1   wc -l 
2
anarcat@angela:~(main)$ diff -u Maildir-mbsync-msgids Maildir-msgids    grep '^\+[^+]'   grep -v sha1   wc -l 
21
anarcat@angela:~(main)$ 
ie. 21 missing from mbsync, and, surprisingly, 2 missing from the original mail spool. Further inspection also showed they were all messages with some sort of "corruption": no body and only headers. I am not sure that is a legal email format in the first place. Since they were mostly spam or administrative emails ("You have been unsubscribed from mailing list..."), it seems fairly harmless to ignore those.

Conclusion As we'll see in the next article, SMD has stellar performance. But that comes at a huge cost: it accesses the mail storage directly. This can (and has) created significant problems on the mail server. It's unclear exactly why those things happen, but Dovecot expects a particular storage format on its file, and it seems unwise to bypass that. In the future, I'll try to remember to avoid that, especially since mechanisms like SMD require special server access (SSH) which, in the long term, I am not sure I want to maintain or expect. In other words, just talking with an IMAP server opens up a lot more possibilities of hosting than setting up a custom synchronisation protocol over SSH. It's also safer and more reliable, as we have seen. Thankfully, I've been able to recover from all the errors I could find, but it could have gone differently and it would have been possible for SMD to permanently corrupt significant part of my mail archives. In the end, however, the last drop was just another weird bug which, ironically, SMD mysteriously recovered from on its own while I was writing this documentation and migrating away from it. In any case, I recommend SMD users start looking for alternatives. The project has been archived upstream, and the Debian package has been orphaned. I have seen significant mail box corruption, including entire mail spool destruction, mostly due to incorrect locking code. I have filed a release-critical bug in Debian to make sure it doesn't ship with Debian bookworm. Alternatives like mbsync provide fast and reliable transport, including over SSH. See the next article for further discussion of the alternatives.

14 November 2021

Russ Allbery: Review: The Last Graduate

Review: The Last Graduate, by Naomi Novik
Series: The Scholomance #2
Publisher: Del Rey
Copyright: 2021
ISBN: 0-593-12887-7
Format: Kindle
Pages: 388
This is a direct sequel to A Deadly Education, by which I mean it starts in the same minute at which A Deadly Education ends (and let me say how grateful I am for a sequel that doesn't drop days, months, or years between books). You do not want to read this series out of order. This book is also very difficult to review without spoiling either it or the previous book, so please bear with me if I'm elliptical in my ravings. Because The Last Graduate is so good. So good, not only as a piece of writing, but as a combination of two of my favorite tropes in fiction, one of which I can't talk about because of spoilers. I adored this book in a way that is not entirely rational. I will attempt a review below anyway, but if you liked the first book, just stop reading here and go read the second one. It's more of everything I loved in the first book except even better, it did some things I was expecting and some things I didn't expect at all, and it's just so ridiculously good. Just be aware that it has another final-line cliffhanger. The third book is coming in (hopefully) 2022. Novik handles the cliffhanger at the end of the previous book beautifully, which is worth noting because there were so many ways in which it could have gone poorly. One of the best things about this series is Novik's skill at writing El's relationship with her mother, even though her mother has not appeared in the series so far. El argues with her mother's voice in her head, tells stories about her, wonders what her mother would think of her classmates (or in some cases knows exactly what her mother would think of her classmates), and sometimes makes the explicit decision to not be her mother. The relationship has the sort of messy complexity, shared history, and underlying respect that many people experience in life but that I've rarely seen portrayed this well in a fantasy novel. Novik's presentation of that relationship works because El's voice is so strong. Within fifteen minutes of starting The Last Graduate, I was already muttering "I love this book" to myself, mostly because of how much I enjoy El's sarcastic, self-deprecating internal commentary. Novik strikes a balance between self-awareness, snark, humor, and real character growth that rivals Murderbot in its effectiveness of first-person perspective. It carries the story over a few weak points, such as a romance that didn't do much for me. Even when I didn't care about part of the plot, I cared about El's opinion of the plot and what it said about El's growing understanding of how to navigate the world. A Deadly Education was scene and character establishment. El insisted on being herself and following her own morals and social rules, and through that found some allies. The Last Graduate gives El enough breathing space to make more nuanced decisions. This is the part of growing up where one realizes the limitations of one's knee-jerk reactions and innate moral judgment. It's also when it becomes hard to trust success that is entirely outside of one's previous experience. El was not a kid who had friends, so she doesn't know what to do with them now that she has them. She's barely able to convince herself that they are friends. This is one of the two fictional tropes I mentioned, the one that I can talk about (at least briefly) without major spoilers. I have such a soft spot for stubborn, sarcastic, principled characters who refuse to play by the social rules that they think are required to make friends and who then find friends who like them for themselves. The moment when they start realizing this has happened and have no idea how to deal with it or how to be a person who has friends is one I will happily read over and over again. I enjoyed this book from the beginning, but there were two points when it grabbed my heart and I was all in. The first one is a huge spoiler that I can't talk about. The second was this paragraph:
[She] came round to me and put her arm around my waist and said under her breath, "Hey, she can be taught," with a tease in her voice that wobbled a little, and when I looked at her, her eyes were bright and wet, and I put my arm around her shoulders and hugged her.
You'll know it when you get there. The Last Graduate also gives the characters other than El and Orion more room, which is part of how it handles the chosen one trope. It's been obvious since early in the first book that Orion is a sort of chosen one, and it becomes obvious to the reader that El may be as well. But Novik doesn't let the plot focus only on them; instead, she uses that trope to look at how alliances and collective action happen, and how no one can carry the weight by themselves. As El learns more and gains power, she also becomes less central to the plot resolution and has to learn how to be less self-reliant. This is not a book where one character is trained to save the world. It's a book where she manages to enlist the support of a kick-ass project manager and becomes part of a team. Middle books of a trilogy are notoriously challenging. Often they're travel books: the first book sets up a problem, the second book moves the characters both physically and emotionally into a position to solve the problem, and the third book is the payoff. Travel books often sag. They can feel obligatory but somewhat boring, like a chore on the way to the third-book climax. The Last Graduate is not a travel book; it is, instead, a pivot book, which is my favorite form of trilogy. It's a book that rewrites the problem the first book set up, both resolving it and expanding the scope beyond what the reader had expected. This is immensely satisfying when done well, and Novik does it extremely well. This is not a flawless book. There are some pacing hiccups, there is a romance angle that didn't work for me (although it does arrive at some character insights that I thought were spot on), and although I think Novik is doing something interesting with the trope, there is a lot of chosen one power escalation happening here. It's not the sort of book that I can claim is perfectly written. Instead, it's the sort of book that uses some of my favorite plot elements and emotional beats in such an effective way and with such a memorable character that I do not have it in me to care about any of the flaws. Your mileage may therefore vary, but I would be happy to read books like this until the end of time. As mentioned above, The Last Graduate ends on another cliffhanger. This time I was worried that Novik might have ended the series there, since there's enough of an internal climax that I could imagine some literary fiction (which often seems allergic to endings) would have stopped here. Thankfully, Novik's web site says this is not the case. The next year is going to be a difficult wait. The third book of this series is going to be incredibly difficult to write, and I hope Novik is up to the challenge she's made for herself. But she handled the transition between the first and second book so well, and this book is so good that I have a lot of hope. If the third book is half as good as I'm hoping, this is going to be one of my favorite fantasy series of all time. Followed by an as-yet-untitled third book. Rating: 10 out of 10

10 November 2021

Neil Williams: LetsEncrypt with Apache, Gunicorn and Debian Bullseye

This took me far too long to identify and debug, so I'm going to write it up here for my own reference and to possibly help others.
Background Upgrading an old codebase from Python2 on Buster to Python3 ready for Bullseye and from Django1 to Django2 (prepared for Django3). Everything is fine at this stage - the Django test server is happy with HTTP and it gives enough support to do the actual code changes to get to Python3. All well and good so far. The main purpose of this particular code was to support payments, so a chunk of the testing cannot be done without HTTPS, which is where things got awkward. This particular service needs HTTPS using LetsEncrypt and Apache2. To support Django, I typically use Gunicorn. All of this works with HTTP. Moving to HTTPS was easy to test using the default-ssl virtual host that comes with Apache2 in Debian. It's a static page and it worked well with https. The problems all start when trying to use this known-working HTTPS config with the other Apache virtual host to add support for the gunicorn proxy.
Apache reverse proxy AH00898 Error during SSL Handshake with remote server
Investigating Now that I know why this happened, it's easier to see what was happening. At the time, I was swamped in a plethora of options and permutations between the Django HTTPS options and the Apache VirtualHost SSL and proxy commands. Going through all of those took up huge amounts of time, all in the wrong area. In previous configurations using packages in Buster, gunicorn could simply run on http://localhost:8000 and Apache would proxy that as https. In versions in Bullseye, this no longer works and it is that handover from https in Apache to http in the proxy is where it is failing. Apache is using HTTPS because the LetsEncrypt certificates, created using dehydrated, are specified in the VirtualHost configuration. To fix the handshake error, the proxy server needs to know about the certificates created by dehydrated as well.
Gunicorn needs the certificates The clue is in the gunicorn help:
--keyfile FILE        SSL key file [None]
--certfile FILE       SSL certificate file [None]
The final part of the puzzle is that the certificates created by dehydrated are in a private location:
drwx------ 2 root root /var/lib/dehydrated/certs/
To test gunicorn, this will mean using sudo but that's just a step towards running gunicorn as a systemd service (when access to the certs will not be a problem). Starting gunicorn using these options shows the proxy now being available at https://localhost:8000 which is a subtle but very important change.
Environment=LOGLEVEL=DEBUG WORKERS=4 LOGFILE=/var/log/gunicorn/site.log
ExecStart=/usr/bin/gunicorn3 site.wsgi --log-level $LOGLEVEL --log-file $LOGFILE --workers $WORKERS \
--certfile /var/lib/dehydrated/certs/site/cert.pem \
--keyfile /var/lib/dehydrated/certs/site/privkey.pem
The specified locations are symbolic links created by dehydrated to cope with regular updates of the certificates using cron.

29 September 2021

Ian Jackson: Rust for the Polyglot Programmer

Rust is definitely in the news. I'm definitely on the bandwagon. (To me it feels like I've been wanting something like Rust for many years.) There're a huge number of intro tutorials, and of course there's the Rust Book. A friend observed to me, though, that while there's a lot of "write your first simple Rust program" there's a dearth of material aimed at the programmer who already knows a dozen diverse languages, and is familiar with computer architecture, basic type theory, and so on. Or indeed, for the impatient and confident reader more generally. I thought I would have a go. Rust for the Polyglot Programmer is the result. Compared to much other information about Rust, Rust for the Polyglot Programmer is: After reading Rust for the Polyglot Programmer, you won't know everything you need to know to use Rust for any project, but should know where to find it. Thanks are due to Simon Tatham, Mark Wooding, Daniel Silverstone, and others, for encouragement, and helpful reviews including important corrections. Particular thanks to Mark Wooding for wrestling pandoc and LaTeX into producing a pretty good-looking PDF. Remaining errors are, of course, mine. Comments are welcome of course, via the Dreamwidth comments or Salsa issue or MR. (If you're making a contribution, please indicate your agreement with the Developer Certificate of Origin.)
edited 2021-09-29 16:58 UTC to fix Salsa link targe, and 17:01 and 17:21 to for minor grammar fixes


comment count unavailable comments

12 September 2021

Russ Allbery: DocKnot 5.00

This release is the culmination of a project that I've been wanting to do for two years, but nearly all the work was done in the past week. That experience made me rethink some of my skepticism, but I'll get to that part of the story later. In March of 1999, I got tired of writing HTML by hand and wrote a small program called spin that implemented a macro language that translated into HTML. This makes it one of the oldest programs for which I have a continuous development history, predating podlators by three months. I think only News::Gateway (now very dormant) and Term::ANSIColor (still under active development but very stable) are older, as long as I'm not counting orphaned packages like newsyslog. I've used spin continuously ever since. It's grown features and an ecosystem of somewhat hackish scripts to do web publishing things I've wanted over the years: journal entries like this one, book reviews, a simple gallery (with some now-unfortunate decisions about maximum image size), RSS feeds, and translation of lots of different input files into HTML. But the core program itself, in all those years, has been one single Perl script written mostly in my Perl coding style from the early 2000s before I read Perl Best Practices. My web site is long overdue for an overhaul. Just to name a couple of obvious problems, it looks like trash on mobile browsers, and I'm using URL syntax from the early days of the web that, while it prompts some nostalgia for tildes, means all the URLs are annoyingly long and embed useless information such as the fact each page is written in HTML. Its internals also use a lot of ad hoc microformats (a bit of RFC 2822 here, a text-based format with significant indentation there, a weird space-separated database) and are supported by programs that extract meaning from human-written pages and perform automated updates to them rather than having a clear separation between structure and data. This will be a very large project, but it's the sort of quixotic personal project that I enjoy. Maintaining my own idiosyncratic static site generator is almost certainly not an efficient use of my time compared to, say, converting everything to Hugo. But I have 3,428 pages (currently) written in the thread macro language, plus numerous customizations that cater to my personal taste and interests, and, most importantly, I like having a highly customized system that I know exactly how to automate. The blocker has been that I didn't want to work on spin as it existed. It badly needed a structural overhaul and modernization, and even more badly needed a test suite since every release involved tedious manual testing by pouring over diffs between generations of the web site. And that was enough work to be intimidating, so I kept putting it off. I've separately been vaguely aware that I have been spending too much time reading Twitter (specifically) and the news (in general). It would be one thing if I were taking in that information to do something productive about it, but I haven't been. It's just doomscrolling. I've been thinking about taking a break for a while but it kept not sticking, so I decided to make a concerted effort this week. It took about four days to stop wanting to check Twitter and forcing myself to go do something else productive or at least play a game instead. Then I managed to get started on my giant refactoring project, and holy shit, Twitter has been bad for my attention span! I haven't been able to sustain this level of concentration for hours at a time in years. Twitter's not the only thing to blame (there are a few other stressers that I've fixed in the past couple of years), but it's obviously a huge part. Anyway, this long personal ramble is prelude to the first release of DocKnot that includes my static site generator. This is not yet the full tooling from my old web tools page; specifically, it's missing faq2html, cl2xhtml, and cvs2xhtml. (faq2html will get similar modernization treatment, cvs2xhtml will probably be rewritten in Perl since I have some old, obsolete scripts that may live in CVS forever, and I may retire cl2xhtml since I've stopped using the GNU ChangeLog format entirely.) But DocKnot now contains the core of my site generation system, including the thread macro language, POD conversion (by way of Pod::Thread), and RSS feeds. Will anyone else ever use this? I have no idea; realistically, probably not. If you were starting from scratch, I'm sure you'd be better off with one of the larger and more mature static site generators that's not the idiosyncratic personal project of one individual. It is packaged for Debian because it's part of the tool chain for generating files (specifically README.md) that are included in every package I maintain, and thus is part of the transitive closure of Debian main, but I'm not sure anyone will install it from there for any other purpose. But for once making something for someone else isn't the point. This is my quirky, individual way to maintain web sites that originated in an older era of the web and that I plan to keep up-to-date (I'm long overdue to figure out what they did to HTML after abandoning the XHTML approach) because it brings me joy to do things this way. In addition to adding the static site generator, this release also has the regular sorts of bug fixes and minor improvements: better formatting of software pages for software that's packaged for Debian, not assuming every package has a TODO file, and ignoring Autoconf 2.71 backup files when generating distribution tarballs. You can get the latest version of DocKnot from CPAN as App-DocKnot, or from its distribution page. I know I haven't yet updated my web tools page to reflect this move, or changed the URL in the footer of all of my pages. This transition will be a process over the next few months and will probably prompt several more minor releases.

9 September 2021

Bits from Debian: DebConf21 online closes

DebConf21 group photo - click to enlarge On Saturday 28 August 2021, the annual Debian Developers and Contributors Conference came to a close. DebConf21 has been held online for the second time, due to the coronavirus (COVID-19) disease pandemic. All of the sessions have been streamed, with a variety of ways of participating: via IRC messaging, online collaborative text documents, and video conferencing meeting rooms. With 740 registered attendees from more than 15 different countries and a total of over 70 event talks, discussion sessions, Birds of a Feather (BoF) gatherings and other activities, DebConf21 was a large success. The setup made for former online events involving Jitsi, OBS, Voctomix, SReview, nginx, Etherpad, a web-based frontend for voctomix has been improved and used for DebConf21 successfully. All components of the video infrastructure are free software, and configured through the Video Team's public ansible repository. The DebConf21 schedule included a wide variety of events, grouped in several tracks: The talks have been streamed using two rooms, and several of these activities have been held in different languages: Telugu, Portuguese, Malayalam, Kannada, Hindi, Marathi and English, allowing a more diverse audience to enjoy and participate. Between talks, the video stream has been showing the usual sponsors on the loop, but also some additional clips including photos from previous DebConfs, fun facts about Debian and short shout-out videos sent by attendees to communicate with their Debian friends. The Debian publicity team did the usual live coverage to encourage participation with micronews announcing the different events. The DebConf team also provided several mobile options to follow the schedule. For those who were not able to participate, most of the talks and sessions are already available through the Debian meetings archive website, and the remaining ones will appear in the following days. The DebConf21 website will remain active for archival purposes and will continue to offer links to the presentations and videos of talks and events. Next year, DebConf22 is planned to be held in Prizren, Kosovo, in July 2022. DebConf is committed to a safe and welcome environment for all participants. During the conference, several teams (Front Desk, Welcome team and Community team) have been available to help so participants get their best experience in the conference, and find solutions to any issue that may arise. See the web page about the Code of Conduct in DebConf21 website for more details on this. Debian thanks the commitment of numerous sponsors to support DebConf21, particularly our Platinum Sponsors: Lenovo, Infomaniak, Roche, Amazon Web Services (AWS) and Google. About Debian The Debian Project was founded in 1993 by Ian Murdock to be a truly free community project. Since then the project has grown to be one of the largest and most influential open source projects. Thousands of volunteers from all over the world work together to create and maintain Debian software. Available in 70 languages, and supporting a huge range of computer types, Debian calls itself the universal operating system. About DebConf DebConf is the Debian Project's developer conference. In addition to a full schedule of technical, social and policy talks, DebConf provides an opportunity for developers, contributors and other interested people to meet in person and work together more closely. It has taken place annually since 2000 in locations as varied as Scotland, Argentina, and Bosnia and Herzegovina. More information about DebConf is available from https://debconf.org/. About Lenovo As a global technology leader manufacturing a wide portfolio of connected products, including smartphones, tablets, PCs and workstations as well as AR/VR devices, smart home/office and data center solutions, Lenovo understands how critical open systems and platforms are to a connected world. About Infomaniak Infomaniak is Switzerland's largest web-hosting company, also offering backup and storage services, solutions for event organizers, live-streaming and video on demand services. It wholly owns its datacenters and all elements critical to the functioning of the services and products provided by the company (both software and hardware). About Roche Roche is a major international pharmaceutical provider and research company dedicated to personalized healthcare. More than 100.000 employees worldwide work towards solving some of the greatest challenges for humanity using science and technology. Roche is strongly involved in publicly funded collaborative research projects with other industrial and academic partners and have supported DebConf since 2017. About Amazon Web Services (AWS) Amazon Web Services (AWS) is one of the world's most comprehensive and broadly adopted cloud platform, offering over 175 fully featured services from data centers globally (in 77 Availability Zones within 24 geographic regions). AWS customers include the fastest-growing startups, largest enterprises and leading government agencies. About Google Google is one of the largest technology companies in the world, providing a wide range of Internet-related services and products such as online advertising technologies, search, cloud computing, software, and hardware. Google has been supporting Debian by sponsoring DebConf for more than ten years, and is also a Debian partner sponsoring parts of Salsa's continuous integration infrastructure within Google Cloud Platform. Contact Information For further information, please visit the DebConf21 web page at https://debconf21.debconf.org/ or send mail to press@debian.org.

5 September 2021

Antoine Beaupr : Automating major Debian upgrades

It's major upgrade time again! The Debian project just published the Debian 11 "bullseye" release, and it's pretty awesome! This makes me realized that I have never written here about my peculiar upgrade process, and figured it was worth bringing that up to a wider audience. My upgrade process also has a notable changes section which includes major version changes (e.g. Inkscape 1.0!), new packages (e.g. podman!) and important behavior changes (e.g. driverless scanning and printing!). I'm particularly interested to hear about any significant change I might have missed. If you know of a cool new package that shipped with bullseye and that I forgot, do let me know! But that's for the cool new stuff. We need to talk about the problems with Debian major upgrades.

Background I have been maintaining detailed upgrade guides, on my wiki, starting with the jessie release, but I have actually written such guides for Koumbit.org as far back as Debian squeeze in 2011 (another worker wrote the older Debian lenny upgrade guide in 2009). Koumbit, since then, has kept maintaining those guides all the way to the latest bullseye upgrade, through 7 major releases! Over the years, those guides evolved from a quick "cheat-sheet" format copied from the release notes into a more or less "scripted" form that I currently use. Each guide has a procedure made of a few steps that can be basically copy-pasted to batch-upgrade a host (or multiple hosts in parallel) as quickly as possible. There is also the predict-os script which allows you to keep track of progress of the upgrades in a Puppet cluster.

Limitations of the official procedure In comparison with my procedure, the official upgrade guide is mostly designed to upgrade a single machine, typically a workstation, with a rather slow and exhaustive process. The PDF version of the upgrade guide is 14 pages long! This, obviously, does not work when you have tens or hundreds of machines to upgrade. Debian upgrades are notorious for being extremely reliable, but we have a lot of packages, and there are always corner cases where the upgrade will just fail because of a bug specific to your environment. Those will only be fixed after some back and forth in the community (and that's assuming users report those bugs, which is not always the case). There's no obvious way to deploy "hot fixes" in this context, at least not without fixing the package and publishing it on an unofficial Debian archive while the official ones catch up. This is slow and difficult. Or some packages require manual labor. Examples of this are the PostgreSQL or Ganeti packages which require you to upgrade your clusters by hand, while the old and new packages live side by side. Debian packages bring you far in the upgrade process, but sometimes not all the way. Which means every Debian install needs to be manually upgraded and inspected when a new release comes out. That's slow and error prone and we can do better.

How to automate major upgrades I have a proposal to automate this. It's been mostly dormant in the Debian wiki, for 5 years now. Fundamentally, this is a hard problem: Debian gets installed in so many different environments, from workstations to physical servers to virtual machines, embedded systems and so on, that it's extremely hard to come up with a "one size fits all" system. The (manual) procedure I'm using is mostly targeting servers, but I'm also using it on workstations. And I'll note that it's specific to my home setup: I have a different procedure at work, although it has a lot of common code. To automate this, I would factor out that common code with hooks where you could easily inject special code like "you need to upgrade ferm first", "you need an extra reboot here", or "this is how you finish the PostgreSQL upgrade". With Debian getting closer to a 2 year release cycle, with the previous release being supported basically only one year after the new stable comes out, I feel more and more strongly that this needs better automation. So I'm thinking that I should write a prototype for this. Ubuntu has do-release-upgrade that is too Ubuntu-specific to be reused. An attempt at collaborating on this has been mostly met with silence from Ubuntu's side as well. I'm thinking that using something like Fabric, Mitogen, or Transilience: anything that will allow me to write simple, portable Python code that can run transparently on a local machine (for single systems upgrades, possibly with a GUI frontend) to remote servers (for large clusters of servers, maybe with canaries and grouping using Cumin). I'll note that Koumbit started experimenting with Puppet Bolt in the bullseye upgrade process, but that feels too site-specific to be useful more broadly.

Trade-offs I am not sure where this stands in the XKCD time trade-off evaluation, because the table doesn't actually cover the time frequency of Debian release (which is basically "biennial") and the amount of time the upgrade would take across a cluster (which varies a lot, but that I estimate to be between one to 6 hours per machine). Assuming I have 80 machines to upgrade, that is 80 to 480 hours (between ~3 to 20 days) of work! It's unclear how much work such an automated system would shave off, however. Assuming things are an order of magnitude faster (say I upgrade 10 machines at a time), I would shave off between 3 and 18 days of work, which implies I might allow myself to spend a minimum of 5 days working on such a project.

The other option: never upgrade Before people mention those: I am aware of containers, Kubernetes, and other deployment mechanisms. Indeed, those may be a long-term solution, we currently can't afford to migrate everything over to containers right now: that is a huge migration and a total paradigm shift. At that point, whatever is left might not even be Debian in the first place. And besides, if you run Kubernetes, you still need to run some OS underneath and upgrade that, so that problem never completely disappears. Still, maybe that's the final answer: never upgrade. For some stateless machines like DNS replicas or load balancers, that might make a lot of sense as there's no or little data to carry to the new host. But this implies a seamless and fast provisioning process, and we don't have that either: at my work, installing a machine takes about as long as upgrading it, and that's after a significant amount of work automating that process, partly writing my own Debian installer with Fabric (!).

What is your process? I'm curious to hear what people think of those ideas. It strikes me as really odd that no one has really tackled that problem yet, considering how many clusters of Debian machines are out there. Surely people are upgrading those, and not following that slow step by step guide, right? I suspect everyone is doing the same thing: we all have our little copy-paste script we batch onto multiple machines, sometimes in parallel. That is what the Debian.org sysadmins are doing as well. There must be a better way. What is yours?

My upgrades so far So far, I have upgraded 2 out of my 3 home machines running buster -- others have been installed directly in bullseye -- with only my main, old, messy server left. Upgrades have been pretty painless so far (see another report, for example), much better than the previous buster upgrade. Obviously, for me personal use, automating this is pointless. Work-side, however, is another story: we have over 80 boxes to upgrade there and that will take a while. The last stretch to buster cycle took about two years to complete, so we might be done by the time the next release (12, "bookworm") is released, but that's actually a full year after "buster" becomes EOL, so it's actually too late... At least I fixed the installers so that new the machines we create all ship with bullseye, so we stopped accumulating new buster hosts...
Thanks to lelutin and pabs for reviewing a draft of this post.

Reproducible Builds: Reproducible Builds in August 2021

Welcome to the latest report from the Reproducible Builds project. In this post, we round up the important things that happened in the world of reproducible builds in August 2021. As always, if you are interested in contributing to the project, please visit the Contribute page on our website.
There were a large number of talks related to reproducible builds at DebConf21 this year, the 21st annual conference of the Debian Linux distribution (full schedule):
PackagingCon (@PackagingCon) is new conference for developers of package management software as well as their related communities and stakeholders. The virtual event, which is scheduled to take place on the 9th and 10th November 2021, has a mission is to bring different ecosystems together: from Python s pip to Rust s cargo to Julia s Pkg, from Debian apt over Nix to conda and mamba, and from vcpkg to Spack we hope to have many different approaches to package management at the conference . A number of people from reproducible builds community are planning on attending this new conference, and some may even present. Tickets start at $20 USD.
As reported in our May report, the president of the United States signed an executive order outlining policies aimed to improve the cybersecurity in the US. The executive order comes after a number of highly-publicised security problems such as a ransomware attack that affected an oil pipeline between Texas and New York and the SolarWinds hack that affected a large number of US federal agencies. As a followup this month, however, a detailed fact sheet was released announcing a number large-scale initiatives and that will undoubtedly be related to software supply chain security and, as a result, reproducible builds.
Lastly, We ran another productive meeting on IRC in August (original announcement) which ran for just short of two hours. A full set of notes from the meeting is available.

Software development kpcyrd announced an interesting new project this month called I probably didn t backdoor this which is an attempt to be:
a practical attempt at shipping a program and having reasonably solid evidence there s probably no backdoor. All source code is annotated and there are instructions explaining how to use reproducible builds to rebuild the artifacts distributed in this repository from source. The idea is shifting the burden of proof from you need to prove there s a backdoor to we need to prove there s probably no backdoor . This repository is less about code (we re going to try to keep code at a minimum actually) and instead contains technical writing that explains why these controls are effective and how to verify them. You are very welcome to adopt the techniques used here in your projects. ( )
As the project s README goes on the mention: the techniques used to rebuild the binary artifacts are only possible because the builds for this project are reproducible . This was also announced on our mailing list this month in a thread titled i-probably-didnt-backdoor-this: Reproducible Builds for upstreams. kpcyrd also wrote a detailed blog post about the problems surrounding Linux distributions (such as Alpine and Arch Linux) that distribute compiled Python bytecode in the form of .pyc files generated during the build process.

diffoscope diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb made a number of changes, including releasing version 180), version 181) and version 182) as well as the following changes:
  • New features:
    • Add support for extracting the signing block from Android APKs. [ ]
    • If we specify a suffix for a temporary file or directory within the code, ensure it starts with an underscore (ie. _ ) to make the generated filenames more human-readable. [ ]
    • Don t include short GCC lines that differ on a single prefix byte either. These are distracting, not very useful and are simply the strings(1) command s idea of the build ID, which is displayed elsewhere in the diff. [ ][ ]
    • Don t include specific .debug-like lines in the ELF-related output, as it is invariably a duplicate of the debug ID that exists better in the readelf(1) differences for this file. [ ]
  • Bug fixes:
    • Add a special case to SquashFS image extraction to not fail if we aren t the superuser. [ ]
    • Only use java -jar /path/to/apksigner.jar if we have an apksigner.jar as newer versions of apksigner in Debian use a shell wrapper script which will be rejected if passed directly to the JVM. [ ]
    • Reduce the maximum line length for calculating Wagner-Fischer, improving the speed of output generation a lot. [ ]
    • Don t require apksigner in order to compare .apk files using apktool. [ ]
    • Update calls (and tests) for the new version of odt2txt. [ ]
  • Output improvements:
    • Mention in the output if the apksigner tool is missing. [ ]
    • Profile diffoscope.diff.linediff and specialize. [ ][ ]
  • Logging improvements:
    • Format debug-level messages related to ELF sections using the diffoscope.utils.format_class. [ ]
    • Print the size of generated reports in the logs (if possible). [ ]
    • Include profiling information in --debug output if --profile is not set. [ ]
  • Codebase improvements:
    • Clarify a comment about the HUGE_TOOLS Python dictionary. [ ]
    • We can pass -f to apktool to avoid creating a strangely-named subdirectory. [ ]
    • Drop an unused File import. [ ]
    • Update the supported & minimum version of Black. [ ]
    • We don t use the logging variable in a specific place, so alias it to an underscore (ie. _ ) instead. [ ]
    • Update some various copyright years. [ ]
    • Clarify a comment. [ ]
  • Test improvements:
    • Update a test to check specific contents of SquashFS listings, otherwise it fails depending on the test systems user ID to username passwd(5) mapping. [ ]
    • Assign seen and expected values to local variables to improve contextual information in failed tests. [ ]
    • Don t print an orphan newline when the source code formatting test passes. [ ]

In addition Santiago Torres Arias added support for Squashfs version 4.5 [ ] and Felix C. Stegerman suggested a number of small improvements to the output of the new APK signing block [ ]. Lastly, Chris Lamb uploaded python-libarchive-c version 3.1-1 to Debian experimental for the new 3.x branch python-libarchive-c is used by diffoscope.

Distribution work In Debian, 68 reviews of packages were added, 33 were updated and 10 were removed this month, adding to our knowledge about identified issues. Two new issue types have been identified too: nondeterministic_ordering_in_todo_items_collected_by_doxygen and kodi_package_captures_build_path_in_source_filename_hash. kpcyrd published another monthly report on their work on reproducible builds within the Alpine and Arch Linux distributions, specifically mentioning rebuilderd, one of the components powering reproducible.archlinux.org. The report also touches on binary transparency, an important component for supply chain security. The @GuixHPC account on Twitter posted an infographic on what fraction of GNU Guix packages are bit-for-bit reproducible: Finally, Bernhard M. Wiedemann posted his monthly reproducible builds status report for openSUSE.

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including: Elsewhere, it was discovered that when supporting various new language features and APIs for Android apps, the resulting APK files that are generated now vary wildly from build to build (example diffoscope output). Happily, it appears that a patch has been committed to the relevant source tree. This was also discussed on our mailing list this month in a thread titled Android desugaring and reproducible builds started by Marcus Hoffmann.

Website and documentation There were quite a few changes to the Reproducible Builds website and documentation this month, including:
  • Felix C. Stegerman:
    • Update the website self-build process to not use the buster-backports suite now that Debian Bullseye is the stable release. [ ]
  • Holger Levsen:
    • Add a new page documenting various package rebuilder solutions. [ ]
    • Add some historical talks and slides from DebConf20. [ ][ ]
    • Various improvements to the history page. [ ][ ][ ]
    • Rename the Comparison protocol documentation category to Verification . [ ]
    • Update links to F-Droid documentation. [ ]
  • Ian Muchina:
    • Increase the font size of titles and de-emphasize event details on the talk page. [ ]
    • Rename the README file to README.md to improve the user experience when browsing the Git repository in a web browser. [ ]
  • Mattia Rizzolo:
    • Drop a position:fixed CSS statement that is negatively affecting with some width settings. [ ]
    • Fix the sizing of the elements inside the side navigation bar. [ ]
    • Show gold level sponsors and above in the sidebar. [ ]
    • Updated the documentation within reprotest to mention how ldconfig conflicts with the kernel variation. [ ]
  • Roland Clobus:
    • Added a ticket number for the issue with the live Cinnamon image and diffoscope. [ ]

Testing framework The Reproducible Builds project runs a testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, the following changes were made:
  • Holger Levsen:
    • Debian-related changes:
      • Make a large number of changes to support the new Debian bookworm release, including adding it to the dashboard [ ], start scheduling tests [ ], adding suitable Apache redirects [ ] etc. [ ][ ][ ][ ][ ]
      • Make the first build use LANG=C.UTF-8 to match the official Debian build servers. [ ]
      • Only test Debian Live images once a week. [ ]
      • Upgrade all nodes to use Debian Bullseye [ ] [ ]
      • Update README documentation for the Debian Bullseye release. [ ]
    • Other changes:
      • Only include rsync output if the $DEBUG variable is enabled. [ ]
      • Don t try to install mock, a tool used to build Fedora packages some time ago. [ ]
      • Drop an unused function. [ ]
      • Various documentation improvements. [ ][ ]
      • Improve the node health check to detect zombie jobs. [ ]
  • Jessica Clarke (FreeBSD-related changes):
    • Update the location and branch name for the main FreeBSD Git repository. [ ]
    • Correctly ignore the source tarball when comparing build results. [ ]
    • Drop an outdated version number from the documentation. [ ]
  • Mattia Rizzolo:
    • Block F-Droid jobs from running whilst the setup is running. [ ]
    • Enable debugging for the rsync job related to Debian Live images. [ ]
    • Pass BUILD_TAG and BUILD_URL environment for the Debian Live jobs. [ ]
    • Refactor the master_wrapper script to use a Bash array for the parameters. [ ]
    • Prefer YAML s safe_load() function over the unsafe variant. [ ]
    • Use the correct variable in the Apache config to match possible existing files on disk. [ ]
    • Stop issuing HTTP 301 redirects for things that not actually permanent. [ ]
  • Roland Clobus (Debian live image generation):
    • Increase the diffoscope timeout from 120 to 240 minutes; the Cinnamon image should now be able to finish. [ ]
    • Use the new snapshot service. [ ]
    • Make a number of improvements to artifact handling, such as moving the artifacts to the Jenkins host [ ] and correctly cleaning them up at the right time. [ ][ ][ ]
    • Where possible, link to the Jenkins build URL that created the artifacts. [ ][ ]
    • Only allow only one job to run at the same time. [ ]
  • Vagrant Cascadian:
    • Temporarily disable armhf nodes for DebConf21. [ ][ ]

Lastly, if you are interested in contributing to the Reproducible Builds project, please visit the Contribute page on our website. You can get in touch with us via:

3 September 2021

Reproducible Builds (diffoscope): diffoscope 183 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 183. This version includes the following changes:
[ Chris Lamb ]
* Add support for extracting Android signing blocks.
  (Closes: reproducible-builds/diffoscope#246)
* Format debug messages for elf sections using our
  diffoscope.utils.format_class utility.
* Clarify a comment about the HUGE_TOOLS dict in diffoscope.external_tools.
[ Felix C. Stegerman ]
* Clarify output around APK Signing Blocks and remove an accidental duplicate
  "0x" prefix.
You find out more by visiting the project homepage.

30 August 2021

Andrew Cater: Oh, my goodness, where's the fantastic barbeque [OMGWTFBBQ 2021]

I'm guessing the last glasses will be through the dishwasher (again) and Pepper the dog can settle down without having to cope with so many people.For those who don't know - Steve and his wife Jo (Sledge and Randombird) hold a barbeque in their garden every August Bank Holiday weekend [UK Bank Holiday on the last Monday in August]. The barbeque is not small - it's the dominating feature in the suburban garden, brick built, with a dedication stone, lights, electricity. The garden is small, generally made smaller by forty or so Debian friends and allies standing and sitting around. People are talking, arguing, hugging people they've not seen for (literal) years and putting the world to rights. This is Debian central point - with large quantities of meat and salads, an amount of beer/alcohol and "Cambridge gin" and general goodwill. This year was more than usually atmospheric because for some of us it was the first time with a large group of people in a while. Side conversations abound: for me it was learning something about the high energy particle physics community, how to precision build helicopters, fly quadcopters and precision 3D print anything, the maths of Isy counting crochet stitches to sew together randomly sized squares ... and, of course, obligatory things like how random is random and what's good enough entropy. And a few sessions of the game of our leader.
This is also a place for stuff to get done: I was unashamedly using this to upgrade the storage in my laptop while there were sensible engineers around. A corner of the table, a RattusRattus and it was quickly sorted - then a discussion around the internals of Thinkpads as he took his apart. Then getting a full install - Gb Ethernet to the Debian mirror in the cupboard six feet away is faster bandwidth than a jumbo jet full of tapes. Then getting mail to work again - it's handy when the mailserver owner is next to you, having come in from the garden to help, and finally IRC. And not just me: "You need a GPG key signed - there's three DPLs here, there's a release manager - but you've just missed one of the DAMs." plus an in-depth GPG how-to session on the other side of the table.I was the luckiest one with the most comfortable bed in the house overnight but I couldn't stay for last night. Thanks once again to all involved but especially Steve and Jo who do this for the love of it, and the fun, and the community and the family. Oh, and thanks to Lenovo - not just for being a platinum sponsor of Debconf but also for providing the official laptop of this and most Debian occasions

Thomas Goirand: developers-reference needs love

During Debconf, Holger, who s one of the developers-reference maintainers, made a quick presentation that was explaining the developers-reference needs some love. Indeed, it has gathered dust, and some useful refresh would be very welcome. Holger pointed at the list of bugs:
https://bugs.debian.org/src:developers-reference After having a quick look into that list, after Holger s Debconf presentation, I wrote to him on IRC: <zigo> Many of the bugs you refered are indeed easily actionable, if all of us just try to help for one bug, that d be a huge improvement of that doc. Then, as I was waiting for the closing ceremony of Debconf, I thought I shouldn t just say it, but actually do something about it. I decided to address https://bugs.debian.org/793633 as I thought it was easy. In just a few minutes, I was able to do a first patch, as seen here: https://salsa.debian.org/debian/developers-reference/-/merge_requests/27 I wrote about it on IRC, and a few people helped with rephrasing what was there (thanks to Fil for correcting my English mistakes, and others for the content). Today, which is 2 days after the MR was opened, I have decided it was long enough and actually merged it, as I considered it was enough time to gather comments. So we now have a brand new shiny chapter about Backports and how to handle them. I m sure that new part is perfectible, so do not hesitate, and do patch what I just wrote if you feel like you can do better. If I m writing this blog post, this is not to promote myself. The goal is to promote the developers-reference manual and push others in Debian to do the same. Please do what Holger suggested, and what I just did: contribute to the document by addressing just one of the currently opened bugs. If all DDs do it, we ll get a much nicer document, and help others to contribute to Debian. This is going to take less than 30 minutes of your time, and it is very much ok if you do this only once. It is really easy: just clone https://salsa.debian.org/debian/developers-reference/ and write a patch. If you re a DD, you can even merge your patch yourself once you re satisfied with it.

23 August 2021

Pavit Kaur: GSoC 2021: Final Evaluation

Project: Incremental Improvements to Debian s CI platform
Project Link: https://summerofcode.withgoogle.com/projects/#6433686825730048
Code Repository: https://salsa.debian.org/ci-team/debci
Mentors: Antonio Terceiro and Paul Gevers

About the Project Debian Continuous Integration is Debian s CI platform. It runs tests on the packages published in the Debian archive, and today is used to control migration of packages from unstable, Debian s development area, to testing, the area of the archive where the next Debian release is being prepared. This makes it a crucial part of Debian s infrastructure. The web platform shows the results of all the tests executed. Debian CI provides developers both API and a GUI Self-Service section to request tests for the packages and get information on test history. This project involves implementing incremental improvements to the platform, making it easier to use and maintain.

Deliverables of the project:
  • Migrating Logins to Salsa
  • Adding support for testing security uploads and Debian LTS

Work done

Migrating Logins to Salsa
  • Originally, Debian CI used Debian SSO client certificates for logging in, but since that was deprecated logins are migrated to Salsa, Debian s Gitlab instance and this is implemented with the help of OmniAuth, the ruby authentication framework.
  • Another thing fixed as part of this task was that there exists a limitation in the existing database structure, where usernames were directly stored as the test requestor field, so that relationship was normalized with a proper foreign key to the users table.
Merge Requests:
Adding support for testing security uploads and Debian LTS In this task, work was done to enable private tests in Debian CI for adding support for testing security uploads and Debian LTS. It includes tasks from adding the private field in the API and Self-Service section in requesting tests to adding an option to publish them when ready through both interfaces. Merge Requests:
Others I worked on some issues to add usability improvements for the web interface:

Learning Takeaways I learned a lot throughout the entire journey. Some of the things to mention are:
  • Writing tests: I learned about using Rspec for writing tests in ruby and understood how writing tests is an integral part of coding any project.
  • Using good coding practices: By my merge requests reviews done by my mentor Antonio, I came to know about good coding practices which help in keeping the code both clean and concise.

Acknowledgement I owe huge thanks to my mentors Antonio Terceiro and Paul Gevers for their constant support and guidance throughout the entire duration. It is for them that I was able to get started with the code-base of the project in the first place. And while working on the project, they were extremely responsive to all my queries. My merge requests were all thoroughly reviewed which enables me to learn more and work more efficiently. It was a complete pleasure interacting with them on weekly meet calls. To sum up, my GSoC journey was awesome and I had a fun and productive summer with Debian

What s next? As part of a continuation of my project task Adding support for testing security uploads and Debian LTS, I would be working on Debian CI to facilitate the process of connecting the embargoed archive. It s the end of my GSoC journey but certainly not the end of my journey with Debian or Open-Source. I plan to stay an active contributor to Debian and get involve with more Open Source projects as well

Extras The link to my Salsa Profile for all my work activities can be found here.
To know more about my journey, my other GSoC blog posts can be found below:

17 August 2021

Russ Allbery: Review: Black Sun

Review: Black Sun, by Rebecca Roanhorse
Series: Between Earth and Sky #1
Publisher: Saga Press
Copyright: October 2020
ISBN: 1-5344-3769-X
Format: Kindle
Pages: 454
Serapio has been crafted and trained to be the vessel for a god. He grew up in Obregi land, far from his ancestral home, but he will return to Tova at the appropriate time and carry the hopes of the Carrion Crow clan with him. Xiala is a ship captain, a woman, and a Teek. That means she's a target. Teek have magic, which makes them uncanny and dangerous. They're also said to carry that magic in their bones, which makes them valuable in ways that are not pleasant for the Teek. Running afoul of the moral codes of Cuecola is therefore even more dangerous to her than it would be to others, which is why she accepts a bargain to run errands for a local lord for twelve years, paid at the end of that time with ownership of a ship and crew. The first task: ferry a strange man to the city of Tova. Meanwhile, in Tova, the priestess Naranpa has clawed her way to the top of the Sky Made hierarchy from an inauspicious beginning in the poor district of Coyote's Maw. She's ruthlessly separated herself from her despised beginnings and focused her attention on calming Tova in advance of the convergence, a rare astronomical alignment at the same time as the winter solstice. But Carrion Crow holds a deep-seated grudge at their slaughter by the priesthood during the Night of Knives, and Naranpa's position atop the religious order that partly rules Tova's fractious politics is more precarious than she thinks. I am delighted that more fantasy is drawing on mythologies and histories other than the genre default of western European. It's long overdue for numerous reasons and a trend to be rewarded. But do authors writing fantasy in English who reach for Mesoamerican cultures have to gleefully embrace the excuse to add more torture? I'm developing an aversion to this setting (which I do not want to do!) because every book seems to feature human sacrifice, dismemberment, or some other horror show. Roanhorse at least does not fill the book with that (there's lingering child abuse but nothing as sickening as the first chapter), but that makes the authorial choice to make the torture one's first impression of this book even odder. Our introduction to Serapio is a scene that I would have preferred to have never read, and I don't think it even adds much to the plot. Huge warnings for people who don't want to read about a mother torturing her son, or about eyes in that context. Once past that introduction, Black Sun settles into a two-thread fantasy, one following Xiala and Serapio's sea voyage and the other following Naranpa and the political machinations in Tova. Both the magic systems and the political systems are different enough to be refreshing, and there are a few bits of world-building I enjoyed (a city built on top of rocks separated by deep canyons and connected with bridges, giant intelligent riding crows, everything about the Teek). My problem was that I didn't care what happened to any of the characters. Naranpa spends most of the book dithering and whining despite a backstory that should have promised more dynamic and decisive responses. The other character from Tova introduced somewhat later in the book is clearly "character whose story will appear in the next volume"; here, he's just station-keeping and representing the status quo. And while it's realistic given the plot that Serapio is an abused sociopath, that didn't mean I enjoyed reading his viewpoint or his childhood abuse. Xiala is the best character in the book by far and I was warming to the careful work she has to do to win over an unknown crew, but apparently Roanhorse was not interested in that. Instead, the focus of Xiala's characterization turns to a bad-boy romance that did absolutely nothing for me. This will be a matter of personal taste; I know this is a plot feature for many readers. But it had me rolling my eyes and turning the pages to get to something more interesting (which, sadly, was not forthcoming). It also plays heavily on magical disabled person cliches, like the blind man being the best fighter anyone has met. I did not enjoy this book very much, but there were some neat bits of world-building and I could see why other people might disagree. What pushed me into actively recommending against it (at least for now) is the publishing structure. This is the first book of a trilogy, so one can expect the major plot to not be resolved by the end of the book. But part of the contract with the reader when publishing a book series is that each volume should reach some sense of closure and catharsis. There will be cliffhangers and unanswered questions, but there should also be enough plot lines that are satisfactorily resolved to warrant publishing a book as a separate novel. There is none of that here. This is the first half (or third) of a novel. It introduces a bunch of plot lines, pulls them together, describes an intermediate crisis, and then simply stops. Not a single plot line is resolved. This is made worse by the fact this series (presumably, as I have only seen the first book) has a U-shaped plot: everything gets worse and worse until some point of crisis, and then presumably the protagonists will get their shit together and things will start to improve. I have soured on U-shaped plots since the first half of the story often feels like a tedious grind (eat your vegetables and then you can have dessert), but it's made much worse by cutting the book off at the bottom of the U. You get a volume, like Black Sun, that's all setup and horror and collapse, with no payoff or optimism. After two tries, I have concluded that Roanhorse is not for me. This is clearly a me problem rather than a Roanhorse problem, given how many other people love both Black Sun and her Sixth World series, but this is the second book of hers where I mildly enjoyed the world building but didn't care about any of the characters. Ah well, tastes will differ. Even if you get along with Roanhorse, though, I recommend against starting this book until the second half of it is published (currently scheduled for 2022). As it stands, it's a wholly unsatisfying reading experience. Followed by the not-yet-published Fevered Star. Rating: 4 out of 10

14 August 2021

Andy Simpkins: Debian Bullseye Released

Wow. It is 21:49 in the evening here (I am with isy and sledge in Cambridge) and image testing has completed! The images are being signed, and sledge is running through the final steps to push them out to our servers, and from there out onto the mirror and torrent networks to be available for public download.

We have had help testing installation images from the regular team; amacater and schweer. With schweer, as ever, covering the edu images. Thank you. This release we were joined by bitin who kindly ran through a couple of tests of the default netinst image with both UEFI and BIOS based VMs, before joining a release party. Moving onto the live images, linux-fan once again spent time testing i386 images on vintage hardware. Getting a desktop environment to work on a Pentium (4?) machine with 1GiB RAM from a live-image sees the number of desktops that will run in this environment get fewer all the time. Again highvoltage was around to run tests on some of the images.

liz, contributed for the first time indeed raised her first bug report as well. I hope that you had fun thanks for joining us today. smcv, also joined in the testing fun a long time DD is this the first time you have run through image smoke tests on release day? thanks! Many thanks to everyone taking time to test installation and live images. Of cause building and testing images doesn t happen in isolation. There is a huge team that puts together and releases the project that is Debin .
On a release day there are many teams working flat out: dsa, ftp, publicity, release, web, and ourselves as the cd/images team.
But that is just activity on a release day
There are all the other teams that are needed to produce the distribution, who work tirelessly day in, day out. Curating the 1,152,960,944 lines of code in Debian bullseye are over more than 6,208 people!!
Some of the contributors are shown in https://contributors.debian.org/contributors/flat
THANK YOU In the 15 minutes it has taken me to compile this post (many thanks to cnote and jmw for facts and figures they published on debian micronews), the last of the image release process has completed by sledge and that s it. Installation images for Debian 11 Bullseye are out in the big wide world, joining the official archives that became available at 10:35 this morning.

1 August 2021

Russ Allbery: Review: Fugitive Telemetry

Review: Fugitive Telemetry, by Martha Wells
Series: Murderbot Diaries #6
Publisher: Tordotcom
Copyright: April 2021
ISBN: 1-250-76538-2
Format: Kindle
Pages: 167
Fugitive Telemetry is the fifth Murderbot novella. It is not a sequel to the (as yet) lone novel, Network Effect. Instead, it takes place between Exit Strategy and Network Effect, filling in more of the backstory of the novel. You should not read it before Exit Strategy, but I believe it and Network Effect could be read in any order. A human has been murdered on Preservation Station. That is not a thing that happens on Preservation Station, which is normally a peaceful place whose crime is limited to intoxication-related stupidity. Murderbot's first worry, and the first worry of his humans, is that this may be one of their enemies getting into position to target them. That risk at least makes the murder worth investigating, rather than leaving it solely to Station Security. The problem from Murderbot's perspective is that there is an effective and efficient way of doing such an investigation, which starts with hacking into the security systems to get necessary investigative data and may end with the silent disposal of dead bodies of enemy agents. But this is Preservation Station, not the Corporation Rim, and Murderbot agreed to not do things like casually compromise all the station security systems or murder people who are security threats.
There was a big huge deal about it, and Security was all "but what if it takes over the station's systems and kills everybody" and Pin-Lee told them "if it wanted to do that it would have done it by now," which in hindsight was probably not the best response.
Worse, Murderbot's human wants it to work collaboratively with Station Security. That is a challenge, given that Security has a lot of reasons not to trust SecUnits, and Murderbot has a lot of reasons not to trust a security organization (not to mention considers them largely incompetent). Also, the surveillance systems are totally inadequate compared to the Corporation Rim for various financial and civil rights reasons that are doubtless wonderful except in situations where someone has been murdered. But hopefully the humans won't get in the way too much. This is one of those books (well, novellas) that I finished a while back but then stalled out on reviewing. I think that's because I don't have that much to say about it. Network Effect pushed the world-building and Murderbot's personal storyline forward significantly, but Fugitive Telemetry doesn't pick up those threads. Instead, this is another novella in much the same vein as the first four. If you, like me, are eager to see where Wells takes the story after the events of the novel, this is somewhat disappointing. But if you enjoyed the novellas, this is more of what you enjoyed: snarky comments about humanity, competence porn, Murderbot getting pulled into problems somewhat against its will and then trying to sort them out, and the occasional touching moment of emotional connection that Murderbot escapes from as quickly as possible. It's quite enjoyable, helped considerably by Wells's wise choice to not make the supporting human characters idiots. Collaboration is not Murderbot's strength; it is certain the investigation will be an endless series of frustrations and annoyances given the level of suspicion Station Security starts with. But some humans (and some SecUnits) are capable of re-evaluating their conclusions when given new evidence, and watching that happen is part of the fun of this novella. What this novella is missing is the overarching plot structure of the rest of the series, since where this story sits chronologically doesn't leave much room for advancing or even deepening the plot arc. It therefore feels incidental: delightful while I was reading it, probably missable if you have to, and not something I spent time thinking about after I finished it. If you liked the Murderbot novellas up until now, you will want to read this one. If you haven't started the series yet, this is not a place to start. If you want something more like the Network Effect novel, or a story where Murderbot makes significant decisions about its future, the wait continues. Rating: 8 out of 10

23 July 2021

Bits from Debian: New Debian Developers and Maintainers (May and June 2021)

The following contributors got their Debian Developer accounts in the last two months: The following contributors were added as Debian Maintainers in the last two months: Congratulations!

16 July 2021

Russell Coker: Thoughts about RAM and Storage Changes

My first Linux system in 1992 was a 386 with 4MB of RAM and a 120MB hard drive which (for some reason I forgot) only was supported by Linux for about 90MB. My first hard drive was 70MB and could do 500KB/s for contiguous IO, my first Linux hard drive was probably a bit faster, maybe 1MB/s. My current Linux workstation has 64G of RAM and 2*1TB NVMe devices that can sustain about 1.1GB/s. The laptop I m using right now has 8GB of RAM and a 180GB SSD that can do 380MB/s. My laptop has 2000* the RAM of my first Linux system and maybe 400* the contiguous IO speed. Currently I don t even run a VM with less than 4GB of RAM, NB I m not saying that smaller VMs aren t useful merely that I don t happen to be using them now. Modern AMD64 CPUs support 2MB huge pages . As a proportion of system RAM if I used 2MB pages everywhere they would be a smaller portion of system RAM than the 4KB pages on my first Linux system! I am not suggesting using 2MB pages for general systems. For my workstations the majority of processes are using less than 10MB of resident memory and given the different uses for memory mapped shared objects, memory mapped file IO, malloc(), stack, heap, etc there would be a lot of inefficiency having 2MB the limit for all allocation. But as systems worked with 4MB of RAM or less and 4K pages it would surely work to have only 2MB pages with 64GB or more of RAM. Back in the 90s it seemed ridiculous to me to have 256 byte pages on a 68030 CPU, but 4K pages on a modern AMD64 system is even more ridiculous. Apparently AMD64 supports 1GB pages on some CPUs, that seems ridiculously large but when run on a system with 1TB of RAM that s comparable to 4K pages on my first Linux system. Currently AWS offers 24TB EC2 instances and the Google Cloud Project offers 12TB virtual machines. It might even make sense to have the entire OS using 1GB pages for some usage scenarios on such systems, wasting tens of GB of RAM to save TLB thrashing might be a good trade-off. My personal laptop has 200* the RAM of my first Linux system and maybe 400* the contiguous IO speed. An employer recently assigned me a Thinkpad Carbon X1 Gen6 with an NVMe device that could sustain 5GB/s until the CPU overheated, that s 5000* the contiguous IO speed of my first Linux hard drive. My Linux hard drive had a 28ms average access time and my first Linux hard drive probably was a little better, let s call it 20ms for the sake of discussion. It s generally quoted that access times for NVMe are at best 10us, that s 2000* better than my first Linux hard drive. As seek times are the main factor for swap performance a laptop with 8GB of RAM and a fast NVMe device could be expected to give adequate performance with 2000* the swap of my first Linux system. For the work laptop in question I had 8G of swap and my personal laptop has 6G of swap which is somewhat comparable to the 4MB of swap on my first Linux system in that swap is about equal to RAM size, so I guess my personal laptop is performing better than it can be expected to. These are just some idle thoughts about hardware changes over the years. Don t take it as advice for purchasing hardware and don t take it too seriously in general. Also when writing comments don t restrict yourself to being overly serious, feel free to run the numbers on what systems with petabytes of Optane might be like, speculate on what NUMA systems in laptops might be like, etc. Go wild.

Jamie McClelland: From Ikiwiki to Hugo

Back in the days of Etch, I converted this blog from Drupal to ikiwiki. I remember being very excited about this brand new concept of static web sites derived from content stored in a version control system. And now over a decade later I ve moved to hugo. I feel some loyalty to ikiwiki and Joey Hess for opening my eyes to the static web site concept. But ultimately I grew tired of splitting my time and energy between learning ikiwiki and hugo, which has been my tool of choice for new projects. When I started getting strange emails that I suspect had something to do with spammers filling out ikiwiki s commenting registration system, I choose to invest my time in switching to hugo over debugging and really understanding how ikiwiki handles user registration. I carefully reviewed anarcat s blog on converting from ikiwiki to hugo and learned about a lot of ikiwiki features I am not using. Wow, it s times like these that I m glad I keep it really simple. Based on the various ikiwiki2hugo python scripts I studied, I eventually wrote a far simpler one tailored to my needs. Also, in what could only be called a desperate act of procrastination combined with a touch of self-hatred (it s been a rough week) I rejected all the commenting options available to me and choose to implement my own in PHP. What?!?! Why would anyone do such a thing? I refer you to my previous sentence about desperate procrastination. And also I know it s fashionable to hate PHP, but honestly as the first programming language I learned, there is something comforting and familiar about it. And, on a more objective level, I can deploy it easily to just about any hosting provider in the world. I don t have to maintain a unicorn service or a nodejs service and make special configuration entries in my web configuration. All I have to do is upload the php files and I m done. Well, I m sure I ll regret this decision. Special thanks to Alexander Bilz for the anatole hugo theme. I choose it via a nearly random click to avoid the rabbit hole of choosing a theme. And, by luck, it has turned out quite well. I only had to override the commento partial theme page to hijack it for my own commenting system s use.

Next.