Search Results: "casi"

26 April 2025

John Goerzen: Memoirs of the Early Internet

The Internet is an amazing place, and occasionally you can find things on the web that have somehow lingered online for decades longer than you might expect. Today I ll take you on a tour of some parts of the early Internet. The Internet, of course, is a network of networks and part of its early (and continuing) promise was to provide a common protocol that all sorts of networks can use to interoperate with each other. In the early days, UUCP was one of the main ways universities linked with each other, and eventually UUCP and the Internet sort of merged (but that s a long story). Let s start with some Usenet maps, which were an early way to document the UUCP modem links between universities. Start with this PDF. The first page is a Usenet map (which at the time mostly flowed over UUCP) from April of 1981. Notice that ucbvax, a VAX system at Berkeley, was central to the map. ucbvax continued to be a central node for UUCP for more than a decade; on page 5 of that PDF, you ll see that it asks for a Path from a major node (eg, ucbvax, devcax, harpo, duke) . Pre-Internet email addresses used a path; eg, mark@ucbvax was duke!decvax!ucbvax!mark to someone. You had to specify the route from your system to the recipient on your email To line. If you gave out your email address on a business card, you would start it from a major node like ucbvax, and the assumption was that everyone would know how to get from their system to the major node. On August 19, 1994, ucbvax was finally turned off. TCP/IP had driven UUCP into more obscurity; by then, it was mostly used by people without a dedicated Internet connection to get on the Internet, rather than an entire communication network of its own. A few days later, Cliff Frost posted a memoir of ucbvax; an obscurbe bit of Internet lore that is fun to read. UUCP was ad-hoc, and by 1984 there was an effort to make a machine-parsable map to help automate routing on UUCP. This was called the pathalias project, and there was a paper about it. The Linux network administration guide even includes a section on pathalias. Because UUCP mainly flowed over phone lines, long distance fees made it quite expensive. In 1985, the Stargate Project was formed, with the idea of distributing Usenet by satellite. The satellite link was short-lived, but the effort eventually morphed into UUNET. It was initially a non-profit, but eventually became a commercial backbone provider, and later ISP. Over a long series of acquisitions, UUNET is now part of Verizon. An article in ;login: is another description of this history. IAPS has an Internet in 1990 article, which includes both pathalias data and an interesting map of domain names to UUCP paths. As I was pondering what interesting things a person could do with NNCPNET Internet email, I stumbled across a page on getting FTP files via e-mail. Yes, that used to be a thing! I remember ftpmail@decwrl.dec.com. It turns out that page is from a copy of EFF s (Extended) Guide to the Internet from 1994. Wow, what a treasure! It has entries such as A Slice of Life in my Virtual Community, libraries with telnet access, Gopher, A Statement of Principle by Bruce Sterling, and I could go on. You can also get it as a PDF from Internet Archive. UUCP is still included with modern Linux and BSD distributions. It was part of how I experienced the PC and Internet revolution in rural America. It lacks modern security, but NNCP is to UUCP what ssh is to telnet.

20 April 2025

Russ Allbery: Review: Up the Down Staircase

Review: Up the Down Staircase, by Bel Kaufman
Publisher: Vintage Books
Copyright: 1964, 1991, 2019
Printing: 2019
ISBN: 0-525-56566-3
Format: Kindle
Pages: 360
Up the Down Staircase is a novel (in an unconventional format, which I'll describe in a moment) about the experiences of a new teacher in a fictional New York City high school. It was a massive best-seller in the 1960s, including a 1967 movie, but seems to have dropped out of the public discussion. I read it from the library sometime in the late 1980s or early 1990s and have thought about it periodically ever since. It was Bel Kaufman's first novel. Sylvia Barrett is a new graduate with a master's degree in English, where she specialized in Chaucer. As Up the Down Staircase opens, it is her first day as an English teacher in Calvin Coolidge High School. As she says in a letter to a college friend:
What I really had in mind was to do a little teaching. "And gladly wolde he lerne, and gladly teche" like Chaucer's Clerke of Oxenford. I had come eager to share all I know and feel; to imbue the young with a love for their language and literature; to instruct and to inspire. What happened in real life (when I had asked why they were taking English, a boy said: "To help us in real life") was something else again, and even if I could describe it, you would think I am exaggerating.
She instead encounters chaos and bureaucracy, broken windows and mindless regulations, a librarian who is so protective of her books that she doesn't let any students touch them, a school guidance counselor who thinks she's Freud, and a principal whose sole interaction with the school is to occasionally float through on a cushion of cliches, dispensing utterly useless wisdom only to vanish again.
I want to take this opportunity to extend a warm welcome to all faculty and staff, and the sincere hope that you have returned from a healthful and fruitful summer vacation with renewed vim and vigor, ready to gird your loins and tackle the many important and vital tasks that lie ahead undaunted. Thank you for your help and cooperation in the past and future. Maxwell E. Clarke
Principal
In practice, the school is run by James J. McHare, Clarke's administrative assistant, who signs his messages JJ McH, Adm. Asst. and who Sylvia immediately starts calling Admiral Ass. McHare is a micro-managing control freak who spends the book desperately attempting to impose order over school procedures, the teachers, and the students, with very little success. The title of the book comes from one of his detention slips:
Please admit bearer to class Detained by me for going Up the Down staircase and subsequent insolence. JJ McH
The conceit of this book is that, except for the first and last chapters, it consists only of memos, letters, notes, circulars, and other paper detritus, often said to come from Sylvia's wastepaper basket. Sylvia serves as the first-person narrator through her long letters to her college friend, and through shorter but more frequent exchanges via intraschool memo with Beatrice Schachter, another English teacher at the same school, but much of the book lies outside her narration. The reader has to piece together what's happening from the discarded paper of a dysfunctional institution. Amid the bureaucratic and personal communications, there are frequent chapters with notes from the students, usually from the suggestion box that Sylvia establishes early in the book. These start as chaotic glimpses of often-misspelled wariness or open hostility, but over the course of Up the Down Staircase, some of the students become characters with fragmentary but still visible story arcs. This remains confusing throughout the novel there are too many students to keep them entirely straight, and several of them use pseudonyms for the suggestion box but it's the sort of confusion that feels like an intentional authorial choice. It mirrors the difficulty a teacher has in piecing together and remembering the stories of individual students in overstuffed classrooms, even if (like Sylvia and unlike several of her colleagues) the teacher is trying to pay attention. At the start, Up the Down Staircase reads as mostly-disconnected humor. There is a strong "kids say the darnedest things" vibe, which didn't entirely work for me, but the send-up of chaotic bureaucracy is both more sophisticated and more entertaining. It has the "laugh so that you don't cry" absurdity of a system with insufficient resources, entirely absent management, and colleagues who have let their quirks take over their personalities. Sylvia alternates between incredulity and stubbornness, and I think this book is at its best when it shows the small acts of practical defiance that one uses to carve out space and coherence from mismanaged bureaucracy. But this book is not just a collection of humorous anecdotes about teaching high school. Sylvia is sincere in her desire to teach, which crystallizes around, but is not limited to, a quixotic attempt to reach one delinquent that everyone else in the school has written off. She slowly finds her footing, she has a few breakthroughs in reaching her students, and the book slowly turns into an earnest portrayal of an attempt to make the system work despite its obvious unfitness for purpose. This part of the book is hard to review. Parts of it worked brilliantly; I could feel myself both adjusting my expectations alongside Sylvia to something less idealistic and also celebrating the rare breakthrough with her. Parts of it were weirdly uncomfortable in ways that I'm not sure I enjoyed. That includes Sylvia's climactic conversation with the boy she's been trying to reach, which was weirdly charged and ambiguous in a way that felt like the author's reach exceeding their grasp. One thing that didn't help my enjoyment is Sylvia's relationship with Paul Barringer, another of the English teachers and a frustrated novelist and poet. Everyone who works at the school has found their own way to cope with the stress and chaos, and many of the ways that seem humorous turn out to have a deeper logic and even heroism. Paul's, however, is to retreat into indifference and alcohol. He is a believable character who works with Kaufman's themes, but he's also entirely unlikable. I never understood why Sylvia tolerated that creepy asshole, let alone kept having lunch with him. It is clear from the plot of the book that Kaufman at least partially understands Paul's deficiencies, but that did not help me enjoy reading about him. This is a great example of a book that tried to do something unusual and risky and didn't entirely pull it off. I like books that take a risk, and sometimes Up the Down Staircase is very funny or suddenly insightful in a way that I'm not sure Kaufman could have reached with a more traditional novel. It takes a hard look at what it means to try to make a system work when it's clearly broken and you can't change it, and the way all of the characters arrive at different answers that are much deeper than their initial impressions was subtle and effective. It's the sort of book that sticks in your head, as shown by the fact I bought it on a whim to re-read some 35 years after I first read it. But it's not consistently great. Some parts of it drag, the characters are frustratingly hard to keep track of, and the emotional climax points are odd and unsatisfying, at least to me. I'm not sure whether to recommend it or not, but it's certainly unusual. I'm glad I read it again, but I probably won't re-read it for another 35 years, at least. If you are considering getting this book, be aware that it has a lot of drawings and several hand-written letters. The publisher of the edition I read did a reasonably good job formatting this for an ebook, but some of the pages, particularly the hand-written letters, were extremely hard to read on a Kindle. Consider paper, or at least reading on a tablet or computer screen, if you don't want to have to puzzle over low-resolution images. The 1991 trade paperback had a new introduction by the author, reproduced in the edition I read as an afterward (which is a better choice than an introduction). It is a long and fascinating essay from Kaufman about her experience with the reaction to this book, culminating in a passionate plea for supporting public schools and public school teachers. Kaufman's personal account adds a lot of depth to the story; I highly recommend it. Content note: Self-harm, plus several scenes that are closely adjacent to student-teacher relationships. Kaufman deals frankly with the problems of mostly-poor high school kids, including sexuality, so be warned that this is not the humorous romp that it might appear on first glance. A couple of the scenes made me uncomfortable; there isn't anything explicit, but the emotional overtones can be pretty disturbing. Rating: 7 out of 10

31 March 2025

Russ Allbery: Review: Ghostdrift

Review: Ghostdrift, by Suzanne Palmer
Series: Finder Chronicles #4
Publisher: DAW
Copyright: May 2024
ISBN: 0-7564-1888-7
Format: Kindle
Pages: 378
Ghostdrift is a science fiction adventure and the fourth (and possibly final) book of the Finder Chronicles. You should definitely read this series in order and not start here, even though the plot of this book would stand alone. Following The Scavenger Door, in which he made enemies even more dramatically than he had in the previous books, Fergus Ferguson has retired to the beach on Coralla to become a tea master and take care of his cat. It's a relaxing, idyllic life and a much-needed total reset. Also, he's bored. The arrival of his alien friend Qai, in some kind of trouble and searching for him, is a complex balance between relief and disappointment. Bas Belos is one of the most notorious pirates of the Barrens. He has someone he wants Fergus to find: his twin sister, who disappeared ten years ago. Fergus has an unmatched reputation for finding things, so Belos kidnapped Qai's partner to coerce her into finding Fergus. It's not an auspicious beginning to a relationship, and Qai was ready to fight once they got her partner back, but Belos makes Fergus an offer of payment that, startlingly, is enough for him to take the job mostly voluntarily. Ghostdrift feels a bit like a return to Finder. Fergus is once again alone among strangers, on an assignment that he's mostly not discussing with others, piecing together clues and navigating tricky social dynamics. I missed his friends, particularly Ignatio, and while there are a few moments with AI ships, they play less of a role. But Fergus is so very good at what he does, and Palmer is so very good at writing it. This continues to be competence porn at its best. Belos's crew thinks Fergus is a pirate recruited from a prison colony, and he quietly sets out to win their trust with a careful balance of self-deprecation and unflappable skill, helped considerably by the hidden gift he acquired in Finder. The character development is subtle, but this feels like a Fergus who understands friendship and other people at a deeper and more satisfying level than the Fergus we first met three books ago. Palmer has a real talent for supporting characters and Ghostdrift is no exception. Belos's crew are criminals and murderers, and Palmer does remind the reader of that occasionally, but they're also humans with complex goals and relationships. Belos has earned their loyalty by being loyal and competent in a rough world where those attributes are rare. The morality of this story reminds me of infiltrating a gang: the existence of the gang is not a good thing, and the things they do are often indefensible, but they are an understandable reaction to a corrupt social system. The cops (in this case, the Alliance) are nearly as bad, as we've learned over the past couple of books, and considerably more insufferable. Fergus balances the ethical complexity in a way that I found satisfyingly nuanced, while quietly insisting on his own moral lines. There is a deep science fiction plot here, possibly the most complex of the series so far. The disappearance of Belos's sister is the tip of an iceberg that leads to novel astrophysics, dangerous aliens, mysterious ruins, and an extended period on a remote and wreck-strewn planet. I groaned a bit when the characters ended up on the planet, since treks across primitive alien terrain with jury-rigged technology are one of my least favorite science fiction tropes, but I need not have worried. Palmer knows what she's doing; the pace of the plot does slow a bit at first, but it quickly picks up again, adding enough new setting and plot complications that I never had a chance to be bored by alien plants. It helps that we get another batch of excellent supporting characters for Fergus to observe and win over. This series is such great science fiction. Each book becomes my new favorite, and Ghostdrift is no exception. The skeleton of its plot is a satisfying science fiction mystery with multiple competing factions, hints of fascinating galactic politics, complicated technological puzzles, and a sense of wonder that reminds me of reading Larry Niven's Known Space series. But the characters are so much better and more memorable than classic SF; compared to Fergus, Niven's Louis Wu barely exists and is readily forgotten as soon as the story is over. Fergus starts as a quiet problem-solver, but so much character depth unfolds over the course of this series. The ending of this book was delightfully consistent with everything we've learned about Fergus, but also the sort of ending that it's hard to imagine the Fergus from Finder knowing how to want. Ghostdrift, like each of the books in this series, reaches a satisfying stand-alone conclusion, but there is no reason within the story for this to be the last of the series. The author's acknowledgments, however, says that this the end. I admit to being disappointed, since I want to read more about Fergus and there are numerous loose ends that could be explored. More importantly, though, I hope Palmer will write more novels in any universe of her choosing so that I can buy and read them. This is fantastic stuff. This review comes too late for the Hugo nominating deadline, but I hope Palmer gets a Best Series nomination for the Finder Chronicles as a whole. She deserves it. Rating: 9 out of 10

24 March 2025

Simon Josefsson: Reproducible Software Releases

Around a year ago I discussed two concerns with software release archives (tarball artifacts) that could be improved to increase confidence in the supply-chain security of software releases. Repeating the goals for simplicity: While implementing these ideas for a small project was accomplished within weeks see my announcement of Libntlm version 1.8 adressing this in complex projects uncovered concerns with tools that had to be addressed, and things stalled for many months pending that work. I had the notion that these two goals were easy and shouldn t be hard to accomplish. I still believe that, but have had to realize that improving tooling to support these goals takes time. It seems clear that these concepts are not universally agreed on and implemented generally. I m now happy to recap some of the work that led to releases of libtasn1 v4.20.0, inetutils v2.6, libidn2 v2.3.8, libidn v1.43. These releases all achieve these goals. I am working on a bunch of more projects to support these ideas too. What have the obstacles so far been to make this happen? It may help others who are in the same process of addressing these concerns to have a high-level introduction to the issues I encountered. Source code for projects above are available and anyone can look at the solutions to learn how the problems are addressed. First let s look at the problems we need to solve to make git-archive style tarballs usable:

Version Handling To build usable binaries from a minimal tarballs, it need to know which version number it is. Traditionally this information was stored inside configure.ac in git. However I use gnulib s git-version-gen to infer the version number from the git tag or git commit instead. The git tag information is not available in a git-archive tarball. My solution to this was to make use of the export-subst feature of the .gitattributes file. I store the file .tarball-version-git in git containing the magic cookie like this:
$Format:%(describe)$
With this, git-archive will replace with a useful version identifier on export, see the libtasn1 patch to achieve this. To make use of this information, the git-version-gen script was enhanced to read this information, see the gnulib patch. This is invoked by ./configure to figure out which version number the package is for.

Translations We want translations to be included in the minimal source tarball for it to be buildable. Traditionally these files are retrieved by the maintainer from the Translation project when running ./bootstrap, however there are two problems with this. The first one is that there is no strong authentication or versioning information on this data, the tools just download and place whatever wget downloaded into your source tree (printf-style injection attack anyone?). We could improve this (e.g., publish GnuPG signed translations messages with clear versioning), however I did not work on that further. The reason is that I want to support offline builds of packages. Downloading random things from the Internet during builds does not work when building a Debian package, for example. The translation project could solve this by making a monthly tarball with their translations available, for distributors to pick up and provide as a separate package that could be used as a build dependency. However that is not how these tools and projects are designed. Instead I reverted back to storing translations in git, something that I did for most projects back when I was using CVS 20 years ago. Hooking this into ./bootstrap and gettext workflow can be tricky (ideas for improvement most welcome!), but I used a simple approach to store all directly downloaded po/*.po files directly as po/*.po.in and make the ./bootstrap tool move them in place, see the libidn2 commit followed by the actual make update-po commit with all the translations where one essential step is:
# Prime po/*.po from fall-back copy stored in git.
for poin in po/*.po.in; do
    po=$(echo $poin   sed 's/.in//')
    test -f $po   cp -v $poin $po
done
ls po/*.po   sed 's .*/ ; s \.po$ ' > po/LINGUAS

Fetching vendor files like gnulib Most build dependencies are in the shape of You need a C compiler . However some come in the shape of source-code files intended to be vendored , and gnulib is a huge repository of such files. The latter is a problem when building from a minimal git archive. It is possible to consider translation files as a class of vendor files, since they need to be copied verbatim into the project build directory for things to work. The same goes for *.m4 macros from the GNU Autoconf Archive. However I m not confident that the solution for all vendor files must be the same. For translation files and for Autoconf Archive macros, I have decided to put these files into git and merge them manually occasionally. For gnulib files, in some projects like OATH Toolkit I also store all gnulib files in git which effectively resolve this concern. (Incidentally, the reason for doing so was originally that running ./bootstrap took forever since there is five gnulib instances used, which is no longer the case since gnulib-tool was rewritten in Python.) For most projects, however, I rely on ./bootstrap to fetch a gnulib git clone when building. I like this model, however it doesn t work offline. One way to resolve this is to make the gnulib git repository available for offline use, and I ve made some effort to make this happen via a Gnulib Git Bundle and have explained how to implement this approach for Debian packaging. I don t think that is sufficient as a generic solution though, it is mostly applicable to building old releases that uses old gnulib files. It won t work when building from CI/CD pipelines, for example, where I have settled to use a crude way of fetching and unpacking a particular gnulib snapshot, see this Libntlm patch. This is much faster than working with git submodules and cloning gnulib during ./bootstrap. Essentially this is doing:
GNULIB_REVISION=$(. bootstrap.conf >&2; echo $GNULIB_REVISION)
wget -nv https://gitlab.com/libidn/gnulib-mirror/-/archive/$GNULIB_REVISION/gnulib-mirror-$GNULIB_REVISION.tar.gz
gzip -cd gnulib-mirror-$GNULIB_REVISION.tar.gz   tar xf -
rm -fv gnulib-mirror-$GNULIB_REVISION.tar.gz
export GNULIB_SRCDIR=$PWD/gnulib-mirror-$GNULIB_REVISION
./bootstrap --no-git
./configure
make

Test the git-archive tarball This goes without saying, but if you don t test that building from a git-archive style tarball works, you are likely to regress at some point. Use CI/CD techniques to continuously test that a minimal git-archive tarball leads to a usable build.

Mission Accomplished So that wasn t hard, was it? You should now be able to publish a minimal git-archive tarball and users should be able to build your project from it. I recommend naming these archives as PROJECT-vX.Y.Z-src.tar.gz replacing PROJECT with your project name and X.Y.Z with your version number. The archive should have only one sub-directory named PROJECT-vX.Y.Z/ containing all the source-code files. This differentiate it against traditional PROJECT-X.Y.Z.tar.gz tarballs in that it embeds the git tag (which typically starts with v) and contains a wildcard-friendly -src substring. Alas there is no consistency around this naming pattern, and GitLab, GitHub, Codeberg etc all seem to use their own slightly incompatible variant. Let s go on to see what is needed to achieve reproducible make dist source tarballs. This is the release artifact that most users use, and they often contain lots of generated files and vendor files. These files are included to make it easy to build for the user. What are the challenges to make these reproducible?

Build dependencies causing different generated content The first part is to realize that if you use tool X with version A to generate a file that goes into the tarball, version B of that tool may produce different outputs. This is a generic concern and it cannot be solved. We want our build tools to evolve and produce better outputs over time. What can be addressed is to avoid needless differences. For example, many tools store timestamps and versioning information in the generated files. This causes needless differences, which makes audits harder. I have worked on some of these, like Autoconf Archive timestamps but solving all of these examples will take a long time, and some upstream are reluctant to incorporate these changes. My approach meanwhile is to build things using similar environments, and compare the outputs for differences. I ve found that the various closely related forks of GNU/Linux distributions are useful for this. Trisquel 11 is based on Ubuntu 22.04, and building my projects using both and comparing the differences only give me the relevant differences to improve. This can be extended to compare AlmaLinux with RockyLinux (for both versions 8 and 9), Devuan 5 against Debian 12, PureOS 10 with Debian 11, and so on.

Timestamps Sometimes tools store timestamps in files in a way that is harder to fix. Two notable examples of this are *.po translation files and Texinfo manuals. For translation files, I have resolved this by making sure the files use a predictable POT-Creation-Date timestamp, and I set it to the modification timestamps of the NEWS file in the repository (which I set to the git commit of the latest commit elsewhere) like this:
dist-hook: po-CreationDate-to-mtime-NEWS
.PHONY: po-CreationDate-to-mtime-NEWS
po-CreationDate-to-mtime-NEWS: mtime-NEWS-to-git-HEAD
  $(AM_V_GEN)for p in $(distdir)/po/*.po $(distdir)/po/$(PACKAGE).pot; do \
    if test -f "$$p"; then \
      $(SED) -e 's,POT-Creation-Date: .*\\n",POT-Creation-Date: '"$$(env LC_ALL=C TZ=UTC0 stat --format=%y $(srcdir)/NEWS   cut -c1-16,31-)"'\\n",' < $$p > $$p.tmp && \
      if cmp $$p $$p.tmp > /dev/null; then \
        rm -f $$p.tmp; \
      else \
        mv $$p.tmp $$p; \
      fi \
    fi \
  done
Similarily, I set a predictable modification time of the texinfo source file like this:
dist-hook: mtime-NEWS-to-git-HEAD
.PHONY: mtime-NEWS-to-git-HEAD
mtime-NEWS-to-git-HEAD:
  $(AM_V_GEN)if test -e $(srcdir)/.git \
                && command -v git > /dev/null; then \
    touch -m -t "$$(git log -1 --format=%cd \
      --date=format-local:%Y%m%d%H%M.%S)" $(srcdir)/NEWS; \
  fi
However I ve realized that this needs to happen earlier and probably has to be run during ./configure time, because the doc/version.texi file is generated on first build before running make dist and for some reason the file is not rebuilt at release time. The Automake texinfo integration is a bit inflexible about providing hooks to extend the dependency tracking. The method to address these differences isn t really important, and they change over time depending on preferences. What is important is that the differences are eliminated.

ChangeLog Traditionally ChangeLog files were manually prepared, and still is for some projects. I maintain git2cl but recently I ve settled with gnulib s gitlog-to-changelog because doing so avoids another build dependency (although the output formatting is different and arguable worse for my git commit style). So the ChangeLog files are generated from git history. This means a shallow clone will not produce the same ChangeLog file depending on how deep it was cloned. For Libntlm I simply disabled use of generated ChangeLog because I wanted to support an even more extreme form of reproducibility: I wanted to be able to reproduce the full make dist source archives from a minimal git-archive source archive. However for other projects I ve settled with a middle ground. I realized that for git describe to produce reproducible outputs, the shallow clone needs to include the last release tag. So it felt acceptable to assume that the clone is not minimal, but instead has some but not all of the history. I settled with the following recipe to produce ChangeLog's covering all changes since the last release.
dist-hook: gen-ChangeLog
.PHONY: gen-ChangeLog
gen-ChangeLog:
  $(AM_V_GEN)if test -e $(srcdir)/.git; then			\
    LC_ALL=en_US.UTF-8 TZ=UTC0					\
    $(top_srcdir)/build-aux/gitlog-to-changelog			\
       --srcdir=$(srcdir) --					\
       v$(PREV_VERSION)~.. > $(distdir)/cl-t &&			\
         printf '\n\nSee the source repo for older entries\n'	\
         >> $(distdir)/cl-t &&					\
         rm -f $(distdir)/ChangeLog &&				\
         mv $(distdir)/cl-t $(distdir)/ChangeLog;  		\
  fi
I m undecided about the usefulness of generated ChangeLog files within make dist archives. Before we have stable and secure archival of git repositories widely implemented, I can see some utility of this in case we lose all copies of the upstream git repositories. I can sympathize with the concept of ChangeLog files died when we started to generate them from git logs: the files no longer serve any purpose, and we can ask people to go look at the git log instead of reading these generated non-source files.

Long-term reproducible trusted build environment Distributions comes and goes, and old releases of them goes out of support and often stops working. Which build environment should I chose to build the official release archives? To my knowledge only Guix offers a reliable way to re-create an older build environment (guix gime-machine) that have bootstrappable properties for additional confidence. However I had two difficult problems here. The first one was that I needed Guix container images that were usable in GitLab CI/CD Pipelines, and this side-tracked me for a while. The second one delayed my effort for many months, and I was inclined to give up. Libidn distribute a C# implementation. Some of the C# source code files included in the release tarball are generated. By what? You guess it, by a C# program, with the source code included in the distribution. This means nobody could reproduce the source tarball of Libidn without trusting someone elses C# compiler binaries, which were built from binaries of earlier releases, chaining back into something that nobody ever attempts to build any more and likely fail to build due to bit-rot. I had two basic choices, either remove the C# implementation from Libidn (which may be a good idea for other reasons, since the C and C# are unrelated implementations) or build the source tarball on some binary-only distribution like Trisquel. Neither felt appealing to me, but a late christmas gift of a reproducible Mono came to Guix that resolve this.

Embedded images in Texinfo manual For Libidn one section of the manual has an image illustrating some concepts. The PNG, PDF and EPS outputs were generated via fig2dev from a *.fig file (hello 1985!) that I had stored in git. Over time, I had also started to store the generated outputs because of build issues. At some point, it was possible to post-process the PDF outputs with grep to remove some timestamps, however with compression this is no longer possible and actually the grep command I used resulted in a 0-byte output file. So my embedded binaries in git was no longer reproducible. I first set out to fix this by post-processing things properly, however I then realized that the *.fig file is not really easy to work with in a modern world. I wanted to create an image from some text-file description of the image. Eventually, via the Guix manual on guix graph, I came to re-discover the graphviz language and tool called dot (hello 1993!). All well then? Oh no, the PDF output embeds timestamps. Binary editing of PDF s no longer work through simple grep, remember? I was back where I started, and after some (soul- and web-) searching I discovered that Ghostscript (hello 1988!) pdfmarks could be used to modify things here. Cooperating with automake s texinfo rules related to make dist proved once again a worthy challenge, and eventually I ended up with a Makefile.am snippet to build images that could be condensed into:
info_TEXINFOS = libidn.texi
libidn_TEXINFOS += libidn-components.png
imagesdir = $(infodir)
images_DATA = libidn-components.png
EXTRA_DIST += components.dot
DISTCLEANFILES = \
  libidn-components.eps libidn-components.png libidn-components.pdf
libidn-components.eps: $(srcdir)/components.dot
  $(AM_V_GEN)$(DOT) -Nfontsize=9 -Teps < $< > $@.tmp
  $(AM_V_at)! grep %%CreationDate $@.tmp
  $(AM_V_at)mv $@.tmp $@
libidn-components.pdf: $(srcdir)/components.dot
  $(AM_V_GEN)$(DOT) -Nfontsize=9 -Tpdf < $< > $@.tmp
# A simple sed on CreationDate is no longer possible due to compression.
# 'exiftool -CreateDate' is alternative to 'gs', but adds ~4kb to file.
# Ghostscript add <1kb.  Why can't 'dot' avoid setting CreationDate?
  $(AM_V_at)printf '[ /ModDate ()\n  /CreationDate ()\n  /DOCINFO pdfmark\n' > pdfmarks
  $(AM_V_at)$(GS) -q -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=$@.tmp2 $@.tmp pdfmarks
  $(AM_V_at)rm -f $@.tmp pdfmarks
  $(AM_V_at)mv $@.tmp2 $@
libidn-components.png: $(srcdir)/components.dot
  $(AM_V_GEN)$(DOT) -Nfontsize=9 -Tpng < $< > $@.tmp
  $(AM_V_at)mv $@.tmp $@
pdf-recursive: libidn-components.pdf
dvi-recursive: libidn-components.eps
ps-recursive: libidn-components.eps
info-recursive: $(top_srcdir)/.version libidn-components.png
Surely this can be improved, but I m not yet certain in what way is the best one forward. I like having a text representation as the source of the image. I m sad that the new image size is ~48kb compared to the old image size of ~1kb. I tried using exiftool -CreateDate as an alternative to GhostScript, but using it to remove the timestamp added ~4kb to the file size and naturally I was appalled by this ignorance of impending doom.

Test reproducibility of tarball Again, you need to continuously test the properties you desire. This means building your project twice using different environments and comparing the results. I ve settled with a small GitLab CI/CD pipeline job that perform bit-by-bit comparison of generated make dist archives. It also perform bit-by-bit comparison of generated git-archive artifacts. See the Libidn2 .gitlab-ci.yml 0-compare job which essentially is:
0-compare:
  image: alpine:latest
  stage: repro
  needs: [ B-AlmaLinux8, B-AlmaLinux9, B-RockyLinux8, B-RockyLinux9, B-Trisquel11, B-Ubuntu2204, B-PureOS10, B-Debian11, B-Devuan5, B-Debian12, B-gcc, B-clang, B-Guix, R-Guix, R-Debian12, R-Ubuntu2404, S-Trisquel10, S-Ubuntu2004 ]
  script:
  - cd out
  - sha256sum */*.tar.* */*/*.tar.*   sort   grep    -- -src.tar.
  - sha256sum */*.tar.* */*/*.tar.*   sort   grep -v -- -src.tar.
  - sha256sum */*.tar.* */*/*.tar.*   sort   uniq -c -w64   sort -rn
  - sha256sum */*.tar.* */*/*.tar.*   grep    -- -src.tar.   sort   uniq -c -w64   grep -v '^      1 '
  - sha256sum */*.tar.* */*/*.tar.*   grep -v -- -src.tar.   sort   uniq -c -w64   grep -v '^      1 '
# Confirm modern git-archive tarball reproducibility
  - cmp b-almalinux8/src/*.tar.gz b-almalinux9/src/*.tar.gz
  - cmp b-almalinux8/src/*.tar.gz b-rockylinux8/src/*.tar.gz
  - cmp b-almalinux8/src/*.tar.gz b-rockylinux9/src/*.tar.gz
  - cmp b-almalinux8/src/*.tar.gz b-debian12/src/*.tar.gz
  - cmp b-almalinux8/src/*.tar.gz b-devuan5/src/*.tar.gz
  - cmp b-almalinux8/src/*.tar.gz r-guix/src/*.tar.gz
  - cmp b-almalinux8/src/*.tar.gz r-debian12/src/*.tar.gz
  - cmp b-almalinux8/src/*.tar.gz r-ubuntu2404/src/*v2.*.tar.gz
# Confirm old git-archive (export-subst but long git describe) tarball reproducibility
  - cmp b-trisquel11/src/*.tar.gz b-ubuntu2204/src/*.tar.gz
# Confirm really old git-archive (no export-subst) tarball reproducibility
  - cmp b-debian11/src/*.tar.gz b-pureos10/src/*.tar.gz
# Confirm 'make dist' generated tarball reproducibility
  - cmp b-almalinux8/*.tar.gz b-rockylinux8/*.tar.gz
  - cmp b-almalinux9/*.tar.gz b-rockylinux9/*.tar.gz
  - cmp b-pureos10/*.tar.gz b-debian11/*.tar.gz
  - cmp b-devuan5/*.tar.gz b-debian12/*.tar.gz
  - cmp b-trisquel11/*.tar.gz b-ubuntu2204/*.tar.gz
  - cmp b-guix/*.tar.gz r-guix/*.tar.gz
# Confirm 'make dist' from git-archive tarball reproducibility
  - cmp s-trisquel10/*.tar.gz s-ubuntu2004/*.tar.gz
Notice that I discovered that git archive outputs differ over time too, which is natural but a bit of a nuisance. The output of the job is illuminating in the way that all SHA256 checksums of generated tarballs are included, for example the libidn2 v2.3.8 job log:
$ sha256sum */*.tar.* */*/*.tar.*   sort   grep -v -- -src.tar.
368488b6cc8697a0a937b9eb307a014396dd17d3feba3881e6911d549732a293  b-trisquel11/libidn2-2.3.8.tar.gz
368488b6cc8697a0a937b9eb307a014396dd17d3feba3881e6911d549732a293  b-ubuntu2204/libidn2-2.3.8.tar.gz
59db2d045fdc5639c98592d236403daa24d33d7c8db0986686b2a3056dfe0ded  b-debian11/libidn2-2.3.8.tar.gz
59db2d045fdc5639c98592d236403daa24d33d7c8db0986686b2a3056dfe0ded  b-pureos10/libidn2-2.3.8.tar.gz
5bd521d5ecd75f4b0ab0fc6d95d444944ef44a84cad859c9fb01363d3ce48bb8  s-trisquel10/libidn2-2.3.8.tar.gz
5bd521d5ecd75f4b0ab0fc6d95d444944ef44a84cad859c9fb01363d3ce48bb8  s-ubuntu2004/libidn2-2.3.8.tar.gz
7f1dcdea3772a34b7a9f22d6ae6361cdcbe5513e3b6485d40100b8565c9b961a  b-almalinux8/libidn2-2.3.8.tar.gz
7f1dcdea3772a34b7a9f22d6ae6361cdcbe5513e3b6485d40100b8565c9b961a  b-rockylinux8/libidn2-2.3.8.tar.gz
8031278157ce43b5813f36cf8dd6baf0d9a7f88324ced796765dcd5cd96ccc06  b-clang/libidn2-2.3.8.tar.gz
8031278157ce43b5813f36cf8dd6baf0d9a7f88324ced796765dcd5cd96ccc06  b-debian12/libidn2-2.3.8.tar.gz
8031278157ce43b5813f36cf8dd6baf0d9a7f88324ced796765dcd5cd96ccc06  b-devuan5/libidn2-2.3.8.tar.gz
8031278157ce43b5813f36cf8dd6baf0d9a7f88324ced796765dcd5cd96ccc06  b-gcc/libidn2-2.3.8.tar.gz
8031278157ce43b5813f36cf8dd6baf0d9a7f88324ced796765dcd5cd96ccc06  r-debian12/libidn2-2.3.8.tar.gz
acf5cbb295e0693e4394a56c71600421059f9c9bf45ccf8a7e305c995630b32b  r-ubuntu2404/libidn2-2.3.8.tar.gz
cbdb75c38100e9267670b916f41878b6dbc35f9c6cbe60d50f458b40df64fcf1  b-almalinux9/libidn2-2.3.8.tar.gz
cbdb75c38100e9267670b916f41878b6dbc35f9c6cbe60d50f458b40df64fcf1  b-rockylinux9/libidn2-2.3.8.tar.gz
f557911bf6171621e1f72ff35f5b1825bb35b52ed45325dcdee931e5d3c0787a  b-guix/libidn2-2.3.8.tar.gz
f557911bf6171621e1f72ff35f5b1825bb35b52ed45325dcdee931e5d3c0787a  r-guix/libidn2-2.3.8.tar.gz
I m sure I have forgotten or suppressed some challenges (sprinkling LANG=C TZ=UTC0 helps) related to these goals, but my hope is that this discussion of solutions will inspire you to implement these concepts for your software project too. Please share your thoughts and additional insights in a comment below. Enjoy Happy Hacking in the course of practicing this!

2 March 2025

Jonathan McDowell: RIP: Steve Langasek

[I d like to stop writing posts like this. I ve been trying to work out what to say now for nearly 2 months (writing the mail to -private to tell the Debian project about his death is one of the hardest things I ve had to write, and I bottled out and wrote something that was mostly just factual, because it wasn t the place), and I ve decided I just have to accept this won t be the post I want it to be, but posted is better than languishing in drafts.] Last weekend I was in Portland, for the Celebration of Life of my friend Steve, who sadly passed away at the start of the year. It wasn t entirely unexpected, but that doesn t make it any easier. I ve struggled to work out what to say about Steve. I ve seen many touching comments from others in Debian about their work with him, but what that s mostly brought home to me is that while I met Steve through Debian, he was first and foremost my friend rather than someone I worked with in Debian. And so everything I have to say is more about that friendship (and thus feels a bit self-centred). My first memory of Steve is getting lost with him in Porto Alegre, Brazil, during DebConf4. We d decided to walk to a local mall to meet up with some other folk (I can t recall how they were getting there, but it wasn t walking), ended up deep in conversation (ISTR it was about shared library transititions), and then it took a bit longer than we expected. I don t know how that managed to cement a friendship (neither of us saw it as the near death experience others feared we d had), but it did. Unlike others I never texted Steve much; we d occasionally chat on IRC, but nothing major. That didn t seem to matter when we actually saw each other in person though, we just picked up like we d seen each other the previous week. DebConf became a recurring theme of when we d see each other. Even outside DebConf we went places together. The first time I went somewhere in the US that wasn t the Bay Area, it was to Portland to see Steve. He, and his family, came to visit me in Belfast a couple of times, and I did road trip from Dublin to Cork with him. He took me to a volcano. Steve saw injustice in the world and actually tried to do something about it. I still have a copy of the US constitution sitting on my desk that he gave me. He made me want to be a better person. The world is a worse place without him in it, and while I am better for having known him, I am sadder for the fact he s gone.

9 February 2025

Dave Hibberd: Radio Activity 10-16 Feb 2025

It s been quite the week of radio related nonsense for me, where I ve been channelling my time and brainspace for radio into activity on air and system refinements, not working on Debian.

POTA, Antennas and why do my toys not work? Having had my interest piqued by Ian at mastodon.radio, I looked online and spotted a couple of parks within stumbling distance of my house, that s good news! It looks like the list has been refactored and expanded since I last looked at it, so there are now more entities to activate and explore. My concerns about antennas noted last week rumbled on. There was a second strand to this concern too, my end fed 64:1 (or 49:1?!) transformer from MM0OPX sits in my mind as not having worked very well in Spain last year, and I want to get to the bottom of why. As with most things in my life, it s probably a me problem. I came up with a cunning plan - firstly, buy a new mast to replace the one I broke a few weeks back on Cat Law. Secondly, buy a couple of new connectors and some heatshrink to reterminate my cable that I m sure is broken. Spending more money on a problem never hurt anyone, right? Come Wednesday, the new toys arrived and I figured combining everything into one convenient night time walk and radio was a good plan. So I walk out to the nearest park with my LoRa APRS doofer going and see what happens: APRS-Map After circling a bit to find somewhere suitable (there appear to be construction works in the park!) I set up my gear in 2C with frost on the ground, called CQ, spotted and got nothing on either the end fed half wave or the cheap vertical. As it was too late for 20m, I tried 40 and a bit of 80 using the inbuilt tuner, but wasn t heard by stations I called or when calling independently. I packed everything up and lora-doofered my way home, mildly deflated.

Try it at home It still didn t sit with me that the end fed wasn t working, so come Friday night I set it up in the back garden/woods behind the house to try and diagnose why it wasn t working. Up it went, I worked some Irish stations pretty effortlessly, and down everything came. No complaints - the only things I did differently was have the feedpoint a little higher and check my power, limiting it to 10W. The G90 can do 20W, I wonder if running at that was saturating the core in the 64:1. At some point in the evening I stepped in some dog s shit too, and spent some time cleaning my boots outside to avoid further tramping the smell through the house. Win some, lose some.

Take it to the Hills On Friday, some of the other GM-ES Sota-ists had been out for an activity day. On account of me being busy in work, I couldn t go outside to play, but I figured a weekend of activity was on the books.

Saturday - A day above the clouds On Saturday I took myself up Tap O Noth, a favourite of mine for some reason, and Lord Arthur s Hill. Before I hit the hills, I took myself to the hackerspace and printed myself a K6ARK Winder and a guy ring for the mast, cut string, tied it together and wound the string on to the winder. I also took time to buzz out my wonky coax and it showed great continuity. Hmm, that can be continued later. I didn t quite get to crimping the radial network of the Aliexpress whip with a 12mm stud crimp, that can also be put on the TODO list.

Tap O Noth Once finally out, the weather was a bit cloudy with passing snow showers, but in between the showers I was above the clouds and the air was clear: After a mild struggle on 2m, I set up the end fed the first hill and got to work from the old hill fort: The end fed worked flawlessly. Exactly as promised, switching between 7MHz, 14MHz, 21MHz and 28MHz without a tuner was perfect, I chased hills on all the bands, and had a great time. Apart from 40m, where there was absolutely no space due to a contest. That wasn t such a fun time! My fingers were bitterly cold, so on went the big gloves for the descent and I felt like I was warm by the time I made it back to the car. It worked so well, in fact, I took the 1/4 wave cheap vertical out my bag and decided to brave it on the next activation.

Lord Arthur s Hill GM5ALX has posted a .gpx to sotlas which is shorter than the other ascent, but much sharper - I figured this would be a fun new way to try up the hill! It takes you right through the heart of the Littlewood Park estate, and I felt a bit uncomfortable walking straight past the estate cottages, especially when there were vehicles moving and active work happening. Presumably this is where Lord Arthur lived, at the foot of his hill. I cut through the woods to the west of the cottages, disturbing some deer and many, many pheasants, but I met the path fairly quickly. From there it was a 2km walk, 300m vertical ascent. Short and sharp! At the top, I was treated to a view of the hill I had activated only an hour or so before, which is a view that always makes me smile: To get some height for the feedpoint, I wrapped the coax around my winder a couple of turns and trapped it with the elastic while draping the coax over the trig. This bought me some more height and I felt clever because of it. Maybe a pole would be easier? From here, I worked inter-G on 40m and had a wee pile up, eventually working 15 or so European stations on 20m. Pleased with that! I had been considering a third hill, but home was the call in the failing light. Back to the car I walked to find my key didn t have any battery, so out came the Audi App and I used the Internet of Things to unlock my car. The modern world is bizarre.

Sunday - Cloudy Head // Head in the Clouds Sunday started off migraney, so I stayed within the confines of my house until I felt safe driving! After some back and forth in my cloudy head, I opted for the easier option of Ladylea Hill as I wasn t feeling up for major physical exertion. It was a long drive, after which I felt more wonky, but I hit the path eventually - I run to Hibby Standard Time, a few hours to a few days behind the rest of GM/ES. I was ready to bail if my head didn t improve, but it turns out, fresh cold air, silence and bloodflow helped. Ladylea Hill was incredibly quiet, a feature I really appreciated. It feels incredibly remote, with a long winding drive down Glenbuchat, which still has ice on the surface of the lochs and standing water. A brooding summit crowned with grey cloud in fantastic scenery that only revealed itself upon the clouds blowing through: I set up at the cairn and picked up 30 contacts overall, split between 40m and 20m, with some inter-g on 40 and a couple of continental surprises. 20 had longer skip today, so I saw Spain, Finland, Slovenia, Poland. On teardown, I managed to snap the top segment of my brand new mast with my cold, clumsy fingers, but thankfully sotabeams stock replacements. More money at the problem, again. Back to the car, no app needed, and homeward bound as the light faded. At the end of the weekend, I find myself finally over 100 activator points and over 400 chaser points. Somehow I ve collected more points this year already than last year, the winter bonuses really do stack up!

Addendum - OSMAnd & Open Street Map I ve been using OSMAnd on my iPhone quite extensively recently, I think offline mapping is super important if you re going out to get mildly lost in the hills. On more than one occasion, I have confidently set off in the wrong direction in the mist, and maps have saved my bacon! As you can download .gpx files, it s great to have them on the device and available for guidance in case you get lost, coupled with an offline map. Plus, as I drive around I love to have the dark red of a hill I ve walked appear on the map in my car dash or in my hand: This weekend I discovered it s possible to have height maps for nice 3d maps and contours marked on the map - you just need to download some additions for the maps. This is a really nice feature, it makes maps more pretty and more useful when you re in the middle of nowhere. Open Street Map also has designators for SOTA summits here and similar for POTA here GM5ALX has set to adding the summits around Scotland here. While the benefits aren t immediately obvious, it allows developers of mapping applications access to more data at no extra cost, really. It helps add depth to an already rich set of information, and allows us as radio amateurs to do more interesting things with maps and not be shackled to Apple/Google. Because it s open data, we can also fix things we find wrong as users. I like to fix road surfaces after I ve been cycling as that will feed forward to route planning through Komoot and data on my wahoo too, which can be modified with osm maps. In the future, it s possible to have an OSMAnd plugin highlighting local SOTA summits or mimicking features of sotl.as but offline. It s cool to be able to put open technologies to use like this in the field and really is the convergence point of all my favourite things!

9 January 2025

Reproducible Builds: Reproducible Builds in December 2024

Welcome to the December 2024 report from the Reproducible Builds project! Our monthly reports outline what we ve been up to over the past month and highlight items of news from elsewhere in the world of software supply-chain security when relevant. As ever, however, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. Table of contents:
  1. reproduce.debian.net
  2. debian-repro-status
  3. On our mailing list
  4. Enhancing the Security of Software Supply Chains
  5. diffoscope
  6. Supply-chain attack in the Solana ecosystem
  7. Website updates
  8. Debian changes
  9. Other development news
  10. Upstream patches
  11. Reproducibility testing framework

reproduce.debian.net Last month saw the introduction of reproduce.debian.net. Announced at the recent Debian MiniDebConf in Toulouse, reproduce.debian.net is an instance of rebuilderd operated by the Reproducible Builds project. rebuilderd is our server designed monitor the official package repositories of Linux distributions and attempts to reproduce the observed results there. This month, however, we are pleased to announce that not only does the service now produce graphs, the reproduce.debian.net homepage itself has become a start page of sorts, and the amd64.reproduce.debian.net and i386.reproduce.debian.net pages have emerged. The first of these rebuilds the amd64 architecture, naturally, but it also is building Debian packages that are marked with the no architecture label, all. The second builder is, however, only rebuilding the i386 architecture. Both of these services were also switched to reproduce the Debian trixie distribution instead of unstable, which started with 43% of the archive rebuild with 79.3% reproduced successfully. This is very much a work in progress, and we ll start reproducing Debian unstable soon. Our i386 hosts are very kindly sponsored by Infomaniak whilst the amd64 node is sponsored by OSUOSL thank you! Indeed, we are looking for more workers for more Debian architectures; please contact us if you are able to help.

debian-repro-status Reproducible builds developer kpcyrd has published a client program for reproduce.debian.net (see above) that queries the status of the locally installed packages and rates the system with a percentage score. This tool works analogously to arch-repro-status for the Arch Linux Reproducible Builds setup. The tool was packaged for Debian and is currently available in Debian trixie: it can be installed with apt install debian-repro-status.

On our mailing list On our mailing list this month:
  • Bernhard M. Wiedemann wrote a detailed post on his long journey towards a bit-reproducible Emacs package. In his interesting message, Bernhard goes into depth about the tools that they used and the lower-level technical details of, for instance, compatibility with the version for glibc within openSUSE.
  • Shivanand Kunijadar posed a question pertaining to the reproducibility issues with encrypted images. Shivanand explains that they must use a random IV for encryption with AES CBC. The resulting artifact is not reproducible due to the random IV used. The message resulted in a handful of replies, hopefully helpful!
  • User Danilo posted an in interesting question related to their attempts in trying to achieve reproducible builds for Threema Desktop 2.0. The question resulted in a number of replies attempting to find the right combination of compiler and linker flags (for example).
  • Longstanding contributor David A. Wheeler wrote to our list announcing the release of the Census III of Free and Open Source Software: Application Libraries report written by Frank Nagle, Kate Powell, Richie Zitomer and David himself. As David writes in his message, the report attempts to answer the question what is the most popular Free and Open Source Software (FOSS)? .
  • Lastly, kpcyrd followed-up to a post from September 2024 which mentioned their desire for someone to implement a hashset of allowed module hashes that is generated during the kernel build and then embedded in the kernel image , thus enabling a deterministic and reproducible build. However, they are now reporting that somebody implemented the hash-based allow list feature and submitted it to the Linux kernel mailing list . Like kpcyrd, we hope it gets merged.

Enhancing the Security of Software Supply Chains: Methods and Practices Mehdi Keshani of the Delft University of Technology in the Netherlands has published their thesis on Enhancing the Security of Software Supply Chains: Methods and Practices . Their introductory summary first begins with an outline of software supply chains and the importance of the Maven ecosystem before outlining the issues that it faces that threaten its security and effectiveness . To address these:
First, we propose an automated approach for library reproducibility to enhance library security during the deployment phase. We then develop a scalable call graph generation technique to support various use cases, such as method-level vulnerability analysis and change impact analysis, which help mitigate security challenges within the ecosystem. Utilizing the generated call graphs, we explore the impact of libraries on their users. Finally, through empirical research and mining techniques, we investigate the current state of the Maven ecosystem, identify harmful practices, and propose recommendations to address them.
A PDF of Mehdi s entire thesis is available to download.

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made the following changes, including preparing and uploading versions 283 and 284 to Debian:
  • Update copyright years. [ ]
  • Update tests to support file 5.46. [ ][ ]
  • Simplify tests_quines.py::test_ differences,differences_deb to simply use assert_diff and not mangle the test fixture. [ ]

Supply-chain attack in the Solana ecosystem A significant supply-chain attack impacted Solana, an ecosystem for decentralised applications running on a blockchain. Hackers targeted the @solana/web3.js JavaScript library and embedded malicious code that extracted private keys and drained funds from cryptocurrency wallets. According to some reports, about $160,000 worth of assets were stolen, not including SOL tokens and other crypto assets.

Website updates Similar to last month, there was a large number of changes made to our website this month, including:
  • Chris Lamb:
    • Make the landing page hero look nicer when the vertical height component of the viewport is restricted, not just the horizontal width.
    • Rename the Buy-in page to Why Reproducible Builds? [ ]
    • Removing the top black border. [ ][ ]
  • Holger Levsen:
  • hulkoba:
    • Remove the sidebar-type layout and move to a static navigation element. [ ][ ][ ][ ]
    • Create and merge a new Success stories page, which highlights the success stories of Reproducible Builds, showcasing real-world examples of projects shipping with verifiable, reproducible builds. These stories aim to enhance the technical resilience of the initiative by encouraging community involvement and inspiring new contributions. . [ ]
    • Further changes to the homepage. [ ]
    • Remove the translation icon from the navigation bar. [ ]
    • Remove unused CSS styles pertaining to the sidebar. [ ]
    • Add sponsors to the global footer. [ ]
    • Add extra space on large screens on the Who page. [ ]
    • Hide the side navigation on small screens on the Documentation pages. [ ]

Debian changes There were a significant number of reproducibility-related changes within Debian this month, including:
  • Santiago Vila uploaded version 0.11+nmu4 of the dh-buildinfo package. In this release, the dh_buildinfo becomes a no-op ie. it no longer does anything beyond warning the developer that the dh-buildinfo package is now obsolete. In his upload, Santiago wrote that We still want packages to drop their [dependency] on dh-buildinfo, but now they will immediately benefit from this change after a simple rebuild.
  • Holger Levsen filed Debian bug #1091550 requesting a rebuild of a number of packages that were built with a very old version of dpkg.
  • Fay Stegerman contributed to an extensive thread on the debian-devel development mailing list on the topic of Supporting alternative zlib implementations . In particular, Fay wrote about her results experimenting whether zlib-ng produces identical results or not.
  • kpcyrd uploaded a new rust-rebuilderd-worker, rust-derp, rust-in-toto and debian-repro-status to Debian, which passed successfully through the so-called NEW queue.
  • Gioele Barabucci filed a number of bugs against the debrebuild component/script of the devscripts package, including:
    • #1089087: Address a spurious extra subdirectory in the build path.
    • #1089201: Extra zero bytes added to .dynstr when rebuilding CMake projects.
    • #1089088: Some binNMUs have a 1-second offset in some timestamps.
  • Gioele Barabucci also filed a bug against the dh-r package to report that the Recommends and Suggests fields are missing from rebuilt R packages. At the time of writing, this bug has no patch and needs some help to make over 350 binary packages reproducible.
  • Lastly, 8 reviews of Debian packages were added, 11 were updated and 11 were removed this month adding to our knowledge about identified issues.

Other development news In other ecosystem and distribution news:
  • Lastly, in openSUSE, Bernhard M. Wiedemann published another report for the distribution. There, Bernhard reports about the success of building R-B-OS , a partial fork of openSUSE with only 100% bit-reproducible packages. This effort was sponsored by the NLNet NGI0 initiative.

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In November, a number of changes were made by Holger Levsen, including:
  • reproduce.debian.net-related:
    • Add a new i386.reproduce.debian.net rebuilder. [ ][ ][ ][ ][ ][ ]
    • Make a number of updates to the documentation. [ ][ ][ ][ ][ ]
    • Run i386.reproduce.debian.net run on a public port to allow external workers. [ ]
    • Add a link to the /api/v0/pkgs/list endpoint. [ ]
    • Add support for a statistics page. [ ][ ][ ][ ][ ][ ]
    • Limit build logs to 20 MiB and diffoscope output to 10 MiB. [ ]
    • Improve the frontpage. [ ][ ]
    • Explain that we re testing arch:any and arch:all on the amd64 architecture, but only arch:any on i386. [ ]
  • Misc:
    • Remove code for testing Arch Linux, which has moved to reproduce.archlinux.org. [ ][ ]
    • Don t install dstat on Jenkins nodes anymore as its been removed from Debian trixie. [ ]
    • Prepare the infom08-i386 node to become another rebuilder. [ ]
    • Add debug date output for benchmarking the reproducible_pool_buildinfos.sh script. [ ]
    • Install installation-birthday everywhere. [ ]
    • Temporarily disable automatic updates of pool links on buildinfos.debian.net. [ ]
    • Install Recommends by default on Jenkins nodes. [ ]
    • Rename rebuilder_stats.py to rebuilderd_stats.py. [ ]
    • r.d.n/stats: minor formatting changes. [ ]
    • Install files under /etc/cron.d/ with the correct permissions. [ ]
and Jochen Sprickerhof made the following changes: Lastly, Gioele Barabucci also classified packages affected by 1-second offset issue filed as Debian bug #1089088 [ ][ ][ ][ ], Chris Hofstaedtler updated the URL for Grml s dpkg.selections file [ ], Roland Clobus updated the Jenkins log parser to parse warnings from diffoscope [ ] and Mattia Rizzolo banned a number of bots and crawlers from the service [ ][ ].
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

8 January 2025

John Goerzen: Censorship Is Complicated: What Internet History Says about Meta/Facebook

In light of this week s announcement by Meta (Facebook, Instagram, Threads, etc), I have been pondering this question: Why am I, a person that has long been a staunch advocate of free speech and encryption, leery of sites that talk about being free speech-oriented? And, more to the point, why an I a person that has been censored by Facebook for mentioning the Open Source social network Mastodon not cheering a lighter touch ? The answers are complicated, and take me back to the early days of social networking. Yes, I mean the 1980s and 1990s. Before digital communications, there were barriers to reaching a lot of people. Especially money. This led to a sort of self-censorship: it may be legal to write certain things, but would a newspaper publish a letter to the editor containing expletives? Probably not. As digital communications started to happen, suddenly people could have their own communities. Not just free from the same kinds of monetary pressures, but free from outside oversight (parents, teachers, peers, community, etc.) When you have a community that the majority of people lack the equipment to access and wouldn t understand how to access even if they had the equipment you have a place where self-expression can be unleashed. And, as J. C. Herz covers in what is now an unintentional history (her book Surfing on the Internet was published in 1995), self-expression WAS unleashed. She enjoyed the wit and expression of everything from odd corners of Usenet to the text-based open world of MOOs and MUDs. She even talks about groups dedicated to insults (flaming) in positive terms. But as I ve seen time and again, if there are absolutely no rules, then whenever a group gets big enough more than a few dozen people, say there are troublemakers that ruin it for everyone. Maybe it s trolling, maybe it s vicious attacks, you name it it will arrive and it will be poisonous. I remember the debates within the Debian community about this. Debian is one of the pillars of the Internet today, a nonprofit project with free speech in its DNA. And yet there were inevitably the poisonous people. Debian took too long to learn that allowing those people to run rampant was causing more harm than good, because having a well-worn Delete key and a tolerance for insults became a requirement for being a Debian developer, and that drove away people that had no desire to deal with such things. (I should note that Debian strikes a much better balance today.) But in reality, there were never absolutely no rules. If you joined a BBS, you used it at the whim of the owner (the sysop or system operator). The sysop may be a 16-yr-old running it from their bedroom, or a retired programmer, but in any case they were letting you use their resources for free and they could kick you off for any or no reason at all. So if you caused trouble, or perhaps insulted their cat, you re banned. But, in all but the smallest towns, there were other options you could try. On the other hand, sysops enjoyed having people call their BBSs and didn t want to drive everyone off, so there was a natural balance at play. As networks like Fidonet developed, a sort of uneasy approach kicked in: don t be excessively annoying, and don t be easily annoyed. Like it or not, it seemed to generally work. A BBS that repeatedly failed to deal with troublemakers could risk removal from Fidonet. On the more institutional Usenet, you generally got access through your university (or, in a few cases, employer). Most universities didn t really even know they were running a Usenet server, and you were generally left alone. Until you did something that annoyed somebody enough that they tracked down the phone number for your dean, in which case real-world consequences would kick in. A site may face the Usenet Death Penalty delinking from the network if they repeatedly failed to prevent malicious content from flowing through their site. Some BBSs let people from minority communities such as LGBTQ+ thrive in a place of peace from tormentors. A lot of them let people be themselves in a way they couldn t be in real life . And yes, some harbored trolls and flamers. The point I am trying to make here is that each BBS, or Usenet site, set their own policies about what their own users could do. These had to be harmonized to a certain extent with the global community, but in a certain sense, with BBSs especially, you could just use a different one if you didn t like what the vibe was at a certain place. That this free speech ethos survived was never inevitable. There were many attempts to regulate the Internet, and it was thanks to the advocacy of groups like the EFF that we have things like strong encryption and a degree of freedom online. With the rise of the very large platforms and here I mean CompuServe and AOL at first, and then Facebook, Twitter, and the like later the low-friction option of just choosing a different place started to decline. You could participate on a Fidonet forum from any of thousands of BBSs, but you could only participate in an AOL forum from AOL. The same goes for Facebook, Twitter, and so forth. Not only that, but as social media became conceived of as very large sites, it became impossible for a person with enough skill, funds, and time to just start a site themselves. Instead of neading a few thousand dollars of equipment, you d need tens or hundreds of millions of dollars of equipment and employees. All that means you can t really run Facebook as a nonprofit. It is a business. It should be absolutely clear to everyone that Facebook s mission is not the one they say it is [to] give people the power to build community and bring the world closer together. If that was their goal, they wouldn t be creating AI users and AI spam and all the rest. Zuck isn t showing courage; he s sucking up to Trump and those that will pay the price are those that always do: women and minorities. Really, the point of any large social network isn t to build community. It s to make the owners their next billion. They do that by convincing people to look at ads on their site. Zuck is as much a windsock as anyone else; he will adjust policies in whichever direction he thinks the wind is blowing so as to let him keep putting ads in front of eyeballs, and stomp all over principles even free speech doing it. Don t expect anything different from any large commercial social network either. Bluesky is going to follow the same trajectory as all the others. The problem with a one-size-fits-all content policy is that the world isn t that kind of place. For instance, I am a pacifist. There is a place for a group where pacifists can hang out with each other, free from the noise of the debate about pacifism. And there is a place for the debate. Forcing everyone that signs up for the conversation to sign up for the debate is harmful. Preventing the debate is often also harmful. One company can t square this circle. Beyond that, the fact that we care so much about one company is a problem on two levels. First, it indicates how succeptible people are to misinformation and such. I don t have much to offer on that point. Secondly, it indicates that we are too centralized. We have a solution there: Mastodon. Mastodon is a modern, open source, decentralized social network. You can join any instance, easily migrate your account from one server to another, and so forth. You pick an instance that suits you. There are thousands of others you can choose from. Some aggressively defederate with instances known to harbor poisonous people; some don t. And, to harken back to the BBS era, if you have some time, some skill, and a few bucks, you can run your own Mastodon instance. Personally, I still visit Facebook on occasion because some people I care about are mainly there. But it is such a terrible experience that I rarely do. Meta is becoming irrelevant to me. They are on a path to becoming irrelevant to many more as well. Maybe this is the moment to go shrug, this sucks and try something better. (And when you do, feel free to say hi to me at @jgoerzen@floss.social on Mastodon.)

5 January 2025

Enrico Zini: ncdu on files to back up

I use borg and restic to backup files in my system. Sometimes I run a huge download or clone a large git repo and forget to mark it with CACHEDIR.TAG, and it gets picked up slowing the backup process and wasting backup space uselessly. I would like to occasionally audit the system to have an idea of what is a candidate for backup. ncdu would be great for this, but it doesn't know about backup exclusion filters. Let's teach it then. Here's a script that simulates a backup and feeds the results to ncdu:
#!/usr/bin/python3
import argparse
import os
import sys
import time
import stat
import json
import subprocess
import tempfile
from pathlib import Path
from typing import Any
FILTER_ARGS = [
    "--one-file-system",
    "--exclude-caches",
    "--exclude",
    "*/.cache",
]
BACKUP_PATHS = [
    "/home",
]
class Dir:
    """
    Dispatch borg output into a hierarchical directory structure.
    borg prints a flat file list, ncdu needs a hierarchical JSON.
    """
    def __init__(self, path: Path, name: str):
        self.path = path
        self.name = name
        self.subdirs: dict[str, "Dir"] =  
        self.files: list[str] = []
    def print(self, indent: str = "") -> None:
        for name, subdir in self.subdirs.items():
            print(f" indent name: /")
            subdir.print(indent + " ")
        for name in self.files:
            print(f" indent name ")
    def add(self, parts: tuple[str, ...]) -> None:
        if len(parts) == 1:
            self.files.append(parts[0])
            return
        subdir = self.subdirs.get(parts[0])
        if subdir is None:
            subdir = Dir(self.path / parts[0], parts[0])
            self.subdirs[parts[0]] = subdir
        subdir.add(parts[1:])
    def to_data(self) -> list[Any]:
        res: list[Any] = []
        st = self.path.stat()
        res.append(self.collect_stat(self.name, st))
        for name, subdir in self.subdirs.items():
            res.append(subdir.to_data())
        dir_fd = os.open(self.path, os.O_DIRECTORY)
        try:
            for name in self.files:
                try:
                    st = os.lstat(name, dir_fd=dir_fd)
                except FileNotFoundError:
                    print(
                        "Possibly broken encoding:",
                        self.path,
                        repr(name),
                        file=sys.stderr,
                    )
                    continue
                if stat.S_ISDIR(st.st_mode):
                    continue
                res.append(self.collect_stat(name, st))
        finally:
            os.close(dir_fd)
        return res
    def collect_stat(self, fname: str, st) -> dict[str, Any]:
        res =  
            "name": fname,
            "ino": st.st_ino,
            "asize": st.st_size,
            "dsize": st.st_blocks * 512,
         
        if stat.S_ISDIR(st.st_mode):
            res["dev"] = st.st_dev
        return res
class Scanner:
    def __init__(self) -> None:
        self.root = Dir(Path("/"), "/")
        self.data = None
    def scan(self) -> None:
        with tempfile.TemporaryDirectory() as tmpdir_name:
            mock_backup_dir = Path(tmpdir_name) / "backup"
            subprocess.run(
                ["borg", "init", mock_backup_dir.as_posix(), "--encryption", "none"],
                cwd=Path.home(),
                check=True,
            )
            proc = subprocess.Popen(
                [
                    "borg",
                    "create",
                    "--list",
                    "--dry-run",
                ]
                + FILTER_ARGS
                + [
                    f" mock_backup_dir ::test",
                ]
                + BACKUP_PATHS,
                cwd=Path.home(),
                stderr=subprocess.PIPE,
            )
            assert proc.stderr is not None
            for line in proc.stderr:
                match line[0:2]:
                    case b"- ":
                        path = Path(line[2:].strip().decode())
                    case b"x ":
                        continue
                    case _:
                        raise RuntimeError(f"Unparsable borg output:  line!r ")
                if path.parts[0] != "/":
                    raise RuntimeError(f"Unsupported path:  path.parts!r ")
                self.root.add(path.parts[1:])
    def to_json(self) -> list[Any]:
        return [
            1,
            0,
             
                "progname": "backup-ncdu",
                "progver": "0.1",
                "timestamp": int(time.time()),
             ,
            self.root.to_data(),
        ]
    def export(self):
        return json.dumps(self.to_json()).encode()
def main():
    parser = argparse.ArgumentParser(
        description="Run ncdu to estimate sizes of files to backup."
    )
    parser.parse_args()
    scanner = Scanner()
    scanner.scan()
    # scanner.root.print()
    res = subprocess.run(["ncdu", "-f-"], input=scanner.export())
    sys.exit(res.returncode)
if __name__ == "__main__":
    main()

30 December 2024

Russ Allbery: Review: House in Hiding

Review: House in Hiding, by Jenny Schwartz
Series: Uncertain Sanctuary #2
Publisher: Jenny Schwartz
Copyright: October 2020
Printing: September 2024
ASIN: B0DBX6GP8Z
Format: Kindle
Pages: 196
House in Hiding is the second book of a self-published space fantasy trilogy that started with The House That Walked Between Worlds. I read it as part of the Uncertain Sanctuary omnibus, which is reflected in the sidebar metadata. At the end of the previous book, Kira had gathered a motley crew for her house and discovered that she had drawn the attention of some rather significant galactic powers. Now, with the help of her new (hopefully) friends, she has to decide what role she's going to play in the galaxy. Or she can dither a lot, ruminate repeatedly on the same topics, and flail about randomly. That's also an option. This is slightly unfair. By the second half of the book, the series plot is beginning to cohere around two major problems: what is happening to the magic flows in the universe, and who killed Kira's parents. But apparently there was a limit on my enjoyment for the chaos in Kira's chaotic decisiveness I praised in my review of the last book, and I hit that limit around the middle of this book. I am interested in the questions of ethics, responsibility, and public image that this series is raising. I'm just not convinced that Schwartz is going to provide satisfying answers. One thing I do appreciate about this book is that it acknowledges that politics exist and that taking powerful people at face value is a bad idea. You would think that this would be a low bar, and yet it's depressing how many fantasy novels signal the trustworthiness of a character via some variation of "I looked into his eyes and shook his hand," or at least expect readers to be surprised by the inevitable betrayals. Schwartz does not make that mistake; after getting a call from a powerful player in galactic politics, the characters take apart everything that was said while assuming it could be attempted manipulation, which is the correct initial response. My problem comes after that. I like reading about competent characters with a plan, and these are absurdly powerful but very naive characters with no plan. This is realistic for the situation Kira has been thrust into, but it's not that entertaining to read about. I think the root of my problem is that there are some fundamental storytelling problems here that Schwartz is struggling to fix. The basic theory of story says that you need a protagonist, a setting, a conflict, and a plot. Schwartz has a good protagonist, one great supporting character and several adequate ones, and an enjoyably weird setting. I think she's working her way up to having a plot, although usually it's best for the plot to show up before the middle book of the series. What she doesn't have is a meaningful conflict. It's not entirely clear to either the reader or to Kira why Kira cares about what's happening. You would not think this would be a problem given that Kira's parents were murdered before the start of the first book. That's a classic conflict that's driven more books than I think anyone could count. It's not what Kira has cared about up to this point, however; she got away from Earth and has shown no sign of wanting to go back or identify the people who killed her parents, perhaps because she mostly blames herself. Instead, she's stumbling across other problems in the universe that other people would like her to care about. She occasionally feels like she ought to care about them because they involve her new friends or because she wants to be a good person, but they have very little dramatic oomph. "I'm a sorcerer and vaguely want the universe to be a better place" turns out to not work that well as a source of dramatic tension. This lack of conflict is somewhat fascinating because it's so different than most fantasy novels. If Schwartz were more aware of how oddly disconnected her protagonist is from the story conflict, I think there could be a thoughtful, if odd, psychological novel in here about one's ethical responsibilities if one suddenly had vast power and no strong attachments to the world. Kira does gesture occasionally in that direction, but there's no real meat to her musings. Instead, her lack of motivation is solved through one of the hoariest tropes in fiction: children in danger. I really want to like this series, and I still love the House, but this book was not good. The romance that I was delighted to not be subjected to in the first book appears to be starting (sigh), the political maneuvering that happens here is only mildly interesting and not believably competent, and the book concludes in Kira making an egregiously and blatantly stupid mistake that should have resulted in one of her friends asking her what the hell she was doing. Some setup happens, and it seems likely that the final book will have a clear conflict and plot, but this middle book was a disappointing mess. These books are fast to read and lightly entertaining between other things, and the House still has me invested enough in this universe that I'll read the last book in the omnibus. Be warned, though, that the middle book is more a collection of anecdotes than a story, and there's only so much of Kira showing off her power I can take without a conflict and a plot. Followed by The House That Fought. Rating: 5 out of 10

29 December 2024

Russ Allbery: Review: The Last Hour Between Worlds

Review: The Last Hour Between Worlds, by Melissa Caruso
Series: The Echo Archives #1
Publisher: Orbit
Copyright: November 2024
ISBN: 0-316-30364-X
Format: Kindle
Pages: 388
The Last Hour Between Worlds is urban, somewhat political high fantasy with strong fae vibes. It is the first book of a series, but it stands alone quite well. Kembral Thorne is a Hound, a member of the guild that serves as guards, investigators, and protectors. Kembral's specialty is Echo retrieval: rescues of people and animals who have fallen through a weak spot in reality into one of the strange, dangerous, and malleable layers called Echoes. Kem once rescued a dog from six layers down, an almost unheard-of feat. Kem is also a new single mother, which means her past two months have been spent in a sleep-deprived haze revolving exclusively around her much-beloved infant. Dona Marjorie Swift's year-turning party is the first time she's been out without Emmi since she gave birth, and she's only there because her sister took the child and practically shoved her out the door. Now, she's desperately trying to remember how to be social and normal, which is not made easier by the unexpected presence of Rika at the party. Rika Nonesuch is not a Hound. She's a Cat, a member of the guild of thieves and occasional assassins. They are the nemesis of the Hounds, but in a stylized and formalized way in which certain courtesies are expected. (The politics of this don't really make sense; you just have to go with it.) Kem has complicated feelings about Rika's grace, banter, and intoxicating perfume, feelings that she thought might be reciprocated until Rika drugged her during an apparent date and left her buried under a pile of garbage. She was not expecting Rika to be at this party and is definitely not ready to have a conversation with her. This emotional turmoil is rudely interrupted by the death of nearly everyone at the party via an Echo poison, the appearance of a dark figure driving a black sword into someone, and the descent of the entire party into an Echo. This was one of those books that kept getting better the farther into the book I read. I was a bit leery at first because the publisher's blurb made it sound more like horror than I prefer, but this is more the disturbing strangeness of fae creatures than the sort of gruesomeness, disgust, or body horror that I find off-putting. Most importantly, the point of this book is not to torture the characters or scare the reader. It's instead structured a bit like a murder mystery, but one whose resolution requires working out obscure fantasy rules and hidden political agendas. One of the currencies in the world of Echos is blood, but another is emotion, revelation, and the stories that bring both, and Caruso focuses the story more on that aspect than on horrifying imagery.
Rika frowned. "Resolve it? How?" "I have no idea." I couldn't keep my frustration from leaking through. "Might be that we have to delve deep into our own hearts to confront the unhealed wounds we've carried with us in secret. Might be that we have to say their names backward, or just close our eyes and they'll go away. Echoes never make any damned sense." Rika made a face. "We'd better not have to confront our unhealed wounds, or I'm leaving you to die."
All of The Last Hour Between Worlds is told in the first person from Kem's perspective, but Rika is the best character in this book. Kem is a rather straightforward, dogged, stubborn protector; Rika is complicated, selfish, conflicted, and considerably more dynamic. The first obvious twist in her background I spotted so long before Kem found out that it was a bit frustrating, but there were multiple satisfying twists after that. As advertised in the blurb, there's a sapphic romance angle here, but it's the sort that comes from a complicated friendship and a lot of mutual respect rather than love at first sight. Some of their relationship conflict is driven by misunderstanding, but the misunderstanding happens before the novel begins, which means the reader doesn't have to sit through the bit where one yells at the characters for being stupid. It helps that the characters have something concrete to do, and that driving plot problem is multi-layered and satisfying. Each time the party falls through a layer of reality, it's mostly reset to the start of the book, but the word "mostly" is hiding a lot of subtlety. Given the clock at the start of each chapter and the blurb (if one read it), the reader can make a good guess that the plot problem will not be fully resolved until the characters fall quite deep into the Echoes, but the story never felt repetitive the way that some time loop stories can. As the characters gain more understanding, the problems change, the players change, and they have to make several excursions into the surrounding world. This is the sort of fantasy that feels a bit like science fiction. You're thrown into a world with a different culture and different rules that are foreign to the reader and natural to the characters. Part of the fun of reading is figuring out the rules, history, and backstory while watching the characters try to solve the puzzles they're faced with. The writing is good but not great. Characterization was good enough for a story primarily focused on action and puzzle-solving, but it was a bit lacking in subtlety. I think Caruso's strengths showed most in the world design, particularly the magic system and the rules followed by the Echo creatures. The excursions outside of the somewhat-protected house struck a balance between eeriness and comprehensibility that reminded me of T. Kingfisher or Sandman. The human politics were unfortunately less successful and rested on some tired centrist cliches. Thankfully, this was not the main point of the story. I should also warn that there is a lot of talk about babies. Kem's entire identity at the start of the novel, to the point of incessant monologue, is "new mother." This is not a perspective we get very often in fantasy, and Kem eventually finds a steadier balance between her bond with her daughter and the other parts of her life. I think some readers will feel very seen. But Caruso leans hard into maternal bonding. So hard. If you don't want to read about someone who is deliriously obsessed with their new child, you may want to skip this one. Right after I finished this book, I thought it was amazing. Now that I've had a few days to think about it, the lack of subtlety and the facile human politics brought it down a notch. I'm a science fiction reader at heart, so I loved the slow revelation of mechanics; the reader starts the story by knowing that Kem can "blink step" but not knowing what that means, and by the end of the story one not only knows but has opinions about its limitations, political implications, and interactions with other forms of magic. The Echo worlds are treated similarly, and this type of world-building is my jam. But the cost is that the human characters, particularly the supporting cast, don't get the same focus and therefore are a bit straightforward and obvious. The subplot with Dona Vandelle was particularly annoying. Ah well. Kem and Rika's relationship did work, and it's the center of the book. If you like fantasy mechanics but are a bit leery of fae stories because they feel too symbolic or arbitrary, give this a try. It's the most satisfyingly constructed fae story that I've read in a long time. It's not great literary fiction, but it's also not trying to be; it's a puzzle adventure, and a well-executed one. Recommended, and I will definitely be reading the sequel. Content notes: Lots of violent death and other physical damage, creepy dream worlds with implied but not explicit horror, and rather a lot of blood. Followed by The Last Soul Among Wolves, not yet published at the time I wrote this review. Rating: 8 out of 10

28 December 2024

Enrico Zini: Disable spellchecker popup on Android

On Android, there's a spellchecker popup that occasionally appears over the keyboard, getting very annoyingly in the way. See for example this unanswered question with screenshots. It looks like a feature of the keyboard, but it's not, and so I looked and I looked and I could not find how to turn it off. The answer is to look for how to disable the spellchecker in the keyboard section of the android system settings, not in the android keyboard app settings. See for example this answer on stackexchange.

27 December 2024

Wouter Verhelst: Writing an extensible JSON-based DSL with Moose

At work, I've been maintaining a perl script that needs to run a number of steps as part of a release workflow. Initially, that script was very simple, but over time it has grown to do a number of things. And then some of those things did not need to be run all the time. And then we wanted to do this one exceptional thing for this one case. And so on; eventually the script became a big mess of configuration options and unreadable flow, and so I decided that I wanted it to be more configurable. I sat down and spent some time on this, and eventually came up with what I now realize is a domain-specific language (DSL) in JSON, implemented by creating objects in Moose, extensible by writing more object classes. Let me explain how it works. In order to explain, however, I need to explain some perl and Moose basics first. If you already know all that, you can safely skip ahead past the "Preliminaries" section that's next.

Preliminaries

Moose object creation, references. In Moose, creating a class is done something like this:
package Foo;
use v5.40;
use Moose;
has 'attribute' => (
    is  => 'ro',
    isa => 'Str',
    required => 1
);
sub say_something  
    my $self = shift;
    say "Hello there, our attribute is " . $self->attribute;
 
The above is a class that has a single attribute called attribute. To create an object, you use the Moose constructor on the class, and pass it the attributes you want:
use v5.40;
use Foo;
my $foo = Foo->new(attribute => "foo");
$foo->say_something;
(output: Hello there, our attribute is foo) This creates a new object with the attribute attribute set to bar. The attribute accessor is a method generated by Moose, which functions both as a getter and a setter (though in this particular case we made the attribute "ro", meaning read-only, so while it can be set at object creation time it cannot be changed by the setter anymore). So yay, an object. And it has methods, things that we set ourselves. Basic OO, all that. One of the peculiarities of perl is its concept of "lists". Not to be confused with the lists of python -- a concept that is called "arrays" in perl and is somewhat different -- in perl, lists are enumerations of values. They can be used as initializers for arrays or hashes, and they are used as arguments to subroutines. Lists cannot be nested; whenever a hash or array is passed in a list, the list is "flattened", that is, it becomes one big list. This means that the below script is functionally equivalent to the above script that uses our "Foo" object:
use v5.40;
use Foo;
my %args;
$args attribute  = "foo";
my $foo = Foo->new(%args);
$foo->say_something;
(output: Hello there, our attribute is foo) This creates a hash %args wherein we set the attributes that we want to pass to our constructor. We set one attribute in %args, the one called attribute, and then use %args and rely on list flattening to create the object with the same attribute set (list flattening turns a hash into a list of key-value pairs). Perl also has a concept of "references". These are scalar values that point to other values; the other value can be a hash, a list, or another scalar. There is syntax to create a non-scalar value at assignment time, called anonymous references, which is useful when one wants to remember non-scoped values. By default, references are not flattened, and this is what allows you to create multidimensional values in perl; however, it is possible to request list flattening by dereferencing the reference. The below example, again functionally equivalent to the previous two examples, demonstrates this:
use v5.40;
use Foo;
my $args =  ;
$args-> attribute  = "foo";
my $foo = Foo->new(%$args);
$foo->say_something;
(output: Hello there, our attribute is foo) This creates a scalar $args, which is a reference to an anonymous hash. Then, we set the key attribute of that anonymous hash to bar (note the use arrow operator here, which is used to indicate that we want to dereference a reference to a hash), and create the object using that reference, requesting hash dereferencing and flattening by using a double sigil, %$. As a side note, objects in perl are references too, hence the fact that we have to use the dereferencing arrow to access the attributes and methods of Moose objects. Moose attributes don't have to be strings or even simple scalars. They can also be references to hashes or arrays, or even other objects:
package Bar;
use v5.40;
use Moose;
extends 'Foo';
has 'hash_attribute' => (
    is => 'ro',
    isa => 'HashRef[Str]',
    predicate => 'has_hash_attribute',
);
has 'object_attribute' => (
    is => 'ro',
    isa => 'Foo',
    predicate => 'has_object_attribute',
);
sub say_something  
    my $self = shift;
    if($self->has_object_attribute)  
        $self->object_attribute->say_something;
     
    $self->SUPER::say_something unless $self->has_hash_attribute;
    say "We have a hash attribute!"
 
This creates a subclass of Foo called Bar that has a hash attribute called hash_attribute, and an object attribute called object_attribute. Both of them are references; one to a hash, the other to an object. The hash ref is further limited in that it requires that each value in the hash must be a string (this is optional but can occasionally be useful), and the object ref in that it must refer to an object of the class Foo, or any of its subclasses. The predicates used here are extra subroutines that Moose provides if you ask for them, and which allow you to see if an object's attribute has a value or not. The example script would use an object like this:
use v5.40;
use Bar;
my $foo = Foo->new(attribute => "foo");
my $bar = Bar->new(object_attribute => $foo, attribute => "bar");
$bar->say_something;
(output: Hello there, our attribute is foo) This example also shows object inheritance, and methods implemented in child classes. Okay, that's it for perl and Moose basics. On to...

Moose Coercion Moose has a concept of "value coercion". Value coercion allows you to tell Moose that if it sees one thing but expects another, it should convert is using a passed subroutine before assigning the value. That sounds a bit dense without example, so let me show you how it works. Reimaginging the Bar package, we could use coercion to eliminate one object creation step from the creation of a Bar object:
package "Bar";
use v5.40;
use Moose;
use Moose::Util::TypeConstraints;
extends "Foo";
coerce "Foo",
    from "HashRef",
    via   Foo->new(%$_)  ;
has 'hash_attribute' => (
    is => 'ro',
    isa => 'HashRef',
    predicate => 'has_hash_attribute',
);
has 'object_attribute' => (
    is => 'ro',
    isa => 'Foo',
    coerce => 1,
    predicate => 'has_object_attribute',
);
sub say_something  
    my $self = shift;
    if($self->has_object_attribute)  
        $self->object_attribute->say_something;
     
    $self->SUPER::say_something unless $self->has_hash_attribute;
    say "We have a hash attribute!"
 
Okay, let's unpack that a bit. First, we add the Moose::Util::TypeConstraints module to our package. This is required to declare coercions. Then, we declare a coercion to tell Moose how to convert a HashRef to a Foo object: by using the Foo constructor on a flattened list created from the hashref that it is given. Then, we update the definition of the object_attribute to say that it should use coercions. This is not the default, because going through the list of coercions to find the right one has a performance penalty, so if the coercion is not requested then we do not do it. This allows us to simplify declarations. With the updated Bar class, we can simplify our example script to this:
use v5.40;
use Bar;
my $bar = Bar->new(attribute => "bar", object_attribute =>   attribute => "foo"  );
$bar->say_something
(output: Hello there, our attribute is foo) Here, the coercion kicks in because the value object_attribute, which is supposed to be an object of class Foo, is instead a hash ref. Without the coercion, this would produce an error message saying that the type of the object_attribute attribute is not a Foo object. With the coercion, however, the value that we pass to object_attribute is passed to a Foo constructor using list flattening, and then the resulting Foo object is assigned to the object_attribute attribute. Coercion works for more complicated things, too; for instance, you can use coercion to coerce an array of hashes into an array of objects, by creating a subtype first:
package MyCoercions;
use v5.40;
use Moose;
use Moose::Util::TypeConstraints;
use Foo;
subtype "ArrayOfFoo", as "ArrayRef[Foo]";
subtype "ArrayOfHashes", as "ArrayRef[HashRef]";
coerce "ArrayOfFoo", from "ArrayOfHashes", via   [ map   Foo->create(%$_)   @ $_  ]  ;
Ick. That's a bit more complex. What happens here is that we use the map function to iterate over a list of values. The given list of values is @ $_ , which is perl for "dereference the default value as an array reference, and flatten the list of values in that array reference". So the ArrayRef of HashRefs is dereferenced and flattened, and each HashRef in the ArrayRef is passed to the map function. The map function then takes each hash ref in turn and passes it to the block of code that it is also given. In this case, that block is Foo->create(%$_) . In other words, we invoke the create factory method with the flattened hashref as an argument. This returns an object of the correct implementation (assuming our hash ref has a type attribute set), and with all attributes of their object set to the correct value. That value is then returned from the block (this could be made more explicit with a return call, but that is optional, perl defaults a return value to the rvalue of the last expression in a block). The map function then returns a list of all the created objects, which we capture in an anonymous array ref (the [] square brackets), i.e., an ArrayRef of Foo object, passing the Moose requirement of ArrayRef[Foo]. Usually, I tend to put my coercions in a special-purpose package. Although it is not strictly required by Moose, I find that it is useful to do this, because Moose does not allow a coercion to be defined if a coercion for the same type had already been done in a different package. And while it is theoretically possible to make sure you only ever declare a coercion once in your entire codebase, I find that doing so is easier to remember if you put all your coercions in a specific package. Okay, now you understand Moose object coercion! On to...

Dynamic module loading Perl allows loading modules at runtime. In the most simple case, you just use require inside a stringy eval:
my $module = "Foo";
eval "require $module";
This loads "Foo" at runtime. Obviously, the $module string could be a computed value, it does not have to be hardcoded. There are some obvious downsides to doing things this way, mostly in the fact that a computed value can basically be anything and so without proper checks this can quickly become an arbitrary code vulnerability. As such, there are a number of distributions on CPAN to help you with the low-level stuff of figuring out what the possible modules are, and how to load them. For the purposes of my script, I used Module::Pluggable. Its API is fairly simple and straightforward:
package Foo;
use v5.40;
use Moose;
use Module::Pluggable require => 1;
has 'attribute' => (
    is => 'ro',
    isa => 'Str',
);
has 'type' => (
    is => 'ro',
    isa => 'Str',
    required => 1,
);
sub handles_type  
    return 0;
 
sub create  
    my $class = shift;
    my %data = @_;
    foreach my $impl($class->plugins)  
        if($impl->can("handles_type") && $impl->handles_type($data type ))  
            return $impl->new(%data);
         
     
    die "could not find a plugin for type " . $data type ;
 
sub say_something  
    my $self = shift;
    say "Hello there, I am a " . $self->type;
 
The new concept here is the plugins class method, which is added by Module::Pluggable, and which searches perl's library paths for all modules that are in our namespace. The namespace is configurable, but by default it is the name of our module; so in the above example, if there were a package "Foo::Bar" which
  • has a subroutine handles_type
  • that returns a truthy value when passed the value of the type key in a hash that is passed to the create subroutine,
  • then the create subroutine creates a new object with the passed key/value pairs used as attribute initializers.
Let's implement a Foo::Bar package:
package Foo::Bar;
use v5.40;
use Moose;
extends 'Foo';
has 'type' => (
    is => 'ro',
    isa => 'Str',
    required => 1,
);
has 'serves_drinks' => (
    is => 'ro',
    isa => 'Bool',
    default => 0,
);
sub handles_type  
    my $class = shift;
    my $type = shift;
    return $type eq "bar";
 
sub say_something  
    my $self = shift;
    $self->SUPER::say_something;
    say "I serve drinks!" if $self->serves_drinks;
 
We can now indirectly use the Foo::Bar package in our script:
use v5.40;
use Foo;
my $obj = Foo->create(type => bar, serves_drinks => 1);
$obj->say_something;
output:
Hello there, I am a bar.
I serve drinks!
Okay, now you understand all the bits and pieces that are needed to understand how I created the DSL engine. On to...

Putting it all together We're actually quite close already. The create factory method in the last version of our Foo package allows us to decide at run time which module to instantiate an object of, and to load that module at run time. We can use coercion and list flattening to turn a reference to a hash into an object of the correct type. We haven't looked yet at how to turn a JSON data structure into a hash, but that bit is actually ridiculously trivial:
use JSON::MaybeXS;
my $data = decode_json($json_string);
Tada, now $data is a reference to a deserialized version of the JSON string: if the JSON string contained an object, $data is a hashref; if the JSON string contained an array, $data is an arrayref, etc. So, in other words, to create an extensible JSON-based DSL that is implemented by Moose objects, all we need to do is create a system that
  • takes hash refs to set arguments
  • has factory methods to create objects, which
    • uses Module::Pluggable to find the available object classes, and
    • uses the type attribute to figure out which object class to use to create the object
  • uses coercion to convert hash refs into objects using these factory methods
In practice, we could have a JSON file with the following structure:
 
    "description": "do stuff",
    "actions": [
         
            "type": "bar",
            "serves_drinks": true,
         ,
         
            "type": "bar",
            "serves_drinks": false,
         
    ]
 
... and then we could have a Moose object definition like this:
package MyDSL;
use v5.40;
use Moose;
use MyCoercions;
has "description" => (
    is => 'ro',
    isa => 'Str',
);
has 'actions' => (
    is => 'ro',
    isa => 'ArrayOfFoo'
    coerce => 1,
    required => 1,
);
sub say_something  
    say "Hello there, I am described as " . $self->description . " and I am performing my actions: ";
    foreach my $action(@ $self->actions )  
        $action->say_something;
     
 
Now, we can write a script that loads this JSON file and create a new object using the flattened arguments:
use v5.40;
use MyDSL;
use JSON::MaybeXS;
my $input_file_name = shift;
my $args = do  
    local $/ = undef;
    open my $input_fh, "<", $input_file_name or die "could not open file";
    <$input_fh>;
 ;
$args = decode_json($args);
my $dsl = MyDSL->new(%$args);
$dsl->say_something
Output:
Hello there, I am described as do stuff and I am performing my actions:
Hello there, I am a bar
I am serving drinks!
Hello there, I am a bar
In some more detail, this will:
  • Read the JSON file and deserialize it;
  • Pass the object keys in the JSON file as arguments to a constructor of the MyDSL class;
  • The MyDSL class then uses those arguments to set its attributes, using Moose coercion to convert the "actions" array of hashes into an array of Foo::Bar objects.
  • Perform the say_something method on the MyDSL object
Once this is written, extending the scheme to also support a "quux" type simply requires writing a Foo::Quux class, making sure it has a method handles_type that returns a truthy value when called with quux as the argument, and installing it into the perl library path. This is rather easy to do. It can even be extended deeper, too; if the quux type requires a list of arguments rather than just a single argument, it could itself also have an array attribute with relevant coercions. These coercions could then be used to convert the list of arguments into an array of objects of the correct type, using the same schema as above. The actual DSL is of course somewhat more complex, and also actually does something useful, in contrast to the DSL that we define here which just says things. Creating an object that actually performs some action when required is left as an exercise to the reader.

24 December 2024

Russ Allbery: Review: Number Go Up

Review: Number Go Up, by Zeke Faux
Publisher: Crown Currency
Copyright: 2023
Printing: 2024
ISBN: 0-593-44382-9
Format: Kindle
Pages: 373
Number Go Up is a cross between a history and a first-person account of investigative journalism around the cryptocurrency bubble and subsequent collapse in 2022. The edition I read has an afterward from June 2024 that brings the story up to date with Sam Bankman-Fried's trial and a few other events. Zeke Faux is a reporter for Bloomberg News and a fellow of New America. Last year, I read Michael Lewis's Going Infinite, a somewhat-sympathetic book-length profile of Sam Bankman-Fried that made a lot of people angry. One of the common refrains at the time was that people should read Number Go Up instead, and since I'm happy to read more about the absurdities of the cryptocurrency world, I finally got around to reading the other big crypto book of 2023. This is a good book, with some caveats that I am about to explain at absurd length. If you want a skeptical history of the cryptocurrency bubble, you should read it. People who think that it's somehow in competition with Michael Lewis's book or who think the two books disagree (including Faux himself) have profoundly missed the point of Going Infinite. I agree with Matt Levine: Both of these books are worth your time if this is the sort of thing you like reading about. But (much) more on Faux's disagreements with Lewis later. The frame of Number Go Up is Faux's quixotic quest to prove that Tether is a fraud. To review this book, I therefore need to briefly explain what Tether is. This is only the first of many extended digressions. One natural way to buy cryptocurrency would be to follow the same pattern as a stock brokerage account. You would deposit some amount of money into the account (or connect the brokerage account to your bank account), and then exchange money for cryptocurrency or vice versa, using bank transfers to put money in or take it out. However, there are several problems with this. One is that swapping cryptocurrency for money is awkward and sometimes expensive. Another is that holding people's investment money for them is usually highly regulated, partly for customer safety but also to prevent money laundering. These are often called KYC laws (Know Your Customer), and the regulation-hostile world of cryptocurrency didn't want to comply with them. Tether is a stablecoin, which means that the company behind Tether attempts to guarantee that one Tether is always worth exactly one US dollar. It is not a speculative investment like Bitcoin; it's a cryptocurrency substitute for dollars. People exchange dollars for Tether to get their money into the system and then settle all of their subsequent trades in Tether, only converting the Tether back to dollars when they want to take their money out of cryptocurrency entirely. In essence, Tether functions like the cash reserve in a brokerage account: Your Tether holdings are supposedly guaranteed to be equivalent to US dollars, you can withdraw them at any time, and because you can do so, you don't bother, instead leaving your money in the reserve account while you contemplate what new coin you want to buy. As with a bank, this system rests on the assurance that one can always exchange one Tether for one US dollar. The instant people stop believing this is true, people will scramble to get their money out of Tether, creating the equivalent of a bank run. Since Tether is not a regulated bank or broker and has no deposit insurance or strong legal protections, the primary defense against a run on Tether is Tether's promise that they hold enough liquid assets to be able to hand out dollars to everyone who wants to redeem Tether. (A secondary defense that I wish Faux had mentioned is that Tether limits redemptions to registered accounts redeeming more than $100,000, which is a tiny fraction of the people who hold Tether, but for most purposes this doesn't matter because that promise is sufficient to maintain the peg with the dollar.) Faux's firmly-held belief throughout this book is that Tether is lying. He believes they do not have enough money to redeem all existing Tether coins, and that rather than backing every coin with very safe liquid assets, they are using the dollars deposited in the system to make illiquid and risky investments. Faux never finds the evidence that he's looking for, which makes this narrative choice feel strange. His theory was tested when there was a run on Tether following the collapse of the Terra stablecoin. Tether passed without apparent difficulty, redeeming $16B or about 20% of the outstanding Tether coins. This doesn't mean Faux is wrong; being able to redeem 20% of the outstanding tokens is very different from being able to redeem 100%, and Tether has been fined for lying about its reserves. But Tether is clearly more stable than Faux thought it was, which makes the main narrative of the book weirdly unsatisfying. If he admitted he might be wrong, I would give him credit for showing his work even if it didn't lead where he expected, but instead he pivots to focusing on Tether's role in money laundering without acknowledging that his original theory took a serious blow. In Faux's pursuit of Tether, he wanders through most of the other elements of the cryptocurrency bubble, and that's the strength of this book. Rather than write Number Go Up as a traditional history, Faux chooses to closely follow his own thought processes and curiosity. This has the advantage of giving Faux an easy and natural narrative, something that non-fiction books of this type can struggle with, and it lets Faux show how confusing and off-putting the cryptocurrency world is to an outsider. The best parts of this book were the parts unrelated to Tether. Faux provides an excellent summary of the Axie Infinity speculative bubble and even traveled to the Philippines to interview people who were directly affected. He then wandered through the bizarre world of NFTs, and his first-hand account of purchasing one (specifically a Mutant Ape) to get entrance to a party (which sounded like a miserable experience I would pay money to get out of) really drives home how sketchy and weird cryptocurrency-related software and markets can be. He also went to El Salvador to talk to people directly about the country's supposed embrace of Bitcoin, and there's no substitute for that type of reporting to show how exaggerated and dishonest the claims of cryptocurrency adoption are. The disadvantage of this personal focus on Faux himself is that it sometimes feels tedious or sensationalized. I was much less interested in his unsuccessful attempts to interview the founder of Tether than Faux was, and while the digression into forced labor compounds in Cambodia devoted to pig butchering scams was informative (and horrific), I think Faux leaned too heavily on an indirect link to Tether. His argument is that cryptocurrency enables a type of money laundering that is particularly well-suited to supporting scams, but both scams and this type of economic slavery existed before cryptocurrency and will exist afterwards. He did not make a very strong case that Tether was uniquely valuable as a money laundering service, as opposed to a currently useful tool that would be replaced with some other tool should it go away. This part of the book is essentially an argument that money laundering is bad because it enables crime, and sure, to an extent I agree. But if you're going to put this much emphasis on the evils of money laundering, I think you need to at least acknowledge that many people outside the United States do not want to give US government, which is often openly hostile to them, veto power over their financial transactions. Faux does not. The other big complaint I have with this book, and with a lot of other reporting on cryptocurrency, is that Faux is sloppy with the term "Ponzi scheme." This is going to sound like nit-picking, but I think this sloppiness matters because it may obscure an ongoing a shift in cryptocurrency markets. A Ponzi scheme is not any speculative bubble. It is a very specific type of fraud in which investors are promised improbably high returns at very low risk and with safe principal. These returns are paid out, not via investment in some underlying enterprise, but by taking the money from new investments and paying it to earlier investors. Ponzi schemes are doomed because satisfying their promises requires a constantly increasing flow of new investors. Since the population of the world is finite, all Ponzi schemes are mathematically guaranteed to eventually fail, often in a sudden death spiral of ever-increasing promises to lure new investors when the investment stream starts to dry up. There are some Ponzi schemes in cryptocurrency, but most practices that are called Ponzi schemes are not. For example, Faux calls Axie Infinity a Ponzi scheme, but it was missing the critical elements of promised safe returns and fraudulently paying returns from the investments of later investors. It was simply a speculative bubble that people bought into on the assumption that its price would increase, and like any speculative bubble those who sold before the peak made money at the expense of those who bought at the peak. The reason why this matters is that Ponzi schemes are a self-correcting problem. One can decry the damage caused when they collapse, but one can also feel the reassuring certainty that they will inevitably collapse and prove the skeptics correct. The same is not true of speculative assets in general. You may think that the lack of an underlying economic justification for prices means that a speculative bubble is guaranteed to collapse eventually, but in the famous words of Gary Schilling, "markets can remain irrational a lot longer than you and I can remain solvent." One of the people Faux interviews explains this distinction to him directly:
Rong explained that in a true Ponzi scheme, the organizer would have to handle the "fraud money." Instead, he gave the sneakers away and then only took a small cut of each trade. "The users are trading between each other. They are not going through me, right?" Rong said. Essentially, he was arguing that by downloading the Stepn app and walking to earn tokens, crypto bros were Ponzi'ing themselves.
Faux is openly contemptuous of this response, but it is technically correct. Stepn is not a Ponzi scheme; it's a speculative bubble. There are no guaranteed returns being paid out of later investments and no promise that your principal is safe. People are buying in at price that you may consider irrational, but Stepn never promised you would get your money back, let alone make a profit, and therefore it doesn't have the exponential progression of a Ponzi scheme. One can argue that this is a distinction without a moral difference, and personally I would agree, but it matters immensely if one is trying to analyze the future of cryptocurrencies. Schemes as transparently unstable as Stepn (which gives you coins for exercise and then tries to claim those coins have value through some vigorous hand-waving) are nearly as certain as Ponzi schemes to eventually collapse. But it's also possible to create a stable business around allowing large numbers of people to regularly lose money to small numbers of sophisticated players who are collecting all of the winnings. It's called a poker room at a casino, and no one thinks poker rooms are Ponzi schemes or are doomed to collapse, even though nearly everyone who plays poker will lose money. This is the part of the story that I think Faux largely missed, and which Michael Lewis highlights in Going Infinite. FTX was a legitimate business that made money (a lot of money) off of trading fees, in much the same way that a casino makes money off of poker rooms. Lots of people want to bet on cryptocurrencies, similar to how lots of people want to play poker. Some of those people will win; most of those people will lose. The casino doesn't care. Its profit comes from taking a little bit of each pot, regardless of who wins. Bankman-Fried also speculated with customer funds, and therefore FTX collapsed, but there is no inherent reason why the core exchange business cannot be stable if people continue to want to speculate in cryptocurrencies. Perhaps people will get tired of this method of gambling, but poker has been going strong for 200 years. It's also important to note that although trading fees are the most obvious way to be a profitable cryptocurrency casino, they're not the only way. Wall Street firms specialize in finding creative ways to take a cut of every financial transaction, and many of those methods are more sophisticated than fees. They are so good at this that buying and selling stock through trading apps like Robinhood is free. The money to run the brokerage platform comes from companies that are delighted to pay for the opportunity to handle stock trades by day traders with a phone app. This is not, as some conspiracy theories would have you believe, due to some sort of fraudulent price manipulation. It is because the average person with a Robinhood phone app is sufficiently unsophisticated that companies that have invested in complex financial modeling will make a steady profit taking the other side of their trades, mostly because of the spread (the difference between offered buy and sell prices). Faux is so caught up in looking for Ponzi schemes and fraud that I think he misses this aspect of cryptocurrency's transformation. Wall Street trading firms aren't piling into cryptocurrency because they want to do securities fraud. They're entering this market because there seems to be persistent demand for this form of gambling, cryptocurrency markets reward complex financial engineering, and running a legal casino is a profitable business model. Michael Lewis appears as a character in this book, and Faux portrays him quite negatively. The root of this animosity appears to stem from a cryptocurrency conference in the Bahamas that Faux attended. Lewis interviewed Bankman-Fried on stage, and, from Faux's account, his questions were fawning and he praised cryptocurrencies in ways that Faux is certain he knew were untrue. From that point on, Faux treats Lewis as an apologist for the cryptocurrency industry and for Sam Bankman-Fried specifically. I think this is a legitimate criticism of Lewis's methods of getting close to the people he wants to write about, but I think Faux also makes the common mistake of assuming Lewis is a muckraking reporter like himself. This has never been what Lewis is interested in. He writes about people he finds interesting and that he thinks a reader will also find interesting. One can legitimately accuse him of being credulous, but that's partly because he's not even trying to do the same thing Faux is doing. He's not trying to judge; he's trying to understand. This shows when it comes to the parts of this book about Sam Bankman-Fried. Faux's default assumption is that everyone involved in cryptocurrency is knowingly doing fraud, and a lot of his research is looking for evidence to support the conclusion he had already reached. I don't think there's anything inherently wrong with that approach: Faux is largely, although not entirely, correct, and this type of hostile journalism is incredibly valuable for society at large. Upton Sinclair didn't start writing The Jungle with an open mind about the meat-packing industry. But where Faux and Lewis disagree on Bankman-Fried's motivations and intentions, I think Lewis has the much stronger argument. Faux's position is that Bankman-Fried always intended to steal people's money through fraud, perhaps to fund his effective altruism donations, and his protestations that he made mistakes and misplaced funds are obvious lies. This is an appealing narrative if one is looking for a simple villain, but Faux's evidence in support of this is weak. He mostly argues through stereotype: Bankman-Fried was a physics major and a Jane Street trader and therefore could not possibly be the type of person to misplace large amounts of money or miscalculate risk. If he wants to understand how that could be possible, he could read Going Infinite? I find it completely credible that someone with what appears to be uncontrolled, severe ADHD could be adept at trading and calculating probabilities and yet also misplace millions of dollars of assets because he wasn't thinking about them and therefore they stopped existing. Lewis made a lot of people angry by being somewhat sympathetic to someone few people wanted to be sympathetic towards, but Faux (and many others) are also misrepresenting his position. Lewis agrees that Bankman-Fried intentionally intermingled customer funds with his hedge fund and agrees that he lied about doing this. His only contention is that Bankman-Fried didn't do this to steal the money; instead, he invested customer money in risky bets that he thought would pay off. In support of this, Lewis made a prediction that was widely scoffed at, namely that much less of FTX's money was missing than was claimed, and that likely most or all of it would be found. And, well, Lewis was basically correct? The FTX bankruptcy is now expected to recover considerably more than the amount of money owed to creditors. Faux argues that this is only because the bankruptcy clawed back assets and cryptocurrencies have gone up considerably since the FTX bankruptcy, and therefore that the lost money was just replaced by unexpected windfall profits on other investments, but I don't think this point is as strong as he thinks it is. Bankman-Fried lost money on some of what he did with customer funds, made money on other things, and if he'd been able to freeze withdrawals for the year that the bankruptcy froze them, it does appear most of the money would have been recoverable. This does not make what he did legal or morally right, but no one is arguing that, only that he didn't intentionally steal money for his own personal gain or for effective altruism donations. And on that point, I don't think Faux is giving Lewis's argument enough credit. I have a lot of complaints about this book because I know way too much about this topic than anyone should probably know. I think Faux missed the plot in a couple of places, and I wish someone would write a book about where cryptocurrency markets are currently going. (Matt Levine's Money Stuff newsletter is quite good, but it's about all sorts of things other than cryptocurrency and isn't designed to tell a coherent story.) But if you know less about cryptocurrency and just want to hear the details of the run-up to the 2022 bubble, this is a great book for that. Faux is writing for people who are already skeptical and is not going to convince people who are cryptocurrency true believers, but that's fine. The details are largely correct (and extensively footnoted) and will satisfy most people's curiosity. Lewis's Going Infinite is a better book, though. It's not the same type of book at all, and it will not give you the broader overview of the cryptocurrency world. But if you're curious about what was going through the head of someone at the center of all of this chaos, I think Lewis's analysis is much stronger than Faux's. I'm happy I read both books. Rating: 8 out of 10

17 December 2024

Russ Allbery: Review: Iris Kelly Doesn't Date

Review: Iris Kelly Doesn't Date, by Ashley Herring Blake
Series: Bright Falls #3
Publisher: Berkley Romance
Copyright: October 2023
ISBN: 0-593-55058-7
Format: Kindle
Pages: 381
Iris Kelly Doesn't Date is a sapphic romance novel (probably a romantic comedy, although I'm bad at romance subgenres). It is the third book in the Bright Falls series. In the romance style, it has a new set of protagonists, but the protagonists of the previous books appear as supporting characters and reading this will spoil the previous books. Among the friend group we were introduced to in Delilah Green Doesn't Care, Iris was the irrepressible loudmouth. She's bad at secrets, good at saying whatever is on her mind, and has zero desire to either get married or have children. After one of the side plots of Astrid Parker Doesn't Fail, she has sworn off dating entirely. Iris is also now a romance novelist. Her paper store didn't get enough foot traffic to justify staying open, so she switched her planner business to online only and wrote a romance novel that was good enough to get a two-book deal. Now she needs to write a second book and she has absolutely nothing. Her own avoidance of romantic situations is not helping, but neither is her meddling family who are convinced her choices about marriage and family can be overturned with sufficient pestering. She desperately needs to shake up her life, get out of her creative rut, and do something new. Failing that, she'll settle for meeting someone in a bar and having some fun. Stevie is a barista and actress living in Portland. Six months ago, she broke up with Adri, her creative partner, girlfriend of six years, and the first person with whom she had a serious relationship. More precisely, Adri broke up with her. They're still friends, truly, even though that friendship is being seriously strained by Adri dating Vanessa, another member of their small and close-knit friend group. Stevie has occasionally-crippling anxiety, not much luck in finding real acting roles in Portland, and a desperate desire to not make waves. Ren, the fourth member of their friend group, thinks Stevie needs a new relationship, or at least a fling. That's how Stevie, with Ren as backup and encouragement, ends up at the same bar with Iris. The resulting dance and conversation was rather fun for both Stevie and Iris. The attempted one-night stand afterwards was a disaster due to Stevie's anxiety, and neither of them expected to see the other again. Stevie therefore felt safe pretending they'd hit it off to get her friends off her back. When Iris's continued restlessness lands her in an audition for Adri's fundraiser play that she also talked Stevie into performing in, this turns into a full-blown fake dating trope. These books continue to be impossible to put down. I'm not sure what Blake is doing to make the pacing so perfect, but as with the previous books of the series I found this utterly compulsive reading. I started it in the afternoon, took a break in the evening for a few hours, and then finished it at 2am. I wasn't sure if a book focused on Iris would work as well, but I need not have worried. Iris Kelly Doesn't Date is both more dramatic and more trope-centered than the earlier books, but Blake handles that in a way that fits Iris's personality and wasn't annoying even to a reader like me, who has an aversion to many types of relationship drama. The secret is Stevie, and specifically having the other protagonist be someone with severe anxiety.
No was never a very easy word for Stevie when it came to Adri, when it came to anyone, really. She could handle the little stuff do you want a soda, have you seen this movie, do you like onions on your pizza but the big stuff, the stuff that caused disappointed expressions and down-turned mouths... yeah, she sucked at that part. Her anxiety would flare, and she'd spend the next week convinced her friends hated her, she'd die alone and miserable, and wasn't worth a damn to anyone. Then, when said friend or family member eventually got ahold of her to tell her that, no, of course they didn't hate her, why in the world would she think that, her anxiety would crest once again, convincing her that she was terrible at understanding people and could never trust her own brain to make heads or tails of any social situation.
This is a spot-on description of a particular type of anxiety, but also this is the perfect protagonist to pair with Iris. Throughout the series, Iris has always been the ride-or-die friend, the person who may have no idea how to help but who will show up anyway and at least try to distract you. Stevie's anxiety makes Iris feel protective, which reveals one of the best sides of Iris's personality, and then the protectiveness plays off against Iris's own relationship issues and tendency to avoid taking anything too seriously. It's one of those relationships that starts a bit one-sided and then becomes mutually supporting once Stevie gets her feet under her. That's a relationship pattern I really enjoy reading about. As with the rest of the series, the friendship dynamics are great. Here we get to see two friend groups at work: Iris's, which we've seen in the previous two volumes and which expanded interestingly in Astrid Parker Doesn't Fail, and Stevie's, which is new. I liked all of these people, even Adri in her own way (although she's the hardest to like). The previous happily-ever-afters do get a bit awkward here, but Blake tries to make that part of the plot and also avoids most of the problem of somewhat-boring romantic bliss by spreading the friendship connections a bit wider. Stevie's friend group formed at orientation at Reed College, and that let me put my finger on another property of this series: essentially all of the characters are from a very specific social class. They're nearly all arts people (bookstore owner, photographer, interior decorator, actress, writer, director), they've mostly gone to college, and while most of them don't have lots of money, there's always at least one person in each friend group with significant wealth. Jordan, from the previous book, is a bit of an exception since she works in a trade (a carpenter), but she still acts like someone from that same social class. It's a bit like reading Jane Austen novels and realizing that the protagonists are drawn from a very specific and very narrow portion of society. This is not a complaint, to be clear; I have no objections to reading about a very specific social class. But if one has already read lots of books about this class of people, I could see that diminishing the appeal of this series a bit. There are a lot of assumptions baked into the story that aren't really questioned, such as the ubiquity of therapists. (I don't know how Stevie affords one on a barista salary.) There are also some small things in the terminology (therapy speak, for example) and in the specific type of earnestness with which the books attempt to be diverse on most axes other than social class that I suspect may grate a bit for some readers. If that's you, this is your warning. There is a third-act breakup here, just like the previous volumes. There is also a defense of the emotional punch of third-act breakups in romance novels in the book itself, put into Iris's internal monologue, so I suspect that's the author's answer to critics like myself who don't like the trope. I was less frustrated by this one because it fit the drama level of the protagonists, but I'll also know to expect a third-act breakup in any Blake novel I read in the future. But, all that said, the summary once again is that I loved this book and could not put it down. Iris is dramatic and occasionally self-destructive but has a core of earnest empathy that makes her easy to like. She's exactly the sort of extrovert who is soothing to introverts rather than draining because she carries the extrovert load of social situations. Stevie is adorably earnest and thoughtful beneath her anxiety. They two of them are wildly different and yet remarkably good together, and I loved reading their story. Highly recommended, along with the whole series. Start with Delilah Green Doesn't Care; if you like that, you're in for a treat. Content note: This book is also rather sex-forward and pretty explicit in the sex scenes, maybe a touch more than Astrid Parker Doesn't Fail. If that is or is not your thing in romance novels, be aware going in. Rating: 9 out of 10

9 December 2024

Gunnar Wolf: Some tips for those who still administer Drupal7-based sites

A bit of history: Drupal at my workplace (and in Debian) My main day-to-day responsibility in my workplace is, and has been for 20 years, to take care of the network infrastructure for UNAM s Economics Research Institute. One of the most visible parts of this responsibility is to ensure we have a working Web presence, and that it caters for the needs of our academic community. I joined the Institute in January 2005. Back then, our designer pushed static versions of our webpage, completely built in her computer. This was standard practice at the time, and lasted through some redesigns, but I soon started advocating for the adoption of a Content Management System. After evaluating some alternatives, I recommended adopting Drupal. It took us quite a bit to do the change: even though I clearly recall starting work toward adopting it as early as 2006, according to the Internet Archive, we switched to a Drupal-backed site around June 2010. We started using it somewhere in the version 6 s lifecycle. As for my Debian work, by late 2012 I started getting involved in the maintenance of the drupal7 package, and by April 2013 I became its primary maintainer. I kept the drupal7 package up to date in Debian until 2018; the supported build methods for Drupal 8 are not compatible with Debian (mainly, bundling third-party libraries and updating them without coordination with the rest of the ecosystem), so towards the end of 2016, I announced I would not package Drupal 8 for Debian. By March 2016, we migrated our main page to Drupal 7. By then, we already had several other sites for our academics projects, but my narrative follows our main Web site. I did manage to migrate several Drupal 6 (D6) sites to Drupal 7 (D7); it was quite involved process, never transparent to the user, and we did have the backlash of long downtimes (or partial downtimes, with sites half-available only) with many of our users. For our main site, we took the opportunity to do a complete redesign and deployed a fully new site. You might note that March 2016 is after the release of D8 (November 2015). I don t recall many of the specifics for this decision, but if I m not mistaken, building the new site was a several months long process not only for the technical work of setting it up, but for the legwork of getting all of the needed information from the different areas that need to be represented in the Institute. Not only that: Drupal sites often include tens of contributed themes and modules; the technological shift the project underwent between its 7 and 8 releases was too deep, and modules took a long time (if at all many themes and modules were outright dumped) to become available for the new release. Naturally, the Drupal Foundation wanted to evolve and deprecate the old codebase. But the pain to migrate from D7 to D8 is too big, and many sites have remained under version 7 Eight years after D8 s release, almost 40% of Drupal installs are for version 7, and a similar proportion runs a currently-supported release (10 or 11). And while the Drupal Foundation made a great job at providing very-long-term support for D7, I understand the burden is becoming too much, so close to a year ago (and after pushing several times the D7, they finally announced support will finish this upcoming January 5.

Drupal 7 must go! I found the following usage graphs quite interesting: the usage statistics for all Drupal versions follows a very positive slope, peaking around 2014 during the best years of D7, and somewhat stagnating afterwards, staying since 2015 at the 25000 28000 sites mark (I m very tempted to copy the graphs, but builtwith s terms of use are very clear in not allowing it). There is a sharp drop in the last year I attribute it to the people that are leaving D7 for other technologies after its end-of-life announcement. This becomes clearer looking only at D7 s usage statistics: D7 peaks at 15000 installs in 2016 stays there for close to 5 years, and has a sharp drop to under 7500 sites in the span of one year. D8 has a more regular rise, peak and fall peaking at ~8500 between 2020 and 2021, and down to close to 2500 for some months already; D9 has a very brief peak of almost 9000 sites in 2023 and is now close to half of it. Currently, the Drupal king appears to be D10, still on a positive slope and with over 9000 sites. Drupal 11 is still just a blip in builtwith s radar, with 3 registered sites as of September 2024 :- After writing this last paragraph, I came across the statistics found in the Drupal webpage; the methodology for acquiring its data is completely different: while builtwith s methodology is their trade secret, you can read more about how Drupal s data is gathered (and agree or disagree with it , but at least you have a page detailing 12 years so far of reported data, producing the following graph (which can be shared under the CC BY-SA license ): Drupal usage statistics by version 2013 2024 This graph is disgregated into minor versions, and I don t want to come up with yet another graph for it but it supports (most of) the narrative I presented above although I do miss the recent drop builtwith reported in D7 s numbers!

And what about Backdrop? During the D8 release cycle, a group of Drupal developers were not happy with the depth of the architectural changes that were being adopted, particularly the transition to the Symfony PHP component framework, and forked the D7 codebase to create the Backdrop CMS, a modern version of Drupal, without dropping the known and tested architecture it had. The Backdrop developers keep working closely together with the Drupal community, and although its usage numbers are way smaller than Drupal s, seems to be sustainable and lively. Of course, as I presented their numbers in the previous section, you can see Backdrop s numbers in builtwith are way, way lower. I have found it to be a very warm and welcoming community, eager to receive new members. And, thanks to its contributed D2B Migrate module, I found it is quite easy to migrate a live site from Drupal 7 to Backdrop.

Migration by playbook! So Well, I m an academic. And (if it s not obvious to you after reading so far ), one of the things I must do in my job is to write. So I decided to write an article to invite my colleagues to consider Backdrop for their D7 sites in Cuadernos T cnicos Universitarios de la DGTIC, a young journal in our university for showcasing technical academical work. And now that my article got accepted and published, I m happy to share it with you of course, if you can read Spanish But anyway Given I have several sites to migrate, and that I m trying to get my colleagues to follow suite, I decided to automatize the migration by writing an Ansible playbook to do the heavy lifting. Of course, the playbook s users will probably need to tweak it a bit to their personal needs. I m also far from an Ansible expert, so I m sure there is ample room fo improvement in my style. But it works. Quite well, I must add.

But with this size of database I did stumble across a big pebble, though. I am working on the migration of one of my users sites, and found that its database is huge. I checked the mysqldump output, and it got me close to 3GB of data. And given the D2B_migrate is meant to work via a Web interface (my playbook works around it by using a client I wrote with Perl s WWW::Mechanize), I repeatedly stumbled with PHP s maximum POST size, maximum upload size, maximum memory size I asked for help in Backdrop s Zulip chat site, and my attention was taken off fixing PHP to something more obvious: Why is the database so large? So I took a quick look at the database (or rather: my first look was at the database server s filesystem usage). MariaDB stores each table as a separate file on disk, so I looked for the nine largest tables:
# ls -lhS head
total 3.8G
-rw-rw---- 1 mysql mysql 2.4G Dec 10 12:09 accesslog.ibd
-rw-rw---- 1 mysql mysql 224M Dec  2 16:43 search_index.ibd
-rw-rw---- 1 mysql mysql 220M Dec 10 12:09 watchdog.ibd
-rw-rw---- 1 mysql mysql 148M Dec  6 14:45 cache_field.ibd
-rw-rw---- 1 mysql mysql  92M Dec  9 05:08 aggregator_item.ibd
-rw-rw---- 1 mysql mysql  80M Dec 10 12:15 cache_path.ibd
-rw-rw---- 1 mysql mysql  72M Dec  2 16:39 search_dataset.ibd
-rw-rw---- 1 mysql mysql  68M Dec  2 13:16 field_revision_field_idea_principal_articulo.ibd
-rw-rw---- 1 mysql mysql  60M Dec  9 13:19 cache_menu.ibd
A single table, the access log, is over 2.4GB long. The three following tables are, cache tables. I can perfectly live without their data in our new site! But I don t want to touch the slightest bit of this site until I m satisfied with the migration process, so I found a way to exclude those tables in a non-destructive way: given D2B_migrate works with a mysqldump output, and given that mysqldump locks each table before starting to modify it and unlocks it after its job is done, I can just do the following:
$ perl -e '$output = 1; while (<>)   $output=0 if /^LOCK TABLES  (accesslog search_index watchdog cache_field cache_path) /; $output=1 if /^UNLOCK TABLES/; print if $output ' < /tmp/d7_backup.sql  > /tmp/d7_backup.eviscerated.sql; ls -hl /tmp/d7_backup.sql /tmp/d7_backup.eviscerated.sql
-rw-rw-r-- 1 gwolf gwolf 216M Dec 10 12:22 /tmp/d7_backup.eviscerated.sql
-rw------- 1 gwolf gwolf 2.1G Dec  6 18:14 /tmp/d7_backup.sql
Five seconds later, I m done! The database is now a tenth of its size, and D2B_migrate is happy to take it. And I m a big step closer to finishing my reliance on (this bit of) legacy code for my highly-visible sites

8 December 2024

Russ Allbery: Review: Why Buildings Fall Down

Review: Why Buildings Fall Down, by Matthys Levy & Mario Salvadori
Illustrator: Kevin Woest
Publisher: W.W. Norton
Copyright: 1992
Printing: 1994
ISBN: 0-393-31152-X
Format: Trade paperback
Pages: 314
Why Buildings Fall Down is a non-fiction survey of the causes of structure collapses, along with some related topics. It is a sequel of sorts to Why Buildings Stand Up by Mario Salvadori, which I have not read. Salvadori was, at the time of writing, Professor Emeritus of Architecture at Columbia University (he died in 1997). Levy is an award-winning architectural engineer, and both authors were principals at the structural engineering firm Weidlinger Associates. There is a revised and updated 2002 edition, but this review is of the original 1992 edition. This is one of those reviews that comes with a small snapshot of how my brain works. I got fascinated by the analysis of the collapse of Champlain Towers South in Surfside, Florida in 2021, thanks largely to a random YouTube series on the tiny channel of a structural engineer. Somewhere in there (I don't remember where, possibly from that channel, possibly not) I saw a recommendation for this book and grabbed a used copy in 2022 with the intent of reading it while my interest was piqued. The book arrived, I didn't read it right away, I got distracted by other things, and it migrated to my shelves and sat there until I picked it up on an "I haven't read nonfiction in a while" whim. Two years is a pretty short time frame for a book to sit on my shelf waiting for me to notice it again. The number of books that have been doing that for several decades is, uh, not small. Why Buildings Fall Down is a non-technical survey of structure failures. These are mostly buildings, but also include dams, bridges, and other structures. It's divided into 18 fairly short chapters, and the discussion of each disaster is brisk and to the point. Most of the structures discussed are relatively recent, but the authors talk about the Meidum Pyramid, the Parthenon (in the chapter on intentional destruction by humans), and the Pavia Civic Tower (in the chapter about building death from old age). If you are someone who has already been down the structural failure rabbit hole, you will find chapters on the expected disasters like the Tacoma Narrows Bridge collapse and the Hyatt Regency walkway collapse, but there are a lot of incidents here, including a short but interesting discussion of the Leaning Tower of Pisa in the chapter on problems caused by soil properties. What you're going to get, in other words, is a tour of ways in which structures can fail, which is precisely what was promised by the title. This wasn't quite what I was expecting, but now I'm not sure why I was expecting something different. There is no real unifying theme here; sometimes the failure was an oversight, sometimes it was a bad design, sometimes it was a last-minute change, and sometimes it was something unanticipated. There are a lot of factors involved in structure design and any of them can fail. The closest there is to a common pattern is a lack of redundancy and sufficient safety factors, but that lack of redundancy was generally not deliberate and therefore this is not a guide to preventing a collapse. The result is a book that feels a bit like a grab-bag of structural trivia that is individually interesting but only occasionally memorable. The writing style I suspect will be a matter of taste, but once I got used to it, I rather enjoyed it. In a co-written book, it's hard to separate the voices of the authors, but Salvadori wrote most of the chapter on the law in the first person and he's clearly a character. (That chapter is largely the story of two trials he testified in, which, from his account, involved him verbally fencing with lawyers who attempted to claim his degrees from the University of Rome didn't count as real degrees.) If this translates to his speaking style, I suspect he was a popular lecturer at Columbia. The explanations of the structural failures are concise and relatively clear, although even with Kevin Woest's diagrams, it's hard to capture the stresses and movement in a written description. (I've found from watching YouTube videos that animations, or even annotations drawn while someone is talking, help a lot.) The framing discussion, well, sometimes that is bombastic in a way that I found amusing:
But we, children of a different era, do not want our lives to be enclosed, to be shielded from the mystery. We are eager to participate in it, to gather with our brothers and sisters in a community of thought that will lift us above the mundane. We need to be together in sorrow and in joy. Thus we rarely build monolithic monuments. Instead, we build domes.
It helps that passages like this are always short and thus don't wear out their welcome. My favorite line in the whole book is a throwaway sentence in a discussion of building failures due to explosions:
With a similar approach, it can be estimated that the chance of an explosion like that at Forty-fifth Street was at most one in thirty million, and probably much less. But this is why life is dangerous and always ends in death.
Going hard, structural engineering book! It's often appealing to learn about things from their failures because the failures are inherently more dramatic and thus more interesting, but if you were hoping for an introduction to structural engineering, this is probably not the book you want. There is an excellent and surprisingly engaging appendix that covers the basics of structural analysis in 45 pages, but you would probably be better off with Why Buildings Stand Up or another architecture or structural engineering textbook (or maybe a video course). The problem with learning by failure case study is that all the case studies tend to blend together, despite the authors' engaging prose, and nearly every collapse introduces a new structural element with new properties and new failure modes and only the briefest of explanations. This book might make you a slightly more informed consumer of the news, but for most readers I suspect it will be a collection of forgettable trivia told in an occasionally entertaining style. I think the book I wanted to read was something that went deeper into the process of forensic engineering, not just the outcomes. It's interesting to know what the cause of a failure was, but I'm more interested in how one goes about investigating a failure. What is the process, how do you organize the investigation, and how does the legal system around engineering failures work? There are tidbits and asides here, but this book is primarily focused on the structural analysis and elides most of the work done to arrive at those conclusions. That said, I was entertained. Why Buildings Fall Down is a bit dated the opening chapter on airplanes hitting buildings reads much differently now than when it was written in 1992, and I'm sure it was updated in the 2002 edition but it succeeds in being clear without being soulless or sounding like a textbook. I appreciate an occasional rant about nuclear weapons in a book about architecture. I'm not sure I really recommend this, but I had a good time with it. Also, I'm now looking for opportunities to say "this is why life is dangerous and always ends in death," so there is that. Rating: 6 out of 10

1 December 2024

Russ Allbery: Review: Unexploded Remnants

Review: Unexploded Remnants, by Elaine Gallagher
Publisher: Tordotcom
Copyright: 2024
ISBN: 1-250-32522-6
Format: Kindle
Pages: 111
Unexploded Remnants is a science fiction adventure novella. The protagonist and world background would support an episodic series, but as of this writing it stands alone. It is Elaine Gallagher's first professional publication. Alice is the last survivor of Earth: an explorer, information trader, and occasional associate of the Archive. She scouts interesting places, looks for inconsistencies in the stories the galactic civilizations tell themselves, and pokes around ruins for treasure. As this story opens, she finds a supposedly broken computer core in the Alta Sidoie bazaar that is definitely not what the trader thinks it is. Very shortly thereafter, she's being hunted by a clan of dangerous Delosi while trying to decide what to do with a possibly malevolent AI with frightening intrusion abilities. This is one of those stories where all the individual pieces sounded great, but the way they were assembled didn't click for me. Unusually, I'm not entirely sure why. Often it's the characters, but I liked Alice well enough. The Lewis Carroll allusions were there but not overdone, her computer agent Bugs is a little too much of a Warner Brothers cartoon but still interesting, and the world building has plenty of interesting hooks. I certainly can't complain about the pacing: the plot moves briskly along to a somewhat predictable but still adequate conclusion. The writing is smooth and competent, and the world is memorable enough that I'm still thinking about it. And yet, I never connected with this story. I think it may be because both Alice and the tight third-person narrator tend towards breezy confidence and matter-of-fact descriptions. Alice does, at times, get scared or angry, but I never felt those emotions. They were just events that were described to me. There wasn't an emotional hook, a place where the character grabbed me, and so it felt like everything was happening at an odd remove. The advantage of this approach is that there are no overwrought emotional meltdowns or brooding angstful protagonists, just an adventure story about a competent and thoughtful character, but I think I wanted a bit more emotional involvement than I got. The world background is the best part and feels like it could be part of a larger series. The Milky Way is connected by an old, vast, and only partly understood network of teleportation portals, which had cut off Earth for unknown reasons and then just as mysteriously reactivated when Alice, then Andrew, drunkenly poked at a standing stone while muttering an old prayer in Gaelic. The Archive spent a year sorting out her intellectual diseases (capitalism was particularly alarming) and giving her a fresh start with a new body. Humanity subsequently destroyed itself in a paroxysm of reactionary violence, leaving Alice a free agent, one of a kind in a galaxy of dizzying variety and forgotten history. Gallagher makes great use of the weirdness of the portal network to create a Star Wars style of universe: the focus is more on the diversity of the planets and alien species than on a coherent unifying structure. The settings of this book are not prone to Planet of the Hats problems. They instead have the contrasts that one would get if one dropped portals near current or former Earth population centers and then took a random walk through them (or, in other words, what playing GeoGuessr on a world map feels like). I liked this effect, but I have to admit that it also added to that sense of sliding off the surface of the story. The place descriptions were great bits of atmosphere, but I never cared about them. There isn't enough emotional coherence to make them memorable. One of the more notable quirks of this story is the description of ideologies and prejudices as viral memes that can be cataloged, cured, and deployed like weapons. This is a theme of the world-building as well: this society, or at least the Archive-affiliated parts of it, classifies some patterns of thought as potentially dangerous but treatable contagious diseases. I'm not going to object too much to this as a bit of background and characterization in a fairly short novella stuffed with a lot of other world-building and plot, but there's was something about treating ethical systems like diseases that bugged me in much the same way that medicalization of neurodiversity bugs me. I think some people will find that sense of moral clarity relaxing and others will find it vaguely irritating, and I seem to have ended up in the second group. Overall, I would classify this as an interesting not-quite-success. It felt like a side story in a larger universe, like a story that would work better if I already knew Alice from other novels and had an established emotional connection with her. As is, I would not really recommend it, but there are enough good pieces here that I would be interested to see what Gallagher does next. Rating: 6 out of 10

29 November 2024

Russ Allbery: Review: The Duke Who Didn't

Review: The Duke Who Didn't, by Courtney Milan
Series: Wedgeford Trials #1
Publisher: Femtopress
Copyright: September 2020
ASIN: B08G4QC3JC
Format: Kindle
Pages: 334
The Duke Who Didn't is a Victorian romance novel, the first of a loosely-connected trilogy in the romance sense of switching protagonists between books. It's self-published, but by Courtney Milan, so the quality of the editing and publishing is about as high as you will see for a self-published novel. Chloe Fong has a goal: to make her father's sauce the success that it should be. His previous version of the recipe was stolen by White and Whistler and is now wildly popular as Pure English Sauce. His current version is much better. In a few days, tourists will come from all over England to the annual festival of the Wedgeford Trials, and this will be Chloe's opportunity to give the sauce a proper debut and marketing push. There is only the small matter of making enough sauce and coming up with a good name. Chloe is very busy and absolutely does not have time for nonsense. Particularly nonsense in the form of Jeremy Yu. Jeremy started coming to the Wedgeford Trials at the age of twelve. He was obviously from money and society, obviously enough that the villagers gave him the nickname Posh Jim after his participation in the central game of the trials. Exactly how wealthy and exactly which society, however, is something that he never quite explained, at first because he was having too much fun and then because he felt he'd waited too long. The village of Wedgeford was thriving under the benevolent neglect of its absent duke and uncollected taxes, and no one who loved it had any desire for that to change. Including Jeremy, the absent duke in question. Jeremy had been in love with Chloe for years, but the last time he came to the Trials, Chloe told him to stop pursuing her unless he could be serious. That was three years and three Trials ago, and Chloe was certain Jeremy had made his choice by his absence. But Jeremy never forgot her, and despite his utter failure to become a more serious person, he is determined to convince her that he is serious about her. And also determined to finally reveal his identity without breaking everything he loves about the village. Somehow. I have mentioned in other reviews that I mostly read sapphic instead of heterosexual romance because the gender roles in heterosexual romance are much more likely to irritate me. It occurred to me that I was probably being unfair to the heterosexual romance genre, I hadn't read nearly widely enough to draw any real conclusions, and I needed to find better examples. I've followed Courtney Milan occasionally on social media (for reasons unrelated to her novels) for long enough to know that she was unlikely to go for gender essentialism, and I'd been meaning to try one of her books for a while. Hence this novel. It is indeed not gender-essentialist. Neither Chloe nor Jeremy fit into obvious gender boxes. Chloe is the motivating force in the novel and many of their interactions were utterly charming. But, despite that, the gender roles still annoyed me in ways that are entirely not the fault of this book. I'm not sure I can even put a finger on something specific. It's a low-grade, pervasive feeling that men do one type of thing and women do a different type of thing, and even if these characters don't stick to that closely, it saturates the vibes. (Admittedly, a Victorian romance was probably not the best choice when I knew this was my biggest problem with genre heterosexual romance. It was just what I had on hand.) The conceit of the Wedgeford Trials series is that the small village of Wedgeford in England, through historical accident, ended up with an unusually large number of residents with Chinese ancestry. This is what I would call a "believable outlier": there was not such a village so far as I know, but there could well have been. At the least, there were way more people with non-English ancestry, including east Asian ancestry, in Victorian England than modern readers might think. There is quite a lot in this novel about family history, cultural traditions, immigration, and colonialism that I'm wholly unqualified to comment on but that was fascinating to read about and seemed (as one would expect from Milan) adroitly written. As for the rest of the story, The Duke Who Didn't is absolutely full of banter. If your idea of a good time with a romance novel is teasing, word play, mock irritation, and endless verbal fencing as a way to avoid directly confronting difficult topics, you will be in heaven. Jeremy is one of those people who is way too much in his own head and has turned his problems into a giant ball of anxiety, but who is good at being the class clown, and therefore leans heavily on banter and making people laugh (or blush) as a way of avoiding whatever he's anxious about. I thought the characterization was quite good, but I admit I still got a bit tired of it. 350 pages is a lot of banter, particularly when the characters have some serious communication problems they need to resolve, and to fully enjoy this book you have to have a lot of patience for Jeremy's near-pathological inability to be forthright with Chloe. Chloe's most charming characteristic is that she makes lists, particularly to-do lists. Her ideal days proceed as an orderly process of crossing things off of lists, and her way to approach any problem is to make a list. This is a great hook, and extremely relatable, but if you're going to talk this much about her lists, I want to see the lists! Chloe is all about details; show me the details! This book does not contain anywhere close to enough of Chloe's lists. I'm not sure there was a single list in this book that the reader both got to see the details of and that made it to more than three items. I think Chloe would agree that it's pointless to talk about the concept of lists; one needs to commit oneself to making an actual list. This book I would unquestioningly classify as romantic comedy (which given my utter lack of familiarity with romance subgenres probably means that it isn't). Jeremy's standard interaction style with anyone is self-deprecating humor, and Chloe is the sort of character who is extremely serious in ways that strike other people as funny. Towards the end of the book, there is a hilarious self-aware subversion of a major romance novel trope that even I caught, despite my general lack of familiarity with the genre. The eventual resolution of Jeremy's problem of hidden identity caught me by surprise in that way where I should have seen it all along, and was both beautifully handled and quite entertaining. All the pieces are here for a great time, and I think a lot of people would love this book. Somehow, it still wasn't quite my thing; I thoroughly enjoyed parts of it, but I don't find myself eager to read another. I'm kind of annoyed at myself that it didn't pull me in, since if I'd liked this I know where to find lots more like it. But ah well. If you like banter-heavy heterosexual romance that is very self-aware about its genre without devolving into metafiction, this is at least worth a try. Followed in the romance series way by The Marquis Who Mustn't, but this is a complete story with a satisfying ending. Rating: 7 out of 10

20 November 2024

Russell Coker: Solving Spam and Phishing for Corporations

Centralisation and Corporations An advantage of a medium to large company is that it permits specialisation. For example I m currently working in the IT department of a medium sized company and because we have standardised hardware (Dell Latitude and Precision laptops, Dell Precision Tower workstations, and Dell PowerEdge servers) and I am involved in fixing all Linux compatibility issues on that I can fix most problems in a small fraction of the time that I would take to fix on a random computer. There is scope for a lot of debate about the extent to which companies should standardise and centralise things. But for computer problems which can escalate quickly from minor to serious if not approached in the correct manner it s clear that a good deal of centralisation is appropriate. For people doing technical computer work such as programming there s a large portion of the employees who are computer hobbyists who like to fiddle with computers. But if the support system is run well even they will appreciate having computers just work most of the time and for a large portion of the failures having someone immediately recognise the problem, like the issues with NVidia drivers that I have documented so that first line support can implement workarounds without the need for a lengthy investigation. A big problem with email in the modern Internet is the prevalence of Phishing scams. The current corporate approach to this is to send out test Phishing email to people and then force computer security training on everyone who clicks on them. One problem with this is that attackers only need to fool one person on one occasion and when you have hundreds of people doing something on rare occasions that s not part of their core work they will periodically get it wrong. When every test Phishing run finds several people who need extra training it seems obvious to me that this isn t a solution that s working well. I will concede that the majority of people who click on the test Phishing email would probably realise their mistake if asked to enter the password for the corporate email system, but I think it s still clear that this isn t a great solution. Let s imagine for the sake of discussion that everyone in a company was 100% accurate at identifying Phishing email and other scam email, if that was the case would the problem be solved? I believe that even in that hypothetical case it would not be a solved problem due to the wasted time and concentration. People can spend minutes determining if a single email is legitimate. On many occasions I have had relatives and clients forward me email because they are unsure if it s valid, it s great that they seek expert advice when they are unsure about things but it would be better if they didn t have to go to that effort. What we ideally want to do is centralise the anti-Phishing and anti-spam work to a small group of people who are actually good at it and who can recognise patterns by seeing larger quantities of spam. When a spam or Phishing message is sent to 600 people in a company you don t want 600 people to individually consider it, you want one person to recognise it and delete/block all 600. If 600 people each spend one minute considering the matter then that s 10 work hours wasted! The Rationale for Human Filtering For personal email human filtering usually isn t viable because people want privacy. But corporate email isn t private, it s expected that the company can read it under certain circumstances (in most jurisdictions) and having email open in public areas of the office where colleagues might see it is expected. You can visit gmail.com on your lunch break to read personal email but every company policy (and common sense) says to not have actually private correspondence on company systems. The amount of time spent by reception staff in sorting out such email would be less than that taken by individuals. When someone sends a spam to everyone in the company instead of 500 people each spending a couple of minutes working out whether it s legit you have one person who s good at recognising spam (because it s their job) who clicks on a remove mail from this sender from all mailboxes button and 500 messages are deleted and the sender is blocked. Delaying email would be a concern. It s standard practice for CEOs (and C*Os at larger companies) to have a PA receive their email and forward the ones that need their attention. So human vetting of email can work without unreasonable delays. If we had someone checking all email for the entire company probably email to the senior people would never get noticeably delayed and while people like me would get their mail delayed on occasion people doing technical work generally don t have notifications turned on for email because it s a distraction and a fast response isn t needed. There are a few senders where fast response is required, which is mostly corporations sending a click this link within 10 minutes to confirm your password change email. Setting up rules for all such senders that are relevant to work wouldn t be difficult to do. How to Solve This Spam and Phishing became serious problems over 20 years ago and we have had 20 years of evolution of email filtering which still hasn t solved the problem. The vast majority of email addresses in use are run by major managed service providers and they haven t managed to filter out spam/phishing mail effectively so I think we should assume that it s not going to be solved by filtering. There is talk about what AI technology might do for filtering spam/phishing but that same technology can product better crafted hostile email to avoid filters. An additional complication for corporate email filtering is that some criteria that are used to filter personal email don t apply to corporate mail. If someone sends email to me personally about millions of dollars then it s obviously not legit. If someone sends email to a company then it could be legit. Companies routinely have people emailing potential clients about how their products can save millions of dollars and make purchases over a million dollars. This is not a problem that s impossible to solve, it s just an extra difficulty that reduces the efficiency of filters. It seems to me that the best solution to the problem involves having all mail filtered by a human. A company could configure their mail server to not accept direct external mail for any employee s address. Then people could email files to colleagues etc without any restriction but spam and phishing wouldn t be a problem. The issue is how to manage inbound mail. One possibility is to have addresses of the form it+russell.coker@example.com (for me as an employee in the IT department) and you would have a team of people who would read those mailboxes and forward mail to the right people if it seemed legit. Having addresses like it+russell.coker means that all mail to the IT department would be received into folders of the same account and they could be filtered by someone with suitable security level and not require any special configuration of the mail server. So the person who read the is mailbox would have a folder named russell.coker receiving mail addressed to me. The system could be configured to automate the processing of mail from known good addresses (and even domains), so they could just put in a rule saying that when Dell sends DMARC authenticated mail to is+$USER it gets immediately directed to $USER. This is the sort of thing that can be automated in the email client (mail filtering is becoming a common feature in MUAs). For a FOSS implementation of such things the server side of it (including extracting account data from a directory to determine which department a user is in) would be about a day s work and then an option would be to modify a webmail program to have extra functionality for approving senders and sending change requests to the server to automatically direct future mail from the same sender. As an aside I have previously worked on a project that had a modified version of the Horde webmail system to do this sort of thing for challenge-response email and adding certain automated messages to the allow-list. The Change One of the first things to do is configuring the system to add every recipient of an outbound message to the allow list for receiving a reply. Having a script go through the sent-mail folders of all accounts and adding the recipients to the allow lists would be easy and catch the common cases. But even with processing the sent mail folders going from a working system without such things to a system like this will take some time for the initial work of adding addresses to the allow lists, particularly for domain wide additions of all the sites that send password confirmation messages. You would need rules to direct inbound mail to the old addresses to the new style and then address a huge amount of mail that needs to be categorised. If you have 600 employees and the average amount of time taken on the first day is 10 minutes per user then that s 100 hours of work, 12 work days. If you had everyone from the IT department, reception, and executive assistants working on it that would be viable. After about a week there wouldn t be much work involved in maintaining it. Then after that it would be a net win for the company. The Benefits If the average employee spends one minute a day dealing with spam and phishing email then with 600 employees that s 10 hours of wasted time per day. Effectively wasting one employee s work! I m sure that s the low end of the range, 5 minutes average per day doesn t seem unreasonable especially when people are unsure about phishing email and send it to Slack so multiple employees spend time analysing it. So you could have 5 employees being wasted by hostile email and avoiding that would take a fraction of the time of a few people adding up to less than an hour of total work per day. Then there s the training time for phishing mail. Instead of having every employee spend half an hour doing email security training every few months (that s 300 hours or 7.5 working weeks every time you do it) you just train the few experts. In addition to saving time there are significant security benefits to having experts deal with possibly hostile email. Someone who deals with a lot of phishing email is much less likely to be tricked. Will They Do It? They probably won t do it any time soon. I don t think it s expensive enough for companies yet. Maybe government agencies already have equivalent measures in place, but for regular corporations it s probably regarded as too difficult to change anything and the costs aren t obvious. I have been unsuccessful in suggesting that managers spend slightly more on computer hardware to save significant amounts of worker time for 30 years.

Next.

Previous.