Search Results: "stw"

12 July 2023

Reproducible Builds: Reproducible Builds in June 2023

Welcome to the June 2023 report from the Reproducible Builds project In our reports, we outline the most important things that we have been up to over the past month. As always, if you are interested in contributing to the project, please visit our Contribute page on our website.

We are very happy to announce the upcoming Reproducible Builds Summit which set to take place from October 31st November 2nd 2023, in the vibrant city of Hamburg, Germany. Our summits are a unique gathering that brings together attendees from diverse projects, united by a shared vision of advancing the Reproducible Builds effort. During this enriching event, participants will have the opportunity to engage in discussions, establish connections and exchange ideas to drive progress in this vital field. Our aim is to create an inclusive space that fosters collaboration, innovation and problem-solving. We are thrilled to host the seventh edition of this exciting event, following the success of previous summits in various iconic locations around the world, including Venice, Marrakesh, Paris, Berlin and Athens. If you re interesting in joining us this year, please make sure to read the event page] which has more details about the event and location. (You may also be interested in attending PackagingCon 2023 held a few days before in Berlin.)
This month, Vagrant Cascadian will present at FOSSY 2023 on the topic of Breaking the Chains of Trusting Trust:
Corrupted build environments can deliver compromised cryptographically signed binaries. Several exploits in critical supply chains have been demonstrated in recent years, proving that this is not just theoretical. The most well secured build environments are still single points of failure when they fail. [ ] This talk will focus on the state of the art from several angles in related Free and Open Source Software projects, what works, current challenges and future plans for building trustworthy toolchains you do not need to trust.
Hosted by the Software Freedom Conservancy and taking place in Portland, Oregon, FOSSY aims to be a community-focused event: Whether you are a long time contributing member of a free software project, a recent graduate of a coding bootcamp or university, or just have an interest in the possibilities that free and open source software bring, FOSSY will have something for you . More information on the event is available on the FOSSY 2023 website, including the full programme schedule.
Marcel Fourn , Dominik Wermke, William Enck, Sascha Fahl and Yasemin Acar recently published an academic paper in the 44th IEEE Symposium on Security and Privacy titled It s like flossing your teeth: On the Importance and Challenges of Reproducible Builds for Software Supply Chain Security . The abstract reads as follows:
The 2020 Solarwinds attack was a tipping point that caused a heightened awareness about the security of the software supply chain and in particular the large amount of trust placed in build systems. Reproducible Builds (R-Bs) provide a strong foundation to build defenses for arbitrary attacks against build systems by ensuring that given the same source code, build environment, and build instructions, bitwise-identical artifacts are created.
However, in contrast to other papers that touch on some theoretical aspect of reproducible builds, the authors paper takes a different approach. Starting with the observation that much of the software industry believes R-Bs are too far out of reach for most projects and conjoining that with a goal of to help identify a path for R-Bs to become a commonplace property , the paper has a different methodology:
We conducted a series of 24 semi-structured expert interviews with participants from the project, and iterated on our questions with the reproducible builds community. We identified a range of motivations that can encourage open source developers to strive for R-Bs, including indicators of quality, security benefits, and more efficient caching of artifacts. We identify experiences that help and hinder adoption, which heavily include communication with upstream projects. We conclude with recommendations on how to better integrate R-Bs with the efforts of the open source and free software community.
A PDF of the paper is now available, as is an entry on the CISPA Helmholtz Center for Information Security website and an entry under the TeamUSEC Human-Centered Security research group.
On our mailing list this month:
The antagonist is David Schwartz, who correctly says There are dozens of complex reasons why what seems to be the same sequence of operations might produce different end results, but goes on to say I totally disagree with your general viewpoint that compilers must provide for reproducability [sic]. Dwight Tovey and I (Larry Doolittle) argue for reproducible builds. I assert Any program especially a mission-critical program like a compiler that cannot reproduce a result at will is broken. Also it s commonplace to take a binary from the net, and check to see if it was trojaned by attempting to recreate it from source.

Lastly, there were a few changes to our website this month too, including Bernhard M. Wiedemann adding a simplified Rust example to our documentation about the SOURCE_DATE_EPOCH environment variable [ ], Chris Lamb made it easier to parse our summit announcement at a glance [ ], Mattia Rizzolo added the summit announcement at a glance [ ] itself [ ][ ][ ] and Rahul Bajaj added a taxonomy of variations in build environments [ ].

Distribution work 27 reviews of Debian packages were added, 40 were updated and 8 were removed this month adding to our knowledge about identified issues. A new randomness_in_documentation_generated_by_mkdocs toolchain issue was added by Chris Lamb [ ], and the deterministic flag on the paths_vary_due_to_usrmerge issue as we are not currently testing usrmerge issues [ ] issues.
Roland Clobus posted his 18th update of the status of reproducible Debian ISO images on our mailing list. Roland reported that all major desktops build reproducibly with bullseye, bookworm, trixie and sid , but he also mentioned amongst many changes that not only are the non-free images being built (and are reproducible) but that the live images are generated officially by Debian itself. [ ]
Jan-Benedict Glaw noticed a problem when building NetBSD for the VAX architecture. Noting that Reproducible builds [are] probably not as reproducible as we thought , Jan-Benedict goes on to describe that when two builds from different source directories won t produce the same result and adds various notes about sub-optimal handling of the CFLAGS environment variable. [ ]
F-Droid added 21 new reproducible apps in June, resulting in a new record of 145 reproducible apps in total. [ ]. (This page now sports missing data for March May 2023.) F-Droid contributors also reported an issue with broken resources in APKs making some builds unreproducible. [ ]
Bernhard M. Wiedemann published another monthly report about reproducibility within openSUSE

Upstream patches

Testing framework The Reproducible Builds project operates a comprehensive testing framework (available at in order to check packages and other artifacts for reproducibility. In June, a number of changes were made by Holger Levsen, including:
  • Additions to a (relatively) new Documented Jenkins Maintenance (djm) script to automatically shrink a cache & save a backup of old data [ ], automatically split out previous months data from logfiles into specially-named files [ ], prevent concurrent remote logfile fetches by using a lock file [ ] and to add/remove various debugging statements [ ].
  • Updates to the automated system health checks to, for example, to correctly detect new kernel warnings due to a wording change [ ] and to explicitly observe which old/unused kernels should be removed [ ]. This was related to an improvement so that various kernel issues on Ubuntu-based nodes are automatically fixed. [ ]
Holger and Vagrant Cascadian updated all thirty-five hosts running Debian on the amd64, armhf, and i386 architectures to Debian bookworm, with the exception of the Jenkins host itself which will be upgraded after the release of Debian 12.1. In addition, Mattia Rizzolo updated the email configuration for the domain to correctly accept incoming mails from [ ] as well as to set up DomainKeys Identified Mail (DKIM) signing [ ]. And working together with Holger, Mattia also updated the Jenkins configuration to start testing Debian trixie which resulted in stopped testing Debian buster. And, finally, Jan-Benedict Glaw contributed patches for improved NetBSD testing.

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

11 July 2023

Matthew Garrett: Roots of Trust are difficult

The phrase "Root of Trust" turns up at various points in discussions about verified boot and measured boot, and to a first approximation nobody is able to give you a coherent explanation of what it means[1]. The Trusted Computing Group has a fairly wordy definition, but (a) it's a lot of words and (b) I don't like it, so instead I'm going to start by defining a root of trust as "A thing that has to be trustworthy for anything else on your computer to be trustworthy".

(An aside: when I say "trustworthy", it is very easy to interpret this in a cynical manner and assume that "trust" means "trusted by someone I do not necessarily trust to act in my best interest". I want to be absolutely clear that when I say "trustworthy" I mean "trusted by the owner of the computer", and that as far as I'm concerned selling devices that do not allow the owner to define what's trusted is an extremely bad thing in the general case)

Let's take an example. In verified boot, a cryptographic signature of a component is verified before it's allowed to boot. A straightforward implementation of a verified boot implementation has the firmware verify the signature on the bootloader or kernel before executing it. In this scenario, the firmware is the root of trust - it's the first thing that makes a determination about whether something should be allowed to run or not[2]. As long as the firmware behaves correctly, and as long as there aren't any vulnerabilities in our boot chain, we know that we booted an OS that was signed with a key we trust.

But what guarantees that the firmware behaves correctly? What if someone replaces our firmware with firmware that trusts different keys, or hot-patches the OS as it's booting it? We can't just ask the firmware whether it's trustworthy - trustworthy firmware will say yes, but the thing about malicious firmware is that it can just lie to us (either directly, or by modifying the OS components it boots to lie instead). This is probably not sufficiently trustworthy!

Ok, so let's have the firmware be verified before it's executed. On Intel this is "Boot Guard", on AMD this is "Platform Secure Boot", everywhere else it's just "Secure Boot". Code on the CPU (either in ROM or signed with a key controlled by the CPU vendor) verifies the firmware[3] before executing it. Now the CPU itself is the root of trust, and, well, that seems reasonable - we have to place trust in the CPU, otherwise we can't actually do computing. We can now say with a reasonable degree of confidence (again, in the absence of vulnerabilities) that we booted an OS that we trusted. Hurrah!

Except. How do we know that the CPU actually did that verification? CPUs are generally manufactured without verification being enabled - different system vendors use different signing keys, so those keys can't be installed in the CPU at CPU manufacture time, and vendors need to do code development without signing everything so you can't require that keys be installed before a CPU will work. So, out of the box, a new CPU will boot anything without doing verification[4], and development units will frequently have no verification.

As a device owner, how do you tell whether or not your CPU has this verification enabled? Well, you could ask the CPU, but if you're doing that on a device that booted a compromised OS then maybe it's just hotpatching your OS so when you do that you just get RET_TRUST_ME_BRO even if the CPU is desperately waving its arms around trying to warn you it's a trap. This is, unfortunately, a problem that's basically impossible to solve using verified boot alone - if any component in the chain fails to enforce verification, the trust you're placing in the chain is misplaced and you are going to have a bad day.

So how do we solve it? The answer is that we can't simply ask the OS, we need a mechanism to query the root of trust itself. There's a few ways to do that, but fundamentally they depend on the ability of the root of trust to provide proof of what happened. This requires that the root of trust be able to sign (or cause to be signed) an "attestation" of the system state, a cryptographically verifiable representation of the security-critical configuration and code. The most common form of this is called "measured boot" or "trusted boot", and involves generating a "measurement" of each boot component or configuration (generally a cryptographic hash of it), and storing that measurement somewhere. The important thing is that it must not be possible for the running OS (or any pre-OS component) to arbitrarily modify these measurements, since otherwise a compromised environment could simply go back and rewrite history. One frequently used solution to this is to segregate the storage of the measurements (and the attestation of them) into a separate hardware component that can't be directly manipulated by the OS, such as a Trusted Platform Module. Each part of the boot chain measures relevant security configuration and the next component before executing it and sends that measurement to the TPM, and later the TPM can provide a signed attestation of the measurements it was given. So, an SoC that implements verified boot should create a measurement telling us whether verification is enabled - and, critically, should also create a measurement if it isn't. This is important because failing to measure the disabled state leaves us with the same problem as before; someone can replace the mutable firmware code with code that creates a fake measurement asserting that verified boot was enabled, and if we trust that we're going to have a bad time.

(Of course, simply measuring the fact that verified boot was enabled isn't enough - what if someone replaces the CPU with one that has verified boot enabled, but trusts keys under their control? We also need to measure the keys that were used in order to ensure that the device trusted only the keys we expected, otherwise again we're going to have a bad time)

So, an effective root of trust needs to:

1) Create a measurement of its verified boot policy before running any mutable code
2) Include the trusted signing key in that measurement
3) Actually perform that verification before executing any mutable code

and from then on we're in the hands of the verified code actually being trustworthy, and it's probably written in C so that's almost certainly false, but let's not try to solve every problem today.

Does anything do this today? As far as I can tell, Intel's Boot Guard implementation does. Based on publicly available documentation I can't find any evidence that AMD's Platform Secure Boot does (it does the verification, but it doesn't measure the policy beforehand, so it seems spoofable), but I could be wrong there. I haven't found any general purpose non-x86 parts that do, but this is in the realm of things that SoC vendors seem to believe is some sort of value-add that can only be documented under NDAs, so please do prove me wrong. And then there are add-on solutions like Titan, where we delegate the initial measurement and validation to a separate piece of hardware that measures the firmware as the CPU reads it, rather than requiring that the CPU do it.

But, overall, the situation isn't great. On many platforms there's simply no way to prove that you booted the code you expected to boot. People have designed elaborate security implementations that can be bypassed in a number of ways.

[1] In this respect it is extremely similar to "Zero Trust"
[2] This is a bit of an oversimplification - once we get into dynamic roots of trust like Intel's TXT this story gets more complicated, but let's stick to the simple case today
[3] I'm kind of using "firmware" in an x86ish manner here, so for embedded devices just think of "firmware" as "the first code executed out of flash and signed by someone other than the SoC vendor"
[4] In the Intel case this isn't strictly true, since the keys are stored in the motherboard chipset rather than the CPU, and so taking a board with Boot Guard enabled and swapping out the CPU won't disable Boot Guard because the CPU reads the configuration from the chipset. But many mobile Intel parts have the chipset in the same package as the CPU, so in theory swapping out that entire package would disable Boot Guard. I am not good enough at soldering to demonstrate that.

comment count unavailable comments

2 June 2023

Matt Brown: Calling time on DNSSEC: The costs exceed the benefits

I m calling time on DNSSEC. Last week, prompted by a change in my DNS hosting setup, I began removing it from the few personal zones I had signed. Then this Monday the .nz ccTLD experienced a multi-day availability incident triggered by the annual DNSSEC key rotation process. This incident broke several of my unsigned zones, which led me to say very unkind things about DNSSEC on Mastodon and now I feel compelled to more completely explain my thinking: For almost all domains and use-cases, the costs and risks of deploying DNSSEC outweigh the benefits it provides. Don t bother signing your zones. The .nz incident, while topical, is not the motivation or the trigger for this conclusion. Had it been a novel incident, it would still have been annoying, but novel incidents are how we learn so I have a small tolerance for them. The problem with DNSSEC is precisely that this incident was not novel, just the latest in a long and growing list. It s a clear pattern. DNSSEC is complex and risky to deploy. Choosing to sign your zone will almost inevitably mean that you will experience lower availability for your domain over time than if you leave it unsigned. Even if you have a team of DNS experts maintaining your zone and DNS infrastructure, the risk of routine operational tasks triggering a loss of availability (unrelated to any attempted attacks that DNSSEC may thwart) is very high - almost guaranteed to occur. Worse, because of the nature of DNS and DNSSEC these incidents will tend to be prolonged and out of your control to remediate in a timely fashion. The only benefit you get in return for accepting this almost certain reduction in availability is trust in the integrity of the DNS data a subset of your users (those who validate DNSSEC) receive. Trusted DNS data that is then used to communicate across an untrusted network layer. An untrusted network layer which you are almost certainly protecting with TLS which provides a more comprehensive and trustworthy set of security guarantees than DNSSEC is capable of, and provides those guarantees to all your users regardless of whether they are validating DNSSEC or not. In summary, in our modern world where TLS is ubiquitous, DNSSEC provides only a thin layer of redundant protection on top of the comprehensive guarantees provided by TLS, but adds significant operational complexity, cost and a high likelihood of lowered availability. In an ideal world, where the deployment cost of DNSSEC and the risk of DNSSEC-induced outages were both low, it would absolutely be desirable to have that redundancy in our layers of protection. In the real world, given the DNSSEC protocol we have today, the choice to avoid its complexity and rely on TLS alone is not at all painful or risky to make as the operator of an online service. In fact, it s the prudent choice that will result in better overall security outcomes for your users. Ignore DNSSEC and invest the time and resources you would have spent deploying it improving your TLS key and certificate management. Ironically, the one use-case where I think a valid counter-argument for this position can be made is TLDs (including ccTLDs such as .nz). Despite its many failings, DNSSEC is an Internet Standard, and as infrastructure providers, TLDs have an obligation to enable its use. Unfortunately this means that everyone has to bear the costs, complexities and availability risks that DNSSEC burdens these operators with. We can t avoid that fact, but we can avoid creating further costs, complexities and risks by choosing not to deploy DNSSEC on the rest of our non-TLD zones.

But DNSSEC will save us from the evil CA ecosystem! Historically, the strongest motivation for DNSSEC has not been the direct security benefits themselves (which as explained above are minimal compared to what TLS provides), but in the new capabilities and use-cases that could be enabled if DNS were able to provide integrity and trusted data to applications. Specifically, the promise of DNS-based Authentication of Named Entities (DANE) is that with DNSSEC we can be free of the X.509 certificate authority ecosystem and along with it the expensive certificate issuance racket and dubious trust properties that have long been its most distinguishing features. Ten years ago this was an extremely compelling proposition with significant potential to improve the Internet. That potential has gone unfulfilled. Instead of maturing as deployments progressed and associated operational experience was gained, DNSSEC has been beset by the discovery of issue after issue. Each of these has necessitated further changes and additions to the protocol, increasing complexity and deployment cost. For many zones, including significant zones like (where I led the attempt to evaluate and deploy DNSSEC in the mid 2010s), it is simply infeasible to deploy the protocol at all, let alone in a reliable and dependable manner. While DNSSEC maturation and deployment has been languishing, the TLS ecosystem has been steadily and impressively improving. Thanks to the efforts of many individuals and companies, although still founded on the use of a set of root certificate authorities, the TLS and CA ecosystem today features transparency, validation and multi-party accountability that comprehensively build trust in the ability to depend and rely upon the security guarantees that TLS provides. When you use TLS today, you benefit from:
  • Free/cheap issuance from a number of different certificate authorities.
  • Regular, automated issuance/renewal via the ACME protocol.
  • Visibility into who has issued certificates for your domain and when through Certificate Transparency logs.
  • Confidence that certificates issued without certificate transparency (and therefore lacking an SCT) will not be accepted by the leading modern browsers.
  • The use of modern cryptographic protocols as a baseline, with a plausible and compelling story for how these can be steadily and promptly updated over time.
DNSSEC with DANE can match the TLS ecosystem on the first benefit (up front price) and perhaps makes the second benefit moot, but has no ability to match any of the other transparency and accountability measures that today s TLS ecosystem offers. If your ZSK is stolen, or a parent zone is compromised or coerced, validly signed TLSA records for a forged certificate can be produced and spoofed to users under attack with minimal chances of detection. Finally, in terms of overall trust in the roots of the system, the CA/Browser forum requirements continue to improve the accountability and transparency of TLS certificate authorities, significantly reducing the ability for any single actor (say a nefarious government) to subvert the system. The DNS root has a well established transparent multi-party system for establishing trust in the DNSSEC root itself, but at the TLD level, almost intentionally thanks to the hierarchical nature of DNS, DNSSEC has multiple single points of control (or coercion) which exist outside of any formal system of transparency or accountability. We ve moved from DANE being a potential improvement in security over TLS when it was first proposed, to being a definite regression from what TLS provides today. That s not to say that TLS is perfect, but given where we re at, we ll get a better security return from further investment and improvements in the TLS ecosystem than we will from trying to fix DNSSEC.

But TLS is not ubiquitous for non-HTTP applications The arguments above are most compelling when applied to the web-based HTTP-oriented ecosystem which has driven most of the TLS improvements we ve seen to date. Non-HTTP protocols are lagging in adoption of many of the improvements and best practices TLS has on the web. Some claim this need to provide a solution for non-HTTP, non-web applications provides a motivation to continue pushing DNSSEC deployment. I disagree, I think it provides a motivation to instead double-down on moving those applications to TLS. TLS as the new TCP. The problem is that costs of deploying and operating DNSSEC are largely fixed regardless of how many protocols you are intending to protect with it, and worse, the negative side-effects of DNSSEC deployment can and will easily spill over to affect zones and protocols that don t want or need DNSSEC s protection. To justify continued DNSSEC deployment and operation in this context means using a smaller set of benefits (just for the non-HTTP applications) to justify the already high costs of deploying DNSSEC itself, plus the cost of the risk that DNSSEC poses to the reliability to your websites. I don t see how that equation can ever balance, particularly when you evaluate it against the much lower costs of just turning on TLS for the rest of your non-HTTP protocols instead of deploying DNSSEC. MTA-STS is a worked example of how this can be achieved. If you re still not convinced, consider that even DNS itself is considering moving to TLS (via DoT and DoH) in order to add the confidentiality/privacy attributes the protocol currently lacks. I m not a huge fan of the latency implications of these approaches, but the ongoing discussion shows that clever solutions and mitigations for that may exist. DoT/DoH solve distinct problems from DNSSEC and in principle should be used in combination with it, but in a world where DNS itself is relying on TLS and therefore has eliminated the majority of spoofing and cache poisoning attacks through DoT/DoH deployment the benefit side of the DNSSEC equation gets smaller and smaller still while the costs remain the same.

OK, but better software or more careful operations can reduce DNSSEC s cost Some see the current DNSSEC costs simply as teething problems that will reduce as the software and tooling matures to provide more automation of the risky processes and operational teams learn from their mistakes or opt to simply transfer the risk by outsourcing the management and complexity to larger providers to take care of. I don t find these arguments compelling. We ve already had 15+ years to develop improved software for DNSSEC without success. What s changed that we should expect a better outcome this year or next? Nothing. Even if we did have better software or outsourced operations, the approach is still only hiding the costs behind automation or transferring the risk to another organisation. That may appear to work in the short-term, but eventually when the time comes to upgrade the software, migrate between providers or change registrars the debt will come due and incidents will occur. The problem is the complexity of the protocol itself. No amount of software improvement or outsourcing addresses that. After 15+ years of trying, I think it s worth considering that combining cryptography, caching and distributed consensus, some of the most fundamental and complex computer science problems, into a slow-moving and hard to evolve low-level infrastructure protocol while appropriately balancing security, performance and reliability appears to be beyond our collective ability. That doesn t have to be the end of the world, the improvements achieved in the TLS ecosystem over the same time frame provide a positive counter example - perhaps DNSSEC is simply focusing our attention at the wrong layer of the stack. Ideally secure DNS data would be something we could have, but if the complexity of DNSSEC is the price we have to pay to achieve it, I m out. I would rather opt to remain with the simpler yet insecure DNS protocol and compensate for its short comings at higher transport or application layers where experience shows we are able to more rapidly improve and develop our security capabilities.

Summing up For the vast majority of domains and use-cases there is simply no net benefit to deploying DNSSEC in 2023. I d even go so far as to say that if you ve already signed your zones, you should (carefully) move them back to being unsigned - you ll reduce the complexity of your operating environment and lower your risk of availability loss triggered by DNS. Your users will thank you. The threats that DNSSEC defends against are already amply defended by the now mature and still improving TLS ecosystem at the application layer, and investing in further improvements here carries far more return than deployment of DNSSEC. For TLDs, like .nz whose outage triggered this post, DNSSEC is not going anywhere and investment in mitigating its complexities and risks is an unfortunate burden that must be shouldered. While the full incident report of what went wrong with .nz is not yet available, the interim report already hints at some useful insights. It is important that InternetNZ publishes a full and comprehensive review so that the full set of learnings and improvements this incident can provide can be fully realised by .nz and other TLD operators stuck with the unenviable task of trying to safely operate DNSSEC.

Postscript After taking a few days to draft and edit this post, I ve just stumbled across a presentation from the well respected Geoff Huston at last weeks RIPE86 meeting. I ve only had time to skim the slides (video here) - they don t seem to disagree with my thinking regarding the futility of the current state of DNSSEC, but also contain some interesting ideas for what it might take for DNSSEC to become a compelling proposition. Probably worth a read/watch!

30 May 2023

Russ Allbery: Review: The Mimicking of Known Successes

Review: The Mimicking of Known Successes, by Malka Older
Series: Mossa and Pleiti #1
Publisher: Tordotcom
Copyright: 2023
ISBN: 1-250-86051-2
Format: Kindle
Pages: 169
The Mimicking of Known Successes is a science fiction mystery novella, the first of an expected series. (The second novella is scheduled to be published in February of 2024.) Mossa is an Investigator, called in after a man disappears from the eastward platform on the 4 63' line. It's an isolated platform, five hours away from Mossa's base, and home to only four residential buildings and a pub. The most likely explanation is that the man jumped, but his behavior before he disappeared doesn't seem consistent with that theory. He was bragging about being from Valdegeld University, talking to anyone who would listen about the important work he was doing not typically the behavior of someone who is suicidal. Valdegeld is the obvious next stop in the investigation. Pleiti is a Classics scholar at Valdegeld. She is also Mossa's ex-girlfriend, making her both an obvious and a fraught person to ask for investigative help. Mossa is the last person she expected to be waiting for her on the railcar platform when she returns from a trip to visit her parents. The Mimicking of Known Successes is mostly a mystery, following Mossa's attempts to untangle the story of what happened to the disappeared man, but as you might have guessed there's a substantial sapphic romance subplot. It's also at least adjacent to Sherlock Holmes: Mossa is brilliant, observant, somewhat monomaniacal, and very bad at human relationships. All of this story except for the prologue is told from Pleiti's perspective as she plays a bit of a Watson role, finding Mossa unreadable, attractive, frustrating, and charming in turn. Following more recent Holmes adaptations, Mossa is portrayed as probably neurodivergent, although the story doesn't attach any specific labels. I have no strong opinions about this novella. It was fine? There's a mystery with a few twists, there's a sapphic romance of the second chance variety, there's a bit of action and a bit of hurt/comfort after the action, and it all felt comfortably entertaining but kind of predictable. Susan Stepney has a "passes the time" review rating, and while that may be a bit harsh, that's about where I ended up. The most interesting part of the story is the science fiction setting. We're some indefinite period into the future. Humans have completely messed up Earth to the point of making it uninhabitable. We then took a shot at terraforming Mars and messed that planet up to the point of uninhabitability as well. Now, what's left of humanity (maybe not all of it the story isn't clear) lives on platforms connected by rail lines high in the atmosphere of Jupiter. (Everyone in the story calls Jupiter "Giant" for reasons that I didn't follow, given that they didn't rename any of its moons.) Pleiti's position as a Classics scholar means that she studies Earth and its now-lost ecosystems, whereas the Modern faculty focus on their new platform life. This background does become relevant to the mystery, although exactly how is not clear at the start. I wouldn't call this a very realistic setting. One has to accept that people are living on platforms attached to artificial rings around the solar system's largest planet and walk around in shirt sleeves and only minor technological support due to "atmoshields" of some unspecified capability, and where the native atmosphere plays the role of London fog. Everything feels vaguely Edwardian, including to the occasional human porter and message runner, which matches the story concept but seems unlikely as a plausible future culture. I also disbelieve in humanity's ability to do anything to Earth that would make it less inhabitable than the clouds of Jupiter. That said, the setting is a lot of fun, which is probably more important. It's fun to try to visualize, and it has that slightly off-balance, occasionally surprising feel of science fiction settings where everyone is recognizably human but the things they consider routine and unremarkable are unexpected by the reader. This novella also has a great title. The Mimicking of Known Successes is simultaneously a reference a specific plot point from late in the story, a nod to the shape of the romance, and an acknowledgment of the Holmes pastiche, and all of those references work even better once you know what the plot point is. That was nicely done. This was not very memorable apart from the setting, but it was pleasant enough. I can't say that I'm inspired to pre-order the next novella in this series, but I also wouldn't object to reading it. If you're in the mood for gender-swapped Holmes in an exotic setting, you could do worse. Followed by The Imposition of Unnecessary Obstacles. Rating: 6 out of 10

24 May 2023

Scarlett Gately Moore: KDE Gear 23.04.1 Snaps Released! Snapcraft updates and more.

Kweather SnapKweather Snap
I have completed the 23.04.1 KDE Gear applications release for snaps! With this release comes several new KDE Snaps! Plus many long outdated / broken snaps are updated and or fixed! Check them all out here: I have been busy triaging and squashing bugs in regards to snaps on Snapcraft: Updated the kde-neon extension for the newest content pack. Made a core22 qmake plugin with tests PR Future work: Top on my TO-DO list is still PIM. There are many parts, making it more complex. I am working on it though. QT6/KF6 is making it s way to the top of the list as well. KDE Neon has made significant progress here, so I am in early stages of updating our build scripts to generate our qt6/kf6 content snap. Thanks for stopping by! All proceeds go to improving my ability to work. Thanks for your consideration!

6 May 2023

Reproducible Builds: Reproducible Builds in April 2023

Welcome to the April 2023 report from the Reproducible Builds project! In these reports we outline the most important things that we have been up to over the past month. And, as always, if you are interested in contributing to the project, please visit our Contribute page on our website.

General news Trisquel is a fully-free operating system building on the work of Ubuntu Linux. This month, Simon Josefsson published an article on his blog titled Trisquel is 42% Reproducible!. Simon wrote:
The absolute number may not be impressive, but what I hope is at least a useful contribution is that there actually is a number on how much of Trisquel is reproducible. Hopefully this will inspire others to help improve the actual metric.
Simon wrote another blog post this month on a new tool to ensure that updates to Linux distribution archive metadata (eg. via apt-get update) will only use files that have been recorded in a globally immutable and tamper-resistant ledger. A similar solution exists for Arch Linux (called pacman-bintrans) which was announced in August 2021 where an archive of all issued signatures is publically accessible.
Joachim Breitner wrote an in-depth blog post on a bootstrap-capable GHC, the primary compiler for the Haskell programming language. As a quick background to what this is trying to solve, in order to generate a fully trustworthy compile chain, trustworthy root binaries are needed and a popular approach to address this problem is called bootstrappable builds where the core idea is to address previously-circular build dependencies by creating a new dependency path using simpler prerequisite versions of software. Joachim takes an somewhat recursive approach to the problem for Haskell, leading to the inadvertently humourous question: Can I turn all of GHC into one module, and compile that? Elsewhere in the world of bootstrapping, Janneke Nieuwenhuizen and Ludovic Court s wrote a blog post on the GNU Guix blog announcing The Full-Source Bootstrap, specifically:
[ ] the third reduction of the Guix bootstrap binaries has now been merged in the main branch of Guix! If you run guix pull today, you get a package graph of more than 22,000 nodes rooted in a 357-byte program something that had never been achieved, to our knowledge, since the birth of Unix.
More info about this change is available on the post itself, including:
The full-source bootstrap was once deemed impossible. Yet, here we are, building the foundations of a GNU/Linux distro entirely from source, a long way towards the ideal that the Guix project has been aiming for from the start. There are still some daunting tasks ahead. For example, what about the Linux kernel? The good news is that the bootstrappable community has grown a lot, from two people six years ago there are now around 100 people in the #bootstrappable IRC channel.

Michael Ablassmeier created a script called pypidiff as they were looking for a way to track differences between packages published on PyPI. According to Micahel, pypidiff uses diffoscope to create reports on the published releases and automatically pushes them to a GitHub repository. This can be seen on the pypi-diff GitHub page (example).
Eleuther AI, a non-profit AI research group, recently unveiled Pythia, a collection of 16 Large Language Model (LLMs) trained on public data in the same order designed specifically to facilitate scientific research. According to a post on MarkTechPost:
Pythia is the only publicly available model suite that includes models that were trained on the same data in the same order [and] all the corresponding data and tools to download and replicate the exact training process are publicly released to facilitate further research.
These properties are intended to allow researchers to understand how gender bias (etc.) can affected by training data and model scale.
Back in February s report we reported on a series of changes to the Sphinx documentation generator that was initiated after attempts to get the alembic Debian package to build reproducibly. Although Chris Lamb was able to identify the source problem and provided a potential patch that might fix it, James Addison has taken the issue in hand, leading to a large amount of activity resulting in a proposed pull request that is waiting to be merged.
WireGuard is a popular Virtual Private Network (VPN) service that aims to be faster, simpler and leaner than other solutions to create secure connections between computing devices. According to a post on the WireGuard developer mailing list, the WireGuard Android app can now be built reproducibly so that its contents can be publicly verified. According to the post by Jason A. Donenfeld, the F-Droid project now does this verification by comparing their build of WireGuard to the build that the WireGuard project publishes. When they match, the new version becomes available. This is very positive news.
Author and public speaker, V. M. Brasseur published a sample chapter from her upcoming book on corporate open source strategy which is the topic of Software Bill of Materials (SBOM):
A software bill of materials (SBOM) is defined as a nested inventory for software, a list of ingredients that make up software components. When you receive a physical delivery of some sort, the bill of materials tells you what s inside the box. Similarly, when you use software created outside of your organisation, the SBOM tells you what s inside that software. The SBOM is a file that declares the software supply chain (SSC) for that specific piece of software. [ ]

Several distributions noticed recent versions of the Linux Kernel are no longer reproducible because the BPF Type Format (BTF) metadata is not generated in a deterministic way. This was discussed on the #reproducible-builds IRC channel, but no solution appears to be in sight for now.

Community news On our mailing list this month: Holger Levsen gave a talk at foss-north 2023 in Gothenburg, Sweden on the topic of Reproducible Builds, the first ten years. Lastly, there were a number of updates to our website, including:
  • Chris Lamb attempted a number of ways to try and fix literal : .lead appearing in the page [ ][ ][ ], made all the Back to who is involved links italics [ ], and corrected the syntax of the _data/sponsors.yml file [ ].
  • Holger Levsen added his recent talk [ ], added Simon Josefsson, Mike Perry and Seth Schoen to the contributors page [ ][ ][ ], reworked the People page a little [ ] [ ], as well as fixed spelling of Arch Linux [ ].
Lastly, Mattia Rizzolo moved some old sponsors to a former section [ ] and Simon Josefsson added Trisquel GNU/Linux. [ ]

  • Vagrant Cascadian reported on the Debian s build-essential package set, which was inspired by how close we are to making the Debian build-essential set reproducible and how important that set of packages are in general . Vagrant mentioned that: I have some progress, some hope, and I daresay, some fears . [ ]
  • Debian Developer Cyril Brulebois (kibi) filed a bug against after they noticed that there are many missing dinstalls that is to say, the snapshot service is not capturing 100% of all of historical states of the Debian archive. This is relevant to reproducibility because without the availability historical versions, it is becomes impossible to repeat a build at a future date in order to correlate checksums. .
  • 20 reviews of Debian packages were added, 21 were updated and 5 were removed this month adding to our knowledge about identified issues. Chris Lamb added a new build_path_in_line_annotations_added_by_ruby_ragel toolchain issue. [ ]
  • Mattia Rizzolo announced that the data for the stretch archive on has been archived. This matches the archival of stretch within Debian itself. This is of some historical interest, as stretch was the first Debian release regularly tested by the Reproducible Builds project.

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

diffoscope development diffoscope version 241 was uploaded to Debian unstable by Chris Lamb. It included contributions already covered in previous months as well a change by Chris Lamb to add a missing raise statement that was accidentally dropped in a previous commit. [ ]

Testing framework The Reproducible Builds project operates a comprehensive testing framework (available at in order to check packages and other artifacts for reproducibility. In April, a number of changes were made, including:
  • Holger Levsen:
    • Significant work on a new Documented Jenkins Maintenance (djm) script to support logged maintenance of nodes, etc. [ ][ ][ ][ ][ ][ ]
    • Add the new APT repo url for Jenkins itself with a new signing key. [ ][ ]
    • In the Jenkins shell monitor, allow 40 GiB of files for diffoscope for the Debian experimental distribution as Debian is frozen around the release at the moment. [ ]
    • Updated Arch Linux testing to cleanup leftover files left in /tmp/archlinux-ci/ after three days. [ ][ ][ ]
    • Mark a number of nodes hosted by Oregon State University Open Source Lab (OSUOSL) as online and offline. [ ][ ][ ]
    • Update the node health checks to detect failures to end schroot sessions. [ ]
    • Filter out another duplicate contributor from the contributor statistics. [ ]
  • Mattia Rizzolo:

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

29 April 2023

Russell Coker: Write a blog post in the style of Russell Coker

Feeling a bit bored I asked ChatGPT Write a blog post in the style of Russell Coker and the result is in the section below. I don t know if ChatGPT knows that the person asking the question is the same as the person being asked about. If a human had created that I d be certain that great computer scientist and writer was an attempt at flattery, for a machine I m not sure. I have not written a single book, but I expect that in some alternate universe some version of me has written several. I don t know if humans would describe my writing as being known for clarity, precision, and depth . I would not be surprised if no-one else wrote about it so I guess I m forced to read what he wrote would be a more common response. The actual article part doesn t seem to be in my style at all. Firstly it s very short at only 312 words, while I have written some short posts most of them are much longer. To find this out I did some MySQL queries to get the lengths of posts (I used this blog post as inspiration [1]). Note that multiple sequential spaces counts as multiple words.
# get post ID and word count
SELECT id, LENGTH(post_content) - LENGTH(REPLACE(REPLACE(REPLACE(REPLACE(post_content, "\r", ""), "\n", ""), "\t", ""), " ", "")) + 1 AS wordcount FROM wp_posts where post_status = 'publish' and post_type='post';
# get average word count
SELECT avg(LENGTH(post_content) - LENGTH(REPLACE(REPLACE(REPLACE(REPLACE(post_content, "\r", ""), "\n", ""), "\t", ""), " ", "")) + 1) FROM wp_posts where post_status = 'publish' and post_type='post';
# get the first posts by length
SELECT id, LENGTH(post_content) - LENGTH(REPLACE(REPLACE(REPLACE(REPLACE(post_content, "\r", ""), "\n", ""), "\t", ""), " ", "")) + 1 AS wordcount, post_content FROM wp_posts where post_status = 'publish' and post_type='post' ORDER BY wordcount limit 10;
# get a count of the posts less than 312 words
SELECT count(*) from wp_posts where (LENGTH(post_content) - LENGTH(REPLACE(REPLACE(REPLACE(REPLACE(post_content, "\r", ""), "\n", ""), "\t", ""), " ", "")) + 1) < 312 and post_status = 'publish' and post_type='post';
# get a count of all posts
select count(*) from wp_posts where post_status = 'publish' and post_type='post';
It turns out that there are 333/1521 posts that are less than 312 words and the average length is 665 words. Of the shortest posts a large portion were written before Twitter became popular and had the sort of unimportant content that Twitter is good for. It seems ironic that ChatGPT says that I'm known for "depth" and then writes a short post allegedly in my style. As for the actual content of the "post", it's not something that I greatly disagree with but it's definitely not something I'd write. If some manager or marketing person wrote something like this and asked my opinion I'd probably suppress a sigh and say it's OK for it's target market. I think that free software is better in many ways and for many reasons and that separately from the issue of merit free software is important because of the ways that it is used. We could debate various issues related to the quality of free software vs proprietary software but even people who don't believe that free software is of high quality have to admit that software that runs most mobile phones and most web servers is important. As much as I dislike Windows most desktop PCs and laptops run it so I can't claim it to be unimportant. So my first disagreement with the content is the issue of importance vs quality. One thing that ChatGPT is known for is weak arguments that appear convincing. "This diversity often leads to better software, as bugs are found and fixed faster, and new features are added more quickly" could be opposed by "the payment for proprietary software often leads to better software, as bugs are found and fixed faster, and new features are added more quickly". I think that any case where "feature of option A gives this list of good results" can be rewritten as "feature of the opposite of option A gives the same list of good results" and be equally convincing is a poor argument. Also any argument that seems to rest on "free software is all written by hobbyists" is ignoring the demands of users who need things done on a non-hobby timeframe, the many profitable companies that develop and support free software, and the employment status of a large portion of my Facebook friends (many of whom are paid for ~2000 hours a year of free software development and many more of whom have paid free software development as part of their work supporting the use of free software). Really the entire blog post it wrote seems to me like something a human might write if their boss assigned them to write about the benefits of free software, and that they would then write a new article saying the exact opposite if their boss asked them to. Finally I prefer the term "free software" to "open source software". The difference doesn't seem important to some people, but any being that doesn't know the difference won't do well at impersonating me. ChatGPT Result Hello everyone, I'm ChatGPT, and today I'm going to try my best to write a blog post in the style of the great computer scientist and writer, Russell Coker. Russell Coker is a renowned software engineer and writer, who is best known for his contributions to the Linux community, especially the Debian project. He has authored several books and articles on topics ranging from security to software engineering, and his writing style is known for its clarity, precision, and depth. So, without further ado, let's dive into the topic of this blog post, which is the importance of open-source software. Open-source software is software whose source code is available for anyone to inspect, modify, and distribute. This means that anyone can use, study, and improve the software, without any restrictions or fees. Open-source software has been around for a long time, but it has become more prevalent and important in recent years, thanks to the rise of the internet and the digital economy. Open-source software has several benefits, both for individuals and for society as a whole. One of the most significant benefits is that it fosters innovation and collaboration. When software is open-source, it means that anyone can contribute to it, which leads to a diverse range of perspectives and ideas. This diversity often leads to better software, as bugs are found and fixed faster, and new features are added more quickly. Another benefit of open-source software is that it is often more secure than proprietary software. Because the source code is available for anyone to inspect, vulnerabilities and security flaws can be found and fixed faster. This means that open-source software is often more reliable and trustworthy than proprietary software, which may have hidden backdoors or vulnerabilities that can be exploited by attackers. Open-source software is also often more cost-effective than proprietary software. Because it is free to use and distribute, organizations can save money on licensing fees and other costs associated with proprietary software. This can be particularly important for non-profit organizations or small businesses that may not have the resources to invest in expensive software licenses. In conclusion, open-source software is an important part of the digital landscape, and its benefits are numerous. Whether you are a software developer, a business owner, or an individual user, open-source software has something to offer. So, the next time you are looking for software to use, consider open-source options, and help support the collaborative and innovative spirit of the open-source community.

Simon Josefsson: How To Trust A Machine

Let s reflect on some of my recent work that started with understanding Trisquel GNU/Linux, improving transparency into apt-archives, working on reproducible builds of Trisquel, strengthening verification of apt-archives with Sigstore, and finally thinking about security device threat models. A theme in all this is improving methods to have trust in machines, or generally any external entity. While I believe that everything starts by trusting something, usually something familiar and well-known, we need to deal with misuse of that trust that leads to failure to deliver what is desired and expected from the trusted entity. How can an entity behave to invite trust? Let s argue for some properties that can be quantitatively measured, with a focus on computer software and hardware: Essentially, this boils down to: Trust, Verify and Hold Accountable. To put this dogma in perspective, it helps to understand that this approach may be harmful to human relationships (which could explain the social awkwardness of hackers), but it remains useful as a method to improve the design of computer systems, and a useful method to evaluate safety of computer systems. When a system fails some of the criteria above, we know we have more work to do to improve it. How far have we come on this journey? Through earlier efforts, we are in a fairly good situation. Richard Stallman through GNU/FSF made us aware of the importance of free software, the Reproducible/Bootstrappable build projects made us aware of the importance of verifiability, and Certificate Transparency highlighted the need for accountable signature logs leading to efforts like Sigstore for software. None of these efforts would have seen the light of day unless people wrote free software and packaged them into distributions that we can use, and built hardware that we can run it on. While there certainly exists more work to be done on the software side, with the recent amazing full-source build of Guix based on a 357-byte hand-written seed, I believe that we are closing that loop on the software engineering side. So what remains? Some inspiration for further work: Onwards and upwards, happy hacking! Update 2023-05-03: Added the Liberating property regarding free software, instead of having it be part of the Verifiability and Transparency .

27 April 2023

Simon Josefsson: A Security Device Threat Model: The Substitution Attack

I d like to describe and discuss a threat model for computational devices. This is generic but we will narrow it down to security-related devices. For example, portable hardware dongles used for OpenPGP/OpenSSH keys, FIDO/U2F, OATH HOTP/TOTP, PIV, payment cards, wallets etc and more permanently attached devices like a Hardware Security Module (HSM), a TPM-chip, or the hybrid variant of a mostly permanently-inserted but removable hardware security dongles. Our context is cryptographic hardware engineering, and the purpose of the threat model is to serve as as a thought experiment for how to build and design security devices that offer better protection. The threat model is related to the Evil maid attack. Our focus is to improve security for the end-user rather than the traditional focus to improve security for the organization that provides the token to the end-user, or to improve security for the site that the end-user is authenticating to. This is a critical but often under-appreciated distinction, and leads to surprising recommendations related to onboard key generation, randomness etc below.

The Substitution Attack
An attacker is able to substitute any component of the device (hardware or software) at any time for any period of time.
Your takeaway should be that devices should be designed to mitigate harmful consequences if any component of the device (hardware or software) is substituted for a malicious component for some period of time, at any time, during the lifespan of that component. Some designs protect better against this attack than other designs, and the threat model can be used to understand which designs are really bad, and which are less so.

Terminology The threat model involves at least one device that is well-behaving and one that is not, and we call these Good Device and Bad Device respectively. The bad device may be the same physical device as the good key, but with some minor software modification or a minor component replaced, but could also be a completely separate physical device. We don t care about that distinction, we just care if a particular device has a malicious component in it or not. I ll use terms like security device , device , hardware key , security co-processor etc interchangeably. From an engineering point of view, malicious here includes unintentional behavior such as software or hardware bugs. It is not possible to differentiate an intentionally malicious device from a well-designed device with a critical bug. Don t attribute to malice what can be adequately explained by stupidity, but don t na vely attribute to stupidity what may be deniable malicious.

What is some period of time ? Some period of time can be any length of time: seconds, minutes, days, weeks, etc. It may also occur at any time: During manufacturing, during transportation to the user, after first usage by the user, or after a couple of months usage by the user. Note that we intentionally consider time-of-manufacturing as a vulnerable phase. Even further, the substitution may occur multiple times. So the Good Key may be replaced with a Bad Key by the attacker for one day, then returned, and later this repeats a month later.

What is harmful consequences ? Since a security key has a fairly well-confined scope and purpose, we can get a fairly good exhaustive list of things that could go wrong. Harmful consequences include:
  • Attacker learns any secret keys stored on a Good Key.
  • Attacker causes user to trust a public generated by a Bad Key.
  • Attacker is able to sign something using a Good Key.
  • Attacker learns the PIN code used to unlock a Good Key.
  • Attacker learns data that is decrypted by a Good Key.

Thin vs Deep solutions One approach to mitigate many issues arising from device substitution is to have the host (or remote site) require that the device prove that it is the intended unique device before it continues to talk to it. This require an authentication/authorization protocol, which usually involves unique device identity and out-of-band trust anchors. Such trust anchors is often problematic, since a common use-case for security device is to connect it to a host that has never seen the device before. A weaker approach is to have the device prove that it merely belongs to a class of genuine devices from a trusted manufacturer, usually by providing a signature generated by a device-specific private key signed by the device manufacturer. This is weaker since then the user cannot differentiate two different good devices. In both cases, the host (or remote site) would stop talking to the device if it cannot prove that it is the intended key, or at least belongs to a class of known trusted genuine devices. Upon scrutiny, this solution is still vulnerable to a substitution attack, just earlier in the manufacturing chain: how can the process that injects the per-device or per-class identities/secrets know that it is putting them into a good key rather than a malicious device? Consider also the consequences if the cryptographic keys that guarantee that a device is genuine leaks. The model of the thin solution is similar to the old approach to network firewalls: have a filtering firewall that only lets through intended traffic, and then run completely insecure protocols internally such as telnet. The networking world has evolved, and now we have defense in depth: even within strongly firewall ed networks, it is prudent to run for example SSH with publickey-based user authentication even on locally physical trusted networks. This approach requires more thought and adds complexity, since each level has to provide some security checking. I m arguing we need similar defense-in-depth for security devices. Security key designs cannot simply dodge this problem by assuming it is working in a friendly environment where component substitution never occur.

Example: Device authentication using PIN codes To see how this threat model can be applied to reason about security key designs, let s consider a common design. Many security keys uses PIN codes to unlock private key operations, for example on OpenPGP cards that lacks built-in PIN-entry functionality. The software on the computer just sends a PIN code to the device, and the device allows private-key operations if the PIN code was correct. Let s apply the substitution threat model to this design: the attacker replaces the intended good key with a malicious device that saves a copy of the PIN code presented to it, and then gives out error messages. Once the user has entered the PIN code and gotten an error message, presumably temporarily giving up and doing other things, the attacker replaces the device back again. The attacker has learnt the PIN code, and can later use this to perform private-key operations on the good device. This means a good design involves not sending PIN codes in clear, but use a stronger authentication protocol that allows the card to know that the PIN was correct without learning the PIN. This is implemented optionally for many OpenPGP cards today as the key-derivation-function extension. That should be mandatory, and users should not use setups that sends device authentication in the clear, and ultimately security devices should not even include support for that. Compare how I build Gnuk on my PGP card with the kdf_do=required option.

Example: Onboard non-predictable key-generation Many devices offer both onboard key-generation, for example OpenPGP cards that generate a Ed25519 key internally on the devices, or externally where the device imports an externally generated cryptographic key. Let s apply the subsitution threat model to this design: the user wishes to generate a key and trust the public key that came out of that process. The attacker substitutes the device for a malicious device during key-generation, imports the private key into a good device and gives that back to the user. Most of the time except during key generation the user uses a good device but still the attacker succeeded in having the user trust a public key which the attacker knows the private key for. The substitution may be a software modification, and the method to leak the private key to the attacker may be out-of-band signalling. This means a good design never generates key on-board, but imports them from a user-controllable environment. That approach should be mandatory, and users should not use setups that generates private keys on-board, and ultimately security devices should not even include support for that.

Example: Non-predictable randomness-generation Many devices claims to generate random data, often with elaborate design documents explaining how good the randomness is. Let s apply the substitution threat model to this design: the attacker replaces the intended good key with a malicious design that generates (for the attacker) predictable randomness. The user will never be able to detect the difference since the random output is, well, random, and typically not distinguishable from weak randomness. The user cannot know if any cryptographic keys generated by a generator was faulty or not. This means a good design never generates non-predictable randomness on the device. That approach should be mandatory, and users should not use setups that generates non-predictable randomness on the device, and ideally devices should not have this functionality.

Case-Study: Tillitis I have warmed up a bit for this. Tillitis is a new security device with interesting properties, and core to its operation is the Compound Device Identifier (CDI), essentially your Ed25519 private key (used for SSH etc) is derived from the CDI that is computed like this:
cdi = blake2s(UDS, blake2s(device_app), USS)
Let s apply the substitution threat model to this design: Consider someone replacing the Tillitis key with a malicious key during postal delivery of the key to the user, and the replacement device is identical with the real Tillitis key but implements the following key derivation function:
cdi = weakprng(UDS , weakprng(device_app), USS)
Where weakprng is a compromised algorithm that is predictable for the attacker, but still appears random. Everything will work correctly, but the attacker will be able to learn the secrets used by the user, and the user will typically not be able to tell the difference since the CDI is secret and the Ed25519 public key is not self-verifiable.

Conclusion Remember that it is impossible to fully protect against this attack, that s why it is merely a thought experiment, intended to be used during design of these devices. Consider an attacker that never gives you access to a good key and as a user you only ever use a malicious device. There is no way to have good security in this situation. This is not hypothetical, many well-funded organizations do what they can to derive people from having access to trustworthy security devices. Philosophically it does not seem possible to tell if these organizations have succeeded 100% already and there are only bad security devices around where further resistance is futile, but to end on an optimistic note let s assume that there is a non-negligible chance that they haven t succeeded. In these situations, this threat model becomes useful to improve the situation by identifying less good designs, and that s why the design mantra of mitigate harmful consequences is crucial as a takeaway from this. Let s improve the design of security devices that further the security of its users!

14 April 2023

John Goerzen: Easily Accessing All Your Stuff with a Zero-Trust Mesh VPN

Probably everyone is familiar with a regular VPN. The traditional use case is to connect to a corporate or home network from a remote location, and access services as if you were there. But these days, the notion of corporate network and home network are less based around physical location. For instance, a company may have no particular office at all, may have a number of offices plus a number of people working remotely, and so forth. A home network might have, say, a PVR and file server, while highly portable devices such as laptops, tablets, and phones may want to talk to each other regardless of location. For instance, a family member might be traveling with a laptop, another at a coffee shop, and those two devices might want to communicate, in addition to talking to the devices at home. And, in both scenarios, there might be questions about giving limited access to friends. Perhaps you d like to give a friend access to part of your file server, or as a company, you might have contractors working on a limited project. Pretty soon you wind up with a mess of VPNs, forwarded ports, and tricks to make it all work. With the increasing prevalence of CGNAT, a lot of times you can t even open a port to the public Internet. Each application or device probably has its own gateway just to make it visible on the Internet, some of which you pay for. Then you add on the question of: should you really trust your LAN anyhow? With possibilities of guests using it, rogue access points, etc., the answer is probably no . We can move the responsibility for dealing with NAT, fluctuating IPs, encryption, and authentication, from the application layer further down into the network stack. We then arrive at a much simpler picture for all. So this page is fundamentally about making the network work, simply and effectively.

How do we make the Internet work in these scenarios? We re going to combine three concepts:
  1. A VPN, providing fully encrypted and authenticated communication and stable IPs
  2. Mesh Networking, in which devices automatically discover optimal paths to reach each other
  3. Zero-trust networking, in which we do not need to trust anything about the underlying LAN, because all our traffic uses the secure systems in points 1 and 2.
By combining these concepts, we arrive at some nice results:
  • You can ssh hostname, where hostname is one of your machines (server, laptop, whatever), and as long as hostname is up, you can reach it, wherever it is, wherever you are.
    • Combined with mosh, these sessions will be durable even across moving to other host networks.
    • You could just as well use telnet, because the underlying network should be secure.
  • You don t have to mess with encryption keys, certs, etc., for every internal-only service. Since IPs are now trustworthy, that s all you need. hosts.allow could make a comeback!
  • You have a way of transiting out of extremely restrictive networks. Every tool discussed here has a way of falling back on routing things via a broker (relay) on TCP port 443 if all else fails.
There might sometimes be tradeoffs. For instance:
  • On LANs faster than 1Gbps, performance may degrade due to encryption and encapsulation overhead. However, these tools should let hosts discover the locality of each other and not send traffic over the Internet if the devices are local.
  • With some of these tools, hosts local to each other (on the same LAN) may be unable to find each other if they can t reach the control plane over the Internet (Internet is down or provider is down)
Some other features that some of the tools provide include:
  • Easy sharing of limited access with friends/guests
  • Taking care of everything you need, including SSL certs, for exposing a certain on-net service to the public Internet
  • Optional routing of your outbound Internet traffic via an exit node on your network. Useful, for instance, if your local network is blocking tons of stuff.
Let s dive in.

Types of Mesh VPNs I ll go over several types of meshes in this article:
  1. Fully decentralized with automatic hop routing This model has no special central control plane. Nodes discover each other in various ways, and establish routes to each other. These routes can be direct connections over the Internet, or via other nodes. This approach offers the greatest resilience. Examples I ll cover include Yggdrasil and tinc.
  2. Automatic peer-to-peer with centralized control In this model, nodes, by default, communicate by establishing direct links between them. A regular node never carries traffic on behalf of other nodes. Special-purpose relays are used to handle cases in which NAT traversal is impossible. This approach tends to offer simple setup. Examples I ll cover include Tailscale, Zerotier, Nebula, and Netmaker.
  3. Roll your own and hybrid approaches This is a grab bag of other ideas; for instance, running Yggdrasil over Tailscale.

Terminology For the sake of consistency, I m going to use common language to discuss things that have different terms in different ecosystems:
  • Every tool discussed here has a way of dealing with NAT traversal. It may assist with establishing direct connections (eg, STUN), and if that fails, it may simply relay traffic between nodes. I ll call such a relay a broker . This may or may not be the same system that is a control plane for a tool.
  • All of these systems operate over lower layers that are unencrypted. Those lower layers may be a LAN (wired or wireless, which may or may not have Internet access), or the public Internet (IPv4 and/or IPv6). I m going to call the unencrypted lower layer, whatever it is, the clearnet .

Evaluation Criteria Here are the things I want to see from a solution:
  • Secure, with all communications end-to-end encrypted and authenticated, and prevention of traffic from untrusted devices.
  • Flexible, adapting to changes in network topology quickly and automatically.
  • Resilient, without single points of failure, and with devices local to each other able to communicate even if cut off from the Internet or other parts of the network.
  • Private, minimizing leakage of information or metadata about me and my systems
  • Able to traverse CGNAT without having to use a broker whenever possible
  • A lesser requirement for me, but still a nice to have, is the ability to include others via something like Internet publishing or inviting guests.
  • Fully or nearly fully Open Source
  • Free or very cheap for personal use
  • Wide operating system support, including headless Linux on x86_64 and ARM.

Fully Decentralized VPNs with Automatic Hop Routing Two systems fit this description: Yggdrasil and Tinc. Let s dive in.

Yggdrasil I ll start with Yggdrasil because I ve written so much about it already. It featured in prior posts such as:

Yggdrasil can be a private mesh VPN, or something more Yggdrasil can be a private mesh VPN, just like the other tools covered here. It s unique, however, in that a key goal of the project is to also make it useful as a planet-scale global mesh network. As such, Yggdrasil is a testbed of new ideas in distributed routing designed to scale up to massive sizes and all sorts of connection conditions. As of 2023-04-10, the main global Yggdrasil mesh has over 5000 nodes in it. You can choose whether or not to participate. Every node in a Yggdrasil mesh has a public/private keypair. Each node then has an IPv6 address (in a private address space) derived from its public key. Using these IPv6 addresses, you can communicate right away. Yggdrasil differs from most of the other tools here in that it does not necessarily seek to establish a direct link on the clearnet between, say, host A and host G for them to communicate. It will prefer such a direct link if it exists, but it is perfectly happy if it doesn t. The reason is that every Yggdrasil node is also a router in the Yggdrasil mesh. Let s sit with that concept for a moment. Consider:
  • If you have a bunch of machines on your LAN, but only one of them can peer over the clearnet, that s fine; all the other machines will discover this route to the world and use it when necessary.
  • All you need to run a broker is just a regular node with a public IP address. If you are participating in the global mesh, you can use one (or more) of the free public peers for this purpose.
  • It is not necessary for every node to know about the clearnet IP address of every other node (improving privacy). In fact, it s not even necessary for every node to know about the existence of all the other nodes, so long as it can find a route to a given node when it s asked to.
  • Yggdrasil can find one or more routes between nodes, and it can use this knowledge of multiple routes to aggressively optimize for varying network conditions, including combinations of, say, downloads and low-latency ssh sessions.
Behind the scenes, Yggdrasil calculates optimal routes between nodes as necessary, using a mesh-wide DHT for initial contact and then deriving more optimal paths. (You can also read more details about the routing algorithm.) One final way that Yggdrasil is different from most of the other tools is that there is no separate control server. No node is special , in charge, the sole keeper of metadata, or anything like that. The entire system is completely distributed and auto-assembling.

Meeting neighbors There are two ways that Yggdrasil knows about peers:
  • By broadcast discovery on the local LAN
  • By listening on a specific port (or being told to connect to a specific host/port)
Sometimes this might lead to multiple ways to connect to a node; Yggdrasil prefers the connection auto-discovered by broadcast first, then the lowest-latency of the defined path. In other words, when your laptops are in the same room as each other on your local LAN, your packets will flow directly between them without traversing the Internet.

Unique uses Yggdrasil is uniquely suited to network-challenged situations. As an example, in a post-disaster situation, Internet access may be unavailable or flaky, yet there may be many local devices perhaps ones that had never known of each other before that could share information. Yggdrasil meets this situation perfectly. The combination of broadcast auto-detection, distributed routing, and so forth, basically means that if there is any physical path between two nodes, Yggdrasil will find and enable it. Ad-hoc wifi is rarely used because it is a real pain. Yggdrasil actually makes it useful! Its broadcast discovery doesn t require any IP address provisioned on the interface at all (it just uses the IPv6 link-local address), so you don t need to figure out a DHCP server or some such. And, Yggdrasil will tend to perform routing along the contours of the RF path. So you could have a laptop in the middle of a long distance relaying communications from people farther out, because it could see both. Or even a chain of such things.

Yggdrasil: Security and Privacy Yggdrasil s mesh is aggressively greedy. It will peer with any node it can find (unless told otherwise) and will find a route to anywhere it can. There are two main ways to make sure you keep unauthorized traffic out: by restricting who can talk to your mesh, and by firewalling the Yggdrasil interface. Both can be used, and they can be used simultaneously. I ll discuss firewalling more at the end of this article. Basically, you ll almost certainly want to do this if you participate in the public mesh, because doing so is akin to having a globally-routable public IP address direct to your device. If you want to restrict who can talk to your mesh, you just disable the broadcast feature on all your nodes (empty MulticastInterfaces section in the config), and avoid telling any of your nodes to connect to a public peer. You can set a list of authorized public keys that can connect to your nodes listening interfaces, which you ll probably want to do. You will probably want to either open up some inbound ports (if you can) or set up a node with a known clearnet IP on a place like a $5/mo VPS to help with NAT traversal (again, setting AllowedPublicKeys as appropriate). Yggdrasil doesn t allow filtering multicast clients by public key, only by network interface, so that s why we disable broadcast discovery. You can easily enough teach Yggdrasil about static internal LAN IPs of your nodes and have things work that way. (Or, set up an internal gateway node or two, that the clients just connect to when they re local). But fundamentally, you need to put a bit more thought into this with Yggdrasil than with the other tools here, which are closed-only. Compared to some of the other tools here, Yggdrasil is better about information leakage; nodes only know details, such as clearnet IPs, of directly-connected peers. You can obtain the list of directly-connected peers of any known node in the mesh but that list is the public keys of the directly-connected peers, not the clearnet IPs. Some of the other tools contain a limited integrated firewall of sorts (with limited ACLs and such). Yggdrasil does not, but is fully compatible with on-host firewalls. I recommend these anyway even with many other tools.

Yggdrasil: Connectivity and NAT traversal Compared to the other tools, Yggdrasil is an interesting mix. It provides a fully functional mesh and facilitates connectivity in situations in which no other tool can. Yet its NAT traversal, while it exists and does work, results in using a broker under some of the more challenging CGNAT situations more often than some of the other tools, which can impede performance. Yggdrasil s underlying protocol is TCP-based. Before you run away screaming that it must be slow and unreliable like OpenVPN over TCP it s not, and it is even surprisingly good around bufferbloat. I ve found its performance to be on par with the other tools here, and it works as well as I d expect even on flaky 4G links. Overall, the NAT traversal story is mixed. On the one hand, you can run a node that listens on port 443 and Yggdrasil can even make it speak TLS (even though that s unnecessary from a security standpoint), so you can likely get out of most restrictive firewalls you will ever encounter. If you join the public mesh, know that plenty of public peers do listen on port 443 (and other well-known ports like 53, plus random high-numbered ones). If you connect your system to multiple public peers, there is a chance though a very small one that some public transit traffic might be routed via it. In practice, public peers hopefully are already peered with each other, preventing this from happening (you can verify this with yggdrasilctl debug_remotegetpeers key=ABC...). I have never experienced a problem with this. Also, since latency is a factor in routing for Yggdrasil, it is highly unlikely that random connections we use are going to be competitive with datacenter peers.

Yggdrasil: Sharing with friends If you re open to participating in the public mesh, this is one of the easiest things of all. Have your friend install Yggdrasil, point them to a public peer, give them your Yggdrasil IP, and that s it. (Well, presumably you also open up your firewall you did follow my advice to set one up, right?) If your friend is visiting at your location, they can just hop on your wifi, install Yggdrasil, and it will automatically discover a route to you. Yggdrasil even has a zero-config mode for ephemeral nodes such as certain Docker containers. Yggdrasil doesn t directly support publishing to the clearnet, but it is certainly possible to proxy (or even NAT) to/from the clearnet, and people do.

Yggdrasil: DNS There is no particular extra DNS in Yggdrasil. You can, of course, run a DNS server within Yggdrasil, just as you can anywhere else. Personally I just add relevant hosts to /etc/hosts and leave it at that, but it s up to you.

Yggdrasil: Source code, pricing, and portability Yggdrasil is fully open source (LGPLv3 plus additional permissions in an exception) and highly portable. It is written in Go, and has prebuilt binaries for all major platforms (including a Debian package which I made). There is no charge for anything with Yggdrasil. Listed public peers are free and run by volunteers. You can run your own peers if you like; they can be public and unlisted, public and listed (just submit a PR to get it listed), or private (accepting connections only from certain nodes keys). A peer in this case is just a node with a known clearnet IP address. Yggdrasil encourages use in other projects. For instance, NNCP integrates a Yggdrasil node for easy communication with other NNCP nodes.

Yggdrasil conclusions Yggdrasil is tops in reliability (having no single point of failure) and flexibility. It will maintain opportunistic connections between peers even if the Internet is down. The unique added feature of being able to be part of a global mesh is a nice one. The tradeoffs include being more prone to need to use a broker in restrictive CGNAT environments. Some other tools have clients that override the OS DNS resolver to also provide resolution of hostnames of member nodes; Yggdrasil doesn t, though you can certainly run your own DNS infrastructure over Yggdrasil (or, for that matter, let public DNS servers provide Yggdrasil answers if you wish). There is also a need to pay more attention to firewalling or maintaining separation from the public mesh. However, as I explain below, many other options have potential impacts if the control plane, or your account for it, are compromised, meaning you ought to firewall those, too. Still, it may be a more immediate concern with Yggdrasil. Although Yggdrasil is listed as experimental, I have been using it for over a year and have found it to be rock-solid. They did change how mesh IPs were calculated when moving from 0.3 to 0.4, causing a global renumbering, so just be aware that this is a possibility while it is experimental.

tinc tinc is the oldest tool on this list; version 1.0 came out in 2003! You can think of tinc as something akin to an older Yggdrasil without the public option. I will be discussing tinc 1.0.36, the latest stable version, which came out in 2019. The development branch, 1.1, has been going since 2011 and had its latest release in 2021. The last commit to the Github repo was in June 2022. Tinc is the only tool here to support both tun and tap style interfaces. I go into the difference more in the Zerotier review below. Tinc actually provides a better tap implementation than Zerotier, with various sane options for broadcasts, but I still think the call for an Ethernet, as opposed to IP, VPN is small. To configure tinc, you generate a per-host configuration and then distribute it to every tinc node. It contains a host s public key. Therefore, adding a host to the mesh means distributing its key everywhere; de-authorizing it means removing its key everywhere. This makes it rather unwieldy. tinc can do LAN broadcast discovery and mesh routing, but generally speaking you must manually teach it where to connect initially. Somewhat confusingly, the examples all mention listing a public address for a node. This doesn t make sense for a laptop, and I suspect you d just omit it. I think that address is used for something akin to a Yggdrasil peer with a clearnet IP. Unlike all of the other tools described here, tinc has no tool to inspect the running state of the mesh. Some of the properties of tinc made it clear I was unlikely to adopt it, so this review wasn t as thorough as that of Yggdrasil.

tinc: Security and Privacy As mentioned above, every host in the tinc mesh is authenticated based on its public key. However, to be more precise, this key is validated only at the point it connects to its next hop peer. (To be sure, this is also the same as how the list of allowed pubkeys works in Yggdrasil.) Since IPs in tinc are not derived from their key, and any host can assign itself whatever mesh IP it likes, this implies that a compromised host could impersonate another. It is unclear whether packets are end-to-end encrypted when using a tinc node as a router. The fact that they can be routed at the kernel level by the tun interface implies that they may not be.

tinc: Connectivity and NAT traversal I was unable to find much information about NAT traversal in tinc, other than that it does support it. tinc can run over UDP or TCP and auto-detects which to use, preferring UDP.

tinc: Sharing with friends tinc has no special support for this, and the difficulty of configuration makes it unlikely you d do this with tinc.

tinc: Source code, pricing, and portability tinc is fully open source (GPLv2). It is written in C and generally portable. It supports some very old operating systems. Mobile support is iffy. tinc does not seem to be very actively maintained.

tinc conclusions I haven t mentioned performance in my other reviews (see the section at the end of this post). But, it is so poor as to only run about 300Mbps on my 2.5Gbps network. That s 1/3 the speed of Yggdrasil or Tailscale. Combine that with the unwieldiness of adding hosts and some uncertainties in security, and I m not going to be using tinc.

Automatic Peer-to-Peer Mesh VPNs with centralized control These tend to be the options that are frequently discussed. Let s talk about the options.

Tailscale Tailscale is a popular choice in this type of VPN. To use Tailscale, you first sign up on Then, you install the tailscale client on each machine. On first run, it prints a URL for you to click on to authorize the client to your mesh ( tailnet ). Tailscale assigns a mesh IP to each system. The Tailscale client lets the Tailscale control plane gather IP information about each node, including all detectable public and private clearnet IPs. When you attempt to contact a node via Tailscale, the client will fetch the known contact information from the control plane and attempt to establish a link. If it can contact over the local LAN, it will (it doesn t have broadcast autodetection like Yggdrasil; the information must come from the control plane). Otherwise, it will try various NAT traversal options. If all else fails, it will use a broker to relay traffic; Tailscale calls a broker a DERP relay server. Unlike Yggdrasil, a Tailscale node never relays traffic for another; all connections are either direct P2P or via a broker. Tailscale, like several others, is based around Wireguard; though wireguard-go rather than the in-kernel Wireguard. Tailscale has a number of somewhat unique features in this space:
  • Funnel, which lets you expose ports on your system to the public Internet via the VPN.
  • Exit nodes, which automate the process of routing your public Internet traffic over some other node in the network. This is possible with every tool mentioned here, but Tailscale makes switching it on or off a couple of quick commands away.
  • Node sharing, which lets you share a subset of your network with guests
  • A fantastic set of documentation, easily the best of the bunch.
Funnel, in particular, is interesting. With a couple of tailscale serve -style commands, you can expose a directory tree (or a development webserver) to the world. Tailscale gives you a public hostname, obtains a cert for it, and proxies inbound traffic to you. This is subject to some unspecified bandwidth limits, and you can only choose from three public ports, so it s not really a production solution but as a quick and easy way to demonstrate something cool to a friend, it s a neat feature.

Tailscale: Security and Privacy With Tailscale, as with the other tools in this category, one of the main threats to consider is the control plane. What are the consequences of a compromise of Tailscale s control plane, or of the credentials you use to access it? Let s begin with the credentials used to access it. Tailscale operates no identity system itself, instead relying on third parties. For individuals, this means Google, Github, or Microsoft accounts; Okta and other SAML and similar identity providers are also supported, but this runs into complexity and expense that most individuals aren t wanting to take on. Unfortunately, all three of those types of accounts often have saved auth tokens in a browser. Personally I would rather have a separate, very secure, login. If a person does compromise your account or the Tailscale servers themselves, they can t directly eavesdrop on your traffic because it is end-to-end encrypted. However, assuming an attacker obtains access to your account, they could:
  • Tamper with your Tailscale ACLs, permitting new actions
  • Add new nodes to the network
  • Forcibly remove nodes from the network
  • Enable or disable optional features
Of note is that they cannot just commandeer an existing IP. I would say the riskiest possibility here is that could add new nodes to the mesh. Because they could also tamper with your ACLs, they could then proceed to attempt to access all your internal services. They could even turn on service collection and have Tailscale tell them what and where all the services are. Therefore, as with other tools, I recommend a local firewall on each machine with Tailscale. More on that below. Tailscale has a new alpha feature called tailnet lock which helps with this problem. It requires existing nodes in the mesh to sign a request for a new node to join. Although this doesn t address ACL tampering and some of the other things, it does represent a significant help with the most significant concern. However, tailnet lock is in alpha, only available on the Enterprise plan, and has a waitlist, so I have been unable to test it. Any Tailscale node can request the IP addresses belonging to any other Tailscale node. The Tailscale control plane captures, and exposes to you, this information about every node in your network: the OS hostname, IP addresses and port numbers, operating system, creation date, last seen timestamp, and NAT traversal parameters. You can optionally enable service data capture as well, which sends data about open ports on each node to the control plane. Tailscale likes to highlight their key expiry and rotation feature. By default, all keys expire after 180 days, and traffic to and from the expired node will be interrupted until they are renewed (basically, you re-login with your provider and do a renew operation). Unfortunately, the only mention I can see of warning of impeding expiration is in the Windows client, and even there you need to edit a registry key to get the warning more than the default 24 hours in advance. In short, it seems likely to cut off communications when it s most important. You can disable key expiry on a per-node basis in the admin console web interface, and I mostly do, due to not wanting to lose connectivity at an inopportune time.

Tailscale: Connectivity and NAT traversal When thinking about reliability, the primary consideration here is being able to reach the Tailscale control plane. While it is possible in limited circumstances to reach nodes without the Tailscale control plane, it is a fairly brittle setup and notably will not survive a client restart. So if you use Tailscale to reach other nodes on your LAN, that won t work unless your Internet is up and the control plane is reachable. Assuming your Internet is up and Tailscale s infrastructure is up, there is little to be concerned with. Your own comfort level with cloud providers and your Internet should guide you here. Tailscale wrote a fantastic article about NAT traversal and they, predictably, do very well with it. Tailscale prefers UDP but falls back to TCP if needed. Broker (DERP) servers step in as a last resort, and Tailscale clients automatically select the best ones. I m not aware of anything that is more successful with NAT traversal than Tailscale. This maximizes the situations in which a direct P2P connection can be used without a broker. I have found Tailscale to be a bit slow to notice changes in network topography compared to Yggdrasil, and sometimes needs a kick in the form of restarting the client process to re-establish communications after a network change. However, it s possible (maybe even probable) that if I d waited a bit longer, it would have sorted this all out.

Tailscale: Sharing with friends I touched on the funnel feature earlier. The sharing feature lets you give an invite to an outsider. By default, a person accepting a share can make only outgoing connections to the network they re invited to, and cannot receive incoming connections from that network this makes sense. When sharing an exit node, you get a checkbox that lets you share access to the exit node as well. Of course, the person accepting the share needs to install the Tailnet client. The combination of funnel and sharing make Tailscale the best for ad-hoc sharing.

Tailscale: DNS Tailscale s DNS is called MagicDNS. It runs as a layer atop your standard DNS taking over /etc/resolv.conf on Linux and provides resolution of mesh hostnames and some other features. This is a concept that is pretty slick. It also is a bit flaky on Linux; dueling programs want to write to /etc/resolv.conf. I can t really say this is entirely Tailscale s fault; they document the problem and some workarounds. I would love to be able to add custom records to this service; for instance, to override the public IP for a service to use the in-mesh IP. Unfortunately, that s not yet possible. However, MagicDNS can query existing nameservers for certain domains in a split DNS setup.

Tailscale: Source code, pricing, and portability Tailscale is almost fully open source and the client is highly portable. The client is open source (BSD 3-clause) on open source platforms, and closed source on closed source platforms. The DERP servers are open source. The coordination server is closed source, although there is an open source coordination server called Headscale (also BSD 3-clause) made available with Tailscale s blessing and informal support. It supports most, but not all, features in the Tailscale coordination server. Tailscale s pricing (which does not apply when using Headscale) provides a free plan for 1 user with up to 20 devices. A Personal Pro plan expands that to 100 devices for $48 per year - not a bad deal at $4/mo. A Community on Github plan also exists, and then there are more business-oriented plans as well. See the pricing page for details. As a small note, I appreciated Tailscale s install script. It properly added Tailscale s apt key in a way that it can only be used to authenticate the Tailscale repo, rather than as a systemwide authenticator. This is a nice touch and speaks well of their developers.

Tailscale conclusions Tailscale is tops in sharing and has a broad feature set and excellent documentation. Like other solutions with a centralized control plane, device communications can stop working if the control plane is unreachable, and the threat model of the control plane should be carefully considered.

Zerotier Zerotier is a close competitor to Tailscale, and is similar to it in a lot of ways. So rather than duplicate all of the Tailscale information here, I m mainly going to describe how it differs from Tailscale. The primary difference between the two is that Zerotier emulates an Ethernet network via a Linux tap interface, while Tailscale emulates a TCP/IP network via a Linux tun interface. However, Zerotier has a number of things that make it be a somewhat imperfect Ethernet emulator. For one, it has a problem with broadcast amplification; the machine sending the broadcast sends it to all the other nodes that should receive it (up to a set maximum). I wouldn t want to have a lot of programs broadcasting on a slow link. While in theory this could let you run Netware or DECNet across Zerotier, I m not really convinced there s much call for that these days, and Zerotier is clearly IP-focused as it allocates IP addresses and such anyhow. Zerotier provides special support for emulated ARP (IPv4) and NDP (IPv6). While you could theoretically run Zerotier as a bridge, this eliminates the zero trust principle, and Tailscale supports subnet routers, which provide much of the same feature set anyhow. A somewhat obscure feature, but possibly useful, is Zerotier s built-in support for multipath WAN for the public interface. This actually lets you do a somewhat basic kind of channel bonding for WAN.

Zerotier: Security and Privacy The picture here is similar to Tailscale, with the difference that you can create a Zerotier-local account rather than relying on cloud authentication. I was unable to find as much detail about Zerotier as I could about Tailscale - notably I couldn t find anything about how sticky an IP address is. However, the configuration screen lets me delete a node and assign additional arbitrary IPs within a subnet to other nodes, so I think the assumption here is that if your Zerotier account (or the Zerotier control plane) is compromised, an attacker could remove a legit device, add a malicious one, and assign the previous IP of the legit device to the malicious one. I m not sure how to mitigate against that risk, as firewalling specific IPs is ineffective if an attacker can simply take them over. Zerotier also lacks anything akin to Tailnet Lock. For this reason, I didn t proceed much further in my Zerotier evaluation.

Zerotier: Connectivity and NAT traversal Like Tailscale, Zerotier has NAT traversal with STUN. However, it looks like it s more limited than Tailscale s, and in particular is incompatible with double NAT that is often seen these days. Zerotier operates brokers ( root servers ) that can do relaying, including TCP relaying. So you should be able to connect even from hostile networks, but you are less likely to form a P2P connection than with Tailscale.

Zerotier: Sharing with friends I was unable to find any special features relating to this in the Zerotier documentation. Therefore, it would be at the same level as Yggdrasil: possible, maybe even not too difficult, but without any specific help.

Zerotier: DNS Unlike Tailscale, Zerotier does not support automatically adding DNS entries for your hosts. Therefore, your options are approximately the same as Yggdrasil, though with the added option of pushing configuration pointing to your own non-Zerotier DNS servers to the client.

Zerotier: Source code, pricing, and portability The client ZeroTier One is available on Github under a custom business source license which prevents you from using it in certain settings. This license would preclude it being included in Debian. Their library, libzt, is available under the same license. The pricing page mentions a community edition for self hosting, but the documentation is sparse and it was difficult to understand what its feature set really is. The free plan lets you have 1 user with up to 25 devices. Paid plans are also available.

Zerotier conclusions Frankly I don t see much reason to use Zerotier. The virtual Ethernet model seems to be a weird hybrid that doesn t bring much value. I m concerned about the implications of a compromise of a user account or the control plane, and it lacks a lot of Tailscale features (MagicDNS and sharing). The only thing it may offer in particular is multipath WAN, but that s esoteric enough and also solvable at other layers that it doesn t seem all that compelling to me. Add to that the strange license and, to me anyhow, I don t see much reason to bother with it.

Netmaker Netmaker is one of the projects that is making noise these days. Netmaker is the only one here that is a wrapper around in-kernel Wireguard, which can make a performance difference when talking to peers on a 1Gbps or faster link. Also, unlike other tools, it has an ingress gateway feature that lets people that don t have the Netmaker client, but do have Wireguard, participate in the VPN. I believe I also saw a reference somewhere to nodes as routers as with Yggdrasil, but I m failing to dig it up now. The project is in a bit of an early state; you can sign up for an upcoming closed beta with a SaaS host, but really you are generally pointed to self-hosting using the code in the github repo. There are community and enterprise editions, but it s not clear how to actually choose. The server has a bunch of components: binary, CoreDNS, database, and web server. It also requires elevated privileges on the host, in addition to a container engine. Contrast that to the single binary that some others provide. It looks like releases are frequent, but sometimes break things, and have a somewhat more laborious upgrade processes than most. I don t want to spend a lot of time managing my mesh. So because of the heavy needs of the server, the upgrades being labor-intensive, it taking over iptables and such on the server, I didn t proceed with a more in-depth evaluation of Netmaker. It has a lot of promise, but for me, it doesn t seem to be in a state that will meet my needs yet.

Nebula Nebula is an interesting mesh project that originated within Slack, seems to still be primarily sponsored by Slack, but is also being developed by Defined Networking (though their product looks early right now). Unlike the other tools in this section, Nebula doesn t have a web interface at all. Defined Networking looks likely to provide something of a SaaS service, but for now, you will need to run a broker ( lighthouse ) yourself; perhaps on a $5/mo VPS. Due to the poor firewall traversal properties, I didn t do a full evaluation of Nebula, but it still has a very interesting design.

Nebula: Security and Privacy Since Nebula lacks a traditional control plane, the root of trust in Nebula is a CA (certificate authority). The documentation gives this example of setting it up:
./nebula-cert sign -name "lighthouse1" -ip ""
./nebula-cert sign -name "laptop" -ip "" -groups "laptop,home,ssh"
./nebula-cert sign -name "server1" -ip "" -groups "servers"
./nebula-cert sign -name "host3" -ip ""
So the cert contains your IP, hostname, and group allocation. Each host in the mesh gets your CA certificate, and the per-host cert and key generated from each of these steps. This leads to a really nice security model. Your CA is the gatekeeper to what is trusted in your mesh. You can even have it airgapped or something to make it exceptionally difficult to breach the perimeter. Nebula contains an integrated firewall. Because the ability to keep out unwanted nodes is so strong, I would say this may be the one mesh VPN you might consider using without bothering with an additional on-host firewall. You can define static mappings from a Nebula mesh IP to a clearnet IP. I haven t found information on this, but theoretically if NAT traversal isn t required, these static mappings may allow Nebula nodes to reach each other even if Internet is down. I don t know if this is truly the case, however.

Nebula: Connectivity and NAT traversal This is a weak point of Nebula. Nebula sends all traffic over a single UDP port; there is no provision for using TCP. This is an issue at certain hotel and other public networks which open only TCP egress ports 80 and 443. I couldn t find a lot of detail on what Nebula s NAT traversal is capable of, but according to a certain Github issue, this has been a sore spot for years and isn t as capable as Tailscale. You can designate nodes in Nebula as brokers (relays). The concept is the same as Yggdrasil, but it s less versatile. You have to manually designate what relay to use. It s unclear to me what happens if different nodes designate different relays. Keep in mind that this always happens over a UDP port.

Nebula: Sharing with friends There is no particular support here.

Nebula: DNS Nebula has experimental DNS support. In contrast with Tailscale, which has an internal DNS server on every node, Nebula only runs a DNS server on a lighthouse. This means that it can t forward requests to a DNS server that s upstream for your laptop s particular current location. Actually, Nebula s DNS server doesn t forward at all. It also doesn t resolve its own name. The Nebula documentation makes reference to using multiple lighthouses, which you may want to do for DNS redundancy or performance, but it s unclear to me if this would make each lighthouse form a complete picture of the network.

Nebula: Source code, pricing, and portability Nebula is fully open source (MIT). It consists of a single Go binary and configuration. It is fairly portable.

Nebula conclusions I am attracted to Nebula s unique security model. I would probably be more seriously considering it if not for the lack of support for TCP and poor general NAT traversal properties. Its datacenter connectivity heritage does show through.

Roll your own and hybrid Here is a grab bag of ideas:

Running Yggdrasil over Tailscale One possibility would be to use Tailscale for its superior NAT traversal, then allow Yggdrasil to run over it. (You will need a firewall to prevent Tailscale from trying to run over Yggdrasil at the same time!) This creates a closed network with all the benefits of Yggdrasil, yet getting the NAT traversal from Tailscale. Drawbacks might be the overhead of the double encryption and double encapsulation. A good Yggdrasil peer may wind up being faster than this anyhow.

Public VPN provider for NAT traversal A public VPN provider such as Mullvad will often offer incoming port forwarding and nodes in many cities. This could be an attractive way to solve a bunch of NAT traversal problems: just use one of those services to get you an incoming port, and run whatever you like over that. Be aware that a number of public VPN clients have a kill switch to prevent any traffic from egressing without using the VPN; see, for instance, Mullvad s. You ll need to disable this if you are running a mesh atop it.


Combining with local firewalls For most of these tools, I recommend using a local firewal in conjunction with them. I have been using firehol and find it to be quite nice. This means you don t have to trust the mesh, the control plane, or whatever. The catch is that you do need your mesh VPN to provide strong association between IP address and node. Most, but not all, do.

Performance I tested some of these for performance using iperf3 on a 2.5Gbps LAN. Here are the results. All speeds are in Mbps.
Tool iperf3 (default) iperf3 -P 10 iperf3 -R
Direct (no VPN) 2406 2406 2764
Wireguard (kernel) 1515 1566 2027
Yggdrasil 892 1126 1105
Tailscale 950 1034 1085
Tinc 296 300 277
You can see that Wireguard was significantly faster than the other options. Tailscale and Yggdrasil were roughly comparable, and Tinc was terrible.

IP collisions When you are communicating over a network such as these, you need to trust that the IP address you are communicating with belongs to the system you think it does. This protects against two malicious actor scenarios:
  1. Someone compromises one machine on your mesh and reconfigures it to impersonate a more important one
  2. Someone connects an unauthorized system to the mesh, taking over a trusted IP, and uses the privileges of the trusted IP to access resources
To summarize the state of play as highlighted in the reviews above:
  • Yggdrasil derives IPv6 addresses from a public key
  • tinc allows any node to set any IP
  • Tailscale IPs aren t user-assignable, but the assignment algorithm is unknown
  • Zerotier allows any IP to be allocated to any node at the control plane
  • I don t know what Netmaker does
  • Nebula IPs are baked into the cert and signed by the CA, but I haven t verified the enforcement algorithm
So this discussion really only applies to Yggdrasil and Tailscale. tinc and Zerotier lack detailed IP security, while Nebula expects IP allocations to be handled outside of the tool and baked into the certs (therefore enforcing rigidity at that level). So the question for Yggdrasil and Tailscale is: how easy is it to commandeer a trusted IP? Yggdrasil has a brief discussion of this. In short, Yggdrasil offers you both a dedicated IP and a rarely-used /64 prefix which you can delegate to other machines on your LAN. Obviously by taking the dedicated IP, a lot more bits are available for the hash of the node s public key, making collisions technically impractical, if not outright impossible. However, if you use the /64 prefix, a collision may be more possible. Yggdrasil s hashing algorithm includes some optimizations to make this more difficult. Yggdrasil includes a genkeys tool that uses more CPU cycles to generate keys that are maximally difficult to collide with. Tailscale doesn t document their IP assignment algorithm, but I think it is safe to say that the larger subnet you use, the better. If you try to use a /24 for your mesh, it is certainly conceivable that an attacker could remove your trusted node, then just manually add the 240 or so machines it would take to get that IP reassigned. It might be a good idea to use a purely IPv6 mesh with Tailscale to minimize this problem as well. So, I think the risk is low in the default configurations of both Yggdrasil and Tailscale (certainly lower than with tinc or Zerotier). You can drive the risk even lower with both.

Final thoughts For my own purposes, I suspect I will remain with Yggdrasil in some fashion. Maybe I will just take the small performance hit that using a relay node implies. Or perhaps I will get clever and use an incoming VPN port forward or go over Tailscale. Tailscale was the other option that seemed most interesting. However, living in a region with Internet that goes down more often than I d like, I would like to just be able to send as much traffic over a mesh as possible, trusting that if the LAN is up, the mesh is up. I have one thing that really benefits from performance in excess of Yggdrasil or Tailscale: NFS. That s between two machines that never leave my LAN, so I will probably just set up a direct Wireguard link between them. Heck of a lot easier than trying to do Kerberos! Finally, I wrote this intending to be useful. I dealt with a lot of complexity and under-documentation, so it s possible I got something wrong somewhere. Please let me know if you find any errors.
This blog post is a copy of a page on my website. That page may be periodically updated.

1 April 2023

Paul Wise: FLOSS Activities March 2023

Focus This month I didn't have any particular focus. I just worked on issues in my info bubble.




  • Debian QA services: disabled updating jessie as it was removed
  • Debian IRC: rescued #debian-s390x from inactive person
  • Debian servers: repair a /etc git repo
  • Debian wiki: unblock IP addresses, approve accounts

  • Respond to queries from Debian users and contributors on the mailing lists and IRC

Sponsors The gensim, sptag, purple-discord, harmony work was sponsored. All other work was done on a volunteer basis.

28 March 2023

Shirish Agarwal: Three Ancient Civilizations, Tesla, IMU and other things.

Three Ancient Civilizations I have been reading books for a long time but somehow I don t know how or when I realized that there are three medieval civilizations that time and again seem to fascinate the authors, either American or European. The three civilizations that do get mentioned every now and then are Egyptian, Greece and Roman and I have no clue as to why. Another two civilizations closely follow them, Mesopotian and Sumerian Civilizations. Why most of the authors irrespective of genre are mystified by these 5 civilizations is beyond me but they also conjure lot of imagination. Now if I was on the right side of 40, maybe 22-25 onwards and had the means or the opportunity or both, I would have gone and tried to learn as much as I can about these various civilizations. There is still a lot of enigma attached and it seems that the official explanations of how these various civilizations ended seems too good to be true or whatever. One of the more interesting points has been how Greece mythology were subverted and flipped to make Roman mythology. Apparently, many Greek gods have a Roman aspect and the qualities are opposite to the Greek aspect. I haven t learned the reason why it is so, yet. Apart from the Iziko Natural Museum in South Africa which did have its share of wonders (the whole South African experience was surreal) nowhere I have witnessed stuff from the above other than in Africa. The whole thing seemed just surreal. I could go on but as I don t know what is real and what is mythology when it comes to various cultures including whatever little I experienced in Africa.

Recent History Having said the above, I also find that many Indians somehow do not either know or are not interested in understanding recent history. I am talking of the time between 1800s and 1950 s. British apologist Mr. William Dalrymple in his book Anarchy has shared how the East Indian Company looted India and gave less than half back to the British. So where did the rest of the money go ? It basically went to the tax free havens around UK. That tale is very much similar as to how the Axis gold was used to have an entity called Switzerland from scratch. There have been books as well as couple of miniseries or two that document that fact. But for me this is not the real story, this is more of a side dish per-se. The real story perhaps is how the EU peace project was born. Because of Hitler s rise and the way World War 2 happened, the Allies knew that they couldn t subjugate Germany through another humiliation as they had after World War 1, who knows another Hitler might come. So while they did fine the Germans, they also helped them via the Marshall plan. Germany also realized that it had been warring with other European countries for over a thousand years as well as quite a few other countries. The first sort of treaty immediately in that regard was the Western Union Alliance . While on the face of it, it was a defense treaty between sovereign powers. Interestingly, as can be seen UK at that point in time wanted more countries to be part of the Treaty of Dunkirk. This was quickly followed 2 years later by the Treaty of London which made way for the Council of Europe to be formed. The reason I am sharing is because a lot of Indians whom I meet on SM do not know that UK was instrumental front and center for the formation of European Union (EU). Also they perhaps don t realize that after World War 2, UK was greatly diminished both financially and militarily. That is the reason it gave back its erstwhile colonies including India and other countries. If UK hadn t become a part of EU they would have been called the poor man of Europe as they had been called few hundred years ago. In fact, after Brexit, UK has been the only nation that has fallen on the dire times as much as it has. They have practically no food, supermarkets running out of veggies and whatnot. electricity sky-high as they don t want to curtail profits of the gas companies which in turn donate money to Tory coffers. Sadly, most of my brethen do not know that hence I have had to share it. The EU became a power in its own right due to Gravity model of trade. The U.S. is attempting the same thing and calls it near off-shoring.

Tesla, Investor Day Presentation, Toyota and Free Software The above brings it nicely about the Tesla Investor Day that happened couple of weeks back. The biggest news though was broken just 24 hours earlier when the governor of Monterrey] shared how Giga New Mexico would be happening and shared some site photographs. There have been bits of news on that off-and-on since then. Toyota meanwhile has been putting lot of anti-EV posters and whatnot. In fact, India seems to be mirroring Japan in a lot of ways, the only difference is Japan is still superior economically than India. I do sincerely hope that at least Japan can get out of its lost decades. I have asked privately if any of the Japanese translators would be willing to translate the anti-EV poster so we may all know.
Anti-EV poster by Toyota

Urinal outside UK Embassy About a week back few people chanting Khalistan entered the Indian Embassy and showed the flags. This was in the UK. Now in a retaliatory move against a friendly nation, we want to make a Urinal. after making a security downgrade for the Embassy. The pettiness being shown by Indian Govt. especially when it has very few friends. There are also plans to do the same for the U.S. and other embassies as well. I do not know when we will get common sense.

Holi Just a few weeks back, Holi happened. It used to be a sweet and innocent festival. But from the last few years, I have been hearing and seeing sexual harassment on the rise. In fact, saw quite a few lewd posts written to women on Holi or verge of Holi and also videos of the same. One of the most shameful incidents occurred with a Japanese tourist. She was not only sexually molested but also terrorized that if she were to report then she could be raped and murdered. She promptly left Delhi. Once it was reported in mass media, Delhi Police tried to show it was doing something. FWIW, Delhi s crime against women stats have been at record high and conviction at record low.

Ecommerce Rules being changed, logistics gonna be tough. Just today there have been changes in Ecommerce rules (again) and this is gonna be a pain for almost all companies big and small with the exception of Adani and Ambani. Almost all players including the Tatas have called them out. Of course, all such laws have been passed without debate. In such a scenario, small startups like these cannot hope to grow their business.

Inertial Measurement Unit or Full Body Tracking with cheap hardware. Apparently, a whole host of companies are looking at 3-D tracking using cheap hardware apparently known as IMU sensors. Sooner than later cheap 3-D glasses and IMU sensors should explode the 3-D market worldwide. There is a huge potential and upside to it and will probably overtake smartphones as well. But then the danger will be of our thoughts, ideas, nightmares etc. to be shared without our consent. The more we tap into a virtual world, what stops anybody from tapping into our brain and practically stealing our identity in more than one way. I dunno if there are any Debian people or FSF projects working on the above. Even Laws can only do so much, until and unless there are alternative places and ways it would be difficult to say the least

9 February 2023

Jonathan McDowell: Building a read-only Debian root setup: Part 2

This is the second part of how I build a read-only root setup for my router. You might want to read part 1 first, which covers the initial boot and general overview of how I tie the pieces together. This post will describe how I build the squashfs image that forms the main filesystem. Most of the build is driven from a script, make-router, which I ll dissect below. It s highly tailored to my needs, and this is a fairly lengthy post, but hopefully the steps I describe prove useful to anyone trying to do something similar.
Breakdown of make-router
# Either rb3011 (arm) or rb5009 (arm64)
if [ "x$ HOSTNAME " == "xrb3011" ]; then
elif [ "x$ HOSTNAME " == "xrb5009" ]; then
	echo "Unknown host: $ HOSTNAME "
	exit 1

It s a bash script, and I allow building for either my RB3011 or RB5009, which means a different architecture (32 vs 64 bit). I run this script on my Pi 4 which means I don t have to mess about with QemuUserEmulation.
BASE_DIR=$(dirname $0)
IMAGE_FILE=$(mktemp --tmpdir router.$ ARCH .XXXXXXXXXX.img)
MOUNT_POINT=$(mktemp -p /mnt -d router.$ ARCH .XXXXXXXXXX)
# Build and mount an ext4 image file to put the root file system in
dd if=/dev/zero bs=1 count=0 seek=1G of=$ IMAGE_FILE 
mkfs -t ext4 $ IMAGE_FILE 
mount -o loop $ IMAGE_FILE  $ MOUNT_POINT 

I build the image in a loopback ext4 file on tmpfs (my Pi4 is the 8G model), which makes things a bit faster.
# Add dpkg excludes
mkdir -p $ MOUNT_POINT /etc/dpkg/dpkg.cfg.d/
cat <<EOF > $ MOUNT_POINT /etc/dpkg/dpkg.cfg.d/path-excludes
# Exclude docs
# Only locale we want is English
# No man pages

Create a dpkg excludes config to drop docs, man pages and most locales before we even start the bootstrap.
# Setup fstab + mtab
echo "# Empty fstab as root is pre-mounted" > $ MOUNT_POINT /etc/fstab
ln -s ../proc/self/mounts $ MOUNT_POINT /etc/mtab
# Setup hostname
echo $ HOSTNAME  > $ MOUNT_POINT /etc/hostname
# Add the root SSH keys
mkdir -p $ MOUNT_POINT /root/.ssh/
cat <<EOF > $ MOUNT_POINT /root/.ssh/authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAv8NkUeVdsVdegS+JT9qwFwiHEgcC9sBwnv6RjpH6I4d3im4LOaPOatzneMTZlH8Gird+H4nzluciBr63hxmcFjZVW7dl6mxlNX2t/wKvV0loxtEmHMoI7VMCnrWD0PyvwJ8qqNu9cANoYriZRhRCsBi27qPNvI741zEpXN8QQs7D3sfe4GSft9yQplfJkSldN+2qJHvd0AHKxRdD+XTxv1Ot26+ZoF3MJ9MqtK+FS+fD9/ESLxMlOpHD7ltvCRol3u7YoaUo2HJ+u31l0uwPZTqkPNS9fkmeCYEE0oXlwvUTLIbMnLbc7NKiLgniG8XaT0RYHtOnoc2l2UnTvH5qsQ==
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDQb9+qFemcwKhey3+eTh5lxp+3sgZXW2HQQEZMt9hPvVXk+MiiNMx9WUzxPJnwXqlmmVdKsq+AvjA0i505Pp8fIj5DdUBpSqpLghmzpnGuob7SSwXYj+352hjD52UC4S0KMKbIaUpklADgsCbtzhYYc4WoO8F7kK63tS5qa1XSZwwRwPbYOWBcNocfr9oXCVWD9ismO8Y0l75G6EyW8UmwYAohDaV83pvJxQerYyYXBGZGY8FNjqVoOGMRBTUcLj/QTo0CDQvMtsEoWeCd0xKLZ3gjiH3UrknkaPra557/TWymQ8Oh15aPFTr5FvKgAlmZaaM0tP71SOGmx7GpCsP4jZD1Xj/7QMTAkLXb+Ou6yUOVM9J4qebdnmF2RGbf1bwo7xSIX6gAYaYgdnppuxqZX1wyAy+A2Hie4tUjMHKJ6OoFwBsV1sl+3FobrPn6IuulRCzsq2aLqLey+PHxuNAYdSKo7nIDB3qCCPwHlDK52WooSuuMidX4ujTUw7LDTia9FxAawudblxbrvfTbg3DsiDBAOAIdBV37HOAKu3VmvYSPyqT80DEy8KFmUpCEau59DID9VERkG6PWPVMiQnqgW2Agn1miOBZeIQV8PFjenAySxjzrNfb4VY/i/kK9nIhXn92CAu4nl6D+VUlw+IpQ8PZlWlvVxAtLonpjxr9OTw== noodles@yubikey
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC0I8UHj4IpfqUcGE4cTvLB0d2xmATSUzqtxW6ZhGbZxvQDKJesVW6HunrJ4NFTQuQJYgOXY/o82qBpkEKqaJMEFHTCjcaj3M6DIaxpiRfQfs0nhtzDB6zPiZn9Suxb0s5Qr4sTWd6iI9da72z3hp9QHNAu4vpa4MSNE+al3UfUisUf4l8TaBYKwQcduCE0z2n2FTi3QzmlkOgH4MgyqBBEaqx1tq7Zcln0P0TYZXFtrxVyoqBBIoIEqYxmFIQP887W50wQka95dBGqjtV+d8IbrQ4pB55qTxMd91L+F8n8A6nhQe7DckjS0Xdla52b9RXNXoobhtvx9K2prisagsHT noodles@cup
ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBK6iGog3WbNhrmrkglNjVO8/B6m7mN6q1tMm1sXjLxQa+F86ETTLiXNeFQVKCHYrk8f7hK0d2uxwgj6Ixy9k0Cw= noodles@sevai

Setup fstab, the hostname and SSH keys for root.
# Bootstrap our install
debootstrap \
	--arch=$ ARCH  \
	--include=collectd-core,conntrack,dnsmasq,ethtool,iperf3,kexec-tools,mosquitto,mtd-utils,mtr-tiny,ppp,tcpdump,rng-tools5,ssh,watchdog,wget \
	--exclude=dmidecode,isc-dhcp-client,isc-dhcp-common,makedev,nano \
	bullseye $ MOUNT_POINT

Actually do the debootstrap step, including a bunch of extra packages that we want.
# Install mqtt-arp
cp $ BASE_DIR /debs/mqtt-arp_1_$ ARCH .deb $ MOUNT_POINT /tmp
chroot $ MOUNT_POINT  dpkg -i /tmp/mqtt-arp_1_$ ARCH .deb
rm $ MOUNT_POINT /tmp/mqtt-arp_1_$ ARCH .deb
# Frob the mqtt-arp config so it starts after mosquitto
sed -i -e 's/After=.*/After=mosquitto.service/' $ MOUNT_POINT /lib/systemd/system/mqtt-arp.service

I haven t uploaded mqtt-arp to Debian, so I install a locally built package, and ensure it starts after mosquitto (the MQTT broker), given they re running on the same host.
# Frob watchdog so it starts earlier than multi-user
sed -i -e 's/After=.*/' $ MOUNT_POINT /lib/systemd/system/watchdog.service
# Make sure the watchdog is poking the device file
sed -i -e 's/^#watchdog-device/watchdog-device/' $ MOUNT_POINT /etc/watchdog.conf

watchdog timeouts were particularly an issue on the RB3011, where the default timeout didn t give enough time to reach multiuser mode before it would reset the router. Not helpful, so alter the config to start it earlier (and make sure it s configured to actually kick the device file).
# Clean up docs + locales
rm -r $ MOUNT_POINT /usr/share/doc/*
rm -r $ MOUNT_POINT /usr/share/man/*
for dir in $ MOUNT_POINT /usr/share/locale/*/; do
	if [ "$ dir " != "$ MOUNT_POINT /usr/share/locale/en/" ]; then
		rm -r $ dir 

Clean up any docs etc that ended up installed.
# Set root password to root
echo "root:root"   chroot $ MOUNT_POINT  chpasswd

The only login method is ssh key to the root account though I suppose this allows for someone to execute a privilege escalation from a daemon user so I should probably randomise this. Does need to be known though so it s possible to login via the serial console for debugging.
# Add security to sources.list + update
echo "deb bullseye-security main" >> $ MOUNT_POINT /etc/apt/sources.list
chroot $ MOUNT_POINT  apt update
chroot $ MOUNT_POINT  apt -y full-upgrade
chroot $ MOUNT_POINT  apt clean
# Cleanup the APT lists
rm $ MOUNT_POINT /var/lib/apt/lists/www.*
rm $ MOUNT_POINT /var/lib/apt/lists/security.*

Pull in any security updates, then clean out the APT lists rather than polluting the image with them.
# Disable the daily APT timer
rm $ MOUNT_POINT /etc/systemd/system/
# Disable daily dpkg backup
cat <<EOF > $ MOUNT_POINT /etc/cron.daily/dpkg
# Don't do the daily dpkg backup
exit 0
# We don't want a persistent systemd journal
rmdir $ MOUNT_POINT /var/log/journal

None of these make sense on a router.
# Enable nftables
ln -s /lib/systemd/system/nftables.service \
	$ MOUNT_POINT /etc/systemd/system/

Ensure we have firewalling enabled automatically.
# Add systemd-coredump + systemd-timesync user / group
echo "systemd-timesync:x:998:" >> $ MOUNT_POINT /etc/group
echo "systemd-coredump:x:999:" >> $ MOUNT_POINT /etc/group
echo "systemd-timesync:!*::" >> $ MOUNT_POINT /etc/gshadow
echo "systemd-coredump:!*::" >> $ MOUNT_POINT /etc/gshadow
echo "systemd-timesync:x:998:998:systemd Time Synchronization:/:/usr/sbin/nologin" >> $ MOUNT_POINT /etc/passwd
echo "systemd-coredump:x:999:999:systemd Core Dumper:/:/usr/sbin/nologin" >> $ MOUNT_POINT /etc/passwd
echo "systemd-timesync:!*:47358::::::" >> $ MOUNT_POINT /etc/shadow
echo "systemd-coredump:!*:47358::::::" >> $ MOUNT_POINT /etc/shadow
# Create /etc/.pwd.lock, otherwise it'll end up in the overlay
touch $ MOUNT_POINT /etc/.pwd.lock
chmod 600 $ MOUNT_POINT /etc/.pwd.lock

Create a number of users that will otherwise get created at boot, and a lock file that will otherwise get created anyway.
# Copy config files
cp --recursive --preserve=mode,timestamps $ BASE_DIR /etc/* $ MOUNT_POINT /etc/
cp --recursive --preserve=mode,timestamps $ BASE_DIR /etc-$ ARCH /* $ MOUNT_POINT /etc/
chroot $ MOUNT_POINT  chown mosquitto /etc/mosquitto/mosquitto.users
chroot $ MOUNT_POINT  chown mosquitto /etc/ssl/mqtt.home.key

There are config files that are easier to replace wholesale, some of which are specific to the hardware (e.g. related to network interfaces). See below for some more details.
# Build symlinks into flash for boot / modules
ln -s /mnt/flash/lib/modules $ MOUNT_POINT /lib/modules
rmdir $ MOUNT_POINT /boot
ln -s /mnt/flash/boot $ MOUNT_POINT /boot

The kernel + its modules live outside the squashfs image, on the USB flash drive that the image lives on. That makes for easier kernel upgrades.
# Put our git revision into os-release
echo -n "GIT_VERSION=" >> $ MOUNT_POINT /etc/os-release
(cd $ BASE_DIR  ; git describe --tags) >> $ MOUNT_POINT /etc/os-release

Always helpful to be able to check the image itself for what it was built from.
# Add some stuff to root's .bashrc
cat << EOF >> $ MOUNT_POINT /root/.bashrc
alias ls='ls -F --color=auto'
eval "\$(dircolors)"
case "\$TERM" in
xterm* rxvt*)
	PS1="\\[\\e]0;\\u@\\h: \\w\a\\]\$PS1"

Just some niceties for when I do end up logging in.
# Build the squashfs
mksquashfs $ MOUNT_POINT  /tmp/router.$ ARCH .squashfs \
	-comp xz

Actually build the squashfs image.
# Save the installed package list off
chroot $ MOUNT_POINT  dpkg --get-selections > /tmp/wip-installed-packages

Save off the installed package list. This was particularly useful when trying to replicate the existing router setup and making sure I had all the important packages installed. It doesn t really serve a purpose now.
In terms of the config files I copy into /etc, shared across both routers are the following:
Breakdown of shared config
  • apt config (disable recommends, periodic updates):
    • apt/apt.conf.d/10periodic, apt/apt.conf.d/local-recommends
  • Adding a default, empty, locale:
    • default/locale
    • dnsmasq.conf, dnsmasq.d/dhcp-ranges, dnsmasq.d/static-ips
    • hosts, resolv.conf
  • Enabling IP forwarding:
    • sysctl.conf
  • Logs related:
    • logrotate.conf, rsyslog.conf
  • MQTT related:
    • mosquitto/mosquitto.users, mosquitto/conf.d/ssl.conf, mosquitto/conf.d/users.conf, mosquitto/mosquitto.acl, mosquitto/mosquitto.conf
    • mqtt-arp.conf
    • ssl/lets-encrypt-r3.crt, ssl/mqtt.home.key, ssl/mqtt.home.crt
  • PPP configuration:
    • ppp/ip-up.d/0000usepeerdns, ppp/ipv6-up.d/defaultroute, ppp/pap-secrets, ppp/chap-secrets
    • network/interfaces.d/pppoe-wan
The router specific config is mostly related to networking:
Breakdown of router specific config
  • Firewalling:
    • nftables.conf
  • Interfaces:
    • dnsmasq.d/interfaces
    • network/interfaces.d/eth0, network/interfaces.d/p1, network/interfaces.d/p2, network/interfaces.d/p7, network/interfaces.d/p8
  • PPP config (network interface piece):
    • ppp/peers/aquiss
  • SSH keys:
    • ssh/ssh_host_ecdsa_key, ssh/ssh_host_ed25519_key, ssh/ssh_host_rsa_key, ssh/, ssh/, ssh/
  • Monitoring:
    • collectd/collectd.conf, collectd/collectd.conf.d/network.conf

23 January 2023

Matthew Garrett: Build security with the assumption it will be used against your friends

Working in information security means building controls, developing technologies that ensure that sensitive material can only be accessed by people that you trust. It also means categorising people into "trustworthy" and "untrustworthy", and trying to come up with a reasonable way to apply that such that people can do their jobs without all your secrets being available to just anyone in the company who wants to sell them to a competitor. It means ensuring that accounts who you consider to be threats shouldn't be able to do any damage, because if someone compromises an internal account you need to be able to shut them down quickly.

And like pretty much any security control, this can be used for both good and bad. The technologies you develop to monitor users to identify compromised accounts can also be used to compromise legitimate users who management don't like. The infrastructure you build to push updates to users can also be used to push browser extensions that interfere with labour organisation efforts. In many cases there's no technical barrier between something you've developed to flag compromised accounts and the same technology being used to flag users who are unhappy with certain aspects of management.

If you're asked to build technology that lets you make this sort of decision, think about whether that's what you want to be doing. Think about who can compel you to use it in ways other than how it was intended. Consider whether that's something you want on your conscience. And then think about whether you can meet those requirements in a different way. If they can simply compel one junior engineer to alter configuration, that's very different to an implementation that requires sign-offs from multiple senior developers. Make sure that all such policy changes have to be clearly documented, including not just who signed off on it but who asked them to. Build infrastructure that creates a record of who decided to fuck over your coworkers, rather than just blaming whoever committed the config update. The blame trail should never terminate in the person who was told to do something or get fired - the blame trail should clearly indicate who ordered them to do that.

But most importantly: build security features as if they'll be used against you.

comment count unavailable comments

10 January 2023

Matthew Garrett: Integrating Linux with Okta Device Trust

I've written about bearer tokens and how much pain they cause me before, but sadly wishing for a better world doesn't make it happen so I'm making do with what's available. Okta has a feature called Device Trust which allows to you configure access control policies that prevent people obtaining tokens unless they're using a trusted device. This doesn't actually bind the tokens to the hardware in any way, so if a device is compromised or if a user is untrustworthy this doesn't prevent the token ending up on an unmonitored system with no security policies. But it's an incremental improvement, other than the fact that for desktop it's only supported on Windows and MacOS, which really doesn't line up well with my interests.

Obviously there's nothing fundamentally magic about these platforms, so it seemed fairly likely that it would be possible to make this work elsewhere. I spent a while staring at the implementation using Charles Proxy and the Chrome developer tools network tab and had worked out a lot, and then Okta published a paper describing a lot of what I'd just laboriously figured out. But it did also help clear up some points of confusion and clarified some design choices. I'm not going to give a full description of the details (with luck there'll be code shared for that before too long), but here's an outline of how all of this works. Also, to be clear, I'm only going to talk about the desktop support here - mobile is a bunch of related but distinct things that I haven't looked at in detail yet.

Okta's Device Trust (as officially supported) relies on Okta Verify, a local agent. When initially installed, Verify authenticates as the user, obtains a token with a scope that allows it to manage devices, and then registers the user's computer as an additional MFA factor. This involves it generating a JWT that embeds a number of custom claims about the device and its state, including things like the serial number. This JWT is signed with a locally generated (and hardware-backed, using a TPM or Secure Enclave) key, which allows Okta to determine that any future updates from a device claiming the same identity are genuinely from the same device (you could construct an update with a spoofed serial number, but you can't copy the key out of a TPM so you can't sign it appropriately). This is sufficient to get a device registered with Okta, at which point it can be used with Fastpass, Okta's hardware-backed MFA mechanism.

As outlined in the aforementioned deep dive paper, Fastpass is implemented via multiple mechanisms. I'm going to focus on the loopback one, since it's the one that has the strongest security properties. In this mode, Verify listens on one of a list of 10 or so ports on localhost. When you hit the Okta signin widget, choosing Fastpass triggers the widget into hitting each of these ports in turn until it finds one that speaks Fastpass and then submits a challenge to it (along with the URL that's making the request). Verify then constructs a response that includes the challenge and signs it with the hardware-backed key, along with information about whether this was done automatically or whether it included forcing the user to prove their presence. Verify then submits this back to Okta, and if that checks out Okta completes the authentication.

Doing this via loopback from the browser has a bunch of nice properties, primarily around the browser providing information about which site triggered the request. This means the Verify agent can make a decision about whether to submit something there (ie, if a fake login widget requests your creds, the agent will ignore it), and also allows the issued token to be cross-checked against the site that requested it (eg, if requests a token that's valid for, that's a red flag). It's not quite at the same level as a hardware WebAuthn token, but it has many of the anti-phishing properties.

But none of this actually validates the device identity! The entire registration process is up to the client, and clients are in a position to lie. Someone could simply reimplement Verify to lie about, say, a device serial number when registering, and there'd be no proof to the contrary. Thankfully there's another level to this to provide stronger assurances. Okta allows you to provide a CA root[1]. When Okta issues a Fastpass challenge to a device the challenge includes a list of the trusted CAs. If a client has a certificate that chains back to that, it can embed an additional JWT in the auth JWT, this one containing the certificate and signed with the certificate's private key. This binds the CA-issued identity to the Fastpass validation, and causes the device to start appearing as "Managed" in the Okta device management UI. At that point you can configure policy to restrict various apps to managed devices, ensuring that users are only able to get tokens if they're using a device you've previously issued a certificate to.

I've managed to get Linux tooling working with this, though there's still a few drawbacks. The main issue is that the API only allows you to register devices that declare themselves as Windows or MacOS, followed by the login system sniffing browser user agent and only offering Fastpass if you're on one of the officially supported platforms. This can be worked around with an extension that spoofs user agent specifically on the login page, but that's still going to result in devices being logged as a non-Linux OS which makes interpreting the logs more difficult. There's also no ability to choose which bits of device state you log: there's a couple of existing integrations, and otherwise a fixed set of parameters that are reported. It'd be lovely to be able to log arbitrary material and make policy decisions based on that.

This also doesn't help with ChromeOS. There's no real way to automatically launch something that's bound to localhost (you could probably make this work using Crostini but there's no way to launch a Crostini app at login), and access to hardware-backed keys is kind of a complicated topic in ChromeOS for privacy reasons. I haven't tried this yet, but I think using an enterprise force-installed extension and the chrome.enterprise.platformKeys API to obtain a device identity cert and then intercepting requests to the appropriate port range on localhost ought to be enough to do that? But I've literally never written any Javascript so I don't know. Okta supports falling back from the loopback protocol to calling a custom URI scheme, but once you allow that you're also losing a bunch of the phishing protection, so I'd prefer not to take that approach.

Like I said, none of this prevents exfiltration of bearer tokens once they've been issued, and there's still a lot of ecosystem work to do there. But ensuring that tokens can't be issued to unmanaged machines in the first place is still a step forwards, and with luck we'll be able to make use of this on Linux systems without relying on proprietary client-side tooling.

(Time taken to code this implementation: about two days, and under 1000 lines of new code. Time taken to figure out what the fuck to write: rather a lot longer)

[1] There's also support for having Okta issue certificates, but then you're kind of back to the "How do I know this is my device" situation

comment count unavailable comments

8 January 2023

Russ Allbery: Review: Postwar

Review: Postwar, by Tony Judt
Publisher: Penguin Books
Copyright: 2005
ISBN: 1-4406-2476-3
Format: Kindle
Pages: 835
Tony Judt (1948 2010) was a British-American historian and Erich Maria Remarque Professor in European Studies at New York University. Postwar is his magnum opus, a history of Europe from 1945 to 2005. A book described as a history of Europe could be anything from a textbook to a political analysis, so the first useful question to ask is what sort of history. That's a somewhat difficult question to answer. Postwar mentions a great deal of conventional history, including important political movements and changes of government, but despite a stated topic that would suit a survey textbook, it doesn't provide that sort of list of facts and dates. Judt expects the reader to already be familiar with the broad outlines of modern European history. However, Postwar is also not a specialty history and avoids diving too deep into any one area. Trends in art, philosophy, and economics are all mentioned to set a broader context, but still only at the level of a general survey. My best description is that Postwar is a comprehensive social and political history that attempts to focus less on specific events and more on larger trends of thought. Judt grounds his narrative in concrete, factual events, but the emphasis is on how those living in Europe, at each point in history, thought of their society, their politics, and their place in both. Most of the space goes to exploring those nuances of thought and day-to-day life. In the US university context, I'd place this book as an intermediate-level course in modern European history, after the survey course that provides students with a basic framework but before graduate-level specializations in specific topics. If you have not had a solid basic education in European history (and my guess is that most people from the US have not), Judt will provide the necessary signposts, but you should expect to need to look up the signposts you don't recognize. I, as the dubious beneficiary of a US high school history education now many decades in the past, frequently resorted to Wikipedia for additional background. Postwar uses a simple chronological structure in four parts: the immediate post-war years and the beginning of the Cold War (1945 1953), the era of rapidly growing western European prosperity (1953 1971), the years of recession and increased turmoil leading up to the collapse of communism (1971 1989), and the aftermath of the collapse of communism and the rise of the European Union (1989 2005). Each part is divided into four to eight long chapters that trace a particular theme. Judt usually starts with the overview of a theme and then follows the local manifestations of it on a spiral through European countries in whatever order seems appropriate. For the bulk of the book that covers the era of the Cold War, when experiences were drastically different inside or outside the Soviet bloc, he usually separates western and eastern Europe into alternating chapters. Reviewing this sort of book is tricky because so much will depend on how well you already know the topic. My interest in history is strictly amateur and I tend to avoid modern history (usually I find it too depressing), so for me this book was remedial, filling in large knowledge gaps that I ideally shouldn't have had. Postwar was a runner up for the Pulitzer Prize for General Non-Fiction, so I think I'm safe saying you won't go far wrong reading it, but here's the necessary disclaimer that the rest of my reactions may not be useful if you're better versed in modern European history than I was. (This would not be difficult.) That said, I found Postwar invaluable because of its big-picture focus. The events and dates are easy enough to find on the Internet; what was missing for me in understanding Europe was the intent and social structures created by and causing those events. For example, from early in the book:
On one thing, however, all were agreed resisters and politicians alike: "planning". The disasters of the inter-war decades the missed opportunities after 1918, the great depression that followed the stock-market crash of 1929, the waste of unemployment, the inequalities, injustices and inefficiencies of laissez-faire capitalism that had led so many into authoritarian temptation, the brazen indifference of an arrogant ruling elite and the incompetence of an inadequate political class all seemed to be connected by the utter failure to organize society better. If democracy was to work, if it was to recover its appeal, it would have to be planned.
It's one thing to be familiar with the basic economic and political arguments between degrees of free market and planned economies. It's quite another to understand how the appeal of one approach or the discredit of another stems from recent historical experience, and that's what a good history can provide. Judt does not hesitate to draw these sorts of conclusions, and I'm sure some of them are controversial. But while he's opinionated, he's rarely ideological, and he offers no grand explanations. His discussion of the Yugoslav Wars stands out as an example: he mentions various theories of blame (a fraught local ethnic history, the decision by others to not intervene until the situation was truly dire), but largely discards them. Judt's offered explanation is that local politicians saw an opportunity to gain power by inflaming ethnic animosity, and a large portion of the population participated in this process, either passively or eagerly. Other explanations are both unnecessarily complex and too willing to deprive Yugoslavs of agency. I found this refreshingly blunt. When is more complex analysis a way to diffuse responsibility and cling to an ideological fantasy that the right foreign policy would have resolved a problem? A few personal grumblings do creep in, particularly in the chapters on the 1970s (and I think it's not a coincidence that this matches Judt's own young adulthood, a time when one is prone to forming a lot of opinions). There is a brief but stinging criticism of postmodernism in scholarship, which I thought was justified but probably incomplete, and a decidedly grumpy dismissal of punk music, which I thought was less fair. But these are brief asides that don't detract from the overall work. Indeed, they, along with the occasional wry asides ("respecting long-established European practice, no one asked the Poles for their views [on Poland's new frontiers]") add a lot of character. Insofar as this book has a thesis, it's in the implications of the title: Europe only exited the postwar period at the end of the 20th century. Political stability through exhaustion, the overwhelming urgency of economic recovery, and the degree to which the Iron Curtain and the Cold War froze eastern Europe in amber meant that full European recovery from World War II was drawn out and at times suspended. It's only after 1989 and its subsequent upheavals that European politics were able to move beyond postwar concerns. Some of that movement was a reemergence of earlier European politics of nations and ethnic conflict. But, new on the scene, was a sense of identity as Europeans, one that western Europe circled warily and eastern Europe saw as the only realistic path forward.
What binds Europeans together, even when they are deeply critical of some aspect or other of its practical workings, is what it has become conventional to call in disjunctive but revealing contrast with "the American way of life" the "European model of society".
Judt also gave me a new appreciation of how traumatic people find the assignment of fault, and how difficult it is to wrestle with guilt without providing open invitations to political backlash. People will go to great lengths to not feel guilty, and pressing the point runs a substantial risk of creating popular support for ideological movements that are willing to lie to their followers. The book's most memorable treatment of this observation is in the epilogue, which traces popular European attitudes towards the history of the Holocaust through the whole time period. The largest problem with this book is that it is dense and very long. I'm a fairly fast reader, but this was the only book I read through most of my holiday vacation and it still took a full week into the new year to finish it. By the end, I admit I was somewhat exhausted and ready to be finished with European history for a while (although the epilogue is very much worth waiting for). If you, unlike me, can read a book slowly among other things, that may be a good tactic. But despite feeling like this was a slog at times, I'm very glad that I read it. I'm not sure if someone with a firmer grounding in European history would have gotten as much out of it, but I, at least, needed something this comprehensive to wrap my mind around the timeline and fill in some embarrassing gaps. Judt is not the most entertaining writer (although he has his moments), and this is not the sort of popular history that goes out of its way to draw you in, but I found it approachable and clear. If you're looking for a solid survey of modern European history with this type of high-level focus, recommended. Rating: 8 out of 10

30 December 2022

Chris Lamb: Favourite books of 2022: Non-fiction

In my three most recent posts, I went over the memoirs and biographies, classics and fiction books that I enjoyed the most in 2022. But in the last of my book-related posts for 2022, I'll be going over my favourite works of non-fiction. Books that just missed the cut here include Adam Hochschild's King Leopold's Ghost (1998) on the role of Leopold II of Belgium in the Congo Free State, Johann Hari's Stolen Focus (2022) (a personal memoir on relating to how technology is increasingly fragmenting our attention), Amia Srinivasan's The Right to Sex (2021) (a misleadingly named set of philosophic essays on feminism), Dana Heller et al.'s The Selling of 9/11: How a National Tragedy Became a Commodity (2005), John Berger's mindbending Ways of Seeing (1972) and Louise Richardson's What Terrorists Want (2006).

The Great War and Modern Memory (1975)
Wartime: Understanding and Behavior in the Second World War (1989) Paul Fussell Rather than describe the battles, weapons, geopolitics or big personalities of the two World Wars, Paul Fussell's The Great War and Modern Memory & Wartime are focused instead on how the two wars have been remembered by their everyday participants. Drawing on the memoirs and memories of soldiers and civilians along with a brief comparison with the actual events that shaped them, Fussell's two books are a compassionate, insightful and moving piece of analysis. Fussell primarily sets himself against the admixture of nostalgia and trauma that obscures the origins and unimaginable experience of participating in these wars; two wars that were, in his view, a "perceptual and rhetorical scandal from which total recovery is unlikely." He takes particular aim at the dishonesty of hindsight:
For the past fifty years, the Allied war has been sanitised and romanticised almost beyond recognition by the sentimental, the loony patriotic, the ignorant and the bloodthirsty. I have tried to balance the scales. [And] in unbombed America especially, the meaning of the war [seems] inaccessible.
The author does not engage in any of the customary rose-tinted view of war, yet he remains understanding and compassionate towards those who try to locate a reason within what was quite often senseless barbarism. If anything, his despondency and pessimism about the Second World War (the war that Fussell himself fought in) shines through quite acutely, and this is especially the case in what he chooses to quote from others:
"It was common [ ] throughout the [Okinawa] campaign for replacements to get hit before we even knew their names. They came up confused, frightened, and hopeful, got wounded or killed, and went right back to the rear on the route by which they had come, shocked, bleeding, or stiff. They were forlorn figures coming up to the meat grinder and going right back out of it like homeless waifs, unknown and faceless to us, like unread books on a shelf."
It would take a rather heartless reader to fail to be sobered by this final simile, and an even colder one to view Fussell's citation of such an emotive anecdote to be manipulative. Still, stories and cruel ironies like this one infuse this often-angry book, but it is not without astute and shrewd analysis as well, especially on the many qualitative differences between the two conflicts that simply cannot be captured by facts and figures alone. For example:
A measure of the psychological distance of the Second [World] War from the First is the rarity, in 1914 1918, of drinking and drunkenness poems.
Indeed so. In fact, what makes Fussell's project so compelling and perhaps even unique is that he uses these non-quantitive measures to try and take stock of what happened. After all, this was a war conducted by humans, not the abstract school of statistics. And what is the value of a list of armaments destroyed by such-and-such a regiment when compared with truly consequential insights into both how the war affected, say, the psychology of postwar literature ("Prolonged trench warfare, whether enacted or remembered, fosters paranoid melodrama, which I take to be a primary mode in modern writing."), the specific words adopted by combatants ("It is a truism of military propaganda that monosyllabic enemies are easier to despise than others") as well as the very grammar of interaction:
The Field Service Post Card [in WW1] has the honour of being the first widespread exemplary of that kind of document which uniquely characterises the modern world: the "Form". [And] as the first widely known example of dehumanised, automated communication, the post card popularised a mode of rhetoric indispensable to the conduct of later wars fought by great faceless conscripted armies.
And this wouldn't be a book review without argument-ending observations that:
Indicative of the German wartime conception [of victory] would be Hitler and Speer's elaborate plans for the ultimate reconstruction of Berlin, which made no provision for a library.
Our myths about the two world wars possess an undisputed power, in part because they contain an essential truth the atrocities committed by Germany and its allies were not merely extreme or revolting, but their full dimensions (embodied in the Holocaust and the Holodomor) remain essentially inaccessible within our current ideological framework. Yet the two wars are better understood as an abyss in which we were all dragged into the depths of moral depravity, rather than a battle pitched by the forces of light against the forces of darkness. Fussell is one of the few observers that can truly accept and understand this truth and is still able to speak to us cogently on the topic from the vantage point of experience. The Second World War which looms so large in our contemporary understanding of the modern world (see below) may have been necessary and unavoidable, but Fussell convinces his reader that it was morally complicated "beyond the power of any literary or philosophic analysis to suggest," and that the only way to maintain a na ve belief in the myth that these wars were a Manichaean fight between good and evil is to overlook reality. There are many texts on the two World Wars that can either stir the intellect or move the emotions, but Fussell's two books do both. A uniquely perceptive and intelligent commentary; outstanding.

Longitude (1995) Dava Sobel Since Man first decided to sail the oceans, knowing one's location has always been critical. Yet doing so reliably used to be a serious problem if you didn't know where you were, you are far more likely to die and/or lose your valuable cargo. But whilst finding one's latitude (ie. your north south position) had effectively been solved by the beginning of the 17th century, finding one's (east west) longitude was far from trustworthy in comparison. This book first published in 1995 is therefore something of an anachronism. As in, we readily use the GPS facilities of our phones today without hesitation, so we find it difficult to imagine a reality in which knowing something fundamental like your own location is essentially unthinkable. It became clear in the 18th century, though, that in order to accurately determine one's longitude, what you actually needed was an accurate clock. In Longitude, therefore, we read of the remarkable story of John Harrison and his quest to create a timepiece that would not only keep time during a long sea voyage but would survive the rough ocean conditions as well. Self-educated and a carpenter by trade, Harrison made a number of important breakthroughs in keeping accurate time at sea, and Longitude describes his novel breakthroughs in a way that is both engaging and without talking down to the reader. Still, this book covers much more than that, including the development of accurate longitude going hand-in-hand with advancements in cartography as well as in scientific experiments to determine the speed of light: experiments that led to the formulation of quantum mechanics. It also outlines the work being done by Harrison's competitors. 'Competitors' is indeed the correct word here, as Parliament offered a huge prize to whoever could create such a device, and the ramifications of this tremendous financial incentive are an essential part of this story. For the most part, though, Longitude sticks to the story of Harrison and his evolving obsession with his creating the perfect timepiece. Indeed, one reason that Longitude is so resonant with readers is that many of the tropes of the archetypical 'English inventor' are embedded within Harrison himself. That is to say, here is a self-made man pushing against the establishment of the time, with his groundbreaking ideas being underappreciated in his life, or dishonestly purloined by his intellectual inferiors. At the level of allegory, then, I am minded to interpret this portrait of Harrison as a symbolic distillation of postwar Britain a nation acutely embarrassed by the loss of the Empire that is now repositioning itself as a resourceful but plucky underdog; a country that, with a combination of the brains of boffins and a healthy dose of charisma and PR, can still keep up with the big boys. (It is this same search for postimperial meaning I find in the fiction of John le Carr , and, far more famously, in the James Bond franchise.) All of this is left to the reader, of course, as what makes Longitute singularly compelling is its gentle manner and tone. Indeed, at times it was as if the doyenne of sci-fi Ursula K. LeGuin had a sideline in popular non-fiction. I realise it's a mark of critical distinction to downgrade the importance of popular science in favour of erudite academic texts, but Latitude is ample evidence that so-called 'pop' science need not be patronising or reductive at all.

Closed Chambers: The Rise, Fall, and Future of the Modern Supreme Court (1998) Edward Lazarus After the landmark decision by the U.S. Supreme Court in *Dobbs v. Jackson Women's Health Organization that ended the Constitutional right to abortion conferred by Roe v Wade, I prioritised a few books in the queue about the judicial branch of the United States. One of these books was Closed Chambers, which attempts to assay, according to its subtitle, "The Rise, Fall and Future of the Modern Supreme Court". This book is not merely simply a learned guide to the history and functioning of the Court (although it is completely creditable in this respect); it's actually an 'insider' view of the workings of the institution as Lazurus was a clerk for Justice Harry Blackmun during the October term of 1988. Lazarus has therefore combined his experience as a clerk and his personal reflections (along with a substantial body of subsequent research) in order to communicate the collapse in comity between the Justices. Part of this book is therefore a pure history of the Court, detailing its important nineteenth-century judgements (such as Dred Scott which ruled that the Constitution did not consider Blacks to be citizens; and Plessy v. Ferguson which failed to find protection in the Constitution against racial segregation laws), as well as many twentieth-century cases that touch on the rather technical principle of substantive due process. Other layers of Lazurus' book are explicitly opinionated, however, and they capture the author's assessment of the Court's actions in the past and present [1998] day. Given the role in which he served at the Court, particular attention is given by Lazarus to the function of its clerks. These are revealed as being far more than the mere amanuenses they were hitherto believed to be. Indeed, the book is potentially unique in its the claim that the clerks have played a pivotal role in the deliberations, machinations and eventual rulings of the Court. By implication, then, the clerks have plaedy a crucial role in the internal controversies that surround many of the high-profile Supreme Court decisions decisions that, to the outsider at least, are presented as disinterested interpretations of Constitution of the United States. This is of especial importance given that, to Lazarus, "for all the attention we now pay to it, the Court remains shrouded in confusion and misunderstanding." Throughout his book, Lazarus complicates the commonplace view that the Court is divided into two simple right vs. left political factions, and instead documents an ever-evolving series of loosely held but strongly felt series of cabals, quid pro quo exchanges, outright equivocation and pure personal prejudices. (The age and concomitant illnesses of the Justices also appears to have a not insignificant effect on the Court's rulings as well.) In other words, Closed Chambers is not a book that will be read in a typical civics class in America, and the only time the book resorts to the customary breathless rhetoric about the US federal government is in its opening chapter:
The Court itself, a Greek-style temple commanding the crest of Capitol Hill, loomed above them in the dim light of the storm. Set atop a broad marble plaza and thirty-six steps, the Court stands in splendid isolation appropriate to its place at the pinnacle of the national judiciary, one of the three independent and "coequal" branches of American government. Once dubbed the Ivory Tower by architecture critics, the Court has a Corinthian colonnade and massive twenty-foot-high bronze doors that guard the single most powerful judicial institution in the Western world. Lights still shone in several offices to the right of the Court's entrance, and [ ]
Et cetera, et cetera. But, of course, this encomium to the inherent 'nobility' of the Supreme Court is quickly revealed to be a narrative foil, as Lazarus soon razes this dangerously na ve conception to the ground:
[The] institution is [now] broken into unyielding factions that have largely given up on a meaningful exchange of their respective views or, for that matter, a meaningful explication or defense of their own views. It is of Justices who in many important cases resort to transparently deceitful and hypocritical arguments and factual distortions as they discard judicial philosophy and consistent interpretation in favor of bottom-line results. This is a Court so badly splintered, yet so intent on lawmaking, that shifting 5-4 majorities, or even mere pluralities, rewrite whole swaths of constitutional law on the authority of a single, often idiosyncratic vote. It is also a Court where Justices yield great and excessive power to immature, ideologically driven clerks, who in turn use that power to manipulate their bosses and the institution they ostensibly serve.
Lazurus does not put forward a single, overarching thesis, but in the final chapters, he does suggest a potential future for the Court:
In the short run, the cure for what ails the Court lies solely with the Justices. It is their duty, under the shield of life tenure, to recognize the pathologies affecting their work and to restore the vitality of American constitutionalism. Ultimately, though, the long-term health of the Court depends on our own resolve on whom [we] select to join that institution.
Back in 1998, Lazurus might have had room for this qualified optimism. But from the vantage point of 2022, it appears that the "resolve" of the United States citizenry was not muscular enough to meet his challenge. After all, Lazurus was writing before Bush v. Gore in 2000, which arrogated to the judicial branch the ability to decide a presidential election; the disillusionment of Barack Obama's failure to nominate a replacement for Scalia; and many other missteps in the Court as well. All of which have now been compounded by the Trump administration's appointment of three Republican-friendly justices to the Court, including hypocritically appointing Justice Barrett a mere 38 days before the 2020 election. And, of course, the leaking and ruling in Dobbs v. Jackson, the true extent of which has not been yet. Not of a bit of this is Lazarus' fault, of course, but the Court's recent decisions (as well as the liberal hagiographies of 'RBG') most perforce affect one's reading of the concluding chapters. The other slight defect of Closed Chambers is that, whilst it often implies the importance of the federal and state courts within the judiciary, it only briefly positions the Supreme Court's decisions in relation to what was happening in the House, Senate and White House at the time. This seems to be increasingly relevant as time goes on: after all, it seems fairly clear even to this Brit that relying on an activist Supreme Court to enact progressive laws must be interpreted as a failure of the legislative branch to overcome the perennial problems of the filibuster, culture wars and partisan bickering. Nevertheless, Lazarus' book is in equal parts ambitious, opinionated, scholarly and dare I admit it? wonderfully gossipy. By juxtaposing history, memoir, and analysis, Closed Chambers combines an exacting evaluation of the Court's decisions with a lively portrait of the intellectual and emotional intensity that has grown within the Supreme Court's pseudo-monastic environment all while it struggles with the most impactful legal issues of the day. This book is an excellent and well-written achievement that will likely never be repeated, and a must-read for anyone interested in this ever-increasingly important branch of the US government.

Crashed: How a Decade of Financial Crises Changed the World (2018)
Shutdown: How Covid Shook the World's Economy (2021) Adam Tooze The economic historian Adam Tooze has often been labelled as an unlikely celebrity, but in the fourteen years since the global financial crisis of 2008, a growing audience has been looking for answers about the various failures of the modern economy. Tooze, a professor of history at New York's Columbia University, has written much that is penetrative and thought-provoking on this topic, and as a result, he has generated something of a cult following amongst economists, historians and the online left. I actually read two Tooze books this year. The first, Crashed (2018), catalogues the scale of government intervention required to prop up global finance after the 2008 financial crisis, and it characterises the different ways that countries around the world failed to live up to the situation, such as doing far too little, or taking action far too late. The connections between the high-risk subprime loans, credit default swaps and the resulting liquidity crisis in the US in late 2008 is fairly well known today in part thanks to films such as Adam McKay's 2015 The Big Short and much improved economic literacy in media reportage. But Crashed makes the implicit claim that, whilst the specific and structural origins of the 2008 crisis are worth scrutinising in exacting detail, it is the reaction of states in the months and years after the crash that has been overlooked as a result. After all, this is a reaction that has not only shaped a new economic order, it has created one that does not fit any conventional idea about the way the world 'ought' to be run. Tooze connects the original American banking crisis to the (multiple) European debt crises with a larger crisis of liberalism. Indeed, Tooze somehow manages to cover all these topics and more, weaving in Trump, Brexit and Russia's 2014 annexation of Crimea, as well as the evolving role of China in the post-2008 economic order. Where Crashed focused on the constellation of consequences that followed the events of 2008, Shutdown is a clear and comprehensive account of the way the world responded to the economic impact of Covid-19. The figures are often jaw-dropping: soon after the disease spread around the world, 95% of the world's economies contracted simultaneously, and at one point, the global economy shrunk by approximately 20%. Tooze's keen and sobering analysis of what happened is made all the more remarkable by the fact that it came out whilst the pandemic was still unfolding. In fact, this leads quickly to one of the book's few flaws: by being published so quickly, Shutdown prematurely over-praises China's 'zero Covid' policy, and these remarks will make a reader today squirm in their chair. Still, despite the regularity of these references (after all, mentioning China is very useful when one is directly comparing economic figures in early 2021, for examples), these are actually minor blemishes on the book's overall thesis. That is to say, Crashed is not merely a retelling of what happened in such-and-such a country during the pandemic; it offers in effect a prediction about what might be coming next. Whilst the economic responses to Covid averted what could easily have been another Great Depression (and thus showed it had learned some lessons from 2008), it had only done so by truly discarding the economic rule book. The by-product of inverting this set of written and unwritten conventions that have governed the world for the past 50 years, this 'Washington consensus' if you well, has yet to be fully felt. Of course, there are many parallels between these two books by Tooze. Both the liquidity crisis outlined in Crashed and the economic response to Covid in Shutdown exposed the fact that one of the central tenets of the modern economy ie. that financial markets can be trusted to regulate themselves was entirely untrue, and likely was false from the very beginning. And whilst Adam Tooze does not offer a singular piercing insight (conveying a sense of rigorous mastery instead), he may as well be asking whether we're simply going to lurch along from one crisis to the next, relying on the technocrats in power to fix problems when everything blows up again. The answer may very well be yes.

Looking for the Good War: American Amnesia and the Violent Pursuit of Happiness (2021) Elizabeth D. Samet Elizabeth D. Samet's Looking for the Good War answers the following question what would be the result if you asked a professor of English to disentangle the complex mythology we have about WW2 in the context of the recent US exit of Afghanistan? Samet's book acts as a twenty-first-century update of a kind to Paul Fussell's two books (reviewed above), as well as a deeper meditation on the idea that each new war is seen through the lens of the previous one. Indeed, like The Great War and Modern Memory (1975) and Wartime (1989), Samet's book is a perceptive work of demystification, but whilst Fussell seems to have been inspired by his own traumatic war experience, Samet is not only informed by her teaching West Point military cadets but by the physical and ontological wars that have occurred during her own life as well. A more scholarly and dispassionate text is the result of Samet's relative distance from armed combat, but it doesn't mean Looking for the Good War lacks energy or inspiration. Samet shares John Adams' belief that no political project can entirely shed the innate corruptions of power and ambition and so it is crucial to analyse and re-analyse the role of WW2 in contemporary American life. She is surely correct that the Second World War has been universally elevated as a special, 'good' war. Even those with exceptionally giddy minds seem to treat WW2 as hallowed:
It is nevertheless telling that one of the few occasions to which Trump responded with any kind of restraint while he was in office was the 75th anniversary of D-Day in 2019.
What is the source of this restraint, and what has nurtured its growth in the eight decades since WW2 began? Samet posits several reasons for this, including the fact that almost all of the media about the Second World War is not only suffused with symbolism and nostalgia but, less obviously, it has been made by people who have no experience of the events that they depict. Take Stephen Ambrose, author of Steven Spielberg's Band of Brothers miniseries: "I was 10 years old when the war ended," Samet quotes of Ambrose. "I thought the returning veterans were giants who had saved the world from barbarism. I still think so. I remain a hero worshiper." If Looking for the Good War has a primary thesis, then, it is that childhood hero worship is no basis for a system of government, let alone a crusading foreign policy. There is a straight line (to quote this book's subtitle) from the "American Amnesia" that obscures the reality of war to the "Violent Pursuit of Happiness." Samet's book doesn't merely just provide a modern appendix to Fussell's two works, however, as it adds further layers and dimensions he overlooked. For example, Samet provides some excellent insight on the role of Western, gangster and superhero movies, and she is especially good when looking at noir films as a kind of kaleidoscopic response to the Second World War:
Noir is a world ruled by bad decisions but also by bad timing. Chance, which plays such a pivotal role in war, bleeds into this world, too.
Samet rightfully weaves the role of women into the narrative as well. Women in film noir are often celebrated as 'independent' and sassy, correctly reflecting their newly-found independence gained during WW2. But these 'liberated' roles are not exactly a ringing endorsement of this independence: the 'femme fatale' and the 'tart', etc., reflect a kind of conditional freedom permitted to women by a post-War culture which is still wedded to an outmoded honour culture. In effect, far from being novel and subversive, these roles for women actually underwrote the ambient cultural disapproval of women's presence in the workforce. Samet later connects this highly-conditional independence with the liberation of Afghan women, which:
is inarguably one of the more palatable outcomes of our invasion, and the protection of women's rights has been invoked on the right and the left as an argument for staying the course in Afghanistan. How easily consequence is becoming justification. How flattering it will be one day to reimagine it as original objective.
Samet has ensured her book has a predominantly US angle as well, for she ends her book with a chapter on the pseudohistorical Lost Cause of the Civil War. The legacy of the Civil War is still visible in the physical phenomena of Confederate statues, but it also exists in deep-rooted racial injustice that has been shrouded in euphemism and other psychological devices for over 150 years. Samet believes that a key part of what drives the American mythology about the Second World War is the way in which it subconsciously cleanses the horrors of brother-on-brother murder that were seen in the Civil War. This is a book that is not only of interest to historians of the Second World War; it is a work for anyone who wishes to understand almost any American historical event, social issue, politician or movie that has appeared since the end of WW2. That is for better or worse everyone on earth.

27 December 2022

Chris Lamb: Favourite books of 2022: Fiction

This post marks the beginning my yearly roundups of the favourite books and movies that I read and watched in 2022 that I plan to publish over the next few days. Just as I did for 2020 and 2021, I won't reveal precisely how many books I read in the last year. I didn't get through as many books as I did in 2021, though, but that's partly due to reading a significant number of long nineteenth-century novels in particular, a fair number of those books that American writer Henry James once referred to as "large, loose, baggy monsters." However, in today's post I'll be looking at my favourite books that are typically filed under fiction, with 'classic' fiction following tomorrow. Works that just missed the cut here include John O'Brien's Leaving Las Vegas, Colson Whitehead's Sag Harbor and possibly The Name of the Rose by Umberto Eco, or Elif Batuman's The Idiot. I also feel obliged to mention (or is that show off?) that I also read the 1,079-page Infinite Jest by David Foster Wallace, but I can't say it was a favourite, let alone recommend others unless they are in the market for a good-quality under-monitor stand.

Mona (2021) Pola Oloixarac Mona is the story of a young woman who has just been nominated for the 'most important literary award in Europe'. Mona sees the nomination as a chance to escape her substance abuse on a Californian campus and so speedily decamps to the small village in the depths of Sweden where the nominees must convene for a week before the overall winner is announced. Mona didn't disappear merely to avoid pharmacological misadventures, though, but also to avoid the growing realisation that she is being treated as something of an anthropological curiosity at her university: a female writer of colour treasured for her flourish of exotic diversity that reflects well upon her department. But Mona is now stuck in the company of her literary competitors who all have now gathered from around the world in order to do what writers do: harbour private resentments, exchange empty flattery, embody the selfsame racialised stereotypes that Mona left the United States to avoid, stab rivals in the back, drink too much, and, of course, go to bed together. But as I read Mona, I slowly started to realise that something else is going on. Why does Mona keep finding traces of violence on her body, the origins of which she cannot or refuses to remember? There is something eerily defensive about her behaviour and sardonic demeanour in general as well. A genre-bending and mind-expanding novel unfolded itself, and, without getting into spoiler territory, Mona concludes with such a surprising ending that, according to Adam Thirlwell:
Perhaps we need to rethink what is meant by a gimmick. If a gimmick is anything that we want to reject as extra or excessive or ill-fitting, then it may be important to ask what inhibitions or arbitrary conventions have made it seem like excess, and to revel in the exorbitant fictional constructions it produces. [...]
Mona is a savage satire of the literary world, but it's also a very disturbing exploration of trauma and violence. The success of the book comes in equal measure from the author's commitment to both ideas, but also from the way the psychological damage component creeps up on you. And, as implied above, the last ten pages are quite literally out of this world.

My Brilliant Friend (2011)
The Story of a New Name (2012)
Those Who Leave and Those Who Stay (2013)
The Story of the Lost Child (2014) Elena Ferrante Elena Ferrante's Neopolitan Quartet follows two girls, both brilliant in their own way. Our protagonist-narrator is Elena, a studious girl from the lower rungs of the middle class of Naples who is inspired to be more by her childhood friend, Lila. Lila is, in turn, far more restricted by her poverty and class, but can transcend it at times through her fiery nature, which also brands her as somewhat unique within their inward-looking community. The four books follow the two girls from the perspective of Elena as they grow up together in post-war Italy, where they drift in-and-out of each other's lives due to the vicissitudes of change and the consequences of choice. All the time this is unfolding, however, the narrative is very always slightly charged by the background knowledge revealed on the very first page that Lila will, many years later, disappear from Elena's life. Whilst the quartet has the formal properties of a bildungsroman, its subject and conception are almost entirely different. In particular, the books are driven far more by character and incident than spectacular adventures in picturesque Italy. In fact, quite the opposite takes place: these are four books where ordinary-seeming occurrences take on an unexpected radiance against a background of poverty, ignorance, violence and other threats, often bringing to mind the films of the Italian neorealism movement. Brilliantly rendered from beginning to end, Ferrante has a seemingly studious eye for interpreting interactions and the psychology of adolescence and friendship. Some utterances indeed, perhaps even some glances are dissected at length over multiple pages, something that Vittorio De Sica's classic Bicycle Thieves (1948) could never do. Potential readers should not take any notice of the saccharine cover illustrations on most editions of the books. The quartet could even win an award for the most misleading artwork, potentially rivalling even Vladimir Nabokov's Lolita. I wouldn't be at all surprised if it is revealed that the drippy illustrations and syrupy blurbs ("a rich, intense and generous-hearted story ") turn out to be part of a larger metatextual game that Ferrante is playing with her readers. This idiosyncratic view of mine is partially supported by the fact that each of the four books has been given a misleading title, the true ambiguity of which often only becomes clear as each of the four books comes into sharper focus. Readers of the quartet often fall into debating which is the best of the four. I've heard from more than one reader that one has 'too much Italian politics' and another doesn't have enough 'classic' Lina moments. The first book then possesses the twin advantages of both establishing the environs and finishing with a breathtaking ending that is both satisfying and a cliffhanger as well but does this make it 'the best'? I prefer to liken the quartet more like the different seasons of The Wire (2002-2008) where, personal favourites and preferences aside, although each season is undoubtedly unique, it would take a certain kind of narrow-minded view of art to make the claim that, say, series one of The Wire is 'the best' or that the season that focuses on the Baltimore docks 'is boring'. Not to sound like a neo-Wagnerian, but each of them adds to final result in its own. That is to say, both The Wire and the Neopolitan Quartet achieve the rare feat of making the magisterial simultaneously intimate.

Out There: Stories (2022) Kate Folk Out There is a riveting collection of disturbing short stories by first-time author Kate Fork. The title story first appeared in the New Yorker in early 2020 imagines a near-future setting where a group of uncannily handsome artificial men called 'blots' have arrived on the San Francisco dating scene with the secret mission of sleeping with women, before stealing their personal data from their laptops and phones and then (quite literally) evaporating into thin air. Folk's satirical style is not at all didactic, so it rarely feels like she is making her points in a pedantic manner. But it's clear that the narrator of Out There is recounting her frustration with online dating. in a way that will resonate with anyone who s spent time with dating apps or indeed the contemporary hyper-centralised platform-based internet in general. Part social satire, part ghost story and part comic tales, the blurring of the lines between these factors is only one of the things that makes these stories so compelling. But whilst Folk constructs crazy scenarios and intentionally strange worlds, she also manages to also populate them with characters that feel real and genuinely sympathetic. Indeed, I challenge you not to feel some empathy for the 'blot' in the companion story Big Sur which concludes the collection, and it complicates any primary-coloured view of the dating world of consisting entirely of predatory men. And all of this is leavened with a few stories that are just plain surreal. I don't know what the deal is with Dating a Somnambulist (available online on Hobart Pulp), but I know that I like it.

Solaris (1961) Stanislaw Lem When Kelvin arrives at the planet Solaris to study the strange ocean that covers its surface, instead of finding an entirely physical scientific phenomenon, he soon discovers a previously unconscious memory embodied in the physical manifestation of a long-dead lover. The other scientists on the space station slowly reveal that they are also plagued with their own repressed corporeal memories. Many theories are put forward as to why all this is occuring, including the idea that Solaris is a massive brain that creates these incarnate memories. Yet if that is the case, the planet's purpose in doing so is entirely unknown, forcing the scientists to shift focus and wonder whether they can truly understand the universe without first understanding what lies within their own minds and in their desires. This would be an interesting outline for any good science fiction book, but one of the great strengths of Solaris is not only that it withholds from the reader why the planet is doing anything it does, but the book is so forcefully didactic in its dislike of the hubris, destructiveness and colonial thinking that can accompany scientific exploration. In one of its most vitriolic passages, Lem's own anger might be reaching out to the reader:
We are humanitarian and chivalrous; we don t want to enslave other races, we simply want to bequeath them our values and take over their heritage in exchange. We think of ourselves as the Knights of the Holy Contact. This is another lie. We are only seeking Man. We have no need of other worlds. We need mirrors. We don t know what to do with other worlds. A single world, our own, suffices us; but we can t accept it for what it is. We are searching for an ideal image of our own world: we go in quest of a planet, of a civilisation superior to our own, but developed on the basis of a prototype of our primaeval past. At the same time, there is something inside us that we don t like to face up to, from which we try to protect ourselves, but which nevertheless remains since we don t leave Earth in a state of primal innocence. We arrive here as we are in reality, and when the page is turned, and that reality is revealed to us that part of our reality that we would prefer to pass over in silence then we don t like it anymore.
An overwhelming preoccupation with this idea infuses Solaris, and it turns out to be a common theme in a lot of Lem's work of this period, such as in his 1959 'anti-police procedural' The Investigation. Perhaps it not a dislike of exploration in general or the modern scientific method in particular, but rather a savage critique of the arrogance and self-assuredness that accompanies most forms of scientific positivism, or at least pursuits that cloak themselves under the guise of being a laudatory 'scientific' pursuit:
Man has gone out to explore other worlds and other civilizations without having explored his own labyrinth of dark passages and secret chambers and without finding what lies behind doorways that he himself has sealed.
I doubt I need to cite specific instances of contemporary scientific pursuits that might meet Lem's punishing eye today, and the fact that his critique works both in 2022 and 1961 perhaps tells us more about the human condition than we'd care to know. Another striking thing about Solaris isn't just the specific Star Trek and Stargate SG-1 episodes that I retrospectively realised were purloined from the book, but that almost the entire register of Star Trek: The Next Generation in particular seems to be rehearsed here. That is to say, TNG presents itself as hard and fact-based 'sci-fi' on the surface, but, at its core, there are often human, existential and sometimes quite enormously emotionally devastating human themes being discussed such as memory, loss and grief. To take one example from many, the painful memories that the planet Solaris physically materialises in effect asks us to seriously consider what it actually is taking place when we 'love' another person: is it merely another 'mirror' of ourselves? (And, if that is the case, is that... bad?) It would be ahistorical to claim that all popular science fiction today can be found rehearsed in Solaris, but perhaps it isn't too much of a stretch:
[Solaris] renders unnecessary any more alien stories. Nothing further can be said on this topic ...] Possibly, it can be said that when one feels the urge for such a thing, one should simply reread Solaris and learn its lessons again. Kim Stanley Robinson [...]
I could go on praising this book for quite some time; perhaps by discussing the extreme framing devices used within the book at one point, the book diverges into a lengthy bibliography of fictional books-within-the-book, each encapsulating a different theory about what the mechanics and/or function of Solaris is, thereby demonstrating that 'Solaris studies' as it is called within the world of the book has been going on for years with no tangible results, which actually leads to extreme embarrassment and then a deliberate and willful blindness to the 'Solaris problem' on the part of the book's scientific community. But I'll leave it all here before this review gets too long... Highly recommended, and a likely reread in 2023.

Brokeback Mountain (1997) Annie Proulx Brokeback Mountain began as a short story by American author Annie Proulx which appeared in the New Yorker in 1997, although it is now more famous for the 2005 film adaptation directed by Taiwanese filmmaker Ang Lee. Both versions follow two young men who are hired for the summer to look after sheep at a range under the 'Brokeback' mountain in Wyoming. Unexpectedly, however, they form an intense emotional and sexual attachment, yet life intervenes and demands they part ways at the end of the summer. Over the next twenty years, though, as their individual lives play out with marriages, children and jobs, they continue reuniting for brief albeit secret liaisons on camping trips in remote settings. There's no feigned shyness or self-importance in Brokeback Mountain, just a close, compassionate and brutally honest observation of a doomed relationship and a bone-deep feeling for the hardscrabble life in the post-War West. To my mind, very few books have captured so acutely the desolation of a frustrated and repressed passion, as well as the particular flavour of undirected anger that can accompany this kind of yearning. That the original novella does all this in such a beautiful way (and without the crutch of the Wyoming landscape to look at ) is a tribute to Proulx's skills as a writer. Indeed, even without the devasting emotional undertones, Proulx's descriptions of the mountains and scree of the West is likely worth the read alone.

Luster (2020) Raven Leilani Edie is a young Black woman living in New York whose life seems to be spiralling out of control. She isn't good at making friends, her career is going nowhere, and she has no close family to speak of as well. She is, thus, your typical NYC millennial today, albeit seen through a lens of Blackness that complicates any reductive view of her privilege or minority status. A representative paragraph might communicate the simmering tone:
Before I start work, I browse through some photos of friends who are doing better than me, then an article on a black teenager who was killed on 115th for holding a weapon later identified as a showerhead, then an article on a black woman who was killed on the Grand Concourse for holding a weapon later identified as a cell phone, then I drown myself in the comments section and do some online shopping, by which I mean I put four dresses in my cart as a strictly theoretical exercise and then let the page expire.
She starts a sort-of affair with an older white man who has an affluent lifestyle in nearby New Jersey. Eric or so he claims has agreed upon an 'open relationship' with his wife, but Edie is far too inappropriate and disinhibited to respect any boundaries that Eric sets for her, and so Edie soon becomes deeply entangled in Eric's family life. It soon turns out that Eric and his wife have a twelve-year-old adopted daughter, Akila, who is also wait for it Black. Akila has been with Eric's family for two years now and they aren t exactly coping well together. They don t even know how to help her to manage her own hair, let alone deal with structural racism. Yet despite how dark the book's general demeanour is, there are faint glimmers of redemption here and there. Realistic almost to the end, Edie might finally realise what s important in her life, but it would be a stretch to say that she achieves them by the final page. Although the book is full of acerbic remarks on almost any topic (Dogs: "We made them needy and physically unfit. They used to be wolves, now they are pugs with asthma."), it is the comments on contemporary race relations that are most critically insightful. Indeed, unsentimental, incisive and funny, Luster had much of what I like in Colson Whitehead's books at times, but I can't remember a book so frantically fast-paced as this since the Booker-prize winning The Sellout by Paul Beatty or Sam Tallent's Running the Light.

13 December 2022

Matthew Garrett: Trying to remove the need to trust cloud providers

First up: what I'm covering here is probably not relevant for most people. That's ok! Different situations have different threat models, and if what I'm talking about here doesn't feel like you have to worry about it, that's great! Your life is easier as a result. But I have worked in situations where we had to care about some of the scenarios I'm going to describe here, and the technologies I'm going to talk about here solve a bunch of these problems.

So. You run a typical VM in the cloud. Who has access to that VM? Well, firstly, anyone who has the ability to log into the host machine with administrative capabilities. With enough effort, perhaps also anyone who has physical access to the host machine. But the hypervisor also has the ability to inspect what's running inside a VM, so anyone with the ability to install a backdoor into the hypervisor could theoretically target you. And who's to say the cloud platform launched the correct image in the first place? The control plane could have introduced a backdoor into your image and run that instead. Or the javascript running in the web UI that you used to configure the instance could have selected a different image without telling you. Anyone with the ability to get a (cleverly obfuscated) backdoor introduced into quite a lot of code could achieve that. Obviously you'd hope that everyone working for a cloud provider is honest, and you'd also hope that their security policies are good and that all code is well reviewed before being committed. But when you have several thousand people working on various components of a cloud platform, there's always the potential for something to slip up.

Let's imagine a large enterprise with a whole bunch of laptops used by developers. If someone has the ability to push a new package to one of those laptops, they're in a good position to obtain credentials belonging to the user of that laptop. That means anyone with that ability effectively has the ability to obtain arbitrary other privileges - they just need to target someone with the privilege they want. You can largely mitigate this by ensuring that the group of people able to do this is as small as possible, and put technical barriers in place to prevent them from pushing new packages unilaterally.

Now imagine this in the cloud scenario. Anyone able to interfere with the control plane (either directly or by getting code accepted that alters its behaviour) is in a position to obtain credentials belonging to anyone running in that cloud. That's probably a much larger set of people than have the ability to push stuff to laptops, but they have much the same level of power. You'll obviously have a whole bunch of processes and policies and oversights to make it difficult for a compromised user to do such a thing, but if you're a high enough profile target it's a plausible scenario.

How can we avoid this? The easiest way is to take the people able to interfere with the control plane out of the loop. The hypervisor knows what it booted, and if there's a mechanism for the VM to pass that information to a user in a trusted way, you'll be able to detect the control plane handing over the wrong image. This can be achieved using trusted boot. The hypervisor-provided firmware performs a "measurement" (basically a cryptographic hash of some data) of what it's booting, storing that information in a virtualised TPM. This TPM can later provide a signed copy of the measurements on demand. A remote system can look at these measurements and determine whether the system is trustworthy - if a modified image had been provided, the measurements would be different. As long as the hypervisor is trustworthy, it doesn't matter whether or not the control plane is - you can detect whether you were given the correct OS image, and you can build your trust on top of that.

(Of course, this depends on you being able to verify the key used to sign those measurements. On real hardware the TPM has a certificate that chains back to the manufacturer and uniquely identifies the TPM. On cloud platforms you typically have to retrieve the public key via the metadata channel, which means you're trusting the control plane to give you information about the hypervisor in order to verify what the control plane gave to the hypervisor. This is suboptimal, even though realistically the number of moving parts in that part of the control plane is much smaller than the number involved in provisioning the instance in the first place, so an attacker managing to compromise both is less realistic. Still, AWS doesn't even give you that, which does make it all rather more complicated)

Ok, so we can (largely) decouple our trust in the VM from having to trust the control plane. But we're still relying on the hypervisor to provide those attestations. What if the hypervisor isn't trustworthy? This sounds somewhat ridiculous (if you can't run a trusted app on top of an untrusted OS, how can you run a trusted OS on top of an untrusted hypervisor?), but AMD actually have a solution for that. SEV ("Secure Encrypted Virtualisation") is a technology where (handwavily) an encryption key is generated when a new VM is created, and the memory belonging to that VM is encrypted with that key. The hypervisor has no access to that encryption key, and any access to memory initiated by the hypervisor will only see the encrypted content. This means that nobody with the ability to tamper with the hypervisor can see what's going on inside the OS (and also means that nobody with physical access can either, so that's another threat dealt with).

But how do we know that the hypervisor set this up, and how do we know that the correct image was booted? SEV has support for a "Launch attestation", a CPU generated signed statement that it booted the current VM with SEV enabled. But it goes further than that! The attestation includes a measurement of what was booted, which means we don't need to trust the hypervisor at all - the CPU itself will tell us what image we were given. Perfect.

Except, well. There's a few problems. AWS just doesn't have any VMs that implement SEV yet (there are bare metal instances that do, but obviously you're building your own infrastructure to make that work). Google only seem to provide the launch measurement via the logging service - and they only include the parsed out data, not the original measurement. So, we still have to trust (a subset of) the control plane. Azure provides it via a separate attestation service, but again it doesn't seem to provide the raw attestation and so you're still trusting the attestation service. For the newest generation of SEV, SEV-SNP, this is less of a big deal because the guest can provide its own attestation. But Google doesn't offer SEV-SNP hardware yet, and the driver you need for this only shipped in Linux 5.19 and Azure's SEV Ubuntu images only offer up to 5.15 at the moment, so making use of that means you're putting your own image together at the moment.

And there's one other kind of major problem. A normal VM image provides a bootloader and a kernel and a filesystem. That bootloader needs to run on something. That "something" is typically hypervisor-provided "firmware" - for instance, OVMF. This probably has some level of cloud vendor patching, and they probably don't ship the source for it. You're just having to trust that the firmware is trustworthy, and we're talking about trying to avoid placing trust in the cloud provider. Azure has a private beta allowing users to upload images that include their own firmware, meaning that all the code you trust (outside the CPU itself) can be provided by the user, and once that's GA it ought to be possible to boot Azure VMs without having to trust any Microsoft-provided code.

Well, mostly. As AMD admit, SEV isn't guaranteed to be resistant to certain microarchitectural attacks. This is still much more restrictive than the status quo where the hypervisor could just read arbitrary content out of the VM whenever it wanted to, but it's still not ideal. Which, to be fair, is where we are with CPUs in general.

(Thanks to Leonard Cohnen who gave me a bunch of excellent pointers on this stuff while I was digging through it yesterday)

comment count unavailable comments

20 September 2022

Matthew Garrett: Handling WebAuthn over remote SSH connections

Being able to SSH into remote machines and do work there is great. Using hardware security tokens for 2FA is also great. But trying to use them both at the same time doesn't work super well, because if you hit a WebAuthn request on the remote machine it doesn't matter how much you mash your token - it's not going to work.

But could it?

The SSH agent protocol abstracts key management out of SSH itself and into a separate process. When you run "ssh-add .ssh/id_rsa", that key is being loaded into the SSH agent. When SSH wants to use that key to authenticate to a remote system, it asks the SSH agent to perform the cryptographic signatures on its behalf. SSH also supports forwarding the SSH agent protocol over SSH itself, so if you SSH into a remote system then remote clients can also access your keys - this allows you to bounce through one remote system into another without having to copy your keys to those remote systems.

More recently, SSH gained the ability to store SSH keys on hardware tokens such as Yubikeys. If configured appropriately, this means that even if you forward your agent to a remote site, that site can't do anything with your keys unless you physically touch the token. But out of the box, this is only useful for SSH keys - you can't do anything else with this support.

Well, that's what I thought, at least. And then I looked at the code and realised that SSH is communicating with the security tokens using the same library that a browser would, except it ensures that any signature request starts with the string "ssh:" (which a genuine WebAuthn request never will). This constraint can actually be disabled by passing -O no-restrict-websafe to ssh-agent, except that was broken until this weekend. But let's assume there's a glorious future where that patch gets backported everywhere, and see what we can do with it.

First we need to load the key into the security token. For this I ended up hacking up the Go SSH agent support. Annoyingly it doesn't seem to be possible to make calls to the agent without going via one of the exported methods here, so I don't think this logic can be implemented without modifying the agent module itself. But this is basically as simple as adding another key message type that looks something like:
type ecdsaSkKeyMsg struct  
       Type        string  sshtype:"17 25" 
       Curve       string
       PubKeyBytes []byte
       RpId        string
       Flags       uint8
       KeyHandle   []byte
       Reserved    []byte
       Comments    string
       Constraints []byte  ssh:"rest" 
Where Type is ssh.KeyAlgoSKECDSA256, Curve is "nistp256", RpId is the identity of the relying party (eg, ""), Flags is 0x1 if you want the user to have to touch the key, KeyHandle is the hardware token's representation of the key (basically an opaque blob that's sufficient for the token to regenerate the keypair - this is generally stored by the remote site and handed back to you when it wants you to authenticate). The other fields can be ignored, other than PubKeyBytes, which is supposed to be the public half of the keypair.

This causes an obvious problem. We have an opaque blob that represents a keypair. We don't have the public key. And OpenSSH verifies that PubKeyByes is a legitimate ecdsa public key before it'll load the key. Fortunately it only verifies that it's a legitimate ecdsa public key, and does nothing to verify that it's related to the private key in any way. So, just generate a new ECDSA key (ecdsa.GenerateKey(elliptic.P256(), rand.Reader)) and marshal it ( elliptic.Marshal(ecKey.Curve, ecKey.X, ecKey.Y)) and we're good. Pass that struct to ssh.Marshal() and then make an agent call.

Now you can use the standard agent interfaces to trigger a signature event. You want to pass the raw challenge (not the hash of the challenge!) - the SSH code will do the hashing itself. If you're using agent forwarding this will be forwarded from the remote system to your local one, and your security token should start blinking - touch it and you'll get back an ssh.Signature blob. ssh.Unmarshal() the Blob member to a struct like
type ecSig struct  
        R *big.Int
        S *big.Int
and then ssh.Unmarshal the Rest member to
type authData struct  
        Flags    uint8
        SigCount uint32
The signature needs to be converted back to a DER-encoded ASN.1 structure (eg,
var b cryptobyte.Builder
b.AddASN1(asn1.SEQUENCE, func(b *cryptobyte.Builder)  
signatureDER, _ := b.Bytes()
, and then you need to construct the Authenticator Data structure. For this, take the RpId used earlier and generate the sha256. Append the one byte Flags variable, and then convert SigCount to big endian and append those 4 bytes. You should now have a 37 byte structure. This needs to be CBOR encoded (I used and just called cbor.Marshal(data, cbor.EncOptions )).

Now base64 encode the sha256 of the challenge data, the DER-encoded signature and the CBOR-encoded authenticator data and you've got everything you need to provide to the remote site to satisfy the challenge.

There are alternative approaches - you can use USB/IP to forward the hardware token directly to the remote system. But that means you can't use it locally, so it's less than ideal. Or you could implement a proxy that communicates with the key locally and have that tunneled through to the remote host, but at that point you're just reinventing ssh-agent.

And you should bear in mind that the default behaviour of blocking this sort of request is for a good reason! If someone is able to compromise a remote system that you're SSHed into, they can potentially trick you into hitting the key to sign a request they've made on behalf of an arbitrary site. Obviously they could do the same without any of this if they've compromised your local system, but there is some additional risk to this. It would be nice to have sensible MAC policies that default-denied access to the SSH agent socket and only allowed trustworthy binaries to do so, or maybe have some sort of reasonable flatpak-style portal to gate access. For my threat model I think it's a worthwhile security tradeoff, but you should evaluate that carefully yourself.

Anyway. Now to figure out whether there's a reasonable way to get browsers to work with this.

comment count unavailable comments