Search Results: "aph"

9 January 2025

Reproducible Builds: Reproducible Builds in December 2024

Welcome to the December 2024 report from the Reproducible Builds project! Our monthly reports outline what we ve been up to over the past month and highlight items of news from elsewhere in the world of software supply-chain security when relevant. As ever, however, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. Table of contents:
  1. reproduce.debian.net
  2. debian-repro-status
  3. On our mailing list
  4. Enhancing the Security of Software Supply Chains
  5. diffoscope
  6. Supply-chain attack in the Solana ecosystem
  7. Website updates
  8. Debian changes
  9. Other development news
  10. Upstream patches
  11. Reproducibility testing framework

reproduce.debian.net Last month saw the introduction of reproduce.debian.net. Announced at the recent Debian MiniDebConf in Toulouse, reproduce.debian.net is an instance of rebuilderd operated by the Reproducible Builds project. rebuilderd is our server designed monitor the official package repositories of Linux distributions and attempts to reproduce the observed results there. This month, however, we are pleased to announce that not only does the service now produce graphs, the reproduce.debian.net homepage itself has become a start page of sorts, and the amd64.reproduce.debian.net and i386.reproduce.debian.net pages have emerged. The first of these rebuilds the amd64 architecture, naturally, but it also is building Debian packages that are marked with the no architecture label, all. The second builder is, however, only rebuilding the i386 architecture. Both of these services were also switched to reproduce the Debian trixie distribution instead of unstable, which started with 43% of the archive rebuild with 79.3% reproduced successfully. This is very much a work in progress, and we ll start reproducing Debian unstable soon. Our i386 hosts are very kindly sponsored by Infomaniak whilst the amd64 node is sponsored by OSUOSL thank you! Indeed, we are looking for more workers for more Debian architectures; please contact us if you are able to help.

debian-repro-status Reproducible builds developer kpcyrd has published a client program for reproduce.debian.net (see above) that queries the status of the locally installed packages and rates the system with a percentage score. This tool works analogously to arch-repro-status for the Arch Linux Reproducible Builds setup. The tool was packaged for Debian and is currently available in Debian trixie: it can be installed with apt install debian-repro-status.

On our mailing list On our mailing list this month:
  • Bernhard M. Wiedemann wrote a detailed post on his long journey towards a bit-reproducible Emacs package. In his interesting message, Bernhard goes into depth about the tools that they used and the lower-level technical details of, for instance, compatibility with the version for glibc within openSUSE.
  • Shivanand Kunijadar posed a question pertaining to the reproducibility issues with encrypted images. Shivanand explains that they must use a random IV for encryption with AES CBC. The resulting artifact is not reproducible due to the random IV used. The message resulted in a handful of replies, hopefully helpful!
  • User Danilo posted an in interesting question related to their attempts in trying to achieve reproducible builds for Threema Desktop 2.0. The question resulted in a number of replies attempting to find the right combination of compiler and linker flags (for example).
  • Longstanding contributor David A. Wheeler wrote to our list announcing the release of the Census III of Free and Open Source Software: Application Libraries report written by Frank Nagle, Kate Powell, Richie Zitomer and David himself. As David writes in his message, the report attempts to answer the question what is the most popular Free and Open Source Software (FOSS)? .
  • Lastly, kpcyrd followed-up to a post from September 2024 which mentioned their desire for someone to implement a hashset of allowed module hashes that is generated during the kernel build and then embedded in the kernel image , thus enabling a deterministic and reproducible build. However, they are now reporting that somebody implemented the hash-based allow list feature and submitted it to the Linux kernel mailing list . Like kpcyrd, we hope it gets merged.

Enhancing the Security of Software Supply Chains: Methods and Practices Mehdi Keshani of the Delft University of Technology in the Netherlands has published their thesis on Enhancing the Security of Software Supply Chains: Methods and Practices . Their introductory summary first begins with an outline of software supply chains and the importance of the Maven ecosystem before outlining the issues that it faces that threaten its security and effectiveness . To address these:
First, we propose an automated approach for library reproducibility to enhance library security during the deployment phase. We then develop a scalable call graph generation technique to support various use cases, such as method-level vulnerability analysis and change impact analysis, which help mitigate security challenges within the ecosystem. Utilizing the generated call graphs, we explore the impact of libraries on their users. Finally, through empirical research and mining techniques, we investigate the current state of the Maven ecosystem, identify harmful practices, and propose recommendations to address them.
A PDF of Mehdi s entire thesis is available to download.

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made the following changes, including preparing and uploading versions 283 and 284 to Debian:
  • Update copyright years. [ ]
  • Update tests to support file 5.46. [ ][ ]
  • Simplify tests_quines.py::test_ differences,differences_deb to simply use assert_diff and not mangle the test fixture. [ ]

Supply-chain attack in the Solana ecosystem A significant supply-chain attack impacted Solana, an ecosystem for decentralised applications running on a blockchain. Hackers targeted the @solana/web3.js JavaScript library and embedded malicious code that extracted private keys and drained funds from cryptocurrency wallets. According to some reports, about $160,000 worth of assets were stolen, not including SOL tokens and other crypto assets.

Website updates Similar to last month, there was a large number of changes made to our website this month, including:
  • Chris Lamb:
    • Make the landing page hero look nicer when the vertical height component of the viewport is restricted, not just the horizontal width.
    • Rename the Buy-in page to Why Reproducible Builds? [ ]
    • Removing the top black border. [ ][ ]
  • Holger Levsen:
  • hulkoba:
    • Remove the sidebar-type layout and move to a static navigation element. [ ][ ][ ][ ]
    • Create and merge a new Success stories page, which highlights the success stories of Reproducible Builds, showcasing real-world examples of projects shipping with verifiable, reproducible builds. These stories aim to enhance the technical resilience of the initiative by encouraging community involvement and inspiring new contributions. . [ ]
    • Further changes to the homepage. [ ]
    • Remove the translation icon from the navigation bar. [ ]
    • Remove unused CSS styles pertaining to the sidebar. [ ]
    • Add sponsors to the global footer. [ ]
    • Add extra space on large screens on the Who page. [ ]
    • Hide the side navigation on small screens on the Documentation pages. [ ]

Debian changes There were a significant number of reproducibility-related changes within Debian this month, including:
  • Santiago Vila uploaded version 0.11+nmu4 of the dh-buildinfo package. In this release, the dh_buildinfo becomes a no-op ie. it no longer does anything beyond warning the developer that the dh-buildinfo package is now obsolete. In his upload, Santiago wrote that We still want packages to drop their [dependency] on dh-buildinfo, but now they will immediately benefit from this change after a simple rebuild.
  • Holger Levsen filed Debian bug #1091550 requesting a rebuild of a number of packages that were built with a very old version of dpkg.
  • Fay Stegerman contributed to an extensive thread on the debian-devel development mailing list on the topic of Supporting alternative zlib implementations . In particular, Fay wrote about her results experimenting whether zlib-ng produces identical results or not.
  • kpcyrd uploaded a new rust-rebuilderd-worker, rust-derp, rust-in-toto and debian-repro-status to Debian, which passed successfully through the so-called NEW queue.
  • Gioele Barabucci filed a number of bugs against the debrebuild component/script of the devscripts package, including:
    • #1089087: Address a spurious extra subdirectory in the build path.
    • #1089201: Extra zero bytes added to .dynstr when rebuilding CMake projects.
    • #1089088: Some binNMUs have a 1-second offset in some timestamps.
  • Gioele Barabucci also filed a bug against the dh-r package to report that the Recommends and Suggests fields are missing from rebuilt R packages. At the time of writing, this bug has no patch and needs some help to make over 350 binary packages reproducible.
  • Lastly, 8 reviews of Debian packages were added, 11 were updated and 11 were removed this month adding to our knowledge about identified issues.

Other development news In other ecosystem and distribution news:
  • Lastly, in openSUSE, Bernhard M. Wiedemann published another report for the distribution. There, Bernhard reports about the success of building R-B-OS , a partial fork of openSUSE with only 100% bit-reproducible packages. This effort was sponsored by the NLNet NGI0 initiative.

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In November, a number of changes were made by Holger Levsen, including:
  • reproduce.debian.net-related:
    • Add a new i386.reproduce.debian.net rebuilder. [ ][ ][ ][ ][ ][ ]
    • Make a number of updates to the documentation. [ ][ ][ ][ ][ ]
    • Run i386.reproduce.debian.net run on a public port to allow external workers. [ ]
    • Add a link to the /api/v0/pkgs/list endpoint. [ ]
    • Add support for a statistics page. [ ][ ][ ][ ][ ][ ]
    • Limit build logs to 20 MiB and diffoscope output to 10 MiB. [ ]
    • Improve the frontpage. [ ][ ]
    • Explain that we re testing arch:any and arch:all on the amd64 architecture, but only arch:any on i386. [ ]
  • Misc:
    • Remove code for testing Arch Linux, which has moved to reproduce.archlinux.org. [ ][ ]
    • Don t install dstat on Jenkins nodes anymore as its been removed from Debian trixie. [ ]
    • Prepare the infom08-i386 node to become another rebuilder. [ ]
    • Add debug date output for benchmarking the reproducible_pool_buildinfos.sh script. [ ]
    • Install installation-birthday everywhere. [ ]
    • Temporarily disable automatic updates of pool links on buildinfos.debian.net. [ ]
    • Install Recommends by default on Jenkins nodes. [ ]
    • Rename rebuilder_stats.py to rebuilderd_stats.py. [ ]
    • r.d.n/stats: minor formatting changes. [ ]
    • Install files under /etc/cron.d/ with the correct permissions. [ ]
and Jochen Sprickerhof made the following changes: Lastly, Gioele Barabucci also classified packages affected by 1-second offset issue filed as Debian bug #1089088 [ ][ ][ ][ ], Chris Hofstaedtler updated the URL for Grml s dpkg.selections file [ ], Roland Clobus updated the Jenkins log parser to parse warnings from diffoscope [ ] and Mattia Rizzolo banned a number of bots and crawlers from the service [ ][ ].
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

Freexian Collaborators: Debian Contributions: Tracker.debian.org updates, Salsa CI improvements, Coinstallable build-essential, Python 3.13 transition, Ruby 3.3 transition and more! (by Anupa Ann Joseph, Stefano Rivera)

Debian Contributions: 2024-12 Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services.

Tracker.debian.org updates, by Rapha l Hertzog Profiting from end-of-year vacations, Rapha l prepared for tracker.debian.org to be upgraded to Debian 12 bookworm by getting rid of the remnants of python3-django-jsonfield in the code (it was superseded by a Django-native field). Thanks to Philipp Kern from the Debian System Administrators team, the upgrade happened on December 23rd. Rapha l also improved distro-tracker to better deal with invalid Maintainer fields which recently caused multiples issues in the regular data updates (#1089985, MR 105). While working on this, he filed #1089648 asking dpkg tools to error out early when maintainers make such mistakes. Finally he provided feedback to multiple issues and merge requests (MR 106, issues #21, #76, #77), there seems to be a surge of interest in distro-tracker lately. It would be nice if those new contributors could stick around and help out with the significant backlog of issues (in the Debian BTS, in Salsa).

Salsa CI improvements, by Santiago Ruano Rinc n Given that the Debian buildd network now relies on sbuild using the unshare backend, and that Salsa CI s reproducibility testing needs to be reworked (#399), Santiago resumed the work for moving the build job to use sbuild. There was some related work a few months ago that was focused on sbuild with the schroot and the sudo backends, but those attempts were stalled for different reasons, including discussions around the convenience of the move (#296). However, using sbuild and unshare avoids all of the drawbacks that have been identified so far. Santiago is preparing two merge requests: !568 to introduce a new build image, and !569 that moves all the extract-source related tasks to the build job. As mentioned in the previous reports, this change will make it possible for more projects to use the pipeline to build the packages (See #195). Additional advantages of this change include a more optimal way to test if a package builds twice in a row: instead of actually building it twice, the Salsa CI pipeline will configure sbuild to check if the clean target of debian/rules correctly restores the source tree, saving some CPU cycles by avoiding one build. Also, the images related to Ubuntu won t be needed anymore, since the build job will create chroots for different distributions and vendors from a single common build image. This will save space in the container registry. More changes are to come, especially those related to handling projects that customize the pipeline and make use of the extract-source job.

Coinstallable build-essential, by Helmut Grohne Building on the gcc-for-host work of last December, a notable patch turning build-essential Multi-Arch: same became feasible. Whilst the change is small, its implications and foundations are not. We still install crossbuild-essential-$ARCH for cross building and due to a britney2 limitation, we cannot have it depend on the host s C library. As a result, there are workarounds in place for sbuild and pbuilder. In turning build-essential Multi-Arch: same, we may actually express these dependencies directly as we install build-essential:$ARCH instead. The crossbuild-essential-$ARCH packages will continue to be available as transitional dummy packages.

Python 3.13 transition, by Colin Watson and Stefano Rivera Building on last month s work, Colin, Stefano, and other members of the Debian Python team fixed 3.13 compatibility bugs in many more packages, allowing 3.13 to now be a supported but non-default version in testing. The next stage will be to switch to it as the default version, which will start soon. Stefano did some test-rebuilds of packages that only build for the default Python 3 version, to find issues that will block the transition. The default version transition typically shakes out some more issues in applications that (unlike libraries) only test with the default Python version. Colin also fixed Sphinx 8.0 compatibility issues in many packages, which otherwise threatened to get in the way of this transition.

Ruby 3.3 transition, by Lucas Kanashiro The Debian Ruby team decided to ship Ruby 3.3 in the next Debian release, and Lucas took the lead of the interpreter transition with the assistance of the rest of the team. In order to understand the impact of the new interpreter in the ruby ecosystem, ruby-defaults was uploaded to experimental adding ruby3.3 as an alternative interpreter, and a mass rebuild of reverse dependencies was done here. Initially, a couple of hundred packages were failing to build, after many rounds of rebuilds, adjustments, and many uploads we are down to 30 package build failures, of those, 21 packages were asked to be removed from testing and for the other 9, bugs were filled. All the information to track this transition can be found here. Now, we are waiting for PHP 8.4 to finish to avoid any collision. Once it is done the Ruby 3.3 transition will start in unstable.

Miscellaneous contributions
  • Enrico Zini redesigned the way nm.debian.org stores historical audit logs and personal data backups.
  • Carles Pina submitted a new package (python-firebase-messaging) and prepared updates for python3-ring-doorbell.
  • Carles Pina developed further po-debconf-manager: better state transition, fixed bugs, automated assigning translators and reviewers on edit, updating po header files automatically, fixed bugs, etc.
  • Carles Pina reviewed, submitted and followed up the debconf templates translation (more than 20 packages) and translated some packages (about 5).
  • Santiago continued to work on DebConf 25 organization related tasks, including handling the logo survey and results. Stefano spent time on DebConf 25 too.
  • Santiago continued the exploratory work about linux livepatching with Emmanuel Arias. Santiago and Emmanuel found a challenge since kpatch won t fully support linux in trixie and newer, so they are exploring alternatives such as klp-build.
  • Helmut maintained the /usr-move transition filing bugs in e.g. bubblewrap, e2fsprogs, libvpd-2.2-3, and pam-tmpdir and corresponding on related issues such as kexec-tools and live-build. The removal of the usrmerge package unfortunately broke debootstrap and was quickly reverted. Continued fallout is expected and will continue until trixie is released.
  • Helmut sent patches for 10 cross build failures and worked with Sandro Knau on stuck Qt/KDE patches related to cross building.
  • Helmut continued to maintain rebootstrap removing the need to build gnu-efi in the process.
  • Helmut collaborated with Emanuele Rocca and Jochen Sprickerhof on an interesting adventure in diagnosing why gcc would FTBFS in recent sbuild.
  • Helmut proposed supporting build concurrency limits in coreutils s nproc. As it turns out nproc is not a good place for this functionality.
  • Colin worked with Sandro Tosi and Andrej Shadura to finish resolving the multipart vs. python-multipart name conflict, as mentioned last month.
  • Colin upgraded 48 Python packages to new upstream versions, fixing four CVEs and a number of compatibility bugs with recent Python versions.
  • Colin issued an openssh bookworm update with a number of fixes that had accumulated over the last year, especially fixing GSS-API key exchange which had been quite broken in bookworm.
  • Stefano fixed a minor bug in debian-reimbursements that was disallowing combination PDFs containing JAL tickets, encoded in UTF-16.
  • Stefano uploaded a stable update to PyPy3 in bookworm, catching up with security issues resolved in cPython.
  • Stefano fixed a regression in the eventlet from his Python 3.13 porting patch.
  • Stefano continued discussing a forwarded patch (renaming the sysconfigdata module) with cPython upstream, ending in a decision to drop the patch from Debian. This will need some continued work.
  • Anupa participated in the Debian Publicity team meeting in December, which discussed the team activities done in 2024 and projects for 2025.

7 January 2025

Thorsten Alteholz: My Debian Activities in December 2024

Debian LTS This was my hundred-twenty-sixth month that I did some work for the Debian LTS initiative, started by Raphael Hertzog at Freexian. I worked on updates for ffmpeg and haproxy in all releases. Along the way I marked more CVEs as not-affected than I had to fix. So finally there was no upload needed for haproxy anymore. Unfortunately testing ffmpeg was not as easy, as the recommended just look whether mpv can play random videos is not really satisfying. So the upload will happen only in January. I also wonder whether fixing glewlwyd is really worth the effort, as the software is already EOL upstream. Debian ELTS This month was the seventy-seventhth ELTS month. During my allocated time I worked on ffmpeg, haproxy, amanda and kmail-account-wizzard. Like LTS, all CVEs of haproxy and some of ffmpeg could be marked as not-affected and testing of the other packages was/is not really straight forward. So the final upload will only happen in January as well. Debian Printing Unfortunately I didn t found any time to work on this topic. Debian Matomo Thanks a lot to William Desportes for all fixes of my bad PHP packaging. Debian Astro This month I uploaded new packages or new upstream or bugfix versions of: I again sponsored an upload of calceph. Debian IoT This month I uploaded new upstream or bugfix versions of: Debian Mobcom This month I uploaded new packages or new upstream or bugfix versions of: misc This month I uploaded new upstream or bugfix versions of: I also sponsored uploads of emacs-lsp-docker, emacs-dape, emacs-oauth2, gpgmngr, libjs-jush. FTP master This month I accepted 330 and rejected 13 packages. The overall number of packages that got accepted was 335.

4 January 2025

Scarlett Gately Moore: KDE: Snap hotfixes and updates

Fixed okular pdf printing https://bugs.kde.org/show_bug.cgi?id=498065 Fixed kwave recording https://bugs.kde.org/show_bug.cgi?id=442085 please run sudo snap connect kwave:audio-record :audio-record until auto-connect gets approved here: https://forum.snapcraft.io/t/kde-auto-connect-our-two-recording-apps/44419 New qt6 snaps in edge until 24.12.1 release
I have begun the process of moving to core24 currently in edge until 24.12.1 release. Some major improvements come with core24!
Tokodon is our wonderful Mastadon client I hate asking but I am unemployable with this broken arm fiasco and 6 hours a day hospital runs for treatment. If you could spare anything it would be appreciated! https://gofund.me/573cc38e

3 January 2025

Bits from Debian: Bits from the DPL

Dear Debian community, this is bits from DPL for December. Happy New Year 2025! Wishing everyone health, productivity, and a successful Debian release later in this year. Strict ownership of packages I'm glad my last bits sparked discussions about barriers between packages and contributors, summarized temporarily in some post on the debian-devel list. As one participant aptly put it, we need a way to visibly say, "I'll do the job until someone else steps up". Based on my experience with the Bug of the Day initiative, simplifying the process for engaging with packages would significantly help. Currently we have
  1. NMU The Developers Reference outlines several preconditions for NMUs, explicitly stating, "Fixing cosmetic issues or changing the packaging style in NMUs is discouraged." This makes NMUs unsuitable for addressing package smells. However, I've seen NMUs used for tasks like switching to source format 3.0 or bumping the debhelper compat level. While it's technically possible to file a bug and then address it in an NMU, the process inherently limits the NMUer's flexibility to reduce package smells.
  2. Package Salvaging This is another approach for working on someone else's packages, aligning with the process we often follow in the Bug of the Day initiative. The criteria for selecting packages typically indicate that the maintainer either lacks time to address open bugs, has lost interest, or is generally MIA.
Both options have drawbacks, so I'd welcome continued discussion on criteria for lowering the barriers to moving packages to Salsa and modernizing their packaging. These steps could enhance Debian overall and are generally welcomed by active maintainers. The discussion also highlighted that packages on Salsa are often maintained collaboratively, fostering the team-oriented atmosphere already established in several Debian teams. Salsa Continuous Integration As part of the ongoing discussion about package maintenance, I'm considering the suggestion to switch from the current opt-in model for Salsa CI to an opt-out approach. While I fully agree that human verification is necessary when the pipeline is activated, I believe the current option to enable CI is less visible than it should be. I'd welcome a more straightforward approach to improve access to better testing for what we push to Salsa. Number of packages not on Salsa In my campaign, I stated that I aimed to reduce the number of packages maintained outside Salsa to below 2,000. As of March 28, 2024, the count was 2,368. As of this writing, the count stands at 1,928 [1], so I consider this promise fulfilled. My thanks go out to everyone who contributed to this effort. Moving forward, I'd like to set a more ambitious goal for the remainder of my term and hope we can reduce the number to below 1,800. [1] UDD query: SELECT DISTINCT count(*) FROM sources WHERE release = 'sid' and vcs_url not like '%salsa%' ; Past and future events Talk at MRI Together In early December, I gave a short online talk, primarily focusing on my work with the Debian Med team. I also used my position as DPL to advocate for attracting more users and developers from the scientific research community. FOSSASIA I originally planned to attend FOSDEM this year. However, given the strong Debian presence there and the need for better representation at the FOSSASIA Summit, I decided to prioritize the latter. This aligns with my goal of improving geographic diversity. I also look forward to opportunities for inter-distribution discussions. Debian team sprints Debian Ruby Sprint I approved the budget for the Debian Ruby Sprint, scheduled for January 2025 in Paris. If you're interested in contributing to the Ruby team, whether in person or online, consider reaching out to them. I'm sure any helping hand would be appreciated. Debian Med sprint There will also be a Debian Med sprint in Berlin in mid-February. As usual, you don't need to be an expert in biology or medicine basic bug squashing skills are enough to contribute and enjoy the friendly atmosphere the Debian Med team fosters at their sprints. For those working in biology and medicine, we typically offer packaging support. Anyone interested in spending a weekend focused on impactful scientific work with Debian is warmly invited. Again all the best for 2025
Andreas.

2 January 2025

Matthew Garrett: The GPU, not the TPM, is the root of hardware DRM

As part of their "Defective by Design" anti-DRM campaign, the FSF recently made the following claim:
Today, most of the major streaming media platforms utilize the TPM to decrypt media streams, forcefully placing the decryption out of the user's control (from here).
This is part of an overall argument that Microsoft's insistence that only hardware with a TPM can run Windows 11 is with the goal of aiding streaming companies in their attempt to ensure media can only be played in tightly constrained environments.

I'm going to be honest here and say that I don't know what Microsoft's actual motivation for requiring a TPM in Windows 11 is. I've been talking about TPM stuff for a long time. My job involves writing a lot of TPM code. I think having a TPM enables a number of worthwhile security features. Given the choice, I'd certainly pick a computer with a TPM. But in terms of whether it's of sufficient value to lock out Windows 11 on hardware with no TPM that would otherwise be able to run it? I'm not sure that's a worthwhile tradeoff.

What I can say is that the FSF's claim is just 100% wrong, and since this seems to be the sole basis of their overall claim about Microsoft's strategy here, the argument is pretty significantly undermined. I'm not aware of any streaming media platforms making use of TPMs in any way whatsoever. There is hardware DRM that the media companies use to restrict users, but it's not in the TPM - it's in the GPU.

Let's back up for a moment. There's multiple different DRM implementations, but the big three are Widevine (owned by Google, used on Android, Chromebooks, and some other embedded devices), Fairplay (Apple implementation, used for Mac and iOS), and Playready (Microsoft's implementation, used in Windows and some other hardware streaming devices and TVs). These generally implement several levels of functionality, depending on the capabilities of the device they're running on - this will range from all the DRM functionality being implemented in software up to the hardware path that will be discussed shortly. Streaming providers can choose what level of functionality and quality to provide based on the level implemented on the client device, and it's common for 4K and HDR content to be tied to hardware DRM. In any scenario, they stream encrypted content to the client and the DRM stack decrypts it before the compressed data can be decoded and played.

The "problem" with software DRM implementations is that the decrypted material is going to exist somewhere the OS can get at it at some point, making it possible for users to simply grab the decrypted stream, somewhat defeating the entire point. Vendors try to make this difficult by obfuscating their code as much as possible (and in some cases putting some of it in-kernel), but pretty much all software DRM is at least somewhat broken and copies of any new streaming media end up being available via Bittorrent pretty quickly after release. This is why higher quality media tends to be restricted to clients that implement hardware-based DRM.

The implementation of hardware-based DRM varies. On devices in the ARM world this is usually handled by performing the cryptography in a Trusted Execution Environment, or TEE. A TEE is an area where code can be executed without the OS having any insight into it at all, with ARM's TrustZone being an example of this. By putting the DRM code in TrustZone, the cryptography can be performed in RAM that the OS has no access to, making the scraping described earlier impossible. x86 has no well-specified TEE (Intel's SGX is an example, but is no longer implemented in consumer parts), so instead this tends to be handed off to the GPU. The exact details of this implementation are somewhat opaque - of the previously mentioned DRM implementations, only Playready does hardware DRM on x86, and I haven't found any public documentation of what drivers need to expose for this to work.

In any case, as part of the DRM handshake between the client and the streaming platform, encryption keys are negotiated with the key material being stored in the GPU or the TEE, inaccessible from the OS. Once decrypted, the material is decoded (again either on the GPU or in the TEE - even in implementations that use the TEE for the cryptography, the actual media decoding may happen on the GPU) and displayed. One key point is that the decoded video material is still stored in RAM that the OS has no access to, and the GPU composites it onto the outbound video stream (which is why if you take a screenshot of a browser playing a stream using hardware-based DRM you'll just see a black window - as far as the OS can see, there is only a black window there).

Now, TPMs are sometimes referred to as a TEE, and in a way they are. However, they're fixed function - you can't run arbitrary code on the TPM, you only have whatever functionality it provides. But TPMs do have the ability to decrypt data using keys that are tied to the TPM, so isn't this sufficient? Well, no. First, the TPM can't communicate with the GPU. The OS could push encrypted material to it, and it would get plaintext material back. But the entire point of this exercise was to avoid the decrypted version of the stream from ever being visible to the OS, so this would be pointless. And rather more fundamentally, TPMs are slow. I don't think there's a TPM on the market that could decrypt a 1080p stream in realtime, let alone a 4K one.

The FSF's focus on TPMs here is not only technically wrong, it's indicative of a failure to understand what's actually happening in the industry. While the FSF has been focusing on TPMs, GPU vendors have quietly deployed all of this technology without the FSF complaining at all. Microsoft has enthusiastically participated in making hardware DRM on Windows possible, and user freedoms have suffered as a result, but Playready hardware-based DRM works just fine on hardware that doesn't have a TPM and will continue to do so.

comment count unavailable comments

31 December 2024

Russ Allbery: Review: Metal from Heaven

Review: Metal from Heaven, by August Clarke
Publisher: Erewhon
Copyright: November 2024
ISBN: 1-64566-099-0
Format: Kindle
Pages: 443
Metal from Heaven is industrial-era secondary-world fantasy with a literary bent. It is a complete story in one book, and I would be very surprised by a sequel. Clarke previously wrote the Scapegracers young-adult trilogy, which got excellent reviews and a few award nominations, as H.A. Clarke. This is his first adult novel.
Know I adore you. Look out over the glow. The cities sundered, their machines inverted, mountains split and prairies blazing, that long foreseen Hereafter crowning fast. This calamity is a promise made to you. A prayer to you, and to your shadow which has become my second self, tucked behind my eye and growing in tandem with me, pressing outwards through the pupil, the smarter, truer, almost bursting reason for our wrath. Do not doubt me. Just look. Watch us rise as the sun comes up over the beauty. The future stains the bleakness so pink. When my violence subsides, we will have nothing, and be champions.
Marney Honeycutt is twelve years old, a factory worker, and lustertouched. She works in the Yann I. Chauncey Ichorite Foundry in Ignavia City, alongside her family and her best friend, shaping the magical metal ichorite into the valuable industrial products of a new age of commerce and industry. She is the oldest of the lustertouched, the children born to factory workers and poisoned by the metal. It has made her allergic, prone to fits at any contact with ichorite, but also able to exert a strange control over the metal if she's willing to pay the price of spasms and hallucinations for hours afterwards. As Metal from Heaven opens, the workers have declared a strike. Her older sister is the spokesperson, demanding shorter hours, safer working conditions, and an investigation into the health of the lustertouched children. Chauncey's response is to send enforcer snipers to kill the workers, including the entirety of her family.
The girl sang, "Unalone toward dawn we go, toward the glory of the new morning." An enforcer shot her in the belly, and when she did not fall, her head.
Marney survives, fleeing into the city, swearing an impossible personal revenge against Yann Chauncey. An act of charity gets her a ticket on a train into the countryside. The woman who bought her ticket is a bandit who is on the train to rob it. Marney's ability to control ichorite allows her to help the bandits in return, winning her a place with the Highwayman's Choir who have been preying on the shipments of the rich and powerful and then disappearing into the hills. The Choir's secret is that the agoraphobic and paranoid Baron of the Fingerbluffs is dead and has been for years. He was killed by his staff, Hereafterist idealists, who have turned his remote territory into an anarchist commune and haven for pirates and bandits. This becomes Marney's home and the Choir becomes her family, but she never forgets her oath of revenge or the childhood friend she left behind in the piles of bodies and to whom this story is narrated. First, Clarke's writing is absolutely gorgeous.
We scaled the viny mountain jags at Montrose Barony's legal edge, the place where land was and wasn't Ignavia, Royston, and Drustland alike. There was a border but it was diffuse and hallucinatory, even more so than most. On legal papers and state maps there were harsh lines that squashed topography and sanded down the mountains into even hills in planter's rows, but here among the jutting rocks and craggy heather, the ground was lineless.
The rhythm of it, the grasp of contrast and metaphor, the word choice! That climactic word "lineless," with its echo of limitless. So good. Second, this is the rarest of books: a political fantasy that takes class and religion seriously and uses them for more than plot drivers. This is not at all our world, and the technology level is somewhat ambiguous, but the parallels to the Gilded Age and Progressive Era are unmistakable. The Hereafterists that Marney joins are political anarchists, not in the sense of alternative governance structures and political theory sanitized for middle-class liberals, but in the sense of Emma Goldman and Peter Kropotkin. The society they have built in the Fingerbluffs is temporary, threatened, and contingent, but it is sincere and wildly popular among the people who already lived there. Even beyond politics, class is a tangible force in this book. Marney is a factory worker and the child of factory workers. She barely knows how to read and doesn't magically learn over the course of the book. She has friends who are clever in the sense rewarded by politics and nobility, who navigate bureaucracies and political nuance, but that is not Marney's world. When, towards the end of the book, she has to deal with a gathering of high-class women, the contrast is stark, and she navigates that gathering only by being entirely unexpected. Perhaps the best illustration of the subtlety of this is the terminology in the book for lesbian. Marney is a crawly, which is a slur thrown at people like her (and one of the rare fictional slurs that work exactly as the author intended) but is also simply what she calls herself. Whether or not it functions as a slur depends on context, and the context is never hard to understand. The high-class lesbians she meets later are Lunarists, and react to crawly as a vile and insulting word. They use language to separate themselves from both the insult and from the social class that uses it. Language is an indication of culture and manners and therefore of morality, unlike deeds, which admit endless justifications.
Conversation was fleeting. Perdita managed with whomever stood near her, chipper about every prettiness she saw, the flitting butterflies, the dappled light between the leaves, the lushness and the fragrance of untamed land, and her walking companions took turns sharing in her delight. It was infectious, how happy she was. She was going to slaughter millions. She was going to skip like this all the while.
The handling of religion is perhaps even better. Marney was raised a Tullian, which sits alongside two other fleshed-out fictional religions and sketches of several more. Tullians tend to be conservative and patriarchal, and Marney has a realistically complicated relationship with faith: sticking with some Tullian worship practices and gestures because they're part of who she is, feeling a kinship to other Tullians, discarding beliefs that don't fit her, and revising others. Every major religion has a Hereafterist spin or reinterpretation that upends or reverses the parts of the religion that were used to prop up the existing social order and brings it more in line with Hereafterist ideals. We see the Tullian Hereafterist variation in detail, and as someone who has studied a lot of methods of reinterpreting Christianity, I was impressed by how well Clarke invents both a belief system and its revisionist rewrite. This is exactly how religions work in human history, but one almost never sees this subtlety in fantasy novels. Marney's allergy to ichorite causes her internal dialogue to dissolve into hallucinatory synesthesia when she's manipulating or exposed to it. Since that's most of the book, substantial portions read like drug trips with growing body horror. I normally hate this type of narration, so it's a sign of just how good Clarke's writing is that I tolerated it and even enjoyed parts. It helps that the descriptions are irreverent and often surprising, full of unexpected metaphors and sudden turns. It's very hard not to quote paragraph after paragraph of this book. Clarke is also doing a lot with gender that I don't feel qualified to comment in detail on, but it would not surprise me to see this book in the Otherwise Award recommendation list. I can think of three significant male characters, all of whom are well-done, but every other major character is female by at least some gender definition. Within that group, though, is huge gender diversity of the complicated and personal type that doesn't force people into defined boxes. Marney's sexuality is similarly unclassified and sometimes surprising. My one complaint is that I thought the sex scenes (which, to warn, are often graphic) fell into the literary fiction trap of being described so closely and physically that it didn't feel like anyone involved was actually enjoying themselves. (This is almost certainly a matter of personal taste.) I had absolutely no idea how Clarke was going to end this book, and the last couple of chapters caught me by surprise. I'm still not sure what I think about the climax. It's not the ending that I wanted, but one of the merits of this book is that it never did what I thought I wanted and yet made me enjoy the journey anyway. It is, at least, a genre ending, not a literary ending: The reader gets a full explanation of what is going on, and the setting is not static the way that it so often is in literary fiction. The characters can change the world, for good or for ill. The story felt frustrating and incomplete when I first finished it, but I haven't stopped thinking about this book and I think I like the shape of it a bit more now. It was certainly unexpected, at least by me. Clarke names Dhalgren as one of their influences in the acknowledgments, and yes, Metal from Heaven is that kind of book. This is the first 2024 novel I've read that felt like the kind of book that should be on award shortlists. I'm not sure it was entirely successful, and there are parts of it that I didn't like or that weren't for me, but it's trying to do something different and challenging and uncomfortable, and I think it mostly worked. And the writing is so good.
She looked like a mythic princess from the old woodcuts, who ruled nature by force of goodness and faith and had no legal power.
Metal from Heaven is not going to be everyone's taste. If you do not like literary fantasy, there is a real chance that you will hate this. I am very glad that I read it, and also am going to take a significant break from difficult books before I tackle another one. But then I'm probably going to try the Scapegracers series, because Clarke is an author I want to follow. Content notes: Explicit sex, including sadomasochistic sex. Political violence, mostly by authorities. Murdered children, some body horror, and a lot of serious injuries and death. Rating: 8 out of 10

23 December 2024

Simon Josefsson: OpenSSH and Git on a Post-Quantum SPHINCS+

Are you aware that Git commits and tags may be signed using OpenSSH? Git signatures may be used to improve integrity and authentication of our software supply-chain. Popular signature algorithms include Ed25519, ECDSA and RSA. Did you consider that these algorithms may not be safe if someone builds a post-quantum computer? As you may recall, I have earlier blogged about the efficient post-quantum key agreement mechanism called Streamlined NTRU Prime and its use in SSH and I have attempted to promote the conservatively designed Classic McEliece in a similar way, although it remains to be adopted. What post-quantum signature algorithms are available? There is an effort by NIST to standardize post-quantum algorithms, and they have a category for signature algorithms. According to wikipedia, after round three the selected algorithms are CRYSTALS-Dilithium, FALCON and SPHINCS+. Of these, SPHINCS+ appears to be a conservative choice suitable for long-term digital signatures. Can we get this to work? Recall that Git uses the ssh-keygen tool from OpenSSH to perform signing and verification. To refresh your memory, let s study the commands that Git uses under the hood for Ed25519. First generate a Ed25519 private key:
jas@kaka:~$ ssh-keygen -t ed25519 -f my_ed25519_key -P ""
Generating public/private ed25519 key pair.
Your identification has been saved in my_ed25519_key
Your public key has been saved in my_ed25519_key.pub
The key fingerprint is:
SHA256:fDa5+jmC2+/aiLhWeWA3IV8Wj6yMNTSuRzqUZlIGlXQ jas@kaka
The key's randomart image is:
+--[ED25519 256]--+
     .+=.E ..      
      oo=.ooo      
     . =o=+o .     
      =oO+o .      
      .=+S.=       
       oo.o o      
      . o  .       
     ...o.+..      
    .o.o.=**.      
+----[SHA256]-----+
jas@kaka:~$ cat my_ed25519_key
-----BEGIN OPENSSH PRIVATE KEY-----
b3BlbnNzaC1rZXktdjEAAAAABG5vbmUAAAAEbm9uZQAAAAAAAAABAAAAMwAAAAtzc2gtZW
QyNTUxOQAAACAWP/aZ8hzN0WNRMSpjzbgW1tJXNd2v6/dnbKaQt7iIBQAAAJCeDotOng6L
TgAAAAtzc2gtZWQyNTUxOQAAACAWP/aZ8hzN0WNRMSpjzbgW1tJXNd2v6/dnbKaQt7iIBQ
AAAEBFRvzgcD3YItl9AMmVK4xDKj8NTg4h2Sluj0/x7aSPlhY/9pnyHM3RY1ExKmPNuBbW
0lc13a/r92dsppC3uIgFAAAACGphc0BrYWthAQIDBAU=
-----END OPENSSH PRIVATE KEY-----
jas@kaka:~$ cat my_ed25519_key.pub 
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIBY/9pnyHM3RY1ExKmPNuBbW0lc13a/r92dsppC3uIgF jas@kaka
jas@kaka:~$ 
Then let s sign something with this key:
jas@kaka:~$ echo "Hello world!" > msg
jas@kaka:~$ ssh-keygen -Y sign -f my_ed25519_key -n my-namespace msg
Signing file msg
Write signature to msg.sig
jas@kaka:~$ cat msg.sig 
-----BEGIN SSH SIGNATURE-----
U1NIU0lHAAAAAQAAADMAAAALc3NoLWVkMjU1MTkAAAAgFj/2mfIczdFjUTEqY824FtbSVz
Xdr+v3Z2ymkLe4iAUAAAAMbXktbmFtZXNwYWNlAAAAAAAAAAZzaGE1MTIAAABTAAAAC3Nz
aC1lZDI1NTE5AAAAQLmWsq05tqOOZIJqjxy5ZP/YRFoaX30lfIllmfyoeM5lpVnxJ3ZxU8
SF0KodDr8Rtukg2N3Xo80NGvZOzbG/9Aw=
-----END SSH SIGNATURE-----
jas@kaka:~$
Now let s create a list of trusted public-keys and associated identities:
jas@kaka:~$ echo 'my.name@example.org ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIBY/9pnyHM3RY1ExKmPNuBbW0lc13a/r92dsppC3uIgF' > allowed-signers
jas@kaka:~$ 
Then let s verify the message we just signed:
jas@kaka:~$ cat msg   ssh-keygen -Y verify -f allowed-signers -I my.name@example.org -n my-namespace -s msg.sig
Good "my-namespace" signature for my.name@example.org with ED25519 key SHA256:fDa5+jmC2+/aiLhWeWA3IV8Wj6yMNTSuRzqUZlIGlXQ
jas@kaka:~$ 
I have implemented support for SPHINCS+ in OpenSSH. This is early work, but I wanted to announce it to get discussion of some of the details going and to make people aware of it. What would a better way to demonstrate SPHINCS+ support in OpenSSH than by validating the Git commit that implements it using itself? Here is how to proceed, first get a suitable development environment up and running. I m using a Debian container launched in a protected environment using podman.
jas@kaka:~$ podman run -it --rm debian:stable
Then install the necessary build dependencies for OpenSSH.
# apt-get update 
# apt-get install git build-essential autoconf libz-dev libssl-dev
Now clone my OpenSSH branch with the SPHINCS+ implentation and build it. You may browse the commit on GitHub first if you are curious.
# cd
# git clone https://github.com/jas4711/openssh-portable.git -b sphincsp
# cd openssh-portable
# autoreconf -fvi
# ./configure
# make
Configure a Git allowed signers list with my SPHINCS+ public key (make sure to keep the public key on one line with the whitespace being one ASCII SPC character):
# mkdir -pv ~/.ssh
# echo 'simon@josefsson.org ssh-sphincsplus@openssh.com AAAAG3NzaC1zcGhpbmNzcGx1c0BvcGVuc3NoLmNvbQAAAECI6eacTxjB36xcPtP0ZyxJNIGCN350GluLD5h0KjKDsZLNmNaPSFH2ynWyKZKOF5eRPIMMKSCIV75y+KP9d6w3' > ~/.ssh/allowed_signers
# git config gpg.ssh.allowedSignersFile ~/.ssh/allowed_signers
Then verify the commit using the newly built ssh-keygen binary:
# PATH=$PWD:$PATH
# git log -1 --show-signature
commit ce0b590071e2dc845373734655192241a4ace94b (HEAD -> sphincsp, origin/sphincsp)
Good "git" signature for simon@josefsson.org with SPHINCSPLUS key SHA256:rkAa0fX0lQf/7V7QmuJHSI44L/PAPPsdWpis4nML7EQ
Author: Simon Josefsson <simon@josefsson.org>
Date:   Tue Dec 3 18:44:25 2024 +0100
    Add SPHINCS+.
# git verify-commit ce0b590071e2dc845373734655192241a4ace94b
Good "git" signature for simon@josefsson.org with SPHINCSPLUS key SHA256:rkAa0fX0lQf/7V7QmuJHSI44L/PAPPsdWpis4nML7EQ
# 
Yay! So what are some considerations? SPHINCS+ comes in many different variants. First it comes with three security levels approximately matching 128/192/256 bit symmetric key strengths. Second choice is between the SHA2-256, SHAKE256 (SHA-3) and Haraka hash algorithms. Final choice is between a robust and a simple variant with different security and performance characteristics. To get going, I picked the sphincss256sha256robust SPHINCS+ implementation from SUPERCOP 20241022. There is a good size comparison table in the sphincsplus implementation, if you want to consider alternative variants. SPHINCS+ public-keys are really small, as you can see in the allowed signers file. This is really good because they are handled by humans and often by cut n paste. What about private keys? They are slightly longer than Ed25519 private keys but shorter than typical RSA private keys.
# ssh-keygen -t sphincsplus -f my_sphincsplus_key -P ""
Generating public/private sphincsplus key pair.
Your identification has been saved in my_sphincsplus_key
Your public key has been saved in my_sphincsplus_key.pub
The key fingerprint is:
SHA256:4rNfXdmLo/ySQiWYzsBhZIvgLu9sQQz7upG8clKziBg root@ad600ff56253
The key's randomart image is:
+[SPHINCSPLUS 256-+
  .  .o            
 o . oo.           
  = .o.. o         
 o o  o o . .   o  
 .+    = S o   o . 
 Eo=  . + . . .. . 
 =*.+  o . . oo .  
 B+=    o o.o. .   
 o*o   ... .oo.    
+----[SHA256]-----+
# cat my_sphincsplus_key.pub 
ssh-sphincsplus@openssh.com AAAAG3NzaC1zcGhpbmNzcGx1c0BvcGVuc3NoLmNvbQAAAEAltAX1VhZ8pdW9FgC+NdM6QfLxVXVaf1v2yW4v+tk2Oj5lxmVgZftfT37GOMOlK9iBm9SQHZZVYZddkEJ9F1D7 root@ad600ff56253
# cat my_sphincsplus_key 
-----BEGIN OPENSSH PRIVATE KEY-----
b3BlbnNzaC1rZXktdjEAAAAABG5vbmUAAAAEbm9uZQAAAAAAAAABAAAAYwAAABtzc2gtc3
BoaW5jc3BsdXNAb3BlbnNzaC5jb20AAABAJbQF9VYWfKXVvRYAvjXTOkHy8VV1Wn9b9slu
L/rZNjo+ZcZlYGX7X09+xjjDpSvYgZvUkB2WVWGXXZBCfRdQ+wAAAQidiIwanYiMGgAAAB
tzc2gtc3BoaW5jc3BsdXNAb3BlbnNzaC5jb20AAABAJbQF9VYWfKXVvRYAvjXTOkHy8VV1
Wn9b9sluL/rZNjo+ZcZlYGX7X09+xjjDpSvYgZvUkB2WVWGXXZBCfRdQ+wAAAIAbwBxEhA
NYzITN6VeCMqUyvw/59JM+WOLXBlRbu3R8qS7ljc4qFVWUtmhy8B3t9e4jrhdO6w0n5I4l
mnLnBi2hJbQF9VYWfKXVvRYAvjXTOkHy8VV1Wn9b9sluL/rZNjo+ZcZlYGX7X09+xjjDpS
vYgZvUkB2WVWGXXZBCfRdQ+wAAABFyb290QGFkNjAwZmY1NjI1MwECAwQ=
-----END OPENSSH PRIVATE KEY-----
# 
Signature size? Now here is the challenge, for this variant the size is around 29kb or close to 600 lines of base64 data:
# git cat-file -p ce0b590071e2dc845373734655192241a4ace94b   head -10
tree ede42093e7d5acd37fde02065a4a19ac1f418703
parent 826483d51a9fee60703298bbf839d9ce37943474
author Simon Josefsson <simon@josefsson.org> 1733247865 +0100
committer Simon Josefsson <simon@josefsson.org> 1734907869 +0100
gpgsig -----BEGIN SSH SIGNATURE-----
 U1NIU0lHAAAAAQAAAGMAAAAbc3NoLXNwaGluY3NwbHVzQG9wZW5zc2guY29tAAAAQIjp5p
 xPGMHfrFw+0/RnLEk0gYI3fnQaW4sPmHQqMoOxks2Y1o9IUfbKdbIpko4Xl5E8gwwpIIhX
 vnL4o/13rDcAAAADZ2l0AAAAAAAAAAZzaGE1MTIAAHSDAAAAG3NzaC1zcGhpbmNzcGx1c0
 BvcGVuc3NoLmNvbQAAdGDHlobgfgkKKQBo3UHmnEnNXczCMNdzJmeYJau67QM6xZcAU+d+
 2mvhbksm5D34m75DWEngzBb3usJTqWJeeDdplHHRe3BKVCQ05LHqRYzcSdN6eoeZqoOBvR
# git cat-file -p ce0b590071e2dc845373734655192241a4ace94b   tail -5 
 ChvXUk4jfiNp85RDZ1kljVecfdB2/6CHFRtxrKHJRDiIavYjucgHF1bjz0fqaOSGa90UYL
 RZjZ0OhdHOQjNP5QErlIOcZeqcnwi0+RtCJ1D1wH2psuXIQEyr1mCA==
 -----END SSH SIGNATURE-----
Add SPHINCS+.
# git cat-file -p ce0b590071e2dc845373734655192241a4ace94b   wc -l
579
# 
What about performance? Verification is really fast:
# time git verify-commit ce0b590071e2dc845373734655192241a4ace94b
Good "git" signature for simon@josefsson.org with SPHINCSPLUS key SHA256:rkAa0fX0lQf/7V7QmuJHSI44L/PAPPsdWpis4nML7EQ
real	0m0.010s
user	0m0.005s
sys	0m0.005s
# 
On this machine, verifying an Ed25519 signature is a couple of times slower, and needs around 0.07 seconds. Signing is slower, it takes a bit over 2 seconds on my laptop.
# echo "Hello world!" > msg
# time ssh-keygen -Y sign -f my_sphincsplus_key -n my-namespace msg
Signing file msg
Write signature to msg.sig
real	0m2.226s
user	0m2.226s
sys	0m0.000s
# echo 'my.name@example.org ssh-sphincsplus@openssh.com AAAAG3NzaC1zcGhpbmNzcGx1c0BvcGVuc3NoLmNvbQAAAEAltAX1VhZ8pdW9FgC+NdM6QfLxVXVaf1v2yW4v+tk2Oj5lxmVgZftfT37GOMOlK9iBm9SQHZZVYZddkEJ9F1D7' > allowed-signers
# cat msg   ssh-keygen -Y verify -f allowed-signers -I my.name@example.org -n my-namespace -s msg.sig
Good "my-namespace" signature for my.name@example.org with SPHINCSPLUS key SHA256:4rNfXdmLo/ySQiWYzsBhZIvgLu9sQQz7upG8clKziBg
# 
Welcome to our new world of Post-Quantum safe digital signatures of Git commits, and Happy Hacking!

20 December 2024

Noah Meyerhans: Local Development VM Management

A coworker asked recently about how people use VMs locally for dev work, so I figured I d take a few minutes to write up a bit about what I do. There are many use cases for local virtual machines in software development and testing. They re self-contained, meaning you can make a mess of them without impacting your day-to-day computing environment. They can run different distributions, kernels, and even entirely different operating systems from the one you use regularly. Etc. They re also cheaper than cloud services and provide finer grained control over the resources. I figured I d share a little bit about how I manage different virtual machines in case anybody finds this useful. This is what works for me, but it won t necessarily work for you, or maybe you ve already got something better. I ve found it to be easy to work with, light weight, and is easy to evolve my needs change.

Use short-lived VMs Rather than keep a long-lived development VM around that you customize over time, I recommend automating the common customizations and provisioning new VMs regularly. If I m working on reproducing a bug or testing a change prior to submitting it upstream, I ll do this work in a VM and delete the VM when when I m done. When provisioning VMs this frequently, though, walking through the installation process for every new VM is tedious and a waste of time. Since most of my work is done in Debian, so I start with images generated daily by the cloud team. These images are available for multiple releases and architectures. The nocloud variant boots to a root prompt and can be useful directly, or the generic images can be used for cloud-init based customization.

Automating image preparation This makefile lets me do something like make image and get a new qcow2 image with the latest build of a given Debian release (sid by default, with others available by specifying DIST).
DATESTAMP=$(shell date +"%Y-%m-%d")
FLAVOR?=generic
ARCH?=$(shell dpkg --print-architecture)
DIST?=sid
RELEASE=$(DIST)
URL_PATH=https://cloud.debian.org/images/cloud/$(DIST)/daily/latest/
ifeq ($(DIST),trixie)
RELEASE=13
endif
ifeq ($(DIST),bookworm)
RELEASE=12
endif
ifeq ($(DIST),bullseye)
RELEASE=11
endif
debian-$(DIST)-$(FLAVOR)-$(ARCH)-daily.tar.xz:
curl --fail --connect-timeout 20 -LO \
$(URL_PATH)/debian-$(RELEASE)-$(FLAVOR)-$(ARCH)-daily.tar.xz
$(DIST)-$(FLAVOR)-$(DATESTAMP).qcow2: debian-$(RELEASE)-$(FLAVOR)-$(ARCH)-daily.tar.xz
tar xvf debian-$(RELEASE)-$(FLAVOR)-$(ARCH)-daily.tar.xz
qemu-img convert -O qcow2 disk.raw $@
rm -f disk.raw
qemu-img resize $@ 20g
qemu-img snapshot -c untouched $@
image: $(DIST)-$(FLAVOR)-$(DATESTAMP).qcow2
.PHONY: image

Customize the VM environment with cloud-init While the nocloud images can be useful, I typically find that I want to apply the same modifications to each new VM I launch, and they don t provide facilities for automating this. The generic images, on the other hand, run cloud-init by default. Using cloud-init, I can create my user account, point apt at local mirrors, install my preferred tools, ensure the root filesystem is resized to make full use of the backing storage, etc. The cloud-init configuration on the generic images will read from a local config drive, which can contain an ISO9660 (cdrom) filesystem image. This image can be generated from a subdirectory containing the various cloud-init input files using the following make syntax:
IMDS_FILES=$(shell find seedconfig -path '*/.git/*' \
-prune -o -type f -name '*.in.json' -print) \
seedconfig/openstack/latest/user_data
seed.iso: $(IMDS_FILES)
genisoimage -V config-2 -o $@ -J -R -m '*~' -m '.git' seedconfig
With the image in place, the VM can be created with
 qemu-system-x86_64 -machine q35,accel=kvm
-cpu host -m 4g -drive file=$ img ,index=0,if=virtio,media=disk
-drive file=seed.iso,media=cdrom,format=raw,index=2,if=virtio
-nic user -nographic
This invokes qemu with the root volume and ISO image attached as disks, uses an emulated q35 machine with the host s CPU and KVM acceleration, the userspace network stack, and a serial console. The first time the VM boots, cloud-init will apply the configuration from the cloud-config available in the ISO9660 filesystem.

Alternatives to cloud-init virt-customize is another tool accomplishing the same type of customization. I use cloud-init because it works directly with cloud providers in addition to local VM images. You could also use something like ansible.

Variations I have a variant of this that uses a bridged network, which I ll write more about later. The bridge is nice because it s more featureful, with full support for IPv6, etc, but it needs a bit more infrastructure in place. It also can be helpful to use 9p or virtfs to share filesystem state between the host the VM. I don t tend to rely on these, and will instead use rsync or TRAMP for moving files around. Containers are also useful, of course, and there are plenty of times when the full isolation of a VM is not worth the overhead.

17 December 2024

Gunnar Wolf: The science of detecting LLM-generated text

This post is a review for Computing Reviews for The science of detecting LLM-generated text , a article published in Communications of the ACM
While artificial intelligence (AI) applications for natural language processing (NLP) are no longer something new or unexpected, nobody can deny the revolution and hype that started, in late 2022, with the announcement of the first public version of ChatGPT. By then, synthetic translation was well established and regularly used, many chatbots had started attending users requests on different websites, voice recognition personal assistants such as Alexa and Siri had been widely deployed, and complaints of news sites filling their space with AI-generated articles were already commonplace. However, the ease of prompting ChatGPT or other large language models (LLMs) and getting extensive answers its text generation quality is so high that it is often hard to discern whether a given text was written by an LLM or by a human has sparked significant concern in many different fields. This article was written to present and compare the current approaches to detecting human- or LLM-authorship in texts. The article presents several different ways LLM-generated text can be detected. The first, and main, taxonomy followed by the authors is whether the detection can be done aided by the LLM s own functions ( white-box detection ) or only by evaluating the generated text via a public application programming interface (API) ( black-box detection ). For black-box detection, the authors suggest training a classifier to discern the origin of a given text. Although this works at first, this task is doomed from its onset to be highly vulnerable to new LLMs generating text that will not follow the same patterns, and thus will probably evade recognition. The authors report that human evaluators find human-authored text to be more emotional and less objective, and use grammar to indicate the tone of the sentiment that should be used when reading the text a trait that has not been picked up by LLMs yet. Human-authored text also tends to have higher sentence-level coherence, with less term repetition in a given paragraph. The frequency distribution for more and less common words is much more homogeneous in LLM-generated texts than in human-written ones. White-box detection includes strategies whereby the LLMs will cooperate in identifying themselves in ways that are not obvious to the casual reader. This can include watermarking, be it rule based or neural based; in this case, both processes become a case of steganography, as the involvement of a LLM is explicitly hidden and spread through the full generated text, aiming at having a low detectability and high recoverability even when parts of the text are edited. The article closes by listing the authors concerns about all of the above-mentioned technologies. Detecting an LLM, be it with or without the collaboration of the LLM s designers, is more of an art than a science, and methods deemed as robust today will not last forever. We also cannot assume that LLMs will continue to be dominated by the same core players; LLM technology has been deeply studied, and good LLM engines are available as free/open-source software, so users needing to do so can readily modify their behavior. This article presents itself as merely a survey of methods available today, while also acknowledging the rapid progress in the field. It is timely and interesting, and easy to follow for the informed reader coming from a different subfield.

Russ Allbery: Review: Iris Kelly Doesn't Date

Review: Iris Kelly Doesn't Date, by Ashley Herring Blake
Series: Bright Falls #3
Publisher: Berkley Romance
Copyright: October 2023
ISBN: 0-593-55058-7
Format: Kindle
Pages: 381
Iris Kelly Doesn't Date is a sapphic romance novel (probably a romantic comedy, although I'm bad at romance subgenres). It is the third book in the Bright Falls series. In the romance style, it has a new set of protagonists, but the protagonists of the previous books appear as supporting characters and reading this will spoil the previous books. Among the friend group we were introduced to in Delilah Green Doesn't Care, Iris was the irrepressible loudmouth. She's bad at secrets, good at saying whatever is on her mind, and has zero desire to either get married or have children. After one of the side plots of Astrid Parker Doesn't Fail, she has sworn off dating entirely. Iris is also now a romance novelist. Her paper store didn't get enough foot traffic to justify staying open, so she switched her planner business to online only and wrote a romance novel that was good enough to get a two-book deal. Now she needs to write a second book and she has absolutely nothing. Her own avoidance of romantic situations is not helping, but neither is her meddling family who are convinced her choices about marriage and family can be overturned with sufficient pestering. She desperately needs to shake up her life, get out of her creative rut, and do something new. Failing that, she'll settle for meeting someone in a bar and having some fun. Stevie is a barista and actress living in Portland. Six months ago, she broke up with Adri, her creative partner, girlfriend of six years, and the first person with whom she had a serious relationship. More precisely, Adri broke up with her. They're still friends, truly, even though that friendship is being seriously strained by Adri dating Vanessa, another member of their small and close-knit friend group. Stevie has occasionally-crippling anxiety, not much luck in finding real acting roles in Portland, and a desperate desire to not make waves. Ren, the fourth member of their friend group, thinks Stevie needs a new relationship, or at least a fling. That's how Stevie, with Ren as backup and encouragement, ends up at the same bar with Iris. The resulting dance and conversation was rather fun for both Stevie and Iris. The attempted one-night stand afterwards was a disaster due to Stevie's anxiety, and neither of them expected to see the other again. Stevie therefore felt safe pretending they'd hit it off to get her friends off her back. When Iris's continued restlessness lands her in an audition for Adri's fundraiser play that she also talked Stevie into performing in, this turns into a full-blown fake dating trope. These books continue to be impossible to put down. I'm not sure what Blake is doing to make the pacing so perfect, but as with the previous books of the series I found this utterly compulsive reading. I started it in the afternoon, took a break in the evening for a few hours, and then finished it at 2am. I wasn't sure if a book focused on Iris would work as well, but I need not have worried. Iris Kelly Doesn't Date is both more dramatic and more trope-centered than the earlier books, but Blake handles that in a way that fits Iris's personality and wasn't annoying even to a reader like me, who has an aversion to many types of relationship drama. The secret is Stevie, and specifically having the other protagonist be someone with severe anxiety.
No was never a very easy word for Stevie when it came to Adri, when it came to anyone, really. She could handle the little stuff do you want a soda, have you seen this movie, do you like onions on your pizza but the big stuff, the stuff that caused disappointed expressions and down-turned mouths... yeah, she sucked at that part. Her anxiety would flare, and she'd spend the next week convinced her friends hated her, she'd die alone and miserable, and wasn't worth a damn to anyone. Then, when said friend or family member eventually got ahold of her to tell her that, no, of course they didn't hate her, why in the world would she think that, her anxiety would crest once again, convincing her that she was terrible at understanding people and could never trust her own brain to make heads or tails of any social situation.
This is a spot-on description of a particular type of anxiety, but also this is the perfect protagonist to pair with Iris. Throughout the series, Iris has always been the ride-or-die friend, the person who may have no idea how to help but who will show up anyway and at least try to distract you. Stevie's anxiety makes Iris feel protective, which reveals one of the best sides of Iris's personality, and then the protectiveness plays off against Iris's own relationship issues and tendency to avoid taking anything too seriously. It's one of those relationships that starts a bit one-sided and then becomes mutually supporting once Stevie gets her feet under her. That's a relationship pattern I really enjoy reading about. As with the rest of the series, the friendship dynamics are great. Here we get to see two friend groups at work: Iris's, which we've seen in the previous two volumes and which expanded interestingly in Astrid Parker Doesn't Fail, and Stevie's, which is new. I liked all of these people, even Adri in her own way (although she's the hardest to like). The previous happily-ever-afters do get a bit awkward here, but Blake tries to make that part of the plot and also avoids most of the problem of somewhat-boring romantic bliss by spreading the friendship connections a bit wider. Stevie's friend group formed at orientation at Reed College, and that let me put my finger on another property of this series: essentially all of the characters are from a very specific social class. They're nearly all arts people (bookstore owner, photographer, interior decorator, actress, writer, director), they've mostly gone to college, and while most of them don't have lots of money, there's always at least one person in each friend group with significant wealth. Jordan, from the previous book, is a bit of an exception since she works in a trade (a carpenter), but she still acts like someone from that same social class. It's a bit like reading Jane Austen novels and realizing that the protagonists are drawn from a very specific and very narrow portion of society. This is not a complaint, to be clear; I have no objections to reading about a very specific social class. But if one has already read lots of books about this class of people, I could see that diminishing the appeal of this series a bit. There are a lot of assumptions baked into the story that aren't really questioned, such as the ubiquity of therapists. (I don't know how Stevie affords one on a barista salary.) There are also some small things in the terminology (therapy speak, for example) and in the specific type of earnestness with which the books attempt to be diverse on most axes other than social class that I suspect may grate a bit for some readers. If that's you, this is your warning. There is a third-act breakup here, just like the previous volumes. There is also a defense of the emotional punch of third-act breakups in romance novels in the book itself, put into Iris's internal monologue, so I suspect that's the author's answer to critics like myself who don't like the trope. I was less frustrated by this one because it fit the drama level of the protagonists, but I'll also know to expect a third-act breakup in any Blake novel I read in the future. But, all that said, the summary once again is that I loved this book and could not put it down. Iris is dramatic and occasionally self-destructive but has a core of earnest empathy that makes her easy to like. She's exactly the sort of extrovert who is soothing to introverts rather than draining because she carries the extrovert load of social situations. Stevie is adorably earnest and thoughtful beneath her anxiety. They two of them are wildly different and yet remarkably good together, and I loved reading their story. Highly recommended, along with the whole series. Start with Delilah Green Doesn't Care; if you like that, you're in for a treat. Content note: This book is also rather sex-forward and pretty explicit in the sex scenes, maybe a touch more than Astrid Parker Doesn't Fail. If that is or is not your thing in romance novels, be aware going in. Rating: 9 out of 10

16 December 2024

Dirk Eddelbuettel: #45: Some r-ci Updates

market monitor Welcome to post 45 in the $R^4 series! We introduced r-ci here in post #32 here nearly four years ago. It has found pretty widespread use and adoption, and we received a few kind words then (in the linked issue) and also more recently (in a follow-up comment) from which we merrily quote:
[ ] almost 3 years later on and I have had zero problems with this CI setup. For people who want reliable R software, resources like these are invaluable.
And while we followed up with post #41 about r2u for simple continuous integration, we may not have posted when we based r-ci on r2u (for the obvious Linux usage case). So let s make time now for a (comparitively smaller) update, and an update usage examples. We made two changes in the last few days. One is a (obvious in hindsight) simplification. Given that the bootstrap step was always executed, and needed no parameters, we pulled it into a new aggregated setup simply called r-ci that includes it so that it can be omitted as a step in the yaml file. Second, we recently needed Fortran on macOS too, and realized it was not installed by default so we just added that too. With that a real and used example is now as simple as the screenshot to the left (and hence one paragraph shorter). The trained eye will no doubt observe that there is nothing specific to a given repo. And that is basically the key feature: we can simply copy this file around and get fast and easy and reliable CI by taking advantage of the underlying robustness of r2u solving all dependency automagically and reliably. The option to enable macOS is also solid and compelling as the GitHub runners are fast (but more expensive in how the count against the limit of minutes so again a tradeoff to make), as is the option to run coverage if one so desires. Some of my repos do too. Take a look at the r-ci website which has more examples for the other supported CI servics it can used with, and feel free to ask questions as issue in the repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can now sponsor me at GitHub. Please report excessive re-aggregation in third-party for-profit settings.

15 December 2024

Russell Coker: Hisense 65U80G 65 Inch 8K ULED Android TV (2021)

The Aim I just bought a Hisense 65U80G 65 Inch 8K ULED Android TV (2021 model) for $1,568 including delivery. I got that deal by googling refurbished 8K TVs and finding the cheapest one I could buy. Amazon and eBay didn t have any good prices on second hand 8K TVs and new ones start at $3,000 on special. I didn t assess how Hisense compares to other TVs, as far as I could determine there was only one model of 8K TV on sale in Australia in the price range I was prepared to pay. So I won t review how this TV compares to other models but how refurbished TVs compare to other display options. I bought this because the highest resolution monitor in my price range is 5120*2160 [1]. While I could get a 5128*2880 monitor for around $1,500 paying 3* the money for 33% more pixels is bad value for money. Getting 4* the pixels for under 3* the price is good value even when it s a TV with the lower display quality that involves. Before buying this TV I read this blog post by Daniel Lawrence about using an 8K TV as a primary monitor [2]. While he has an interesting setup with a 65 TV on a large desk it s not what I plan to do at this time. My Plans for Use I don t plan to make it a main monitor. While 5120*2160 isn t as good as I like on my desk it s bearable and the quality of the display is high. High resolution isn t needed for all tasks, for example I m writing this blog post on my laptop while watching a movie on the 8K TV. One thing I d like to do with the 8K TV when I get it working as a monitor is to share the screen for team programming projects. I don t have any specific plans other than team coding projects at the moment. But it will be interesting to experiment with it when I get it working. Technical Issues with High Resolution Monitors Hardware Needed A lot of the graphic hardware out there don t support resolutions higher than 5120*2880. It seems that most laptops don t support resolutions higher than that and higher resolutions than 4K are difficult. Only quite recent and high end video cards will do 8K. Apparently the RTX 2080 is one of the oldest ones that does and that s $400 on ebay. Strangely the GPU chipset spec pages don t list the maximum resolution and there s the additional complication that the other chips might not support the resolutions that the GPU itself can support. As an aside I don t use NVidia cards for regular workstations due to reliability problems. But they are good for ML work and for special purpose systems. Interface Versions To do 8K video it seems that you need HDMI 2.1 (or maybe 2.0 with 4:2:0 chroma subsampling) or DisplayPort 1.3 for 30Hz with 24bit color and 2.0 for higher refresh rates. But using a particular version of the interface doesn t require supporting all the resolutions that it might support. This TV has HDMI 2.1 inputs, I ve bought an adaptor cable that does DisplayPort 1.4 to HDMI 2.1 at 8K resolution. So I need a video card that does DisplayPort 1.4 or HDMI 2.1 output. That doesn t mean that the card will work, but it could work. It s a pity that no-one has made a USB-C video controller that has a basic frame-buffer supporting 8K and the minimal GPU capabilities. The consensus of opinion is that no games will run well at 8K at this time so anyone using 8K resolution doesn t need GPU power unless it s for ML stuff. I m thinking of making a system that can be used as a ML server and X/Wayland server so a GPU with a decent amount of RAM and compute power would be good. I m not particularly interested in spending $1,500+ to get a GPU that can drive a $1,568 TV. I m looking into getting a RTX A2000 with 12G of RAM which should be adequate for ML experiments and can handle 8K@60Hz output. I ve ordered a DisplayPort to HDMI converter cable so if I get a DisplayPort card it will work. Software Support When I first got started with 4K monitors I had significant problems in adjusting the UI to be usable. The support for scaling software is much better now than it was then and 8K 65 has a lower DPI than 4K 32 . So I hope this won t be an issue. Progress So Far My first Hisense 8K TV stopped working properly. It would change to a mostly white screen after being used for some time. The screen would change in ways that correlate to changes in what should appear, but not in a way that was usable. It was just a different pattern of white blobs when I changed to a menu view not anything that allowed using it. I presume that this was the problem that drove a need for refurbishment as when I first got the TV it was still signed in to Google accounts for YouTube and to NetFlix. Best Buy Electrical was good about providing a quick replacement, they took away the old TV and delivered a new one on the same visit and it s now working well. I ve obtained a NVidia card that can allegedly do 8K output and a combination of cables that might be able to carry an 8K signal. Now I just need to get the NVidia drivers to not cause a kernel panic to get things to work.

13 December 2024

Emanuele Rocca: Murder Mystery: GCC Builds Failing After sbuild Refactoring

This is the story of an investigation conducted by Jochen Sprickerhof, Helmut Grohne, and myself. It was true teamwork, and we would have not reached the bottom of the issue working individually. We think you will find it as interesting and fun as we did, so here is a brief writeup. A few of the steps mentioned here took several days, others just a few minutes. What is described as a natural progression of events did not always look very obvious at the moment at all.
Let us go through the Six Stages of Debugging together.

Stage 1: That cannot happen
Official Debian GCC builds start failing on multiple architectures in late November.
The build error happens on the build servers when running the testuite, but we know this cannot happen. GCC builds are not meant to fail in case of testsuite failures! Return codes are not making the build fail, make is being called with -k, it just cannot happen.
A lot of the GCC tests are always failing in fact, and an extensive log of the results is posted to the debian-gcc mailing list, but the packages always build fine regardless.
On the build daemons, build failures take several hours.

Stage 2: That does not happen on my machine
Building on my machine running Bookworm is just fine. The Build Daemons run Bookworm and use a Sid chroot for the build environment, just like I am. Same kernel.
The only obvious difference between my setup and the Debian buildds is that I am using sbuild 0.85.0 from bookworm, and the buildds have 0.86.3~bpo12+1 from bookworm-backports. Trying again with 0.86.3~bpo12+1, the build fails on my system too. The build daemons were updated to the bookworm-backports version of sbuild at some point in late November. Ha.

Stage 3: That should not happen
There are quite a few sbuild versions in between 0.85.0 and 0.86.3~bpo12+1, but looking at recent sbuild bugs shows that sbuild 0.86.0 was breaking "quite a number of packages". Indeed, with 0.86.0 the build still fails. Trying the version immediately before, 0.85.11, the build finishes correctly. This took more time than it sounds, one run including the tests takes several hours. We need a way to shorten this somehow.
The Debian packaging of GCC allows to specify which languages you may want to skip, and by default it builds Ada, Go, C, C++, D, Fortran, Objective C, Objective C++, M2, and Rust. When running the tests sequentially, the build logs stop roughly around the tests of a runtime library for D, libphobos. So can we still reproduce the failure by skipping everything except for D? With DEB_BUILD_OPTIONS=nolang=ada,go,c,c++,fortran,objc,obj-c++,m2,rust the build still fails, and it fails faster than before. Several minutes, not hours. This is progress, and time to file a bug. The report contains massive spoilers, so no link. :-)

Stage 4: Why does that happen?
Something is causing the build to end prematurely. It s not the OOM killer, and the kernel does not have anything useful to say in the logs. Can it be that the D language tests are sending signals to some process, and that is what s killing make ? We start tracing signals sent with bpftrace by writing the following script, signals.bt:
tracepoint:signal:signal_generate  
    printf("%s PID %d (%s) sent signal %d to PID %d\n", comm, pid, args->sig, args->pid);
 
And executing it with sudo bpftrace signals.bt.
The build takes its sweet time, and it fails. Looking at the trace output there s a suspicious process.exe terminating stuff.
process.exe (PID: 2868133) sent signal 15 to PID 711826
That looks interesting, but we have no clue what PID 711826 may be. Let s change the script a bit, and trace signals received as well.
tracepoint:signal:signal_generate  
    printf("PID %d (%s) sent signal %d to %d\n", pid, comm, args->sig, args->pid);
 
tracepoint:signal:signal_deliver  
    printf("PID %d (%s) received signal %d\n", pid, comm, args->sig);
 
The working version of sbuild was using dumb-init, whereas the new one features a little init in perl. We patch the current version of sbuild by making it use dumb-init instead, and trace two builds: one with the perl init, one with dumb-init.
Here are the signals observed when building with dumb-init.
PID 3590011 (process.exe) sent signal 2 to 3590014
PID 3590014 (sleep) received signal 9
PID 3590011 (process.exe) sent signal 15 to 3590063
PID 3590063 (std.process tem) received signal 9
PID 3590011 (process.exe) sent signal 9 to 3590065
PID 3590065 (std.process tem) received signal 9
And this is what happens with the new init in perl:
PID 3589274 (process.exe) sent signal 2 to 3589291
PID 3589291 (sleep) received signal 9
PID 3589274 (process.exe) sent signal 15 to 3589338
PID 3589338 (std.process tem) received signal 9
PID 3589274 (process.exe) sent signal 9 to 3589340
PID 3589340 (std.process tem) received signal 9
PID 3589274 (process.exe) sent signal 15 to 3589341
PID 3589274 (process.exe) sent signal 15 to 3589323
PID 3589274 (process.exe) sent signal 15 to 3589320
PID 3589274 (process.exe) sent signal 15 to 3589274
PID 3589274 (process.exe) received signal 9
PID 3589341 (sleep) received signal 9
PID 3589273 (sbuild-usernsex) sent signal 9 to 3589320
PID 3589273 (sbuild-usernsex) sent signal 9 to 3589323
There are a few additional SIGTERM being sent when using the perl init, that s helpful. At this point we are fairly convinced that process.exe is worth additional inspection. The source code of process.d shows something interesting:
1221 @system unittest
1222  
[...]
1247     auto pid = spawnProcess(["sleep", "10000"],
[...]
1260     // kill the spawned process with SIGINT
1261     // and send its return code
1262     spawn((shared Pid pid)  
1263         auto p = cast() pid;
1264         kill(p, SIGINT);
So yes, there s our sleep and the SIGINT (signal 2) right in the unit tests of process.d, just like we have observed in the bpftrace output.
Can we study the behavior of process.exe in isolation, separatedly from the build? Indeed we can. Let s take the executable from a failed build, and try running it under /usr/libexec/sbuild-usernsexec.
First, we prepare a chroot inside a suitable user namespace:
unshare --map-auto --setuid 0 --setgid 0 mkdir /tmp/rootfs
cd /tmp/rootfs
cat /home/ema/.cache/sbuild/unstable-arm64.tar   unshare --map-auto --setuid 0 --setgid 0 tar xf  -
unshare --map-auto --setuid 0 --setgid 0 mkdir /tmp/rootfs/whatever
unshare --map-auto --setuid 0 --setgid 0 cp process.exe /tmp/rootfs/
Now we can run process.exe on its own using the perl init, and trace signals at will:
/usr/libexec/sbuild-usernsexec --pivotroot --nonet u:0:100000:65536  g:0:100000:65536 /tmp/rootfs ema /whatever -- /process.exe
We can compare the behavior of the perl init vis-a-vis the one using dumb-init in milliseconds instead of minutes.

Stage 5: Oh, I see.
Why does process.exe send more SIGTERMs when using the perl init is now the big question. We have a simple reproducer, so this is where using strace becomes possible.
sudo strace --user ema --follow-forks -o sbuild-dumb-init.strace ./sbuild-usernsexec-dumb-init --pivotroot --nonet u:0:100000:65536  g:0:100000:65536 /tmp/dumbroot ema /whatever -- /process.exe
We start comparing the strace output of dumb-init with that of perl-init, looking in particular for different calls to kill.
Here is what process.exe does under dumb-init:
3593883 kill(-2, SIGTERM)               = -1 ESRCH (No such process)
No such process. Under perl-init instead:
3593777 kill(-2, SIGTERM <unfinished ...>
The process is there under perl-init!
That is a kill with negative pid. From the kill(2) man page:
If pid is less than -1, then sig is sent to every process in the process group whose ID is -pid.
It would have been very useful to see this kill with negative pid in the output of bpftrace, why didn t we? The tracepoint used, tracepoint:signal:signal_generate, shows when signals are actually being sent, and not the syscall being called. To confirm, one can trace tracepoint:syscalls:sys_enter_kill and see the negative PIDs, for example:
PID 312719 (bash) sent signal 2 to -312728
The obvious question at this point is: why is there no process group 2 when using dumb-init?

Stage 6: How did that ever work?
We know that process.exe sends a SIGTERM to every process in the process group with ID 2. To find out what this process group may be, we spawn a shell with dumb-init and observe under /proc PIDs 1, 16, and 17. With perl-init we have 1, 2, and 17. When running dumb-init, there are a few forks before launching the program, explaining the difference. Looking at /proc/2/cmdline we see that it s bash, ie. the program we are running under perl-init. When building a package, that is dpkg-buildpackage itself.
The test is accidentally killing its own process group.
Now where does this -2 come from in the test?
2363     // Special values for _processID.
2364     enum invalid = -1, terminated = -2;
Oh. -2 is used as a special value for PID, meaning "terminated". And there s a call to kill() later on:
2694     do   s = tryWait(pid);   while (!s.terminated);
[...]
2697     assertThrown!ProcessException(kill(pid));
What sets pid to terminated you ask?
Here is tryWait:
2568 auto tryWait(Pid pid) @safe
2569  
2570     import std.typecons : Tuple;
2571     assert(pid !is null, "Called tryWait on a null Pid.");
2572     auto code = pid.performWait(false);
And performWait:
2306         _processID = terminated;
The solution, dear reader, is not to kill.
PS: the bug report with spoilers for those interested is #1089007.

12 December 2024

Matthew Garrett: Android privacy improvements break key attestation

Sometimes you want to restrict access to something to a specific set of devices - for instance, you might want your corporate VPN to only be reachable from devices owned by your company. You can't really trust a device that self attests to its identity, for instance by reporting its MAC address or serial number, for a couple of reasons:
If we want a high degree of confidence that the device we're talking to really is the device it claims to be, we need something that's much harder to spoof. For devices with a TPM this is the TPM itself. Every TPM has an Endorsement Key (EK) that's associated with a certificate that chains back to the TPM manufacturer. By verifying that certificate path and having the TPM prove that it's in posession of the private half of the EK, we know that we're communicating with a genuine TPM[1].

Android has a broadly equivalent thing called ID Attestation. Android devices can generate a signed attestation that they have certain characteristics and identifiers, and this can be chained back to the manufacturer. Obviously providing signed proof of the device identifier is kind of problematic from a privacy perspective, so the short version[2] is that only apps installed using a corporate account rather than a normal user account are able to do this.

But that's still not ideal - the device identifiers involved included the IMEI and serial number of the device, and those could potentially be used to correlate devices across privacy boundaries since they're static[3] identifiers that are the same both inside a corporate work profile and in the normal user profile, and also remains static if you move between different employers and use the same phone[4]. So, since Android 12, ID Attestation includes an "Enterprise Specific ID" or ESID. The ESID is based on a hash of device-specific data plus the enterprise that the corporate work profile is associated with. If a device is enrolled with the same enterprise then this ID will remain static, if it's enrolled with a different enterprise it'll change, and it just doesn't exist outside the work profile at all. The other device identifiers are no longer exposed.

But device ID verification isn't enough to solve the underlying problem here. When we receive a device ID attestation we know that someone at the far end has posession of a device with that ID, but we don't know that that device is where the packets are originating. If our VPN simply has an API that asks for an attestation from a trusted device before routing packets, we could pass that on to said trusted device and then simply forward the attestation to the VPN server[5]. We need some way to prove that the the device trying to authenticate is actually that device.

The answer to this is key provenance attestation. If we can prove that an encryption key was generated on a trusted device, and that the private half of that key is stored in hardware and can't be exported, then using that key to establish a connection proves that we're actually communicating with a trusted device. TPMs are able to do this using the attestation keys generated in the Credential Activation process, giving us proof that a specific keypair was generated on a TPM that we've previously established is trusted.

Android again has an equivalent called Key Attestation. This doesn't quite work the same way as the TPM process - rather than being tied back to the same unique cryptographic identity, Android key attestation chains back through a separate cryptographic certificate chain but contains a statement about the device identity - including the IMEI and serial number. By comparing those to the values in the device ID attestation we know that the key is associated with a trusted device and we can now establish trust in that key.

"But Matthew", those of you who've been paying close attention may be saying, "Didn't Android 12 remove the IMEI and serial number from the device ID attestation?" And, well, congratulations, you were apparently paying more attention than Google. The key attestation no longer contains enough information to tie back to the device ID attestation, making it impossible to prove that a hardware-backed key is associated with a specific device ID attestation and its enterprise enrollment.

I don't think this was any sort of deliberate breakage, and it's probably more an example of shipping the org chart - my understanding is that device ID attestation and key attestation are implemented by different parts of the Android organisation and the impact of the ESID change (something that appears to be a legitimate improvement in privacy!) on key attestation was probably just not realised. But it's still a pain.

[1] Those of you paying attention may realise that what we're doing here is proving the identity of the TPM, not the identity of device it's associated with. Typically the TPM identity won't vary over the lifetime of the device, so having a one-time binding of those two identities (such as when a device is initially being provisioned) is sufficient. There's actually a spec for distributing Platform Certificates that allows device manufacturers to bind these together during manufacturing, but I last worked on those a few years back and don't know what the current state of the art there is

[2] Android has a bewildering array of different profile mechanisms, some of which are apparently deprecated, and I can never remember how any of this works, so you're not getting the long version

[3] Nominally, anyway. Cough.

[4] I wholeheartedly encourage people not to put work accounts on their personal phones, but I am a filthy hypocrite here

[5] Obviously if we have the ability to ask for attestation from a trusted device, we have access to a trusted device. Why not simply use the trusted device? The answer there may be that we've compromised one and want to do as little as possible on it in order to reduce the probability of triggering any sort of endpoint detection agent, or it may be because we want to run on a device with different security properties than those enforced on the trusted device.

comment count unavailable comments

10 December 2024

Russ Allbery: 2024 book haul

I haven't made one of these posts since... last year? Good lord. I've therefore already read and reviewed a lot of these books. Kemi Ashing-Giwa The Splinter in the Sky (sff)
Moniquill Blackgoose To Shape a Dragon's Breath (sff)
Ashley Herring Blake Delilah Green Doesn't Care (romance)
Ashley Herring Blake Astrid Parker Doesn't Fail (romance)
Ashley Herring Blake Iris Kelly Doesn't Date (romance)
Molly J. Bragg Scatter (sff)
Sarah Rees Breenan Long Live Evil (sff)
Michelle Browne And the Stars Will Sing (sff)
Steven Brust Lyorn (sff)
Miles Cameron Beyond the Fringe (sff)
Miles Cameron Deep Black (sff)
Haley Cass Those Who Wait (romance)
Sylvie Cathrall A Letter to the Luminous Deep (sff)
Ta-Nehisi Coates The Message (non-fiction)
Julie E. Czerneda To Each This World (sff)
Brigid Delaney Reasons Not to Worry (non-fiction)
Mar Delaney Moose Madness (sff)
Jerusalem Demsas On the Housing Crisis (non-fiction)
Michelle Diener Dark Horse (sff)
Michelle Diener Dark Deeds (sff)
Michelle Diener Dark Minds (sff)
Michelle Diener Dark Matters (sff)
Elaine Gallagher Unexploded Remnants (sff)
Bethany Jacobs These Burning Stars (sff)
Bethany Jacobs On Vicious Worlds (sff)
Micaiah Johnson Those Beyond the Wall (sff)
T. Kingfisher Paladin's Faith (sff)
T.J. Klune Somewhere Beyond the Sea (sff)
Mark Lawrence The Book That Wouldn't Burn (sff)
Mark Lawrence The Book That Broke the World (sff)
Mark Lawrence Overdue (sff)
Mark Lawrence Returns (sff collection)
Malinda Lo Last Night at the Telegraph Club (historical)
Jessie Mihalik Hunt the Stars (sff)
Samantha Mills The Wings Upon Her Back (sff)
Lyda Morehouse Welcome to Boy.net (sff)
Cal Newport Slow Productivity (non-fiction)
Naomi Novik Buried Deep and Other Stories (sff collection)
Claire O'Dell The Hound of Justice (sff)
Keanu Reeves & China Mi ville The Book of Elsewhere (sff)
Kit Rocha Beyond Temptation (sff)
Kit Rocha Beyond Jealousy (sff)
Kit Rocha Beyond Solitude (sff)
Kit Rocha Beyond Addiction (sff)
Kit Rocha Beyond Possession (sff)
Kit Rocha Beyond Innocence (sff)
Kit Rocha Beyond Ruin (sff)
Kit Rocha Beyond Ecstasy (sff)
Kit Rocha Beyond Surrender (sff)
Kit Rocha Consort of Fire (sff)
Geoff Ryman HIM (sff)
Melissa Scott Finders (sff)
Rob Wilkins Terry Pratchett: A Life with Footnotes (non-fiction)
Gabrielle Zevin Tomorrow, and Tomorrow, and Tomorrow (mainstream)
That's a lot of books, although I think I've already read maybe a third of them? Which is better than I usually do.

9 December 2024

Gunnar Wolf: Some tips for those who still administer Drupal7-based sites

A bit of history: Drupal at my workplace (and in Debian) My main day-to-day responsibility in my workplace is, and has been for 20 years, to take care of the network infrastructure for UNAM s Economics Research Institute. One of the most visible parts of this responsibility is to ensure we have a working Web presence, and that it caters for the needs of our academic community. I joined the Institute in January 2005. Back then, our designer pushed static versions of our webpage, completely built in her computer. This was standard practice at the time, and lasted through some redesigns, but I soon started advocating for the adoption of a Content Management System. After evaluating some alternatives, I recommended adopting Drupal. It took us quite a bit to do the change: even though I clearly recall starting work toward adopting it as early as 2006, according to the Internet Archive, we switched to a Drupal-backed site around June 2010. We started using it somewhere in the version 6 s lifecycle. As for my Debian work, by late 2012 I started getting involved in the maintenance of the drupal7 package, and by April 2013 I became its primary maintainer. I kept the drupal7 package up to date in Debian until 2018; the supported build methods for Drupal 8 are not compatible with Debian (mainly, bundling third-party libraries and updating them without coordination with the rest of the ecosystem), so towards the end of 2016, I announced I would not package Drupal 8 for Debian. By March 2016, we migrated our main page to Drupal 7. By then, we already had several other sites for our academics projects, but my narrative follows our main Web site. I did manage to migrate several Drupal 6 (D6) sites to Drupal 7 (D7); it was quite involved process, never transparent to the user, and we did have the backlash of long downtimes (or partial downtimes, with sites half-available only) with many of our users. For our main site, we took the opportunity to do a complete redesign and deployed a fully new site. You might note that March 2016 is after the release of D8 (November 2015). I don t recall many of the specifics for this decision, but if I m not mistaken, building the new site was a several months long process not only for the technical work of setting it up, but for the legwork of getting all of the needed information from the different areas that need to be represented in the Institute. Not only that: Drupal sites often include tens of contributed themes and modules; the technological shift the project underwent between its 7 and 8 releases was too deep, and modules took a long time (if at all many themes and modules were outright dumped) to become available for the new release. Naturally, the Drupal Foundation wanted to evolve and deprecate the old codebase. But the pain to migrate from D7 to D8 is too big, and many sites have remained under version 7 Eight years after D8 s release, almost 40% of Drupal installs are for version 7, and a similar proportion runs a currently-supported release (10 or 11). And while the Drupal Foundation made a great job at providing very-long-term support for D7, I understand the burden is becoming too much, so close to a year ago (and after pushing several times the D7, they finally announced support will finish this upcoming January 5.

Drupal 7 must go! I found the following usage graphs quite interesting: the usage statistics for all Drupal versions follows a very positive slope, peaking around 2014 during the best years of D7, and somewhat stagnating afterwards, staying since 2015 at the 25000 28000 sites mark (I m very tempted to copy the graphs, but builtwith s terms of use are very clear in not allowing it). There is a sharp drop in the last year I attribute it to the people that are leaving D7 for other technologies after its end-of-life announcement. This becomes clearer looking only at D7 s usage statistics: D7 peaks at 15000 installs in 2016 stays there for close to 5 years, and has a sharp drop to under 7500 sites in the span of one year. D8 has a more regular rise, peak and fall peaking at ~8500 between 2020 and 2021, and down to close to 2500 for some months already; D9 has a very brief peak of almost 9000 sites in 2023 and is now close to half of it. Currently, the Drupal king appears to be D10, still on a positive slope and with over 9000 sites. Drupal 11 is still just a blip in builtwith s radar, with 3 registered sites as of September 2024 :- After writing this last paragraph, I came across the statistics found in the Drupal webpage; the methodology for acquiring its data is completely different: while builtwith s methodology is their trade secret, you can read more about how Drupal s data is gathered (and agree or disagree with it , but at least you have a page detailing 12 years so far of reported data, producing the following graph (which can be shared under the CC BY-SA license ): Drupal usage statistics by version 2013 2024 This graph is disgregated into minor versions, and I don t want to come up with yet another graph for it but it supports (most of) the narrative I presented above although I do miss the recent drop builtwith reported in D7 s numbers!

And what about Backdrop? During the D8 release cycle, a group of Drupal developers were not happy with the depth of the architectural changes that were being adopted, particularly the transition to the Symfony PHP component framework, and forked the D7 codebase to create the Backdrop CMS, a modern version of Drupal, without dropping the known and tested architecture it had. The Backdrop developers keep working closely together with the Drupal community, and although its usage numbers are way smaller than Drupal s, seems to be sustainable and lively. Of course, as I presented their numbers in the previous section, you can see Backdrop s numbers in builtwith are way, way lower. I have found it to be a very warm and welcoming community, eager to receive new members. And, thanks to its contributed D2B Migrate module, I found it is quite easy to migrate a live site from Drupal 7 to Backdrop.

Migration by playbook! So Well, I m an academic. And (if it s not obvious to you after reading so far ), one of the things I must do in my job is to write. So I decided to write an article to invite my colleagues to consider Backdrop for their D7 sites in Cuadernos T cnicos Universitarios de la DGTIC, a young journal in our university for showcasing technical academical work. And now that my article got accepted and published, I m happy to share it with you of course, if you can read Spanish But anyway Given I have several sites to migrate, and that I m trying to get my colleagues to follow suite, I decided to automatize the migration by writing an Ansible playbook to do the heavy lifting. Of course, the playbook s users will probably need to tweak it a bit to their personal needs. I m also far from an Ansible expert, so I m sure there is ample room fo improvement in my style. But it works. Quite well, I must add.

But with this size of database I did stumble across a big pebble, though. I am working on the migration of one of my users sites, and found that its database is huge. I checked the mysqldump output, and it got me close to 3GB of data. And given the D2B_migrate is meant to work via a Web interface (my playbook works around it by using a client I wrote with Perl s WWW::Mechanize), I repeatedly stumbled with PHP s maximum POST size, maximum upload size, maximum memory size I asked for help in Backdrop s Zulip chat site, and my attention was taken off fixing PHP to something more obvious: Why is the database so large? So I took a quick look at the database (or rather: my first look was at the database server s filesystem usage). MariaDB stores each table as a separate file on disk, so I looked for the nine largest tables:
# ls -lhS head
total 3.8G
-rw-rw---- 1 mysql mysql 2.4G Dec 10 12:09 accesslog.ibd
-rw-rw---- 1 mysql mysql 224M Dec  2 16:43 search_index.ibd
-rw-rw---- 1 mysql mysql 220M Dec 10 12:09 watchdog.ibd
-rw-rw---- 1 mysql mysql 148M Dec  6 14:45 cache_field.ibd
-rw-rw---- 1 mysql mysql  92M Dec  9 05:08 aggregator_item.ibd
-rw-rw---- 1 mysql mysql  80M Dec 10 12:15 cache_path.ibd
-rw-rw---- 1 mysql mysql  72M Dec  2 16:39 search_dataset.ibd
-rw-rw---- 1 mysql mysql  68M Dec  2 13:16 field_revision_field_idea_principal_articulo.ibd
-rw-rw---- 1 mysql mysql  60M Dec  9 13:19 cache_menu.ibd
A single table, the access log, is over 2.4GB long. The three following tables are, cache tables. I can perfectly live without their data in our new site! But I don t want to touch the slightest bit of this site until I m satisfied with the migration process, so I found a way to exclude those tables in a non-destructive way: given D2B_migrate works with a mysqldump output, and given that mysqldump locks each table before starting to modify it and unlocks it after its job is done, I can just do the following:
$ perl -e '$output = 1; while (<>)   $output=0 if /^LOCK TABLES  (accesslog search_index watchdog cache_field cache_path) /; $output=1 if /^UNLOCK TABLES/; print if $output ' < /tmp/d7_backup.sql  > /tmp/d7_backup.eviscerated.sql; ls -hl /tmp/d7_backup.sql /tmp/d7_backup.eviscerated.sql
-rw-rw-r-- 1 gwolf gwolf 216M Dec 10 12:22 /tmp/d7_backup.eviscerated.sql
-rw------- 1 gwolf gwolf 2.1G Dec  6 18:14 /tmp/d7_backup.sql
Five seconds later, I m done! The database is now a tenth of its size, and D2B_migrate is happy to take it. And I m a big step closer to finishing my reliance on (this bit of) legacy code for my highly-visible sites

Thorsten Alteholz: My Debian Activities in November 2024

Debian LTS This was my hundred-twenty-fifth month that I did some work for the Debian LTS initiative, started by Raphael Hertzog at Freexian. During my allocated time I uploaded or worked on: I also continued to work on a fix for glewlwyd, which is more difficult than expected. Besides I started to work on ffmpeg and haproxy. Last but not least I did a week of FD this month and attended the monthly LTS/ELTS meeting. Debian ELTS This month was the seventy-sixth ELTS month. During my allocated time I uploaded or worked on: I also started to work on a fix for kmail-account-wizzard. Unfortunately preparing a testing environment takes some time and I did not finish testing this month. Besides I started to work on ffmpeg and haproxy. Last but not least I did a week of FD this month and attended the monthly LTS/ELTS meeting. Debian Printing Unfortunately I didn t found any time to work on this topic. Debian Matomo Unfortunately I didn t found any time to work on this topic. Debian Astro This month I uploaded new packages or new upstream or bugfix versions of: I also sponsored an upload of calceph. Debian IoT This month I uploaded new upstream or bugfix versions of: Debian Mobcom This month I uploaded new packages or new upstream or bugfix versions of: misc This month I uploaded new upstream or bugfix versions of: I also did some NMU of opensta, kdrill, glosstex, irsim, pagetools, afnix, cpm, to fix some RC bugs. FTP master This month I accepted 266 and rejected 16 packages. The overall number of packages that got accepted was 269.

7 December 2024

Russ Allbery: Review: Dark Minds

Review: Dark Minds, by Michelle Diener
Series: Class 5 #3
Publisher: Eclipse
Copyright: 2016
ISBN: 0-6454658-5-2
Format: Kindle
Pages: 328
Dark Minds is the third book in the self-published Class 5 science-fiction romance series. It is a sequel to Dark Deeds and should not be read out of order, but as with the other books of the series, it follows new protagonists. Imogen, like another Earth woman before her, was kidnapped by a Class 5 for experimentation by the Tecran. She was subsequently transferred to a research facility and has been held in cages and secret military bases for over three months. As the story opens, she had been imprisoned in a Tecran transport for a couple of weeks when the transport is taken by the Krik who kill all of the Tecran crew. Next stop: another Class 5. That same Class 5 is where the Krik are taking Camlar Kalor and his team. Cam is a Grih investigator from the United Council, sent to look into reports of a second Earth woman rescued from a Garmman ship. (The series reader will recognize this as the plot of Dark Deeds.) Now he's trapped with the crews of a bunch of other random ships in the cargo hold of a Class 5 with Krik instead of Tecran roaming the halls, apparently at the behest of the ship's banned AI, and a mysterious third Earth woman already appears to be befriending it. Imogen is the woman Fiona saw signs of in Dark Deeds. She's had a rougher time than the protagonists of the previous two books in this series, and she's been dropped into a less stable situation. The Class 5 she's brought to at the start of the story is far more suspicious (with quite a lot of cause) and somewhat more hostile than the AIs we've encountered previously. The rest of the story formula is roughly the same, though: hunky Grih officer with a personality completely indistinguishable from the hunky Grih officers in the previous two books, an AI with a sketchy concept of morality that desperately needs an ally, the Grih obsession with singing, the eventual discovery of useful armor and weaponry that Imogen can use to surprise people, and more political maneuvering over the Sentient Beings Agreement. This entry in the series mostly abandons the Grih shock and horror at how badly the Earth women have been treated. This makes sense, given how dangerous Earth women have proven over the course of this series, and I like that Diener is changing the political dynamics as the story develops. I do sometimes miss that appalled anger of Dark Horse, but Dark Minds focuses more on the politics and corridor fighting of a tense multi-sided stand-off. I found the action more gripping in Dark Minds than in Dark Deeds, and I liked Imogen more as a character than Fiona. She doesn't have the delightfully calm competence of Rose from Dark Horse, but she's a bit more hardened, a bit more canny, and is better at taking control of situations. I also like that Diener avoids simplistically pairing Earth women off with Class 5s. The series plot is progressing faster than I had expected, and that gives this book a somewhat different shape than the previous ones. Cam is probably the least interesting of the men in this series so far and appears to exist solely to take up the man-shaped hole in the plot. This is not a great series for gender roles; thankfully, the romance is a small part of the plot and largely ignorable. The story is about the women and the AIs, and all of the women and most of the AIs of the previous books make an appearance. It's clear they're forming an alliance whether the Grih like it or not, and that part of the story was very satisfying. Up to this book, this series had been all feel-good happy endings. I will risk the small spoiler and warn that this is not true to the same degree here, so you may not want to read this one if you want something entirely fluffy, light, and positive (inasmuch as a series involving off-screen experimentation on humans can be fluffy, light, and positive). That caught me by surprise in a way I didn't entirely like, and I wish Diener had stuck with the entirely positive tone. Other than that, though, this was fun, light, readable entertainment. It's not going to win any literary awards, it's formulaic, the male protagonist comes from central casting, and the emphasis by paragraph break is still a bit grating in places, but I will probably pick up the next book when I'm in the mood for something light. Dark Minds is an improvement over book two, which bodes well for the rest of the series. Followed by Dark Matters. Rating: 7 out of 10

5 December 2024

Reproducible Builds: Reproducible Builds in November 2024

Welcome to the November 2024 report from the Reproducible Builds project! Our monthly reports outline what we ve been up to over the past month and highlight items of news from elsewhere in the world of software supply-chain security where relevant. As ever, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. Table of contents:
  1. Reproducible Builds mourns the passing of Lunar
  2. Introducing reproduce.debian.net
  3. New landing page design
  4. SBOMs for Python packages
  5. Debian updates
  6. Reproducible builds by default in Maven 4
  7. PyPI now supports digital attestations
  8. Dependency Challenges in OSS Package Registries
  9. Zig programming language demonstrated reproducible
  10. Website updates
  11. Upstream patches
  12. Misc development news
  13. Reproducibility testing framework

Reproducible Builds mourns the passing of Lunar The Reproducible Builds community sadly announced it has lost its founding member, Lunar. J r my Bobbio aka Lunar passed away on Friday November 8th in palliative care in Rennes, France. Lunar was instrumental in starting the Reproducible Builds project in 2013 as a loose initiative within the Debian project. He was the author of our earliest status reports and many of our key tools in use today are based on his design. Lunar s creativity, insight and kindness were often noted. You can view our full tribute elsewhere on our website. He will be greatly missed.

Introducing reproduce.debian.net In happier news, this month saw the introduction of reproduce.debian.net. Announced at the recent Debian MiniDebConf in Toulouse, reproduce.debian.net is an instance of rebuilderd operated by the Reproducible Builds project. rebuilderd is our server designed monitor the official package repositories of Linux distributions and attempts to reproduce the observed results there. In November, reproduce.debian.net began rebuilding Debian unstable on the amd64 architecture, but throughout the MiniDebConf, it had attempted to rebuild 66% of the official archive. From this, it could be determined that it is currently possible to bit-for-bit reproduce and corroborate approximately 78% of the actual binaries distributed by Debian that is, using the .buildinfo files hosted by Debian itself. reproduce.debian.net also contains instructions how to setup one s own rebuilderd instance, and we very much invite everyone with a machine to spare to setup their own version and to share the results. Whilst rebuilderd is still in development, it has been used to reproduce Arch Linux since 2019. We are especially looking for installations targeting Debian architectures other than i386 and amd64.

New landing page design As part of a very productive partnership with the Sovereign Tech Fund and Neighbourhoodie, we are pleased to unveil our new homepage/landing page. We are very happy with our collaboration with both STF and Neighbourhoodie (including many changes not directly related to the website), and look forward to working with them in the future.

SBOMs for Python packages The Python Software Foundation has announced a new cross-functional project for SBOMs and Python packages . Seth Michael Larson writes that the project is specifically looking to solve these issues :
  • Enable Python users that require SBOM documents (likely due to regulations like CRA or SSDF) to self-serve using existing SBOM generation tools.
  • Solve the phantom dependency problem, where non-Python software is bundled in Python packages but not recorded in any metadata. This makes the job of software composition analysis (SCA) tools difficult or impossible.
  • Make the adoption work by relevant projects such as build backends, auditwheel-esque tools, as minimal as possible. Empower users who are interested in having better SBOM data for the Python projects they are using to be able to contribute engineering time towards that goal.
A GitHub repository for the initiative is available, and there are a number of queries, comments and remarks on Seth s Discourse forum post.

Debian updates There was significant development within Debian this month. Firstly, at the recent MiniDebConf in Toulouse, France, Holger Levsen gave a Debian-specific talk on rebuilding packages distributed from ftp.debian.org that is to say, how to reproduce the results from the official Debian build servers: Holger described the talk as follows:
For more than ten years, the Reproducible Builds project has worked towards reproducible builds of many projects, and for ten years now we have build Debian packages twice with maximal variations applied to see if they can be build reproducible still. Since about a month, we ve also been rebuilding trying to exactly match the builds being distributed via ftp.debian.org. This talk will describe the setup and the lessons learned so far, and why the results currently are what they are (spoiler: they are less than 30% reproducible), and what we can do to fix that.
The Debian Project Leader, Andreas Tille, was present at the talk and remarked later in his Bits from the DPL update that:
It might be unfair to single out a specific talk from Toulouse, but I d like to highlight the one on reproducible builds. Beyond its technical focus, the talk also addressed the recent loss of Lunar, whom we mourn deeply. It served as a tribute to Lunar s contributions and legacy. Personally, I ve encountered packages maintained by Lunar and bugs he had filed. I believe that taking over his packages and addressing the bugs he reported is a meaningful way to honor his memory and acknowledge the value of his work.
Holger s slides and video in .webm format are available.
Next, rebuilderd is the server to monitor package repositories of Linux distributions and attempt to reproduce the observed results. This month, version 0.21.0 released, most notably with improved support for binNMUs by Jochen Sprickerhof and updating the rebuilderd-debian.sh integration to the latest debrebuild version by Holger Levsen. There has also been significant work to get the rebuilderd package into the Debian archive, in particular, both rust-rebuilderd-common version 0.20.0-1 and rust-rust-lzma version 0.6.0-1 were packaged by kpcyrd and uploaded by Holger Levsen. Related to this, Holger Levsen submitted three additional issues against rebuilderd as well:
  • rebuildctl should be more verbose when encountering issues. [ ]
  • Please add an option to used randomised queues. [ ]
  • Scheduling and re-scheduling multiple packages at once. [ ]
and lastly, Jochen Sprickerhof submitted one an issue requested that rebuilderd downloads the source package in addition to the .buildinfo file [ ] and kpcyrd also submitted and fixed an issue surrounding dependencies and clarifying the license [ ]
Separate to this, back in 2018, Chris Lamb filed a bug report against the sphinx-gallery package as it generates unreproducible content in various ways. This month, however, Dmitry Shachnev finally closed the bug, listing the multiple sub-issues that were part of the problem and how they were resolved.
Elsewhere, Roland Clobus posted to our mailing list this month, asking for input on a bug in Debian s ca-certificates-java package. The issue is that the Java key management tools embed timestamps in its output, and this output ends up in the /etc/ssl/certs/java/cacerts file on the generated ISO images. A discussion resulted from Roland s post suggesting some short- and medium-term solutions to the problem.
Holger Levsen uploaded some packages with reproducibility-related changes:
Lastly, 12 reviews of Debian packages were added, 5 were updated and 21 were removed this month adding to our knowledge about identified issues in Debian.

Reproducible builds by default in Maven 4 On our mailing list this month, Herv Boutemy reported the latest release of Maven (4.0.0-beta-5) has reproducible builds enabled by default. In his mailing list post, Herv mentions that this story started during our Reproducible Builds summit in Hamburg , where he created the upstream issue that builds on a multi-year effort to have Maven builds configured for reproducibility.

PyPI now supports digital attestations Elsewhere in the Python ecosystem and as reported on LWN and elsewhere, the Python Package Index (PyPI) has announced that it has finalised support for PEP 740 ( Index support for digital attestations ). Trail of Bits, who performed much of the development work, has an in-depth blog post about the work and its adoption, as well as what is left undone:
One thing is notably missing from all of this work: downstream verification. [ ] This isn t an acceptable end state (cryptographic attestations have defensive properties only insofar as they re actually verified), so we re looking into ways to bring verification to individual installing clients. In particular, we re currently working on a plugin architecture for pip that will enable users to load verification logic directly into their pip install flows.
There was an in-depth discussion on LWN s announcement page, as well as on Hacker News.

Dependency Challenges in OSS Package Registries At BENEVOL, the Belgium-Netherlands Software Evolution workshop in Namur, Belgium, Tom Mens and Alexandre Decan presented their paper, An Overview and Catalogue of Dependency Challenges in Open Source Software Package Registries . The abstract of their paper is as follows:
While open-source software has enabled significant levels of reuse to speed up software development, it has also given rise to the dreadful dependency hell that all software practitioners face on a regular basis. This article provides a catalogue of dependency-related challenges that come with relying on OSS packages or libraries. The catalogue is based on the scientific literature on empirical research that has been conducted to understand, quantify and overcome these challenges. [ ]
A PDF of the paper is available online.

Zig programming language demonstrated reproducible Motiejus Jak ty posted an interesting and practical blog post on his successful attempt to reproduce the Zig programming language without using the pre-compiled binaries checked into the repository, and despite the circular dependency inherent in its bootstrapping process. As a summary, Motiejus concludes that:
I can now confidently say (and you can also check, you don t need to trust me) that there is nothing hiding in zig1.wasm [the checked-in binary] that hasn t been checked-in as a source file.
The full post is full of practical details, and includes a few open questions.

Website updates Notwithstanding the significant change to the landing page (screenshot above), there were an enormous number of changes made to our website this month. This included:
  • Alex Feyerke and Mariano Gim nez:
    • Dramatically overhaul the website s landing page with new benefit cards tailored to the expected visitors to our website and a reworking of the visual hierarchy and design. [ ][ ][ ][ ][ ][ ][ ][ ][ ][ ]
  • Bernhard M. Wiedemann:
    • Update the System images page to document the e2fsprogs approach. [ ]
  • Chris Lamb:
  • FC (Fay) Stegerman:
    • Replace more inline markdown with HTML on the Success stories page. [ ]
    • Add some links, fix some other links and correct some spelling errors on the Tools page. [ ]
  • Holger Levsen:
    • Add a historical presentation ( Reproducible builds everywhere eg. in Debian, OpenWrt and LEDE ) from October 2016. [ ]
    • Add jochensp and Oejet to the list of known contributors. [ ][ ]
  • Julia Kr ger:
  • Ninette Adhikari & hulkoba:
    • Add/rework the list of success stories into a new page that clearly shows milestones in Reproducible Builds. [ ][ ][ ][ ][ ][ ]
  • Philip Rinn:
    • Import 47 historical weekly reports. [ ]
  • hulkoba:
    • Add alt text to almost all images (!). [ ][ ]
    • Fix a number of links on the Talks . [ ][ ]
    • Avoid so-called ghost buttons by not using <button> elements as links, as the affordance of a <button> implies an action with (potentially) a side effect. [ ][ ]
    • Center the sponsor logos on the homepage. [ ]
    • Move publications and generate them instead from a data.yml file with an improved layout. [ ][ ]
    • Make a large number of small but impactful stylisting changes. [ ][ ][ ][ ]
    • Expand the Tools to include a number of missing tools, fix some styling issues and fix a number of stale/broken links. [ ][ ][ ][ ][ ][ ]

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Misc development news

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In November, a number of changes were made by Holger Levsen, including:
  • reproduce.debian.net-related changes:
    • Create and introduce a new reproduce.debian.net service and subdomain [ ]
    • Make a large number of documentation changes relevant to rebuilderd. [ ][ ][ ][ ][ ]
    • Explain a temporary workaround for a specific issue in rebuilderd. [ ]
    • Setup another rebuilderd instance on the o4 node and update installation documentation to match. [ ][ ]
    • Make a number of helpful/cosmetic changes to the interface, such as clarifying terms and adding links. [ ][ ][ ][ ][ ]
    • Deploy configuration to the /opt and /var directories. [ ][ ]
    • Add an infancy (or alpha ) disclaimer. [ ][ ]
    • Add more notes to the temporary rebuilderd documentation. [ ]
    • Commit an nginx configuration file for reproduce.debian.net s Stats page. [ ]
    • Commit a rebuilder-worker.conf configuration for the o5 node. [ ]
  • Debian-related changes:
    • Grant jspricke and jochensp access to the o5 node. [ ][ ]
    • Build the qemu package with the nocheck build flag. [ ]
  • Misc changes:
    • Adapt the update_jdn.sh script for new Debian trixie systems. [ ]
    • Stop installing the PostgreSQL database engine on the o4 and o5 nodes. [ ]
    • Prevent accidental reboots of the o4 node because of a long-running job owned by josch. [ ][ ]
In addition, Mattia Rizzolo addressed a number of issues with reproduce.debian.net [ ][ ][ ][ ]. And lastly, both Holger Levsen [ ][ ][ ][ ] and Vagrant Cascadian [ ][ ][ ][ ] performed node maintenance.
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

Next.

Previous.