Search Results: "aph"

21 January 2025

Ravi Dwivedi: The Arduous Luxembourg Visa Process

In 2024, I was sponsored by The Document Foundation (TDF) to attend the LibreOffice annual conference in Luxembourg from the 10th to the 12th of October. Being an Indian passport holder, I needed a visa to visit Luxembourg. However, due to my Kenya trip coming up in September, I ran into a dilemma: whether to apply before or after the Kenya trip. To obtain a visa, I needed to submit my application with VFS Global (and not with the Luxembourg embassy directly). Therefore, I checked the VFS website for information on processing time, which says:
As a rule, the processing time of an admissible Schengen visa application should not exceed 15 calendar days (from the date the application is received at the Embassy).
It also mentions:
If the application is received less than 15 calendar days before the intended travel date, the Embassy can deem your application inadmissible. If so, your visa application will not be processed by the Embassy and the application will be sent back to VFS along with the passport.
If I applied for the Luxembourg visa before my trip, I would run the risk of not getting my passport back in time, and therefore missing my Kenya flight. On the other hand, if I waited until after returning from Kenya, I would run afoul of the aforementioned 15 working days needed by the embassy to process my application. I had previously applied for a Schengen visa for Austria, which was completed in 7 working days. My friends who had been to France told me they got their visa decision within a week. So, I compared Luxembourg s application numbers with those of other Schengen countries. In 2023, Luxembourg received 3,090 applications from India, while Austria received 39,558, Italy received 52,332 and France received 176,237. Since Luxembourg receives a far fewer number of applications, I expected the process to be quick. Therefore, I submitted my visa application with VFS Global in Delhi on the 5th of August, giving the embassy a month with 18 working days before my Kenya trip. However, I didn t mention my Kenya trip in the Luxembourg visa application. For reference, here is a list of documents I submitted: I submitted flight reservations instead of flight tickets . It is because, in case of visa rejection, I would have lost a significant amount of money if I booked confirmed flight tickets. The embassy also recommends the same. After the submission of documents, my fingerprints were taken. The expenses for the visa application were as follows:
Service Description Amount (INR)
Visa Fee 8,114
VFS Global Fee 1,763
Courier 800
Total 10,677
Going by the emails sent by VFS, my application reached the Luxembourg embassy the next day. Fast-forward to the 27th of August 14th day of my visa application. I had already booked my flight ticket to Nairobi for the 4th of September, but my passport was still with the Luxembourg embassy, and I hadn t heard back. In addition, I also obtained Kenya s eTA and got vaccinated for Yellow Fever, a requirement to travel to Kenya. In order to check on my application status, I gave the embassy a phone call, but missed their calling window, which was easy to miss since it was only 1 hour - 12:00 to 1:00 PM. So, I dropped them an email explaining my situation. At this point, I was already wondering whether to cancel the Kenya trip or the Luxembourg one, if I had to choose. After not getting a response to my email, I called them again the next day. The embassy told me they would look into it and asked me to send my flight tickets over email. One week to go before my flight now. I followed up with the embassy on the 30th by a phone call, and the person who picked up the call told me that my request had already been forwarded to the concerned department and is under process. They asked me to follow up on Monday, 2nd September. During the visa process, I was in touch with three other Indian attendees.1 In the meantime, I got to know that all of them had applied for a Luxembourg visa by the end of the month of August. Back to our story, over the next two days, the embassy closed for the weekend. I began weighing my options. On one hand, I could cancel the Kenya trip and hope that Luxembourg goes through. Even then, Luxembourg wasn t guaranteed as the visa could get rejected, so I might have ended up missing both the trips. On the other hand, I could cancel the Luxembourg visa application and at least be sure of going to Kenya. However, I thought it would make Luxembourg very unlikely because it didn t leave 15 working days for the embassy to process my visa after returning from Kenya. I also badly wanted to attend the LibreOffice conference because I couldn t make it two years ago. Therefore, I chose not to cancel my Luxembourg visa application. I checked with my travel agent and learned that I could cancel my Nairobi flight before September 4th for a cancelation fee of approximately 7,000 INR. On the 2nd of September, I was a bit frustrated because I hadn t heard anything from the embassy regarding my request. Therefore, I called the embassy again. They assured me that they would arrange a call for me from the concerned department that day, which I did receive later that evening. During the call, they offered to return my passport via VFS the next day and asked me to resubmit it after returning from Kenya. I immediately accepted the offer and was overjoyed, as it would enable me to take my flight to Nairobi without canceling my Luxembourg visa application. However, I didn t have the offer in writing, so it wasn t clear to me how I would collect my passport from VFS. The next day, I would receive it when I would be on my way to VFS in the form of an email from the embassy which read:
Dear Mr. Dwivedi, We acknowledge the receipt of your email. As you requested, we are returning your passport exceptionally through VFS, you can collect it directly from VFS Delhi Center between 14:00-17:00 hrs, 03 Sep 2024. Kindly bring the printout of this email along with your VFS deposit receipt and Original ID proof. Once you are back from your trip, you can redeposit the passport with VFS Luxembourg for our processing. With best regards,
Consular Section GRAND DUCHY OF LUXEMBOURG
Embassy in New Delhi
I took a printout of the email and submitted it to VFS to get my passport. This seemed like a miracle - just when I lost all hope of making it to my Kenya flight and was mentally preparing myself to miss it, I got my passport back exceptionally and now I had to mentally prepare again for Kenya. I had never heard of an embassy returning passport before completing the visa process before. The next day, I took my flight to Nairobi as planned. In case you are interested, I have written two blog posts on my Kenya trip - one on the OpenStreetMap conference in Nairobi and the other on my travel experience in Kenya. After returning from Kenya, I resubmitted my passport on the 17th of September. Fast-forward to the 25th of September; I didn t hear anything from the embassy about my application process. So, I checked with TDF to see whether the embassy reached out to them. They told me they confirmed my participation and my hotel booking to the visa authorities on the 19th of September (6 days ago). I was wondering what was taking so long after the verification. On the 1st of October, I received a phone call from the Luxembourg embassy, which turned out to be a surprise interview. They asked me about my work, my income, how I came to know about the conference, whether I had been to Europe before, etc. The call lasted around 10 minutes. At this point, my travel date - 8th of October - was just two working days away as the 2nd of October was off due to Gandhi Jayanti and 5th and 6th October were weekends, leaving only the 3rd and the 4th. I am not sure why the embassy saved this for the last moment, even though I submitted my application 2 months ago. I also got to know that one of the other Indian attendees missed the call due to being in their college lab, where he was not allowed to take phone calls. Therefore, I recommend that the embassy agree on a time slot for the interview call beforehand. Visa decisions for all the above-mentioned Indian attendees were sent by the embassy on the 4th of October, and I received mine on the 5th. For my travel date of 8th October, this was literally the last moment the embassy could send my visa. The parcel contained my passport and a letter. The visa was attached to a page in the passport. I was happy that my visa had been approved. However, the timing made my task challenging. The enclosed letter stated:
Subject: Your Visa Application for Luxembourg
Dear Applicant, We would like to inform you that a Schengen visa has been granted for the 8-day duration from 08/10/2024 to 30/10/2024 for conference purposes in Luxembourg. You are requested to report back to the Embassy of Luxembourg in New Delhi through an email (email address redacted) after your return with the following documents:
  • Immigration Stamps (Entry and Exit of Schengen Area)
  • Restaurant Bills
  • Shopping/Hotel/Accommodation bills
Failure to report to the Embassy after your return will be taken into consideration for any further visa applications.
I understand the embassy wanting to ensure my entry and exit from the Schengen area during the visa validity period, but found the demand for sending shopping bills excessive. Further, not everyone was as lucky as I was as it took a couple of days for one of the Indian attendees to receive their visa, delaying their plan. Another attendee had to send their father to the VFS center to collect their visa in time, rather than wait for the courier to arrive at their home. Foreign travel is complicated, especially for the citizens of countries whose passports and currencies are weak. Embassies issuing visas a day before the travel date doesn t help. For starters, a last-minute visa does not give enough time for obtaining a forex card as banks ask for the visa. Further, getting foreign currency (Euros in our case) in cash with a good exchange rate becomes difficult. As an example, for the Kenya trip, I had to get US Dollars at the airport due to the plan being finalized at the last moment, worsening the exchange rate. Back to the current case, the flight prices went up significantly compared to September, almost doubling. The choice of airlines also got narrowed, as most of the flights got booked by the time I received my visa. With all that said, I think it was still better than an arbitrary rejection. Credits: Contrapunctus, Badri, Fletcher, Benson, and Anirudh for helping with the draft of this post.

  1. Thanks to Sophie, our point of contact for the conference, for putting me in touch with them.

19 January 2025

Petter Reinholdtsen: 121 packages in Debian mapped to hardware for automatic recommendation

For some years now, I have been working on a automatic hardware based package recommendation system for Debian and other Linux distributions. The isenkram system I started on back in 2013 now consist of two subsystems, one locating firmware files using the information provided by apt-file, and one matching hardware to packages using information provided by AppStream. The former is very similar to the mechanism implemented in debian-installer to pick the right firmware packages to install. This post is about the latter system. Thanks to steady progress and good help from both other Debian and upstream developers, I am happy to report that the Isenkram system now are able to recommend 121 packages using information provided via AppStream. The mapping is done using modalias information provided by the kernel, the same information used by udev when creating device files, and the kernel when deciding which kernel modules to load. To get all the modalias identifiers relevant for your machine, you can run the following command on the command line:
find /sys/devices -name modalias -print0   xargs -0 sort -u
The modalias identifiers can look something like this:
acpi:PNP0000
cpu:type:x86,ven0000fam0006mod003F:feature:,0000,0001,0002,0003,0004,0005,0006,0007,0008,0009,000B,000C,000D,000E,000F,0010,0011,0013,0015,0016,0017,0018,0019,001A,001B,001C,001D,001F,002B,0034,003A,003B,003D,0068,006B,006C,006D,006F,0070,0072,0074,0075,0076,0078,0079,007C,0080,0081,0082,0083,0084,0085,0086,0087,0088,0089,008B,008C,008D,008E,008F,0091,0092,0093,0094,0095,0096,0097,0098,0099,009A,009B,009C,009D,009E,00C0,00C5,00E1,00E3,00EB,00ED,00F0,00F1,00F3,00F5,00F6,00F9,00FA,00FB,00FD,00FF,0100,0101,0102,0103,0111,0120,0121,0123,0125,0127,0128,0129,012A,012C,012D,0140,0160,0161,0165,016C,017B,01C0,01C1,01C2,01C4,01C5,01C6,01F9,024A,025A,025B,025C,025F,0282
dmi:bvnDellInc.:bvr2.18.1:bd08/14/2023:br2.18:svnDellInc.:pnPowerEdgeR730:pvr:rvnDellInc.:rn0H21J3:rvrA09:cvnDellInc.:ct23:cvr:skuSKU=NotProvided
pci:v00008086d00008D3Bsv00001028sd00000600bc07sc80i00
platform:serial8250
scsi:t-0x05
usb:v413CpA001d0000dc09dsc00dp00ic09isc00ip00in00
The entries above are a selection of the complete set available on a Dell PowerEdge R730 machine I have access to, to give an idea about the various styles of hardware identifiers presented in the modalias format. When looking up relevant packages in a Debian Testing installation on the same R730, I get this list of packages proposed:
% sudo isenkram-lookup
firmware-bnx2x
firmware-nvidia-graphics
firmware-qlogic
megactl
wsl
%
The list consist of firmware packages requested by kernel modules, as well packages with program to get the status from the RAID controller and to maintain the LAN console. When the edac-utils package providing tools to check the ECC RAM status will enter testing in a few days, it will also show up as a proposal from isenkram. In addition, once the mfiutil package we uploaded in October get past the NEW processing, it will also propose a tool to configure the RAID controller. Another example is the trusty old Lenovo Thinkpad X230, which have hardware handled by several packages in the archive. This is running on Debian Stable:
% isenkram-lookup 
beignet-opencl-icd
bluez
cheese
ethtool
firmware-iwlwifi
firmware-misc-nonfree
fprintd
fprintd-demo
gkrellm-thinkbat
hdapsd
libpam-fprintd
pidgin-blinklight
thinkfan
tlp
tp-smapi-dkms
tpb
%
Here there proposal consist of software to handle the camera, bluetooth, network card, wifi card, GPU, fan, fingerprint reader and acceleration sensor on the machine. Here is the complete set of packages currently providing hardware mapping via AppStream in Debian Unstable: air-quality-sensor, alsa-firmware-loaders, antpm, array-info, avarice, avrdude, bmusb-v4l2proxy, brltty, calibre, colorhug-client, concordance-common, consolekit, dahdi-firmware-nonfree, dahdi-linux, edac-utils, eegdev-plugins-free, ekeyd, elogind, firmware-amd-graphics, firmware-ath9k-htc, firmware-atheros, firmware-b43-installer, firmware-b43legacy-installer, firmware-bnx2, firmware-bnx2x, firmware-brcm80211, firmware-carl9170, firmware-cavium, firmware-intel-graphics, firmware-intel-misc, firmware-ipw2x00, firmware-ivtv, firmware-iwlwifi, firmware-libertas, firmware-linux-free, firmware-mediatek, firmware-misc-nonfree, firmware-myricom, firmware-netronome, firmware-netxen, firmware-nvidia-graphics, firmware-qcom-soc, firmware-qlogic, firmware-realtek, firmware-ti-connectivity, fpga-icestorm, g810-led, galileo, garmin-forerunner-tools, gkrellm-thinkbat, goldencheetah, gpsman, gpstrans, gqrx-sdr, i8kutils, imsprog, ledger-wallets-udev, libairspy0, libam7xxx0.1, libbladerf2, libgphoto2-6t64, libhamlib-utils, libm2k0.9.0, libmirisdr4, libnxt, libopenxr1-monado, libosmosdr0, librem5-flash-image, librtlsdr0, libticables2-8, libx52pro0, libykpers-1-1, libyubikey-udev, limesuite, linuxcnc-uspace, lomoco, madwimax, media-player-info, megactl, mixxx, mkgmap, msi-keyboard, mu-editor, mustang-plug, nbc, nitrokey-app, nqc, ola, openfpgaloader, openocd, openrazer-driver-dkms, pcmciautils, pcscd, pidgin-blinklight, ponyprog, printer-driver-splix, python-yubico-tools, python3-btchip, qlcplus, rosegarden, scdaemon, sispmctl, solaar, spectools, sunxi-tools, t2n, thinkfan, tlp, tp-smapi-dkms, trezor, tucnak, ubertooth, usbrelay, uuu, viking, w1retap, wsl, xawtv, xinput-calibrator, xserver-xorg-input-wacom and xtrx-dkms. In addition to these, there are several with patches pending in the Debian bug tracking system, and even more where no-one wrote patches yet. Good candiates for the latter are packages with udev rules but no AppStream hardware information. The isenkram system consist of two packages, isenkram-cli with the command line tools, and isenkram with a GUI background process. The latter will listen for dbus events from udev emitted when new hardware become available (like when inserting a USB dongle or discovering a new bluetooth device), look up the modalias entry for this piece of hardware in AppStream (and a hard coded list of mappings from isenkram - currently working hard to move this list to AppStream), and pop up a dialog proposing to install any not already installed packages supporting this hardware. It work very well today when inserting the LEGO Mindstorms RCX, NXT and EV3 controllers. :) If you want to make sure more hardware related packages get recommended, please help out fixing the remaining packages in Debian to provide AppStream metadata with hardware mappings. As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

17 January 2025

C.J. Collier: Security concerns regarding OpenSSH mac sha1 in Debian

What is HMAC? HMAC stands for Hash-Based Message Authentication Code. It s a specific way to use a cryptographic hash function (like SHA-1, SHA-256, etc.) along with a secret key to produce a unique fingerprint of some data. This fingerprint allows someone else with the same key to verify that the data hasn t been tampered with. How HMAC Works Keyed Hashing: The core idea is to incorporate the secret key into the hashing process. This is done in a specific way to prevent clever attacks that might try to bypass the security.
Inner and Outer Hashing: HMAC uses two rounds of hashing. First, the message and a modified version of the key are hashed together. Then, the result of that hash, along with another modified version of the key, are hashed again. This two-step process adds an extra layer of protection. HMAC in OpenSSH OpenSSH uses HMAC to ensure the integrity of messages sent back and forth during an SSH session. This prevents an attacker from subtly modifying data in transit. HMAC-SHA1 with OpenSSH: Is it Weak? SHA-1 itself is considered cryptographically broken. This means that with enough computing power, it s possible to find collisions (two different messages that produce the same hash). However, HMAC-SHA1 is generally still considered secure for most purposes. This is because exploiting weaknesses in SHA-1 to break HMAC-SHA1 is much more difficult than just finding collisions in SHA-1. Should you use it? While HMAC-SHA1 might still be okay for now, it s best practice to move to stronger alternatives like HMAC-SHA256 or HMAC-SHA512. OpenSSH supports these, and they provide a greater margin of safety against future attacks. In Summary HMAC is a powerful tool for ensuring data integrity. Even though SHA-1 has weaknesses, HMAC-SHA1 in OpenSSH is likely still safe for most users. However, to be on the safe side and prepare for the future, switching to HMAC-SHA256 or HMAC-SHA512 is recommended. Following are instructions for creating dataproc clusters with sha1 mac support removed: I can appreciate an excess of caution, and I can offer you some code to produce Dataproc instances which do not allow HMAC authentication using sha1. Place code similar to this in a startup script or an initialization action that you reference when creating a cluster with gcloud dataproc clusters create:
#!/bin/bash
# remove mac specification from sshd configuration
sed -i -e 's/^macs.*$//' /etc/ssh/sshd_config
# place a new mac specification at the end of the service configuration
ssh -Q mac   perl -e \
  '@mac=grep  chomp; ! /sha1/  ; print("macs ", join(",",@mac), $/)' >> /etc/ssh/sshd_config
# reload the new ssh service configuration
systemctl reload ssh.service
If this code is hosted on GCS, you can refer to it with
--initialization-actions=CLOUD_STORAGE_URI,[...]
or
--metadata startup-script-url=CLOUD_STORAGE_URI,[...]

14 January 2025

Louis-Philippe V ronneau: Montreal Subway Foot Traffic Data, 2024 edition

Another year of data from Soci t de Transport de Montr al, Montreal's transit agency! A few highlights this year:
  1. The closure of the Saint-Michel station had a drastic impact on D'Iberville, the station closest to it.
  2. The opening of the Royalmount shopping center nearly doubled the traffic of the De La Savane station.
  3. The Montreal subway continues to grow, but has not yet recovered from the pandemic. Berri-UQAM station (the largest one) is still below 1 million entries per quarter compared to its pre-pandemic record.
By clicking on a subway station, you'll be redirected to a graph of the station's foot traffic. Licences

12 January 2025

Bastian Venthur: Investigating the popularity of Python build backends over time (II)

Last year, I analyzed the popularity of build backends used in pyproject.toml files over time. This post is the update for 2024. Analysis Like last year, I m using Tom Forbes fantastic dataset containing information about every file within every release uploaded to PyPI. To get the current dataset, I followed the same process as in last year s analysis, so I won t repeat all the details here. Instead, I ll highlight the main steps: Downloading all the parquet files took roughly a week due to GitHub s rate limiting. Tom suggested leveraging the Git v2 protocol to fetch the data directly. This approach could bypass rate limiting and complete the download of all pyproject.toml files in just 20 minutes(!). However, I couldn t find sufficient documentation that would help me to implement this method, so this will have to wait until next year s analysis. Once all the data is downloaded, I perform some preprocessing: Results I modified the plots a bit from last year to make them easier to read. Most notably, I binned the data into quarters to make the plots less noisy, and secondly, I stopped stacking the relative distribution plots to make the percentages directly readable. The first plot shows the absolute number of uploads (in thousands) by quarter and build backend. Absolute distribution of build backends by quarter The second plot shows the relative distribution of build backends by quarter. Relative distribution of build backends by quarter In 2024, we observe that: The script for downloading and analyzing the data is available in my GitHub repository. If someone has insights or examples on implementing the Git v2 protocol to download the pyproject.toml file given the repository URL and its hash, I d love to hear from you!

9 January 2025

Reproducible Builds: Reproducible Builds in December 2024

Welcome to the December 2024 report from the Reproducible Builds project! Our monthly reports outline what we ve been up to over the past month and highlight items of news from elsewhere in the world of software supply-chain security when relevant. As ever, however, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. Table of contents:
  1. reproduce.debian.net
  2. debian-repro-status
  3. On our mailing list
  4. Enhancing the Security of Software Supply Chains
  5. diffoscope
  6. Supply-chain attack in the Solana ecosystem
  7. Website updates
  8. Debian changes
  9. Other development news
  10. Upstream patches
  11. Reproducibility testing framework

reproduce.debian.net Last month saw the introduction of reproduce.debian.net. Announced at the recent Debian MiniDebConf in Toulouse, reproduce.debian.net is an instance of rebuilderd operated by the Reproducible Builds project. rebuilderd is our server designed monitor the official package repositories of Linux distributions and attempts to reproduce the observed results there. This month, however, we are pleased to announce that not only does the service now produce graphs, the reproduce.debian.net homepage itself has become a start page of sorts, and the amd64.reproduce.debian.net and i386.reproduce.debian.net pages have emerged. The first of these rebuilds the amd64 architecture, naturally, but it also is building Debian packages that are marked with the no architecture label, all. The second builder is, however, only rebuilding the i386 architecture. Both of these services were also switched to reproduce the Debian trixie distribution instead of unstable, which started with 43% of the archive rebuild with 79.3% reproduced successfully. This is very much a work in progress, and we ll start reproducing Debian unstable soon. Our i386 hosts are very kindly sponsored by Infomaniak whilst the amd64 node is sponsored by OSUOSL thank you! Indeed, we are looking for more workers for more Debian architectures; please contact us if you are able to help.

debian-repro-status Reproducible builds developer kpcyrd has published a client program for reproduce.debian.net (see above) that queries the status of the locally installed packages and rates the system with a percentage score. This tool works analogously to arch-repro-status for the Arch Linux Reproducible Builds setup. The tool was packaged for Debian and is currently available in Debian trixie: it can be installed with apt install debian-repro-status.

On our mailing list On our mailing list this month:
  • Bernhard M. Wiedemann wrote a detailed post on his long journey towards a bit-reproducible Emacs package. In his interesting message, Bernhard goes into depth about the tools that they used and the lower-level technical details of, for instance, compatibility with the version for glibc within openSUSE.
  • Shivanand Kunijadar posed a question pertaining to the reproducibility issues with encrypted images. Shivanand explains that they must use a random IV for encryption with AES CBC. The resulting artifact is not reproducible due to the random IV used. The message resulted in a handful of replies, hopefully helpful!
  • User Danilo posted an in interesting question related to their attempts in trying to achieve reproducible builds for Threema Desktop 2.0. The question resulted in a number of replies attempting to find the right combination of compiler and linker flags (for example).
  • Longstanding contributor David A. Wheeler wrote to our list announcing the release of the Census III of Free and Open Source Software: Application Libraries report written by Frank Nagle, Kate Powell, Richie Zitomer and David himself. As David writes in his message, the report attempts to answer the question what is the most popular Free and Open Source Software (FOSS)? .
  • Lastly, kpcyrd followed-up to a post from September 2024 which mentioned their desire for someone to implement a hashset of allowed module hashes that is generated during the kernel build and then embedded in the kernel image , thus enabling a deterministic and reproducible build. However, they are now reporting that somebody implemented the hash-based allow list feature and submitted it to the Linux kernel mailing list . Like kpcyrd, we hope it gets merged.

Enhancing the Security of Software Supply Chains: Methods and Practices Mehdi Keshani of the Delft University of Technology in the Netherlands has published their thesis on Enhancing the Security of Software Supply Chains: Methods and Practices . Their introductory summary first begins with an outline of software supply chains and the importance of the Maven ecosystem before outlining the issues that it faces that threaten its security and effectiveness . To address these:
First, we propose an automated approach for library reproducibility to enhance library security during the deployment phase. We then develop a scalable call graph generation technique to support various use cases, such as method-level vulnerability analysis and change impact analysis, which help mitigate security challenges within the ecosystem. Utilizing the generated call graphs, we explore the impact of libraries on their users. Finally, through empirical research and mining techniques, we investigate the current state of the Maven ecosystem, identify harmful practices, and propose recommendations to address them.
A PDF of Mehdi s entire thesis is available to download.

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made the following changes, including preparing and uploading versions 283 and 284 to Debian:
  • Update copyright years. [ ]
  • Update tests to support file 5.46. [ ][ ]
  • Simplify tests_quines.py::test_ differences,differences_deb to simply use assert_diff and not mangle the test fixture. [ ]

Supply-chain attack in the Solana ecosystem A significant supply-chain attack impacted Solana, an ecosystem for decentralised applications running on a blockchain. Hackers targeted the @solana/web3.js JavaScript library and embedded malicious code that extracted private keys and drained funds from cryptocurrency wallets. According to some reports, about $160,000 worth of assets were stolen, not including SOL tokens and other crypto assets.

Website updates Similar to last month, there was a large number of changes made to our website this month, including:
  • Chris Lamb:
    • Make the landing page hero look nicer when the vertical height component of the viewport is restricted, not just the horizontal width.
    • Rename the Buy-in page to Why Reproducible Builds? [ ]
    • Removing the top black border. [ ][ ]
  • Holger Levsen:
  • hulkoba:
    • Remove the sidebar-type layout and move to a static navigation element. [ ][ ][ ][ ]
    • Create and merge a new Success stories page, which highlights the success stories of Reproducible Builds, showcasing real-world examples of projects shipping with verifiable, reproducible builds. These stories aim to enhance the technical resilience of the initiative by encouraging community involvement and inspiring new contributions. . [ ]
    • Further changes to the homepage. [ ]
    • Remove the translation icon from the navigation bar. [ ]
    • Remove unused CSS styles pertaining to the sidebar. [ ]
    • Add sponsors to the global footer. [ ]
    • Add extra space on large screens on the Who page. [ ]
    • Hide the side navigation on small screens on the Documentation pages. [ ]

Debian changes There were a significant number of reproducibility-related changes within Debian this month, including:
  • Santiago Vila uploaded version 0.11+nmu4 of the dh-buildinfo package. In this release, the dh_buildinfo becomes a no-op ie. it no longer does anything beyond warning the developer that the dh-buildinfo package is now obsolete. In his upload, Santiago wrote that We still want packages to drop their [dependency] on dh-buildinfo, but now they will immediately benefit from this change after a simple rebuild.
  • Holger Levsen filed Debian bug #1091550 requesting a rebuild of a number of packages that were built with a very old version of dpkg.
  • Fay Stegerman contributed to an extensive thread on the debian-devel development mailing list on the topic of Supporting alternative zlib implementations . In particular, Fay wrote about her results experimenting whether zlib-ng produces identical results or not.
  • kpcyrd uploaded a new rust-rebuilderd-worker, rust-derp, rust-in-toto and debian-repro-status to Debian, which passed successfully through the so-called NEW queue.
  • Gioele Barabucci filed a number of bugs against the debrebuild component/script of the devscripts package, including:
    • #1089087: Address a spurious extra subdirectory in the build path.
    • #1089201: Extra zero bytes added to .dynstr when rebuilding CMake projects.
    • #1089088: Some binNMUs have a 1-second offset in some timestamps.
  • Gioele Barabucci also filed a bug against the dh-r package to report that the Recommends and Suggests fields are missing from rebuilt R packages. At the time of writing, this bug has no patch and needs some help to make over 350 binary packages reproducible.
  • Lastly, 8 reviews of Debian packages were added, 11 were updated and 11 were removed this month adding to our knowledge about identified issues.

Other development news In other ecosystem and distribution news:
  • Lastly, in openSUSE, Bernhard M. Wiedemann published another report for the distribution. There, Bernhard reports about the success of building R-B-OS , a partial fork of openSUSE with only 100% bit-reproducible packages. This effort was sponsored by the NLNet NGI0 initiative.

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In November, a number of changes were made by Holger Levsen, including:
  • reproduce.debian.net-related:
    • Add a new i386.reproduce.debian.net rebuilder. [ ][ ][ ][ ][ ][ ]
    • Make a number of updates to the documentation. [ ][ ][ ][ ][ ]
    • Run i386.reproduce.debian.net run on a public port to allow external workers. [ ]
    • Add a link to the /api/v0/pkgs/list endpoint. [ ]
    • Add support for a statistics page. [ ][ ][ ][ ][ ][ ]
    • Limit build logs to 20 MiB and diffoscope output to 10 MiB. [ ]
    • Improve the frontpage. [ ][ ]
    • Explain that we re testing arch:any and arch:all on the amd64 architecture, but only arch:any on i386. [ ]
  • Misc:
    • Remove code for testing Arch Linux, which has moved to reproduce.archlinux.org. [ ][ ]
    • Don t install dstat on Jenkins nodes anymore as its been removed from Debian trixie. [ ]
    • Prepare the infom08-i386 node to become another rebuilder. [ ]
    • Add debug date output for benchmarking the reproducible_pool_buildinfos.sh script. [ ]
    • Install installation-birthday everywhere. [ ]
    • Temporarily disable automatic updates of pool links on buildinfos.debian.net. [ ]
    • Install Recommends by default on Jenkins nodes. [ ]
    • Rename rebuilder_stats.py to rebuilderd_stats.py. [ ]
    • r.d.n/stats: minor formatting changes. [ ]
    • Install files under /etc/cron.d/ with the correct permissions. [ ]
and Jochen Sprickerhof made the following changes: Lastly, Gioele Barabucci also classified packages affected by 1-second offset issue filed as Debian bug #1089088 [ ][ ][ ][ ], Chris Hofstaedtler updated the URL for Grml s dpkg.selections file [ ], Roland Clobus updated the Jenkins log parser to parse warnings from diffoscope [ ] and Mattia Rizzolo banned a number of bots and crawlers from the service [ ][ ].
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

Freexian Collaborators: Debian Contributions: Tracker.debian.org updates, Salsa CI improvements, Coinstallable build-essential, Python 3.13 transition, Ruby 3.3 transition and more! (by Anupa Ann Joseph, Stefano Rivera)

Debian Contributions: 2024-12 Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services.

Tracker.debian.org updates, by Rapha l Hertzog Profiting from end-of-year vacations, Rapha l prepared for tracker.debian.org to be upgraded to Debian 12 bookworm by getting rid of the remnants of python3-django-jsonfield in the code (it was superseded by a Django-native field). Thanks to Philipp Kern from the Debian System Administrators team, the upgrade happened on December 23rd. Rapha l also improved distro-tracker to better deal with invalid Maintainer fields which recently caused multiples issues in the regular data updates (#1089985, MR 105). While working on this, he filed #1089648 asking dpkg tools to error out early when maintainers make such mistakes. Finally he provided feedback to multiple issues and merge requests (MR 106, issues #21, #76, #77), there seems to be a surge of interest in distro-tracker lately. It would be nice if those new contributors could stick around and help out with the significant backlog of issues (in the Debian BTS, in Salsa).

Salsa CI improvements, by Santiago Ruano Rinc n Given that the Debian buildd network now relies on sbuild using the unshare backend, and that Salsa CI s reproducibility testing needs to be reworked (#399), Santiago resumed the work for moving the build job to use sbuild. There was some related work a few months ago that was focused on sbuild with the schroot and the sudo backends, but those attempts were stalled for different reasons, including discussions around the convenience of the move (#296). However, using sbuild and unshare avoids all of the drawbacks that have been identified so far. Santiago is preparing two merge requests: !568 to introduce a new build image, and !569 that moves all the extract-source related tasks to the build job. As mentioned in the previous reports, this change will make it possible for more projects to use the pipeline to build the packages (See #195). Additional advantages of this change include a more optimal way to test if a package builds twice in a row: instead of actually building it twice, the Salsa CI pipeline will configure sbuild to check if the clean target of debian/rules correctly restores the source tree, saving some CPU cycles by avoiding one build. Also, the images related to Ubuntu won t be needed anymore, since the build job will create chroots for different distributions and vendors from a single common build image. This will save space in the container registry. More changes are to come, especially those related to handling projects that customize the pipeline and make use of the extract-source job.

Coinstallable build-essential, by Helmut Grohne Building on the gcc-for-host work of last December, a notable patch turning build-essential Multi-Arch: same became feasible. Whilst the change is small, its implications and foundations are not. We still install crossbuild-essential-$ARCH for cross building and due to a britney2 limitation, we cannot have it depend on the host s C library. As a result, there are workarounds in place for sbuild and pbuilder. In turning build-essential Multi-Arch: same, we may actually express these dependencies directly as we install build-essential:$ARCH instead. The crossbuild-essential-$ARCH packages will continue to be available as transitional dummy packages.

Python 3.13 transition, by Colin Watson and Stefano Rivera Building on last month s work, Colin, Stefano, and other members of the Debian Python team fixed 3.13 compatibility bugs in many more packages, allowing 3.13 to now be a supported but non-default version in testing. The next stage will be to switch to it as the default version, which will start soon. Stefano did some test-rebuilds of packages that only build for the default Python 3 version, to find issues that will block the transition. The default version transition typically shakes out some more issues in applications that (unlike libraries) only test with the default Python version. Colin also fixed Sphinx 8.0 compatibility issues in many packages, which otherwise threatened to get in the way of this transition.

Ruby 3.3 transition, by Lucas Kanashiro The Debian Ruby team decided to ship Ruby 3.3 in the next Debian release, and Lucas took the lead of the interpreter transition with the assistance of the rest of the team. In order to understand the impact of the new interpreter in the ruby ecosystem, ruby-defaults was uploaded to experimental adding ruby3.3 as an alternative interpreter, and a mass rebuild of reverse dependencies was done here. Initially, a couple of hundred packages were failing to build, after many rounds of rebuilds, adjustments, and many uploads we are down to 30 package build failures, of those, 21 packages were asked to be removed from testing and for the other 9, bugs were filled. All the information to track this transition can be found here. Now, we are waiting for PHP 8.4 to finish to avoid any collision. Once it is done the Ruby 3.3 transition will start in unstable.

Miscellaneous contributions
  • Enrico Zini redesigned the way nm.debian.org stores historical audit logs and personal data backups.
  • Carles Pina submitted a new package (python-firebase-messaging) and prepared updates for python3-ring-doorbell.
  • Carles Pina developed further po-debconf-manager: better state transition, fixed bugs, automated assigning translators and reviewers on edit, updating po header files automatically, fixed bugs, etc.
  • Carles Pina reviewed, submitted and followed up the debconf templates translation (more than 20 packages) and translated some packages (about 5).
  • Santiago continued to work on DebConf 25 organization related tasks, including handling the logo survey and results. Stefano spent time on DebConf 25 too.
  • Santiago continued the exploratory work about linux livepatching with Emmanuel Arias. Santiago and Emmanuel found a challenge since kpatch won t fully support linux in trixie and newer, so they are exploring alternatives such as klp-build.
  • Helmut maintained the /usr-move transition filing bugs in e.g. bubblewrap, e2fsprogs, libvpd-2.2-3, and pam-tmpdir and corresponding on related issues such as kexec-tools and live-build. The removal of the usrmerge package unfortunately broke debootstrap and was quickly reverted. Continued fallout is expected and will continue until trixie is released.
  • Helmut sent patches for 10 cross build failures and worked with Sandro Knau on stuck Qt/KDE patches related to cross building.
  • Helmut continued to maintain rebootstrap removing the need to build gnu-efi in the process.
  • Helmut collaborated with Emanuele Rocca and Jochen Sprickerhof on an interesting adventure in diagnosing why gcc would FTBFS in recent sbuild.
  • Helmut proposed supporting build concurrency limits in coreutils s nproc. As it turns out nproc is not a good place for this functionality.
  • Colin worked with Sandro Tosi and Andrej Shadura to finish resolving the multipart vs. python-multipart name conflict, as mentioned last month.
  • Colin upgraded 48 Python packages to new upstream versions, fixing four CVEs and a number of compatibility bugs with recent Python versions.
  • Colin issued an openssh bookworm update with a number of fixes that had accumulated over the last year, especially fixing GSS-API key exchange which had been quite broken in bookworm.
  • Stefano fixed a minor bug in debian-reimbursements that was disallowing combination PDFs containing JAL tickets, encoded in UTF-16.
  • Stefano uploaded a stable update to PyPy3 in bookworm, catching up with security issues resolved in cPython.
  • Stefano fixed a regression in the eventlet from his Python 3.13 porting patch.
  • Stefano continued discussing a forwarded patch (renaming the sysconfigdata module) with cPython upstream, ending in a decision to drop the patch from Debian. This will need some continued work.
  • Anupa participated in the Debian Publicity team meeting in December, which discussed the team activities done in 2024 and projects for 2025.

7 January 2025

Thorsten Alteholz: My Debian Activities in December 2024

Debian LTS This was my hundred-twenty-sixth month that I did some work for the Debian LTS initiative, started by Raphael Hertzog at Freexian. I worked on updates for ffmpeg and haproxy in all releases. Along the way I marked more CVEs as not-affected than I had to fix. So finally there was no upload needed for haproxy anymore. Unfortunately testing ffmpeg was not as easy, as the recommended just look whether mpv can play random videos is not really satisfying. So the upload will happen only in January. I also wonder whether fixing glewlwyd is really worth the effort, as the software is already EOL upstream. Debian ELTS This month was the seventy-seventhth ELTS month. During my allocated time I worked on ffmpeg, haproxy, amanda and kmail-account-wizzard. Like LTS, all CVEs of haproxy and some of ffmpeg could be marked as not-affected and testing of the other packages was/is not really straight forward. So the final upload will only happen in January as well. Debian Printing Unfortunately I didn t found any time to work on this topic. Debian Matomo Thanks a lot to William Desportes for all fixes of my bad PHP packaging. Debian Astro This month I uploaded new packages or new upstream or bugfix versions of: I again sponsored an upload of calceph. Debian IoT This month I uploaded new upstream or bugfix versions of: Debian Mobcom This month I uploaded new packages or new upstream or bugfix versions of: misc This month I uploaded new upstream or bugfix versions of: I also sponsored uploads of emacs-lsp-docker, emacs-dape, emacs-oauth2, gpgmngr, libjs-jush. FTP master This month I accepted 330 and rejected 13 packages. The overall number of packages that got accepted was 335.

4 January 2025

Scarlett Gately Moore: KDE: Snap hotfixes and updates

Fixed okular pdf printing https://bugs.kde.org/show_bug.cgi?id=498065 Fixed kwave recording https://bugs.kde.org/show_bug.cgi?id=442085 please run sudo snap connect kwave:audio-record :audio-record until auto-connect gets approved here: https://forum.snapcraft.io/t/kde-auto-connect-our-two-recording-apps/44419 New qt6 snaps in edge until 24.12.1 release
I have begun the process of moving to core24 currently in edge until 24.12.1 release. Some major improvements come with core24!
Tokodon is our wonderful Mastadon client I hate asking but I am unemployable with this broken arm fiasco and 6 hours a day hospital runs for treatment. If you could spare anything it would be appreciated! https://gofund.me/573cc38e

3 January 2025

Bits from Debian: Bits from the DPL

Dear Debian community, this is bits from DPL for December. Happy New Year 2025! Wishing everyone health, productivity, and a successful Debian release later in this year. Strict ownership of packages I'm glad my last bits sparked discussions about barriers between packages and contributors, summarized temporarily in some post on the debian-devel list. As one participant aptly put it, we need a way to visibly say, "I'll do the job until someone else steps up". Based on my experience with the Bug of the Day initiative, simplifying the process for engaging with packages would significantly help. Currently we have
  1. NMU The Developers Reference outlines several preconditions for NMUs, explicitly stating, "Fixing cosmetic issues or changing the packaging style in NMUs is discouraged." This makes NMUs unsuitable for addressing package smells. However, I've seen NMUs used for tasks like switching to source format 3.0 or bumping the debhelper compat level. While it's technically possible to file a bug and then address it in an NMU, the process inherently limits the NMUer's flexibility to reduce package smells.
  2. Package Salvaging This is another approach for working on someone else's packages, aligning with the process we often follow in the Bug of the Day initiative. The criteria for selecting packages typically indicate that the maintainer either lacks time to address open bugs, has lost interest, or is generally MIA.
Both options have drawbacks, so I'd welcome continued discussion on criteria for lowering the barriers to moving packages to Salsa and modernizing their packaging. These steps could enhance Debian overall and are generally welcomed by active maintainers. The discussion also highlighted that packages on Salsa are often maintained collaboratively, fostering the team-oriented atmosphere already established in several Debian teams. Salsa Continuous Integration As part of the ongoing discussion about package maintenance, I'm considering the suggestion to switch from the current opt-in model for Salsa CI to an opt-out approach. While I fully agree that human verification is necessary when the pipeline is activated, I believe the current option to enable CI is less visible than it should be. I'd welcome a more straightforward approach to improve access to better testing for what we push to Salsa. Number of packages not on Salsa In my campaign, I stated that I aimed to reduce the number of packages maintained outside Salsa to below 2,000. As of March 28, 2024, the count was 2,368. As of this writing, the count stands at 1,928 [1], so I consider this promise fulfilled. My thanks go out to everyone who contributed to this effort. Moving forward, I'd like to set a more ambitious goal for the remainder of my term and hope we can reduce the number to below 1,800. [1] UDD query: SELECT DISTINCT count(*) FROM sources WHERE release = 'sid' and vcs_url not like '%salsa%' ; Past and future events Talk at MRI Together In early December, I gave a short online talk, primarily focusing on my work with the Debian Med team. I also used my position as DPL to advocate for attracting more users and developers from the scientific research community. FOSSASIA I originally planned to attend FOSDEM this year. However, given the strong Debian presence there and the need for better representation at the FOSSASIA Summit, I decided to prioritize the latter. This aligns with my goal of improving geographic diversity. I also look forward to opportunities for inter-distribution discussions. Debian team sprints Debian Ruby Sprint I approved the budget for the Debian Ruby Sprint, scheduled for January 2025 in Paris. If you're interested in contributing to the Ruby team, whether in person or online, consider reaching out to them. I'm sure any helping hand would be appreciated. Debian Med sprint There will also be a Debian Med sprint in Berlin in mid-February. As usual, you don't need to be an expert in biology or medicine basic bug squashing skills are enough to contribute and enjoy the friendly atmosphere the Debian Med team fosters at their sprints. For those working in biology and medicine, we typically offer packaging support. Anyone interested in spending a weekend focused on impactful scientific work with Debian is warmly invited. Again all the best for 2025
Andreas.

2 January 2025

Matthew Garrett: The GPU, not the TPM, is the root of hardware DRM

As part of their "Defective by Design" anti-DRM campaign, the FSF recently made the following claim:
Today, most of the major streaming media platforms utilize the TPM to decrypt media streams, forcefully placing the decryption out of the user's control (from here).
This is part of an overall argument that Microsoft's insistence that only hardware with a TPM can run Windows 11 is with the goal of aiding streaming companies in their attempt to ensure media can only be played in tightly constrained environments.

I'm going to be honest here and say that I don't know what Microsoft's actual motivation for requiring a TPM in Windows 11 is. I've been talking about TPM stuff for a long time. My job involves writing a lot of TPM code. I think having a TPM enables a number of worthwhile security features. Given the choice, I'd certainly pick a computer with a TPM. But in terms of whether it's of sufficient value to lock out Windows 11 on hardware with no TPM that would otherwise be able to run it? I'm not sure that's a worthwhile tradeoff.

What I can say is that the FSF's claim is just 100% wrong, and since this seems to be the sole basis of their overall claim about Microsoft's strategy here, the argument is pretty significantly undermined. I'm not aware of any streaming media platforms making use of TPMs in any way whatsoever. There is hardware DRM that the media companies use to restrict users, but it's not in the TPM - it's in the GPU.

Let's back up for a moment. There's multiple different DRM implementations, but the big three are Widevine (owned by Google, used on Android, Chromebooks, and some other embedded devices), Fairplay (Apple implementation, used for Mac and iOS), and Playready (Microsoft's implementation, used in Windows and some other hardware streaming devices and TVs). These generally implement several levels of functionality, depending on the capabilities of the device they're running on - this will range from all the DRM functionality being implemented in software up to the hardware path that will be discussed shortly. Streaming providers can choose what level of functionality and quality to provide based on the level implemented on the client device, and it's common for 4K and HDR content to be tied to hardware DRM. In any scenario, they stream encrypted content to the client and the DRM stack decrypts it before the compressed data can be decoded and played.

The "problem" with software DRM implementations is that the decrypted material is going to exist somewhere the OS can get at it at some point, making it possible for users to simply grab the decrypted stream, somewhat defeating the entire point. Vendors try to make this difficult by obfuscating their code as much as possible (and in some cases putting some of it in-kernel), but pretty much all software DRM is at least somewhat broken and copies of any new streaming media end up being available via Bittorrent pretty quickly after release. This is why higher quality media tends to be restricted to clients that implement hardware-based DRM.

The implementation of hardware-based DRM varies. On devices in the ARM world this is usually handled by performing the cryptography in a Trusted Execution Environment, or TEE. A TEE is an area where code can be executed without the OS having any insight into it at all, with ARM's TrustZone being an example of this. By putting the DRM code in TrustZone, the cryptography can be performed in RAM that the OS has no access to, making the scraping described earlier impossible. x86 has no well-specified TEE (Intel's SGX is an example, but is no longer implemented in consumer parts), so instead this tends to be handed off to the GPU. The exact details of this implementation are somewhat opaque - of the previously mentioned DRM implementations, only Playready does hardware DRM on x86, and I haven't found any public documentation of what drivers need to expose for this to work.

In any case, as part of the DRM handshake between the client and the streaming platform, encryption keys are negotiated with the key material being stored in the GPU or the TEE, inaccessible from the OS. Once decrypted, the material is decoded (again either on the GPU or in the TEE - even in implementations that use the TEE for the cryptography, the actual media decoding may happen on the GPU) and displayed. One key point is that the decoded video material is still stored in RAM that the OS has no access to, and the GPU composites it onto the outbound video stream (which is why if you take a screenshot of a browser playing a stream using hardware-based DRM you'll just see a black window - as far as the OS can see, there is only a black window there).

Now, TPMs are sometimes referred to as a TEE, and in a way they are. However, they're fixed function - you can't run arbitrary code on the TPM, you only have whatever functionality it provides. But TPMs do have the ability to decrypt data using keys that are tied to the TPM, so isn't this sufficient? Well, no. First, the TPM can't communicate with the GPU. The OS could push encrypted material to it, and it would get plaintext material back. But the entire point of this exercise was to avoid the decrypted version of the stream from ever being visible to the OS, so this would be pointless. And rather more fundamentally, TPMs are slow. I don't think there's a TPM on the market that could decrypt a 1080p stream in realtime, let alone a 4K one.

The FSF's focus on TPMs here is not only technically wrong, it's indicative of a failure to understand what's actually happening in the industry. While the FSF has been focusing on TPMs, GPU vendors have quietly deployed all of this technology without the FSF complaining at all. Microsoft has enthusiastically participated in making hardware DRM on Windows possible, and user freedoms have suffered as a result, but Playready hardware-based DRM works just fine on hardware that doesn't have a TPM and will continue to do so.

comment count unavailable comments

31 December 2024

Russ Allbery: Review: Metal from Heaven

Review: Metal from Heaven, by August Clarke
Publisher: Erewhon
Copyright: November 2024
ISBN: 1-64566-099-0
Format: Kindle
Pages: 443
Metal from Heaven is industrial-era secondary-world fantasy with a literary bent. It is a complete story in one book, and I would be very surprised by a sequel. Clarke previously wrote the Scapegracers young-adult trilogy, which got excellent reviews and a few award nominations, as H.A. Clarke. This is his first adult novel.
Know I adore you. Look out over the glow. The cities sundered, their machines inverted, mountains split and prairies blazing, that long foreseen Hereafter crowning fast. This calamity is a promise made to you. A prayer to you, and to your shadow which has become my second self, tucked behind my eye and growing in tandem with me, pressing outwards through the pupil, the smarter, truer, almost bursting reason for our wrath. Do not doubt me. Just look. Watch us rise as the sun comes up over the beauty. The future stains the bleakness so pink. When my violence subsides, we will have nothing, and be champions.
Marney Honeycutt is twelve years old, a factory worker, and lustertouched. She works in the Yann I. Chauncey Ichorite Foundry in Ignavia City, alongside her family and her best friend, shaping the magical metal ichorite into the valuable industrial products of a new age of commerce and industry. She is the oldest of the lustertouched, the children born to factory workers and poisoned by the metal. It has made her allergic, prone to fits at any contact with ichorite, but also able to exert a strange control over the metal if she's willing to pay the price of spasms and hallucinations for hours afterwards. As Metal from Heaven opens, the workers have declared a strike. Her older sister is the spokesperson, demanding shorter hours, safer working conditions, and an investigation into the health of the lustertouched children. Chauncey's response is to send enforcer snipers to kill the workers, including the entirety of her family.
The girl sang, "Unalone toward dawn we go, toward the glory of the new morning." An enforcer shot her in the belly, and when she did not fall, her head.
Marney survives, fleeing into the city, swearing an impossible personal revenge against Yann Chauncey. An act of charity gets her a ticket on a train into the countryside. The woman who bought her ticket is a bandit who is on the train to rob it. Marney's ability to control ichorite allows her to help the bandits in return, winning her a place with the Highwayman's Choir who have been preying on the shipments of the rich and powerful and then disappearing into the hills. The Choir's secret is that the agoraphobic and paranoid Baron of the Fingerbluffs is dead and has been for years. He was killed by his staff, Hereafterist idealists, who have turned his remote territory into an anarchist commune and haven for pirates and bandits. This becomes Marney's home and the Choir becomes her family, but she never forgets her oath of revenge or the childhood friend she left behind in the piles of bodies and to whom this story is narrated. First, Clarke's writing is absolutely gorgeous.
We scaled the viny mountain jags at Montrose Barony's legal edge, the place where land was and wasn't Ignavia, Royston, and Drustland alike. There was a border but it was diffuse and hallucinatory, even more so than most. On legal papers and state maps there were harsh lines that squashed topography and sanded down the mountains into even hills in planter's rows, but here among the jutting rocks and craggy heather, the ground was lineless.
The rhythm of it, the grasp of contrast and metaphor, the word choice! That climactic word "lineless," with its echo of limitless. So good. Second, this is the rarest of books: a political fantasy that takes class and religion seriously and uses them for more than plot drivers. This is not at all our world, and the technology level is somewhat ambiguous, but the parallels to the Gilded Age and Progressive Era are unmistakable. The Hereafterists that Marney joins are political anarchists, not in the sense of alternative governance structures and political theory sanitized for middle-class liberals, but in the sense of Emma Goldman and Peter Kropotkin. The society they have built in the Fingerbluffs is temporary, threatened, and contingent, but it is sincere and wildly popular among the people who already lived there. Even beyond politics, class is a tangible force in this book. Marney is a factory worker and the child of factory workers. She barely knows how to read and doesn't magically learn over the course of the book. She has friends who are clever in the sense rewarded by politics and nobility, who navigate bureaucracies and political nuance, but that is not Marney's world. When, towards the end of the book, she has to deal with a gathering of high-class women, the contrast is stark, and she navigates that gathering only by being entirely unexpected. Perhaps the best illustration of the subtlety of this is the terminology in the book for lesbian. Marney is a crawly, which is a slur thrown at people like her (and one of the rare fictional slurs that work exactly as the author intended) but is also simply what she calls herself. Whether or not it functions as a slur depends on context, and the context is never hard to understand. The high-class lesbians she meets later are Lunarists, and react to crawly as a vile and insulting word. They use language to separate themselves from both the insult and from the social class that uses it. Language is an indication of culture and manners and therefore of morality, unlike deeds, which admit endless justifications.
Conversation was fleeting. Perdita managed with whomever stood near her, chipper about every prettiness she saw, the flitting butterflies, the dappled light between the leaves, the lushness and the fragrance of untamed land, and her walking companions took turns sharing in her delight. It was infectious, how happy she was. She was going to slaughter millions. She was going to skip like this all the while.
The handling of religion is perhaps even better. Marney was raised a Tullian, which sits alongside two other fleshed-out fictional religions and sketches of several more. Tullians tend to be conservative and patriarchal, and Marney has a realistically complicated relationship with faith: sticking with some Tullian worship practices and gestures because they're part of who she is, feeling a kinship to other Tullians, discarding beliefs that don't fit her, and revising others. Every major religion has a Hereafterist spin or reinterpretation that upends or reverses the parts of the religion that were used to prop up the existing social order and brings it more in line with Hereafterist ideals. We see the Tullian Hereafterist variation in detail, and as someone who has studied a lot of methods of reinterpreting Christianity, I was impressed by how well Clarke invents both a belief system and its revisionist rewrite. This is exactly how religions work in human history, but one almost never sees this subtlety in fantasy novels. Marney's allergy to ichorite causes her internal dialogue to dissolve into hallucinatory synesthesia when she's manipulating or exposed to it. Since that's most of the book, substantial portions read like drug trips with growing body horror. I normally hate this type of narration, so it's a sign of just how good Clarke's writing is that I tolerated it and even enjoyed parts. It helps that the descriptions are irreverent and often surprising, full of unexpected metaphors and sudden turns. It's very hard not to quote paragraph after paragraph of this book. Clarke is also doing a lot with gender that I don't feel qualified to comment in detail on, but it would not surprise me to see this book in the Otherwise Award recommendation list. I can think of three significant male characters, all of whom are well-done, but every other major character is female by at least some gender definition. Within that group, though, is huge gender diversity of the complicated and personal type that doesn't force people into defined boxes. Marney's sexuality is similarly unclassified and sometimes surprising. My one complaint is that I thought the sex scenes (which, to warn, are often graphic) fell into the literary fiction trap of being described so closely and physically that it didn't feel like anyone involved was actually enjoying themselves. (This is almost certainly a matter of personal taste.) I had absolutely no idea how Clarke was going to end this book, and the last couple of chapters caught me by surprise. I'm still not sure what I think about the climax. It's not the ending that I wanted, but one of the merits of this book is that it never did what I thought I wanted and yet made me enjoy the journey anyway. It is, at least, a genre ending, not a literary ending: The reader gets a full explanation of what is going on, and the setting is not static the way that it so often is in literary fiction. The characters can change the world, for good or for ill. The story felt frustrating and incomplete when I first finished it, but I haven't stopped thinking about this book and I think I like the shape of it a bit more now. It was certainly unexpected, at least by me. Clarke names Dhalgren as one of their influences in the acknowledgments, and yes, Metal from Heaven is that kind of book. This is the first 2024 novel I've read that felt like the kind of book that should be on award shortlists. I'm not sure it was entirely successful, and there are parts of it that I didn't like or that weren't for me, but it's trying to do something different and challenging and uncomfortable, and I think it mostly worked. And the writing is so good.
She looked like a mythic princess from the old woodcuts, who ruled nature by force of goodness and faith and had no legal power.
Metal from Heaven is not going to be everyone's taste. If you do not like literary fantasy, there is a real chance that you will hate this. I am very glad that I read it, and also am going to take a significant break from difficult books before I tackle another one. But then I'm probably going to try the Scapegracers series, because Clarke is an author I want to follow. Content notes: Explicit sex, including sadomasochistic sex. Political violence, mostly by authorities. Murdered children, some body horror, and a lot of serious injuries and death. Rating: 8 out of 10

23 December 2024

Simon Josefsson: OpenSSH and Git on a Post-Quantum SPHINCS+

Are you aware that Git commits and tags may be signed using OpenSSH? Git signatures may be used to improve integrity and authentication of our software supply-chain. Popular signature algorithms include Ed25519, ECDSA and RSA. Did you consider that these algorithms may not be safe if someone builds a post-quantum computer? As you may recall, I have earlier blogged about the efficient post-quantum key agreement mechanism called Streamlined NTRU Prime and its use in SSH and I have attempted to promote the conservatively designed Classic McEliece in a similar way, although it remains to be adopted. What post-quantum signature algorithms are available? There is an effort by NIST to standardize post-quantum algorithms, and they have a category for signature algorithms. According to wikipedia, after round three the selected algorithms are CRYSTALS-Dilithium, FALCON and SPHINCS+. Of these, SPHINCS+ appears to be a conservative choice suitable for long-term digital signatures. Can we get this to work? Recall that Git uses the ssh-keygen tool from OpenSSH to perform signing and verification. To refresh your memory, let s study the commands that Git uses under the hood for Ed25519. First generate a Ed25519 private key:
jas@kaka:~$ ssh-keygen -t ed25519 -f my_ed25519_key -P ""
Generating public/private ed25519 key pair.
Your identification has been saved in my_ed25519_key
Your public key has been saved in my_ed25519_key.pub
The key fingerprint is:
SHA256:fDa5+jmC2+/aiLhWeWA3IV8Wj6yMNTSuRzqUZlIGlXQ jas@kaka
The key's randomart image is:
+--[ED25519 256]--+
     .+=.E ..      
      oo=.ooo      
     . =o=+o .     
      =oO+o .      
      .=+S.=       
       oo.o o      
      . o  .       
     ...o.+..      
    .o.o.=**.      
+----[SHA256]-----+
jas@kaka:~$ cat my_ed25519_key
-----BEGIN OPENSSH PRIVATE KEY-----
b3BlbnNzaC1rZXktdjEAAAAABG5vbmUAAAAEbm9uZQAAAAAAAAABAAAAMwAAAAtzc2gtZW
QyNTUxOQAAACAWP/aZ8hzN0WNRMSpjzbgW1tJXNd2v6/dnbKaQt7iIBQAAAJCeDotOng6L
TgAAAAtzc2gtZWQyNTUxOQAAACAWP/aZ8hzN0WNRMSpjzbgW1tJXNd2v6/dnbKaQt7iIBQ
AAAEBFRvzgcD3YItl9AMmVK4xDKj8NTg4h2Sluj0/x7aSPlhY/9pnyHM3RY1ExKmPNuBbW
0lc13a/r92dsppC3uIgFAAAACGphc0BrYWthAQIDBAU=
-----END OPENSSH PRIVATE KEY-----
jas@kaka:~$ cat my_ed25519_key.pub 
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIBY/9pnyHM3RY1ExKmPNuBbW0lc13a/r92dsppC3uIgF jas@kaka
jas@kaka:~$ 
Then let s sign something with this key:
jas@kaka:~$ echo "Hello world!" > msg
jas@kaka:~$ ssh-keygen -Y sign -f my_ed25519_key -n my-namespace msg
Signing file msg
Write signature to msg.sig
jas@kaka:~$ cat msg.sig 
-----BEGIN SSH SIGNATURE-----
U1NIU0lHAAAAAQAAADMAAAALc3NoLWVkMjU1MTkAAAAgFj/2mfIczdFjUTEqY824FtbSVz
Xdr+v3Z2ymkLe4iAUAAAAMbXktbmFtZXNwYWNlAAAAAAAAAAZzaGE1MTIAAABTAAAAC3Nz
aC1lZDI1NTE5AAAAQLmWsq05tqOOZIJqjxy5ZP/YRFoaX30lfIllmfyoeM5lpVnxJ3ZxU8
SF0KodDr8Rtukg2N3Xo80NGvZOzbG/9Aw=
-----END SSH SIGNATURE-----
jas@kaka:~$
Now let s create a list of trusted public-keys and associated identities:
jas@kaka:~$ echo 'my.name@example.org ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIBY/9pnyHM3RY1ExKmPNuBbW0lc13a/r92dsppC3uIgF' > allowed-signers
jas@kaka:~$ 
Then let s verify the message we just signed:
jas@kaka:~$ cat msg   ssh-keygen -Y verify -f allowed-signers -I my.name@example.org -n my-namespace -s msg.sig
Good "my-namespace" signature for my.name@example.org with ED25519 key SHA256:fDa5+jmC2+/aiLhWeWA3IV8Wj6yMNTSuRzqUZlIGlXQ
jas@kaka:~$ 
I have implemented support for SPHINCS+ in OpenSSH. This is early work, but I wanted to announce it to get discussion of some of the details going and to make people aware of it. What would a better way to demonstrate SPHINCS+ support in OpenSSH than by validating the Git commit that implements it using itself? Here is how to proceed, first get a suitable development environment up and running. I m using a Debian container launched in a protected environment using podman.
jas@kaka:~$ podman run -it --rm debian:stable
Then install the necessary build dependencies for OpenSSH.
# apt-get update 
# apt-get install git build-essential autoconf libz-dev libssl-dev
Now clone my OpenSSH branch with the SPHINCS+ implentation and build it. You may browse the commit on GitHub first if you are curious.
# cd
# git clone https://github.com/jas4711/openssh-portable.git -b sphincsp
# cd openssh-portable
# autoreconf -fvi
# ./configure
# make
Configure a Git allowed signers list with my SPHINCS+ public key (make sure to keep the public key on one line with the whitespace being one ASCII SPC character):
# mkdir -pv ~/.ssh
# echo 'simon@josefsson.org ssh-sphincsplus@openssh.com AAAAG3NzaC1zcGhpbmNzcGx1c0BvcGVuc3NoLmNvbQAAAECI6eacTxjB36xcPtP0ZyxJNIGCN350GluLD5h0KjKDsZLNmNaPSFH2ynWyKZKOF5eRPIMMKSCIV75y+KP9d6w3' > ~/.ssh/allowed_signers
# git config gpg.ssh.allowedSignersFile ~/.ssh/allowed_signers
Then verify the commit using the newly built ssh-keygen binary:
# PATH=$PWD:$PATH
# git log -1 --show-signature
commit ce0b590071e2dc845373734655192241a4ace94b (HEAD -> sphincsp, origin/sphincsp)
Good "git" signature for simon@josefsson.org with SPHINCSPLUS key SHA256:rkAa0fX0lQf/7V7QmuJHSI44L/PAPPsdWpis4nML7EQ
Author: Simon Josefsson <simon@josefsson.org>
Date:   Tue Dec 3 18:44:25 2024 +0100
    Add SPHINCS+.
# git verify-commit ce0b590071e2dc845373734655192241a4ace94b
Good "git" signature for simon@josefsson.org with SPHINCSPLUS key SHA256:rkAa0fX0lQf/7V7QmuJHSI44L/PAPPsdWpis4nML7EQ
# 
Yay! So what are some considerations? SPHINCS+ comes in many different variants. First it comes with three security levels approximately matching 128/192/256 bit symmetric key strengths. Second choice is between the SHA2-256, SHAKE256 (SHA-3) and Haraka hash algorithms. Final choice is between a robust and a simple variant with different security and performance characteristics. To get going, I picked the sphincss256sha256robust SPHINCS+ implementation from SUPERCOP 20241022. There is a good size comparison table in the sphincsplus implementation, if you want to consider alternative variants. SPHINCS+ public-keys are really small, as you can see in the allowed signers file. This is really good because they are handled by humans and often by cut n paste. What about private keys? They are slightly longer than Ed25519 private keys but shorter than typical RSA private keys.
# ssh-keygen -t sphincsplus -f my_sphincsplus_key -P ""
Generating public/private sphincsplus key pair.
Your identification has been saved in my_sphincsplus_key
Your public key has been saved in my_sphincsplus_key.pub
The key fingerprint is:
SHA256:4rNfXdmLo/ySQiWYzsBhZIvgLu9sQQz7upG8clKziBg root@ad600ff56253
The key's randomart image is:
+[SPHINCSPLUS 256-+
  .  .o            
 o . oo.           
  = .o.. o         
 o o  o o . .   o  
 .+    = S o   o . 
 Eo=  . + . . .. . 
 =*.+  o . . oo .  
 B+=    o o.o. .   
 o*o   ... .oo.    
+----[SHA256]-----+
# cat my_sphincsplus_key.pub 
ssh-sphincsplus@openssh.com AAAAG3NzaC1zcGhpbmNzcGx1c0BvcGVuc3NoLmNvbQAAAEAltAX1VhZ8pdW9FgC+NdM6QfLxVXVaf1v2yW4v+tk2Oj5lxmVgZftfT37GOMOlK9iBm9SQHZZVYZddkEJ9F1D7 root@ad600ff56253
# cat my_sphincsplus_key 
-----BEGIN OPENSSH PRIVATE KEY-----
b3BlbnNzaC1rZXktdjEAAAAABG5vbmUAAAAEbm9uZQAAAAAAAAABAAAAYwAAABtzc2gtc3
BoaW5jc3BsdXNAb3BlbnNzaC5jb20AAABAJbQF9VYWfKXVvRYAvjXTOkHy8VV1Wn9b9slu
L/rZNjo+ZcZlYGX7X09+xjjDpSvYgZvUkB2WVWGXXZBCfRdQ+wAAAQidiIwanYiMGgAAAB
tzc2gtc3BoaW5jc3BsdXNAb3BlbnNzaC5jb20AAABAJbQF9VYWfKXVvRYAvjXTOkHy8VV1
Wn9b9sluL/rZNjo+ZcZlYGX7X09+xjjDpSvYgZvUkB2WVWGXXZBCfRdQ+wAAAIAbwBxEhA
NYzITN6VeCMqUyvw/59JM+WOLXBlRbu3R8qS7ljc4qFVWUtmhy8B3t9e4jrhdO6w0n5I4l
mnLnBi2hJbQF9VYWfKXVvRYAvjXTOkHy8VV1Wn9b9sluL/rZNjo+ZcZlYGX7X09+xjjDpS
vYgZvUkB2WVWGXXZBCfRdQ+wAAABFyb290QGFkNjAwZmY1NjI1MwECAwQ=
-----END OPENSSH PRIVATE KEY-----
# 
Signature size? Now here is the challenge, for this variant the size is around 29kb or close to 600 lines of base64 data:
# git cat-file -p ce0b590071e2dc845373734655192241a4ace94b   head -10
tree ede42093e7d5acd37fde02065a4a19ac1f418703
parent 826483d51a9fee60703298bbf839d9ce37943474
author Simon Josefsson <simon@josefsson.org> 1733247865 +0100
committer Simon Josefsson <simon@josefsson.org> 1734907869 +0100
gpgsig -----BEGIN SSH SIGNATURE-----
 U1NIU0lHAAAAAQAAAGMAAAAbc3NoLXNwaGluY3NwbHVzQG9wZW5zc2guY29tAAAAQIjp5p
 xPGMHfrFw+0/RnLEk0gYI3fnQaW4sPmHQqMoOxks2Y1o9IUfbKdbIpko4Xl5E8gwwpIIhX
 vnL4o/13rDcAAAADZ2l0AAAAAAAAAAZzaGE1MTIAAHSDAAAAG3NzaC1zcGhpbmNzcGx1c0
 BvcGVuc3NoLmNvbQAAdGDHlobgfgkKKQBo3UHmnEnNXczCMNdzJmeYJau67QM6xZcAU+d+
 2mvhbksm5D34m75DWEngzBb3usJTqWJeeDdplHHRe3BKVCQ05LHqRYzcSdN6eoeZqoOBvR
# git cat-file -p ce0b590071e2dc845373734655192241a4ace94b   tail -5 
 ChvXUk4jfiNp85RDZ1kljVecfdB2/6CHFRtxrKHJRDiIavYjucgHF1bjz0fqaOSGa90UYL
 RZjZ0OhdHOQjNP5QErlIOcZeqcnwi0+RtCJ1D1wH2psuXIQEyr1mCA==
 -----END SSH SIGNATURE-----
Add SPHINCS+.
# git cat-file -p ce0b590071e2dc845373734655192241a4ace94b   wc -l
579
# 
What about performance? Verification is really fast:
# time git verify-commit ce0b590071e2dc845373734655192241a4ace94b
Good "git" signature for simon@josefsson.org with SPHINCSPLUS key SHA256:rkAa0fX0lQf/7V7QmuJHSI44L/PAPPsdWpis4nML7EQ
real	0m0.010s
user	0m0.005s
sys	0m0.005s
# 
On this machine, verifying an Ed25519 signature is a couple of times slower, and needs around 0.07 seconds. Signing is slower, it takes a bit over 2 seconds on my laptop.
# echo "Hello world!" > msg
# time ssh-keygen -Y sign -f my_sphincsplus_key -n my-namespace msg
Signing file msg
Write signature to msg.sig
real	0m2.226s
user	0m2.226s
sys	0m0.000s
# echo 'my.name@example.org ssh-sphincsplus@openssh.com AAAAG3NzaC1zcGhpbmNzcGx1c0BvcGVuc3NoLmNvbQAAAEAltAX1VhZ8pdW9FgC+NdM6QfLxVXVaf1v2yW4v+tk2Oj5lxmVgZftfT37GOMOlK9iBm9SQHZZVYZddkEJ9F1D7' > allowed-signers
# cat msg   ssh-keygen -Y verify -f allowed-signers -I my.name@example.org -n my-namespace -s msg.sig
Good "my-namespace" signature for my.name@example.org with SPHINCSPLUS key SHA256:4rNfXdmLo/ySQiWYzsBhZIvgLu9sQQz7upG8clKziBg
# 
Welcome to our new world of Post-Quantum safe digital signatures of Git commits, and Happy Hacking!

20 December 2024

Noah Meyerhans: Local Development VM Management

A coworker asked recently about how people use VMs locally for dev work, so I figured I d take a few minutes to write up a bit about what I do. There are many use cases for local virtual machines in software development and testing. They re self-contained, meaning you can make a mess of them without impacting your day-to-day computing environment. They can run different distributions, kernels, and even entirely different operating systems from the one you use regularly. Etc. They re also cheaper than cloud services and provide finer grained control over the resources. I figured I d share a little bit about how I manage different virtual machines in case anybody finds this useful. This is what works for me, but it won t necessarily work for you, or maybe you ve already got something better. I ve found it to be easy to work with, light weight, and is easy to evolve my needs change.

Use short-lived VMs Rather than keep a long-lived development VM around that you customize over time, I recommend automating the common customizations and provisioning new VMs regularly. If I m working on reproducing a bug or testing a change prior to submitting it upstream, I ll do this work in a VM and delete the VM when when I m done. When provisioning VMs this frequently, though, walking through the installation process for every new VM is tedious and a waste of time. Since most of my work is done in Debian, so I start with images generated daily by the cloud team. These images are available for multiple releases and architectures. The nocloud variant boots to a root prompt and can be useful directly, or the generic images can be used for cloud-init based customization.

Automating image preparation This makefile lets me do something like make image and get a new qcow2 image with the latest build of a given Debian release (sid by default, with others available by specifying DIST).
DATESTAMP=$(shell date +"%Y-%m-%d")
FLAVOR?=generic
ARCH?=$(shell dpkg --print-architecture)
DIST?=sid
RELEASE=$(DIST)
URL_PATH=https://cloud.debian.org/images/cloud/$(DIST)/daily/latest/
ifeq ($(DIST),trixie)
RELEASE=13
endif
ifeq ($(DIST),bookworm)
RELEASE=12
endif
ifeq ($(DIST),bullseye)
RELEASE=11
endif
debian-$(DIST)-$(FLAVOR)-$(ARCH)-daily.tar.xz:
curl --fail --connect-timeout 20 -LO \
$(URL_PATH)/debian-$(RELEASE)-$(FLAVOR)-$(ARCH)-daily.tar.xz
$(DIST)-$(FLAVOR)-$(DATESTAMP).qcow2: debian-$(RELEASE)-$(FLAVOR)-$(ARCH)-daily.tar.xz
tar xvf debian-$(RELEASE)-$(FLAVOR)-$(ARCH)-daily.tar.xz
qemu-img convert -O qcow2 disk.raw $@
rm -f disk.raw
qemu-img resize $@ 20g
qemu-img snapshot -c untouched $@
image: $(DIST)-$(FLAVOR)-$(DATESTAMP).qcow2
.PHONY: image

Customize the VM environment with cloud-init While the nocloud images can be useful, I typically find that I want to apply the same modifications to each new VM I launch, and they don t provide facilities for automating this. The generic images, on the other hand, run cloud-init by default. Using cloud-init, I can create my user account, point apt at local mirrors, install my preferred tools, ensure the root filesystem is resized to make full use of the backing storage, etc. The cloud-init configuration on the generic images will read from a local config drive, which can contain an ISO9660 (cdrom) filesystem image. This image can be generated from a subdirectory containing the various cloud-init input files using the following make syntax:
IMDS_FILES=$(shell find seedconfig -path '*/.git/*' \
-prune -o -type f -name '*.in.json' -print) \
seedconfig/openstack/latest/user_data
seed.iso: $(IMDS_FILES)
genisoimage -V config-2 -o $@ -J -R -m '*~' -m '.git' seedconfig
With the image in place, the VM can be created with
 qemu-system-x86_64 -machine q35,accel=kvm
-cpu host -m 4g -drive file=$ img ,index=0,if=virtio,media=disk
-drive file=seed.iso,media=cdrom,format=raw,index=2,if=virtio
-nic user -nographic
This invokes qemu with the root volume and ISO image attached as disks, uses an emulated q35 machine with the host s CPU and KVM acceleration, the userspace network stack, and a serial console. The first time the VM boots, cloud-init will apply the configuration from the cloud-config available in the ISO9660 filesystem.

Alternatives to cloud-init virt-customize is another tool accomplishing the same type of customization. I use cloud-init because it works directly with cloud providers in addition to local VM images. You could also use something like ansible.

Variations I have a variant of this that uses a bridged network, which I ll write more about later. The bridge is nice because it s more featureful, with full support for IPv6, etc, but it needs a bit more infrastructure in place. It also can be helpful to use 9p or virtfs to share filesystem state between the host the VM. I don t tend to rely on these, and will instead use rsync or TRAMP for moving files around. Containers are also useful, of course, and there are plenty of times when the full isolation of a VM is not worth the overhead.

17 December 2024

Gunnar Wolf: The science of detecting LLM-generated text

This post is a review for Computing Reviews for The science of detecting LLM-generated text , a article published in Communications of the ACM
While artificial intelligence (AI) applications for natural language processing (NLP) are no longer something new or unexpected, nobody can deny the revolution and hype that started, in late 2022, with the announcement of the first public version of ChatGPT. By then, synthetic translation was well established and regularly used, many chatbots had started attending users requests on different websites, voice recognition personal assistants such as Alexa and Siri had been widely deployed, and complaints of news sites filling their space with AI-generated articles were already commonplace. However, the ease of prompting ChatGPT or other large language models (LLMs) and getting extensive answers its text generation quality is so high that it is often hard to discern whether a given text was written by an LLM or by a human has sparked significant concern in many different fields. This article was written to present and compare the current approaches to detecting human- or LLM-authorship in texts. The article presents several different ways LLM-generated text can be detected. The first, and main, taxonomy followed by the authors is whether the detection can be done aided by the LLM s own functions ( white-box detection ) or only by evaluating the generated text via a public application programming interface (API) ( black-box detection ). For black-box detection, the authors suggest training a classifier to discern the origin of a given text. Although this works at first, this task is doomed from its onset to be highly vulnerable to new LLMs generating text that will not follow the same patterns, and thus will probably evade recognition. The authors report that human evaluators find human-authored text to be more emotional and less objective, and use grammar to indicate the tone of the sentiment that should be used when reading the text a trait that has not been picked up by LLMs yet. Human-authored text also tends to have higher sentence-level coherence, with less term repetition in a given paragraph. The frequency distribution for more and less common words is much more homogeneous in LLM-generated texts than in human-written ones. White-box detection includes strategies whereby the LLMs will cooperate in identifying themselves in ways that are not obvious to the casual reader. This can include watermarking, be it rule based or neural based; in this case, both processes become a case of steganography, as the involvement of a LLM is explicitly hidden and spread through the full generated text, aiming at having a low detectability and high recoverability even when parts of the text are edited. The article closes by listing the authors concerns about all of the above-mentioned technologies. Detecting an LLM, be it with or without the collaboration of the LLM s designers, is more of an art than a science, and methods deemed as robust today will not last forever. We also cannot assume that LLMs will continue to be dominated by the same core players; LLM technology has been deeply studied, and good LLM engines are available as free/open-source software, so users needing to do so can readily modify their behavior. This article presents itself as merely a survey of methods available today, while also acknowledging the rapid progress in the field. It is timely and interesting, and easy to follow for the informed reader coming from a different subfield.

Russ Allbery: Review: Iris Kelly Doesn't Date

Review: Iris Kelly Doesn't Date, by Ashley Herring Blake
Series: Bright Falls #3
Publisher: Berkley Romance
Copyright: October 2023
ISBN: 0-593-55058-7
Format: Kindle
Pages: 381
Iris Kelly Doesn't Date is a sapphic romance novel (probably a romantic comedy, although I'm bad at romance subgenres). It is the third book in the Bright Falls series. In the romance style, it has a new set of protagonists, but the protagonists of the previous books appear as supporting characters and reading this will spoil the previous books. Among the friend group we were introduced to in Delilah Green Doesn't Care, Iris was the irrepressible loudmouth. She's bad at secrets, good at saying whatever is on her mind, and has zero desire to either get married or have children. After one of the side plots of Astrid Parker Doesn't Fail, she has sworn off dating entirely. Iris is also now a romance novelist. Her paper store didn't get enough foot traffic to justify staying open, so she switched her planner business to online only and wrote a romance novel that was good enough to get a two-book deal. Now she needs to write a second book and she has absolutely nothing. Her own avoidance of romantic situations is not helping, but neither is her meddling family who are convinced her choices about marriage and family can be overturned with sufficient pestering. She desperately needs to shake up her life, get out of her creative rut, and do something new. Failing that, she'll settle for meeting someone in a bar and having some fun. Stevie is a barista and actress living in Portland. Six months ago, she broke up with Adri, her creative partner, girlfriend of six years, and the first person with whom she had a serious relationship. More precisely, Adri broke up with her. They're still friends, truly, even though that friendship is being seriously strained by Adri dating Vanessa, another member of their small and close-knit friend group. Stevie has occasionally-crippling anxiety, not much luck in finding real acting roles in Portland, and a desperate desire to not make waves. Ren, the fourth member of their friend group, thinks Stevie needs a new relationship, or at least a fling. That's how Stevie, with Ren as backup and encouragement, ends up at the same bar with Iris. The resulting dance and conversation was rather fun for both Stevie and Iris. The attempted one-night stand afterwards was a disaster due to Stevie's anxiety, and neither of them expected to see the other again. Stevie therefore felt safe pretending they'd hit it off to get her friends off her back. When Iris's continued restlessness lands her in an audition for Adri's fundraiser play that she also talked Stevie into performing in, this turns into a full-blown fake dating trope. These books continue to be impossible to put down. I'm not sure what Blake is doing to make the pacing so perfect, but as with the previous books of the series I found this utterly compulsive reading. I started it in the afternoon, took a break in the evening for a few hours, and then finished it at 2am. I wasn't sure if a book focused on Iris would work as well, but I need not have worried. Iris Kelly Doesn't Date is both more dramatic and more trope-centered than the earlier books, but Blake handles that in a way that fits Iris's personality and wasn't annoying even to a reader like me, who has an aversion to many types of relationship drama. The secret is Stevie, and specifically having the other protagonist be someone with severe anxiety.
No was never a very easy word for Stevie when it came to Adri, when it came to anyone, really. She could handle the little stuff do you want a soda, have you seen this movie, do you like onions on your pizza but the big stuff, the stuff that caused disappointed expressions and down-turned mouths... yeah, she sucked at that part. Her anxiety would flare, and she'd spend the next week convinced her friends hated her, she'd die alone and miserable, and wasn't worth a damn to anyone. Then, when said friend or family member eventually got ahold of her to tell her that, no, of course they didn't hate her, why in the world would she think that, her anxiety would crest once again, convincing her that she was terrible at understanding people and could never trust her own brain to make heads or tails of any social situation.
This is a spot-on description of a particular type of anxiety, but also this is the perfect protagonist to pair with Iris. Throughout the series, Iris has always been the ride-or-die friend, the person who may have no idea how to help but who will show up anyway and at least try to distract you. Stevie's anxiety makes Iris feel protective, which reveals one of the best sides of Iris's personality, and then the protectiveness plays off against Iris's own relationship issues and tendency to avoid taking anything too seriously. It's one of those relationships that starts a bit one-sided and then becomes mutually supporting once Stevie gets her feet under her. That's a relationship pattern I really enjoy reading about. As with the rest of the series, the friendship dynamics are great. Here we get to see two friend groups at work: Iris's, which we've seen in the previous two volumes and which expanded interestingly in Astrid Parker Doesn't Fail, and Stevie's, which is new. I liked all of these people, even Adri in her own way (although she's the hardest to like). The previous happily-ever-afters do get a bit awkward here, but Blake tries to make that part of the plot and also avoids most of the problem of somewhat-boring romantic bliss by spreading the friendship connections a bit wider. Stevie's friend group formed at orientation at Reed College, and that let me put my finger on another property of this series: essentially all of the characters are from a very specific social class. They're nearly all arts people (bookstore owner, photographer, interior decorator, actress, writer, director), they've mostly gone to college, and while most of them don't have lots of money, there's always at least one person in each friend group with significant wealth. Jordan, from the previous book, is a bit of an exception since she works in a trade (a carpenter), but she still acts like someone from that same social class. It's a bit like reading Jane Austen novels and realizing that the protagonists are drawn from a very specific and very narrow portion of society. This is not a complaint, to be clear; I have no objections to reading about a very specific social class. But if one has already read lots of books about this class of people, I could see that diminishing the appeal of this series a bit. There are a lot of assumptions baked into the story that aren't really questioned, such as the ubiquity of therapists. (I don't know how Stevie affords one on a barista salary.) There are also some small things in the terminology (therapy speak, for example) and in the specific type of earnestness with which the books attempt to be diverse on most axes other than social class that I suspect may grate a bit for some readers. If that's you, this is your warning. There is a third-act breakup here, just like the previous volumes. There is also a defense of the emotional punch of third-act breakups in romance novels in the book itself, put into Iris's internal monologue, so I suspect that's the author's answer to critics like myself who don't like the trope. I was less frustrated by this one because it fit the drama level of the protagonists, but I'll also know to expect a third-act breakup in any Blake novel I read in the future. But, all that said, the summary once again is that I loved this book and could not put it down. Iris is dramatic and occasionally self-destructive but has a core of earnest empathy that makes her easy to like. She's exactly the sort of extrovert who is soothing to introverts rather than draining because she carries the extrovert load of social situations. Stevie is adorably earnest and thoughtful beneath her anxiety. They two of them are wildly different and yet remarkably good together, and I loved reading their story. Highly recommended, along with the whole series. Start with Delilah Green Doesn't Care; if you like that, you're in for a treat. Content note: This book is also rather sex-forward and pretty explicit in the sex scenes, maybe a touch more than Astrid Parker Doesn't Fail. If that is or is not your thing in romance novels, be aware going in. Rating: 9 out of 10

16 December 2024

Dirk Eddelbuettel: #45: Some r-ci Updates

market monitor Welcome to post 45 in the $R^4 series! We introduced r-ci here in post #32 here nearly four years ago. It has found pretty widespread use and adoption, and we received a few kind words then (in the linked issue) and also more recently (in a follow-up comment) from which we merrily quote:
[ ] almost 3 years later on and I have had zero problems with this CI setup. For people who want reliable R software, resources like these are invaluable.
And while we followed up with post #41 about r2u for simple continuous integration, we may not have posted when we based r-ci on r2u (for the obvious Linux usage case). So let s make time now for a (comparitively smaller) update, and an update usage examples. We made two changes in the last few days. One is a (obvious in hindsight) simplification. Given that the bootstrap step was always executed, and needed no parameters, we pulled it into a new aggregated setup simply called r-ci that includes it so that it can be omitted as a step in the yaml file. Second, we recently needed Fortran on macOS too, and realized it was not installed by default so we just added that too. With that a real and used example is now as simple as the screenshot to the left (and hence one paragraph shorter). The trained eye will no doubt observe that there is nothing specific to a given repo. And that is basically the key feature: we can simply copy this file around and get fast and easy and reliable CI by taking advantage of the underlying robustness of r2u solving all dependency automagically and reliably. The option to enable macOS is also solid and compelling as the GitHub runners are fast (but more expensive in how the count against the limit of minutes so again a tradeoff to make), as is the option to run coverage if one so desires. Some of my repos do too. Take a look at the r-ci website which has more examples for the other supported CI servics it can used with, and feel free to ask questions as issue in the repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can now sponsor me at GitHub. Please report excessive re-aggregation in third-party for-profit settings.

15 December 2024

Russell Coker: Hisense 65U80G 65 Inch 8K ULED Android TV (2021)

The Aim I just bought a Hisense 65U80G 65 Inch 8K ULED Android TV (2021 model) for $1,568 including delivery. I got that deal by googling refurbished 8K TVs and finding the cheapest one I could buy. Amazon and eBay didn t have any good prices on second hand 8K TVs and new ones start at $3,000 on special. I didn t assess how Hisense compares to other TVs, as far as I could determine there was only one model of 8K TV on sale in Australia in the price range I was prepared to pay. So I won t review how this TV compares to other models but how refurbished TVs compare to other display options. I bought this because the highest resolution monitor in my price range is 5120*2160 [1]. While I could get a 5128*2880 monitor for around $1,500 paying 3* the money for 33% more pixels is bad value for money. Getting 4* the pixels for under 3* the price is good value even when it s a TV with the lower display quality that involves. Before buying this TV I read this blog post by Daniel Lawrence about using an 8K TV as a primary monitor [2]. While he has an interesting setup with a 65 TV on a large desk it s not what I plan to do at this time. My Plans for Use I don t plan to make it a main monitor. While 5120*2160 isn t as good as I like on my desk it s bearable and the quality of the display is high. High resolution isn t needed for all tasks, for example I m writing this blog post on my laptop while watching a movie on the 8K TV. One thing I d like to do with the 8K TV when I get it working as a monitor is to share the screen for team programming projects. I don t have any specific plans other than team coding projects at the moment. But it will be interesting to experiment with it when I get it working. Technical Issues with High Resolution Monitors Hardware Needed A lot of the graphic hardware out there don t support resolutions higher than 5120*2880. It seems that most laptops don t support resolutions higher than that and higher resolutions than 4K are difficult. Only quite recent and high end video cards will do 8K. Apparently the RTX 2080 is one of the oldest ones that does and that s $400 on ebay. Strangely the GPU chipset spec pages don t list the maximum resolution and there s the additional complication that the other chips might not support the resolutions that the GPU itself can support. As an aside I don t use NVidia cards for regular workstations due to reliability problems. But they are good for ML work and for special purpose systems. Interface Versions To do 8K video it seems that you need HDMI 2.1 (or maybe 2.0 with 4:2:0 chroma subsampling) or DisplayPort 1.3 for 30Hz with 24bit color and 2.0 for higher refresh rates. But using a particular version of the interface doesn t require supporting all the resolutions that it might support. This TV has HDMI 2.1 inputs, I ve bought an adaptor cable that does DisplayPort 1.4 to HDMI 2.1 at 8K resolution. So I need a video card that does DisplayPort 1.4 or HDMI 2.1 output. That doesn t mean that the card will work, but it could work. It s a pity that no-one has made a USB-C video controller that has a basic frame-buffer supporting 8K and the minimal GPU capabilities. The consensus of opinion is that no games will run well at 8K at this time so anyone using 8K resolution doesn t need GPU power unless it s for ML stuff. I m thinking of making a system that can be used as a ML server and X/Wayland server so a GPU with a decent amount of RAM and compute power would be good. I m not particularly interested in spending $1,500+ to get a GPU that can drive a $1,568 TV. I m looking into getting a RTX A2000 with 12G of RAM which should be adequate for ML experiments and can handle 8K@60Hz output. I ve ordered a DisplayPort to HDMI converter cable so if I get a DisplayPort card it will work. Software Support When I first got started with 4K monitors I had significant problems in adjusting the UI to be usable. The support for scaling software is much better now than it was then and 8K 65 has a lower DPI than 4K 32 . So I hope this won t be an issue. Progress So Far My first Hisense 8K TV stopped working properly. It would change to a mostly white screen after being used for some time. The screen would change in ways that correlate to changes in what should appear, but not in a way that was usable. It was just a different pattern of white blobs when I changed to a menu view not anything that allowed using it. I presume that this was the problem that drove a need for refurbishment as when I first got the TV it was still signed in to Google accounts for YouTube and to NetFlix. Best Buy Electrical was good about providing a quick replacement, they took away the old TV and delivered a new one on the same visit and it s now working well. I ve obtained a NVidia card that can allegedly do 8K output and a combination of cables that might be able to carry an 8K signal. Now I just need to get the NVidia drivers to not cause a kernel panic to get things to work.

13 December 2024

Emanuele Rocca: Murder Mystery: GCC Builds Failing After sbuild Refactoring

This is the story of an investigation conducted by Jochen Sprickerhof, Helmut Grohne, and myself. It was true teamwork, and we would have not reached the bottom of the issue working individually. We think you will find it as interesting and fun as we did, so here is a brief writeup. A few of the steps mentioned here took several days, others just a few minutes. What is described as a natural progression of events did not always look very obvious at the moment at all.
Let us go through the Six Stages of Debugging together.

Stage 1: That cannot happen
Official Debian GCC builds start failing on multiple architectures in late November.
The build error happens on the build servers when running the testuite, but we know this cannot happen. GCC builds are not meant to fail in case of testsuite failures! Return codes are not making the build fail, make is being called with -k, it just cannot happen.
A lot of the GCC tests are always failing in fact, and an extensive log of the results is posted to the debian-gcc mailing list, but the packages always build fine regardless.
On the build daemons, build failures take several hours.

Stage 2: That does not happen on my machine
Building on my machine running Bookworm is just fine. The Build Daemons run Bookworm and use a Sid chroot for the build environment, just like I am. Same kernel.
The only obvious difference between my setup and the Debian buildds is that I am using sbuild 0.85.0 from bookworm, and the buildds have 0.86.3~bpo12+1 from bookworm-backports. Trying again with 0.86.3~bpo12+1, the build fails on my system too. The build daemons were updated to the bookworm-backports version of sbuild at some point in late November. Ha.

Stage 3: That should not happen
There are quite a few sbuild versions in between 0.85.0 and 0.86.3~bpo12+1, but looking at recent sbuild bugs shows that sbuild 0.86.0 was breaking "quite a number of packages". Indeed, with 0.86.0 the build still fails. Trying the version immediately before, 0.85.11, the build finishes correctly. This took more time than it sounds, one run including the tests takes several hours. We need a way to shorten this somehow.
The Debian packaging of GCC allows to specify which languages you may want to skip, and by default it builds Ada, Go, C, C++, D, Fortran, Objective C, Objective C++, M2, and Rust. When running the tests sequentially, the build logs stop roughly around the tests of a runtime library for D, libphobos. So can we still reproduce the failure by skipping everything except for D? With DEB_BUILD_OPTIONS=nolang=ada,go,c,c++,fortran,objc,obj-c++,m2,rust the build still fails, and it fails faster than before. Several minutes, not hours. This is progress, and time to file a bug. The report contains massive spoilers, so no link. :-)

Stage 4: Why does that happen?
Something is causing the build to end prematurely. It s not the OOM killer, and the kernel does not have anything useful to say in the logs. Can it be that the D language tests are sending signals to some process, and that is what s killing make ? We start tracing signals sent with bpftrace by writing the following script, signals.bt:
tracepoint:signal:signal_generate  
    printf("%s PID %d (%s) sent signal %d to PID %d\n", comm, pid, args->sig, args->pid);
 
And executing it with sudo bpftrace signals.bt.
The build takes its sweet time, and it fails. Looking at the trace output there s a suspicious process.exe terminating stuff.
process.exe (PID: 2868133) sent signal 15 to PID 711826
That looks interesting, but we have no clue what PID 711826 may be. Let s change the script a bit, and trace signals received as well.
tracepoint:signal:signal_generate  
    printf("PID %d (%s) sent signal %d to %d\n", pid, comm, args->sig, args->pid);
 
tracepoint:signal:signal_deliver  
    printf("PID %d (%s) received signal %d\n", pid, comm, args->sig);
 
The working version of sbuild was using dumb-init, whereas the new one features a little init in perl. We patch the current version of sbuild by making it use dumb-init instead, and trace two builds: one with the perl init, one with dumb-init.
Here are the signals observed when building with dumb-init.
PID 3590011 (process.exe) sent signal 2 to 3590014
PID 3590014 (sleep) received signal 9
PID 3590011 (process.exe) sent signal 15 to 3590063
PID 3590063 (std.process tem) received signal 9
PID 3590011 (process.exe) sent signal 9 to 3590065
PID 3590065 (std.process tem) received signal 9
And this is what happens with the new init in perl:
PID 3589274 (process.exe) sent signal 2 to 3589291
PID 3589291 (sleep) received signal 9
PID 3589274 (process.exe) sent signal 15 to 3589338
PID 3589338 (std.process tem) received signal 9
PID 3589274 (process.exe) sent signal 9 to 3589340
PID 3589340 (std.process tem) received signal 9
PID 3589274 (process.exe) sent signal 15 to 3589341
PID 3589274 (process.exe) sent signal 15 to 3589323
PID 3589274 (process.exe) sent signal 15 to 3589320
PID 3589274 (process.exe) sent signal 15 to 3589274
PID 3589274 (process.exe) received signal 9
PID 3589341 (sleep) received signal 9
PID 3589273 (sbuild-usernsex) sent signal 9 to 3589320
PID 3589273 (sbuild-usernsex) sent signal 9 to 3589323
There are a few additional SIGTERM being sent when using the perl init, that s helpful. At this point we are fairly convinced that process.exe is worth additional inspection. The source code of process.d shows something interesting:
1221 @system unittest
1222  
[...]
1247     auto pid = spawnProcess(["sleep", "10000"],
[...]
1260     // kill the spawned process with SIGINT
1261     // and send its return code
1262     spawn((shared Pid pid)  
1263         auto p = cast() pid;
1264         kill(p, SIGINT);
So yes, there s our sleep and the SIGINT (signal 2) right in the unit tests of process.d, just like we have observed in the bpftrace output.
Can we study the behavior of process.exe in isolation, separatedly from the build? Indeed we can. Let s take the executable from a failed build, and try running it under /usr/libexec/sbuild-usernsexec.
First, we prepare a chroot inside a suitable user namespace:
unshare --map-auto --setuid 0 --setgid 0 mkdir /tmp/rootfs
cd /tmp/rootfs
cat /home/ema/.cache/sbuild/unstable-arm64.tar   unshare --map-auto --setuid 0 --setgid 0 tar xf  -
unshare --map-auto --setuid 0 --setgid 0 mkdir /tmp/rootfs/whatever
unshare --map-auto --setuid 0 --setgid 0 cp process.exe /tmp/rootfs/
Now we can run process.exe on its own using the perl init, and trace signals at will:
/usr/libexec/sbuild-usernsexec --pivotroot --nonet u:0:100000:65536  g:0:100000:65536 /tmp/rootfs ema /whatever -- /process.exe
We can compare the behavior of the perl init vis-a-vis the one using dumb-init in milliseconds instead of minutes.

Stage 5: Oh, I see.
Why does process.exe send more SIGTERMs when using the perl init is now the big question. We have a simple reproducer, so this is where using strace becomes possible.
sudo strace --user ema --follow-forks -o sbuild-dumb-init.strace ./sbuild-usernsexec-dumb-init --pivotroot --nonet u:0:100000:65536  g:0:100000:65536 /tmp/dumbroot ema /whatever -- /process.exe
We start comparing the strace output of dumb-init with that of perl-init, looking in particular for different calls to kill.
Here is what process.exe does under dumb-init:
3593883 kill(-2, SIGTERM)               = -1 ESRCH (No such process)
No such process. Under perl-init instead:
3593777 kill(-2, SIGTERM <unfinished ...>
The process is there under perl-init!
That is a kill with negative pid. From the kill(2) man page:
If pid is less than -1, then sig is sent to every process in the process group whose ID is -pid.
It would have been very useful to see this kill with negative pid in the output of bpftrace, why didn t we? The tracepoint used, tracepoint:signal:signal_generate, shows when signals are actually being sent, and not the syscall being called. To confirm, one can trace tracepoint:syscalls:sys_enter_kill and see the negative PIDs, for example:
PID 312719 (bash) sent signal 2 to -312728
The obvious question at this point is: why is there no process group 2 when using dumb-init?

Stage 6: How did that ever work?
We know that process.exe sends a SIGTERM to every process in the process group with ID 2. To find out what this process group may be, we spawn a shell with dumb-init and observe under /proc PIDs 1, 16, and 17. With perl-init we have 1, 2, and 17. When running dumb-init, there are a few forks before launching the program, explaining the difference. Looking at /proc/2/cmdline we see that it s bash, ie. the program we are running under perl-init. When building a package, that is dpkg-buildpackage itself.
The test is accidentally killing its own process group.
Now where does this -2 come from in the test?
2363     // Special values for _processID.
2364     enum invalid = -1, terminated = -2;
Oh. -2 is used as a special value for PID, meaning "terminated". And there s a call to kill() later on:
2694     do   s = tryWait(pid);   while (!s.terminated);
[...]
2697     assertThrown!ProcessException(kill(pid));
What sets pid to terminated you ask?
Here is tryWait:
2568 auto tryWait(Pid pid) @safe
2569  
2570     import std.typecons : Tuple;
2571     assert(pid !is null, "Called tryWait on a null Pid.");
2572     auto code = pid.performWait(false);
And performWait:
2306         _processID = terminated;
The solution, dear reader, is not to kill.
PS: the bug report with spoilers for those interested is #1089007.

12 December 2024

Matthew Garrett: Android privacy improvements break key attestation

Sometimes you want to restrict access to something to a specific set of devices - for instance, you might want your corporate VPN to only be reachable from devices owned by your company. You can't really trust a device that self attests to its identity, for instance by reporting its MAC address or serial number, for a couple of reasons:
If we want a high degree of confidence that the device we're talking to really is the device it claims to be, we need something that's much harder to spoof. For devices with a TPM this is the TPM itself. Every TPM has an Endorsement Key (EK) that's associated with a certificate that chains back to the TPM manufacturer. By verifying that certificate path and having the TPM prove that it's in posession of the private half of the EK, we know that we're communicating with a genuine TPM[1].

Android has a broadly equivalent thing called ID Attestation. Android devices can generate a signed attestation that they have certain characteristics and identifiers, and this can be chained back to the manufacturer. Obviously providing signed proof of the device identifier is kind of problematic from a privacy perspective, so the short version[2] is that only apps installed using a corporate account rather than a normal user account are able to do this.

But that's still not ideal - the device identifiers involved included the IMEI and serial number of the device, and those could potentially be used to correlate devices across privacy boundaries since they're static[3] identifiers that are the same both inside a corporate work profile and in the normal user profile, and also remains static if you move between different employers and use the same phone[4]. So, since Android 12, ID Attestation includes an "Enterprise Specific ID" or ESID. The ESID is based on a hash of device-specific data plus the enterprise that the corporate work profile is associated with. If a device is enrolled with the same enterprise then this ID will remain static, if it's enrolled with a different enterprise it'll change, and it just doesn't exist outside the work profile at all. The other device identifiers are no longer exposed.

But device ID verification isn't enough to solve the underlying problem here. When we receive a device ID attestation we know that someone at the far end has posession of a device with that ID, but we don't know that that device is where the packets are originating. If our VPN simply has an API that asks for an attestation from a trusted device before routing packets, we could pass that on to said trusted device and then simply forward the attestation to the VPN server[5]. We need some way to prove that the the device trying to authenticate is actually that device.

The answer to this is key provenance attestation. If we can prove that an encryption key was generated on a trusted device, and that the private half of that key is stored in hardware and can't be exported, then using that key to establish a connection proves that we're actually communicating with a trusted device. TPMs are able to do this using the attestation keys generated in the Credential Activation process, giving us proof that a specific keypair was generated on a TPM that we've previously established is trusted.

Android again has an equivalent called Key Attestation. This doesn't quite work the same way as the TPM process - rather than being tied back to the same unique cryptographic identity, Android key attestation chains back through a separate cryptographic certificate chain but contains a statement about the device identity - including the IMEI and serial number. By comparing those to the values in the device ID attestation we know that the key is associated with a trusted device and we can now establish trust in that key.

"But Matthew", those of you who've been paying close attention may be saying, "Didn't Android 12 remove the IMEI and serial number from the device ID attestation?" And, well, congratulations, you were apparently paying more attention than Google. The key attestation no longer contains enough information to tie back to the device ID attestation, making it impossible to prove that a hardware-backed key is associated with a specific device ID attestation and its enterprise enrollment.

I don't think this was any sort of deliberate breakage, and it's probably more an example of shipping the org chart - my understanding is that device ID attestation and key attestation are implemented by different parts of the Android organisation and the impact of the ESID change (something that appears to be a legitimate improvement in privacy!) on key attestation was probably just not realised. But it's still a pain.

[1] Those of you paying attention may realise that what we're doing here is proving the identity of the TPM, not the identity of device it's associated with. Typically the TPM identity won't vary over the lifetime of the device, so having a one-time binding of those two identities (such as when a device is initially being provisioned) is sufficient. There's actually a spec for distributing Platform Certificates that allows device manufacturers to bind these together during manufacturing, but I last worked on those a few years back and don't know what the current state of the art there is

[2] Android has a bewildering array of different profile mechanisms, some of which are apparently deprecated, and I can never remember how any of this works, so you're not getting the long version

[3] Nominally, anyway. Cough.

[4] I wholeheartedly encourage people not to put work accounts on their personal phones, but I am a filthy hypocrite here

[5] Obviously if we have the ability to ask for attestation from a trusted device, we have access to a trusted device. Why not simply use the trusted device? The answer there may be that we've compromised one and want to do as little as possible on it in order to reduce the probability of triggering any sort of endpoint detection agent, or it may be because we want to run on a device with different security properties than those enforced on the trusted device.

comment count unavailable comments

Next.