Search Results: "pm"

19 May 2022

Agathe Porte: Status update, May 2022

Boing, time for another status update.
Debian work I have finally found how to make my fonts-creep2 package work on my Debian machines. The solution was to not use the TTF file that contains the Bitmap glyphs, but instead generate an OTB file, which is an OpenType format for Bitmap fonts. Creep2 font used in htop command This means that I can close the fonts-creep ITP bug altogether and rely on this fonts-creep2 package instead. Hopefully it will be reviewed and uploaded soon by a certified Debian Developer. This font is too small for daily usage, but imagine the quantity of data you could display on an auxiliary screen with poor resolution (and poor pixel density eventually). Here is a meme I created for the occasion: Hide the pain Harold meme. First: Package software and its gazillion dependencies. Second: Popcon says I'm the only user. Checks out.
Rust work I have obsoleted my most popular Rust crate, gladis. Screenshot of the Gladis Github README Indeed, the GTK folks have managed to develop a similar solution named CompositeTemplate, that is available in both gtk3-macros and gtk4-macros crates. I did not investigate from how long this has been available before I created this crate. Hopefully it did not exist before I developed it. I have learnt a lot about Rust crates development with this crate, and managed to put in place a semi-automated release flow that I will surely use in other future crates. See ya.

18 May 2022

Reproducible Builds: Supporter spotlight: Jan Nieuwenhuizen on Bootstrappable Builds, GNU Mes and GNU Guix

The Reproducible Builds project relies on several projects, supporters and sponsors for financial support, but they are also valued as ambassadors who spread the word about our project and the work that we do. This is the fourth instalment in a series featuring the projects, companies and individuals who support the Reproducible Builds project. We started this series by featuring the Civil Infrastructure Platform project and followed this up with a post about the Ford Foundation as well as a recent ones about ARDC and the Google Open Source Security Team (GOSST). Today, however, we will be talking with Jan Nieuwenhuizen about Bootstrappable Builds, GNU Mes and GNU Guix.
Chris Lamb: Hi Jan, thanks for taking the time to talk with us today. First, could you briefly tell me about yourself? Jan: Thanks for the chat; it s been a while! Well, I ve always been trying to find something new and interesting that is just asking to be created but is mostly being overlooked. That s how I came to work on GNU Guix and create GNU Mes to address the bootstrapping problem that we have in free software. It s also why I have been working on releasing Dezyne, a programming language and set of tools to specify and formally verify concurrent software systems as free software. Briefly summarised, compilers are often written in the language they are compiling. This creates a chicken-and-egg problem which leads users and distributors to rely on opaque, pre-built binaries of those compilers that they use to build newer versions of the compiler. To gain trust in our computing platforms, we need to be able to tell how each part was produced from source, and opaque binaries are a threat to user security and user freedom since they are not auditable. The goal of bootstrappability (and the bootstrappable.org project in particular) is to minimise the amount of these bootstrap binaries. Anyway, after studying Physics at Eindhoven University of Technology (TU/e), I worked for digicash.com, a startup trying to create a digital and anonymous payment system sadly, however, a traditional account-based system won. Separate to this, as there was no software (either free or proprietary) to automatically create beautiful music notation, together with Han-Wen Nienhuys, I created GNU LilyPond. Ten years ago, I took the initiative to co-found a democratic school in Eindhoven based on the principles of sociocracy. And last Christmas I finally went vegan, after being mostly vegetarian for about 20 years!
Chris: For those who have not heard of it before, what is GNU Guix? What are the key differences between Guix and other Linux distributions? Jan: GNU Guix is both a package manager and a full-fledged GNU/Linux distribution. In both forms, it provides state-of-the-art package management features such as transactional upgrades and package roll-backs, hermetical-sealed build environments, unprivileged package management as well as per-user profiles. One obvious difference is that Guix forgoes the usual Filesystem Hierarchy Standard (ie. /usr, /lib, etc.), but there are other significant differences too, such as Guix being scriptable using Guile/Scheme, as well as Guix s dedication and focus on free software.
Chris: How does GNU Guix relate to GNU Mes? Or, rather, what problem is Mes attempting to solve? Jan: GNU Mes was created to address the security concerns that arise from bootstrapping an operating system such as Guix. Even if this process entirely involves free software (i.e. the source code is, at least, available), this commonly uses large and unauditable binary blobs. Mes is a Scheme interpreter written in a simple subset of C and a C compiler written in Scheme, and it comes with a small, bootstrappable C library. Twice, the Mes bootstrap has halved the size of opaque binaries that were needed to bootstrap GNU Guix. These reductions were achieved by first replacing GNU Binutils, GNU GCC and the GNU C Library with Mes, and then replacing Unix utilities such as awk, bash, coreutils, grep sed, etc., by Gash and Gash-Utils. The final goal of Mes is to help create a full-source bootstrap for any interested UNIX-like operating system.
Chris: What is the current status of Mes? Jan: Mes supports all that is needed from R5RS and GNU Guile to run MesCC with Nyacc, the C parser written for Guile, for 32-bit x86 and ARM. The next step for Mes would be more compatible with Guile, e.g., have guile-module support and support running Gash and Gash Utils. In working to create a full-source bootstrap, I have disregarded the kernel and Guix build system for now, but otherwise, all packages should be built from source, and obviously, no binary blobs should go in. We still need a Guile binary to execute some scripts, and it will take at least another one to two years to remove that binary. I m using the 80/20 approach, cutting corners initially to get something working and useful early. Another metric would be how many architectures we have. We are quite a way with ARM, tinycc now works, but there are still problems with GCC and Glibc. RISC-V is coming, too, which could be another metric. Someone has looked into picking up NixOS this summer. How many distros do anything about reproducibility or bootstrappability? The bootstrappability community is so small that we don t need metrics, sadly. The number of bytes of binary seed is a nice metric, but running the whole thing on a full-fledged Linux system is tough to put into a metric. Also, it is worth noting that I m developing on a modern Intel machine (ie. a platform with a management engine), that s another key component that doesn t have metrics.
Chris: From your perspective as a Mes/Guix user and developer, what does reproducibility mean to you? Are there any related projects? Jan: From my perspective, I m more into the problem of bootstrapping, and reproducibility is a prerequisite for bootstrappability. Reproducibility clearly takes a lot of effort to achieve, however. It s relatively easy to install some Linux distribution and be happy, but if you look at communities that really care about security, they are investing in reproducibility and other ways of improving the security of their supply chain. Projects I believe are complementary to Guix and Mes include NixOS, Debian and on the hardware side the RISC-V platform shares many of our core principles and goals.
Chris: Well, what are these principles and goals? Jan: Reproducibility and bootstrappability often feel like the next step in the frontier of free software. If you have all the sources and you can t reproduce a binary, that just doesn t feel right anymore. We should start to desire (and demand) transparent, elegant and auditable software stacks. To a certain extent, that s always been a low-level intent since the beginning of free software, but something clearly got lost along the way. On the other hand, if you look at the NPM or Rust ecosystems, we see a world where people directly install binaries. As they are not as supportive of copyleft as the rest of the free software community, you can see that movement and people in our area are doing more as a response to that so that what we have continues to remain free, and to prevent us from falling asleep and waking up in a couple of years and see, for example, Rust in the Linux kernel and (more importantly) we require big binary blobs to use our systems. It s an excellent time to advance right now, so we should get a foothold in and ensure we don t lose any more.
Chris: What would be your ultimate reproducibility goal? And what would the key steps or milestones be to reach that? Jan: The ultimate goal would be to have a system built with open hardware, with all software on it fully bootstrapped from its source. This bootstrap path should be easy to replicate and straightforward to inspect and audit. All fully reproducible, of course! In essence, we would have solved the supply chain security problem. Our biggest challenge is ignorance. There is much unawareness about the importance of what we are doing. As it is rather technical and doesn t really affect everyday computer use, that is not surprising. This unawareness can be a great force driving us in the opposite direction. Think of Rust being allowed in the Linux kernel, or Python being required to build a recent GNU C library (glibc). Also, the fact that companies like Google/Apple still want to play us vs them , not willing to to support GPL software. Not ready yet to truly support user freedom. Take the infamous log4j bug everyone is using open source these days, but nobody wants to take responsibility and help develop or nurture the community. Not ecosystem , as that s how it s being approached right now: live and let live/die: see what happens without taking any responsibility. We are growing and we are strong and we can do a lot but if we have to work against those powers, it can become problematic. So, let s spread our great message and get more people involved!
Chris: What has been your biggest win? Jan: From a technical point of view, the full-source bootstrap has have been our biggest win. A talk by Carl Dong at the 2019 Breaking Bitcoin conference stated that connecting Jeremiah Orian s Stage0 project to Mes would be the holy grail of bootstrapping, and we recently managed to achieve just that: in other words, starting from hex0, 357-byte binary, we can now build the entire Guix system. This past year we have not made significant visible progress, however, as our funding was unfortunately not there. The Stage0 project has advanced in RISC-V. A month ago, though, I secured NLnet funding for another year, and thanks to NLnet, Ekaitz Zarraga and Timothy Sample will work on GNU Mes and the Guix bootstrap as well. Separate to this, the bootstrappable community has grown a lot from two people it was six years ago: there are now currently over 100 people in the #bootstrappable IRC channel, for example. The enlarged community is possibly an even more important win going forward.
Chris: How realistic is a 100% bootstrappable toolchain? And from someone who has been working in this area for a while, is solving Trusting Trust) actually feasible in reality? Jan: Two answers: Yes and no, it really depends on your definition. One great thing is that the whole Stage0 project can also run on the Knight virtual machine, a hardware platform that was designed, I think, in the 1970s. I believe we can and must do better than we are doing today, and that there s a lot of value in it. The core issue is not the trust; we can probably all trust each other. On the other hand, we don t want to trust each other or even ourselves. I am not, personally, going to inspect my RISC-V laptop, and other people create the hardware and do probably not want to inspect the software. The answer comes back to being conscientious and doing what is right. Inserting GCC as a binary blob is not right. I think we can do better, and that s what I d like to do. The security angle is interesting, but I don t like the paranoid part of that; I like the beauty of what we are creating together and stepwise improving on that.
Chris: Thanks for taking the time to talk to us today. If someone wanted to get in touch or learn more about GNU Guix or Mes, where might someone go? Jan: Sure! First, check out: I m also on Twitter (@janneke_gnu) and on octodon.social (@janneke@octodon.social).
Chris: Thanks for taking the time to talk to us today. Jan: No problem. :)


For more information about the Reproducible Builds project, please see our website at reproducible-builds.org. If you are interested in ensuring the ongoing security of the software that underpins our civilisation and wish to sponsor the Reproducible Builds project, please reach out to the project by emailing contact@reproducible-builds.org.

Louis-Philippe V ronneau: Clojure Team 2022 Sprint Report

This is the report for the Debian Clojure Team remote sprint that took place on May 13-14th. Looking at my previous blog entries, this was my first Debian sprint since July 2020! Crazy how fast time flies... Many thanks to those who participated, namely: Sadly, Utkarsh Gupta although having planned on participating ended up not being able to and worked on DebConf Bursary paperwork instead. rlb Rob mostly worked on creating a dh-clojure tool to help make packaging Clojure libraries easier. At the moment, most of the packaging is done manually, by invoking build tools by hand. Having a tool to automate many of the steps required to build Clojure packages would go a long way in making them more uniform. His work (although still very much a WIP) can be found here: https://salsa.debian.org/rlb/dh-clojure/ ehashman Elana: lavamind It was J r me's first time working on Clojure packages, and things went great! During the sprint, he: allentiak Leandro joined us on Saturday, since he couldn't get off work on Friday. He mostly continued working on replacing our in-house scripts for /usr/bin/clojure by upstream's, a task he had already started during GSoC 2021. Sadly, none of us were familiar with Debian's mechanism for alternatives. If you (yes you, dear reader) are familiar with it, I'm sure he would warmly welcome feedback on his development branch. pollo As for me, I: Overall, it was quite a productive sprint! Thanks to Debian for sponsoring our food during the sprint. It was nice to be able to concentrate on fixing things instead of making food :) Here's a bonus picture of the nice sushi platter I ended up getting for dinner on Saturday night: Picture of a sushi platter

16 May 2022

Matthew Garrett: Can we fix bearer tokens?

Last month I wrote about how bearer tokens are just awful, and a week later Github announced that someone had managed to exfiltrate bearer tokens from Heroku that gave them access to, well, a lot of Github repositories. This has inevitably resulted in a whole bunch of discussion about a number of things, but people seem to be largely ignoring the fundamental issue that maybe we just shouldn't have magical blobs that grant you access to basically everything even if you've copied them from a legitimate holder to Honest John's Totally Legitimate API Consumer.

To make it clearer what the problem is here, let's use an analogy. You have a safety deposit box. To gain access to it, you simply need to be able to open it with a key you were given. Anyone who turns up with the key can open the box and do whatever they want with the contents. Unfortunately, the key is extremely easy to copy - anyone who is able to get hold of your keyring for a moment is in a position to duplicate it, and then they have access to the box. Wouldn't it be better if something could be done to ensure that whoever showed up with a working key was someone who was actually authorised to have that key?

To achieve that we need some way to verify the identity of the person holding the key. In the physical world we have a range of ways to achieve this, from simply checking whether someone has a piece of ID that associates them with the safety deposit box all the way up to invasive biometric measurements that supposedly verify that they're definitely the same person. But computers don't have passports or fingerprints, so we need another way to identify them.

When you open a browser and try to connect to your bank, the bank's website provides a TLS certificate that lets your browser know that you're talking to your bank instead of someone pretending to be your bank. The spec allows this to be a bi-directional transaction - you can also prove your identity to the remote website. This is referred to as "mutual TLS", or mTLS, and a successful mTLS transaction ends up with both ends knowing who they're talking to, as long as they have a reason to trust the certificate they were presented with.

That's actually a pretty big constraint! We have a reasonable model for the server - it's something that's issued by a trusted third party and it's tied to the DNS name for the server in question. Clients don't tend to have stable DNS identity, and that makes the entire thing sort of awkward. But, thankfully, maybe we don't need to? We don't need the client to be able to prove its identity to arbitrary third party sites here - we just need the client to be able to prove it's a legitimate holder of whichever bearer token it's presenting to that site. And that's a much easier problem.

Here's the simple solution - clients generate a TLS cert. This can be self-signed, because all we want to do here is be able to verify whether the machine talking to us is the same one that had a token issued to it. The client contacts a service that's going to give it a bearer token. The service requests mTLS auth without being picky about the certificate that's presented. The service embeds a hash of that certificate in the token before handing it back to the client. Whenever the client presents that token to any other service, the service ensures that the mTLS cert the client presented matches the hash in the bearer token. Copy the token without copying the mTLS certificate and the token gets rejected. Hurrah hurrah hats for everyone.

Well except for the obvious problem that if you're in a position to exfiltrate the bearer tokens you can probably just steal the client certificates and keys as well, and now you can pretend to be the original client and this is not adding much additional security. Fortunately pretty much everything we care about has the ability to store the private half of an asymmetric key in hardware (TPMs on Linux and Windows systems, the Secure Enclave on Macs and iPhones, either a piece of magical hardware or Trustzone on Android) in a way that avoids anyone being able to just steal the key.

How do we know that the key is actually in hardware? Here's the fun bit - it doesn't matter. If you're issuing a bearer token to a system then you're already asserting that the system is trusted. If the system is lying to you about whether or not the key it's presenting is hardware-backed then you've already lost. If it lied and the system is later compromised then sure all your apes get stolen, but maybe don't run systems that lie and avoid that situation as a result?

Anyway. This is covered in RFC 8705 so why aren't we all doing this already? From the client side, the largest generic issue is that TPMs are astonishingly slow in comparison to doing a TLS handshake on the CPU. RSA signing operations on TPMs can take around half a second, which doesn't sound too bad, except your browser is probably establishing multiple TLS connections to subdomains on the site it's connecting to and performance is going to tank. Fixing this involves doing whatever's necessary to convince the browser to pipe everything over a single TLS connection, and that's just not really where the web is right at the moment. Using EC keys instead helps a lot (~0.1 seconds per signature on modern TPMs), but it's still going to be a bottleneck.

The other problem, of course, is that ecosystem support for hardware-backed certificates is just awful. Windows lets you stick them into the standard platform certificate store, but the docs for this are hidden in a random PDF in a Github repo. Macs require you to do some weird bridging between the Secure Enclave API and the keychain API. Linux? Well, the standard answer is to do PKCS#11, and I have literally never met anybody who likes PKCS#11 and I have spent a bunch of time in standards meetings with the sort of people you might expect to like PKCS#11 and even they don't like it. It turns out that loading a bunch of random C bullshit that has strong feelings about function pointers into your security critical process is not necessarily something that is going to improve your quality of life, so instead you should use something like this and just have enough C to bridge to a language that isn't secretly plotting to kill your pets the moment you turn your back.

And, uh, obviously none of this matters at all unless people actually support it. Github has no support at all for validating the identity of whoever holds a bearer token. Most issuers of bearer tokens have no support for embedding holder identity into the token. This is not good! As of last week, all three of the big cloud providers support virtualised TPMs in their VMs - we should be running CI on systems that can do that, and tying any issued tokens to the VMs that are supposed to be making use of them.

So sure this isn't trivial. But it's also not impossible, and making this stuff work would improve the security of, well, everything. We literally have the technology to prevent attacks like Github suffered. What do we have to do to get people to actually start working on implementing that?

comment count unavailable comments

15 May 2022

Dirk Eddelbuettel: RcppArmadillo 0.11.1.1.0 on CRAN: Updates

armadillo image Armadillo is a powerful and expressive C++ template library for linear algebra and scientific computing. It aims towards a good balance between speed and ease of use, has syntax deliberately close to Matlab and is useful for algorithm development directly in C++, or quick conversion of research code into production environments. RcppArmadillo integrates this library with the R environment and language and is widely used by (currently) 978 other packages on CRAN, downloaded over 24 million times (per the partial logs from the cloud mirrors of CRAN), and the CSDA paper (preprint / vignette) by Conrad and myself has been cited 469 times according to Google Scholar. This release brings a first new upstream fix in the new release series 11.*. In particular, treatment of ill-conditioned matrices is further strengthened. We once again tested this very rigorously via three different RC releases each of which got a full reverse-dependencies run (for which results are always logged here). A minor issue with old g++ compilers was found once 11.1.0 was tagged to this upstream release is now 11.1.1. Also fixed is an OpenMP setup issue where Justin Silverman noticed that we did not propagate the -fopenmp setting correctly. The full set of changes (since the last CRAN release 0.11.0.0.0) follows.

Changes in RcppArmadillo version 0.11.1.1.0 (2022-05-15)
  • Upgraded to Armadillo release 11.1.1 (Angry Kitchen Appliance)
    • added inv_opts::no_ugly option to inv() and inv_sympd() to disallow inverses of poorly conditioned matrices
    • more efficient handling of rank-deficient matrices via inv_opts::allow_approx option in inv() and inv_sympd()
    • better detection of rank deficient matrices by solve()
    • faster handling of symmetric and diagonal matrices by cond()
  • The configure script again propagates the'found' case again, thanks to Justin Silverman for the heads-up and suggested fix (Dirk and Justin in #376 and #377 fixing #375).

Changes in RcppArmadillo version 0.11.0.1.0 (2022-04-14)
  • Upgraded to Armadillo release 11.0.1 (Creme Brulee)
    • fix miscompilation of inv() and inv_sympd() functions when using inv_opts::allow_approx and inv_opts::tiny options

Courtesy of my CRANberries, there is a diffstat report relative to previous release. More detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

13 May 2022

Antoine Beaupr : BTRFS notes

I'm not a fan of BTRFS. This page serves as a reminder of why, but also a cheat sheet to figure out basic tasks in a BTRFS environment because those are not obvious to me, even after repeatedly having to deal with them. Content warning: there might be mentions of ZFS.

Stability concerns I'm worried about BTRFS stability, which has been historically ... changing. RAID-5 and RAID-6 are still marked unstable, for example. It's kind of a lucky guess whether your current kernel will behave properly with your planned workload. For example, in Linux 4.9, RAID-1 and RAID-10 were marked as "mostly OK" with a note that says:
Needs to be able to create two copies always. Can get stuck in irreversible read-only mode if only one copy can be made.
Even as of now, RAID-1 and RAID-10 has this note:
The simple redundancy RAID levels utilize different mirrors in a way that does not achieve the maximum performance. The logic can be improved so the reads will spread over the mirrors evenly or based on device congestion.
Granted, that's not a stability concern anymore, just performance. A reviewer of a draft of this article actually claimed that BTRFS only reads from one of the drives, which hopefully is inaccurate, but goes to show how confusing all this is. There are other warnings in the Debian wiki that are quite scary. Even the legendary Arch wiki has a warning on top of their BTRFS page, still. Even if those issues are now fixed, it can be hard to tell when they were fixed. There is a changelog by feature but it explicitly warns that it doesn't know "which kernel version it is considered mature enough for production use", so it's also useless for this. It would have been much better if BTRFS was released into the world only when those bugs were being completely fixed. Or that, at least, features were announced when they were stable, not just "we merged to mainline, good luck". Even now, we get mixed messages even in the official BTRFS documentation which says "The Btrfs code base is stable" (main page) while at the same time clearly stating unstable parts in the status page (currently RAID56). There are much harsher BTRFS critics than me out there so I will stop here, but let's just say that I feel a little uncomfortable trusting server data with full RAID arrays to BTRFS. But surely, for a workstation, things should just work smoothly... Right? Well, let's see the snags I hit.

My BTRFS test setup Before I go any further, I should probably clarify how I am testing BTRFS in the first place. The reason I tried BTRFS is that I was ... let's just say "strongly encouraged" by the LWN editors to install Fedora for the terminal emulators series. That, in turn, meant the setup was done with BTRFS, because that was somewhat the default in Fedora 27 (or did I want to experiment? I don't remember, it's been too long already). So Fedora was setup on my 1TB HDD and, with encryption, the partition table looks like this:
NAME                   MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                      8:0    0 931,5G  0 disk  
 sda1                   8:1    0   200M  0 part  /boot/efi
 sda2                   8:2    0     1G  0 part  /boot
 sda3                   8:3    0   7,8G  0 part  
   fedora_swap        253:5    0   7.8G  0 crypt [SWAP]
 sda4                   8:4    0 922,5G  0 part  
   fedora_crypt       253:4    0 922,5G  0 crypt /
(This might not entirely be accurate: I rebuilt this from the Debian side of things.) This is pretty straightforward, except for the swap partition: normally, I just treat swap like any other logical volume and create it in a logical volume. This is now just speculation, but I bet it was setup this way because "swap" support was only added in BTRFS 5.0. I fully expect BTRFS experts to yell at me now because this is an old setup and BTRFS is so much better now, but that's exactly the point here. That setup is not that old (2018? old? really?), and migrating to a new partition scheme isn't exactly practical right now. But let's move on to more practical considerations.

No builtin encryption BTRFS aims at replacing the entire mdadm, LVM, and ext4 stack with a single entity, and adding new features like deduplication, checksums and so on. Yet there is one feature it is critically missing: encryption. See, my typical stack is actually mdadm, LUKS, and then LVM and ext4. This is convenient because I have only a single volume to decrypt. If I were to use BTRFS on servers, I'd need to have one LUKS volume per-disk. For a simple RAID-1 array, that's not too bad: one extra key. But for large RAID-10 arrays, this gets really unwieldy. The obvious BTRFS alternative, ZFS, supports encryption out of the box and mixes it above the disks so you only have one passphrase to enter. The main downside of ZFS encryption is that it happens above the "pool" level so you can typically see filesystem names (and possibly snapshots, depending on how it is built), which is not the case with a more traditional stack.

Subvolumes, filesystems, and devices I find BTRFS's architecture to be utterly confusing. In the traditional LVM stack (which is itself kind of confusing if you're new to that stuff), you have those layers:
  • disks: let's say /dev/nvme0n1 and nvme1n1
  • RAID arrays with mdadm: let's say the above disks are joined in a RAID-1 array in /dev/md1
  • volume groups or VG with LVM: the above RAID device (technically a "physical volume" or PV) is assigned into a VG, let's call it vg_tbbuild05 (multiple PVs can be added to a single VG which is why there is that abstraction)
  • LVM logical volumes: out of that volume group actually "virtual partitions" or "logical volumes" are created, that is where your filesystem lives
  • filesystem, typically with ext4: that's your normal filesystem, which treats the logical volume as just another block device
A typical server setup would look like this:
NAME                      MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
nvme0n1                   259:0    0   1.7T  0 disk  
 nvme0n1p1               259:1    0     8M  0 part  
 nvme0n1p2               259:2    0   512M  0 part  
   md0                     9:0    0   511M  0 raid1 /boot
 nvme0n1p3               259:3    0   1.7T  0 part  
   md1                     9:1    0   1.7T  0 raid1 
     crypt_dev_md1       253:0    0   1.7T  0 crypt 
       vg_tbbuild05-root 253:1    0    30G  0 lvm   /
       vg_tbbuild05-swap 253:2    0 125.7G  0 lvm   [SWAP]
       vg_tbbuild05-srv  253:3    0   1.5T  0 lvm   /srv
 nvme0n1p4               259:4    0     1M  0 part
I stripped the other nvme1n1 disk because it's basically the same. Now, if we look at my BTRFS-enabled workstation, which doesn't even have RAID, we have the following:
  • disk: /dev/sda with, again, /dev/sda4 being where BTRFS lives
  • filesystem: fedora_crypt, which is, confusingly, kind of like a volume group. it's where everything lives. i think.
  • subvolumes: home, root, /, etc. those are actually the things that get mounted. you'd think you'd mount a filesystem, but no, you mount a subvolume. that is backwards.
It looks something like this to lsblk:
NAME                   MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                      8:0    0 931,5G  0 disk  
 sda1                   8:1    0   200M  0 part  /boot/efi
 sda2                   8:2    0     1G  0 part  /boot
 sda3                   8:3    0   7,8G  0 part  [SWAP]
 sda4                   8:4    0 922,5G  0 part  
   fedora_crypt       253:4    0 922,5G  0 crypt /srv
Notice how we don't see all the BTRFS volumes here? Maybe it's because I'm mounting this from the Debian side, but lsblk definitely gets confused here. I frankly don't quite understand what's going on, even after repeatedly looking around the rather dismal documentation. But that's what I gather from the following commands:
root@curie:/home/anarcat# btrfs filesystem show
Label: 'fedora'  uuid: 5abb9def-c725-44ef-a45e-d72657803f37
    Total devices 1 FS bytes used 883.29GiB
    devid    1 size 922.47GiB used 916.47GiB path /dev/mapper/fedora_crypt
root@curie:/home/anarcat# btrfs subvolume list /srv
ID 257 gen 108092 top level 5 path home
ID 258 gen 108094 top level 5 path root
ID 263 gen 108020 top level 258 path root/var/lib/machines
I only got to that point through trial and error. Notice how I use an existing mountpoint to list the related subvolumes. If I try to use the filesystem path, the one that's listed in filesystem show, I fail:
root@curie:/home/anarcat# btrfs subvolume list /dev/mapper/fedora_crypt 
ERROR: not a btrfs filesystem: /dev/mapper/fedora_crypt
ERROR: can't access '/dev/mapper/fedora_crypt'
Maybe I just need to use the label? Nope:
root@curie:/home/anarcat# btrfs subvolume list fedora
ERROR: cannot access 'fedora': No such file or directory
ERROR: can't access 'fedora'
This is really confusing. I don't even know if I understand this right, and I've been staring at this all afternoon. Hopefully, the lazyweb will correct me eventually. (As an aside, why are they called "subvolumes"? If something is a "sub" of "something else", that "something else" must exist right? But no, BTRFS doesn't have "volumes", it only has "subvolumes". Go figure. Presumably the filesystem still holds "files" though, at least empirically it doesn't seem like it lost anything so far. In any case, at least I can refer to this section in the future, the next time I fumble around the btrfs commandline, as I surely will. I will possibly even update this section as I get better at it, or based on my reader's judicious feedback.

Mounting BTRFS subvolumes So how did I even get to that point? I have this in my /etc/fstab, on the Debian side of things:
UUID=5abb9def-c725-44ef-a45e-d72657803f37   /srv    btrfs  defaults 0   2
This thankfully ignores all the subvolume nonsense because it relies on the UUID. mount tells me that's actually the "root" (? /?) subvolume:
root@curie:/home/anarcat# mount   grep /srv
/dev/mapper/fedora_crypt on /srv type btrfs (rw,relatime,space_cache,subvolid=5,subvol=/)
Let's see if I can mount the other volumes I have on there. Remember that subvolume list showed I had home, root, and var/lib/machines. Let's try root:
mount -o subvol=root /dev/mapper/fedora_crypt /mnt
Interestingly, root is not the same as /, it's a different subvolume! It seems to be the Fedora root (/, really) filesystem. No idea what is happening here. I also have a home subvolume, let's mount it too, for good measure:
mount -o subvol=home /dev/mapper/fedora_crypt /mnt/home
Note that lsblk doesn't notice those two new mountpoints, and that's normal: it only lists block devices and subvolumes (rather inconveniently, I'd say) do not show up as devices:
root@curie:/home/anarcat# lsblk 
NAME                   MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                      8:0    0 931,5G  0 disk  
 sda1                   8:1    0   200M  0 part  
 sda2                   8:2    0     1G  0 part  
 sda3                   8:3    0   7,8G  0 part  
 sda4                   8:4    0 922,5G  0 part  
   fedora_crypt       253:4    0 922,5G  0 crypt /srv
This is really, really confusing. Maybe I did something wrong in the setup. Maybe it's because I'm mounting it from outside Fedora. Either way, it just doesn't feel right.

No disk usage per volume If you want to see what's taking up space in one of those subvolumes, tough luck:
root@curie:/home/anarcat# df -h  /srv /mnt /mnt/home
Filesystem                Size  Used Avail Use% Mounted on
/dev/mapper/fedora_crypt  923G  886G   31G  97% /srv
/dev/mapper/fedora_crypt  923G  886G   31G  97% /mnt
/dev/mapper/fedora_crypt  923G  886G   31G  97% /mnt/home
(Notice, in passing, that it looks like the same filesystem is mounted in different places. In that sense, you'd expect /srv and /mnt (and /mnt/home?!) to be exactly the same, but no: they are entirely different directory structures, which I will not call "filesystems" here because everyone's head will explode in sparks of confusion.) Yes, disk space is shared (that's the Size and Avail columns, makes sense). But nope, no cookie for you: they all have the same Used columns, so you need to actually walk the entire filesystem to figure out what each disk takes. (For future reference, that's basically:
root@curie:/home/anarcat# time du -schx /mnt/home /mnt /srv
124M    /mnt/home
7.5G    /mnt
875G    /srv
883G    total
real    2m49.080s
user    0m3.664s
sys 0m19.013s
And yes, that was painfully slow.) ZFS actually has some oddities in that regard, but at least it tells me how much disk each volume (and snapshot) takes:
root@tubman:~# time df -t zfs -h
Filesystem         Size  Used Avail Use% Mounted on
rpool/ROOT/debian  3.5T  1.4G  3.5T   1% /
rpool/var/tmp      3.5T  384K  3.5T   1% /var/tmp
rpool/var/spool    3.5T  256K  3.5T   1% /var/spool
rpool/var/log      3.5T  2.0G  3.5T   1% /var/log
rpool/home/root    3.5T  2.2G  3.5T   1% /root
rpool/home         3.5T  256K  3.5T   1% /home
rpool/srv          3.5T   80G  3.5T   3% /srv
rpool/var/cache    3.5T  114M  3.5T   1% /var/cache
bpool/BOOT/debian  571M   90M  481M  16% /boot
real    0m0.003s
user    0m0.002s
sys 0m0.000s
That's 56360 times faster, by the way. But yes, that's not fair: those in the know will know there's a different command to do what df does with BTRFS filesystems, the btrfs filesystem usage command:
root@curie:/home/anarcat# time btrfs filesystem usage /srv
Overall:
    Device size:         922.47GiB
    Device allocated:        916.47GiB
    Device unallocated:        6.00GiB
    Device missing:          0.00B
    Used:            884.97GiB
    Free (estimated):         30.84GiB  (min: 27.84GiB)
    Free (statfs, df):        30.84GiB
    Data ratio:               1.00
    Metadata ratio:           2.00
    Global reserve:      512.00MiB  (used: 0.00B)
    Multiple profiles:              no
Data,single: Size:906.45GiB, Used:881.61GiB (97.26%)
   /dev/mapper/fedora_crypt  906.45GiB
Metadata,DUP: Size:5.00GiB, Used:1.68GiB (33.58%)
   /dev/mapper/fedora_crypt   10.00GiB
System,DUP: Size:8.00MiB, Used:128.00KiB (1.56%)
   /dev/mapper/fedora_crypt   16.00MiB
Unallocated:
   /dev/mapper/fedora_crypt    6.00GiB
real    0m0,004s
user    0m0,000s
sys 0m0,004s
Almost as fast as ZFS's df! Good job. But wait. That doesn't actually tell me usage per subvolume. Notice it's filesystem usage, not subvolume usage, which unhelpfully refuses to exist. That command only shows that one "filesystem" internal statistics that are pretty opaque.. You can also appreciate that it's wasting 6GB of "unallocated" disk space there: I probably did something Very Wrong and should be punished by Hacker News. I also wonder why it has 1.68GB of "metadata" used... At this point, I just really want to throw that thing out of the window and restart from scratch. I don't really feel like learning the BTRFS internals, as they seem oblique and completely bizarre to me. It feels a little like the state of PHP now: it's actually pretty solid, but built upon so many layers of cruft that I still feel it corrupts my brain every time I have to deal with it (needle or haystack first? anyone?)...

Conclusion I find BTRFS utterly confusing and I'm worried about its reliability. I think a lot of work is needed on usability and coherence before I even consider running this anywhere else than a lab, and that's really too bad, because there are really nice features in BTRFS that would greatly help my workflow. (I want to use filesystem snapshots as high-performance, high frequency backups.) So now I'm experimenting with OpenZFS. It's so much simpler, just works, and it's rock solid. After this 8 minute read, I had a good understanding of how ZFS worked. Here's the 30 seconds overview:
  • vdev: a RAID array
  • vpool: a volume group of vdevs
  • datasets: normal filesystems (or block device, if you want to use another filesystem on top of ZFS)
There's also other special volumes like caches and logs that you can (really easily, compared to LVM caching) use to tweak your setup. You might also want to look at recordsize or ashift to tweak the filesystem to fit better your workload (or deal with drives lying about their sector size, I'm looking at you Samsung), but that's it. Running ZFS on Linux currently involves building kernel modules from scratch on every host, which I think is pretty bad. But I was able to setup a ZFS-only server using this excellent documentation without too much problem. I'm hoping some day the copyright issues are resolved and we can at least ship binary packages, but the politics (e.g. convincing Debian that is the right thing to do) and the logistics (e.g. DKMS auto-builders? is that even a thing? how about signed DKMS packages? fun-fun-fun!) seem really impractical. Who knows, maybe hell will freeze over (again) and Oracle will fix the CDDL. I personally think that we should just completely ignore this problem (which wasn't even supposed to be a problem) and ship binary packages directly, but I'm a pragmatic and do not always fit well with the free software fundamentalists. All of this to say that, short term, we don't have a reliable, advanced filesystem/logical disk manager in Linux. And that's really too bad.

Arturo Borrero Gonz lez: Toolforge GridEngine Debian 10 Buster migration

Toolforge logo, a circle with an anvil in the middle This post was originally published in the Wikimedia Tech blog, authored by Arturo Borrero Gonzalez. In accordance with our operating system upgrade policy, we should migrate our servers to Debian Buster. As discussed in the previous post, one of the most important and successful services provided by the Wikimedia Cloud Services team at the Wikimedia Foundation is Toolforge. Toolforge is a platform that allows users and developers to run and use a variety of applications with the ultimate goal of helping the Wikimedia mission from the technical side. As you may know already, all Wikimedia Foundation servers are powered by Debian, and this includes Toolforge and Cloud VPS. The Debian Project mostly follows a two year cadence for releases, and Toolforge has been using Debian Stretch for some years now, which nowadays is considered old-old-stable . In accordance with our operating system upgrade policy, we should migrate our servers to Debian Buster. Toolforge s two different backend engines, Kubernetes and Grid Engine, are impacted by this upgrade policy. Grid Engine is notably tied to the underlying Debian release, and the execution environment offered to tools running in the grid is limited to what the Debian archive contains for a given release. This is unlike in Kubernetes, where tool developers can leverage container images and decouple the runtime environment selection from the base operating system. Since the Toolforge grid original conception, we have been doing the same operation over and over again: We ve done this type of migration several times before. The last few ones were Ubuntu Precise to Ubuntu Trusty and Ubuntu Trusty to Debian Stretch. But this time around we had some special angles to consider.

So, you are upgrading the Debian release
  • You are migrating to Debian 11 Bullseye, no?
  • No, we re migrating to Debian 10 Buster
  • Wait, but Debian 11 Bullseye exists!
  • Yes, we know! Let me explain
We re migrating the grid from Debian 9 Stretch to Debian 10 Buster, but perhaps we should be migrating from Debian 9 Stretch to Debian 11 Bullseye directly. This is a legitimate concern, and we discussed it in September 2021. A timeline showing Debian versions since 2014 Back then, our reasoning was that skipping to Debian 11 Bullseye would be more difficult for our users, especially because greater jump in version numbers for the underlying runtimes. Additionally, all the migration work started before Debian 11 Bullseye was released. Our original intention was for the migration to be completed before the release. For a couple of reasons the project was delayed, and when it was time to restart the project we decided to continue with the original idea. We had some work done to get Debian 10 Buster working correctly with the grid, and supporting Debian 11 Bullseye would require an additional effort. We didn t even check if Grid Engine could be installed in the latest Debian release. For the grid, in general, the engineering effort to do a N+1 upgrade is lower than doing a N+2 upgrade. If we had tried a N+2 upgrade directly, things would have been much slower and difficult for us, and for our users. In that sense, our conclusion was to not skip Debian 10 Buster.

We no longer want to run Grid Engine In a previous blog post we shared information about our desired future for Grid Engine in Toolforge. Our intention is to discontinue our usage of this technology.

No grid? What about my tools? Toolforge logo, a circle with an anvil in the middle Traditionally there have been two main workflows or use cases that were supported in the grid, but not in our Kubernetes backend:
  • Running jobs, long-running bots and other scheduled tasks.
  • Mixing runtime environments (for example, a nodejs app that runs some python code).
The good news is that work to handle the continuity of such use cases has already started. This takes the form of two main efforts:
  • The Toolforge buildpacks project to support arbitrary runtime environments.
  • The Toolforge Jobs Framework to support jobs, scheduled tasks, etc.
In particular, the Toolforge Jobs Framework has been available for a while in an open beta phase. We did some initial design and implementation, then deployed it in Toolforge for some users to try it and report bugs, report missing features, etc. These are complex, and feature-rich projects, and they deserve a dedicated blog post. More information on each will be shared in the future. For now, it is worth noting that both initiatives have some degree of development already. The conclusion Knowing all the moving parts, we were faced with a few hard questions when deciding how to approach the Debian 9 Stretch deprecation:
  • Should we not upgrade the grid, and focus on Kubernetes instead? Let Debian 9 Stretch be the last supported version on the grid?
  • What is the impact of these decisions on the technical community? What is best for our users?
The choices we made are already known in the community. A couple of weeks ago we announced the Debian 9 Stretch Grid Engine deprecation. In parallel to this migration, we decided to promote the new Toolforge Jobs Framework, even if it s still in beta phase. This new option should help users to future-proof their tool, and reduce maintenance effort. An early migration to Kubernetes now will avoid any more future grid problems. We truly hope that Debian 10 Buster is the last version we have for the grid, but as they say, hope is not a good strategy when it comes to engineering. What we will do is to work really hard in bringing Toolforge to the service level we want, and that means to keep developing and enabling more Kubernetes-based functionalities. Stay tuned for more upcoming blog posts with additional information about Toolforge. This post was originally published in the Wikimedia Tech blog, authored by Arturo Borrero Gonzalez.

10 May 2022

Melissa Wen: Multiple syncobjs support for V3D(V) (Part 1)

As you may already know, we at Igalia have been working on several improvements to the 3D rendering drivers of Broadcom Videocore GPU, found in Raspberry Pi 4 devices. One of our recent works focused on improving V3D(V) drivers adherence to Vulkan submission and synchronization framework. We had to cross various layers from the Linux Graphics stack to add support for multiple syncobjs to V3D(V), from the Linux/DRM kernel to the Vulkan driver. We have delivered bug fixes, a generic gate to extend job submission interfaces, and a more direct sync mapping of the Vulkan framework. These changes did not impact the performance of the tested games and brought greater precision to the synchronization mechanisms. Ultimately, support for multiple syncobjs opened the door to new features and other improvements to the V3DV submission framework.

DRM Syncobjs But, first, what are DRM sync objs?
* DRM synchronization objects (syncobj, see struct &drm_syncobj) provide a
* container for a synchronization primitive which can be used by userspace
* to explicitly synchronize GPU commands, can be shared between userspace
* processes, and can be shared between different DRM drivers.
* Their primary use-case is to implement Vulkan fences and semaphores.
[...]
* At it's core, a syncobj is simply a wrapper around a pointer to a struct
* &dma_fence which may be NULL.
And Jason Ekstrand well-summarized dma_fence features in a talk at the Linux Plumbers Conference 2021:
A struct that represents a (potentially future) event:
  • Has a boolean signaled state
  • Has a bunch of useful utility helpers/concepts, such as refcount, callback wait mechanisms, etc.
Provides two guarantees:
  • One-shot: once signaled, it will be signaled forever
  • Finite-time: once exposed, is guaranteed signal in a reasonable amount of time

What does multiple semaphores support mean for Raspberry Pi 4 GPU drivers? For our main purpose, the multiple syncobjs support means that V3DV can submit jobs with more than one wait and signal semaphore. In the kernel space, wait semaphores become explicit job dependencies to wait on before executing the job. Signal semaphores (or post dependencies), in turn, work as fences to be signaled when the job completes its execution, unlocking following jobs that depend on its completion. The multisync support development comprised of many decision-making points and steps summarized as follow:
  • added to the v3d kernel-driver capabilities to handle multiple syncobj;
  • exposed multisync capabilities to the userspace through a generic extension; and
  • reworked synchronization mechanisms of the V3DV driver to benefit from this feature
  • enabled simulator to work with multiple semaphores
  • tested on Vulkan games to verify the correctness and possible performance enhancements.
We decided to refactor parts of the V3D(V) submission design in kernel-space and userspace during this development. We improved job scheduling on V3D-kernel and the V3DV job submission design. We also delivered more accurate synchronizing mechanisms and further updates in the Broadcom Vulkan driver running on Raspberry Pi 4. Therefore, we summarize here changes in the kernel space, describing the previous state of the driver, taking decisions, side improvements, and fixes.

From single to multiple binary in/out syncobjs: Initially, V3D was very limited in the numbers of syncobjs per job submission. V3D job interfaces (CL, CSD, and TFU) only supported one syncobj (in_sync) to be added as an execution dependency and one syncobj (out_sync) to be signaled when a submission completes. Except for CL submission, which accepts two in_syncs: one for binner and another for render job, it didn t change the limited options. Meanwhile in the userspace, the V3DV driver followed alternative paths to meet Vulkan s synchronization and submission framework. It needed to handle multiple wait and signal semaphores, but the V3D kernel-driver interface only accepts one in_sync and one out_sync. In short, V3DV had to fit multiple semaphores into one when submitting every GPU job.

Generic ioctl extension The first decision was how to extend the V3D interface to accept multiple in and out syncobjs. We could extend each ioctl with two entries of syncobj arrays and two entries for their counters. We could create new ioctls with multiple in/out syncobj. But after examining other drivers solutions to extend their submission s interface, we decided to extend V3D ioctls (v3d_cl_submit_ioctl, v3d_csd_submit_ioctl, v3d_tfu_submit_ioctl) by a generic ioctl extension. I found a curious commit message when I was examining how other developers handled the issue in the past:
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Mar 22 09:23:22 2019 +0000
    drm/i915: Introduce the i915_user_extension_method
    
    An idea for extending uABI inspired by Vulkan's extension chains.
    Instead of expanding the data struct for each ioctl every time we need
    to add a new feature, define an extension chain instead. As we add
    optional interfaces to control the ioctl, we define a new extension
    struct that can be linked into the ioctl data only when required by the
    user. The key advantage being able to ignore large control structs for
    optional interfaces/extensions, while being able to process them in a
    consistent manner.
    
    In comparison to other extensible ioctls, the key difference is the
    use of a linked chain of extension structs vs an array of tagged
    pointers. For example,
    
    struct drm_amdgpu_cs_chunk  
    	__u32		chunk_id;
        __u32		length_dw;
        __u64		chunk_data;
     ;
[...]
So, inspired by amdgpu_cs_chunk and i915_user_extension, we opted to extend the V3D interface through a generic interface. After applying some suggestions from Iago Toral (Igalia) and Daniel Vetter, we reached the following struct:
struct drm_v3d_extension  
	__u64 next;
	__u32 id;
#define DRM_V3D_EXT_ID_MULTI_SYNC		0x01
	__u32 flags; /* mbz */
 ;
This generic extension has an id to identify the feature/extension we are adding to an ioctl (that maps the related struct type), a pointer to the next extension, and flags (if needed). Whenever we need to extend the V3D interface again for another specific feature, we subclass this generic extension into the specific one instead of extending ioctls indefinitely.

Multisync extension For the multiple syncobjs extension, we define a multi_sync extension struct that subclasses the generic extension struct. It has arrays of in and out syncobjs, the respective number of elements in each of them, and a wait_stage value used in CL submissions to determine which job needs to wait for syncobjs before running.
struct drm_v3d_multi_sync  
	struct drm_v3d_extension base;
	/* Array of wait and signal semaphores */
	__u64 in_syncs;
	__u64 out_syncs;
	/* Number of entries */
	__u32 in_sync_count;
	__u32 out_sync_count;
	/* set the stage (v3d_queue) to sync */
	__u32 wait_stage;
	__u32 pad; /* mbz */
 ;
And if a multisync extension is defined, the V3D driver ignores the previous interface of single in/out syncobjs. Once we had the interface to support multiple in/out syncobjs, v3d kernel-driver needed to handle it. As V3D uses the DRM scheduler for job executions, changing from single syncobj to multiples is quite straightforward. V3D copies from userspace the in syncobjs and uses drm_syncobj_find_fence()+ drm_sched_job_add_dependency() to add all in_syncs (wait semaphores) as job dependencies, i.e. syncobjs to be checked by the scheduler before running the job. On CL submissions, we have the bin and render jobs, so V3D follows the value of wait_stage to determine which job depends on those in_syncs to start its execution. When V3D defines the last job in a submission, it replaces dma_fence of out_syncs with the done_fence from this last job. It uses drm_syncobj_find() + drm_syncobj_replace_fence() to do that. Therefore, when a job completes its execution and signals done_fence, all out_syncs are signaled too.

Other improvements to v3d kernel driver This work also made possible some improvements in the original implementation. Following Iago s suggestions, we refactored the job s initialization code to allocate memory and initialize a job in one go. With this, we started to clean up resources more cohesively, clearly distinguishing cleanups in case of failure from job completion. We also fixed the resource cleanup when a job is aborted before the DRM scheduler arms it - at that point, drm_sched_job_arm() had recently been introduced to job initialization. Finally, we prepared the semaphore interface to implement timeline syncobjs in the future.

Going Up The patchset that adds multiple syncobjs support and improvements to V3D is available here and comprises four patches:
  • drm/v3d: decouple adding job dependencies steps from job init
  • drm/v3d: alloc and init job in one shot
  • drm/v3d: add generic ioctl extension
  • drm/v3d: add multiple syncobjs support
After extending the V3D kernel interface to accept multiple syncobjs, we worked on V3DV to benefit from V3D multisync capabilities. In the next post, I will describe a little of this work.

3 May 2022

Gunnar Wolf: Using a RPi as a display adapter

Almost ten months ago, I mentioned on this blog I bought an ARM laptop, which is now my main machine while away from home a Lenovo Yoga C630 13Q50. Yes, yes, I am still not as much away from home as I used to before, as this pandemic is still somewhat of a thing, but I do move more. My main activity in the outside world with my laptop is teaching. I teach twice a week, and well, having a display for my slides and for showing examples in the terminal and such is a must. However, as I said back in August, one of the hardware support issues for this machine is:
No HDMI support via the USB-C displayport. While I don t expect
to go to conferences or even classes in the next several months,
I hope this can be fixed before I do. It s a potential important
issue for me.
It has sadly not yet been solved While many things have improved since kernel 5.12 (the first I used), the Device Tree does not yet hint at where external video might sit. So, I went to the obvious: Many people carry different kinds of video adaptors I carry a slightly bulky one: A RPi3 For two months already (time flies!), I had an ugly contraption where the RPi3 connected via Ethernet and displayed a VNC client, and my laptop had a VNC server. Oh, but did I mention My laptop works so much better with Wayland than with Xorg that I switched, and am now a happy user of the Sway compositor (a drop-in replacement for the i3 window manager). It is built over WLRoots, which is a great and (relatively) simple project, but will thankfully not carry some of Gnome or KDE s ideas not even those I d rather have. So it took a bit of searching; I was very happy to find WayVNC, a VNC server for wlroot-sbased Wayland compositors. I launched a second Wayland, to be able to have my main session undisturbed and present only a window from it. Only that VNC is slow and laggy, and sometimes awkward. So I kept searching for something better. And something better is, happily, what I was finally able to do! In the laptop, I am using wf-recorder to grab an area of the screen and funnel it into a V4L2 loopback device (which allows it to be used as a camera, solving the main issue with grabbing parts of a Wayland screen):
/usr/bin/wf-recorder -g '0,32 960x540' -t --muxer=v4l2 --codec=rawvideo --pixelformat=yuv420p --file=/dev/video10
(yes, my V4L2Loopback device is set to /dev/video10). You will note I m grabbing a 960 540 rectangle, which is the top of my screen (1920x1080) minus the Waybar. I think I ll increase it to 960 720, as the projector to which I connect the Raspberry has a 4 3 output. After this is sent to /dev/video10, I tell ffmpeg to send it via RTP to the fixed address of the Raspberry:
/usr/bin/ffmpeg -i /dev/video10 -an -f rtp -sdp_file /tmp/video.sdp rtp://10.0.0.100:7000/
Yes, some uglier things happen here. You will note /tmp/video.sdp is created in the laptop itself; this file describes the stream s metadata so it can be used from the client side. I cheated and copied it over to the Raspberry, doing an ugly hardcode along the way:
user@raspi:~ $ cat video.sdp
v=0
o=- 0 0 IN IP4 127.0.0.1
s=No Name
c=IN IP4 10.0.0.100
t=0 0
a=tool:libavformat 58.76.100
m=video 7000 RTP/AVP 96
b=AS:200
a=rtpmap:96 MP4V-ES/90000
a=fmtp:96 profile-level-id=1
People familiar with RTP will scold me: How come I m streaming to the unicast client address? I should do it to an address in the 224.0.0.0 239.0.0.0 range. And it worked, sometimes. I switched over to 10.0.0.100 because it works, basically always Finally, upon bootup, I have configured NoDM to start a session with the user user, and dropped the following in my user s .xsession:
setterm -blank 0 -powersave off -powerdown 0
xset s off
xset -dpms
xset s noblank
mplayer -msglevel all=1 -fs /home/usuario/video.sdp
Anyway, as a result, my students are able to much better follow the pace of my presentation, and I m able to do some tricks better (particularly when it requires quick reaction times, as often happens when dealing with concurrency and such issues). Oh, and of course in case it s of interest to anybody, knowing that SD cards are all but reliable in the long run, I wrote a vmdb2 recipe to build the images. You can grab it here; it requires some local files to be present to be built some are the ones I copied over above, and the other ones are surely of no interest to you (such as my public ssh key or such :-] ) What am I still missing? (read: Can you help me with some ideas? ) Of course, this is a blog post published to brag about my stuff, but also to serve me as persistent memory in case I need to recreate this

28 April 2022

Rapha&#235;l Hertzog: Freexian s report about Debian Long Term Support, March 2022

A Debian LTS logo
Every month we review the work funded by Freexian s Debian LTS offering. Please find the report for March below. Debian project funding Learn more about the rationale behind this initiative in this article. Debian LTS contributors In March, 11 contributors were paid to work on Debian LTS, their reports are available below. If you re interested in participating in the LTS or ELTS teams, we welcome participation from the Debian community. Simply get in touch with Jeremiah or Rapha l if you are if you are interested in participating. Evolution of the situation In March we released 42 DLAs. The security tracker currently lists 81 packages with a known CVE and the dla-needed.txt file has 52 packages needing an update. We re glad to welcome a few new sponsors such as lectricit de France (Gold sponsor), Telecats BV and Soliton Systems. Thanks to our sponsors Sponsors that joined recently are in bold.

27 April 2022

Antoine Beaupr : building Debian packages under qemu with sbuild

I've been using sbuild for a while to build my Debian packages, mainly because it's what is used by the Debian autobuilders, but also because it's pretty powerful and efficient. Configuring it just right, however, can be a challenge. In my quick Debian development guide, I had a few pointers on how to configure sbuild with the normal schroot setup, but today I finished a qemu based configuration.

Why I want to use qemu mainly because it provides better isolation than a chroot. I sponsor packages sometimes and while I typically audit the source code before building, it still feels like the extra protection shouldn't hurt. I also like the idea of unifying my existing virtual machine setup with my build setup. My current VM is kind of all over the place: libvirt, vagrant, GNOME Boxes, etc?). I've been slowly converging over libvirt however, and most solutions I use right now rely on qemu under the hood, certainly not chroots... I could also have decided to go with containers like LXC, LXD, Docker (with conbuilder, whalebuilder, docker-buildpackage), systemd-nspawn (with debspawn), unshare (with schroot --chroot-mode=unshare), or whatever: I didn't feel those offer the level of isolation that is provided by qemu. The main downside of this approach is that it is (obviously) slower than native builds. But on modern hardware, that cost should be minimal.

How Basically, you need this:
sudo mkdir -p /srv/sbuild/qemu/
sudo apt install sbuild-qemu
sudo sbuild-qemu-create -o /srv/sbuild/qemu/unstable.img unstable https://deb.debian.org/debian
Then to make this used by default, add this to ~/.sbuildrc:
# run autopkgtest inside the schroot
$run_autopkgtest = 1;
# tell sbuild to use autopkgtest as a chroot
$chroot_mode = 'autopkgtest';
# tell autopkgtest to use qemu
$autopkgtest_virt_server = 'qemu';
# tell autopkgtest-virt-qemu the path to the image
# use --debug there to show what autopkgtest is doing
$autopkgtest_virt_server_options = [ '--', '/srv/sbuild/qemu/%r-%a.img' ];
# tell plain autopkgtest to use qemu, and the right image
$autopkgtest_opts = [ '--', 'qemu', '/srv/sbuild/qemu/%r-%a.img' ];
# no need to cleanup the chroot after build, we run in a completely clean VM
$purge_build_deps = 'never';
# no need for sudo
$autopkgtest_root_args = '';
Note that the above will use the default autopkgtest (1GB, one core) and qemu (128MB, one core) configuration, which might be a little low on resources. You probably want to be explicit about this, with something like this:
# extra parameters to pass to qemu
# --enable-kvm is not necessary, detected on the fly by autopkgtest
my @_qemu_options = ['--ram-size=4096', '--cpus=2'];
# tell autopkgtest-virt-qemu the path to the image
# use --debug there to show what autopkgtest is doing
$autopkgtest_virt_server_options = [ @_qemu_options, '--', '/srv/sbuild/qemu/%r-%a.img' ];
$autopkgtest_opts = [ '--', 'qemu', @qemu_options, '/srv/sbuild/qemu/%r-%a.img'];
This configuration will:
  1. create a virtual machine image in /srv/sbuild/qemu for unstable
  2. tell sbuild to use that image to create a temporary VM to build the packages
  3. tell sbuild to run autopkgtest (which should really be default)
  4. tell autopkgtest to use qemu for builds and for tests
Note that the VM created by sbuild-qemu-create have an unlocked root account with an empty password.

Other useful tasks
  • enter the VM to make test, changes will be discarded (thanks Nick Brown for the sbuild-qemu-boot tip!):
     sbuild-qemu-boot /srv/sbuild/qemu/unstable-amd64.img
    
    That program is shipped only with bookworm and later, an equivalent command is:
     qemu-system-x86_64 -snapshot -enable-kvm -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0,id=rng-device0 -m 2048 -nographic /srv/sbuild/qemu/unstable-amd64.img
    
    The key argument here is -snapshot.
  • enter the VM to make permanent changes, which will not be discarded:
     sudo sbuild-qemu-boot --readwrite /srv/sbuild/qemu/unstable-amd64.img
    
    Equivalent command:
     sudo qemu-system-x86_64 -enable-kvm -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0,id=rng-device0 -m 2048 -nographic /srv/sbuild/qemu/unstable-amd64.img
    
  • update the VM (thanks lavamind):
     sudo sbuild-qemu-update /srv/sbuild/qemu/unstable-amd64.img
    
  • build in a specific VM regardless of the suite specified in the changelog (e.g. UNRELEASED, bookworm-backports, bookworm-security, etc):
     sbuild --autopkgtest-virt-server-opts="-- qemu /var/lib/sbuild/qemu/bookworm-amd64.img"
    
    Note that you'd also need to pass --autopkgtest-opts if you want autopkgtest to run in the correct VM as well:
     sbuild --autopkgtest-opts="-- qemu /var/lib/sbuild/qemu/unstable.img" --autopkgtest-virt-server-opts="-- qemu /var/lib/sbuild/qemu/bookworm-amd64.img"
    
    You might also need parameters like --ram-size if you customized it above.
And yes, this is all quite complicated and could be streamlined a little, but that's what you get when you have years of legacy and just want to get stuff done. It seems to me autopkgtest-virt-qemu should have a magic flag starts a shell for you, but it doesn't look like that's a thing. When that program starts, it just says ok and sits there. Maybe because the authors consider the above to be simple enough (see also bug #911977 for a discussion of this problem).

Live access to a running test When autopkgtest starts a VM, it uses this funky qemu commandline:
qemu-system-x86_64 -m 4096 -smp 2 -nographic -net nic,model=virtio -net user,hostfwd=tcp:127.0.0.1:10022-:22 -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0,id=rng-device0 -monitor unix:/tmp/autopkgtest-qemu.w1mlh54b/monitor,server,nowait -serial unix:/tmp/autopkgtest-qemu.w1mlh54b/ttyS0,server,nowait -serial unix:/tmp/autopkgtest-qemu.w1mlh54b/ttyS1,server,nowait -virtfs local,id=autopkgtest,path=/tmp/autopkgtest-qemu.w1mlh54b/shared,security_model=none,mount_tag=autopkgtest -drive index=0,file=/tmp/autopkgtest-qemu.w1mlh54b/overlay.img,cache=unsafe,if=virtio,discard=unmap,format=qcow2 -enable-kvm -cpu kvm64,+vmx,+lahf_lm
... which is a typical qemu commandline, I'm sorry to say. That gives us a VM with those settings (paths are relative to a temporary directory, /tmp/autopkgtest-qemu.w1mlh54b/ in the above example):
  • the shared/ directory is, well, shared with the VM
  • port 10022 is forward to the VM's port 22, presumably for SSH, but not SSH server is started by default
  • the ttyS1 and ttyS2 UNIX sockets are mapped to the first two serial ports (use nc -U to talk with those)
  • the monitor UNIX socket is a qemu control socket (see the QEMU monitor documentation, also nc -U)
In other words, it's possible to access the VM with:
nc -U /tmp/autopkgtest-qemu.w1mlh54b/ttyS2
The nc socket interface is ... not great, but it works well enough. And you can probably fire up an SSHd to get a better shell if you feel like it.

Nitty-gritty details no one cares about

Fixing hang in sbuild cleanup I'm having a hard time making heads or tails of this, but please bear with me. In sbuild + schroot, there's this notion that we don't really need to cleanup after ourselves inside the schroot, as the schroot will just be delted anyways. This behavior seems to be handled by the internal "Session Purged" parameter. At least in lib/Sbuild/Build.pm, we can see this:
my $is_cloned_session = (defined ($session->get('Session Purged')) &&
             $session->get('Session Purged') == 1) ? 1 : 0;
[...]
if ($is_cloned_session)  
$self->log("Not cleaning session: cloned chroot in use\n");
  else  
if ($purge_build_deps)  
    # Removing dependencies
    $resolver->uninstall_deps();
  else  
    $self->log("Not removing build depends: as requested\n");
 
 
The schroot builder defines that parameter as:
    $self->set('Session Purged', $info-> 'Session Purged' );
... which is ... a little confusing to me. $info is:
my $info = $self->get('Chroots')->get_info($schroot_session);
... so I presume that depends on whether the schroot was correctly cleaned up? I stopped digging there... ChrootUnshare.pm is way more explicit:
$self->set('Session Purged', 1);
I wonder if we should do something like this with the autopkgtest backend. I guess people might technically use it with something else than qemu, but qemu is the typical use case of the autopkgtest backend, in my experience. Or at least certainly with things that cleanup after themselves. Right? For some reason, before I added this line to my configuration:
$purge_build_deps = 'never';
... the "Cleanup" step would just completely hang. It was quite bizarre.

Disgression on the diversity of VM-like things There are a lot of different virtualization solutions one can use (e.g. Xen, KVM, Docker or Virtualbox). I have also found libguestfs to be useful to operate on virtual images in various ways. Libvirt and Vagrant are also useful wrappers on top of the above systems. There are particularly a lot of different tools which use Docker, Virtual machines or some sort of isolation stronger than chroot to build packages. Here are some of the alternatives I am aware of: Take, for example, Whalebuilder, which uses Docker to build packages instead of pbuilder or sbuild. Docker provides more isolation than a simple chroot: in whalebuilder, packages are built without network access and inside a virtualized environment. Keep in mind there are limitations to Docker's security and that pbuilder and sbuild do build under a different user which will limit the security issues with building untrusted packages. On the upside, some of things are being fixed: whalebuilder is now an official Debian package (whalebuilder) and has added the feature of passing custom arguments to dpkg-buildpackage. None of those solutions (except the autopkgtest/qemu backend) are implemented as a sbuild plugin, which would greatly reduce their complexity. I was previously using Qemu directly to run virtual machines, and had to create VMs by hand with various tools. This didn't work so well so I switched to using Vagrant as a de-facto standard to build development environment machines, but I'm returning to Qemu because it uses a similar backend as KVM and can be used to host longer-running virtual machines through libvirt. The great thing now is that autopkgtest has good support for qemu and sbuild has bridged the gap and can use it as a build backend. I originally had found those bugs in that setup, but all of them are now fixed:
  • #911977: sbuild: how do we correctly guess the VM name in autopkgtest?
  • #911979: sbuild: fails on chown in autopkgtest-qemu backend
  • #911963: autopkgtest qemu build fails with proxy_cmd: parameter not set
  • #911981: autopkgtest: qemu server warns about missing CPU features
So we have unification! It's possible to run your virtual machines and Debian builds using a single VM image backend storage, which is no small feat, in my humble opinion. See the sbuild-qemu blog post for the annoucement Now I just need to figure out how to merge Vagrant, GNOME Boxes, and libvirt together, which should be a matter of placing images in the right place... right? See also hosting.

pbuilder vs sbuild I was previously using pbuilder and switched in 2017 to sbuild. AskUbuntu.com has a good comparative between pbuilder and sbuild that shows they are pretty similar. The big advantage of sbuild is that it is the tool in use on the buildds and it's written in Perl instead of shell. My concerns about switching were POLA (I'm used to pbuilder), the fact that pbuilder runs as a separate user (works with sbuild as well now, if the _apt user is present), and setting up COW semantics in sbuild (can't just plug cowbuilder there, need to configure overlayfs or aufs, which was non-trivial in Debian jessie). Ubuntu folks, again, have more documentation there. Debian also has extensive documentation, especially about how to configure overlays. I was ultimately convinced by stapelberg's post on the topic which shows how much simpler sbuild really is...

Who Thanks lavamind for the introduction to the sbuild-qemu package.

Russ Allbery: Review: Sorceress of Darshiva

Review: Sorceress of Darshiva, by David Eddings
Series: The Malloreon #4
Publisher: Del Rey
Copyright: December 1989
Printing: November 1990
ISBN: 0-345-36935-1
Format: Mass market
Pages: 371
This is the fourth book of the Malloreon, the sequel series to the Belgariad. Eddings as usual helpfully summarizes the plot of previous books (the one thing about his writing that I wish more authors would copy), this time by having various important people around the world briefed on current events. That said, you don't want to start reading here (although you might wish you could). This is such a weird book. One could argue that not much happens in the Belgariad other than map exploration and collecting a party, but the party collection involves meddling in local problems to extract each new party member. It's a bit of a random sequence of events, but things clearly happen. The Malloreon starts off with a similar structure, including an explicit task to create a new party of adventurers to save the world, but most of the party is already gathered at the start of the series since they carry over from the previous series. There is a ton of map exploration, but it's within the territory of the bad guys from the previous series. Rather than local meddling and acquiring new characters, the story is therefore chasing Zandramas (the big bad of the series) and books of prophecy. This could still be an effective plot trigger but for another decision of Eddings that becomes obvious in Demon Lord of Karanda (the third book): the second continent of this world, unlike the Kingdoms of Hats world-building of the Belgariad, is mostly uniform. There are large cities, tons of commercial activity, and a fairly effective and well-run empire, with only a little local variation. In some ways it's a welcome break from Eddings's previous characterization by stereotype, but there isn't much in the way of local happenings for the characters to get entangled in. Even more oddly, this continental empire, which the previous series set up as the mysterious and evil adversaries of the west akin to Sauron's domain in Lord of the Rings, is not mysterious to some of the party at all. Silk, the Drasnian spy who is a major character in both series, has apparently been running a vast trading empire in Mallorea. Not only has he been there before, he has houses and factors and local employees in every major city and keeps being distracted from the plot by his cutthroat capitalist business shenanigans. It's as if the characters ventured into the heart of the evil empire and found themselves in the entirely normal city next door, complete with restaurant recommendations from one of their traveling companions. I think this is an intentional subversion of the normal fantasy plot by Eddings, and I kind of like it. We have met the evil empire, and they're more normal than most of the good guys, and both unaware and entirely uninterested in being the evil empire. But in terms of dramatic plot structure, it is such an odd choice. Combined with the heroes being so absurdly powerful that they have no reason to take most dangers seriously (and indeed do not), it makes this book remarkably anticlimactic and weirdly lacking in drama. And yet I kind of enjoyed reading it? It's oddly quiet and comfortable reading. Nothing bad happens, nor seems very likely to happen. The primary plot tension is Belgarath trying to figure out the plot of the series by tracking down prophecies in which the plot is written down with all of the dramatic tension of an irritated rare book collector. In the middle of the plot, the characters take a detour to investigate an alchemist who is apparently immortal, featuring a university on Melcena that could have come straight out of a Discworld novel, because investigating people who spontaneously discover magic is of arguably equal importance to saving the world. Given how much the plot is both on rails and clearly willing to wait for the protagonists to catch up, it's hard to argue with them. It felt like a side quest in a video game. I continue to find the way that Eddings uses prophecy in this series to be highly amusing, although there aren't nearly enough moments of the prophecy giving Garion stage direction. The basic concept of two competing prophecies that are active characters in the world attempting to create their own sequence of events is one that would support a better series than this one. It's a shame that Zandramas, the main villain, is rather uninteresting despite being female in a highly sexist society, highly competent, a different type of shapeshifter (I won't say more to avoid spoilers for earlier books), and the anchor of the other prophecy. It's good material, but Eddings uses it very poorly, on top of making the weird decision to have her talk like an extra in a Shakespeare play. This book was astonishingly pointless. I think the only significant plot advancement besides map movement is picking up a new party member (who was rather predictable), and the plot is so completely on rails that the characters are commenting about the brand of railroad ties that Eddings used. Ce'Nedra continues to be spectacularly irritating. It's not, by any stretch of the imagination, a good book, and yet for some reason I enjoyed it more than the other books of the series so far. Chalk one up for brain candy when one is in the right mood, I guess. Followed by The Seeress of Kell, the epic (?) conclusion. Rating: 6 out of 10

Russell Coker: PIN for Login

Windows 10 added a new PIN login method, which is an optional login method instead of an Internet based password through Microsoft or a Domain password through Active Directory. Here is a web page explaining some of the technology (don t watch the YouTube video) [1]. There are three issues here, whether a PIN is any good in concept, whether the specifics of how it works are any good, and whether we can copy any useful ideas for Linux. Is a PIN Any Good? A PIN in concept is a shorter password. I think that less secure methods of screen unlocking (fingerprint, face unlock, and a PIN) can be reasonably used in less hostile environments. For example if you go to the bathroom or to get a drink in a relatively secure environment like a typical home or office you don t need to enter a long password afterwards. Having a short password that works for short time periods of screen locking and a long password for longer times could be a viable option. It could also be an option to allow short passwords when the device is in a certain area (determined by GPS or Wifi connection). Android devices have in the past had options to disable passwords when at home. Is the Windows 10 PIN Any Good? The Windows 10 PIN is based on TPM security which can provide real benefits, but this is more of a failure of Windows local passwords in not using the TPM than a benefit for the PIN. When you login to a Windows 10 system you will be given a choice of PIN or the configured password (local password or AD password). As a general rule providing a user a choice of ways to login is bad for security as an attacker can use whichever option is least secure. The configuration options for Windows 10 allow either group policy in AD or the registry to determine whether PIN login is allowed but doesn t have any control over when the PIN can be used which seems like a major limitation to me. The claim that the PIN is more secure than a password would only make sense if it was a viable option to disable the local password or AD domain password and only use the PIN. That s unreasonably difficult for home users and usually impossible for people on machines with corporate management. Ideas For Linux I think it would be good to have separate options for short term and long term screen locks. This could be implemented by having a screen locking program use two different PAM configurations for unlocking after short term and long term lock periods. Having local passwords based on the TPM might be useful. But if you have the root filesystem encrypted via the TPM using systemd-cryptoenroll it probably doesn t gain you a lot. One benefit of the TPM is limiting the number of incorrect attempts at guessing the password in hardware, the default is allowing 32 wrong attempts and then one every 10 minutes. Trying to do that in software would allow 32 guesses and then a hardware reset which could average at something like 32 guesses per minute instead of 32 guesses per 320 minutes. Maybe something like fail2ban could help with this (a similar algorithm but for password authentication guesses instead of network access). Having a local login method to use when there is no Internet access and network authentication can t work could be useful. But if the local login method is easier then an attacker could disrupt Internet access to force a less secure login method. Is there a good federated authentication system for Linux? Something to provide comparable functionality to AD but with distributed operation as a possibility?

26 April 2022

Reproducible Builds: Supporter spotlight: Google Open Source Security Team (GOSST)

The Reproducible Builds project relies on several projects, supporters and sponsors for financial support, but they are also valued as ambassadors who spread the word about our project and the work that we do. This is the fourth instalment in a series featuring the projects, companies and individuals who support the Reproducible Builds project. If you are a supporter of the Reproducible Builds project (of whatever size) and would like to be featured here, please let get in touch with us at contact@reproducible-builds.org. We started this series by featuring the Civil Infrastructure Platform project and followed this up with a post about the Ford Foundation as well as a recent one about ARDC. However, today, we ll be talking with Meder Kydyraliev of the Google Open Source Security Team (GOSST).
Chris Lamb: Hi Meder, thanks for taking the time to talk to us today. So, for someone who has not heard of the Google Open Source Security Team (GOSST) before, could you tell us what your team is about? Meder: Of course. The Google Open Source Security Team (or GOSST ) was created in 2020 to work with the open source community at large, with the goal of making the open source software that everyone relies on more secure.
Chris: What kinds of activities is the GOSST involved in? Meder: The range of initiatives that the team is involved in recognizes the diversity of the ecosystem and unique challenges that projects face on their security journey. For example, our sponsorship of sos.dev ensures that developers are rewarded for security improvements to open source projects, whilst the long term work on improving Linux kernel security tackles specific kernel-related vulnerability classes. Many of the projects GOSST is involved with aim to make it easier for developers to improve security through automated assessment (Scorecards and Allstar) and vulnerability discovery tools (OSS-Fuzz, ClusterFuzzLite, FuzzIntrospector), in addition to contributing to infrastructure to make adopting certain best practices easier. Two great examples of best practice efforts are Sigstore for artifact signing and OSV for automated vulnerability management.
Chris: The usage of open source software has exploded in the last decade, but supply-chain hygiene and best practices has seemingly not kept up. How does GOSST see this issue and what approaches is it taking to ensure that past and future systems are resilient? Meder: Every open source ecosystem is a complex environment and that awareness informs our approaches in this space. There are, of course, no silver bullets , and long-lasting and material supply-chain improvements requires infrastructure and tools that will make lives of open source developers easier, all whilst improving the state of the wider software supply chain. As part of a broader effort, we created the Supply-chain Levels for Software Artifacts framework that has been used internally at Google to protect production workloads. This framework describes the best practices for source code and binary artifact integrity, and we are engaging with the community on its refinement and adoption. Here, package managers (such as PyPI, Maven Central, Debian, etc.) are an essential link in the software supply chain due to their near-universal adoption; users do not download and compile their own software anymore. GOSST is starting to work with package managers to explore ways to collaborate together on improving the state of the supply chain and helping package maintainers and application developers do better all with the understanding that many open source projects are developed in spare time as a hobby! Solutions like this, which are the result of collaboration between GOSST and GitHub, are very encouraging as they demonstrate a way to materially strengthen software supply chain security with readily available tools, while also improving development workflows. For GOSST, the problem of supply chain security also covers vulnerability management and solutions to make it easier for everyone to discover known vulnerabilities in open source packages in a scalable and automated way. This has been difficult in the past due to lack of consistently high-quality data in an easily-consumable format. To address this, we re working on infrastructure (OSV.dev) to make vulnerability data more easily accessible as well as a widely adopted and automation friendly data format.
Chris: How does the Reproducible Builds effort help GOSST achieve its goals? Meder: Build reproducibility has a lot of attributes that are attractive as part of generally good build hygiene . As an example, hermeticity, one of requirements to meet SLSA level 4, makes it much easier to reason about the dependencies of a piece of software. This is an enormous benefit during vulnerability or supply chain incident response. On a higher level, however, we always think about reproducibility from the viewpoint of a user and the threats that reproducibility protects them from. Here, a lot of progress has been made, of course, but a lot of work remains to make reproducibility part of everyone s software consumption practices.
Chris: So if someone wanted to know more about GOSST or follow the team s work, where might they go to look? Meder: We post regular updates on Google s Security Blog and on the Linux hardening mailing list. We also welcome community participation in the projects we work on! See any of the projects linked above or OpenSSF s GitHub projects page for a list of efforts you can contribute to directly if you want to get involved in strengthening the open source ecosystem.
Chris: Thanks for taking the time to talk to us today. Meder: No problem. :)


For more information about the Reproducible Builds project, please see our website at reproducible-builds.org. If you are interested in ensuring the ongoing security of the software that underpins our civilisation and wish to sponsor the Reproducible Builds project, please reach out to the project by emailing contact@reproducible-builds.org.

22 April 2022

Andrej Shadura: To England by train (part 1)

This post was written in August 2021. Just as I was going to publish it, I received an email from BB stating that due to a railway strike in Germany my night train would be cancelled. Since the rest of the trip has already been booked well in advance, I had to take a plane to Charleroi and a bus to Brussels to catch my Eurostar. Ultimately, I ended up publishing it in April 2022, just as I m about to leave for a fully train-powered trip to the UK once again. Before the pandemic started, I planned to use the last months of my then-expiring UK visa and go to England by train. I ve completed two train long journeys by that time already, to Brussels and to Belarus and Ukraine, but this would be something quite different, as I wanted to have multiple stops on my way, use night trains where it made sense, and to go through the Channel Tunnel. The Channel Tunnel fascinated me since my childhood; I first read about it in the Soviet Science and Life magazine ( ) when I was seven. I ve never had the chance to use it though, since to board any train going though it I d first need to get to France, Belgium or the Netherlands, making it significantly more expensive than the cheap 30 Ryanair flights to Stansted. As the coronavirus spread across the world, all of my travel plans along with plans for a sabbatical had to be cancelled. During 2020, I only managed to go on two weekend trips to Prague and Budapest, followed by a two-weeks holiday on Crete (we returned just a couple of weeks before the infection numbers rose and lockdowns started). I do realise that a lot of people couldn t even have this much because the situation in their countries was much worse we were lucky to have had at least some travel. Fast forward to August 2021, I m fully vaccinated, I once again have a UK visa for five years, and the UK finally recognises the EU vaccination passports yay! I can finally go to Devon to see my mother and sister again. By train, of course. Compared to my original plan, this journey will be different: about the same or even more expensive than I originally planned, but shorter and with fewer stops on the way. My original plan would be to take multiple trains from Bratislava to France or Belgium and complete this segment of the trip in about three days, enjoying my stay in a couple of cities on the way. Instead, I m taking a direct NightJet from Vienna to Brussels, not stopping anywhere on the way.
Map: train route from Bratislava to BrusselsTrain route from Bratislava to Brussels
Since I was booking my trip just two weeks ahead, the price of the ticket is not that I hoped for, but much higher: 109 for the ticket itself and 60 for the berth (advance bookings could be about twice as cheap). Next, to London! Eurostar is still on a very much reduced schedule, running one train only from Amsterdam through Brussels and Lille to London each day. This means, of course, higher ticket prices (I paid about 100 for the ticket) and longer waiting time in Brussels my sleeper arrives about 10 am, but the Eurostar train is scheduled to depart at 3 pm.
Map: train route from Brussels to LondonTrain route from Brussels to London
The train makes a stop in Lille, which I initially suspected to be risky as at the time when I booked my tickets, as at the time France was on the amber plus list for the UK, requiring a quarantine upon arrival. However, Eurostar announced that they will assign travellers from Lille to a different carriage to avoid other passengers having to go to quarantine, but recently France was taken off the amber plus list. The train fare system in the UK is something I don t quite understand, as sometimes split tickets are cheaper, sometimes they re expensive, sometimes prices for the same service at different times can be vastly different, off-peak tickets don t say what exactly off-peak means (very few people in the UK are asked were able to tell me when exactly off-peak hours are). Curiously, transfers between train stations using London Underground services can be included into railway tickets, but some last mile connection like Exeter to Honiton cannot (but this used to be possible). Both GWR.com and TrainLine refused to sell me a single ticket from London to Honiton through Exeter, insisting I split the ticket at Exeter St Davids or take the slower South Western train to Honiton via Salisbury and Yeovil. I ended up buying a 57 ticket from Paddington to Exeter St Davids with the first segment being the London Underground from St Pancras, and a separate 7.70 ticket to Honiton.
Map: arrival at St Pancras, the underground, departure from PaddingtonChange of trains in London
Map: arrival at Exeter St Davids, change to a South Western train, arrival at HonitonChange of trains in Exeter
Date Station Arrival Departure Train
12.8 Bratislava-Petr alka 18:15 REX 7756
Wien Hbf 19:15 19:53 NJ 50490
13.8 Bruxelles-Midi 9:55 15:06 EST 9145
London St Pancras 16:03 16:34 TfL
London Paddington 16:49 17:04 GWR 59231
Exeter St Davids 19:19 19:25 SWR 52706
Honiton 19:54
Unfortunately, due to the price of the tickets, I m taking a 15 Ryanair flight back Update after the journey Since I flew to Charleroi instead of comfortably sleeping in a night train, I had to put up with inconveniences of airports, including cumbersome connections to the nearby cities. The only reasonable way of getting from Charleroi to Brussels is an overcrowded bus which takes almost an hour to arrive. I used to take this bus when I tried to save money on my way to FOSDEM, and I must admit it s not something I missed. Boarding the Eurostar train went fine, my vaccination passport and Covid test wasn t really checked, just glanced at. The waiting room was a bit of a disappointment, with bars closed and vending machines broken. Since it was underground, I couldn t even see the trains until the very last moment when we were finally allowed on the platform. The train itself, while comfortable, disappointed me with the bistro carriage: standing only, instant coffee, poor selection of food and drinks. I m glad I bought some food at Carrefour at the Midi station! When I arrived in Exeter, I soon found out why the system refused to sell me a through ticket: 6 minutes is not enough to change trains at Exeter St Davids! Or, it might have been if I took the right footbridge but I took the one which led into a very talkative (and slow!) lift. I ended up running to the train just as it closed the doors and departed, leaving me tin Exeter for an hour until. I used this chance and walked to Exeter Central, and had a pint in a conveniently located pub around the corner. P.S. The maps in this and other posts were created using uMap; the map data come from OpenStreetMap. The train route visualisation was generated with help of the Raildar.fr OSRM instance.

19 April 2022

Bits from Debian: ITP Prizren Platinum Sponsor of DebConf22

itplogo We are very pleased to announce that ITP - Innovation and Training Park Prizren has committed to supporting DebConf22 as a Platinum sponsor. Also, ITP Prizren will host the Conference for all 15 days! The ITP Prizren intends to be a changing and boosting element in the area of ICT, agro-food and creatives industries, through the creation and management of a favourable environment and efficient services for SMEs, exploiting different kinds of innovations that can contribute to Kosovo to improve its level of development in industry and research, bringing benefits to the economy and society of the country as a whole. ITP Prizren is a focal point in the Balkan region for innovation, business and skills development, and a source of innovative and successful ideas. With this commitment as Platinum Sponsor, ITP Prizren is contributing to make possible our annual conference, and directly supporting the progress of Debian and Free Software, helping to strengthen the community that continues to collaborate on Debian projects throughout the rest of the year. Thank you very much ITP Prizren, for your support of DebConf22! Become a sponsor too! DebConf22 will take place from July 17th to 24th, 2022 at the Innovation and Training Park (ITP) in Prizren, Kosovo, and will be preceded by DebCamp, from July 10th to 16th. And DebConf22 is still accepting sponsors! Interested companies and organizations may contact the DebConf team through sponsors@debconf.org, and visit the DebConf22 website at https://debconf22.debconf.org/sponsors/become-a-sponsor. DebConf22 banner open registration

14 April 2022

Reproducible Builds: Supporter spotlight: Amateur Radio Digital Communications (ARDC)

The Reproducible Builds project relies on several projects, supporters and sponsors for financial support, but they are also valued as ambassadors who spread the word about the project and the work that we do. This is the third instalment in a series featuring the projects, companies and individuals who support the Reproducible Builds project. If you are a supporter of the Reproducible Builds project (of whatever size) and would like to be featured here, please let get in touch with us at contact@reproducible-builds.org. We started this series by featuring the Civil Infrastructure Platform project and followed this up with a post about the Ford Foundation. Today, however, we ll be talking with Dan Romanchik, Communications Manager at Amateur Radio Digital Communications (ARDC).
Chris Lamb: Hey Dan, it s nice to meet you! So, for someone who has not heard of Amateur Radio Digital Communications (ARDC) before, could you tell us what your foundation is about? Dan: Sure! ARDC s mission is to support, promote, and enhance experimentation, education, development, open access, and innovation in amateur radio, digital communication, and information and communication science and technology. We fulfill that mission in two ways:
  1. We administer an allocation of IP addresses that we call 44Net. These IP addresses (in the 44.0.0.0/8 IP range) can only be used for amateur radio applications and experimentation.
  2. We make grants to organizations whose work aligns with our mission. This includes amateur radio clubs as well as other amateur radio-related organizations and activities. Additionally, we support scholarship programs for people who either have an amateur radio license or are pursuing careers in technology, STEM education and open-source software development projects that fit our mission, such as Reproducible Builds.

Chris: How might you relate the importance of amateur radio and similar technologies to someone who is non-technical? Dan: Amateur radio is important in a number of ways. First of all, amateur radio is a public service. In fact, the legal name for amateur radio is the Amateur Radio Service, and one of the primary reasons that amateur radio exists is to provide emergency and public service communications. All over the world, amateur radio operators are prepared to step up and provide emergency communications when disaster strikes or to provide communications for events such as marathons or bicycle tours. Second, amateur radio is important because it helps advance the state of the art. By experimenting with different circuits and communications techniques, amateurs have made significant contributions to communications science and technology. Third, amateur radio plays a big part in technical education. It enables students to experiment with wireless technologies and electronics in ways that aren t possible without a license. Amateur radio has historically been a gateway for young people interested in pursuing a career in engineering or science, such as network or electrical engineering. Fourth and this point is a little less obvious than the first three amateur radio is a way to enhance international goodwill and community. Radio knows no boundaries, of course, and amateurs are therefore ambassadors for their country, reaching out to all around the world. Beyond amateur radio, ARDC also supports and promotes research and innovation in the broader field of digital communication and information and communication science and technology. Information and communication technology plays a big part in our lives, be it for business, education, or personal communications. For example, think of the impact that cell phones have had on our culture. The challenge is that much of this work is proprietary and owned by large corporations. By focusing on open source work in this area, we help open the door to innovation outside of the corporate landscape, which is important to overall technological resiliency.
Chris: Could you briefly outline the history of ARDC? Dan: Nearly forty years ago, a group of visionary hams saw the future possibilities of what was to become the internet and requested an address allocation from the Internet Assigned Numbers Authority (IANA). That allocation included more than sixteen million IPv4 addresses, 44.0.0.0 through 44.255.255.255. These addresses have been used exclusively for amateur radio applications and experimentation with digital communications techniques ever since. In 2011, the informal group of hams administering these addresses incorporated as a nonprofit corporation, Amateur Radio Digital Communications (ARDC). ARDC is recognized by IANA, ARIN and the other Internet Registries as the sole owner of these addresses, which are also known as AMPRNet or 44Net. Over the years, ARDC has assigned addresses to thousands of hams on a long-term loan (essentially acting as a zero-cost lease), allowing them to experiment with digital communications technology. Using these IP addresses, hams have carried out some very interesting and worthwhile research projects and developed practical applications, including TCP/IP connectivity via radio links, digital voice, telemetry and repeater linking. Even so, the amateur radio community never used much more than half the available addresses, and today, less than one third of the address space is assigned and in use. This is one of the reasons that ARDC, in 2019, decided to sell one quarter of the address space (or approximately 4 million IP addresses) and establish an endowment with the proceeds. This endowment now funds ARDC s a suite of grants, including scholarships, research projects, and of course amateur radio projects. Initially, ARDC was restricted to awarding grants to organizations in the United States, but is now able to provide funds to organizations around the world.
Chris: How does the Reproducible Builds effort help ARDC achieve its goals? Dan: Our aspirational goals include: We think that the Reproducible Builds efforts in helping to ensure the safety and security of open source software closely align with those goals.
Chris: Are there any specific success stories that ARDC is particularly proud of? Dan: We are really proud of our grant to the Hoopa Valley Tribe in California. With a population of nearly 2,100, their reservation is the largest in California. Like everywhere else, the COVID-19 pandemic hit the reservation hard, and the lack of broadband internet access meant that 130 children on the reservation were unable to attend school remotely. The ARDC grant allowed the tribe to address the immediate broadband needs in the Hoopa Valley, as well as encourage the use of amateur radio and other two-way communications on the reservation. The first nation was able to deploy a network that provides broadband access to approximately 90% of the residents in the valley. And, in addition to bringing remote education to those 130 children, the Hoopa now use the network for remote medical monitoring and consultation, adult education, and other applications. Other successes include our grants to:
Chris: ARDC supports a number of other existing projects and initiatives, not all of them in the open source world. How much do you feel being a part of the broader free culture movement helps you achieve your aims? Dan: In general, we find it challenging that most digital communications technology is proprietary and closed-source. It s part of our mission to fund open source alternatives. Without them, we are solely reliant, as a society, on corporate interests for our digital communication needs. It makes us vulnerable and it puts us at risk of increased surveillance. Thus, ARDC supports open source software wherever possible, and our grantees must make a commitment to share their work under an open source license or otherwise make it as freely available as possible.
Chris: Thanks so much for taking the time to talk to us today. Now, if someone wanted to know more about ARDC or to get involved, where might they go to look? To learn more about ARDC in general, please visit our website at https://www.ampr.org. To learn more about 44Net, go to https://wiki.ampr.org/wiki/Main_Page. And, finally, to learn more about our grants program, go to https://www.ampr.org/apply/

For more about the Reproducible Builds project, please see our website at reproducible-builds.org. If you are interested in ensuring the ongoing security of the software that underpins our civilisation and wish to sponsor the Reproducible Builds project, please reach out to the project by emailing contact@reproducible-builds.org.

13 April 2022

Antoine Beaupr : Tuning my wifi radios

After listening to an episode of the 2.5 admins podcast, I realized there was some sort of low-hanging fruit I could pick to better tune my WiFi at home. You see, I'm kind of a fraud in WiFi: I only started a WiFi mesh in Montreal (now defunct), I don't really know how any of that stuff works. So I was surprised to hear one of the podcast host say "it's all about airtime" and "you want to reduce the power on your access points" (APs). It seemed like sound advice: better bandwidth means less time on air, means less collisions, less latency, and less power also means less collisions. Worth a try, right?

Frequency So the first thing I looked at was WifiAnalyzer to see if I had any optimisation I could do there. Normally, I try to avoid having nearby APs on the same frequency to avoid collisions, but who knows, maybe I had messed that up. And turns out I did! Both APs were on "auto" for 5GHz, which typically means "do nothing or worse". 5GHz is really interesting, because, in theory, there are LOTS of channels to pick from, it goes up to 196!! And both my APs were on 36, what gives? So the first thing I did was to set it to channel 100, as there was that long gap in WifiAnalyzer where no other AP was. But that just broke 5GHz on the AP. The OpenWRT GUI (luci) would just say "wireless not associated" and the ESSID wouldn't show up in a scan anymore. At first, I thought this was a problem with OpenWRT or my hardware, but I could reproduce the problem with both my APs: a TP-Link Archer A7 v5 and a Turris Omnia (see also my review). As it turns out, that's because that range of the WiFi band interferes with trivial things like satellites and radar, which make the actually very useful radar maps look like useless christmas trees. So those channels require DFS to operate. DFS works by first listening on the frequency for a certain amount of time (1-2 minute, but could be as high as 10) to see if there's something else transmitting at all. So typically, that means they just don't operate at all in those bands, especially if you're near any major city which generally means you are near a weather radar that will transmit on that band. In the system logs, if you have such a problem, you might see this:
Apr  9 22:17:39 octavia hostapd: wlan0: DFS-CAC-START freq=5500 chan=100 sec_chan=1, width=0, seg0=102, seg1=0, cac_time=60s
Apr  9 22:17:39 octavia hostapd: DFS start_dfs_cac() failed, -1
... and/or this:
Sat Apr  9 18:05:03 2022 daemon.notice hostapd: Channel 100 (primary) not allowed for AP mode, flags: 0x10095b NO-IR RADAR
Sat Apr  9 18:05:03 2022 daemon.warn hostapd: wlan0: IEEE 802.11 Configured channel (100) not found from the channel list of current mode (2) IEEE 802.11a
Sat Apr  9 18:05:03 2022 daemon.warn hostapd: wlan0: IEEE 802.11 Hardware does not support configured channel
Here, it clearly says RADAR (in all caps too, which means it's really important). NO-IR is also important, I'm not sure what it means but it could be that you're not allowed to transmit in that band because of other local regulations. There might be a way to workaround those by changing the "region" in the Luci GUI, but I didn't mess with that, because I figured that other devices will have that already configured. So using a forbidden channel might make it more difficult for clients to connect (although it's possible this is enforced only on the AP side). In any case, 5GHz is promising, but in reality, you only get from channel 36 (5.170GHz) to 48 (5.250GHz), inclusively. Fast counters will notice that is exactly 80MHz, which means that if an AP is configured for that hungry, all-powerful 80MHz, it will effectively take up all 5GHz channels at once. This, in other words, is as bad as 2.4GHz, where you also have only two 40MHz channels. (Really, what did you expect: this is an unregulated frequency controlled by commercial interests...) So the first thing I did was to switch to 40MHz. This gives me two distinct channels in 5GHz at no noticeable bandwidth cost. (In fact, I couldn't find hard data on what the bandwidth ends up being on those frequencies, but I could still get 400Mbps which is fine for my use case.)

Power The next thing I did was to fiddle with power. By default, both radios were configured to transmit as much power as they needed to reach clients, which means that if a client gets farther away, it would boost its transmit power which, in turns, would mean the client would still connect to instead of failing and properly roaming to the other AP. The higher power also means more interference with neighbors and other APs, although that matters less if they are on different channels. On 5GHz, power was about 20dBm (100 mW) -- and more on the Turris! -- when I first looked, so I tried to lower it drastically to 5dBm (3mW) just for kicks. That didn't work so well, so I bumped it back up to 14 dBm (25 mW) and that seems to work well: clients hit about -80dBm when they get far enough from the AP, which gets close to the noise floor (and where the neighbor APs are), which is exactly what I want. On 2.4GHz, I lowered it down even further, to 10 dBm (10mW) since it's better at going through wells, I figured it would need less power. And anyways, I rather people use the 5GHz APs, so maybe that will act as an encouragement to switch. I was still able to connect correctly to the APs at that power as well.

Other tweaks I disabled the "Allow legacy 802.11b rates" setting in the 5GHz configuration. According to this discussion:
Checking the "Allow b rates" affects what the AP will transmit. In particular it will send most overhead packets including beacons, probe responses, and authentication / authorization as the slow, noisy, 1 Mb DSSS signal. That is bad for you and your neighbors. Do not check that box. The default really should be unchecked.
This, in particular, "will make the AP unusable to distant clients, which again is a good thing for public wifi in general". So I just unchecked that box and I feel happier now. I didn't make tests to see the effect separately however, so this is mostly just a guess.

10 April 2022

Dirk Eddelbuettel: RcppRedis 0.2.1: Maintenance

A month after the major release 0.2.0 bringing pub/sub and other goodies to our RcppRedis package, a new version 0.2.1 arrived on CRAN yesterday. RcppRedis is one of several packages connecting R to the fabulous Redis in-memory datastructure store (and much more). RcppRedis does not pretend to be feature complete, but it may do some things faster than the other interfaces, and also offers an optional coupling with MessagePack binary (de)serialization via RcppMsgPack. The package has carried production loads for several years now. This release updated the rredis suggestion by adding an Additional_repositories entry as Bryan decided to retire the rredis package. You can still install it via install.packages("rredis") by setting the addtional repo, for example repos=c("https://ghrr.github.io/drat", getOption("repos")) as documented in package and at our ghrr drat repo. The detailed changes list follows.

Changes in version 0.2.1 (2022-04-09)
  • The rredis package can be installed via the repo listed in Additional_repositories; the pubsub.R test file makes rredis optional and conditional; all demos now note that the optional rredis package is installable via the drat listed in Additional_repositories.
  • The fallback-compilation of hiredis has been forced to override compilation flags because CRAN knows better than upstream.
  • The GLOBEX pub/sub example has small updates.

Courtesy of CRANberries, there is also a diffstat report for this release. More information is on the RcppRedis page. If you like this or other open-source work I do, you can now sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

8 April 2022

Reproducible Builds: Reproducible Builds in March 2022

Welcome to the March 2022 report from the Reproducible Builds project! In our monthly reports we outline the most important things that we have been up to over the past month.
The in-toto project was accepted as an incubating project within the Cloud Native Computing Foundation (CNCF). in-toto is a framework that protects the software supply chain by collecting and verifying relevant data. It does so by enabling libraries to collect information about software supply chain actions and then allowing software users and/or project managers to publish policies about software supply chain practices that can be verified before deploying or installing software. CNCF foundations hosts a number of critical components of the global technology infrastructure under the auspices of the Linux Foundation. (View full announcement.)
Herv Boutemy posted to our mailing list with an announcement that the Java Reproducible Central has hit the milestone of 500 fully reproduced builds of upstream projects . Indeed, at the time of writing, according to the nightly rebuild results, 530 releases were found to be fully reproducible, with 100% reproducible artifacts.
GitBOM is relatively new project to enable build tools trace every source file that is incorporated into build artifacts. As an experiment and/or proof-of-concept, the GitBOM developers are rebuilding Debian to generate side-channel build metadata for versions of Debian that have already been released. This only works because Debian is (partial) reproducible, so one can be sure that that, if the case where build artifacts are identical, any metadata generated during these instrumented builds applies to the binaries that were built and released in the past. More information on their approach is available in README file in the bomsh repository.
Ludovic Courtes has published an academic paper discussing how the performance requirements of high-performance computing are not (as usually assumed) at odds with reproducible builds. The received wisdom is that vendor-specific libraries and platform-specific CPU extensions have resulted in a culture of local recompilation to ensure the best performance, rendering the property of reproducibility unobtainable or even meaningless. In his paper, Ludovic explains how Guix has:
[ ] implemented what we call package multi-versioning for C/C++ software that lacks function multi-versioning and run-time dispatch [ ]. It is another way to ensure that users do not have to trade reproducibility for performance. (full PDF)

Kit Martin posted to the FOSSA blog a post titled The Three Pillars of Reproducible Builds. Inspired by the shock of infiltrated or intentionally broken NPM packages, supply chain attacks, long-unnoticed backdoors , the post goes on to outline the high-level steps that lead to a reproducible build:
It is one thing to talk about reproducible builds and how they strengthen software supply chain security, but it s quite another to effectively configure a reproducible build. Concrete steps for specific languages are a far larger topic than can be covered in a single blog post, but today we ll be talking about some guiding principles when designing reproducible builds. [ ]
The article was discussed on Hacker News.
Finally, Bernhard M. Wiedemann noticed that the GNU Helloworld project varies depending on whether it is being built during a full moon! (Reddit announcement, openSUSE bug report)

Events There will be an in-person Debian Reunion in Hamburg, Germany later this year, taking place from 23 30 May. Although this is a Debian event, there will be some folks from the broader Reproducible Builds community and, of course, everyone is welcome. Please see the event page on the Debian wiki for more information. Bernhard M. Wiedemann posted to our mailing list about a meetup for Reproducible Builds folks at the openSUSE conference in Nuremberg, Germany. It was also recently announced that DebConf22 will take place this year as an in-person conference in Prizren, Kosovo. The pre-conference meeting (or Debcamp ) will take place from 10 16 July, and the main talks, workshops, etc. will take place from 17 24 July.

Misc news Holger Levsen updated the Reproducible Builds website to improve the documentation for the SOURCE_DATE_EPOCH environment variable, both by expanding parts of the existing text [ ][ ] as well as clarifying meaning by removing text in other places [ ]. In addition, Chris Lamb added a Twitter Card to our website s metadata too [ ][ ][ ]. On our mailing list this month:

Distribution work In Debian this month:
  • Johannes Schauer Marin Rodrigues posted to the debian-devel list mentioning that he exploited the property of reproducibility within Debian to demonstrate that automatically converting a large number of packages to a new internal source version did not change the resulting packages. The proposed change could therefore be applied without causing breakage:
So now we have 364 source packages for which we have a patch and for which we can show that this patch does not change the build output. Do you agree that with those two properties, the advantages of the 3.0 (quilt) format are sufficient such that the change shall be implemented at least for those 364? [ ]
In openSUSE, Bernhard M. Wiedemann posted his usual monthly reproducible builds status report.

Tooling diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploaded versions 207, 208 and 209 to Debian unstable, as well as made the following changes to the code itself:
  • Update minimum version of Black to prevent test failure on Ubuntu jammy. [ ]
  • Updated the R test fixture for the 4.2.x series of the R programming language. [ ]
Brent Spillner also worked on adding graceful handling for UNIX sockets and named pipes to diffoscope. [ ][ ][ ]. Vagrant Cascadian also updated the diffoscope package in GNU Guix. [ ][ ] reprotest is the Reproducible Build s project end-user tool to build the same source code twice in widely different environments and checking whether the binaries produced by the builds have any differences. This month, Santiago Ruano Rinc n added a new --append-build-command option [ ], which was subsequently uploaded to Debian unstable by Holger Levsen.

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Testing framework The Reproducible Builds project runs a significant testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, the following changes were made:
  • Holger Levsen:
    • Replace a local copy of the dsa-check-running-kernel script with a packaged version. [ ]
    • Don t hide the status of offline hosts in the Jenkins shell monitor. [ ]
    • Detect undefined service problems in the node health check. [ ]
    • Update the sources.lst file for our mail server as its still running Debian buster. [ ]
    • Add our mail server to our node inventory so it is included in the Jenkins maintenance processes. [ ]
    • Remove the debsecan package everywhere; it got installed accidentally via the Recommends relation. [ ]
    • Document the usage of the osuosl174 host. [ ]
Regular node maintenance was also performed by Holger Levsen [ ], Vagrant Cascadian [ ][ ][ ] and Mattia Rizzolo.
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

Next.

Previous.