Search Results: "ecki"

31 August 2024

Andrew Cater: Debian release weekend - media team update 202408311900 UTC

We're doing fairly well: Debian release team have been working really hard on a double point release today. Final release for Bullseye as 11.11 as it moves to LTS.

12.7 Bookworm install media finishing tests - it's been quite a long day so far.For 11.11 we're part way through media tests.

We've been joined by a lot of enthusiastic folk from Cape Town who've been a great help. Always nice to see old friends and new people join us on IRC - and they've just joined us for a short video call.

This has gone well: two release day media checking and bug-squashing groups on two continents is excellent.

Dear Cape Town - feel free to join us for the next time and we'll hold the video call open for longer. If we don't see any of you here in Cambridge for mini-Debconf, we'll meet up in Brest for Debconf 25.


28 July 2024

Marco d'Itri: An interesting architecture-dependent autopkgtest regression

More than two years after I last uploaded the purity-off Debian package, its autopkgtest (the Debian distribution-wide continuous integration system) started failing on arm64, and only on this architecture. The failing test is very simple: it prints a long stream of "y" or "n" characters to purity(6)'s standard input and then checks the output for the expected result. While investigating the live autopkgtest system, I figured out that: I did not have time to verify this, but I have noticed that the 25 years old purity(6) program calls TIOCGWINSZ to determine the screen length, and then uses the results in the answer buffer without checking if the ioctl(2) call returned an error. Which it obviously does in this case, because standard input is not a console but a pipe. So my theory is that paging is enabled because the undefined result of the ioctl has changed, and only on this architecture. Since I do not want to fix purity(6) right now, I have implemented the workaround of printing many more "y" characters as input.

17 July 2024

Kalyani Kenekar: Securing Your Website: Installing and Configuring Nginx with SSL

Logo Nginx

The Initial Encounter: I recently started to work with Nginx to explore the requirements on how to configure a then so called server block. It s quite different than within Apache. But there are a tons of good websites out there which do explain the different steps and options quite well. I also realized quickly that I need to be able to configure my Nginx setups in a way so the content is delivered through https with some automatic redirection from http URLs.
  • Let s install Nginx

Installing Nginx
$ sudo apt update
$ sudo apt install nginx

Checking your Web Server
  • We can check now nginx service is active or inactive
Output
  nginx.service - A high performance web server and a reverse proxy server
     Loaded: loaded (/lib/systemd/system/nginx.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2024-02-12 09:59:20 UTC; 3h ago
       Docs: man:nginx(8)
   Main PID: 2887 (nginx)
      Tasks: 2 (limit: 1132)
     Memory: 4.2M
        CPU: 81ms
     CGroup: /system.slice/nginx.service
              2887 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
              2890 nginx: worker process
  • Now we successfully installed nginx and it in running state.

How To Secure Nginx with Let s Encrypt on Debian 12
  • In this documentation, you will use Certbot to obtain a free SSL certificate for Nginx on Debian 12 and set up your certificate.

Step 1 Installing Certbot $ sudo apt install certbot python3-certbot-nginx
  • Certbot is now ready to use, but in order for it to automatically configure SSL for Nginx, we need to verify some of Nginx s configuration.

Step 2 Confirming Nginx s Configuration
  • Certbot needs to be able to find the correct server block in your Nginx configuration for it to be able to automatically configure SSL. Specifically, it does this by looking for a server_name directive that matches the domain you request a certificate for. To check, open the configuration file for your domain using nano or your favorite text editor.
$ sudo vi /etc/nginx/sites-available/example.com
 server  
    listen 80;
    root /var/www/html/;
    index index.html;
    server_name example.com
    location /  
        try_files $uri $uri/ =404;
     
    location /test.html  
        try_files $uri $uri/ =404;
        auth_basic "admin area";
        auth_basic_user_file /etc/nginx/.htpasswd;
     
 
  • Fillup above data your project wise and then save the file, quit your editor, and verify the syntax of your configuration edits.
$ sudo nginx -t

Step 3 Obtaining an SSL Certificate
  • Certbot provides a variety of ways to obtain SSL certificates through plugins. The Nginx plugin will take care of reconfiguring Nginx and reloading the config whenever necessary. To use this plugin, type the following command line.
$ sudo certbot --nginx -d example.com
  • The configuration will be updated, and Nginx will reload to pick up the new settings. certbot will wrap up with a message telling you the process was successful and where your certificates are stored.
Output
IMPORTANT NOTES:
 - Congratulations! Your certificate and chain have been saved at:
   /etc/letsencrypt/live/example.com/fullchain.pem
   Your key file has been saved at:
   /etc/letsencrypt/live/example.com/privkey.pem
   Your cert will expire on 2024-05-12. To obtain a new or tweaked
   version of this certificate in the future, simply run certbot again
   with the "certonly" option. To non-interactively renew *all* of
   your certificates, run "certbot renew"
 - If you like Certbot, please consider supporting our work by:
   Donating to ISRG / Let's Encrypt:   https://letsencrypt.org/donate
   Donating to EFF:                    https://eff.org/donate-le
  • Your certificates are downloaded, installed, and loaded. Next check the syntax again of your configuration.
$ sudo nginx -t
  • If you get an error, reopen the server block file and check for any typos or missing characters. Once your configuration file s syntax is correct, reload Nginx to load the new configuration.
$ sudo systemctl reload nginx
  • Try reloading your website using https:// and notice your browser s security indicator. It should indicate that the site is properly secured, usually with a lock icon.
Now your website is secured by SSL usage.

Gunnar Wolf: Script for weather reporting in Waybar

While I was living in Argentina, we (my family) found ourselves checking for weather forecasts almost constantly weather there can be quite unexpected, much more so that here in Mexico. So it took me a bit of tinkering to come up with a couple of simple scripts to show the weather forecast as part of my Waybar setup. I haven t cared to share with anybody, as I believe them to be quite trivial and quite dirty. But today, V ctor was asking for some slightly-related things, so here I go. Please do remember I warned: Dirty. Forecast I am using OpenWeather s open API. I had to register to get an APPID, and it allows me for up to 1,000 API calls per day, more than plenty for my uses, even if I am logged in at my desktops at three different computers (not an uncommon situation). Having that, I set up a file named /etc/get_weather/, that currently reads:
# Home, Mexico City
LAT=19.3364
LONG=-99.1819
# # Home, Paran , Argentina
# LAT=-31.7208
# LONG=-60.5317
# # PKNU, Busan, South Korea
# LAT=35.1339
#LONG=129.1055
APPID=SomeLongRandomStringIAmNotSharing
Then, I have a simple script, /usr/local/bin/get_weather, that fetches the current weather and the forecast, and stores them as /run/weather.json and /run/forecast.json:
#!/usr/bin/bash
CONF_FILE=/etc/get_weather
if [ -e "$CONF_FILE" ]; then
    . "$CONF_FILE"
else
    echo "Configuration file $CONF_FILE not found"
    exit 1
fi
if [ -z "$LAT" -o -z "$LONG" -o -z "$APPID" ]; then
    echo "Configuration file must declare latitude (LAT), longitude (LONG) "
    echo "and app ID (APPID)."
    exit 1
fi
CURRENT=/run/weather.json
FORECAST=/run/forecast.json
wget -q "https://api.openweathermap.org/data/2.5/weather?lat=$ LAT &lon=$ LONG &units=metric&appid=$ APPID " -O "$ CURRENT "
wget -q "https://api.openweathermap.org/data/2.5/forecast?lat=$ LAT &lon=$ LONG &units=metric&appid=$ APPID " -O "$ FORECAST "
This script is called by the corresponding systemd service unit, found at /etc/systemd/system/get_weather.service:
[Unit]
Description=Get the current weather
[Service]
Type=oneshot
ExecStart=/usr/local/bin/get_weather
And it is run every 15 minutes via the following systemd timer unit, /etc/systemd/system/get_weather.timer:
[Unit]
Description=Get the current weather every 15 minutes
[Timer]
OnCalendar=*:00/15:00
Unit=get_weather.service
[Install]
WantedBy=multi-user.target
(yes, it runs even if I m not logged in, wasting some of my free API calls but within reason) Then, I declare a "custom/weather" module in the desired position of my ~/.config/waybar/waybar.config, and define it as:
"custom/weather":  
    "exec": "while true;do /home/gwolf/bin/parse_weather.rb;sleep 10; done",
"return-type": "json",
 ,
This script basically morphs a generic weather JSON description into another set of JSON bits that display my weather in the way I prefer to have it displayed as:
#!/usr/bin/ruby
require 'json'
Sources =  :weather => '/run/weather.json',
           :forecast => '/run/forecast.json'
           
Icons =  '01d' => ' ', # d   day
         '01n' => ' ', # n   night
         '02d' => ' ',
         '02n' => ' ',
         '03d' => ' ',
         '03n' => ' ',
         '04d'  => ' ',
         '04n' => ' ',
         '09d' => ' ',
         '10n' =>  '  ',
         '10d' => ' ',
         '13d' => ' ',
         '50d' => ' '
         
ret =  'text': nil, 'tooltip': nil, 'class': 'weather', 'percentage': 100 
# Current weather report: Main text of the module
begin
  weather = JSON.parse(open(Sources[:weather],'r').read)
  loc_name = weather['name']
  icon = Icons[weather['weather'][0]['icon']]   ' ' + f['weather'][0]['icon'] + f['weather'][0]['main']
  temp = weather['main']['temp']
  sens = weather['main']['feels_like']
  hum = weather['main']['humidity']
  wind_vel = weather['wind']['speed']
  wind_dir = weather['wind']['deg']
  portions =  
  portions[:loc] = loc_name
  portions[:temp] = '%s  %2.2f C (%2.2f)' % [icon, temp, sens]
  portions[:hum] = '  %2d%%' % hum
  portions[:wind] = ' %2.2fm/s %d ' % [wind_vel, wind_dir]
  ret['text'] = [:loc, :temp, :hum, :wind].map   p  portions[p] .join(' ')
rescue => err
  ret['text'] = 'Could not process weather file (%s   %s: %s)' % [Sources[:weather], err.class, err.to_s]
end
# Weather prevision for the following hours/days
begin
  cast = []
  forecast = JSON.parse(open(Sources[:forecast], 'r').read)
  min = ''
  max = ''
  day=Time.now.strftime('%Y.%m.%d')
  by_day =  
  forecast['list'].each_with_index do  f,i 
    by_day[day]  = []
    time = Time.at(f['dt'])
    time_lbl = '%02d:%02d' % [time.hour, time.min]
    icon = Icons[f['weather'][0]['icon']]   ' ' + f['weather'][0]['icon'] + f['weather'][0]['main']
    by_day[day] << f['main']['temp']
    if time.hour == 0
      min = '%2.2f' % by_day[day].min
      max = '%2.2f' % by_day[day].max
      cast << '          min: <b>%s C</b> max: <b>%s C</b>' % [min, max]
      day = time.strftime('%Y.%m.%d')
      cast << '        <b>%04d.%02d.%02d</b>  ' %
              [time.year, time.month, time.day]
    end
    cast << '%s   %2.2f C    %2d%%   %s %s' % [time_lbl,
                                                f['main']['temp'],
                                                f['main']['humidity'],
                                                icon,
                                                f['weather'][0]['description']
                                               ]
  end
  cast << '          min: <b>%s</b> C max: <b>%s C</b>' % [min, max]
  ret['tooltip'] = cast.join("\n")
  
rescue => err
  ret['tooltip'] = 'Could not process forecast file (%s   %s)' % [Sources[:forecast], err.class, err.to_s]
end
# Print out the result for Waybar to process
puts ret.to_json
The end result? Nothing too stunning, but definitively something I find useful and even nicely laid out: Screenshot Do note that it seems OpenWeather will return the name of the closest available meteorology station with (most?) recent data for my home, I often get Ciudad Universitaria, but sometimes Coyoac n or even San ngel Inn.

13 July 2024

Russ Allbery: podlators v6.0.1

This is a small bug-fix release to remove use of autodie from the build system for the module. podlators is included in Perl core, and at the point when it is built during the core build, the prerequisites of the autodie module are not yet met, so the module is not available. This release reverts to explicit error checking in all the files used by the build system. Thanks to James E Keenan for the report and the analysis. You can get the latest version from CPAN or the podlators distribution page.

4 July 2024

Arturo Borrero Gonz lez: Wikimedia Toolforge: migrating Kubernetes from PodSecurityPolicy to Kyverno

Le ch teau de Val re et le Haut de Cry en juillet 2022 Christian David, CC BY-SA 4.0, via Wikimedia Commons This post was originally published in the Wikimedia Tech blog, authored by Arturo Borrero Gonzalez. Summary: this article shares the experience and learnings of migrating away from Kubernetes PodSecurityPolicy into Kyverno in the Wikimedia Toolforge platform. Wikimedia Toolforge is a Platform-as-a-Service, built with Kubernetes, and maintained by the Wikimedia Cloud Services team (WMCS). It is completely free and open, and we welcome anyone to use it to build and host tools (bots, webservices, scheduled jobs, etc) in support of Wikimedia projects. We provide a set of platform-specific services, command line interfaces, and shortcuts to help in the task of setting up webservices, jobs, and stuff like building container images, or using databases. Using these interfaces makes the underlying Kubernetes system pretty much invisible to users. We also allow direct access to the Kubernetes API, and some advanced users do directly interact with it. Each account has a Kubernetes namespace where they can freely deploy their workloads. We have a number of controls in place to ensure performance, stability, and fairness of the system, including quotas, RBAC permissions, and up until recently PodSecurityPolicies (PSP). At the time of this writing, we had around 3.500 Toolforge tool accounts in the system. We early adopted PSP in 2019 as a way to make sure Pods had the correct runtime configuration. We needed Pods to stay within the safe boundaries of a set of pre-defined parameters. Back when we adopted PSP there was already the option to use 3rd party agents, like OpenPolicyAgent Gatekeeper, but we decided not to invest in them, and went with a native, built-in mechanism instead. In 2021 it was announced that the PSP mechanism would be deprecated, and removed in Kubernetes 1.25. Even though we had been warned years in advance, we did not prioritize the migration of PSP until we were in Kubernetes 1.24, and blocked, unable to upgrade forward without taking actions. The WMCS team explored different alternatives for this migration, but eventually we decided to go with Kyverno as a replacement for PSP. And so with that decision it began the journey described in this blog post. First, we needed a source code refactor for one of the key components of our Toolforge Kubernetes: maintain-kubeusers. This custom piece of software that we built in-house, contains the logic to fetch accounts from LDAP and do the necessary instrumentation on Kubernetes to accommodate each one: create namespace, RBAC, quota, a kubeconfig file, etc. With the refactor, we introduced a proper reconciliation loop, in a way that the software would have a notion of what needs to be done for each account, what would be missing, what to delete, upgrade, and so on. This would allow us to easily deploy new resources for each account, or iterate on their definitions. The initial version of the refactor had a number of problems, though. For one, the new version of maintain-kubeusers was doing more filesystem interaction than the previous version, resulting in a slow reconciliation loop over all the accounts. We used NFS as the underlying storage system for Toolforge, and it could be very slow because of reasons beyond this blog post. This was corrected in the next few days after the initial refactor rollout. A side note with an implementation detail: we stored a configmap on each account namespace with the state of each resource. Storing more state on this configmap was our solution to avoid additional NFS latency. I initially estimated this refactor would take me a week to complete, but unfortunately it took me around three weeks instead. Previous to the refactor, there were several manual steps and cleanups required to be done when updating the definition of a resource. The process is now automated, more robust, performant, efficient and clean. So in my opinion it was worth it, even if it took more time than expected. Then, we worked on the Kyverno policies themselves. Because we had a very particular PSP setting, in order to ease the transition, we tried to replicate their semantics on a 1:1 basis as much as possible. This involved things like transparent mutation of Pod resources, then validation. Additionally, we had one different PSP definition for each account, so we decided to create one different Kyverno namespaced policy resource for each account namespace remember, we had 3.5k accounts. We created a Kyverno policy template that we would then render and inject for each account. For developing and testing all this, maintain-kubeusers and the Kyverno bits, we had a project called lima-kilo, which was a local Kubernetes setup replicating production Toolforge. This was used by each engineer in their laptop as a common development environment. We had planned the migration from PSP to Kyverno policies in stages, like this:
  1. update our internal template generators to make Pod security settings explicit
  2. introduce Kyverno policies in Audit mode
  3. see how the cluster would behave with them, and if we had any offending resources reported by the new policies, and correct them
  4. modify Kyverno policies and set them in Enforce mode
  5. drop PSP
In stage 1, we updated things like the toolforge-jobs-framework and tools-webservice. In stage 2, when we deployed the 3.5k Kyverno policy resources, our production cluster died almost immediately. Surprise. All the monitoring went red, the Kubernetes apiserver became irresponsibe, and we were unable to perform any administrative actions in the Kubernetes control plane, or even the underlying virtual machines. All Toolforge users were impacted. This was a full scale outage that required the energy of the whole WMCS team to recover from. We temporarily disabled Kyverno until we could learn what had occurred. This incident happened despite having tested before in lima-kilo and in another pre-production cluster we had, called Toolsbeta. But we had not tested that many policy resources. Clearly, this was something scale-related. After the incident, I went on and created 3.5k Kyverno policy resources on lima-kilo, and indeed I was able to reproduce the outage. We took a number of measures, corrected a few errors in our infrastructure, reached out to the Kyverno upstream developers, asking for advice, and at the end we did the following to accommodate the setup to our needs: I have to admit, I was briefly tempted to drop Kyverno, and even stop pursuing using an external policy agent entirely, and write our own custom admission controller out of concerns over performance of this architecture. However, after applying all the measures listed above, the system became very stable, so we decided to move forward. The second attempt at deploying it all went through just fine. No outage this time When we were in stage 4 we detected another bug. We had been following the Kubernetes upstream documentation for setting securityContext to the right values. In particular, we were enforcing the procMount to be set to the default value, which per the docs it was DefaultProcMount . However, that string is the name of the internal variable in the source code, whereas the actual default value is the string Default . This caused pods to be rightfully rejected by Kyverno while we figured the problem. I sent a patch upstream to fix this problem. We finally had everything in place, reached stage 5, and we were able to disable PSP. We unloaded the PSP controller from the kubernetes apiserver, and deleted every individual PSP definition. Everything was very smooth in this last step of the migration. This whole PSP project, including the maintain-kubeusers refactor, the outage, and all the different migration stages took roughly three months to complete. For me there are a number of valuable reasons to learn from this project. For one, the scale is something to consider, and test, when evaluating a new architecture or software component. Not doing so can lead to service outages, or unexpectedly poor performances. This is in the first chapter of the SRE handbook, but we got a reminder the hard way This post was originally published in the Wikimedia Tech blog, authored by Arturo Borrero Gonzalez.

3 July 2024

Sahil Dhiman: RTI to NPL Regarding Their NTP Infrastructure

I became interested in Network Time Protocol (NTP) last year after learning how fundamental this protocol is to the functioning of the global Internet. NTP helps synchronize clocks on devices over the Internet, which is essential for secure browsing, timestamping, keeping everyone in sync or just checking what time it is. Computers usually have a hardware real-time clock (RTC) but that deviates over time, so an occasional sync over NTP is required to keep the time accurate. Many network and IoT devices don t have hardware RTC so have even more reliance on NTP. Accurate time keeping starts with reference clocks like atomic clocks, GPS etc. Multiple government standard agencies host these reference clocks, which are regarded as Stratum 0. Stratum 1 servers are known as primary servers, and directly connect to Stratum 0 clocks for time. Stratum 1 servers then distribute time to Stratum 2 and further down the hierarchy. Computers typically connects to one or more Stratum 1/2/3 servers to get their time. Someone has to host these public Stratum 1,2,3 NTP servers. That s what NTP pool, a global effort by volunteers does. They provide NTP servers for the public to use. As of today, there are 4700+ servers in the pool which are free to use for anyone. Now let s come to the reason for writing this post. Indian Computer Emergency Response Team (CERT-In) in April 2022 released a set of cybersecurity directions which set the alarm bells ringing. Internet Society (and almost everyone else) wrote about it. And then there was this specific section about NTP:
All service providers, intermediaries, data centres, body corporate and Government organisations shall connect to the Network Time Protocol (NTP) Server of National Informatics Centre (NIC) or National Physical Laboratory (NPL) or with NTP servers traceable to these NTP servers, for synchronisation of all their ICT systems clocks. Entities having ICT infrastructure spanning multiple geographies may also use accurate and standard time source other than NPL and NIC, however it is to be ensured that their time source shall not deviate from NPL and NIC.
CSIR-National Physical Laboratory (NPL) is the official timekeeper for India and hosts the only public Stratum 1 clock in India, according to NTP pool website. So I was naturally curious to know what kind of infrastructure they re running for NTP. India has a Right to Information (RTI) Act which, like the Freedom of Information Act (FOIA) in the United States, gives citizens rights to request information from governmental entities, to which they have to respond in under 30 days. So last year, I filed two sets of RTI (one after the first reply came) inquiring about NPL s public NTP server setup. The first RTI had some generic questions: RTI 1
First RTI. Click to enlarge
This gave a vague idea about the setup, so I sat down and came with some specific questions in the next RTI. RTI 2
Second RTI. Click to enlarge
Feel free to make your conclusions from it now. Bear in mind these were filled last year so things might have changed. Do let me know if you have more information about it. Update (07/07/2024): Found an article from Medianama about Indian government time servers with information sourced through RTI.

28 June 2024

Matthew Palmer: Checking for Compromised Private Keys has Never Been Easier

As regular readers would know, since I never stop banging on about it, I run Pwnedkeys, a service which finds and collates private keys which have been disclosed or are otherwise compromised. Until now, the only way to check if a key is compromised has been to use the Pwnedkeys API, which is not necessarily trivial for everyone. Starting today, that s changing. The next phase of Pwnedkeys is to start offering more user-friendly tools for checking whether keys being used are compromised. These will typically be web-based or command-line tools intended to answer the question is the key in this (certificate, CSR, authorized_keys file, TLS connection, email, etc) known to Pwnedkeys to have been compromised? .

Opening the Toolbox Available right now are the first web-based key checking tools in this arsenal. These tools allow you to:
  1. Check the key in a PEM-format X509 data structure (such as a CSR or certificate);
  2. Check the keys in an authorized_keys file you upload; and
  3. Check the SSH keys used by a user at any one of a number of widely-used code-hosting sites.
Further planned tools include live checking of the certificates presented in TLS connections (for HTTPS, etc), SSH host keys, command-line utilities for checking local authorized_keys files, and many other goodies.

If You Are Intrigued By My Ideas and wish to subscribe to my newsletter, now you can! I m not going to be blogging every little update to Pwnedkeys, because that would probably get a bit tedious for readers who aren t as intrigued by compromised keys as I am. Instead, I ll be posting every little update in the Pwnedkeys newsletter. So, if you want to keep up-to-date with the latest and greatest news and information, subscribe to the newsletter.

Supporting Pwnedkeys All this work I m doing on my own time, and I m paying for the infrastructure from my own pocket. If you ve got a few dollars to spare, I d really appreciate it if you bought me a refreshing beverage. It helps keep the lights on here at Pwnedkeys Global HQ.

8 June 2024

Thorsten Alteholz: My Debian Activities in May 2024

FTP master This month I accepted 347 and rejected 49 packages. The overall number of packages that got accepted was 348.

Debian LTS This was my hundred-nineteenth month that I did some work for the Debian LTS initiative, started by Raphael Hertzog at Freexian. During my allocated time I uploaded or worked on: I also continued to work on tiff and last but not least did a week of FD and attended the monthly LTS/ELTS meeting. Unfortunately I used lots of time to debug an issue with nghttp2. Please see my odyssey below. Debian ELTS This month was the seventieth ELTS month. During my allocated time I uploaded: For some tests I installed the new nghttp2 package on my Stretch VM and started the daemon. Unfortunately I got an unexpected error from getaddrinfo() about ai_socktype not supported. The daemon was configured to listen on lo, the device was available, but the error remained. I was pretty sure that my patch was not the reason for this and indeed the unpatched version showed this error as well. I didn t want to release an untested package, so nghttp2 had to start at least! Therefore I built a minimal example to reproduce the issue. getaddrinfo() failed for hints.ai_socktype=SOCK_STREAM and a numerical IP address. Having no hints at all or localhost instead of 127.0.0.1 made the error disappear (as a remark: localhost resolves to 127.0.0.1, the ipv6 variant is ip6-localhost ). I could see that in nghttp2 as well. Configuring it with localhost let the error vanish but the daemon still exited due to other reasons. After some time of debugging, I added another network interface to my VM and configured it with a dummy IPv4 address. Voila, everything worked as expected. According to Wikipedia, IPv6 was ratified as standard in 2017 and Stretch was also released in 2017. No wonder that a IPv6-only-VM had problems back then and these problems survived to the present. I also continued to work on an update for tiff in Jessie and Stretch, did a week of FD and attended the LTS/ELTS meeting. Debian Printing This month I uploaded new upstream or bugfix versions of: This work is generously funded by Freexian! Debian Astro This month I uploaded a new upstream or bugfix version of: Debian IoT This month I uploaded new upstream or bugfix versions of: Debian Mobcom Due to more and more problems with time_t, I removed osmo-iuh and all dependencies from armel, armhf and i386, sorry. If there is really anybody using this software on 32-bit architectures don t hesitate to get in touch. It is official now, the GSoC student working on the Mobcom packages is Nathan Doris. He already finished the hardest part of the job and I could upload the latest version of libosmocore. I really enjoy working with him and look forward to a pleasant SoC :-). misc This month I uploaded new upstream or bugfix versions of: Did I already mention that I love lists with topics I can work on. I print out such lists and enjoy checking off one after the other. End of May Helmut told me that I am a bit lazy and gave me such a list with all my packages that have one or the other issue with /usr-move. Most of the uploads above are packages on that list and I could check off a lot :-).

Reproducible Builds: Reproducible Builds in May 2024

Welcome to the May 2024 report from the Reproducible Builds project! In these reports, we try to outline what we have been up to over the past month and highlight news items in software supply-chain security more broadly. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website. Table of contents:
  1. A peek into build provenance for Homebrew
  2. Distribution news
  3. Mailing list news
  4. Miscellaneous news
  5. Two new academic papers
  6. diffoscope
  7. Website updates
  8. Upstream patches
  9. Reproducibility testing framework


A peek into build provenance for Homebrew Joe Sweeney and William Woodruff on the Trail of Bits blog wrote an extensive post about build provenance for Homebrew, the third-party package manager for MacOS. Their post details how each bottle (i.e. each release):
[ ] built by Homebrew will come with a cryptographically verifiable statement binding the bottle s content to the specific workflow and other build-time metadata that produced it. [ ] In effect, this injects greater transparency into the Homebrew build process, and diminishes the threat posed by a compromised or malicious insider by making it impossible to trick ordinary users into installing non-CI-built bottles.
The post also briefly touches on future work, including work on source provenance:
Homebrew s formulae already hash-pin their source artifacts, but we can go a step further and additionally assert that source artifacts are produced by the repository (or other signing identity) that s latent in their URL or otherwise embedded into the formula specification.

Distribution news In Debian this month, Johannes Schauer Marin Rodrigues (aka josch) noticed that the Debian binary package bash version 5.2.15-2+b3 was uploaded to the archive twice. Once to bookworm and once to sid but with differing content. This is problem for reproducible builds in Debian due its assumption that the package name, version and architecture triplet is unique. However, josch highlighted that
This example with bash is especially problematic since bash is Essential:yes, so there will now be a large portion of .buildinfo files where it is not possible to figure out with which of the two differing bash packages the sources were compiled.
In response to this, Holger Levsen performed an analysis of all .buildinfo files and found that this needs almost 1,500 binNMUs to fix the fallout from this bug. Elsewhere in Debian, Vagrant Cascadian posted about a Non-Maintainer Upload (NMU) sprint to take place during early June, and it was announced that there is now a #debian-snapshot IRC channel on OFTC to discuss the creation of a new source code archiving service to, perhaps, replace snapshot.debian.org. Lastly, 11 reviews of Debian packages were added, 15 were updated and 48 were removed this month adding to our extensive knowledge about identified issues. A number of issue types have been updated by Chris Lamb as well. [ ][ ]
Elsewhere in the world of distributions, deep within a larger announcement from Colin Percival about the release of version 14.1-BETA2, it was mentioned that the FreeBSD kernels are now built reproducibly.
In Fedora, however, the change proposal mentioned in our report for April 2024 was approved, so, per the ReproduciblePackageBuilds wiki page, the add-determinism tool is now running in new builds for Fedora 41 ( rawhide ). The add-determinism tool is a Rust program which, as its name suggests, adds determinism to files that are given as input by attempting to standardize metadata contained in binary or source files to ensure consistency and clamping to $SOURCE_DATE_EPOCH in all instances . This is essentially the Fedora version of Debian s strip-nondeterminism. However, strip-nondeterminism is written in Perl, and Fedora did not want to pull Perl in the buildroot for every package. The add-determinism tool eliminates many causes of non-determinism and work is ongoing to continue the scope of packages it can operate on.

Mailing list news On our mailing list this month, regular contributor kpcyrd wrote to the list with an update on their source code indexing project, whatsrc.org. The whatsrc.org project, which was launched last month in response to the XZ Utils backdoor, now contains and indexes almost 250,000 unique source code archives. In their post, kpcyrd gives an example of its intended purpose, noting that it shown that whilst there seems to be consensus about [the] source code for zsh 5.9 in various Linux distributions, it does not align with the contents of the zsh Git repository . Holger Levsen also posted to the list with a pre-announcement of sorts for the 2024 Reproducible Builds summit. In particular:
[Whilst] the dates and location are not fixed yet, however if you don help us with finding a suitable location soon, it is very likely that we ll meet again in Hamburg in the 2nd half of September 2024 [ ].
Lastly, Frederic-Emmanuel Picca wrote to the list asking for help understanding the non-reproducible status of the Debian silx package and received replies from both Vagrant Cascadian and Chris Lamb.

Miscellaneous news strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build. This month strip-nondeterminism version 1.14.0-1 was uploaded to Debian unstable by Chris Lamb chiefly to incorporate a change from Alex Muntada to avoid a dependency on Sub::Override to perform monkey-patching and break circular dependencies related to debhelper [ ]. Elsewhere in our tooling, Jelle van der Waa modified reprotest because the pipes module will be removed in Python version 3.13 [ ].
It was also noticed that a new blog post by Daniel Stenberg detailing How to verify a Curl release mentions the SOURCE_DATE_EPOCH environment variable. This is because:
The [curl] release tools document also contains another key component: the exact time stamp at which the release was done using integer second resolution. In order to generate a correct tarball clone, you need to also generate the new version using the old version s timestamp. Because the modification date of all files in the produced tarball will be set to this timestamp.

Furthermore, Fay Stegerman filed a bug against the Signal messenger app for Android to report that their reproducible builds cannot, in fact, be reproduced. However, Fay is quick to note that she has:
found zero evidence of any kind of compromise. Some differences are yet unexplained but everything I found seems to be benign. I am disappointed that Reproducible Builds have been broken for months but I have zero reason to doubt Signal s security in any way.

Lastly, it was observed that there was a concise and diagrammatic overview of supply chain threats on the SLSA website.

Two new academic papers Two new scholarly papers were published this month. Firstly, Mathieu Acher, Beno t Combemale, Georges Aaron Randrianaina and Jean-Marc J z quel of University of Rennes on Embracing Deep Variability For Reproducibility & Replicability. The authors describe their approach as follows:
In this short [vision] paper we delve into the application of software engineering techniques, specifically variability management, to systematically identify and explicit points of variability that may give rise to reproducibility issues (e.g., language, libraries, compiler, virtual machine, OS, environment variables, etc.). The primary objectives are: i) gaining insights into the variability layers and their possible interactions, ii) capturing and documenting configurations for the sake of reproducibility, and iii) exploring diverse configurations to replicate, and hence validate and ensure the robustness of results. By adopting these methodologies, we aim to address the complexities associated with reproducibility and replicability in modern software systems and environments, facilitating a more comprehensive and nuanced perspective on these critical aspects.
(A PDF of this article is available.)
Secondly, Ludovic Court s, Timothy Sample, Simon Tournier and Stefano Zacchiroli have collaborated to publish a paper on Source Code Archiving to the Rescue of Reproducible Deployment. Their paper was motivated because:
The ability to verify research results and to experiment with methodologies are core tenets of science. As research results are increasingly the outcome of computational processes, software plays a central role. GNU Guix is a software deployment tool that supports reproducible software deployment, making it a foundation for computational research workflows. To achieve reproducibility, we must first ensure the source code of software packages Guix deploys remains available.
(A PDF of this article is also available.)

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions 266, 267, 268 and 269 to Debian, making the following changes:
  • New features:
    • Use xz --list to supplement output when comparing .xz archives; essential when metadata differs. (#1069329)
    • Include xz --verbose --verbose (ie. double) output. (#1069329)
    • Strip the first line from the xz --list output. [ ]
    • Only include xz --list --verbose output if the xz has no other differences. [ ]
    • Actually append the xz --list after the container differences, as it simplifies a lot. [ ]
  • Testing improvements:
    • Allow Debian testing to fail right now. [ ]
    • Drop apktool from Build-Depends; we can still test APK functionality via autopkgtests. (#1071410)
    • Add a versioned dependency for at least version 5.4.5 for the xz tests as they fail under (at least) version 5.2.8. (#374)
    • Fix tests for 7zip 24.05. [ ][ ]
    • Fix all tests after additon of xz --list. [ ][ ]
  • Misc:
    • Update copyright years. [ ]
In addition, James Addison fixed an issue where the HTML output showed only the first difference in a file, while the text output shows all differences [ ][ ][ ], Sergei Trofimovich amended the 7zip version test for older 7z versions that include the string [64] [ ][ ] and Vagrant Cascadian relaxed the versioned dependency to allow version 5.4.1 for the xz tests [ ] and proposed updates to guix for versions 267, 268 and pushed version 269 to Guix. Furthermore, Eli Schwartz updated the diffoscope.org website in order to explain how to install diffoscope on Gentoo [ ].

Website updates There were a number of improvements made to our website this month, including Chris Lamb making the print CSS stylesheet nicer [ ]. Fay Stegerman made a number of updates to the page about the SOURCE_DATE_EPOCH environment variable [ ][ ][ ] and Holger Levsen added some of their presentations to the Resources page. Furthermore, IOhannes zm lnig stipulated support for SOURCE_DATE_EPOCH in clang version 16.0.0+ [ ], Jan Zerebecki expanded the Formal definition page and fixed a number of typos on the Buy-in page [ ] and Simon Josefsson fixed the link to Trisquel GNU/Linux on the Projects page [ ].

Upstream patches This month, we wrote a number of patches to fix specific reproducibility issues, including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In May, a number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Enable the rebuilder-snapshot API on osuosl4. [ ]
    • Schedule the i386 architecture a bit more often. [ ]
    • Adapt cleanup_nodes.sh to the new way of running our build services. [ ]
    • Add 8 more workers for the i386 architecture. [ ]
    • Update configuration now that the infom07 and infom08 nodes have been reinstalled as real i386 systems. [ ]
    • Make diffoscope timeouts more visible on the #debian-reproducible-changes IRC channel. [ ]
    • Mark the cbxi4a-armhf node as down. [ ][ ]
    • Only install the hdmi2usb-mode-switch package only on Debian bookworm and earlier [ ] and only install the haskell-platform package on Debian bullseye [ ].
  • Misc:
    • Install the ntpdate utility as we need it later. [ ]
    • Document the progress on the i386 architecture nodes at Infomaniak. [ ]
    • Drop an outdated and unnoticed notice. [ ]
    • Add live_setup_schroot to the list of so-called zombie jobs. [ ]
In addition, Mattia Rizzolo reinstalled the infom07 and infom08 nodes [ ] and Vagrant Cascadian marked the cbxi4a node as online [ ].

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

31 May 2024

Dirk Eddelbuettel: RcppArmadillo 0.12.8.4.0 on CRAN: Upstream Bugfix

armadillo image Armadillo is a powerful and expressive C++ template library for linear algebra and scientific computing. It aims towards a good balance between speed and ease of use, has a syntax deliberately close to Matlab, and is useful for algorithm development directly in C++, or quick conversion of research code into production environments. RcppArmadillo integrates this library with the R environment and language and is widely used by (currently) 1151 other packages on CRAN, downloaded 34.6 million times (per the partial logs from the cloud mirrors of CRAN), and the CSDA paper (preprint / vignette) by Conrad and myself has been cited 584 times according to Google Scholar. Conrad released a new upstream bugfix yesterday (to improve views of sparse matrices). We uploaded it yesterday too but it once agfain took a day for the hard-working CRAN maintainers to concur that the two NOTEs from reverse-dependency checking over 1100 packages were in a fact false positves. And so it appeared on CRAN earlier today. We also increased the versioned dependency on Rcpp to match the use of optional entry-point headers Rcpp/Light, Rcpp/Lighter and Rcpp/Lightest. No other changes were made. The set of changes since the last CRAN release follows.

Changes in RcppArmadillo version 0.12.8.4.0 (2024-05-30)
  • Upgraded to Armadillo release 12.8.4 (Cortisol Injector)
    • Faster handling of sparse submatrix views
  • Update versioned Depends on Rcpp to 1.0.8 or later to match use of Light/Lighter/Lightest headers.

Courtesy of my CRANberries, there is a diffstat report relative to previous release. More detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the Rcpp R-Forge page. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

14 May 2024

Evgeni Golov: Using Packit to build RPMs for projects that depend on or vendor your code

I am a huge fan of Packit as it allows us to provide RPMs to our users and testers directly from a pull-request, thus massively tightening the feedback loop and involving people who otherwise might not be able to apply the changes (for whatever reason) and "quickly test" something out. It's also a great way to validate that a change actually builds in a production environment, where no unnecessary development and test dependencies are installed. You can also run tests of the built packages on Testing Farm and automate pushing releases into Fedora/CentOS Stream, but this is neither a (plain) Packit advertisement post, nor is that functionality that I can talk about with a certain level of experience. Adam recently asked why we don't have Packit builds for our our Puppet modules and my first answer was: "well, puppet-* doesn't produce a thing we ship directly, so nobody dared to do it". My second answer was that I had blogged how to test a Puppet module PR with Packit, but I totally agree that the process was a tad cumbersome and could be improved. Now some madman did it and we all get to hear his story! ;-) What is the problem anyway? The Foreman Installer is a bit of Ruby code1 that provides a CLI to puppet apply based on a set of Puppet modules. As the Puppet modules can also be used outside the installer and have their own lifecycle, they live in separate git repositories and their releases get uploaded to the Puppet Forge. Users however do not want to (and should not have to) install the modules themselves. So we have to ship the modules inside the foreman-installer package. Packaging 25 modules for two packaging systems (we support Enterprise Linux and Debian/Ubuntu) seems like a lot of work. Especially if you consider that the main foreman-installer package would need to be rebuilt after each module change as it contains generated files based on the modules which are too expensive to generate at runtime. So we can ship the modules inside the foreman-installer source release, thus vendoring those modules into the installer release. To do so we use librarian-puppet with a Puppetfile and either a Puppetfile.lock for stable releases or by letting librarian-puppet fetch latest for nightly snapshots. This works beautifully for changes that land in the development and release branches of our repositories - regardless if it's foreman-installer.git or any of the puppet-*.git ones. It also works nicely for pull-requests against foreman-installer.git. But because the puppet-* repositories do not map to packages, we assumed it wouldn't work well for pull-requests against those. How can we solve this? Well, the "obvious" solution is to build the foreman-installer package via Packit also for pull-requests against the puppet-* repositories. However, as usual, the devil is in the details. Packit by default clones the repository of the pull-request and tries to create a source tarball from that using git archive. As this might be too simple for many projects, one can define a custom create-archive action that runs after the pull-request has been cloned and produces the tarball instead. We already use that in the Packit configuration for foreman-installer to run the pkg:generate_source rake target which executes librarian-puppet for us. But now the pull-request is against one of the Puppet modules, so Packit will clone that, not the installer. We gotta clone foreman-installer on our own. And then point librarian-puppet at the pull-request. Fun. Cloning is relatively simple, call git clone -- sorry Packit/Copr infrastructure. But the Puppet module pull-request? One can use :git => 'https://git.example.com/repo.git' in the Puppetfile to fetch a git repository. In fact, that's what we already do for our nightly snapshots. It also supports :ref => 'some_branch_or_tag_name', if the remote HEAD is not what you want. My brain first went "I know this! GitHub has this magic refs/pull/1/head and refs/pull/1/merge refs you can checkout to get the contents of the pull-request without bothering to add a remote for the source of the pull-request". Well, this requires to know the ID of the pull-request and Packit does not expose that in the environment variables available during create-archive. Wait, but we already have a checkout. Can we just say :git => '../.git'? Cloning a .git folder is totally possible after all.
[Librarian]     --> fatal: repository '../.git' does not exist
Could not checkout ../.git: fatal: repository '../.git' does not exist
Seems librarian disagrees. Damn. (Yes, I checked, the path exists.) does it maybe just not like relative paths?! Yepp, using an absolute path absolutely works! For some reason it ends up checking out the default HEAD of the "real" (GitHub) remote, not of ../. Luckily this can be fixed by explicitly passing :ref => 'origin/HEAD', which resolves to the branch Packit created for the pull-request. Now we just need to put all of that together and remember to execute all commands from inside the foreman-installer checkout as that is where all our vendoring recipes etc live. Putting it all together Let's look at the diff between the packit.yaml for foreman-installer and the one I've proposed for puppet-pulpcore:
--- a/foreman-installer/.packit.yaml    2024-05-14 21:45:26.545260798 +0200
+++ b/puppet-pulpcore/.packit.yaml  2024-05-14 21:44:47.834162418 +0200
@@ -18,13 +18,15 @@
 actions:
   post-upstream-clone:
     - "wget https://raw.githubusercontent.com/theforeman/foreman-packaging/rpm/develop/packages/foreman/foreman-installer/foreman-installer.spec -O foreman-installer.spec"
+    - "git clone https://github.com/theforeman/foreman-installer"
+    - "sed -i '/theforeman.pulpcore/ s@:git.*@:git => \"# __dir__ /../.git\", :ref => \"origin/HEAD\"@' foreman-installer/Puppetfile"
   get-current-version:
-    - "sed 's/-develop//' VERSION"
+    - "sed 's/-develop//' foreman-installer/VERSION"
   create-archive:
-    - bundle config set --local path vendor/bundle
-    - bundle config set --local without development:test
-    - bundle install
-    - bundle exec rake pkg:generate_source
+    - bash -c "cd foreman-installer && bundle config set --local path vendor/bundle"
+    - bash -c "cd foreman-installer && bundle config set --local without development:test"
+    - bash -c "cd foreman-installer && bundle install"
+    - bash -c "cd foreman-installer && bundle exec rake pkg:generate_source"
  1. It clones foreman-installer (in post-upstream-clone, as that felt more natural after some thinking)
  2. It adjusts the Puppetfile to use # __dir__ /../.git as the Git repository, abusing the fact that a Puppetfile is really just a Ruby script (sorry Ben!) and knows the __dir__ it lives in
  3. It fetches the version from the foreman-installer checkout, so it's sort-of reasonable
  4. It performs all building inside the foreman-installer checkout
Can this be used in other scenarios? I hope so! Vendoring is not unheard of. And testing your "consumers" (dependents? naming is hard) is good style anyway!

  1. three Ruby modules in a trench coat, so to say

Julian Andres Klode: The new APT 3.0 solver

APT 2.9.3 introduces the first iteration of the new solver codenamed solver3, and now available with the solver 3.0 option. The new solver works fundamentally different from the old one.

How does it work? Solver3 is a fully backtracking dependency solving algorithm that defers choices to as late as possible. It starts with an empty set of packages, then adds the manually installed packages, and then installs packages automatically as necessary to satisfy the dependencies. Deferring the choices is implemented multiple ways: First, all install requests recursively mark dependencies with a single solution for install, and any packages that are being rejected due to conflicts or user requests will cause their reverse dependencies to be transitively marked as rejected, provided their or group cannot be solved by a different package. Second, any dependency with more than one choice is pushed to a priority queue that is ordered by the number of possible solutions, such that we resolve a b before a b c. Not just by the number of solutions, though. One important point to note is that optional dependencies, that is, Recommends, are always sorting after mandatory dependencies. Do note on that: Recommended packages do not nest in backtracking - dependencies of a Recommended package themselves are not optional, so they will have to be resolved before the next Recommended package is seen in the queue. Another important step in deferring choices is extracting the common dependencies of a package across its version and then installing them before we even decide which of its versions we want to install - one of the dependencies might cycle back to a specific version after all. Decisions about package levels are recorded at a certain decision level, if we reach a conflict we backtrack to the previous decision level, mark the decision we made (install X) in the inverse (DO NOT INSTALL X), reset all the state all decisions made at the higher level, and restore any dependencies that are no longer resolved to the work queue.

Comparison to SAT solver design. If you have studied SAT solver design, you ll find that essentially this is a DPLL solver without pure literal elimination. A pure literal eliminitation phase would not work for a package manager: First negative pure literals (packages that everything conflicts with) do not exist, and positive pure literals (packages nothing conflicts with) we do not want to mark for install - we want to install as little as possible (well subject, to policy). As part of the solving phase, we also construct an implication graph, albeit a partial one: The first package installing another package is marked as the reason (A -> B), the same thing for conflicts (not A -> not B). Once we have added the ability to have multiple parents in the implication graph, it stands to reason that we can also implement the much more advanced method of conflict-driven clause learning; where we do not jump back to the previous decision level but exactly to the decision level that caused the conflict. This would massively speed up backtracking.

What changes can you expect in behavior? The most striking difference to the classic APT solver is that solver3 always keeps manually installed packages around, it never offers to remove them. We will relax that in a future iteration so that it can replace packages with new ones, that is, if your package is no longer available in the repository (obsolete), but there is one that Conflicts+Replaces+Provides it, solver3 will be allowed to install that and remove the other. Implementing that policy is rather trivial: We just need to queue obsolete replacement as a dependency to solve, rather than mark the obsolete package for install. Another critical difference is the change in the autoremove behavior: The new solver currently only knows the strongest dependency chain to each package, and hence it will not keep around any packages that are only reachable via weaker chains. A common example is when gcc-<version> packages accumulate on your system over the years. They all have Provides: c-compiler and the libtool Depends: gcc c-compiler is enough to keep them around.

New features The new option --no-strict-pinning instructs the solver to consider all versions of a package and not just the candidate version. For example, you could use apt install foo=2.0 --no-strict-pinning to install version 2.0 of foo and upgrade - or downgrade - packages as needed to satisfy foo=2.0 dependencies. This mostly comes in handy in use cases involving Debian experimental or the Ubuntu proposed pockets, where you want to install a package from there, but try to satisfy from the normal release as much as possible. The implication graph building allows us to implement an apt why command, that while not as nicely detailed as aptitude, at least tells you the exact reason why a package is installed. It will only show the strongest dependency chain at first of course, since that is what we record.

What is left to do? At the moment, error information is not stored across backtracking in any way, but we generally will want to show you the first conflict we reach as it is the most natural one; or all conflicts. Currently you get the last conflict which may not be particularly useful. Likewise, errors currently are just rendered as implication graphs of the form [not] A -> [not] B -> ..., and we need to put in some work to present those nicely. The test suite is not passing yet, I haven t really started working on it. A challenge is that most packages in the test suite are manually installed as they are mocked, and the solver now doesn t remove those. We plan to implement the replacement logic such that foo can be replaced by foo2 Conflicts/Replaces/Provides foo without needing to be automatically installed. Improving the backtracking to be non-chronological conflict-driven clause learning would vastly enhance our backtracking performance. Not that it seems to be an issue right now in my limited testing (mostly noble 64-bit-time_t upgrades). A lot of that complexity you have normally is not there because the manually installed packages and resulting unit propagation (single-solution Depends/Reverse-Depends for Conflicts) already ground us fairly far in what changes we can actually make. Once all the stuff has landed, we need to start rolling it out and gather feedback. On Ubuntu I d like automated feedback on regressions (running solver3 in parallel, checking if result is worse and then submitting an error to the error tracker), on Debian this could just be a role email address to send solver dumps to. At the same time, we can also incrementally start rolling this out. Like phased updates in Ubuntu, we can also roll out the new solver as the default to 10%, 20%, 50% of users before going to the full 100%. This will allow us to capture regressions early and fix them.

10 May 2024

Reproducible Builds: Reproducible Builds in April 2024

Welcome to the April 2024 report from the Reproducible Builds project! In our reports, we attempt to outline what we have been up to over the past month, as well as mentioning some of the important things happening more generally in software supply-chain security. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website. Table of contents:
  1. New backseat-signed tool to validate distributions source inputs
  2. NixOS is not reproducible
  3. Certificate vulnerabilities in F-Droid s fdroidserver
  4. Website updates
  5. Reproducible Builds and Insights from an Independent Verifier for Arch Linux
  6. libntlm now releasing minimal source-only tarballs
  7. Distribution work
  8. Mailing list news
  9. diffoscope
  10. Upstream patches
  11. reprotest
  12. Reproducibility testing framework

New backseat-signed tool to validate distributions source inputs kpcyrd announced a new tool called backseat-signed, after:
I figured out a somewhat straight-forward way to check if a given git archive output is cryptographically claimed to be the source input of a given binary package in either Arch Linux or Debian (or both).
Elaborating more in their announcement post, kpcyrd writes:
I believe this to be the reproducible source tarball thing some people have been asking about. As explained in the README, I believe reproducing autotools-generated tarballs isn t worth everybody s time and instead a distribution that claims to build from source should operate on VCS snapshots instead of tarballs with 25k lines of pre-generated shell-script.
Indeed, many distributions packages already build from VCS snapshots, and this trend is likely to accelerate in response to the xz incident. The announcement led to a lengthy discussion on our mailing list, as well as shorter followup thread from kpcyrd about bootstrapping Autotools projects.

NixOS is not reproducible Morten Linderud posted an post on his blog this month, provocatively titled, NixOS is not reproducible . Although quickly admitting that his title is indeed clickbait , Morten goes on to clarify the precise guarantees and promises that NixOS provides its users. Later in the most, Morten mentions that he was motivated to write the post because:
I have heavily invested my free-time on this topic since 2017, and met some of the accomplishments we have had with Doesn t NixOS solve this? for just as long and I thought it would be of peoples interest to clarify[.]

Certificate vulnerabilities in F-Droid s fdroidserver In early April, Fay Stegerman announced a certificate pinning bypass vulnerability and Proof of Concept (PoC) in the F-Droid fdroidserver tools for managing builds, indexes, updates, and deployments for F-Droid repositories to the oss-security mailing list.
We observed that embedding a v1 (JAR) signature file in an APK with minSdk >= 24 will be ignored by Android/apksigner, which only checks v2/v3 in that case. However, since fdroidserver checks v1 first, regardless of minSdk, and does not verify the signature, it will accept a fake certificate and see an incorrect certificate fingerprint. [ ] We also realised that the above mentioned discrepancy between apksigner and androguard (which fdroidserver uses to extract the v2/v3 certificates) can be abused here as well. [ ]
Later on in the month, Fay followed up with a second post detailing a third vulnerability and a script that could be used to scan for potentially affected .apk files and mentioned that, whilst upstream had acknowledged the vulnerability, they had not yet applied any ameliorating fixes.

Website updates There were a number of improvements made to our website this month, including Chris Lamb updating the archive page to recommend -X and unzipping with TZ=UTC [ ] and adding Maven, Gradle, JDK and Groovy examples to the SOURCE_DATE_EPOCH page [ ]. In addition Jan Zerebecki added a new /contribute/opensuse/ page [ ] and Sertonix fixed the automatic RSS feed detection [ ][ ].

Reproducible Builds and Insights from an Independent Verifier for Arch Linux Joshua Drexel, Esther H nggi and Iy n M ndez Veiga of the School of Computer Science and Information Technology, Hochschule Luzern (HSLU) in Switzerland published a paper this month entitled Reproducible Builds and Insights from an Independent Verifier for Arch Linux. The paper establishes the context as follows:
Supply chain attacks have emerged as a prominent cybersecurity threat in recent years. Reproducible and bootstrappable builds have the potential to reduce such attacks significantly. In combination with independent, exhaustive and periodic source code audits, these measures can effectively eradicate compromises in the building process. In this paper we introduce both concepts, we analyze the achievements over the last ten years and explain the remaining challenges.
What is more, the paper aims to:
contribute to the reproducible builds effort by setting up a rebuilder and verifier instance to test the reproducibility of Arch Linux packages. Using the results from this instance, we uncover an unnoticed and security-relevant packaging issue affecting 16 packages related to Certbot [ ].
A PDF of the paper is available.

libntlm now releasing minimal source-only tarballs Simon Josefsson wrote on his blog this month that, going forward, the libntlm project will now be releasing what they call minimal source-only tarballs :
The XZUtils incident illustrate that tarballs with files that are not included in the git archive offer an opportunity to disguise malicious backdoors. [The] risk of hiding malware is not the only motivation to publish signed minimal source-only tarballs. With pre-generated content in tarballs, there is a risk that GNU/Linux distributions [ship] generated files coming from the tarball into the binary *.deb or *.rpm package file. Typically the person packaging the upstream project never realized that some installed artifacts was not re-built[.]
Simon s post goes into further details how this was achieved, and describes some potential caveats and counters some expected responses as well. A shorter version can be found in the announcement for the 1.8 release of libntlm.

Distribution work In Debian this month, Helmut Grohne filed a bug suggesting the removal of dh-buildinfo, a tool to generate and distribute .buildinfo-like files within binary packages. Note that this is distinct from the .buildinfo generation performed by dpkg-genbuildinfo. By contrast, the entirely optional dh-buildinfo generated a debian/buildinfo file that would be shipped within binary packages as /usr/share/doc/package/buildinfo_$arch.gz. Adrian Bunk recently asked about including source hashes in Debian s .buildinfo files, which prompted Guillem Jover to refresh some old patches to dpkg to make this possible, which revealed some quirks Vagrant Cascadian discovered when testing. In addition, 21 reviews of Debian packages were added, 22 were updated and 16 were removed this month adding to our knowledge about identified issues. A number issue types have been added, such as new random_temporary_filenames_embedded_by_mesonpy and timestamps_added_by_librime toolchain issues. In openSUSE, it was announced that their Factory distribution enabled bit-by-bit reproducible builds for almost all parts of the package. Previously, more parts needed to be ignored when comparing package files, but now only the signature needs to be deleted. In addition, Bernhard M. Wiedemann published theunreproduciblepackage as a proper .rpm package which it allows to better test tools intended to debug reproducibility. Furthermore, it was announced that Bernhard s work on a 100% reproducible openSUSE-based distribution will be funded by NLnet. He also posted another monthly report for his reproducibility work in openSUSE. In GNU Guix, Janneke Nieuwenhuizen submitted a patch set for creating a reproducible source tarball for Guix. That is to say, ensuring that make dist is reproducible when run from Git. [ ] Lastly, in Fedora, a new wiki page was created to propose a change to the distribution. Titled Changes/ReproduciblePackageBuilds , the page summarises itself as a proposal whereby A post-build cleanup is integrated into the RPM build process so that common causes of build irreproducibility in packages are removed, making most of Fedora packages reproducible.

Mailing list news On our mailing list this month:
  • Continuing a thread started in March 2024 about the Arch Linux minimal container now being 100% reproducible, John Gilmore followed up with a post about the practical and philosophical distinctions of local vs. remote storage of the various artifacts needed to build packages.
  • Chris Lamb asked the list which conferences readers are attending these days: After peak Covid and other industry-wide changes, conferences are no longer the must attend events they previously were especially in the area of software supply-chain security. In rough, practical terms, it seems harder to justify conference travel today than it did in mid-2019. The thread generated a number of responses which would be of interest to anyone planning travel in Q3 and Q4 of 2024.
  • James Addison wrote to the list about a quirk in Git related to its core.autocrlf functionality, thus helpfully passing on a slightly off-topic and perhaps not of direct relevance to anyone on the list today note that might still be the kind of issue that is useful to be aware of if-and-when puzzling over unexpected git content / checksum issues (situations that I do expect people on this list encounter from time-to-time) .

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions 263, 264 and 265 to Debian and made the following additional changes:
  • Don t crash on invalid .zip files, even if we encounter their badness halfway through the file and not at the time of their initial opening. [ ]
  • Prevent odt2txt tests from always being skipped due to an (impossibly) new version requirement. [ ]
  • Avoid parens-in-parens in test skipping messages. [ ]
  • Ensure that tests with >=-style version constraints actually print the tool name. [ ]
In addition, Fay Stegerman fixed a crash when there are (invalid) duplicate entries in .zip which was originally reported in Debian bug #1068705). [ ] Fay also added a user-visible note to a diff when there are duplicate entries in ZIP files [ ]. Lastly, Vagrant Cascadian added an external tool pointer for the zipdetails tool under GNU Guix [ ] and proposed updates to diffoscope in Guix as well [ ] which were merged as [264] [265], fixed a regression in test coverage and increased verbosity of the test suite[ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

reprotest reprotest is our tool for building the same source code twice in different environments and then checking the binaries produced by each build for any differences. This month, reprotest version 0.7.27 was uploaded to Debian unstable) by Vagrant Cascadian who made the following additional changes:
  • Enable specific number of CPUs using --vary=num_cpus.cpus=X. [ ]
  • Consistently use 398 days for time variation, rather than choosing randomly each time. [ ]
  • Disable builds of arch:any packages. [ ]
  • Update the description for the build_path.path option in README.rst. [ ]
  • Update escape sequences for compatibility with Python 3.12. (#1068853). [ ]
  • Remove the generic upstream signing-key [ ] and update the packages signing key with the currently active team members [ ].
  • Update the packaging Standards-Version to 4.7.0. [ ]
In addition, Holger Levsen fixed some spelling errors detected by the spellintian tool [ ] and Vagrant Cascadian updated reprotest in GNU Guix to 0.7.27.

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In April, an enormous number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Adjust for changed internal IP addresses at Codethink. [ ]
    • Automatically cleanup failed diffoscope user services if there are too many failures. [ ][ ]
    • Configure two new nodes at infomanik.cloud. [ ][ ]
    • Schedule Debian experimental even less. [ ][ ]
  • Breakage detection:
    • Exclude currently building packages from breakage detection. [ ]
    • Be more noisy if diffoscope crashes. [ ]
    • Health check: provide clickable URLs in jenkins job log for failed pkg builds due to diffoscope crashes. [ ]
    • Limit graph to about the last 100 days of breakages only. [ ]
    • Fix all found files with bad permissions. [ ]
    • Prepare dealing with diffoscope timeouts. [ ]
    • Detect more cases of failure to debootstrap base system. [ ]
    • Include timestamps of failed job runs. [ ]
  • Documentation updates:
    • Document how to access arm64 nodes at Codethink. [ ]
    • Document how to use infomaniak.cloud. [ ]
    • Drop notes about long stalled LeMaker HiKey960 boards sponsored by HPE and hosted at ETH. [ ]
    • Mention osuosl4 and osuosl5 and explain their usage. [ ]
    • Mention that some packages are built differently. [ ][ ]
    • Improve language in a comment. [ ]
    • Add more notes how to query resource usage from infomaniak.cloud. [ ]
  • Node maintenance:
    • Add ionos4 and ionos14 to THANKS. [ ][ ][ ][ ][ ]
    • Deprecate Squid on ionos1 and ionos10. [ ]
    • Drop obsolete script to powercycle arm64 architecture nodes. [ ]
    • Update system_health_check for new proxy nodes. [ ]
  • Misc changes:
    • Make the update_jdn.sh script more robust. [ ][ ]
    • Update my SSH public key. [ ]
In addition, Mattia Rizzolo added some new host details. [ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

9 May 2024

Dirk Eddelbuettel: RcppArmadillo 0.12.8.3.0 on CRAN: Upstream Bugfix

armadillo image Armadillo is a powerful and expressive C++ template library for linear algebra and scientific computing. It aims towards a good balance between speed and ease of use, has a syntax deliberately close to Matlab, and is useful for algorithm development directly in C++, or quick conversion of research code into production environments. RcppArmadillo integrates this library with the R environment and language and is widely used by (currently) 1144 other packages on CRAN, downloaded 34.2 million times (per the partial logs from the cloud mirrors of CRAN), and the CSDA paper (preprint / vignette) by Conrad and myself has been cited 583 times according to Google Scholar. Conrad released a new upstream bugfix yesterday (for a corner case with fftw3). We uploaded it yesterday too but it took a day for the hard-working CRAN maintainers to concur that the one (!) NOTE from reverse-dependency checking over 1100 packages was in a fact a false positve. And so it appeared on CRAN (very) early this morning. We also made a change removing a long-redundant setter for C++11 mode via the plugin. No other changes were made. The set of changes since the last CRAN release follows.

Changes in RcppArmadillo version 0.12.8.3.0 (2024-05-07)
  • Upgraded to Armadillo release 12.8.3 (Cortisol Injector)
    • Fix issue in fft() and fft2() in multi-threaded contexts with FFTW3 enabled
  • No longer set C++11 for the Rcpp plugin as this standard has been the default by R for very long time now.

Courtesy of my CRANberries, there is a diffstat report relative to previous release. More detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the Rcpp R-Forge page. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

3 May 2024

Colin Watson: Playing with rich

One of the things I do as a side project for Freexian is to work on various bits of business automation: accounting tools, programs to help contributors report their hours, invoicing, that kind of thing. While it s not quite my usual beat, this makes quite a good side project as the tools involved are mostly rather sensible and easy to deal with (Python, git, ledger, that sort of thing) and it s the kind of thing where I can dip into it for a day or so a week and feel like I m making useful contributions. The logic can be quite complex, but there s very little friction in the tools themselves. A recent case where I did run into some friction in the tools was with some commands that need to present small amounts of tabular data on the terminal, using OSC 8 hyperlinks if the terminal supports them: think customer-related information with some links to issues. One of my colleagues had previously done this using a hack on top of texttable, which was perfectly fine as far as it went. However, now I wanted to be able to add multiple links in a single table cell in some cases, and that was really going to stretch the limits of that approach: working out the width of the displayed text in the cell was going to take an annoying amount of bookkeeping. I started looking around to see whether any other approaches might be easier, without too much effort (remember that a day or so a week bit above). ansiwrap looked somewhat promising, but it isn t currently packaged in Debian, and it would have still left me with the problem of figuring out how to integrate it into texttable, which looked like it would be quite complicated. Then I remembered that I d heard good things about rich, and thought I d take a look. rich turned out to be exactly what I wanted. Instead of something like this based on the texttable hack above:
import shutil
from pyxian.texttable import UrlTable
termsize = shutil.get_terminal_size((80, 25))
table = UrlTable(max_width=termsize.columns)
table.set_deco(UrlTable.HEADER)
table.set_cols_align(["l"])
table.set_cols_dtype(["u"])
table.add_row(["Issue"])
table.add_row([(issue_url, f"# issue_id ")]
print(table.draw())
now I can do this instead:
import rich
from rich import box
from rich.table import Table
table = Table(box=box.SIMPLE)
table.add_column("Issue")
table.add_row(f"[link= issue_url ]# issue_id [/link]")
rich.print(table)
While this is a little shorter, the real bonus is that I can now just put multiple [link] tags in a single string, and it all just works. No ceremony. In fact, once the relevant bits of code passed type-checking (since the real code is a bit more complex than the samples above), it worked first time. It s a pleasure to work with a library like that. It looks like I ve only barely scratched the surface of rich, but I expect I ll reach for it more often now.

1 May 2024

Antoine Beaupr : Tor migrates from Gitolite/GitWeb to GitLab

Note: I've been awfully silent here for the past ... (checks notes) oh dear, 3 months! But that's not because I've been idle, quite the contrary, I've been very busy but just didn't have time to write about anything. So I've taken it upon myself to write something about my work this week, and published this post on the Tor blog which I copy here for a broader audience. Let me know if you like this or not.
Tor has finally completed a long migration from legacy Git infrastructure (Gitolite and GitWeb) to our self-hosted GitLab server. Git repository addresses have therefore changed. Many of you probably have made the switch already, but if not, you will need to change:
https://git.torproject.org/
to:
https://gitlab.torproject.org/
In your Git configuration. The GitWeb front page is now an archived listing of all the repositories before the migration. Inactive git repositories were archived in GitLab legacy/gitolite namespace and the gitweb.torproject.org and git.torproject.org web sites now redirect to GitLab. Best effort was made to reproduce the original gitolite repositories faithfully and also avoid duplicating too much data in the migration. But it's possible that some data present in Gitolite has not migrated to GitLab. User repositories are particularly at risk, because they were massively migrated, and they were "re-forked" from their upstreams, to avoid wasting disk space. If a user had a project with a matching name it was assumed to have the right data, which might be inaccurate. The two virtual machines responsible for the legacy service (cupani for git-rw.torproject.org and vineale for git.torproject.org and gitweb.torproject.org) have been shutdown. Their disks will remain for 3 months (until the end of July 2024) and their backups for another year after that (until the end of July 2025), after which point all the data from those hosts will be destroyed, with only the GitLab archives remaining. The rest of this article expands on how this was done and what kind of problems we faced during the migration.

Where is the code? Normally, nothing should be lost. All repositories in gitolite have been either explicitly migrated by their owners, forcibly migrated by the sysadmin team (TPA), or explicitly destroyed at their owner's request. An exhaustive rewrite map translates gitolite projects to GitLab projects. Some of those projects actually redirect to their parent in cases of empty repositories that were obvious forks. Destroyed repositories redirect to the GitLab front page. Because the migration happened progressively, it's technically possible that commits pushed to gitolite were lost after the migration. We took great care to avoid that scenario. First, we adopted a proposal (TPA-RFC-36) in June 2023 to announce the transition. Then, in March 2024, we locked down all repositories from any further changes. Around that time, only a handful of repositories had changes made after the adoption date, and we examined each repository carefully to make sure nothing was lost. Still, we built a diff of all the changes in the git references that archivists can peruse to check for data loss. It's large (6MiB+) because a lot of repositories were migrated before the mass migration and then kept evolving in GitLab. Many other repositories were rebuilt in GitLab from parent to rebuild a fork relationship which added extra references to those clones. A note to amateur archivists out there, it's probably too late for one last crawl now. The Git repositories now all redirect to GitLab and are effectively unavailable in their original form. That said, the GitWeb site was crawled into the Internet Archive in February 2024, so at least some copy of it is available in the Wayback Machine. At that point, however, many developers had already migrated their projects to GitLab, so the copies there were already possibly out of date compared with the repositories in GitLab. Software Heritage also has a copy of all repositories hosted on Gitolite since June 2023 and have continuously kept mirroring the repositories, where they will be kept hopefully in eternity. There's an issue where the main website can't find the repositories when you search for gitweb.torproject.org, instead search for git.torproject.org. In any case, if you believe data is missing, please do let us know by opening an issue with TPA.

Why? This is an old project in the making. The first discussion about migrating from gitolite to GitLab started in 2020 (almost 4 years ago). But going further back, the first GitLab experiment was in 2016, almost a decade ago. The current GitLab server dates from 2019, replacing Trac for issue tracking in 2020. It was originally supposed to host only mirrors for merge requests and issue trackers but, naturally, one thing led to another and eventually, GitLab had grown a container registry, continuous integration (CI) runners, GitLab Pages, and, of course, hosted most Git repositories. There were hesitations at moving to GitLab for code hosting. We had discussions about the increased attack surface and ways to mitigate that, but, ultimately, it seems the issues were not that serious and the community embraced GitLab. TPA actually migrated its most critical repositories out of shared hosting entirely, into specific servers (e.g. the Puppet Git repository is just on the Puppet server now), leveraging Git's decentralized nature and removing an entire attack surface from our infrastructure. Some of those repositories are mirrored back into GitLab, but the authoritative copy is not on GitLab. In any case, the proposal to migrate from Gitolite to GitLab was effectively just formalizing a fait accompli.

How to migrate from Gitolite / cgit to GitLab The progressive migration was a challenge. If you intend to migrate between hosting platforms, we strongly recommend to make a "flag day" during which you migrate all repositories at once. This ensures a smoother transition and avoids elaborate rewrite rules. When Gitolite access was shutdown, we had repositories on both GitLab and Gitolite, without a clear relationship between the two. A priori, the plan then was to import all the remaining Gitolite repositories into the legacy/gitolite namespace, but that seemed wasteful, particularly for large repositories like Tor Browser which uses nearly a gigabyte of disk space. So we took special care to avoid duplicating repositories. When the mass migration started, only 71 of the 538 Gitolite repositories were Migrated to GitLab in the gitolite.conf file. So, given that we had hundreds of repositories to migrate:, we developed some automation to "save time". We already automate similar ad-hoc tasks with Fabric, so we used that framework here as well. (Our normal configuration management tool is Puppet, which is a poor fit here.) So a relatively large amount of Python code was produced to basically do the following:
  1. check if all on-disk repositories are listed in gitolite.conf (and vice versa) and either add missing repositories or delete them from disk if garbage
  2. for each repository in gitolite.conf, if its category is marked Migrated to GitLab, skip, otherwise;
  3. find a matching GitLab project by name, prompt the user for multiple matches
  4. if a match is found, redirect if the repository is non-empty
    • we have GitLab projects that look like the real thing, but are only present to host migrated Trac issues
    • in such cases we cloned the Gitolite project locally and pushed to the existing repository instead
  5. otherwise, a new repository is created in the legacy/gitolite namespace, using the "import" mechanism in GitLab to automatically import the repository from Gitolite, creating redirections and updating gitolite.conf to document the change
User repositories (those under the user/ directory in Gitolite) were handled specially. First, the existing redirection map was checked to see if a similarly named project was migrated (so that, e.g. user/dgoulet/tor is properly treated as a fork of tpo/core/tor). Then the parent project was forked in GitLab and the Gitolite project force-pushed to the fork. This allows us to show the fork relationship in GitLab and, more importantly, benefit from the "pool" feature in GitLab which deduplicates disk usage between forks. Sometimes, we found no such relationships. Then we simply imported multiple repositories with similar names in the legacy/gitolite namespace, sometimes creating forks between user repositories, on a first-come-first-served basis from the gitolite.conf order. The code used in this migration is now available publicly. We encourage other groups planning to migrate from Gitolite/GitWeb to GitLab to use (and contribute to) our fabric-tasks repository, even though it does have its fair share of hard-coded assertions. The main entry point is the gitolite.mass-repos-migration task. A typical migration job looked like:
anarcat@angela:fabric-tasks$ fab -H cupani.torproject.org gitolite.mass-repos-migration 
[...]
INFO: skipping project project/help/infra in category Migrated to GitLab
INFO: skipping project project/help/wiki in category Migrated to GitLab
INFO: skipping project project/jenkins/jobs in category Migrated to GitLab
INFO: skipping project project/jenkins/tools in category Migrated to GitLab
INFO: searching for projects matching fastlane
INFO: Successfully connected to https://gitlab.torproject.org
import gitolite project project/tor-browser/fastlane into gitlab legacy/gitolite/project/tor-browser/fastlane with desc 'Tor Browser app store and deployment configuration for Fastlane'? [Y/n] 
INFO: importing gitolite project project/tor-browser/fastlane into gitlab legacy/gitolite/project/tor-browser/fastlane with desc 'Tor Browser app store and deployment configuration for Fastlane'
INFO: building a new connect to cupani
INFO: defaulting name to fastlane
INFO: importing project into GitLab
INFO: Successfully connected to https://gitlab.torproject.org
INFO: loading group legacy/gitolite/project/tor-browser
INFO: archiving project
INFO: creating repository fastlane (fastlane) in namespace legacy/gitolite/project/tor-browser from https://git.torproject.org/project/tor-browser/fastlane into https://gitlab.torproject.org/legacy/gitolite/project/tor-browser/fastlane
INFO: migrating Gitolite repository project/tor-browser/fastlane to GitLab project legacy/gitolite/project/tor-browser/fastlane
INFO: uploading 399 bytes to /srv/git.torproject.org/repositories/project/tor-browser/fastlane.git/hooks/pre-receive
INFO: making /srv/git.torproject.org/repositories/project/tor-browser/fastlane.git/hooks/pre-receive executable
INFO: adding entry to rewrite_map /home/anarcat/src/tor/tor-puppet/modules/profile/files/git/gitolite2gitlab.txt
INFO: modifying gitolite.conf to add: "config gitweb.category = Migrated to GitLab"
INFO: rewriting gitolite config /home/anarcat/src/tor/gitolite-admin/conf/gitolite.conf to change project project/tor-browser/fastlane to category Migrated to GitLab
INFO: skipping project project/bridges/bridgedb-admin in category Migrated to GitLab
[...]
In the above, you can see migrated repositories skipped then the fastlane project being archived into GitLab. Another example with a later version of the script, processing only user repositories and showing the interactive prompt and a force-push into a fork:
$ fab -H cupani.torproject.org  gitolite.mass-repos-migration --include 'user/.*' --exclude '.*tor-?browser.*'
INFO: skipping project user/aagbsn/bridgedb in category Migrated to GitLab
[...]
INFO: skipping project user/phw/atlas in category Migrated to GitLab
INFO: processing project user/phw/obfsproxy (Philipp's obfsproxy repository) in category Users' development repositories (Attic)
INFO: Successfully connected to https://gitlab.torproject.org
INFO: user repository detected, trying to find fork phw/obfsproxy
WARNING: no existing fork found, entering user fork subroutine
INFO: found 6 GitLab projects matching 'obfsproxy' (https://gitweb.torproject.org/user/phw/obfsproxy.git)
0 legacy/gitolite/debian/obfsproxy
1 legacy/gitolite/debian/obfsproxy-legacy
2 legacy/gitolite/user/asn/obfsproxy
3 legacy/gitolite/user/ioerror/obfsproxy
4 tpo/anti-censorship/pluggable-transports/obfsproxy
5 tpo/anti-censorship/pluggable-transports/obfsproxy-legacy
select parent to fork from, or enter to abort: ^G4
INFO: repository is not empty: in-pack: 2104, packs: 1, size-pack: 414
fork project tpo/anti-censorship/pluggable-transports/obfsproxy into legacy/gitolite/user/phw/obfsproxy^G [Y/n] 
INFO: loading project tpo/anti-censorship/pluggable-transports/obfsproxy
INFO: forking project user/phw/obfsproxy into namespace legacy/gitolite/user/phw
INFO: waiting for fork to complete...
INFO: fork status: started, sleeping...
INFO: fork finished
INFO: cloning and force pushing from user/phw/obfsproxy to legacy/gitolite/user/phw/obfsproxy
INFO: deleting branch protection: <class 'gitlab.v4.objects.branches.ProjectProtectedBranch'> =>  'id': 2723, 'name': 'master', 'push_access_levels': [ 'id': 2864, 'access_level': 40, 'access_level_description': 'Maintainers', 'deploy_key_id': None ], 'merge_access_levels': [ 'id': 2753, 'access_level': 40, 'access_level_description': 'Maintainers' ], 'allow_force_push': False 
INFO: cloning repository git-rw.torproject.org:/srv/git.torproject.org/repositories/user/phw/obfsproxy.git in /tmp/tmp6orvjggy/user/phw/obfsproxy
Cloning into bare repository '/tmp/tmp6orvjggy/user/phw/obfsproxy'...
INFO: pushing to GitLab: https://gitlab.torproject.org/legacy/gitolite/user/phw/obfsproxy
remote: 
remote: To create a merge request for bug_10887, visit:        
remote:   https://gitlab.torproject.org/legacy/gitolite/user/phw/obfsproxy/-/merge_requests/new?merge_request%5Bsource_branch%5D=bug_10887        
remote: 
[...]
To ssh://gitlab.torproject.org/legacy/gitolite/user/phw/obfsproxy
 + 2bf9d09...a8e54d5 master -> master (forced update)
 * [new branch]      bug_10887 -> bug_10887
[...]
INFO: migrating repo
INFO: migrating Gitolite repository https://gitweb.torproject.org/user/phw/obfsproxy.git to GitLab project https://gitlab.torproject.org/legacy/gitolite/user/phw/obfsproxy
INFO: adding entry to rewrite_map /home/anarcat/src/tor/tor-puppet/modules/profile/files/git/gitolite2gitlab.txt
INFO: modifying gitolite.conf to add: "config gitweb.category = Migrated to GitLab"
INFO: rewriting gitolite config /home/anarcat/src/tor/gitolite-admin/conf/gitolite.conf to change project user/phw/obfsproxy to category Migrated to GitLab
INFO: processing project user/phw/scramblesuit (Philipp's ScrambleSuit repository) in category Users' development repositories (Attic)
INFO: user repository detected, trying to find fork phw/scramblesuit
WARNING: no existing fork found, entering user fork subroutine
WARNING: no matching gitlab project found for user/phw/scramblesuit
INFO: user fork subroutine failed, resuming normal procedure
INFO: searching for projects matching scramblesuit
import gitolite project user/phw/scramblesuit into gitlab legacy/gitolite/user/phw/scramblesuit with desc 'Philipp's ScrambleSuit repository'?^G [Y/n] 
INFO: checking if remote repo https://git.torproject.org/user/phw/scramblesuit exists
INFO: importing gitolite project user/phw/scramblesuit into gitlab legacy/gitolite/user/phw/scramblesuit with desc 'Philipp's ScrambleSuit repository'
INFO: importing project into GitLab
INFO: Successfully connected to https://gitlab.torproject.org
INFO: loading group legacy/gitolite/user/phw
INFO: creating repository scramblesuit (scramblesuit) in namespace legacy/gitolite/user/phw from https://git.torproject.org/user/phw/scramblesuit into https://gitlab.torproject.org/legacy/gitolite/user/phw/scramblesuit
INFO: archiving project
INFO: migrating Gitolite repository https://gitweb.torproject.org/user/phw/scramblesuit.git to GitLab project https://gitlab.torproject.org/legacy/gitolite/user/phw/scramblesuit
INFO: adding entry to rewrite_map /home/anarcat/src/tor/tor-puppet/modules/profile/files/git/gitolite2gitlab.txt
INFO: modifying gitolite.conf to add: "config gitweb.category = Migrated to GitLab"
INFO: rewriting gitolite config /home/anarcat/src/tor/gitolite-admin/conf/gitolite.conf to change project user/phw/scramblesuit to category Migrated to GitLab
[...]
Acute eyes will notice the bell used as a notification mechanism as well in this transcript. A lot of the code is now useless for us, but some, like "commit and push" or is-repo-empty live on in the git module and, of course, the gitlab module has grown some legs along the way. We've also found fun bugs, like a file descriptor exhaustion in bash, among other oddities. The retirement milestone and issue 41215 has a detailed log of the migration, for those curious. This was a challenging project, but it feels nice to have this behind us. This gets rid of 2 of the 4 remaining machines running Debian "old-old-stable", which moves a bit further ahead in our late bullseye upgrades milestone. Full transparency: we tested GPT-3.5, GPT-4, and other large language models to see if they could answer the question "write a set of rewrite rules to redirect GitWeb to GitLab". This has become a standard LLM test for your faithful writer to figure out how good a LLM is at technical responses. None of them gave an accurate, complete, and functional response, for the record. The actual rewrite rules as of this writing follow, for humans that actually like working answers provided by expert humans instead of artificial intelligence which currently seem to be, glorified, mansplaining interns.

git.torproject.org rewrite rules Those rules are relatively simple in that they rewrite a single URL to its equivalent GitLab counterpart in a 1:1 fashion. It relies on the rewrite map mentioned above, of course.
RewriteEngine on
# this RewriteMap connects the gitweb projects to their GitLab
# equivalent
RewriteMap gitolite2gitlab "txt:/etc/apache2/gitolite2gitlab.txt"
# if this becomes a performance bottleneck, convert to a DBM map with:
#
#  $ httxt2dbm -i mapfile.txt -o mapfile.map
#
# and:
#
# RewriteMap mapname "dbm:/etc/apache/mapfile.map"
#
# according to reports lavamind found online, we hit such a
# performance bottleneck only around millions of entries, which is not our case
# those two rules can go away once all the projects are
# migrated to GitLab
#
# this matches the request URI so we can check the RewriteMap
# for a match next
#
# WARNING: this won't match URLs without .git in them, which
# *do* work now. one possibility would be to match the request
# URI (without query string!) with:
#
# /git/(.*)(.git)?/(((branches hooks info objects/).*) git-.* upload-pack receive-pack HEAD config description)?.
#
# I haven't been able to figure out the actual structure of
# those URLs, so it's really hard to figure out the boundaries
# of the project name here. I stopped after pouring around the
# http-backend.c code in git
# itself. https://www.git-scm.com/docs/http-protocol is also
# kind of incomplete and unsatisfying.
RewriteCond % REQUEST_URI  ^/(git/)?(.*).git/.*$
# this makes the RewriteRule match only if there's a match in
# the rewrite map
RewriteCond $ gitolite2gitlab:%2 NOT_FOUND  !NOT_FOUND
RewriteRule ^/(git/)?(.*).git/(.*)$ https://gitlab.torproject.org/$ gitolite2gitlab:$2 .git/$3 [R=302,L]
# Fallback everything else to GitLab
RewriteRule (.*) https://gitlab.torproject.org [R=302,L]

gitweb.torproject.org rewrite rules Those are the vastly more complicated GitWeb to GitLab rewrite rules. Note that we say "GitWeb" but we were actually not running GitWeb but cgit, as the former didn't actually scale for us.
RewriteEngine on
# this RewriteMap connects the gitweb projects to their GitLab
# equivalent
RewriteMap gitolite2gitlab "txt:/etc/apache2/gitolite2gitlab.txt"
# special rule to process targets of the old spec.tpo site and
# bring them to the right redirect on the new spec.tpo site. that should turn, for example:
#
# https://gitweb.torproject.org/torspec.git/tree/address-spec.txt
#
# into:
#
# https://spec.torproject.org/address-spec
RewriteRule ^/torspec.git/tree/(.*).txt$ https://spec.torproject.org/$1 [R=302]
# list of endpoints taken from cgit's cmd.c
# those two RewriteCond are necessary because we don't move
# all repositories at once. once the migration is completed,
# they can be removed.
#
# and yes, they are copied all over the place below
#
# create a match for the project name to check if the project
# has been moved to GitLab
RewriteCond % REQUEST_URI  ^/(.*).git(/.*)?$
# this makes the RewriteRule match only if there's a match in
# the rewrite map
RewriteCond $ gitolite2gitlab:%1 NOT_FOUND  !NOT_FOUND
# main project page, like summary below
RewriteRule ^/(.*).git/?$ https://gitlab.torproject.org/$ gitolite2gitlab:$1 / [R=302,L]
# summary
RewriteCond % REQUEST_URI  ^/(.*).git/.*$
RewriteCond $ gitolite2gitlab:%1 NOT_FOUND  !NOT_FOUND
RewriteRule ^/(.*).git/summary/?$ https://gitlab.torproject.org/$ gitolite2gitlab:$1 / [R=302,L]
# about
RewriteCond % REQUEST_URI  ^/(.*).git/.*$
RewriteCond $ gitolite2gitlab:%1 NOT_FOUND  !NOT_FOUND
RewriteRule ^/(.*).git/about/?$ https://gitlab.torproject.org/$ gitolite2gitlab:$1 / [R=302,L]
# commit
RewriteCond % REQUEST_URI  ^/(.*).git/.*$
RewriteCond $ gitolite2gitlab:%1 NOT_FOUND  !NOT_FOUND
RewriteCond "% QUERY_STRING " "(.*(?:^ &))id=([^&]*)(&.*)?$"
RewriteRule ^/(.*).git/commit/? https://gitlab.torproject.org/$ gitolite2gitlab:$1 /-/commit/%2 [R=302,L,QSD]
RewriteCond % REQUEST_URI  ^/(.*).git/.*$
RewriteCond $ gitolite2gitlab:%1 NOT_FOUND  !NOT_FOUND
RewriteRule ^/(.*).git/commit/? https://gitlab.torproject.org/$ gitolite2gitlab:$1 /-/commits/HEAD [R=302,L]
# diff, incomplete because can diff arbitrary refs and files in cgit but not in GitLab, hard to parse
RewriteCond % REQUEST_URI  ^/(.*).git/.*$
RewriteCond $ gitolite2gitlab:%1 NOT_FOUND  !NOT_FOUND
RewriteCond % QUERY_STRING  id=([^&]*)
RewriteRule ^/(.*).git/diff/? https://gitlab.torproject.org/$ gitolite2gitlab:$1 /-/commit/%1 [R=302,L,QSD]
# patch
RewriteCond % REQUEST_URI  ^/(.*).git/.*$
RewriteCond $ gitolite2gitlab:%1 NOT_FOUND  !NOT_FOUND
RewriteCond % QUERY_STRING  id=([^&]*)
RewriteRule ^/(.*).git/patch/? https://gitlab.torproject.org/$ gitolite2gitlab:$1 /-/commit/%1.patch [R=302,L,QSD]
# rawdiff, incomplete because can show only one file diff, which GitLab cannot
RewriteCond % REQUEST_URI  ^/(.*).git/.*$
RewriteCond $ gitolite2gitlab:%1 NOT_FOUND  !NOT_FOUND
RewriteCond % QUERY_STRING  id=([^&]*)
RewriteRule ^/(.*).git/rawdiff/?$ https://gitlab.torproject.org/$ gitolite2gitlab:$1 /-/commit/%1.diff [R=302,L,QSD]
# log
RewriteCond % REQUEST_URI  ^/(.*).git/.*$
RewriteCond $ gitolite2gitlab:%1 NOT_FOUND  !NOT_FOUND
RewriteCond % QUERY_STRING  h=([^&]*)
RewriteRule ^/(.*).git/log/?$ https://gitlab.torproject.org/$ gitolite2gitlab:$1 /-/commits/%1 [R=302,L,QSD]
RewriteCond % REQUEST_URI  ^/(.*).git/.*$
RewriteCond $ gitolite2gitlab:%1 NOT_FOUND  !NOT_FOUND
RewriteRule ^/(.*).git/log/?$ https://gitlab.torproject.org/$ gitolite2gitlab:$1 /-/commits/HEAD [R=302,L]
RewriteCond % REQUEST_URI  ^/(.*).git/.*$
RewriteCond $ gitolite2gitlab:%1 NOT_FOUND  !NOT_FOUND
RewriteRule ^/(.*).git/log(/?.*)$ https://gitlab.torproject.org/$ gitolite2gitlab:$1 /-/commits/HEAD$2 [R=302,L]
# atom
RewriteCond % REQUEST_URI  ^/(.*).git/.*$
RewriteCond $ gitolite2gitlab:%1 NOT_FOUND  !NOT_FOUND
RewriteCond % QUERY_STRING  h=([^&]*)
RewriteRule ^/(.*).git/atom/?$ https://gitlab.torproject.org/$ gitolite2gitlab:$1 /-/commits/%1 [R=302,L,QSD]
RewriteCond % REQUEST_URI  ^/(.*).git/.*$
RewriteCond $ gitolite2gitlab:%1 NOT_FOUND  !NOT_FOUND
RewriteRule ^/(.*).git/atom/?$ https://gitlab.torproject.org/$ gitolite2gitlab:$1 /-/commits/HEAD [R=302,L,QSD]
# refs, incomplete because two pages in GitLab, defaulting to "tags"
RewriteCond % REQUEST_URI  ^/(.*).git/.*$
RewriteCond $ gitolite2gitlab:%1 NOT_FOUND  !NOT_FOUND
RewriteRule ^/(.*).git/refs/?$ https://gitlab.torproject.org/$ gitolite2gitlab:$1 /-/tags [R=302,L]
RewriteCond % REQUEST_URI  ^/(.*).git/.*$
RewriteCond $ gitolite2gitlab:%1 NOT_FOUND  !NOT_FOUND
RewriteCond % QUERY_STRING  h=([^&]*)
RewriteRule ^/(.*).git/tag/? https://gitlab.torproject.org/$ gitolite2gitlab:$1 /-/tags/%1 [R=302,L,QSD]
# tree
RewriteCond % REQUEST_URI  ^/(.*).git/.*$
RewriteCond $ gitolite2gitlab:%1 NOT_FOUND  !NOT_FOUND
RewriteCond % QUERY_STRING  id=([^&]*)
RewriteRule ^/(.*).git/tree(/?.*)$ https://gitlab.torproject.org/$ gitolite2gitlab:$1 /-/tree/%1$2 [R=302,L,QSD]
RewriteCond % REQUEST_URI  ^/(.*).git/.*$
RewriteCond $ gitolite2gitlab:%1 NOT_FOUND  !NOT_FOUND
RewriteRule ^/(.*).git/tree(/?.*)$ https://gitlab.torproject.org/$ gitolite2gitlab:$1 /-/tree/HEAD$2 [R=302,L]
# /-/tree has no good default in GitLab, revert to HEAD which is a good
# approximation (we can't assume "master" here anymore)
RewriteCond % REQUEST_URI  ^/(.*).git/.*$
RewriteCond $ gitolite2gitlab:%1 NOT_FOUND  !NOT_FOUND
RewriteRule ^/(.*).git/tree/?$ https://gitlab.torproject.org/$ gitolite2gitlab:$1 /-/tree/HEAD [R=302,L]
# plain
RewriteCond % REQUEST_URI  ^/(.*).git/.*$
RewriteCond $ gitolite2gitlab:%1 NOT_FOUND  !NOT_FOUND
RewriteCond % QUERY_STRING  h=([^&]*)
RewriteRule ^/(.*).git/plain(/?.*)$ https://gitlab.torproject.org/$ gitolite2gitlab:$1 /-/raw/%1$2 [R=302,L,QSD]
RewriteCond % REQUEST_URI  ^/(.*).git/.*$
RewriteCond $ gitolite2gitlab:%1 NOT_FOUND  !NOT_FOUND
RewriteRule ^/(.*).git/plain(/?.*)$ https://gitlab.torproject.org/$ gitolite2gitlab:$1 /-/raw/HEAD$2 [R=302,L]
# blame: disabled
#RewriteCond % REQUEST_URI  ^/(.*).git/.*$
#RewriteCond $ gitolite2gitlab:%1 NOT_FOUND  !NOT_FOUND
#RewriteCond % QUERY_STRING  h=([^&]*)
#RewriteRule ^/(.*).git/blame(/?.*)$ https://gitlab.torproject.org/$ gitolite2gitlab:$1 /-/blame/%1$2 [R=302,L,QSD]
# same default as tree above
#RewriteCond % REQUEST_URI  ^/(.*).git/.*$
#RewriteCond $ gitolite2gitlab:%1 NOT_FOUND  !NOT_FOUND
#RewriteRule ^/(.*).git/blame(/?.*)$ https://gitlab.torproject.org/$ gitolite2gitlab:$1 /-/blame/HEAD/$2 [R=302,L]
# stats
RewriteCond % REQUEST_URI  ^/(.*).git/.*$
RewriteCond $ gitolite2gitlab:%1 NOT_FOUND  !NOT_FOUND
RewriteRule ^/(.*).git/stats/?$ https://gitlab.torproject.org/$ gitolite2gitlab:$1 /-/graphs/HEAD [R=302,L]
# still TODO:
# repolist: once migration is complete
#
# cannot be done:
# atom: needs a feed token, user must be logged in
# blob: no direct equivalent
# info: not working on main cgit website?
# ls_cache: not working, irrelevant?
# objects: undocumented?
# snapshot: pattern too hard to match on cgit's side
# special case, we keep a copy of the main index on the archive
RewriteRule ^/?$ https://archive.torproject.org/websites/gitweb.torproject.org.html [R=302,L]
# Fallback: everything else to GitLab
RewriteRule .* https://gitlab.torproject.org [R=302,L]
The reference copy of those is available in our (currently private) Puppet git repository.

25 April 2024

Jonathan McDowell: Sorting out backup internet #3: failover

With local recursive DNS and a 5G modem in place the next thing was to work on some sort of automatic failover when the primary FTTP connection failed. My wife works from home too and I sometimes travel so I wanted to make sure things didn t require me to be around to kick them into switch the link in use. First, let s talk about what I didn t do. One choice to try and ensure as seamless a failover as possible would be to get a VM somewhere out there. I d then run Wireguard tunnels over both the FTTP + 5G links to the VM, and run some sort of routing protocol (RIP, OSPF?) over the links. Set preferences such that the FTTP is preferred, NAT v4 to the VM IP, and choose somewhere that gave me a v6 range I could just use directly. This has the advantage that I m actively checking link quality to the outside work, rather than just to the next hop. It also means, if the failover detection is fast enough, that existing sessions stay up rather than needing re-established. The downsides are increased complexity, adding another point of potential failure (the VM + provider), the impact on connection quality (even with a decent endpoint it s an extra hop and latency), and finally the increased cost involved. I can cope with having to reconnect my SSH sessions in the event of a failure, and I d rather be sure I can make full use of the FTTP connection, so I didn t go this route. I chose to rely on local link failure detection to provide the signal for failover, and a set of policy routing on top of that to make things a bit more seamless. Local link failure turns out to be fairly easy. My FTTP is a PPPoE configuration, so in /etc/ppp/peers/aquiss I have:
lcp-echo-interval 1
lcp-echo-failure 5
lcp-echo-adaptive
Which gives me a failover of ~ 5s if the link goes down. I m operating the 5G modem in bridge rather than router mode, which means I get the actual IP from the 5G network via DHCP. The DHCP lease the modem hands out is under a minute, and in the event of a network failure it only hands out a 192.168.254.x IP to talk to its web interface. As the 5G modem is the last resort path I choose not to do anything special with this, but the information is at least there if I need it. To allow both interfaces to be up and the FTTP to be preferred I m simply using route metrics. For the PPP configuration that s:
defaultroute-metric 100
and for the 5G modem I have:
iface sfp.31 inet dhcp
    metric 1000
    vlan-raw-device sfp
There s a wrinkle in that pppd will not replace an existing default route, so I ve created /etc/ppp/ip-up.d/default-route to ensure it s added:
#!/bin/bash
[ "$PPP_IFACE" = "pppoe-wan" ]   exit 0
# Ensure we add a default route; pppd will not do so if we have
# a lower pref route out the 5G modem
ip route add default dev pppoe-wan metric 100   true
Additionally, in /etc/dhcp/dhclient.conf I ve disabled asking for any server details (DNS, NTP, etc) - I have internal setups for the servers I want, and don t want to be trying to select things over the 5G link by default. However, what I do want is to be able to access the 5G modem web interface and explicitly route some traffic out that link (e.g. so I can add it to my smokeping tests). For that I need some source based routing. First step, add a 5g table to /etc/iproute2/rt_tables:
16  5g
Then I ended up with the following in /etc/dhcp/dhclient-exit-hooks.d/modem-interface-route, which is more complex than I d like but seems to do what I want:
#!/bin/sh
case "$reason" in
    BOUND RENEW REBIND REBOOT)
        # Check if we've actually changed IP address
        if [ -z "$old_ip_address" ]  
           [ "$old_ip_address" != "$new_ip_address" ]  
           [ "$reason" = "BOUND" ]   [ "$reason" = "REBOOT" ]; then
            if [ ! -z "$old_ip_address" ]; then
                ip rule del from $old_ip_address lookup 5g
            fi
            ip rule add from $new_ip_address lookup 5g
            ip route add default dev sfp.31 table 5g   true
            ip route add 192.168.254.1 dev sfp.31 2>/dev/null   true
        fi
    ;;
    EXPIRE)
        if [ ! -z "$old_ip_address" ]; then
            ip rule del from $old_ip_address lookup 5g
        fi
    ;;
    *)
    ;;
esac
What does all that aim to do? We want to ensure traffic directed to the 5G WAN address goes out the 5G modem, so I can SSH into it even when the main link is up. So we add a rule directing traffic from that IP to hit the 5g routing table, and a default route in that table which uses the 5G link. There s no configuration for the FTTP connection in that table, so if the 5G link is down the traffic gets dropped, which is what we want. We also configure 192.168.254.1 to go out the link to the modem, as that s where the web interface lives. I also have a curl callout (curl --interface sfp.31 to ensure it goes out the 5G link) after the routes are configured to set dynamic DNS with Mythic Beasts, which helps with knowing where to connect back to. I seem to see IP address changes on the 5G link every couple of days at least. Additionally, I have an entry in the interfaces configuration carving out the top set of the netblock my smokeping server is in:
    up ip rule add from 192.0.2.224/27 lookup 5g
My smokeping /etc/smokeping/config.d/Probes file then looks like:
*** Probes ***
+ FPing
binary = /usr/bin/fping
++ FPingNormal
++ FPing5G
sourceaddress = 192.0.2.225
+ FPing6
binary = /usr/bin/fping
which allows me to use probe = FPing5G for targets to test them over the 5G link. That mostly covers the functionality I want for a backup link. There s one piece that isn t quite solved, however, IPv6, which can wait for another post.

11 April 2024

Reproducible Builds: Reproducible Builds in March 2024

Welcome to the March 2024 report from the Reproducible Builds project! In our reports, we attempt to outline what we have been up to over the past month, as well as mentioning some of the important things happening more generally in software supply-chain security. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website. Table of contents:
  1. Arch Linux minimal container userland now 100% reproducible
  2. Validating Debian s build infrastructure after the XZ backdoor
  3. Making Fedora Linux (more) reproducible
  4. Increasing Trust in the Open Source Supply Chain with Reproducible Builds and Functional Package Management
  5. Software and source code identification with GNU Guix and reproducible builds
  6. Two new Rust-based tools for post-processing determinism
  7. Distribution work
  8. Mailing list highlights
  9. Website updates
  10. Delta chat clients now reproducible
  11. diffoscope updates
  12. Upstream patches
  13. Reproducibility testing framework

Arch Linux minimal container userland now 100% reproducible In remarkable news, Reproducible builds developer kpcyrd reported that that the Arch Linux minimal container userland is now 100% reproducible after work by developers dvzv and Foxboron on the one remaining package. This represents a real world , widely-used Linux distribution being reproducible. Their post, which kpcyrd suffixed with the question now what? , continues on to outline some potential next steps, including validating whether the container image itself could be reproduced bit-for-bit. The post, which was itself a followup for an Arch Linux update earlier in the month, generated a significant number of replies.

Validating Debian s build infrastructure after the XZ backdoor From our mailing list this month, Vagrant Cascadian wrote about being asked about trying to perform concrete reproducibility checks for recent Debian security updates, in an attempt to gain some confidence about Debian s build infrastructure given that they performed builds in environments running the high-profile XZ vulnerability. Vagrant reports (with some caveats):
So far, I have not found any reproducibility issues; everything I tested I was able to get to build bit-for-bit identical with what is in the Debian archive.
That is to say, reproducibility testing permitted Vagrant and Debian to claim with some confidence that builds performed when this vulnerable version of XZ was installed were not interfered with.

Making Fedora Linux (more) reproducible In March, Davide Cavalca gave a talk at the 2024 Southern California Linux Expo (aka SCALE 21x) about the ongoing effort to make the Fedora Linux distribution reproducible. Documented in more detail on Fedora s website, the talk touched on topics such as the specifics of implementing reproducible builds in Fedora, the challenges encountered, the current status and what s coming next. (YouTube video)

Increasing Trust in the Open Source Supply Chain with Reproducible Builds and Functional Package Management Julien Malka published a brief but interesting paper in the HAL open archive on Increasing Trust in the Open Source Supply Chain with Reproducible Builds and Functional Package Management:
Functional package managers (FPMs) and reproducible builds (R-B) are technologies and methodologies that are conceptually very different from the traditional software deployment model, and that have promising properties for software supply chain security. This thesis aims to evaluate the impact of FPMs and R-B on the security of the software supply chain and propose improvements to the FPM model to further improve trust in the open source supply chain. PDF
Julien s paper poses a number of research questions on how the model of distributions such as GNU Guix and NixOS can be leveraged to further improve the safety of the software supply chain , etc.

Software and source code identification with GNU Guix and reproducible builds In a long line of commendably detailed blog posts, Ludovic Court s, Maxim Cournoyer, Jan Nieuwenhuizen and Simon Tournier have together published two interesting posts on the GNU Guix blog this month. In early March, Ludovic Court s, Maxim Cournoyer, Jan Nieuwenhuizen and Simon Tournier wrote about software and source code identification and how that might be performed using Guix, rhetorically posing the questions: What does it take to identify software ? How can we tell what software is running on a machine to determine, for example, what security vulnerabilities might affect it? Later in the month, Ludovic Court s wrote a solo post describing adventures on the quest for long-term reproducible deployment. Ludovic s post touches on GNU Guix s aim to support time travel , the ability to reliably (and reproducibly) revert to an earlier point in time, employing the iconic image of Harold Lloyd hanging off the clock in Safety Last! (1925) to poetically illustrate both the slapstick nature of current modern technology and the gymnastics required to navigate hazards of our own making.

Two new Rust-based tools for post-processing determinism Zbigniew J drzejewski-Szmek announced add-determinism, a work-in-progress reimplementation of the Reproducible Builds project s own strip-nondeterminism tool in the Rust programming language, intended to be used as a post-processor in RPM-based distributions such as Fedora In addition, Yossi Kreinin published a blog post titled refix: fast, debuggable, reproducible builds that describes a tool that post-processes binaries in such a way that they are still debuggable with gdb, etc.. Yossi post details the motivation and techniques behind the (fast) performance of the tool.

Distribution work In Debian this month, since the testing framework no longer varies the build path, James Addison performed a bulk downgrade of the bug severity for issues filed with a level of normal to a new level of wishlist. In addition, 28 reviews of Debian packages were added, 38 were updated and 23 were removed this month adding to ever-growing knowledge about identified issues. As part of this effort, a number of issue types were updated, including Chris Lamb adding a new ocaml_include_directories toolchain issue [ ] and James Addison adding a new filesystem_order_in_java_jar_manifest_mf_include_resource issue [ ] and updating the random_uuid_in_notebooks_generated_by_nbsphinx to reference a relevant discussion thread [ ]. In addition, Roland Clobus posted his 24th status update of reproducible Debian ISO images. Roland highlights that the images for Debian unstable often cannot be generated due to changes in that distribution related to the 64-bit time_t transition. Lastly, Bernhard M. Wiedemann posted another monthly update for his reproducibility work in openSUSE.

Mailing list highlights Elsewhere on our mailing list this month:

Website updates There were made a number of improvements to our website this month, including:
  • Pol Dellaiera noticed the frequent need to correctly cite the website itself in academic work. To facilitate easier citation across multiple formats, Pol contributed a Citation File Format (CIF) file. As a result, an export in BibTeX format is now available in the Academic Publications section. Pol encourages community contributions to further refine the CITATION.cff file. Pol also added an substantial new section to the buy in page documenting the role of Software Bill of Materials (SBOMs) and ephemeral development environments. [ ][ ]
  • Bernhard M. Wiedemann added a new commandments page to the documentation [ ][ ] and fixed some incorrect YAML elsewhere on the site [ ].
  • Chris Lamb add three recent academic papers to the publications page of the website. [ ]
  • Mattia Rizzolo and Holger Levsen collaborated to add Infomaniak as a sponsor of amd64 virtual machines. [ ][ ][ ]
  • Roland Clobus updated the stable outputs page, dropping version numbers from Python documentation pages [ ] and noting that Python s set data structure is also affected by the PYTHONHASHSEED functionality. [ ]

Delta chat clients now reproducible Delta Chat, an open source messaging application that can work over email, announced this month that the Rust-based core library underlying Delta chat application is now reproducible.

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions 259, 260 and 261 to Debian and made the following additional changes:
  • New features:
    • Add support for the zipdetails tool from the Perl distribution. Thanks to Fay Stegerman and Larry Doolittle et al. for the pointer and thread about this tool. [ ]
  • Bug fixes:
    • Don t identify Redis database dumps as GNU R database files based simply on their filename. [ ]
    • Add a missing call to File.recognizes so we actually perform the filename check for GNU R data files. [ ]
    • Don t crash if we encounter an .rdb file without an equivalent .rdx file. (#1066991)
    • Correctly check for 7z being available and not lz4 when testing 7z. [ ]
    • Prevent a traceback when comparing a contentful .pyc file with an empty one. [ ]
  • Testsuite improvements:
    • Fix .epub tests after supporting the new zipdetails tool. [ ]
    • Don t use parenthesis within test skipping messages, as PyTest adds its own parenthesis. [ ]
    • Factor out Python version checking in test_zip.py. [ ]
    • Skip some Zip-related tests under Python 3.10.14, as a potential regression may have been backported to the 3.10.x series. [ ]
    • Actually test 7z support in the test_7z set of tests, not the lz4 functionality. (Closes: reproducible-builds/diffoscope#359). [ ]
In addition, Fay Stegerman updated diffoscope s monkey patch for supporting the unusual Mozilla ZIP file format after Python s zipfile module changed to detect potentially insecure overlapping entries within .zip files. (#362) Chris Lamb also updated the trydiffoscope command line client, dropping a build-dependency on the deprecated python3-distutils package to fix Debian bug #1065988 [ ], taking a moment to also refresh the packaging to the latest Debian standards [ ]. Finally, Vagrant Cascadian submitted an update for diffoscope version 260 in GNU Guix. [ ]

Upstream patches This month, we wrote a large number of patches, including: Bernhard M. Wiedemann used reproducibility-tooling to detect and fix packages that added changes in their %check section, thus failing when built with the --no-checks option. Only half of all openSUSE packages were tested so far, but a large number of bugs were filed, including ones against caddy, exiv2, gnome-disk-utility, grisbi, gsl, itinerary, kosmindoormap, libQuotient, med-tools, plasma6-disks, pspp, python-pypuppetdb, python-urlextract, rsync, vagrant-libvirt and xsimd. Similarly, Jean-Pierre De Jesus DIAZ employed reproducible builds techniques in order to test a proposed refactor of the ath9k-htc-firmware package. As the change produced bit-for-bit identical binaries to the previously shipped pre-built binaries:
I don t have the hardware to test this firmware, but the build produces the same hashes for the firmware so it s safe to say that the firmware should keep working.

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In March, an enormous number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Sleep less after a so-called 404 package state has occurred. [ ]
    • Schedule package builds more often. [ ][ ]
    • Regenerate all our HTML indexes every hour, but only every 12h for the released suites. [ ]
    • Create and update unstable and experimental base systems on armhf again. [ ][ ]
    • Don t reschedule so many depwait packages due to the current size of the i386 architecture queue. [ ]
    • Redefine our scheduling thresholds and amounts. [ ]
    • Schedule untested packages with a higher priority, otherwise slow architectures cannot keep up with the experimental distribution growing. [ ]
    • Only create the stats_buildinfo.png graph once per day. [ ][ ]
    • Reproducible Debian dashboard: refactoring, update several more static stats only every 12h. [ ]
    • Document how to use systemctl with new systemd-based services. [ ]
    • Temporarily disable armhf and i386 continuous integration tests in order to get some stability back. [ ]
    • Use the deb.debian.org CDN everywhere. [ ]
    • Remove the rsyslog logging facility on bookworm systems. [ ]
    • Add zst to the list of packages which are false-positive diskspace issues. [ ]
    • Detect failures to bootstrap Debian base systems. [ ]
  • Arch Linux-related changes:
    • Temporarily disable builds because the pacman package manager is broken. [ ][ ]
    • Split reproducible_html_live_status and split the scheduling timing . [ ][ ][ ]
    • Improve handling when database is locked. [ ][ ]
  • Misc changes:
    • Show failed services that require manual cleanup. [ ][ ]
    • Integrate two new Infomaniak nodes. [ ][ ][ ][ ]
    • Improve IRC notifications for artifacts. [ ]
    • Run diffoscope in different systemd slices. [ ]
    • Run the node health check more often, as it can now repair some issues. [ ][ ]
    • Also include the string Bot in the userAgent for Git. (Re: #929013). [ ]
    • Document increased tmpfs size on our OUSL nodes. [ ]
    • Disable memory account for the reproducible_build service. [ ][ ]
    • Allow 10 times as many open files for the Jenkins service. [ ]
    • Set OOMPolicy=continue and OOMScoreAdjust=-1000 for both the Jenkins and the reproducible_build service. [ ]
Mattia Rizzolo also made the following changes:
  • Debian-related changes:
    • Define a systemd slice to group all relevant services. [ ][ ]
    • Add a bunch of quotes in scripts to assuage the shellcheck tool. [ ]
    • Add stats on how many packages have been built today so far. [ ]
    • Instruct systemd-run to handle diffoscope s exit codes specially. [ ]
    • Prefer the pgrep tool over grepping the output of ps. [ ]
    • Re-enable a couple of i386 and armhf architecture builders. [ ][ ]
    • Fix some stylistic issues flagged by the Python flake8 tool. [ ]
    • Cease scheduling Debian unstable and experimental on the armhf architecture due to the time_t transition. [ ]
    • Start a few more i386 & armhf workers. [ ][ ][ ]
    • Temporarly skip pbuilder updates in the unstable distribution, but only on the armhf architecture. [ ]
  • Other changes:
    • Perform some large-scale refactoring on how the systemd service operates. [ ][ ]
    • Move the list of workers into a separate file so it s accessible to a number of scripts. [ ]
    • Refactor the powercycle_x86_nodes.py script to use the new IONOS API and its new Python bindings. [ ]
    • Also fix nph-logwatch after the worker changes. [ ]
    • Do not install the stunnel tool anymore, it shouldn t be needed by anything anymore. [ ]
    • Move temporary directories related to Arch Linux into a single directory for clarity. [ ]
    • Update the arm64 architecture host keys. [ ]
    • Use a common Postfix configuration. [ ]
The following changes were also made by:
  • Jan-Benedict Glaw:
    • Initial work to clean up a messy NetBSD-related script. [ ][ ]
  • Roland Clobus:
    • Show the installer log if the installer fails to build. [ ]
    • Avoid the minus character (i.e. -) in a variable in order to allow for tags in openQA. [ ]
    • Update the schedule of Debian live image builds. [ ]
  • Vagrant Cascadian:
    • Maintenance on the virt* nodes is completed so bring them back online. [ ]
    • Use the fully qualified domain name in configuration. [ ]
Node maintenance was also performed by Holger Levsen, Mattia Rizzolo [ ][ ] and Vagrant Cascadian [ ][ ][ ][ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

29 March 2024

Reproducible Builds (diffoscope): diffoscope 262 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 262. This version includes the following changes:
[ Chris Lamb ]
* Factor out Python version checking in test_zip.py. (Re: #362)
* Also skip some zip tests under 3.10.14 as well; a potential regression may
  have been backported to the 3.10.x series. The underlying cause is still to
  be investigated. (Re: #362)
You find out more by visiting the project homepage.

Next.

Previous.