Search Results: "js"

20 April 2024

Bastian Venthur: Help needed: creating a WSDL file to interact with debbugs

I am upstream and Debian package maintainer of python-debianbts, which is a Python library that allows for querying Debian s Bug Tracking System (BTS). python-debianbts is used by reportbug, the standard tool to report bugs in Debian, and therefore the glue between the reportbug and the BTS. debbugs, the software that powers Debian s BTS, provides a SOAP interface for querying the BTS. Unfortunately, SOAP is not a very popular protocol anymore, and I m facing the second migration to another underlying SOAP library as they continue to become unmaintained over time. Zeep, the library I m currently considering, requires a WSDL file in order to work with a SOAP service, however, debbugs does not provide one. Since I m not familiar with WSDL, I need help from someone who can create a WSDL file for debbugs, so I can migrate python-debianbts away from pysimplesoap to zeep. How did we get here? Back in the olden days, reportbug was querying the BTS by parsing its HTML output. While this worked, it tightly coupled the user-facing presentation of the BTS with critical functionality of the bug reporting tool. The setup was fragile, prone to breakage, and did not allow changing anything in the BTS frontend for fear of breaking reportbug itself. In 2007, I started to work on reportbug-ng, a user-friendly alternative to reportbug, targeted at users not comfortable using the command line. Early on, I decided to use the BTS SOAP interface instead of parsing HTML like reportbug did. 2008, I extracted the code that dealt with the BTS into a separate Python library, and after some collaboration with the reportbug maintainers, reportbug adopted python-debianbts in 2011 and has used it ever since. 2015, I was working on porting python-debianbts to Python 3. During that process, it turned out that its major dependency, SoapPy was pretty much unmaintained for years and blocking the Python3 transition. Thanks to the help of Gaetano Guerriero, who ported python-debianbts to pysimplesoap, the migration was unblocked and could proceed. In 2024, almost ten years later, pysimplesoap seems to be unmaintained as well, and I have to look again for alternatives. The most promising one right now seems to be zeep. Unfortunately, zeep requires a WSDL file for working with a SOAP service, which debbugs does not provide. How can you help? reportbug (and thus python-debianbts) is used by thousands of users and I have a certain responsibility to keep things working properly. Since I simply don t know enough about WSDL to create such a file for debbugs myself, I m looking for someone who can help me with this task. If you re familiar with SOAP, WSDL and optionally debbugs, please get in touch with me. I don t speak Pearl, so I m not really able to read debbugs code, but I do know some things about the SOAP requests and replies due to my work on python-debianbts, so I m sure we can work something out. There is a WSDL file for a debbugs version used by GNU, but I don t think it s official and it currently does not work with zeep. It may be a good starting point, though. The future of debbugs API While we can probably continue to support debbugs SOAP interface for a while, I don t think it s very sustainable in the long run. A simpler, well documented REST API that returns JSON seems more appropriate nowadays. The queries and replies that debbugs currently supports are simple enough to design a REST API with JSON around it. The benefit would be less complex libraries on the client side and probably easier maintainability on the server side as well. debbugs maintainer seemed to be in agreement with this idea back in 2018. I created an attempt to define a new API (HTML render), but somehow we got stuck and no progress has been made since then. I m still happy to help shaping such an API for debbugs, but I can t really implement anything in debbugs itself, as it is written in Perl, which I m not familiar with.

18 April 2024

Jonathan McDowell: Sorting out backup internet #2: 5G modem

Having setup recursive DNS it was time to actually sort out a backup internet connection. I live in a Virgin Media area, but I still haven t forgiven them for my terrible Virgin experiences when moving here. Plus it involves a bigger contractual commitment. There are no altnets locally (though I m watching youfibre who have already rolled out in a few Belfast exchanges), so I decided to go for a 5G modem. That gives some flexibility, and is a bit easier to get up and running. I started by purchasing a ZTE MC7010. This had the advantage of being reasonably cheap off eBay, not having any wifi functionality I would just have to disable (it s going to plug it into the same router the FTTP connection terminates on), being outdoor mountable should I decide to go that way, and, finally, being powered via PoE. For now this device sits on the window sill in my study, which is at the top of the house. I printed a table stand for it which mostly does the job (though not as well with a normal, rather than flat, network cable). The router lives downstairs, so I ve extended a dedicated VLAN through the study switch, down to the core switch and out to the router. The PoE study switch can only do GigE, not 2.5Gb/s, but at present that s far from the limiting factor on the speed of the connection. The device is 3 branded, and, as it happens, I ve ended up with a 3 SIM in it. Up until recently my personal phone was with them, but they ve kicked me off Go Roam, so I ve moved. Going with 3 for the backup connection provides some slight extra measure of resiliency; we now have devices on all 4 major UK networks in the house. The SIM is a preloaded data only SIM good for a year; I don t expect to use all of the data allowance, but I didn t want to have to worry about unexpected excess charges. Performance turns out to be disappointing; I end up locking the device to 4G as the 5G signal is marginal - leaving it enabled results in constantly switching between 4G + 5G and a significant extra latency. The smokeping graph below shows a brief period where I removed the 4G lock and allowed 5G: Smokeping 4G vs 5G graph (There s a handy zte.js script to allow doing this from the device web interface.) I get about 10Mb/s sustained downloads out of it. EE/Vodafone did not lead to significantly better results, so for now I m accepting it is what it is. I tried relocating the device to another part of the house (a little tricky while still providing switch-based PoE, but I have an injector), without much improvement. Equally pinning the 4G to certain bands provided a short term improvement (I got up to 40-50Mb/s sustained), but not reliably so. speedtest.net results This is disappointing, but if it turns out to be a problem I can look at mounting it externally. I also assume as 5G is gradually rolled out further things will naturally improve, but that might be wishful thinking on my part. Rather than wait until my main link had a problem I decided to try a day working over the 5G connection. I spend a lot of my time either in browser based apps or accessing remote systems via SSH, so I m reasonably sensitive to a jittery or otherwise flaky connection. I picked a day that I did not have any meetings planned, but as it happened I ended up with an adhoc video call arranged. I m pleased to say that it all worked just fine; definitely noticeable as slower than the FTTP connection (to be expected), but all workable and even the video call was fine (at least from my end). Looking at the traffic graph shows the expected ~ 10Mb/s peak (actually a little higher, and looking at the FTTP stats for previous days not out of keeping with what we see there), and you can just about see the ~ 3Mb/s symmetric use by the video call at 2pm: 4G traffic during the work day The test run also helped iron out the fact that the content filter was still enabled on the SIM, but that was easily resolved. Up next, vaguely automatic failover.

12 April 2024

Freexian Collaborators: Debian Contributions: SSO Authentication for jitsi.debian.social, /usr-move updates, and more! (by Utkarsh Gupta)

Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services. P.S. We ve completed over a year of writing these blogs. If you have any suggestions on how to make them better or what you d like us to cover, or any other opinions/reviews you might have, et al, please let us know by dropping an email to us. We d be happy to hear your thoughts. :)

SSO Authentication for jitsi.debian.social, by Stefano Rivera Debian.social s jitsi instance has been getting some abuse by (non-Debian) people sharing sexually explicit content on the service. After playing whack-a-mole with this for a month, and shutting the instance off for another month, we opened it up again and the abuse immediately re-started. Stefano sat down and wrote an SSO Implementation that hooks into Jitsi s existing JWT SSO support. This requires everyone using jitsi.debian.social to have a Salsa account. With only a little bit of effort, we could change this in future, to only require an account to open a room, and allow guests to join the call.

/usr-move, by Helmut Grohne The biggest task this month was sending mitigation patches for all of the /usr-move issues arising from package renames due to the 2038 transition. As a result, we can now say that every affected package in unstable can either be converted with dh-sequence-movetousr or has an open bug report. The package set relevant to debootstrap except for the set that has to be uploaded concurrently has been moved to /usr and is awaiting migration. The move of coreutils happened to affect piuparts which hard codes the location of /bin/sync and received multiple updates as a result.

Miscellaneous contributions
  • Stefano Rivera uploaded a stable release update to python3.11 for bookworm, fixing a use-after-free crash.
  • Stefano uploaded a new version of python-html2text, and updated python3-defaults to build with it.
  • In support of Python 3.12, Stefano dropped distutils as a Build-Dependency from a few packages, and uploaded a complex set of patches to python-mitogen.
  • Stefano landed some merge requests to clean up dead code in dh-python, removed the flit plugin, and uploaded it.
  • Stefano uploaded new upstream versions of twisted, hatchling, python-flexmock, python-authlib, python mitogen, python-pipx, and xonsh.
  • Stefano requested removal of a few packages supporting the Opsis HDMI2USB hardware that DebConf Video team used to use for HDMI capture, as they are not being maintained upstream. They started to FTBFS, with recent sdcc changes.
  • DebConf 24 is getting ready to open registration, Stefano spent some time fixing bugs in the website, caused by infrastructure updates.
  • Stefano reviewed all the DebConf 23 travel reimbursements, filing requests for more information from SPI where our records mismatched.
  • Stefano spun up a Wafer website for the Berlin 2024 mini DebConf.
  • Roberto C. S nchez worked on facilitating the transfer of upstream maintenance responsibility for the dormant Shorewall project to a new team led by the current maintainer of the Shorewall packages in Debian.
  • Colin Watson fixed build failures in celery-haystack-ng, db1-compat, jsonpickle, libsdl-perl, kali, knews, openssh-ssh1, python-json-log-formatter, python-typing-extensions, trn4, vigor, and wcwidth. Some of these were related to the 64-bit time_t transition, since that involved enabling -Werror=implicit-function-declaration.
  • Colin fixed an off-by-one error in neovim, which was already causing a build failure in Ubuntu and would eventually have caused a build failure in Debian with stricter toolchain settings.
  • Colin added an sshd@.service template to openssh to help newer systemd versions make containers and VMs SSH-accessible over AF_VSOCK sockets.
  • Following the xz-utils backdoor, Colin spent some time testing and discussing OpenSSH upstream s proposed inline systemd notification patch, since the current implementation via libsystemd was part of the attack vector used by that backdoor.
  • Utkarsh reviewed and sponsored some Go packages for Lena Voytek and Rajudev.
  • Utkarsh also helped Mitchell Dzurick with the adoption of pyparted package.
  • Helmut sent 10 patches for cross build failures.
  • Helmut partially fixed architecture cross bootstrap tooling to deal with changes in linux-libc-dev and the recent gcc-for-host changes and also fixed a 64bit-time_t FTBFS in libtextwrap.
  • Thorsten Alteholz uploaded several packages from debian-printing: cjet, lprng, rlpr and epson-inkjet-printer-escpr were affected by the newly enabled compiler switch -Werror=implicit-function-declaration. Besides fixing these serious bugs, Thorsten also worked on other bugs and could fix one or the other.
  • Carles updated simplemonitor and python-ring-doorbell packages with new upstream versions.
  • Santiago is still working on the Salsa CI MRs to adapt the build jobs so they can rely on sbuild. Current work includes adapting the images used by the build job, implementing the basic sbuild support the related jobs, and adjusting the support for experimental and *-backports releases..
    Additionally, Santiago reviewed some MR such as Make timeout action explicit in the logs and the subsequent Implement conditional timeout verbosity, and the batch of MRs included in https://salsa.debian.org/salsa-ci-team/pipeline/-/merge_requests/482.
  • Santiago also reviewed applications for the improving Salsa CI in Debian GSoC 2024 project. We received applications from four very talented candidates. The selection process is currently ongoing. A huge thanks to all of them!
  • As part of the DebConf 24 organization, Santiago has taken part in the Content team discussions.

8 April 2024

Bastian Blank: Python dataclasses for Deb822 format

Python includes some helping support for classes that are designed to just hold some data and not much more: Data Classes. It uses plain Python type definitions to specify what you can have and some further information for every field. This will then generate you some useful methods, like __init__ and __repr__, but on request also more. But given that those type definitions are available to other code, a lot more can be done. There exists several separate packages to work on data classes. For example you can have data validation from JSON with dacite. But Debian likes a pretty strange format usually called Deb822, which is in fact derived from the RFC 822 format of e-mail messages. Those files includes single messages with a well known format. So I'd like to introduce some Deb822 format support for Python Data Classes. For now the code resides in the Debian Cloud tool. Usage Setup It uses the standard data classes support and several helper functions. Also you need to enable support for postponed evaluation of annotations.
from __future__ import annotations
from dataclasses import dataclass
from dataclasses_deb822 import read_deb822, field_deb822
Class definition start Data classes are just normal classes, just with a decorator.
@dataclass
class Package:
Field definitions You need to specify the exact key to be used for this field.
    package: str = field_deb822('Package')
    version: str = field_deb822('Version')
    arch: str = field_deb822('Architecture')
Default values are also supported.
    multi_arch: Optional[str] = field_deb822(
        'Multi-Arch',
        default=None,
    )
Reading files
for p in read_deb822(Package, sys.stdin, ignore_unknown=True):
    print(p)
Full example
from __future__ import annotations
from dataclasses import dataclass
from debian_cloud_images.utils.dataclasses_deb822 import read_deb822, field_deb822
from typing import Optional
import sys
@dataclass
class Package:
    package: str = field_deb822('Package')
    version: str = field_deb822('Version')
    arch: str = field_deb822('Architecture')
    multi_arch: Optional[str] = field_deb822(
        'Multi-Arch',
        default=None,
    )
for p in read_deb822(Package, sys.stdin, ignore_unknown=True):
    print(p)
Known limitations

3 April 2024

Guido G nther: Free Software Activities March 2024

A short status update of what happened on my side last month. I spent quiet a bit of time reviewing new, code (thanks!) as well as maintenance to keep things going but we also have some improvements: Phosh Phoc phosh-mobile-settings phosh-osk-stub gmobile Livi squeekboard GNOME calls Libsoup If you want to support my work see donations.

1 April 2024

Colin Watson: Free software activity in March 2024

My Debian contributions this month were all sponsored by Freexian.

9 March 2024

Reproducible Builds: Reproducible Builds in February 2024

Welcome to the February 2024 report from the Reproducible Builds project! In our reports, we try to outline what we have been up to over the past month as well as mentioning some of the important things happening in software supply-chain security.

Reproducible Builds at FOSDEM 2024 Core Reproducible Builds developer Holger Levsen presented at the main track at FOSDEM on Saturday 3rd February this year in Brussels, Belgium. However, that wasn t the only talk related to Reproducible Builds. However, please see our comprehensive FOSDEM 2024 news post for the full details and links.

Maintainer Perspectives on Open Source Software Security Bernhard M. Wiedemann spotted that a recent report entitled Maintainer Perspectives on Open Source Software Security written by Stephen Hendrick and Ashwin Ramaswami of the Linux Foundation sports an infographic which mentions that 56% of [polled] projects support reproducible builds .

Mailing list highlights From our mailing list this month:

Distribution work In Debian this month, 5 reviews of Debian packages were added, 22 were updated and 8 were removed this month adding to Debian s knowledge about identified issues. A number of issue types were updated as well. [ ][ ][ ][ ] In addition, Roland Clobus posted his 23rd update of the status of reproducible ISO images on our mailing list. In particular, Roland helpfully summarised that all major desktops build reproducibly with bullseye, bookworm, trixie and sid provided they are built for a second time within the same DAK run (i.e. [within] 6 hours) and that there will likely be further work at a MiniDebCamp in Hamburg. Furthermore, Roland also responded in-depth to a query about a previous report
Fedora developer Zbigniew J drzejewski-Szmek announced a work-in-progress script called fedora-repro-build that attempts to reproduce an existing package within a koji build environment. Although the projects README file lists a number of fields will always or almost always vary and there is a non-zero list of other known issues, this is an excellent first step towards full Fedora reproducibility.
Jelle van der Waa introduced a new linter rule for Arch Linux packages in order to detect cache files leftover by the Sphinx documentation generator which are unreproducible by nature and should not be packaged. At the time of writing, 7 packages in the Arch repository are affected by this.
Elsewhere, Bernhard M. Wiedemann posted another monthly update for his work elsewhere in openSUSE.

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions 256, 257 and 258 to Debian and made the following additional changes:
  • Use a deterministic name instead of trusting gpg s use-embedded-filenames. Many thanks to Daniel Kahn Gillmor dkg@debian.org for reporting this issue and providing feedback. [ ][ ]
  • Don t error-out with a traceback if we encounter struct.unpack-related errors when parsing Python .pyc files. (#1064973). [ ]
  • Don t try and compare rdb_expected_diff on non-GNU systems as %p formatting can vary, especially with respect to MacOS. [ ]
  • Fix compatibility with pytest 8.0. [ ]
  • Temporarily fix support for Python 3.11.8. [ ]
  • Use the 7zip package (over p7zip-full) after a Debian package transition. (#1063559). [ ]
  • Bump the minimum Black source code reformatter requirement to 24.1.1+. [ ]
  • Expand an older changelog entry with a CVE reference. [ ]
  • Make test_zip black clean. [ ]
In addition, James Addison contributed a patch to parse the headers from the diff(1) correctly [ ][ ] thanks! And lastly, Vagrant Cascadian pushed updates in GNU Guix for diffoscope to version 255, 256, and 258, and updated trydiffoscope to 67.0.6.

reprotest reprotest is our tool for building the same source code twice in different environments and then checking the binaries produced by each build for any differences. This month, Vagrant Cascadian made a number of changes, including:
  • Create a (working) proof of concept for enabling a specific number of CPUs. [ ][ ]
  • Consistently use 398 days for time variation rather than choosing randomly and update README.rst to match. [ ][ ]
  • Support a new --vary=build_path.path option. [ ][ ][ ][ ]

Website updates There were made a number of improvements to our website this month, including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In February, a number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Temporarily disable upgrading/bootstrapping Debian unstable and experimental as they are currently broken. [ ][ ]
    • Use the 64-bit amd64 kernel on all i386 nodes; no more 686 PAE kernels. [ ]
    • Add an Erlang package set. [ ]
  • Other changes:
    • Grant Jan-Benedict Glaw shell access to the Jenkins node. [ ]
    • Enable debugging for NetBSD reproducibility testing. [ ]
    • Use /usr/bin/du --apparent-size in the Jenkins shell monitor. [ ]
    • Revert reproducible nodes: mark osuosl2 as down . [ ]
    • Thanks again to Codethink, for they have doubled the RAM on our arm64 nodes. [ ]
    • Only set /proc/$pid/oom_score_adj to -1000 if it has not already been done. [ ]
    • Add the opemwrt-target-tegra and jtx task to the list of zombie jobs. [ ][ ]
Vagrant Cascadian also made the following changes:
  • Overhaul the handling of OpenSSH configuration files after updating from Debian bookworm. [ ][ ][ ]
  • Add two new armhf architecture build nodes, virt32z and virt64z, and insert them into the Munin monitoring. [ ][ ] [ ][ ]
In addition, Alexander Couzens updated the OpenWrt configuration in order to replace the tegra target with mpc85xx [ ], Jan-Benedict Glaw updated the NetBSD build script to use a separate $TMPDIR to mitigate out of space issues on a tmpfs-backed /tmp [ ] and Zheng Junjie added a link to the GNU Guix tests [ ]. Lastly, node maintenance was performed by Holger Levsen [ ][ ][ ][ ][ ][ ] and Vagrant Cascadian [ ][ ][ ][ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

7 March 2024

Petter Reinholdtsen: Plain text accounting file from your bitcoin transactions

A while back I wrote a small script to extract the Bitcoin transactions in a wallet in the ledger plain text accounting format. The last few days I spent some time to get it working better with more special cases. In case it can be useful for others, here is a copy:
#!/usr/bin/python3
#  -*- coding: utf-8 -*-
#  Copyright (c) 2023-2024 Petter Reinholdtsen
from decimal import Decimal
import json
import subprocess
import time
import numpy
def format_float(num):
    return numpy.format_float_positional(num, trim='-')
accounts =  
    u'amount' : 'Assets:BTC:main',
 
addresses =  
    '' : 'Assets:bankkonto',
    '' : 'Assets:bankkonto',
 
def exec_json(cmd):
    proc = subprocess.Popen(cmd,stdout=subprocess.PIPE)
    j = json.loads(proc.communicate()[0], parse_float=Decimal)
    return j
def list_txs():
    # get all transactions for all accounts / addresses
    c = 0
    txs = []
    txidfee =  
    limit=100000
    cmd = ['bitcoin-cli', 'listtransactions', '*', str(limit)]
    if True:
        txs.extend(exec_json(cmd))
    else:
        # Useful for debugging
        with open('transactions.json') as f:
            txs.extend(json.load(f, parse_float=Decimal))
    #print txs
    for tx in sorted(txs, key=lambda a: a['time']):
#        print tx['category']
        if 'abandoned' in tx and tx['abandoned']:
            continue
        if 'confirmations' in tx and 0 >= tx['confirmations']:
            continue
        when = time.strftime('%Y-%m-%d %H:%M', time.localtime(tx['time']))
        if 'message' in tx:
            desc = tx['message']
        elif 'comment' in tx:
            desc = tx['comment']
        elif 'label' in tx:
            desc = tx['label']
        else:
            desc = 'n/a'
        print("%s %s" % (when, desc))
        if 'address' in tx:
            print("  ; to bitcoin address %s" % tx['address'])
        else:
            print("  ; missing address in transaction, txid=%s" % tx['txid'])
        print(f"  ; amount= tx['amount'] ")
        if 'fee'in tx:
            print(f"  ; fee= tx['fee'] ")
        for f in accounts.keys():
            if f in tx and Decimal(0) != tx[f]:
                amount = tx[f]
                print("  %-20s   %s BTC" % (accounts[f], format_float(amount)))
        if 'fee' in tx and Decimal(0) != tx['fee']:
            # Make sure to list fee used in several transactions only once.
            if 'fee' in tx and tx['txid'] in txidfee \
               and tx['fee'] == txidfee[tx['txid']]:
                True
            else:
                fee = tx['fee']
                print("  %-20s   %s BTC" % (accounts['amount'], format_float(fee)))
                print("  %-20s   %s BTC" % ('Expences:BTC-fee', format_float(-fee)))
                txidfee[tx['txid']] = tx['fee']
        if 'address' in tx and tx['address'] in addresses:
            print("  %s" % addresses[tx['address']])
        else:
            if 'generate' == tx['category']:
                print("  Income:BTC-mining")
            else:
                if amount < Decimal(0):
                    print(f"  Assets:unknown:sent:update-script-addr- tx['address'] ")
                else:
                    print(f"  Assets:unknown:received:update-script-addr- tx['address'] ")
        print()
        c = c + 1
    print("# Found %d transactions" % c)
    if limit == c:
        print(f"# Warning: Limit  limit  reached, consider increasing limit.")
def main():
    list_txs()
main()
It is more of a proof of concept, and I do not expect it to handle all edge cases, but it worked for me, and perhaps you can find it useful too. To get a more interesting result, it is useful to map accounts sent to or received from to accounting accounts, using the addresses hash. As these will be very context dependent, I leave out my list to allow each user to fill out their own list of accounts. Out of the box, 'ledger reg BTC:main' should be able to show the amount of BTCs present in the wallet at any given time in the past. For other and more valuable analysis, a account plan need to be set up in the addresses hash. Here is an example transaction:
2024-03-07 17:00 Donated to good cause
    Assets:BTC:main                           -0.1 BTC
    Assets:BTC:main                       -0.00001 BTC
    Expences:BTC-fee                       0.00001 BTC
    Expences:donations                         0.1 BTC
It need a running Bitcoin Core daemon running, as it connect to it using bitcoin-cli listtransactions * 100000 to extract the transactions listed in the Wallet. As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

3 March 2024

Iustin Pop: New corydalis 2024.9.0 release!

Obligatory and misused quote: It s not dead, Jim! I ve kind of dropped by ball lately on organising my own photo collection, but February was a pretty good month and I managed to write some more code for Corydalis, ending up with the aforementioned new release. The release is not a big one, but I did manage to solve one thing that was annoying me greatly: that lack of ability to play videos inline in one of the two picture viewing modes (in my preferred mode, in fact). Now, whether you re browsing through pictures, or looking at pictures one-by-one, you can in both cases play videos easily, and to some extent, as it should be . No user docs for that, yet (I actually need to split the manual in user/admin/developer parts) I did some more internal cleanups, and I ve enabled building release zips (since that s how GitHub actions creates artifacts), which means it should be 10% easier to test this. The rest 90% is configuring it and pointing to picture folders and and and, so this is definitely not plug-and-play. The diff summary between 2023.44.0 and 2024.9.0 is: 56 files changed, 1412 insertions(+), 700 deletions(-). Which is not bad, but also not too much. The biggest churn was, as expected, in the viewer (due to the aforementioned video playing). The scary part is that the TypeScript code is not at 7.9% (and a tiny more JS, which I can t convert yet due to lack of type definitions upstream). I say scary in quotes, because I would actually like to know Typescript better, but no time. The new release can be seen in action on demo.corydalis.io, and as always, just after release I found two minor issues: Well, there will be future releases. For now, I ve made an open-source package release, which I didn t do in a while, so I m happy . See you!

8 February 2024

Sven Hoexter: Use GitHub CLI to List all Repository Secrets

Write it down before I forget about it again:
for x in $(gh api graphql --paginate -f query='query($endCursor:String)   organization(login:"myorg")  
    repositories(first: 100, after: $endCursor, isArchived:false)  
        pageInfo  
            hasNextPage
            endCursor
         
        nodes  
            name
         
     
     
     ' --jq '.data.organization.repositories.nodes[].name'); do
    secrets=$(gh secret list --json name --jq '.[].name' -R "myorg/$ x "   tr '\n' ',')
    if ! [ -z "$ secrets " ]; then
        echo "$ x ,$ secrets "
    fi
done
Requests a list of all not archived repositories in a GitHub org and queries repository secrets. If we find some we output the repo name and the secrets in a comma separated list. Not real CSV, but good enough for further processing. I've to admit it's kinda beautiful what you can do with the gh cli by now. Sadly it seems the secrets are not yet available via GraphQL (or I missed it in the docs), so I just use the gh cli to do the REST calls.

7 February 2024

Reproducible Builds: Reproducible Builds in January 2024

Welcome to the January 2024 report from the Reproducible Builds project. In these reports we outline the most important things that we have been up to over the past month. If you are interested in contributing to the project, please visit our Contribute page on our website.

How we executed a critical supply chain attack on PyTorch John Stawinski and Adnan Khan published a lengthy blog post detailing how they executed a supply-chain attack against PyTorch, a popular machine learning platform used by titans like Google, Meta, Boeing, and Lockheed Martin :
Our exploit path resulted in the ability to upload malicious PyTorch releases to GitHub, upload releases to [Amazon Web Services], potentially add code to the main repository branch, backdoor PyTorch dependencies the list goes on. In short, it was bad. Quite bad.
The attack pivoted on PyTorch s use of self-hosted runners as well as submitting a pull request to address a trivial typo in the project s README file to gain access to repository secrets and API keys that could subsequently be used for malicious purposes.

New Arch Linux forensic filesystem tool On our mailing list this month, long-time Reproducible Builds developer kpcyrd announced a new tool designed to forensically analyse Arch Linux filesystem images. Called archlinux-userland-fs-cmp, the tool is supposed to be used from a rescue image (any Linux) with an Arch install mounted to, [for example], /mnt. Crucially, however, at no point is any file from the mounted filesystem eval d or otherwise executed. Parsers are written in a memory safe language. More information about the tool can be found on their announcement message, as well as on the tool s homepage. A GIF of the tool in action is also available.

Issues with our SOURCE_DATE_EPOCH code? Chris Lamb started a thread on our mailing list summarising some potential problems with the source code snippet the Reproducible Builds project has been using to parse the SOURCE_DATE_EPOCH environment variable:
I m not 100% sure who originally wrote this code, but it was probably sometime in the ~2015 era, and it must be in a huge number of codebases by now. Anyway, Alejandro Colomar was working on the shadow security tool and pinged me regarding some potential issues with the code. You can see this conversation here.
Chris ends his message with a request that those with intimate or low-level knowledge of time_t, C types, overflows and the various parsing libraries in the C standard library (etc.) contribute with further info.

Distribution updates In Debian this month, Roland Clobus posted another detailed update of the status of reproducible ISO images on our mailing list. In particular, Roland helpfully summarised that all major desktops build reproducibly with bullseye, bookworm, trixie and sid provided they are built for a second time within the same DAK run (i.e. [within] 6 hours) . Additionally 7 of the 8 bookworm images from the official download link build reproducibly at any later time. In addition to this, three reviews of Debian packages were added, 17 were updated and 15 were removed this month adding to our knowledge about identified issues. Elsewhere, Bernhard posted another monthly update for his work elsewhere in openSUSE.

Community updates There were made a number of improvements to our website, including Bernhard M. Wiedemann fixing a number of typos of the term nondeterministic . [ ] and Jan Zerebecki adding a substantial and highly welcome section to our page about SOURCE_DATE_EPOCH to document its interaction with distribution rebuilds. [ ].
diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions 254 and 255 to Debian but focusing on triaging and/or merging code from other contributors. This included adding support for comparing eXtensible ARchive (.XAR/.PKG) files courtesy of Seth Michael Larson [ ][ ], as well considerable work from Vekhir in order to fix compatibility between various and subtle incompatible versions of the progressbar libraries in Python [ ][ ][ ][ ]. Thanks!

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In January, a number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Reduce the number of arm64 architecture workers from 24 to 16. [ ]
    • Use diffoscope from the Debian release being tested again. [ ]
    • Improve the handling when killing unwanted processes [ ][ ][ ] and be more verbose about it, too [ ].
    • Don t mark a job as failed if process marked as to-be-killed is already gone. [ ]
    • Display the architecture of builds that have been running for more than 48 hours. [ ]
    • Reboot arm64 nodes when they hit an OOM (out of memory) state. [ ]
  • Package rescheduling changes:
    • Reduce IRC notifications to 1 when rescheduling due to package status changes. [ ]
    • Correctly set SUDO_USER when rescheduling packages. [ ]
    • Automatically reschedule packages regressing to FTBFS (build failure) or FTBR (build success, but unreproducible). [ ]
  • OpenWrt-related changes:
    • Install the python3-dev and python3-pyelftools packages as they are now needed for the sunxi target. [ ][ ]
    • Also install the libpam0g-dev which is needed by some OpenWrt hardware targets. [ ]
  • Misc:
    • As it s January, set the real_year variable to 2024 [ ] and bump various copyright years as well [ ].
    • Fix a large (!) number of spelling mistakes in various scripts. [ ][ ][ ]
    • Prevent Squid and Systemd processes from being killed by the kernel s OOM killer. [ ]
    • Install the iptables tool everywhere, else our custom rc.local script fails. [ ]
    • Cleanup the /srv/workspace/pbuilder directory on boot. [ ]
    • Automatically restart Squid if it fails. [ ]
    • Limit the execution of chroot-installation jobs to a maximum of 4 concurrent runs. [ ][ ]
Significant amounts of node maintenance was performed by Holger Levsen (eg. [ ][ ][ ][ ][ ][ ][ ] etc.) and Vagrant Cascadian (eg. [ ][ ][ ][ ][ ][ ][ ][ ]). Indeed, Vagrant Cascadian handled an extended power outage for the network running the Debian armhf architecture test infrastructure. This provided the incentive to replace the UPS batteries and consolidate infrastructure to reduce future UPS load. [ ] Elsewhere in our infrastructure, however, Holger Levsen also adjusted the email configuration for @reproducible-builds.org to deal with a new SMTP email attack. [ ]

Upstream patches The Reproducible Builds project tries to detects, dissects and fix as many (currently) unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including: Separate to this, Vagrant Cascadian followed up with the relevant maintainers when reproducibility fixes were not included in newly-uploaded versions of the mm-common package in Debian this was quickly fixed, however. [ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

28 January 2024

Niels Thykier: Annotating the Debian packaging directory

In my previous blog post Providing online reference documentation for debputy, I made a point about how debhelper documentation was suboptimal on account of being static rather than online. The thing is that debhelper is not alone in this problem space, even if it is a major contributor to the number of packaging files you have to to know about. If we look at the "competition" here such as Fedora and Arch Linux, they tend to only have one packaging file. While most Debian people will tell you a long list of cons about having one packaging file (such a Fedora's spec file being 3+ domain specific languages "mashed" into one file), one major advantage is that there is only "the one packaging file". You only need to remember where to find the documentation for one file, which is great when you are running on wetware with limited storage capacity. Which means as a newbie, you can dedicate less mental resources to tracking multiple files and how they interact and more effort understanding the "one file" at hand. I started by asking myself how can we in Debian make the packaging stack more accessible to newcomers? Spoiler alert, I dug myself into rabbit hole and ended up somewhere else than where I thought I was going. I started by wanting to scan the debian directory and annotate all files that I could with documentation links. The logic was that if debputy could do that for you, then you could spend more mental effort elsewhere. So I combined debputy's packager provided files detection with a static list of files and I quickly had a good starting point for debputy-based packages.
Adding (non-static) dpkg and debhelper files to the mix Now, I could have closed the topic here and said "Look, I did debputy files plus couple of super common files". But I decided to take it a bit further. I added support for handling some dpkg files like packager provided files (such as debian/substvars and debian/symbols). But even then, we all know that debhelper is the big hurdle and a major part of the omission... In another previous blog post (A new Debian package helper: debputy), I made a point about how debputy could list all auxiliary files while debhelper could not. This was exactly the kind of feature that I would need for this feature, if this feature was to cover debhelper. Now, I also remarked in that blog post that I was not willing to maintain such a list. Also, I may have ranted about static documentation being unhelpful for debhelper as it excludes third-party provided tooling. Fortunately, a recent update to dh_assistant had provided some basic plumbing for loading dh sequences. This meant that getting a list of all relevant commands for a source package was a lot easier than it used to be. Once you have a list of commands, it would be possible to check all of them for dh's NOOP PROMISE hints. In these hints, a command can assert it does nothing if a given pkgfile is not present. This lead to the new dh_assistant list-guessed-dh-config-files command that will list all declared pkgfiles and which helpers listed them. With this combined feature set in place, debputy could call dh_assistant to get a list of pkgfiles, pretend they were packager provided files and annotate those along with manpage for the relevant debhelper command. The exciting thing about letting debpputy resolve the pkgfiles is that debputy will resolve "named" files automatically (debhelper tools will only do so when --name is passed), so it is much more likely to detect named pkgfiles correctly too. Side note: I am going to ignore the elephant in the room for now, which is dh_installsystemd and its package@.service files and the wide-spread use of debian/foo.service where there is no package called foo. For the latter case, the "proper" name would be debian/pkg.foo.service. With the new dh_assistant feature done and added to debputy, debputy could now detect the ubiquitous debian/install file. Excellent. But less great was that the very common debian/docs file was not. Turns out that dh_installdocs cannot be skipped by dh, so it cannot have NOOP PROMISE hints. Meh... Well, dh_assistant could learn about a new INTROSPECTABLE marker in addition to the NOOP PROMISE and then I could sprinkle that into a few commands. Indeed that worked and meant that debian/postinst (etc.) are now also detectable. At this point, debputy would be able to identify a wide range of debhelper related configuration files in debian/ and at least associate each of them with one or more commands. Nice, surely, this would be a good place to stop, right...?
Adding more metadata to the files The debhelper detected files only had a command name and manpage URI to that command. It would be nice if we could contextualize this a bit more. Like is this file installed into the package as is like debian/pam or is it a file list to be processed like debian/install. To make this distinction, I could add the most common debhelper file types to my static list and then merge the result together. Except, I do not want to maintain a full list in debputy. Fortunately, debputy has a quite extensible plugin infrastructure, so added a new plugin feature to provide this kind of detail and now I can outsource the problem! I split my definitions into two and placed the generic ones in the debputy-documentation plugin and moved the debhelper related ones to debhelper-documentation. Additionally, third-party dh addons could provide their own debputy plugin to add context to their configuration files. So, this gave birth file categories and configuration features, which described each file on different fronts. As an example, debian/gbp.conf could be tagged as a maint-config to signal that it is not directly related to the package build but more of a tool or style preference file. On the other hand, debian/install and debian/debputy.manifest would both be tagged as a pkg-helper-config. Files like debian/pam were tagged as ppf-file for packager provided file and so on. I mentioned configuration features above and those were added because, I have had a beef with debhelper's "standard" configuration file format as read by filearray and filedoublearray. They are often considered simple to understand, but it is hard to know how a tool will actually read the file. As an example, consider the following:
  • Will the debhelper use filearray, filedoublearray or none of them to read the file? This topic has about 2 bits of entropy.
  • Will the config file be executed if it is marked executable assuming you are using the right compat level? If it is executable, does dh-exec allow renaming for this file? This topic adds 1 or 2 bit of entropy depending on the context.
  • Will the config file be subject to glob expansions? This topic sounds like a boolean but is a complicated mess. The globs can be handled either by debhelper as it parses the file for you. In this case, the globs are applied to every token. However, this is not what dh_install does. Here the last token on each line is supposed to be a directory and therefore not subject to globs. Therefore, dh_install does the globbing itself afterwards but only on part of the tokens. So that is about 2 bits of entropy more. Actually, it gets worse...
    • If the file is executed, debhelper will refuse to expand globs in the output of the command, which was a deliberate design choice by the original debhelper maintainer took when he introduced the feature in debhelper/8.9.12. Except, dh_install feature interacts with the design choice and does enable glob expansion in the tool output, because it does so manually after its filedoublearray call.
So these "simple" files have way too many combinations of how they can be interpreted. I figured it would be helpful if debputy could highlight these difference, so I added support for those as well. Accordingly, debian/install is tagged with multiple tags including dh-executable-config and dh-glob-after-execute. Then, I added a datatable of these tags, so it would be easy for people to look up what they meant. Ok, this seems like a closed deal, right...?
Context, context, context However, the dh-executable-config tag among other are only applicable in compat 9 or later. It does not seem newbie friendly if you are told that this feature exist, but then have to read in the extended description that that it actually does not apply to your package. This problem seems fixable. Thanks to dh_assistant, it is easy to figure out which compat level the package is using. Then tweak some metadata to enable per compat level rules. With that tags like dh-executable-config only appears for packages using compat 9 or later. Also, debputy should be able to tell you where packager provided files like debian/pam are installed. We already have the logic for packager provided files that debputy supports and I am already using debputy engine for detecting the files. If only the plugin provided metadata gave me the install pattern, debputy would be able tell you where this file goes in the package. Indeed, a bit of tweaking later and setting install-pattern to usr/lib/pam.d/ name , debputy presented me with the correct install-path with the package name placing the name placeholder. Now, I have been using debian/pam as an example, because debian/pam is installed into usr/lib/pam.d in compat 14. But in earlier compat levels, it was installed into etc/pam.d. Well, I already had an infrastructure for doing compat file tags. Off we go to add install-pattern to the complat level infrastructure and now changing the compat level would change the path. Great. (Bug warning: The value is off-by-one in the current version of debhelper. This is fixed in git) Also, while we are in this install-pattern business, a number of debhelper config files causes files to be installed into a fixed directory. Like debian/docs which causes file to be installed into /usr/share/docs/ package . Surely, we can expand that as well and provide that bit of context too... and done. (Bug warning: The code currently does not account for the main documentation package context) It is rather common pattern for people to do debian/foo.in files, because they want to custom generation of debian/foo. Which means if you have debian/foo you get "Oh, let me tell you about debian/foo ". Then you rename it to debian/foo.in and the result is "debian/foo.in is a total mystery to me!". That is suboptimal, so lets detect those as well as if they were the original file but add a tag saying that they are a generate template and which file we suspect it generates. Finally, if you use debputy, almost all of the standard debhelper commands are removed from the sequence, since debputy replaces them. It would be weird if these commands still contributed configuration files when they are not actually going to be invoked. This mostly happened naturally due to the way the underlying dh_assistant command works. However, any file mentioned by the debhelper-documentation plugin would still appear unfortunately. So off I went to filter the list of known configuration files against which dh_ commands that dh_assistant thought would be used for this package.
Wrapping it up I was several layers into this and had to dig myself out. I have ended up with a lot of data and metadata. But it was quite difficult for me to arrange the output in a user friendly manner. However, all this data did seem like it would be useful any tool that wants to understand more about the package. So to get out of the rabbit hole, I for now wrapped all of this into JSON and now we have a debputy tool-support annotate-debian-directory command that might be useful for other tools. To try it out, you can try the following demo: In another day, I will figure out how to structure this output so it is useful for non-machine consumers. Suggestions are welcome. :)
Limitations of the approach As a closing remark, I should probably remind people that this feature relies heavily on declarative features. These include:
  • When determining which commands are relevant, using Build-Depends: dh-sequence-foo is much more reliable than configuring it via the Turing complete configuration we call debian/rules.
  • When debhelper commands use NOOP promise hints, dh_assistant can "see" the config files listed those hints, meaning the file will at least be detected. For new introspectable hint and the debputy plugin, it is probably better to wait until the dust settles a bit before adding any of those.
You can help yourself and others to better results by using the declarative way rather than using debian/rules, which is the bane of all introspection!

26 January 2024

Bastian Venthur: Investigating popularity of Python build backends over time

Inspired by a Mastodon post by Fran oise Conil, who investigated the current popularity of build backends used in pyproject.toml files, I wanted to investigate how the popularity of build backends used in pyproject.toml files evolved over the years since the introduction of PEP-0517 in 2015. Getting the data Tom Forbes provides a huge dataset that contains information about every file within every release uploaded to PyPI. To get the current dataset, we can use:
curl -L --remote-name-all $(curl -L "https://github.com/pypi-data/data/raw/main/links/dataset.txt")
This will download approximately 30GB of parquet files, providing detailed information about each file included in a PyPI upload, including:
  1. project name, version and release date
  2. file path, size and line count
  3. hash of the file
The dataset does not contain the actual files themselves though, more on that in a moment. Querying the dataset using duckdb We can now use duckdb to query the parquet files directly. Let s look into the schema first:
describe select * from '*.parquet';
 
    column_name     column_type    null    
      varchar         varchar     varchar  
 
  project_name      VARCHAR       YES      
  project_version   VARCHAR       YES      
  project_release   VARCHAR       YES      
  uploaded_on       TIMESTAMP     YES      
  path              VARCHAR       YES      
  archive_path      VARCHAR       YES      
  size              UBIGINT       YES      
  hash              BLOB          YES      
  skip_reason       VARCHAR       YES      
  lines             UBIGINT       YES      
  repository        UINTEGER      YES      
 
  11 rows                       6 columns  
 
From all files mentioned in the dataset, we only care about pyproject.toml files that are in the project s root directory. Since we ll still have to download the actual files, we need to get the path and the repository to construct the corresponding URL to the mirror that contains all files in a bunch of huge git repositories. Some files are not available on the mirrors; to skip these, we only take files where the skip_reason is empty. We also care about the timestamp of the upload (uploaded_on) and the hash to avoid processing identical files twice:
select
    path,
    hash,
    uploaded_on,
    repository
from '*.parquet'
where
    skip_reason == '' and
    lower(string_split(path, '/')[-1]) == 'pyproject.toml' and
    len(string_split(path, '/')) == 5
order by uploaded_on desc
This query runs for a few minutes on my laptop and returns ~1.2M rows. Getting the actual files Using the repository and path, we can now construct an URL from which we can fetch the actual file for further processing:
url = f"https://raw.githubusercontent.com/pypi-data/pypi-mirror- repository /code/ path "
We can download the individual pyproject.toml files and parse them to read the build-backend into a dictionary mapping the file-hash to the build backend. Downloads on GitHub are rate-limited, so downloading 1.2M files will take a couple of days. By skipping files with a hash we ve already processed, we can avoid downloading the same file more than once, cutting the required downloads by circa 50%. Results Assuming the data is complete and my analysis is sound, these are the findings: There is a surprising amount of build backends in use, but the overall amount of uploads per build backend decreases quickly, with a long tail of single uploads:
>>> results.backend.value_counts()
backend
setuptools        701550
poetry            380830
hatchling          56917
flit               36223
pdm                11437
maturin             9796
jupyter             1707
mesonpy              625
scikit               556
                   ...
postry                 1
tree                   1
setuptoos              1
neuron                 1
avalon                 1
maturimaturinn         1
jsonpath               1
ha                     1
pyo3                   1
Name: count, Length: 73, dtype: int64
We pick only the top 4 build backends, and group the remaining ones (including PDM and Maturin) into other so they are accounted for as well. The following plot shows the relative distribution of build backends over time. Each bin represents a time span of 28 days. I chose 28 days to reduce visual clutter. Within each bin, the height of the bars corresponds to the relative proportion of uploads during that time interval: Relative distribution of build backends over time Looking at the right side of the plot, we see the current distribution. It confirms Fran oise s findings about the current popularity of build backends: Between 2018 and 2020 the graph exhibits significant fluctuations, due to the relatively low amount uploads utizing pyproject.toml files. During that early period, Flit started as the most popular build backend, but was eventually displaced by Setuptools and Poetry. Between 2020 and 2020, the overall usage of pyproject.toml files increased significantly. By the end of 2022, the share of Setuptools peaked at 70%. After 2020, other build backends experienced a gradual rise in popularity. Amongh these, Hatch emerged as a notable contender, steadily gaining traction and ultimately stabilizing at 10%. We can also look into the absolute distribution of build backends over time: Absolute distribution of build backends over time The plot shows that Setuptools has the strongest growth trajectory, surpassing all other build backends. Poetry and Hatch are growing at a comparable rate, but since Hatch started roughly 4 years after Poetry, it s lagging behind in popularity. Despite not being among the most widely used backends anymore, Flit maintains a steady and consistent growth pattern, indicating its enduring relevance in the Python packaging landscape. The script for downloading and analyzing the data can be found in my GitHub repository. It contains the results of the duckb query (so you don t have to download the full dataset) and the pickled dictionary, mapping the file hashes to the build backends, saving you days for downloading and analyzing the pyproject.toml files yourself.

25 January 2024

Joachim Breitner: GHC Steering Committee Retrospective

After seven years of service as member and secretary on the GHC Steering Committee, I have resigned from that role. So this is a good time to look back and retrace the formation of the GHC proposal process and committee. In my memory, I helped define and shape the proposal process, optimizing it for effectiveness and throughput, but memory can be misleading, and judging from the paper trail in my email archives, this was indeed mostly Ben Gamari s and Richard Eisenberg s achievement: Already in Summer of 2016, Ben Gamari set up the ghc-proposals Github repository with a sketch of a process and sent out a call for nominations on the GHC user s mailing list, which I replied to. The Simons picked the first set of members, and in the fall of 2016 we discussed the committee s by-laws and procedures. As so often, Richard was an influential shaping force here.

Three ingredients For example, it was him that suggested that for each proposal we have one committee member be the Shepherd , overseeing the discussion. I believe this was one ingredient for the process effectiveness: There is always one person in charge, and thus we avoid the delays incurred when any one of a non-singleton set of volunteers have to do the next step (and everyone hopes someone else does it). The next ingredient was that we do not usually require a vote among all members (again, not easy with volunteers with limited bandwidth and occasional phases of absence). Instead, the shepherd makes a recommendation (accept/reject), and if the other committee members do not complain, this silence is taken as consent, and we come to a decision. It seems this idea can also be traced back on Richard, who suggested that once a decision is requested, the shepherd [generates] consensus. If consensus is elusive, then we vote. At the end of the year we agreed and wrote down these rules, created the mailing list for our internal, but publicly archived committee discussions, and began accepting proposals, starting with Adam Gundry s OverloadedRecordFields. At that point, there was no secretary role yet, so how I did become one? It seems that in February 2017 I started to clean-up and refine the process documentation, fixing bugs in the process (like requiring authors to set Github labels when they don t even have permissions to do that). This in particular meant that someone from the committee had to manually handle submissions and so on, and by the aforementioned principle that at every step there ought to be exactly one person in change, the role of a secretary followed naturally. In the email in which I described that role I wrote:
Simon already shoved me towards picking up the secretary hat, to reduce load on Ben.
So when I merged the updated process documentation, I already listed myself secretary . It wasn t just Simon s shoving that put my into the role, though. I dug out my original self-nomination email to Ben, and among other things I wrote:
I also hope that there is going to be clear responsibilities and a clear workflow among the committee. E.g. someone (possibly rotating), maybe called the secretary, who is in charge of having an initial look at proposals and then assigning it to a member who shepherds the proposal.
So it is hardly a surprise that I became secretary, when it was dear to my heart to have a smooth continuous process here. I am rather content with the result: These three ingredients single secretary, per-proposal shepherds, silence-is-consent helped the committee to be effective throughout its existence, even as every once in a while individual members dropped out.

Ulterior motivation I must admit, however, there was an ulterior motivation behind me grabbing the secretary role: Yes, I did want the committee to succeed, and I did want that authors receive timely, good and decisive feedback on their proposals but I did not really want to have to do that part. I am, in fact, a lousy proposal reviewer. I am too generous when reading proposals, and more likely mentally fill gaps in a specification rather than spotting them. Always optimistically assuming that the authors surely know what they are doing, rather than critically assessing the impact, the implementation cost and the interaction with other language features. And, maybe more importantly: why should I know which changes are good and which are not so good in the long run? Clearly, the authors cared enough about a proposal to put it forward, so there is some need and I do believe that Haskell should stay an evolving and innovating language but how does this help me decide about this or that particular feature. I even, during the formation of the committee, explicitly asked that we write down some guidance on Vision and Guideline ; do we want to foster change or innovation, or be selective gatekeepers? Should we accept features that are proven to be useful, or should we accept features so that they can prove to be useful? This discussion, however, did not lead to a concrete result, and the assessment of proposals relied on the sum of each member s personal preference, expertise and gut feeling. I am not saying that this was a mistake: It is hard to come up with a general guideline here, and even harder to find one that does justice to each individual proposal. So the secret motivation for me to grab the secretary post was that I could contribute without having to judge proposals. Being secretary allowed me to assign most proposals to others to shepherd, and only once in a while myself took care of a proposal, when it seemed to be very straight-forward. Sneaky, ain t it?

7 Years later For years to come I happily played secretary: When an author finished their proposal and public discussion ebbed down they would ping me on GitHub, I would pick a suitable shepherd among the committee and ask them to judge the proposal. Eventually, the committee would come to a conclusion, usually by implicit consent, sometimes by voting, and I d merge the pull request and update the metadata thereon. Every few months I d summarize the current state of affairs to the committee (what happened since the last update, which proposals are currently on our plate), and once per year gathered the data for Simon Peyton Jones annually GHC Status Report. Sometimes some members needed a nudge or two to act. Some would eventually step down, and I d sent around a call for nominations and when the nominations came in, distributed them off-list among the committee and tallied the votes. Initially, that was exciting. For a long while it was a pleasant and rewarding routine. Eventually, it became a mere chore. I noticed that I didn t quite care so much anymore about some of the discussion, and there was a decent amount of naval-gazing, meta-discussions and some wrangling about claims of authority that was probably useful and necessary, but wasn t particularly fun. I also began to notice weaknesses in the processes that I helped shape: We could really use some more automation for showing proposal statuses, notifying people when they have to act, and nudging them when they don t. The whole silence-is-assent approach is good for throughput, but not necessary great for quality, and maybe the committee members need to be pushed more firmly to engage with each proposal. Like GHC itself, the committee processes deserve continuous refinement and refactoring, and since I could not muster the motivation to change my now well-trod secretarial ways, it was time for me to step down. Luckily, Adam Gundry volunteered to take over, and that makes me feel much less bad for quitting. Thanks for that! And although I am for my day job now enjoying a language that has many of the things out of the box that for Haskell are still only language extensions or even just future proposals (dependent types, BlockArguments, do notation with ( foo) expressions and Unicode), I m still around, hosting the Haskell Interlude Podcast, writing on this blog and hanging out at ZuriHac etc.

22 January 2024

Dirk Eddelbuettel: RProtoBuf 0.4.22 on CRAN: Updated Windows Support!

A new maintenance release 0.4.22 of RProtoBuf arrived on CRAN earlier today. RProtoBuf provides R with bindings for the Google Protocol Buffers ( ProtoBuf ) data encoding and serialization library used and released by Google, and deployed very widely in numerous projects as a language and operating-system agnostic protocol. This release matches the recent 0.4.21 release which enabled use of the package with newer ProtoBuf releases. Tomas has been updating the Windows / rtools side of things, and supplied us with simple PR that will enable building with those updated versions once finalised. The following section from the NEWS.Rd file has full details.

Changes in RProtoBuf version 0.4.22 (2022-12-13)
  • Apply patch by Tomas Kalibera to support updated rtools to build with newer ProtoBuf releases on windows

Thanks to my CRANberries, there is a diff to the previous release. The RProtoBuf page has copies of the (older) package vignette, the quick overview vignette, and the pre-print of our JSS paper. Questions, comments etc should go to the GitHub issue tracker off the GitHub repo. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

17 January 2024

Colin Watson: Task management

Now that I m freelancing, I need to actually track my time, which is something I ve had the luxury of not having to do before. That meant something of a rethink of the way I ve been keeping track of my to-do list. Up to now that was a combination of things like the bug lists for the projects I m working on at the moment, whatever task tracking system Canonical was using at the moment (Jira when I left), and a giant flat text file in which I recorded logbook-style notes of what I d done each day plus a few extra notes at the bottom to remind myself of particularly urgent tasks. I could have started manually adding times to each logbook entry, but ugh, let s not. In general, I had the following goals (which were a bit reminiscent of my address book): I didn t do an elaborate evaluation of multiple options, because I m not trying to come up with the best possible solution for a client here. Also, there are a bazillion to-do list trackers out there and if I tried to evaluate them all I d never do anything else. I just wanted something that works well enough for me. Since it came up on Mastodon: a bunch of people swear by Org mode, which I know can do at least some of this sort of thing. However, I don t use Emacs and don t plan to use Emacs. nvim-orgmode does have some support for time tracking, but when I ve tried vim-based versions of Org mode in the past I ve found they haven t really fitted my brain very well. Taskwarrior and Timewarrior One of the other Freexian collaborators mentioned Taskwarrior and Timewarrior, so I had a look at those. The basic idea of Taskwarrior is that you have a task command that tracks each task as a blob of JSON and provides subcommands to let you add, modify, and remove tasks with a minimum of friction. task add adds a task, and you can add metadata like project:Personal (I always make sure every task has a project, for ease of filtering). Just running task shows you a task list sorted by Taskwarrior s idea of urgency, with an ID for each task, and there are various other reports with different filtering and verbosity. task <id> annotate lets you attach more information to a task. task <id> done marks it as done. So far so good, so a redacted version of my to-do list looks like this:
$ task ls
ID A Project     Tags                 Description
17   Freexian                         Add Incus support to autopkgtest [2]
 7   Columbiform                      Figure out Lloyds online banking [1]
 2   Debian                           Fix troffcvt for groff 1.23.0 [1]
11   Personal                         Replace living room curtain rail
Once I got comfortable with it, this was already a big improvement. I haven t bothered to learn all the filtering gadgets yet, but it was easy enough to see that I could do something like task all project:Personal and it d show me both pending and completed tasks in that project, and that all the data was stored in ~/.task - though I have to say that there are enough reporting bells and whistles that I haven t needed to poke around manually. In combination with the regular backups that I do anyway (you do too, right?), this gave me enough confidence to abandon my previous text-file logbook approach. Next was time tracking. Timewarrior integrates with Taskwarrior, albeit in an only semi-packaged way, and it was easy enough to set that up. Now I can do:
$ task 25 start
Starting task 00a9516f 'Write blog post about task tracking'.
Started 1 task.
Note: '"Write blog post about task tracking"' is a new tag.
Tracking Columbiform "Write blog post about task tracking"
  Started 2024-01-10T11:28:38
  Current                  38
  Total               0:00:00
You have more urgent tasks.
Project 'Columbiform' is 25% complete (3 of 4 tasks remaining).
When I stop work on something, I do task active to find the ID, then task <id> stop. Timewarrior does the tedious stopwatch business for me, and I can manually enter times if I forget to start/stop a task. Then the really useful bit: I can do something like timew summary :month <name-of-client> and it tells me how much to bill that client for this month. Perfect. I also started using VIT to simplify the day-to-day flow a little, which means I m normally just using one or two keystrokes rather than typing longer commands. That isn t really necessary from my point of view, but it does save some time. Android integration I left Android integration for a bit later since it wasn t essential. When I got round to it, I have to say that it felt a bit clumsy, but it did eventually work. The first step was to set up a taskserver. Most of the setup procedure was OK, but I wanted to use Let s Encrypt to minimize the amount of messing around with CAs I had to do. Getting this to work involved hitting things with sticks a bit, and there s still a local CA involved for client certificates. What I ended up with was a certbot setup with the webroot authenticator and a custom deploy hook as follows (with cert_name replaced by a DNS name in my house domain):
#! /bin/sh
set -eu
cert_name=taskd.example.org
found=false
for domain in $RENEWED_DOMAINS; do
    case "$domain" in
        $cert_name)
            found=:
            ;;
    esac
done
$found   exit 0
install -m 644 "/etc/letsencrypt/live/$cert_name/fullchain.pem" \
    /var/lib/taskd/pki/fullchain.pem
install -m 640 -g Debian-taskd "/etc/letsencrypt/live/$cert_name/privkey.pem" \
    /var/lib/taskd/pki/privkey.pem
systemctl restart taskd.service
I could then set this in /etc/taskd/config (server.crl.pem and ca.cert.pem were generated using the documented taskserver setup procedure):
server.key=/var/lib/taskd/pki/privkey.pem
server.cert=/var/lib/taskd/pki/fullchain.pem
server.crl=/var/lib/taskd/pki/server.crl.pem
ca.cert=/var/lib/taskd/pki/ca.cert.pem
Then I could set taskd.ca on my laptop to /usr/share/ca-certificates/mozilla/ISRG_Root_X1.crt and otherwise follow the client setup instructions, run task sync init to get things started, and then task sync every so often to sync changes between my laptop and the taskserver. I used TaskWarrior Mobile as the client. I have to say I wouldn t want to use that client as my primary task tracking interface: the setup procedure is clunky even beyond the necessity of copying a client certificate around, it expects you to give it a .taskrc rather than having a proper settings interface for that, and it only seems to let you add a task if you specify a due date for it. It also lacks Timewarrior integration, so I can only really use it when I don t care about time tracking, e.g. personal tasks. But that s really all I need, so it meets my minimum requirements. Next? Considering this is literally the first thing I tried, I have to say I m pretty happy with it. There are a bunch of optional extras I haven t tried yet, but in general it kind of has the vim nature for me: if I need something it s very likely to exist or easy enough to build, but the features I don t use don t get in my way. I wouldn t recommend any of this to somebody who didn t already spend most of their time in a terminal - but I do. I m glad people have gone to all the effort to build this so I didn t have to.

12 January 2024

Freexian Collaborators: Monthly report about Debian Long Term Support, December 2023 (by Roberto C. S nchez)

Like each month, have a look at the work funded by Freexian s Debian LTS offering.

Debian LTS contributors In December, 18 contributors have been paid to work on Debian LTS, their reports are available:
  • Abhijith PA did 7.0h (out of 7.0h assigned and 7.0h from previous period), thus carrying over 7.0h to the next month.
  • Adrian Bunk did 16.0h (out of 26.25h assigned and 8.75h from previous period), thus carrying over 19.0h to the next month.
  • Bastien Roucari s did 16.0h (out of 16.0h assigned and 4.0h from previous period), thus carrying over 4.0h to the next month.
  • Ben Hutchings did 8.0h (out of 7.25h assigned and 16.75h from previous period), thus carrying over 16.0h to the next month.
  • Chris Lamb did 18.0h (out of 18.0h assigned).
  • Emilio Pozuelo Monfort did 8.0h (out of 26.75h assigned and 8.25h from previous period), thus carrying over 27.0h to the next month.
  • Guilhem Moulin did 25.0h (out of 18.0h assigned and 7.0h from previous period).
  • Holger Levsen did 5.5h (out of 5.5h assigned).
  • Jochen Sprickerhof did 0.0h (out of 0h assigned and 10.0h from previous period), thus carrying over 10.0h to the next month.
  • Lee Garrett did 0.0h (out of 25.75h assigned and 9.25h from previous period), thus carrying over 35.0h to the next month.
  • Markus Koschany did 35.0h (out of 35.0h assigned).
  • Roberto C. S nchez did 9.5h (out of 5.5h assigned and 6.5h from previous period), thus carrying over 2.5h to the next month.
  • Santiago Ruano Rinc n did 8.255h (out of 3.26h assigned and 12.745h from previous period), thus carrying over 7.75h to the next month.
  • Sean Whitton did 4.25h (out of 3.25h assigned and 6.75h from previous period), thus carrying over 5.75h to the next month.
  • Sylvain Beucler did 16.5h (out of 21.25h assigned and 13.75h from previous period), thus carrying over 18.5h to the next month.
  • Thorsten Alteholz did 14.0h (out of 14.0h assigned).
  • Tobias Frost did 10.25h (out of 12.0h assigned), thus carrying over 1.75h to the next month.
  • Utkarsh Gupta did 18.75h (out of 11.25h assigned and 13.5h from previous period), thus carrying over 6.0h to the next month.

Evolution of the situation In December, we have released 29 DLAs. A particularly notable update in December was prepared by LTS contributor Santiago Ruano Rinc n for the openssh package. The updated produced DLA-3694-1 and included a fix for the Terrapin Attack (CVE-2023-48795), which was a rather serious flaw in the SSH protocol itself. The package bluez was the subject of another notable update by LTS contributor Chris Lamb, which resulted in DLA-3689-1 to address an insecure default configuration which allowed attackers to inject keyboard commands over Bluetooth without first authenticating. The LTS team continues its efforts to have a positive impact beyond the boundaries of LTS. Several contributors worked on packages, preparing LTS updates, but also preparing patches or full updates which were uploaded to the unstable, stable, and oldstable distributions, including: Guilhem Moulin s update of tinyxml (uploads to LTS and unstable and patches submitted to the security team for stable and oldstable); Guilhem Moulin s update of xerces-c (uploads to LTS and unstable and patches submitted to the security team for oldstable); Thorsten Alteholz s update of libde265 (uploads to LTS and stable and additional patches submitted to the maintainer for stable and oldstable); Thorsten Alteholz s update of cjson (upload to LTS and patches submitted to the maintainer for stable and oldstable); and Tobias Frost s update of opendkim (sponsor maintainer-prepared upload to LTS and additionally prepared updates for stable and oldstable). Going beyond Debian and looking to the broader community, LTS contributor Bastien Roucari s was contacted by SUSE concerning an update he had prepared for zbar. He was able to assist by coordinating with the former organization of the original zbar author to secure for SUSE access to information concerning the exploits. This has enabled another distribution to benefit from the work done in support of LTS and from the assistance of Bastien in coordinating the access to information. Finally, LTS contributor Santiago Ruano Rinc n continued work relating to how updates for packages in statically-linked language ecosystems (e.g., Go, Rust, and others) are handled. The work is presently focused on more accurately and reliably identifying which packages are impacted in a given update scenario to enable notifications to be published so that users will be made aware of these situations as they occur. As the work continues, it will eventually result in improvements to Debian infrustructure so that the LTS team and Security team are able to manage updates of this nature in a more consistent way.

Thanks to our sponsors Sponsors that joined recently are in bold.

10 January 2024

Dirk Eddelbuettel: Rcpp 1.0.12 on CRAN: New Maintenance / Update Release

rcpp logo The Rcpp Core Team is once again thrilled to announce a new release 1.0.12 of the Rcpp package. It arrived on CRAN early today, and has since been uploaded to Debian as well. Windows and macOS builds should appear at CRAN in the next few days, as will builds in different Linux distribution and of course at r2u should catch up tomorrow. The release was uploaded yesterday, and run its reverse dependencies overnight. Rcpp always gets flagged nomatter what because the grandfathered .Call(symbol) but we had not single change to worse among over 2700 reverse dependencies! This release continues with the six-months January-July cycle started with release 1.0.5 in July 2020. As a reminder, we do of course make interim snapshot dev or rc releases available via the Rcpp drat repo and strongly encourage their use and testing I run my systems with these versions which tend to work just as well, and are also fully tested against all reverse-dependencies. Rcpp has long established itself as the most popular way of enhancing R with C or C++ code. Right now, 2791 packages on CRAN depend on Rcpp for making analytical code go faster and further, along with 254 in BioConductor. On CRAN, 13.8% of all packages depend (directly) on Rcpp, and 59.9% of all compiled packages do. From the cloud mirror of CRAN (which is but a subset of all CRAN downloads), Rcpp has been downloaded 78.1 million times. The two published papers (also included in the package as preprint vignettes) have, respectively, 1766 (JSS, 2011) and 292 (TAS, 2018) citations, while the the book (Springer useR!, 2013) has another 617. This release is incremental as usual, generally preserving existing capabilities faithfully while smoothing our corners and / or extending slightly, sometimes in response to changing and tightened demands from CRAN or R standards. The full list below details all changes, their respective PRs and, if applicable, issue tickets. Big thanks from all of us to all contributors!

Changes in Rcpp release version 1.0.12 (2024-01-08)
  • Changes in Rcpp API:
    • Missing header includes as spotted by some recent tools were added in two places (Michael Chirico in #1272 closing #1271).
    • Casts to avoid integer overflow in matrix row/col selections have neem added (Aaron Lun #1281).
    • Three print format correction uncovered by R-devel were applied with thanks to Tomas Kalibera (Dirk in #1285).
    • Correct a print format correction in the RcppExports glue code (Dirk in #1288 fixing #1287).
    • The upcoming OBJSXP addition to R 4.4.0 is supported in the type2name mapper (Dirk and I aki in #1293).
  • Changes in Rcpp Attributes:
    • Generated interface code from base R that fails under LTO is now corrected (I aki in #1274 fixing a StackOverflow issue).
  • Changes in Rcpp Documentation:
    • The caption for third figure in the introductory vignette has been corrected (Dirk in #1277 fixing #1276).
    • A small formatting issue was correct in an Rd file as noticed by R-devel (Dirk in #1282).
    • The Rcpp FAQ vignette has been updated (Dirk in #1284).
    • The Rcpp.bib file has been refreshed to current package versions.
  • Changes in Rcpp Deployment:
    • The RcppExports file for an included test package has been updated (Dirk in #1289).

Thanks to my CRANberries, you can also look at a diff to the previous release Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page. Bugs reports are welcome at the GitHub issue tracker as well (where one can also search among open or closed issues). If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

8 January 2024

Thorsten Alteholz: My Debian Activities in December 2023

FTP master This month I accepted 235 and rejected 13 packages. The overall number of packages that got accepted was 249. I also handled lots of RM bugs and almost stopped the increase in packages this month :-). Please be aware, if you don t want your package to be removed, take care of it and keep it in good shape! Debian LTS This was my hundred-fourteenth month that I did some work for the Debian LTS initiative, started by Raphael Hertzog at Freexian. During my allocated time I uploaded: This month was rather calm and no unexpected things happened. The web team now automatically creates all webpages from data found in the security tracker. So I could deactivate my web-dla script again which created the webpages from the contents of the announcement mailing list. Last but not least I also did two weeks of frontdesk duties. Debian ELTS This month was the sixty-fifth ELTS month. During my allocated time I uploaded: Last but not least I also did two weeks of frontdesk duties. Debian Printing This month I uploaded a package to fix bugs: This work is generously funded by Freexian! Debian Astro This month I uploaded a package to fix bugs: Other stuff This month I uploaded new upstream version of packages, did a source upload for the transition or uploaded it to fix one or the other issue:

25 December 2023

Sergio Talens-Oliag: GitLab CI/CD Tips: Automatic Versioning Using semantic-release

This post describes how I m using semantic-release on gitlab-ci to manage versioning automatically for different kinds of projects following a simple workflow (a develop branch where changes are added or merged to test new versions, a temporary release/#.#.# to generate the release candidate versions and a main branch where the final versions are published).

What is semantic-releaseIt is a Node.js application designed to manage project versioning information on Git Repositories using a Continuous integration system (in this post we will use gitlab-ci)

How does it workBy default semantic-release uses semver for versioning (release versions use the format MAJOR.MINOR.PATCH) and commit messages are parsed to determine the next version number to publish. If after analyzing the commits the version number has to be changed, the command updates the files we tell it to (i.e. the package.json file for nodejs projects and possibly a CHANGELOG.md file), creates a new commit with the changed files, creates a tag with the new version and pushes the changes to the repository. When running on a CI/CD system we usually generate the artifacts related to a release (a package, a container image, etc.) from the tag, as it includes the right version number and usually has passed all the required tests (it is a good idea to run the tests again in any case, as someone could create a tag manually or we could run extra jobs when building the final assets if they fail it is not a big issue anyway, numbers are cheap and infinite, so we can skip releases if needed).

Commit messages and versioningThe commit messages must follow a known format, the default module used to analyze them uses the angular git commit guidelines, but I prefer the conventional commits one, mainly because it s a lot easier to use when you want to update the MAJOR version. The commit message format used must be:
<type>(optional scope): <description>
[optional body]
[optional footer(s)]
The system supports three types of branches: release, maintenance and pre-release, but for now I m not using maintenance ones. The branches I use and their types are:
  • main as release branch (final versions are published from there)
  • develop as pre release branch (used to publish development and testing versions with the format #.#.#-SNAPSHOT.#)
  • release/#.#.# as pre release branches (they are created from develop to publish release candidate versions with the format #.#.#-rc.# and once they are merged with main they are deleted)
On the release branch (main) the version number is updated as follows:
  1. The MAJOR number is incremented if a commit with a BREAKING CHANGE: footer or an exclamation (!) after the type/scope is found in the list of commits found since the last version change (it looks for tags on the same branch).
  2. The MINOR number is incremented if the MAJOR number is not going to be changed and there is a commit with type feat in the commits found since the last version change.
  3. The PATCH number is incremented if neither the MAJOR nor the MINOR numbers are going to be changed and there is a commit with type fix in the the commits found since the last version change.
On the pre release branches (develop and release/#.#.#) the version and pre release numbers are always calculated from the last published version available on the branch (i. e. if we published version 1.3.2 on main we need to have the commit with that tag on the develop or release/#.#.# branch to get right what will be the next version). The version number is updated as follows:
  1. The MAJOR number is incremented if a commit with a BREAKING CHANGE: footer or an exclamation (!) after the type/scope is found in the list of commits found since the last released version.In our example it was 1.3.2 and the version is updated to 2.0.0-SNAPSHOT.1 or 2.0.0-rc.1 depending on the branch.
  2. The MINOR number is incremented if the MAJOR number is not going to be changed and there is a commit with type feat in the commits found since the last released version.In our example the release was 1.3.2 and the version is updated to 1.4.0-SNAPSHOT.1 or 1.4.0-rc.1 depending on the branch.
  3. The PATCH number is incremented if neither the MAJOR nor the MINOR numbers are going to be changed and there is a commit with type fix in the the commits found since the last version change.In our example the release was 1.3.2 and the version is updated to 1.3.3-SNAPSHOT.1 or 1.3.3-rc.1 depending on the branch.
  4. The pre release number is incremented if the MAJOR, MINOR and PATCH numbers are not going to be changed but there is a commit that would otherwise update the version (i.e. a fix on 1.3.3-SNAPSHOT.1 will set the version to 1.3.3-SNAPSHOT.2, a fix or feat on 1.4.0-rc.1 will set the version to 1.4.0-rc.2 an so on).

How do we manage its configurationAlthough the system is designed to work with nodejs projects, it can be used with multiple programming languages and project types. For nodejs projects the usual place to put the configuration is the project s package.json, but I prefer to use the .releaserc file instead. As I use a common set of CI templates, instead of using a .releaserc on each project I generate it on the fly on the jobs that need it, replacing values related to the project type and the current branch on a template using the tmpl command (lately I use a branch of my own fork while I wait for some feedback from upstream, as you will see on the Dockerfile).

Container used to run itAs we run the command on a gitlab-ci job we use the image built from the following Dockerfile:
Dockerfile
# Semantic release image
FROM golang:alpine AS tmpl-builder
#RUN go install github.com/krakozaure/tmpl@v0.4.0
RUN go install github.com/sto/tmpl@v0.4.0-sto.2
FROM node:lts-alpine
COPY --from=tmpl-builder /go/bin/tmpl /usr/local/bin/tmpl
RUN apk update &&\
  apk upgrade &&\
  apk add curl git jq openssh-keygen yq zip &&\
  npm install --location=global\
    conventional-changelog-conventionalcommits@6.1.0\
    @qiwi/multi-semantic-release@7.0.0\
    semantic-release@21.0.7\
    @semantic-release/changelog@6.0.3\
    semantic-release-export-data@1.0.1\
    @semantic-release/git@10.0.1\
    @semantic-release/gitlab@9.5.1\
    @semantic-release/release-notes-generator@11.0.4\
    semantic-release-replace-plugin@1.2.7\
    semver@7.5.4\
  &&\
  rm -rf /var/cache/apk/*
CMD ["/bin/sh"]

How and when is it executedThe job that runs semantic-release is executed when new commits are added to the develop, release/#.#.# or main branches (basically when something is merged or pushed) and after all tests have passed (we don t want to create a new version that does not compile or passes at least the unit tests). The job is something like the following:
semantic_release:
  image: $SEMANTIC_RELEASE_IMAGE
  rules:
    - if: '$CI_COMMIT_BRANCH =~ /^(develop main release\/\d+.\d+.\d+)$/'
      when: always
  stage: release
  before_script:
    - echo "Loading scripts.sh"
    - . $ASSETS_DIR/scripts.sh
  script:
    - sr_gen_releaserc_json
    - git_push_setup
    - semantic-release
Where the SEMANTIC_RELEASE_IMAGE variable contains the URI of the image built using the Dockerfile above and the sr_gen_releaserc_json and git_push_setup are functions defined on the $ASSETS_DIR/scripts.sh file:
  • The sr_gen_releaserc_json function generates the .releaserc.json file using the tmpl command.
  • The git_push_setup function configures git to allow pushing changes to the repository with the semantic-release command, optionally signing them with a SSH key.

The sr_gen_releaserc_json functionThe code for the sr_gen_releaserc_json function is the following:
sr_gen_releaserc_json()
 
  # Use nodejs as default project_type
  project_type="$ PROJECT_TYPE:-nodejs "
  # REGEX to match the rc_branch name
  rc_branch_regex='^release\/[0-9]\+\.[0-9]\+\.[0-9]\+$'
  # PATHS on the local ASSETS_DIR
  assets_dir="$ CI_PROJECT_DIR /$ ASSETS_DIR "
  sr_local_plugin="$ assets_dir /local-plugin.cjs"
  releaserc_tmpl="$ assets_dir /releaserc.json.tmpl"
  pipeline_runtime_values_yaml="/tmp/releaserc_values.yaml"
  pipeline_values_yaml="$ assets_dir /values_$ project_type _project.yaml"
  # Destination PATH
  releaserc_json=".releaserc.json"
  # Create an empty pipeline_values_yaml if missing
  test -f "$pipeline_values_yaml"   : >"$pipeline_values_yaml"
  # Create the pipeline_runtime_values_yaml file
  echo "branch: $ CI_COMMIT_BRANCH " >"$pipeline_runtime_values_yaml"
  echo "gitlab_url: $ CI_SERVER_URL " >"$pipeline_runtime_values_yaml"
  # Add the rc_branch name if we are on an rc_branch
  if [ "$(echo "$CI_COMMIT_BRANCH"   sed -ne "/$rc_branch_regex/ p ")" ]; then
    echo "rc_branch: $ CI_COMMIT_BRANCH " >>"$pipeline_runtime_values_yaml"
  elif [ "$(echo "$CI_MERGE_REQUEST_SOURCE_BRANCH_NAME"  
      sed -ne "/$rc_branch_regex/ p ")" ]; then
    echo "rc_branch: $ CI_MERGE_REQUEST_SOURCE_BRANCH_NAME " \
      >>"$pipeline_runtime_values_yaml"
  fi
  echo "sr_local_plugin: $ sr_local_plugin " >>"$pipeline_runtime_values_yaml"
  # Create the releaserc_json file
  tmpl -f "$pipeline_runtime_values_yaml" -f "$pipeline_values_yaml" \
    "$releaserc_tmpl"   jq . >"$releaserc_json"
  # Remove the pipeline_runtime_values_yaml file
  rm -f "$pipeline_runtime_values_yaml"
  # Print the releaserc_json file
  print_file_collapsed "$releaserc_json"
  # --*-- BEG: NOTE --*--
  # Rename the package.json to ignore it when calling semantic release.
  # The idea is that the local-plugin renames it back on the first step of the
  # semantic-release process.
  # --*-- END: NOTE --*--
  if [ -f "package.json" ]; then
    echo "Renaming 'package.json' to 'package.json_disabled'"
    mv "package.json" "package.json_disabled"
  fi
 
Almost all the variables used on the function are defined by gitlab except the ASSETS_DIR and PROJECT_TYPE; in the complete pipelines the ASSETS_DIR is defined on a common file included by all the pipelines and the project type is defined on the .gitlab-ci.yml file of each project. If you review the code you will see that the file processed by the tmpl command is named releaserc.json.tmpl, its contents are shown here:
 
  "plugins": [
     - if .sr_local_plugin  
    "  .sr_local_plugin  ",
     - end  
    [
      "@semantic-release/commit-analyzer",
       
        "preset": "conventionalcommits",
        "releaseRules": [
            "breaking": true, "release": "major"  ,
            "revert": true, "release": "patch"  ,
            "type": "feat", "release": "minor"  ,
            "type": "fix", "release": "patch"  ,
            "type": "perf", "release": "patch"  
        ]
       
    ],
     - if .replacements  
    [
      "semantic-release-replace-plugin",
        "replacements":   .replacements   toJson    
    ],
     - end  
    "@semantic-release/release-notes-generator",
     - if eq .branch "main"  
    [
      "@semantic-release/changelog",
        "changelogFile": "CHANGELOG.md", "changelogTitle": "# Changelog"  
    ],
     - end  
    [
      "@semantic-release/git",
       
        "assets":   if .assets   .assets   toJson   else  []  end  ,
        "message": "ci(release): v$ nextRelease.version \n\n$ nextRelease.notes "
       
    ],
    [
      "@semantic-release/gitlab",
        "gitlabUrl": "  .gitlab_url  ", "successComment": false  
    ]
  ],
  "branches": [
      "name": "develop", "prerelease": "SNAPSHOT"  ,
     - if .rc_branch  
      "name": "  .rc_branch  ", "prerelease": "rc"  ,
     - end  
    "main"
  ]
 
The values used to process the template are defined on a file built on the fly (releaserc_values.yaml) that includes the following keys and values:
  • branch: the name of the current branch
  • gitlab_url: the URL of the gitlab server (the value is taken from the CI_SERVER_URL variable)
  • rc_branch: the name of the current rc branch; we only set the value if we are processing one because semantic-release only allows one branch to match the rc prefix and if we use a wildcard (i.e. release/*) but the users keep more than one release/#.#.# branch open at the same time the calls to semantic-release will fail for sure.
  • sr_local_plugin: the path to the local plugin we use (shown later)
The template also uses a values_$ project_type _project.yaml file that includes settings specific to the project type, the one for nodejs is as follows:
replacements:
  - files:
      - "package.json"
    from: "\"version\": \".*\""
    to: "\"version\": \"$ nextRelease.version \""
assets:
  - "CHANGELOG.md"
  - "package.json"
The replacements section is used to update the version field on the relevant files of the project (in our case the package.json file) and the assets section includes the files that will be committed to the repository when the release is published (looking at the template you can see that the CHANGELOG.md is only updated for the main branch, we do it this way because if we update the file on other branches it creates a merge nightmare and we are only interested on it for released versions anyway). The local plugin adds code to rename the package.json_disabled file to package.json if present and prints the last and next versions on the logs with a format that can be easily parsed using sed:
local-plugin.cjs
// Minimal plugin to:
// - rename the package.json_disabled file to package.json if present
// - log the semantic-release last & next versions
function verifyConditions(pluginConfig, context)  
  var fs = require('fs');
  if (fs.existsSync('package.json_disabled'))  
    fs.renameSync('package.json_disabled', 'package.json');
    context.logger.log( verifyConditions: renamed 'package.json_disabled' to 'package.json' );
   
 
function analyzeCommits(pluginConfig, context)  
  if (context.lastRelease && context.lastRelease.version)  
    context.logger.log( analyzeCommits: LAST_VERSION=$ context.lastRelease.version  );
   
 
function verifyRelease(pluginConfig, context)  
  if (context.nextRelease && context.nextRelease.version)  
    context.logger.log( verifyRelease: NEXT_VERSION=$ context.nextRelease.version  );
   
 
module.exports =  
  verifyConditions,
  analyzeCommits,
  verifyRelease
 

The git_push_setup functionThe code for the git_push_setup function is the following:
git_push_setup()
 
  # Update global credentials to allow git clone & push for all the group repos
  git config --global credential.helper store
  cat >"$HOME/.git-credentials" <<EOF
https://fake-user:$ GITLAB_REPOSITORY_TOKEN @gitlab.com
EOF
  # Define user name, mail and signing key for semantic-release
  user_name="$SR_USER_NAME"
  user_email="$SR_USER_EMAIL"
  ssh_signing_key="$SSH_SIGNING_KEY"
  # Export git user variables
  export GIT_AUTHOR_NAME="$user_name"
  export GIT_AUTHOR_EMAIL="$user_email"
  export GIT_COMMITTER_NAME="$user_name"
  export GIT_COMMITTER_EMAIL="$user_email"
  # Sign commits with ssh if there is a SSH_SIGNING_KEY variable
  if [ "$ssh_signing_key" ]; then
    echo "Configuring GIT to sign commits with SSH"
    ssh_keyfile="/tmp/.ssh-id"
    : >"$ssh_keyfile"
    chmod 0400 "$ssh_keyfile"
    echo "$ssh_signing_key"   tr -d '\r' >"$ssh_keyfile"
    git config gpg.format ssh
    git config user.signingkey "$ssh_keyfile"
    git config commit.gpgsign true
  fi
 
The function assumes that the GITLAB_REPOSITORY_TOKEN variable (set on the CI/CD variables section of the project or group we want) contains a token with read_repository and write_repository permissions on all the projects we are going to use this function. The SR_USER_NAME and SR_USER_EMAIL variables can be defined on a common file or the CI/CD variables section of the project or group we want to work with and the script assumes that the optional SSH_SIGNING_KEY is exported as a CI/CD default value of type variable (that is why the keyfile is created on the fly) and git is configured to use it if the variable is not empty.
Warning: Keep in mind that the variables GITLAB_REPOSITORY_TOKEN and SSH_SIGNING_KEY contain secrets, so probably is a good idea to make them protected (if you do that you have to make the develop, main and release/* branches protected too).
Warning: The semantic-release user has to be able to push to all the projects on those protected branches, it is a good idea to create a dedicated user and add it as a MAINTAINER for the projects we want (the MAINTAINERS need to be able to push to the branches), or, if you are using a Gitlab with a Premium license you can use the api to allow the semantic-release user to push to the protected branches without allowing it for any other user.

The semantic-release commandOnce we have the .releaserc file and the git configuration ready we run the semantic-release command. If the branch we are working with has one or more commits that will increment the version, the tool does the following (note that the steps are described are the ones executed if we use the configuration we have generated):
  1. It detects the commits that will increment the version and calculates the next version number.
  2. Generates the release notes for the version.
  3. Applies the replacements defined on the configuration (in our example updates the version field on the package.json file).
  4. Updates the CHANGELOG.md file adding the release notes if we are going to publish the file (when we are on the main branch).
  5. Creates a commit if all or some of the files listed on the assets key have changed and uses the commit message we have defined, replacing the variables for their current values.
  6. Creates a tag with the new version number and the release notes.
  7. As we are using the gitlab plugin after tagging it also creates a release on the project with the tag name and the release notes.

Notes about the git workflows and merges between branchesIt is very important to remember that semantic-release looks at the commits of a given branch when calculating the next version to publish, that has two important implications:
  1. On pre release branches we need to have the commit that includes the tag with the released version, if we don t have it the next version is not calculated correctly.
  2. It is a bad idea to squash commits when merging a branch to another one, if we do that we will lose the information semantic-release needs to calculate the next version and even if we use the right prefix for the squashed commit (fix, feat, ) we miss all the messages that would otherwise go to the CHANGELOG.md file.
To make sure that we have the right commits on the pre release branches we should merge the main branch changes into the develop one after each release tag is created; in my pipelines the fist job that processes a release tag creates a branch from the tag and an MR to merge it to develop. The important thing about that MR is that is must not be squashed, if we do that the tag commit will probably be lost, so we need to be careful. To merge the changes directly we can run the following code:
# Set the SR_TAG variable to the tag you want to process
SR_TAG="v1.3.2"
# Fetch all the changes
git fetch --all --prune
# Switch to the main branch
git switch main
# Pull all the changes
git pull
# Switch to the development branch
git switch develop
# Pull all the changes
git pull
# Create followup branch from tag
git switch -c "followup/$SR_TAG" "$SR_TAG"
# Change files manually & commit the changed files
git commit -a --untracked-files=no -m "ci(followup): $SR_TAG to develop"
# Switch to the development branch
git switch develop
# Merge the followup branch into the development one using the --no-ff option
git merge --no-ff "followup/$SR_TAG"
# Remove the followup branch
git branch -d "followup/$SR_TAG"
# Push the changes
git push
If we can t push directly to develop we can create a MR pushing the followup branch after committing the changes, but we have to make sure that we don t squash the commits when merging or it will not work as we want.

Next.