Search Results: "jelmer"

24 March 2024

Niels Thykier: debputy v0.1.21

Earlier today, I have just released debputy version 0.1.21 to Debian unstable. In the blog post, I will highlight some of the new features.
Package boilerplate reduction with automatic relationship substvar Last month, I started a discussion on rethinking how we do relationship substvars such as the $ misc:Depends . These generally ends up being boilerplate runes in the form of Depends: $ misc:Depends , $ shlibs:Depends where you as the packager has to remember exactly which runes apply to your package. My proposed solution was to automatically apply these substvars and this feature has now been implemented in debputy. It is also combined with the feature where essential packages should use Pre-Depends by default for dpkg-shlibdeps related dependencies. I am quite excited about this feature, because I noticed with libcleri that we are now down to 3-5 fields for defining a simple library package. Especially since most C library packages are trivial enough that debputy can auto-derive them to be Multi-Arch: same. As an example, the libcleric1 package is down to 3 fields (Package, Architecture, Description) with Section and Priority being inherited from the Source stanza. I have submitted a MR to show case the boilerplate reduction at https://salsa.debian.org/siridb-team/libcleri/-/merge_requests/3. The removal of libcleric1 (= $ binary:Version ) in that MR relies on another existing feature where debputy can auto-derive a dependency between an arch:any -dev package and the library package based on the .so symlink for the shared library. The arch:any restriction comes from the fact that arch:all and arch:any packages are not built together, so debputy cannot reliably see across the package boundaries during the build (and therefore refuses to do so at all). Packages that have already migrated to debputy can use debputy migrate-from-dh to detect any unnecessary relationship substitution variables in case you want to clean up. The removal of Multi-Arch: same and intra-source dependencies must be done manually and so only be done so when you have validated that it is safe and sane to do. I was willing to do it for the show-case MR, but I am less confident that would bother with these for existing packages in general. Note: I summarized the discussion of the automatic relationship substvar feature earlier this month in https://lists.debian.org/debian-devel/2024/03/msg00030.html for those who want more details. PS: The automatic relationship substvars feature will also appear in debhelper as a part of compat 14.
Language Server (LSP) and Linting I have long been frustrated by our poor editor support for Debian packaging files. To this end, I started working on a Language Server (LSP) feature in debputy that would cover some of our standard Debian packaging files. This release includes the first version of said language server, which covers the following files:
  • debian/control
  • debian/copyright (the machine readable variant)
  • debian/changelog (mostly just spelling)
  • debian/rules
  • debian/debputy.manifest (syntax checks only; use debputy check-manifest for the full validation for now)
Most of the effort has been spent on the Deb822 based files such as debian/control, which comes with diagnostics, quickfixes, spellchecking (but only for relevant fields!), and completion suggestions. Since not everyone has a LSP capable editor and because sometimes you just want diagnostics without having to open each file in an editor, there is also a batch version for the diagnostics via debputy lint. Please see debputy(1) for how debputy lint compares with lintian if you are curious about which tool to use at what time. To help you getting started, there is a now debputy lsp editor-config command that can provide you with the relevant editor config glue. At the moment, emacs (via eglot) and vim with vim-youcompleteme are supported. For those that followed the previous blog posts on writing the language server, I would like to point out that the command line for running the language server has changed to debputy lsp server and you no longer have to tell which format it is. I have decided to make the language server a "polyglot" server for now, which I will hopefully not regret... Time will tell. :) Anyhow, to get started, you will want:
$ apt satisfy 'dh-debputy (>= 0.1.21~), python3-pygls'
# Optionally, for spellchecking
$ apt install python3-hunspell hunspell-en-us
# For emacs integration
$ apt install elpa-dpkg-dev-el markdown-mode-el
# For vim integration via vim-youcompleteme
$ apt install vim-youcompleteme
Specifically for emacs, I also learned two things after the upload. First, you can auto-activate eglot via eglot-ensure. This badly feature interacts with imenu on debian/changelog for reasons I do not understand (causing a several second start up delay until something times out), but it works fine for the other formats. Oddly enough, opening a changelog file and then activating eglot does not trigger this issue at all. In the next version, editor config for emacs will auto-activate eglot on all files except debian/changelog. The second thing is that if you install elpa-markdown-mode, emacs will accept and process markdown in the hover documentation provided by the language server. Accordingly, the editor config for emacs will also mention this package from the next version on. Finally, on a related note, Jelmer and I have been looking at moving some of this logic into a new package called debpkg-metadata. The point being to support easier reuse of linting and LSP related metadata - like pulling a list of known fields for debian/control or sharing logic between lintian-brush and debputy.
Minimal integration mode for Rules-Requires-Root One of the original motivators for starting debputy was to be able to get rid of fakeroot in our build process. While this is possible, debputy currently does not support most of the complex packaging features such as maintscripts and debconf. Unfortunately, the kind of packages that need fakeroot for static ownership tend to also require very complex packaging features. To bridge this gap, the new version of debputy supports a very minimal integration with dh via the dh-sequence-zz-debputy-rrr. This integration mode keeps the vast majority of debhelper sequence in place meaning most dh add-ons will continue to work with dh-sequence-zz-debputy-rrr. The sequence only replaces the following commands:
  • dh_fixperms
  • dh_gencontrol
  • dh_md5sums
  • dh_builddeb
The installations feature of the manifest will be disabled in this integration mode to avoid feature interactions with debhelper tools that expect debian/<pkg> to contain the materialized package. On a related note, the debputy migrate-from-dh command now supports a --migration-target option, so you can choose the desired level of integration without doing code changes. The command will attempt to auto-detect the desired integration from existing package features such as a build-dependency on a relevant dh sequence, so you do not have to remember this new option every time once the migration has started. :)

1 November 2023

Paul Wise: FLOSS Activities October 2023

Focus This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Review
  • Debian wiki: RecentChanges for the month
  • Debian BTS usertags: changes for the month
  • Debian screenshots:

Administration
  • Debian IRC: rescue obsolete/unused #debian-wiki channel
  • Debian servers: rescue data from an old DebConf server
  • Debian wiki: approve accounts

Communication
  • Respond to queries from Debian users and contributors on the mailing lists and IRC

Sponsors The SWH, golang-ginkgo, DBD-ODBC, sqliteodbc work was sponsored. All other work was done on a volunteer basis.

10 September 2023

Jelmer Vernooij: Transcontinental Race No 9

After cycling the Northcape 4000 (from Italy to northern Norway) last year, I signed up for the transcontinental race this year. The Transcontinental is bikepacking race across Europe, self-routed (but with some mandatory checkpoints), unsupported and with a distance of usually somewhere around 4000 km. The cut-off time is 15 days, with the winner usually taking 7-10 days. This year, the route went from Belgium to Thessaloniki in Greece, with control points in northern Italy, Slovenia, Albania and Meteora (Greece). The event was great - it was well organised and communication was a lot better than at the Northcape. It did feel very different from the Northcape, though, being a proper race. Participants are not allowed to draft off each other or help each other, though a quick chat here or there as you pass people is possible, or when you re both stopped at a shop or control point.
My experience The route was beautiful - the first bit through France was a bit monotonic, but especially the views in the alps were amazing. Like with other long events, the first day or two can be hard but once you get into the rhythm of things it s a lot easier. From early on, I lost a lot of time. We started in the rain, and I ran several flats in a row, just 4 hours in. In addition to that, the thread on my pump had worn so it wouldn t fit on some of my spare tubes, and my tubes were all TPU - which are hard to patch. So at 3 AM I found myself by the side of an N-road in France without any usable tubes to put in my rear wheel. I ended up walking 20km to the nearest town with a bike shop, where they fortunately had good old butyl tubes and a working pump. But overall, this cost me about 12 hours in total. In addition to that, my time management wasn t great. On previous rides, I d usually gotten about 8 hours of sleep per night while staying in hotels. On the transcontinental I had meant to get less sleep but still stay in hotels most night, but I found that not all hotels accomodated well for that - especially with a bike. So I ended up getting more sleep than I had intended, and spending more time off the bike than I had planned - close to 11 or 12 hours per day. I hadn t scheduled much time off work after the finish either, so arriving in Greece late wasn t really an option. And then, on an early morning in Croatia (about 2000km in) in heavy fog, I rode into a kerb at 35 km/h, bending the rim of my front wheel (but fortunately not coming off my bike). While I probably would have been able to continue with a replacement wheel (and mailing the broken one home), that would have taken another day to sort out and I almost certainly wouldn t have been able to source a new dynamo wheel in Croatia - which would have made night time riding a lot harder. So I decided to scratch and take the train home from Zagreb. Overall, I really enjoyed the event and I think I ve learned some useful lessons. I ll probably try again next year.

2 June 2023

Jelmer Vernooij: Porting Python projects to Rust

I ve recently been working on porting some of my Python code to rust, both for performance reasons, and because of the strong typing in the language. As a fan of Haskell, I also just really enjoy using the language. Porting any large project to a new language can be a challenge. There is a temptation to do a rewrite from the ground-up in idiomatic rust and using all new fancy features of the language.
Porting in one go However, this is a bit of a trap:
  • It blocks other work. It can take a long time to finish the rewrite, during which time there is no good place to make other bug fixes/feature changes. If you make the change in the python branch, then you may also have to patch the in-progress rust fork.
  • No immediate return on investment. While the rewrite is happening, all of the investment in it is sunk costs.
  • Throughout the process, you can only run the tests for subsystems that have already been ported. It s common to find subtle bugs later in code ported early.
  • Understanding existing code, porting it and making it idiomatic rust all at the same time takes more time and post-facto debugging.
Iterative porting Instead, we ve found that it works much better to take an iterative approach. One of the hidden gems of rust is the excellent PyO3 crate, which allows creating python bindings for rust code in a way that is several times less verbose and less painful than C or SWIG. Because of rust s strong ownership model, it s also really hard to muck up e.g. reference counts when creating Python bindings for rust code. We port individual functions or classes to rust one at a time, starting with functionality that doesn t have dependencies on other python code and gradually working our way up the call stack. Each subsystem of the code is converted to two matching rust crates: one with a port of the code to pure rust, and one with python bindings for the rust code. Generally multiple python modules end up being a single pair of rust crates. The signature for the pure Rust code follow rust conventions, but the business logic is mostly ported as-is (just in rust syntax) and the signatures of the python bindings match that of the original python code. This then allows running the original python tests to verify that the code still behaves the same way. Changes can also immediately land on the main branch. A subsequent step is usually to refactor the rust code to be more idiomatic - all the while keeping the tests passing. There is also the potential to e.g. switch to using external rust crates (with perhaps subtly different behaviour), or drop functionality altogether. At some point, we will also port the tests from python to rust, and potentially drop the python bindings - once all the caller s have been converted to rust.
Example For example, imagine I have a Python module janitor/mail_filter.py with this function:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
def parse_plain_text_body(text):
   lines = text.splitlines()
   for i, line in enumerate(lines):
       if line == 'Reply to this email directly or view it on GitHub:':
           return lines[i + 1].split('#')[0]
       if (line == 'For more details, see:'
               and lines[i + 1].startswith('https://code.launchpad.net/')):
           return lines[i + 1]
       try:
           (field, value) = line.split(':', 1)
       except ValueError:
           continue
       if field.lower() == 'merge request url':
           return value.strip()
   return None
Porting this to rust naively (in a crate I ve called mailfilter ) it might look something like this:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
pub fn parse_plain_text_body(text: &str) -> Option<String>  
     let lines: Vec<&str> = text.lines().collect();
     for (i, line) in lines.iter().enumerate()  
         if line == &"Reply to this email directly or view it on GitHub:"  
             return Some(lines[i + 1].split('#').next().unwrap().to_string());
          
         if line == &"For more details, see:"
             && lines[i + 1].starts_with("https://code.launchpad.net/")
          
             return Some(lines[i + 1].to_string());
          
         if let Some((field, value)) = line.split_once(':')  
             if field.to_lowercase() == "merge request url"  
                 return Some(value.trim().to_string());
              
          
      
     None
  
Bindings are created in a crate called mailfilter-py, which looks like this:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
use pyo3::prelude::*;
 #[pyfunction]
 fn parse_plain_text_body(text: &str) -> Option<String>  
     janitor_mail_filter::parse_plain_text_body(text)
  
 #[pymodule]
 pub fn _mail_filter(py: Python, m: &PyModule) -> PyResult<()>  
     m.add_function(wrap_pyfunction!(parse_plain_text_body, m)?)?;
     Ok(())
  
The metadata for the crates is what you d expect. mailfilter-py uses PyO3 and depends on mailfilter.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
[package]
 name = "mailfilter-py"
 version = "0.0.0"
 authors = ["Jelmer Vernoo  <jelmer@jelmer.uk>"]
 edition = "2018"
 [lib]
 crate-type = ["cdylib"]
 [dependencies]
 janitor-mail-filter =   path = "../mailfilter"  
 pyo3 =   version = ">=0.14", features = ["extension-module"] 
I use python-setuptools-rust to get the python ecosystem to build the python bindings. Here is what setup.py looks like:
1
2
3
4
5
6
7
8
9
#!/usr/bin/python3
from setuptools import setup
from setuptools_rust import RustExtension, Binding
setup(
        rust_extensions=[RustExtension(
        "janitor._mailfilter", "crates/mailfilter-py/Cargo.toml",
        binding=Binding.PyO3)],
)
And of course, setuptools-rust needs to be listed as a setup requirement in pyproject.toml or setup.cfg. After that, we can replace the original python code with a simple import and verify that the tests still run:
1
from ._mailfilter import parse_plain_text_body
Of course, not all bindings are as simple as this. Iterators in particular are more complicated, as is code that has a loose idea of ownership in python. But I ve found that the time investment is usually well worth the ability to land changes on the development head early and often. I d be curious to hear if people have had success with other approaches to porting Python code to Rust. If you do, please leave a comment.

8 March 2023

Jelmer Vernooij: The Kali Janitor

The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor. Kali Linux have been running their own instance of the Janitor for the last year, under the kali-bot user on GitLab. Their web site has some excellent documentation explaining how the bot works. Both projects share some common components - the core janitor codebase, Silver-Platter and the various codemods (lintian-brush and deb-new-upstream). The site and some of the review logic is different for Kali. The Kali bot has several campaigns:

The last campaign doesn t exist in the Debian janitor, and pulls in new changes from packages that have been imported from other distributions.

For more information about the Janitor s lintian-fixes efforts, see the landing page.

25 February 2023

Jelmer Vernooij: Silver Platter Batch Mode

Background Silver-Platter makes it easier to publish automated changes to repositories. However, in its default mode, the only option for reviewing changes before publishing them is to run in dry-run mode. This can be quite cumbersome if you have a lot of repositories. A new batch mode now makes it possible to generate a large number of changes against different repositories using a script, review and optionally alter the diffs, and then all publish them (and potentially refresh them later if conflicts appear).
Example running pyupgrade I m using the pyupgrade example recipe that comes with silver-platter.
 ---
 name: pyupgrade
 command: 'pyupgrade --exit-zero-even-if-changed $(find -name "test_*.py")'
 mode: propose
 merge-request:
   commit-message: Upgrade Python code to a modern version
And a list of candidate repositories to process in candidates.yaml.
 ---
 - url: https://github.com/jelmer/dulwich
 - url: https://github.com/jelmer/xandikos
With these in place, the updated repositories can be created:
 $ svp batch generate --recipe=pyupgrade.yaml --candidates=candidate.syml pyupgrade
The intermediate results This will create a directory called pyupgrade, with a clone of each of the repositories.
$ ls pyupgrade
batch.yaml  dulwich  xandikos
$ cd pyupgrade/dulwich
$ git log
commit 931f9ffb26e9143c56f20e0b85e6ddb0a8eee2eb (HEAD -> master)
Author: Jelmer Vernoo  <jelmer@jelmer.uk>
Date:   Sat Feb 25 22:28:12 2023 +0000
Run pyupgrade
diff --git a/dulwich/tests/compat/test_client.py b/dulwich/tests/compat/test_client.py
index 02ab6c0a..9b0661ed 100644
--- a/dulwich/tests/compat/test_client.py
+++ b/dulwich/tests/compat/test_client.py
@@ -628,7 +628,7 @@ class HTTPGitServer(http.server.HTTPServer):
         self.server_name = "localhost"
     def get_url(self):
-        return "http:// : /".format(self.server_name, self.server_port)
+        return f"http:// self.server_name : self.server_port /"
 class DulwichHttpClientTest(CompatTestCase, DulwichClientTestBase):
...
There is also a file called batch.yaml that describes the pending changes:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
name: pyupgrade
work:
- url: https://github.com/dulwich/dulwich
  name: dulwich
  description: Upgrade to modern Python statements
  commit-message: Run pyupgrade
  mode: propose
- url: https://github.com/jelmer/xandikos
  name: xandikos
  description: Upgrade to modern Python statements
  commit-message: Run pyupgrade
  mode: propose
recipe: ../pyupgrade.yaml
At this point the changes can be reviewed, and batch.yaml edited as the user sees fit - they can remove entries that don t appear to be correct, edit the metadata for the merge requests, etc. It s also possible to make changes to the clones. Once you re happy, publish the results:
$ svp batch publish pyupgrade
This will publish all the changes, using the mode and parameters specified in batch.yaml. batch.yaml is automatically stripped of any entries in work that have fully landed, i.e. where the pull request has been merged or where the changes were pushed to the origin. To check up on the status of your changes, run svp batch status:
$ svp batch status pyupgrade
To refresh any merge proposals that may have become out of date, simply run publish again:
svp batch publish pyupgrade

28 November 2022

Jelmer Vernooij: Detecting Package Transitions

Larger transitions in Debian are usually announced on e.g. debian-devel, but it s harder to track the current status of all transitions. Having done a lot of QA uploads recently, I have on occasion uploaded packages involved in a transition. This can be unhelpful for the people handling the transition, but there s also often not much point in uploading if your uploads are going to get stuck. Talking to one of the release managers at a recent BSP, it was great to find out that the release team actually publish a data dump with which packages are involved in which transitions. Here s the script I use to find out about the transitions the package in my current working directory is involved in:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#!/usr/bin/python3
from urllib.request import urlopen
import sys
from debian.deb822 import Deb822
import yaml
with open('debian/control', 'r') as f:
    package = Deb822(f)['Source']
with urlopen("https://release.debian.org/transitions/export/packages.yaml") as f:
    data = yaml.safe_load(f)
def find_transitions(data, package):
    for entry in data:
        if entry['name'] != package:
            continue
        return dict(entry['list'])
    return  
transitions = find_transitions(data, package)
print(transitions)
sys.exit(1 if 'ongoing' in transitions.values() else 0)
In practice, the output looks something like this:
$ debcheckout bctoolbox
git clone https://salsa.debian.org/pkg-voip-team/linphone-stack/bctoolbox.git bctoolbox ...
Cloning into 'bctoolbox'...
...
$ cd bctoolbox
$ in-transition.py
 'auto-upperlimit-libbctoolbox1': 'ongoing' 

28 September 2022

Jelmer Vernooij: Northcape 4000

This summer, I signed up to participate in the Northcape 4000 <https://www.northcape4000.com/>, an annual 4000km bike ride between Rovereto (in northern Italy) and the northernmost point of Europe, the North cape. The Northcape event has been held for several years, and while it always ends on the North Cape, the route there varies. Last years route went through the Baltics, but this years was perhaps as direct as possible - taking us through Italy, Austria, Switzerland, Germany, the Czech republic, Germany again, Sweden, Finland and finally Norway. The ride is unsupported, meaning you have to find your own food and accomodation and can only avail yourself of resupply and sleeping options on the route that are available to everybody else as well. The event is not meant to be a race (unlike the Transcontinental, which starts at the same day), so there is a minimum time to finish it in (10 days) and a maximum (21 days). Unfortunately, this meant skipping some other events I d wanted attend (DebConf, MCH).

4 January 2022

Jelmer Vernooij: Personal Streaming Audio Server

For a while now, I ve been looking for a good way to stream music from my home music collection on my phone. There are quite a few options for music servers that support streaming. However, Android apps that can stream music from one of those servers tend to be unmaintained, clunky or slow (or more than one of those). It is possible to use something that runs in a web server, but that means no offline caching - which can be quite convenient in spots without connectivity, such as the Underground or other random bits of London with poor cell coverage.
Server Most music servers today support some form of the subsonic API. I ve tried a couple, with mixed results:
  • supysonic; Python. Slow. Ran into some issues with subsonic clients. No real web UI.
  • gonic; Go. Works well & fast enough. Minimal web UI, i.e. no ability to play music from a browser.
  • airsonic; Java. Last in a chain of (abandoned) forks. More effort to get to work, and resource intensive.
Eventually, I ve settled on Navidrome. It s got a couple of things going for it:
  • Good subsonic implementation that worked with all the Android apps I used it with.
  • Great Web UI for use in a browser
I run Navidrome in Kubernetes. It s surprisingly easy to get going. Here s the deployment I m using:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
apiVersion: apps/v1
kind: Deployment
metadata:
 name: navidrome
spec:
 replicas: 1
 selector:
   matchLabels:
     app: navidrome
 template:
   metadata:
     labels:
       app: navidrome
   spec:
     containers:
       - name: navidrome
         image: deluan/navidrome:latest
         imagePullPolicy: Always
         resources:
           limits:
             cpu: ".5"
             memory: "2Gi"
           requests:
             cpu: "0.1"
             memory: "10M"
         ports:
           - containerPort: 4533
         volumeMounts:
           - name: navidrome-data-volume
             mountPath: /data
           - name: navidrome-music-volume
             mountPath: /music
         env:
           - name: ND_SCANSCHEDULE
             value: 1h
           - name: ND_LOGLEVEL
             value: info
           - name: ND_SESSIONTIMEOUT
             value: 24h
           - name: ND_BASEURL
             value: /navidrome
         livenessProbe:
            httpGet:
              path: /navidrome/app
              port: 4533
            initialDelaySeconds: 30
            periodSeconds: 3
            timeoutSeconds: 90
     volumes:
        - name: navidrome-data-volume
          hostPath:
           path: /srv/navidrome
           type: Directory
        - name: navidrome-music-volume
          hostPath:
            path: /srv/media/music
            type: Directory
---
apiVersion: v1
kind: Service
metadata:
  name: navidrome
spec:
  ports:
    - port: 4533
      name: web
  selector:
    app: navidrome
  type: ClusterIP
At the moment, this deployment is still tied to the machine with my music on it since it relies on hostPath volumes, but I m planning to move that to ceph in the future. I then expose this service on /navidrome on my private domain (here replaced with example.com) using an Ingress:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: navidrome
spec:
  ingressClassName: nginx
  rules:
  - host: example.com
    http:
      paths:
      - backend:
          service:
            name: navidrome
            port:
              name: web
        path: /navidrome(/ $)(.*)
        pathType: Prefix
Client On the desktop, I usually just use navidrome s web interface. Clementine s support for subsonic is also okay. sublime-music is meant to be a music player specifically for Subsonic, but I ve not really found it stable enough for day-to-day usage. There are various Android clients for Subsonic, but I ve only really considered the Open Source ones that are hosted on F-Droid. Most of those are abandoned, but D-Sub works pretty well - as does my preferred option, Subtracks.

Jonathan McDowell: Upgrading from a CC2531 to a CC2538 Zigbee coordinator

Previously I setup a CC2531 as a Zigbee coordinator for my home automation. This has turned out to be a good move, with the 4 gang wireless switch being particularly useful. However the range of the CC2531 is fairly poor; it has a simple PCB antenna. It s also a very basic device. I set about trying to improve the range and scalability and settled upon a CC2538 + CC2592 device, which feature an MMCX antenna connector. This device also has the advantage that it s ARM based, which I m hopeful means I might be able to build some firmware myself using a standard GCC toolchain. For now I fetched the JetHome firmware from https://github.com/jethome-ru/zigbee-firmware/tree/master/ti/coordinator/cc2538_cc2592 (JH_2538_2592_ZNP_UART_20211222.hex) - while it s possible to do USB directly with the CC2538 my board doesn t have those bits so going the external USB UART route is easier. The device had some existing firmware on it, so I needed to erase this to force a drop into the boot loader. That means soldering up the JTAG pins and hooking it up to my Bus Pirate for OpenOCD goodness.
OpenOCD config
source [find interface/buspirate.cfg]
buspirate_port /dev/ttyUSB1
buspirate_mode normal
buspirate_vreg 1
buspirate_pullup 0
transport select jtag
source [find target/cc2538.cfg]
Steps to erase
$ telnet localhost 4444
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Open On-Chip Debugger
> mww 0x400D300C 0x7F800
> mww 0x400D3008 0x0205
> shutdown
shutdown command invoked
Connection closed by foreign host.
At that point I can switch to the UART connection (on PA0 + PA1) and flash using cc2538-bsl:
$ git clone https://github.com/JelmerT/cc2538-bsl.git
$ cc2538-bsl/cc2538-bsl.py -p /dev/ttyUSB1 -e -w -v ~/JH_2538_2592_ZNP_UART_20211222.hex
Opening port /dev/ttyUSB1, baud 500000
Reading data from /home/noodles/JH_2538_2592_ZNP_UART_20211222.hex
Firmware file: Intel Hex
Connecting to target...
CC2538 PG2.0: 512KB Flash, 32KB SRAM, CCFG at 0x0027FFD4
Primary IEEE Address: 00:12:4B:00:22:22:22:22
    Performing mass erase
Erasing 524288 bytes starting at address 0x00200000
    Erase done
Writing 524256 bytes starting at address 0x00200000
Write 232 bytes at 0x0027FEF88
    Write done
Verifying by comparing CRC32 calculations.
    Verified (match: 0x74f2b0a1)
I then wanted to migrate from the old device to the new without having to repair everything. So I shut down Home Assistant and backed up the CC2531 network information using zigpy-znp (which is already installed for Home Assistant):
python3 -m zigpy_znp.tools.network_backup /dev/zigbee > cc2531-network.json
I copied the backup to cc2538-network.json and modified the coordinator_ieee to be the new device s MAC address (rather than end up with 2 devices claiming the same MAC if/when I reuse the CC2531) and did:
python3 -m zigpy_znp.tools.network_restore --input cc2538-network.json /dev/ttyUSB1
The old CC2531 needed unplugged first, otherwise I got an RuntimeError: Network formation refused, RF environment is likely too noisy. Temporarily unscrew the antenna or shield the coordinator with metal until a network is formed. error. After that I updated my udev rules to map the CC2538 to /dev/zigbee and restarted Home Assistant. To my surprise it came up and detected the existing devices without any extra effort on my part. However that resulted in 2 coordinators being shown in the visualisation, with the old one turning up as unk_manufacturer. Fixing that involved editing /etc/homeassistant/.storage/core.device_registry and removing the entry which had the old MAC address, removing the device entry in /etc/homeassistant/.storage/zha.storage for the old MAC and then finally firing up sqlite to modify the Zigbee database:
$ sqlite3 /etc/homeassistant/zigbee.db
SQLite version 3.34.1 2021-01-20 14:10:07
Enter ".help" for usage hints.
sqlite> DELETE FROM devices_v6 WHERE ieee = '00:12:4b:00:11:11:11:11';
sqlite> DELETE FROM endpoints_v6 WHERE ieee = '00:12:4b:00:11:11:11:11';
sqlite> DELETE FROM in_clusters_v6 WHERE ieee = '00:12:4b:00:11:11:11:11';
sqlite> DELETE FROM neighbors_v6 WHERE ieee = '00:12:4b:00:11:11:11:11' OR device_ieee = '00:12:4b:00:11:11:11:11';
sqlite> DELETE FROM node_descriptors_v6 WHERE ieee = '00:12:4b:00:11:11:11:11';
sqlite> DELETE FROM out_clusters_v6 WHERE ieee = '00:12:4b:00:11:11:11:11';
sqlite> .quit
So far it all seems a bit happier than with the CC2531; I ve been able to pair a light bulb that was previously detected but would not integrate, which suggests the range is improved. (This post another in the set of things I should write down so I can just grep my own website when I forget what I did to do foo .)

6 September 2021

Jelmer Vernooij: Web Hooks for the Janitor

The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor. As covered in my post from last week, the Janitor now regularly tries to import new upstream git snapshots or upstream releases into packages in Sid.

Moving parts There are about 30,000 packages in sid, and it usually takes a couple of weeks for the janitor to cycle through all of them. Generally speaking, there are up to three moving targets for each package:
  • The packaging repository; vcswatch regularly scans this for changes, and notifies the janitor when a repository has changed. For salsa repositories it is instantly notified through a web hook
  • The upstream release tarballs; the QA watch service regularly polls these, and the janitor scans for changes in the UDD tables with watch data (used for fresh-releases)
  • The upstream repository; there is no service in Debian that watches this at the moment (used for fresh-snapshots)
When the janitor notices that one of these three targets has changed, it prioritizes processing of a package - this means that a push to a packaging repository on salsa usually leads to a build being kicked off within 10 minutes. New upstream releases are usually noticed by QA watch within a day or so and then lead to a build. Now commits in upstream repositories don t get noticed today. Note that there are no guarantees; the scheduler tries to be clever and not e.g. rebuild the same package over and over again if it s constantly changing and takes a long time to build. Packages without priority are processed with a scoring system that takes into account perceived value (based on e.g. popcon), cost (based on wall-time duration of previous builds) and likelihood of success (whether recent builds were successful, and how frequently the repositories involved change).
webhooks for upstream repositories At the moment there is no service in Debian (yet - perhaps this is something that vcswatch or a sibling service could also do?) that scans upstream repositories for changes. However, if you maintain an upstream package, you can use a webhook to notify the janitor that commits have been made to your repository, and it will create a new package in fresh-snapshots. Webhooks from the following hosting site software are currently supported: You can simply use the URL https://janitor.debian.net/ as the target for hooks. There is no need to specify a secret, and the hook can either use a JSON or form encoding payload. The endpoint should tell you whether it understood a webhook request, and whether it took any action. It s fine to submit webhooks for repositories that the janitor does not (yet) know about.
GitHub For GitHub, you can do so in the Webhooks section of the Settings tab. Fill the form as shown below and click on Add webhook:
GitLab On GitLab instances, you can find the Webhooks tab under the Settings menu for each repository (under the gear symbol). Fill the form in as shown below and click Add Webhook:
Launchpad For Launchpad, go to the repository (for Git) web view and click Manage Webhooks. From there, you can add a new webhook; fill the form in as shown below and click Add Webhook:

25 August 2021

Jelmer Vernooij: Thousands of Debian packages updated from their upstream Git repository

The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor. Linux distributions like Debian fulfill an important function in the FOSS ecosystem - they are system integrators that take existing free and open source software projects and adapt them where necessary to work well together. They also make it possible for users to install more software in an easy and consistent way and with some degree of quality control and review. One of the consequences of this model is that the distribution package often lags behind upstream releases. This is especially true for distributions that have tighter integration and standardization (such as Debian), and often new upstream code is only imported irregularly because it is a manual process - both updating the package, but also making sure that it still works together well with the rest of the system. The process of importing a new upstream used to be (well, back when I started working on Debian packages) fairly manual and something like this:

Ecosystem Improvements However, there have been developments over the last decade that make it easier to import new upstream releases into Debian packages.
Uscan and debian QA watch Uscan and debian/watch have been around for a while and make it possible to find upstream tarballs. A debian watch file usually looks something like this:
1
2
version=4
http://somesite.com/dir/filenamewithversion.tar.gz
The QA watch service regularly polls all watch locations in the archive and makes the information available, so it s possible to know which packages have changed without downloading each one of them.
Git Git is fairly ubiquitous nowadays, and most upstream projects and packages in Debian use it. There are still exceptions that do not use any version control system or that use a different control system, but they are becoming increasingly rare. [1]
debian/upstream/metadata DEP-12 specifies a file format with metadata about the upstream project that a package was based on. In particular relevant for our case is the fact it has fields for the location of the upstream version control location. debian/upstream/metadata files look something like this:
1
2
3
---
Repository: https://www.dulwich.io/code/dulwich/
Repository-Browse: https://www.dulwich.io/code/dulwich/
While DEP-12 is still a draft, it has already been widely adopted - there are about 10000 packages in Debian that ship a debian/upstream/metadata file with Repository information.
Autopkgtest The Autopkgtest standard and associated tooling provide a way to run a defined set of tests against an installed package. This makes it possible to verify that a package is working correctly as part of the system as a whole. ci.debian.net regularly runs these tests against Debian packages to detect regressions.
Vcs-Git headers The Vcs-Git headers in debian/control are the equivalent of the Repository field in debian/upstream/metadata, but for the packaging repositories (as opposed to the upstream ones). They ve been around for a while and are widely adopted, as can be seen from zack s stats: The vcswatch service that regularly polls packaging repositories to see whether they have changed makes it a lot easier to consume this information in usable way.
Debhelper adoption Over the last couple of years, Debian has slowly been converging on a single build tool - debhelper s dh interface. Being able to rely on a single build tool makes it easier to write code to update packaging when upstream changes require it.
Debhelper DWIM Debhelper (and its helpers) increasingly can figure out how to do the Right Thing in many cases without being explicitly configured. This makes packaging less effort, but also means that it s less likely that importing a new upstream version will require updates to the packaging. With all of these improvements in place, it actually becomes feasible in a lot of situations to update a Debian package to a new upstream version automatically. Of course, this requires that all of this information is available, so it won t work for all packages. In some cases, the packaging for the older upstream version might not apply to the newer upstream version. The Janitor has attempted to import a new upstream Git snapshot and a new upstream release for every package in the archive where a debian/watch file or debian/upstream/metadata file are present. These are the steps it uses:
  • Find new upstream version
    • If release, use debian/watch - or maybe tagged in upstream repository
    • If snapshot, use debian/upstream/metadata s Repository field
    • If neither is available, use guess-upstream-metadata from upstream-ontologist to guess the upstream Repository
  • Merge upstream version into packaging repository, possibly importing tarballs using pristine-tar
  • Update the changelog file to mention the new upstream version
  • Run some checks to ensure there are no unintentional changes, e.g.:
    • Scan diff between old and new for surprising license changes
      • Today, abort if there are any - in the future, maybe update debian/copyright
    • Check for obvious compatibility breaks - e.g. sonames changing
  • Attempt to update the packaging to reflect upstream changes
    • Refresh patches
  • Attempt to build the package with deb-fix-build, to deal with any missing dependencies
  • Run the autopkgtests with deb-fix-build to deal with missing dependencies, and abort if any tests fail
Results When run over all packages in unstable (sid), this process works for a surprising number of them.
Fresh Releases For fresh-releases (aka imports of upstream releases), processing all packages maintained in Git for which QA watch reports new releases (about 11,000): That means about 2300 packages updated, and about 4000 unchanged.
Fresh Snapshots For fresh-snapshots (aka imports of latest Git commit from upstream), processing all packages maintained in Git (about 26,000): Or 5100 packages updated and 2100 for which there was nothing to do, i.e. no upstream commits since the last Debian upload. As can be seen, this works for a surprising fraction of packages. It s possible to get the numbers up even higher, by both improving the tooling, the autopkgtests and the metadata that is provided by packages.
Using these packages All the packages that have been built can be accessed from the Janitor APT repository. More information can be found at https://janitor.debian.net/fresh, but in short - run:
1
2
3
4
5
6
echo deb "[arch=amd64 signed-by=/usr/share/keyrings/debian-janitor-archive-keyring.gpg]" \
    https://janitor.debian.net/ fresh-snapshots main   sudo tee /etc/apt/sources.list.d/fresh-snapshots.list
echo deb "[arch=amd64 signed-by=/usr/share/keyrings/debian-janitor-archive-keyring.gpg]" \
    https://janitor.debian.net/ fresh-releases main   sudo tee /etc/apt/sources.list.d/fresh-releases.list
sudo curl -o /usr/share/keyrings/debian-janitor-archive-keyring.gpg https://janitor.debian.net/pgp_keys
apt update
And then you can install packages from the fresh-snapshots (upstream git snapshots) or fresh-releases suites on a case-by-case basis by running something like:
1
apt install -t fresh-snapshots r-cran-roxygen2
Most packages are updated based on information provided by vcswatch and qa watch, but it s also possible for upstream repositories to call a web hook to trigger a refresh of a package. These packages were built against unstable, but should in almost all cases also work for testing.
Caveats Of course, since these packages are built automatically without human supervision it s likely that some of them will have bugs in them that would otherwise have been caught by the maintainer.
[1]I m not saying that a monoculture is great here, but it does help distributions.

27 June 2021

Louis-Philippe V ronneau: Writing QA Scripts for Debian Teams

Since I joined the Debian Python Team, I have had a lot of fun working on different QA issues. Although I'm still a Perl illiterate1, I've for example contributed to a few Lintian tags. There are multiple ways to make mass QA changes to team-managed packages. Projects like the Debian Janitor are more than fantastic: they make for a robust, thorough and automated way to fix QA issues in the archive and I don't have enough good words to describe the amazing work of Jelmer Vernooij on the toolsuite the Janitor uses. But with robustness comes complexity. The Janitor is currently based on 10 different subtools (silver-platter, ognibuild, lintian-brush, ...) and if you want to use it to fix a bug, you first need to make sure there's a Lintian tag that flags the issue you're working on. Then you need to write a lintian-brush fixer to fix said issue. Sadly, sometimes writing a new Lintian tag to flag a trivial changes is not the appropriate course of action and only creates clutter. All this to say until now, I was a missing a "quick and somewhat dirty2" way to make simple one-off changes to a bunch of packages. 200 lines of Python later, I'm happy to report I have a simple way to replace the old Clojure Team email in d/control by the new one for all of our packages. Even better, although this script doesn't aim to be a versatile tool like the Janitor is, most of the functions can be reused for other similar one-off scripts. Many thanks to Felix Lechner showing me the very handy Lintian Query JSON interface!

  1. I don't really enjoy coding in Perl, but it makes up so much of the current Debian infrastructure that I wish I did. I keep telling myself I should buy an "Introduction to Perl" book...
  2. A quick and dirty way to make those changes would've been to write a shell script, but one of my 2021 resolution is to use Python for all my scripting needs.

14 May 2021

Jelmer Vernooij: Ognibuild

The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor. The FOSS world uses a wide variety of different build tools; given a git repository or tarball, it can be hard to figure out how to build and install a piece of software. Humans will generally know what build tool a project is using when they check out a project from git, or they can read the README. And even then, the answer may not always be straightforward to everybody. For automation, there is no obvious place to figure out how to build or install a project.

Debian For Debian packages, Debian maintainers generally will have determined that the appropriate tools to invoke are, and added appropriate invocations to debian/rules. This is really nice when rebuilding all of Debian - one can just invoke debian/rules - a consistent interface - and it will in turn invoke the right tools to build the package, meeting a long list of requirements. With newer versions of debhelper and most common build systems, debhelper can figure a lot of this out automatically - the maintainer just has to add the appropriate build and run time dependencies. However, debhelper needs to be consistent in its behaviour per compat level - otherwise builds might start failing with different versions of debhelper, when the autodetection logic is changed. debhelper can also only do the right thing if all the necessary dependencies are present. debhelper also only functions in the context of a Debian package.
Ognibuild Ognibuild is a new tool that figures out the build system in use by an upstream project, as well as the other dependencies it needs. This information can then be used to invoke said build system, or to e.g. add missing build dependencies to a Debian package. Ognibuild uses a variety of techniques to work out what the dependencies for an upstream package are:
  • Extracting dependencies and other requirements declared in build system metadata (e.g. setup.py)
  • Attempting builds and parsing build logs for missing dependencies (repeating until the build succeeds), calling out to buildlog-consultant
Once it is determined which dependencies are missing, they can be resolved in a variety of ways. Apt can be invoked to install missing dependencies on Debian systems (optionally in a chroot) or ecosystem-specific tools can be used to do so (e.g. pypi or cpan). Instead of installing packages, the tool can also simply inform the user about the missing packages and commands to install them, or update a Debian package appropriately (this is what deb-fix-build does). The target audience of ognibuild are people who need to (possibly from automation) build a variety of projects from different ecosystems or users who are looking to just install a project from source. Developers who are just hacking on e.g. a Python project are better off directly invoking the ecosystem-native tools rather than a wrapper like ognibuild.
Supported ecosystems (Partially) supported ecosystems currently include:
  • Combinations of make and autoconf, automake or CMake
  • Python, including fetching packages from pypi
  • Perl, including fetching packages from cpan
  • Haskell, including fetching from hackage
  • Ninja/Meson
  • Maven
  • Rust, including fetching packages from crates.io
  • PHP Pear
  • R, including fetching packages from CRAN and Bioconductor
For a full list, see the README.
Usage Ognibuild provides a couple of top-level subcommands that will seem familiar to anybody who has used a couple of other build systems:
  • ogni clean - remove build artifacts
  • ogni dist - create a dist tarball
  • ogni build - build the project in the current directory
  • ogni test - run the test suite
  • ogni install - install the project somewhere
  • ogni info - display project information including discovered build system and dependencies
  • ogni exec - run an arbitrary command but attempt to resolve issues like missing dependencies
These tools all take a couple of common options:
resolve=apt auto native Specifies how to resolve any missing dependencies:
  • apt: install the appropriate dependency using apt
  • native: install dependencies using native tools like pip or cpan
  • auto: invoke either apt or native package install, depending on whether the current user is allowed to invoke apt
schroot=name Run inside of a schroot.
explain do not make any changes but tell the user which native on apt packages they could install. There are also subcommand-specific options, e.g. to install to a specific directory on restrict which tests are run.
Examples
Creating a dist tarball
1
2
3
4
5
6
7
8
9
% git clone https://github.com/dulwich/dulwich
% cd dulwich
% ogni --schroot=unstable-amd64-sbuild dist
 
Writing dulwich-0.20.21/setup.cfg
creating dist
Creating tar archive
removing 'dulwich-0.20.21' (and everything under it)
Found new tarball dulwich-0.20.21.tar.gz in /var/run/schroot/mount/unstable-amd64-sbuild-974d32d7-6f10-4e77-8622-b6a091857e85/build/tmpucazj7j7/package/dist.
Installing ldb from source, resolving dependencies using apt
1
2
3
4
5
6
7
8
9
% wget https://download.samba.org/pub/ldb/ldb-2.3.0.tar.gz
% tar xvfz ldb-2.3.0.tar.gz
% cd ldb-2.3.0
% ogni install --prefix=/tmp/ldb
 
+ install /tmp/ldb/include/ldb.h (from include/ldb.h)
 
Waf: Leaving directory  /tmp/ldb-2.3.0/bin/default'
'install' finished successfully (11.395s)
Running all tests from XML::LibXML::LazyBuilder
1
2
3
4
5
6
% wget  https://cpan.metacpan.org/authors/id/T/TO/TORU/XML-LibXML-LazyBuilder-0.08.tar.gz _ <https://cpan.metacpan.org/authors/id/T/TO/TORU/XML-LibXML-LazyBuilder-0.08.tar.gz> _
% tar xvfz XML-LibXML-LazyBuilder-0.08.tar.gz
Cd XML-LibXML-LazyBuilder-0.08
% ogni test
 
Current Status ognibuild is still in its early stages, but works well enough that it can detect and invoke the build system for most of the upstream projects packaged in Debian. If there are buildsystems that it currently lacks support for or other issues, then I d welcome any bug reports.

11 April 2021

Jelmer Vernooij: The upstream ontologist

The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor. The upstream ontologist is a project that extracts metadata about upstream projects in a consistent format. It does this with a combination of heuristics and reading ecosystem-specific metadata files, such as Python s setup.py, rust s Cargo.toml as well as e.g. scanning README files.

Supported Data Sources It will extract information from a wide variety of sources, including:
Supported Fields Fields that it currently provides include:
  • Homepage: homepage URL
  • Name: name of the upstream project
  • Contact: contact address of some sort of the upstream (e-mail, mailing list URL)
  • Repository: VCS URL
  • Repository-Browse: Web URL for viewing the VCS
  • Bug-Database: Bug database URL (for web viewing, generally)
  • Bug-Submit: URL to use to submit new bugs (either on the web or an e-mail address)
  • Screenshots: List of URLs with screenshots
  • Archive: Archive used - e.g. SourceForge
  • Security-Contact: e-mail or URL with instructions for reporting security issues
  • Documentation: Link to documentation on the web:
  • Wiki: Wiki URL
  • Summary: one-line description of the project
  • Description: longer description of the project
  • License: Single line license description (e.g. GPL 2.0 ) as declared in the metadata[1]
  • Copyright: List of copyright holders
  • Version: Current upstream version
  • Security-MD: URL to markdown file with security policy
All data fields have a certainty associated with them ( certain , confident , likely or possible ), which gets set depending on how the data was derived or where it was found. If multiple possible values were found for a specific field, then the value with the highest certainty is taken.
Interface The ontologist provides a high-level Python API as well as two command-line tools that can write output in two different formats: For example, running guess-upstream-metadata on dulwich:
 % guess-upstream-metadata
 <string>:2: (INFO/1) Duplicate implicit target name: "contributing".
 Name: dulwich
 Repository: https://www.dulwich.io/code/
 X-Security-MD: https://github.com/dulwich/dulwich/tree/HEAD/SECURITY.md
 X-Version: 0.20.21
 Bug-Database: https://github.com/dulwich/dulwich/issues
 X-Summary: Python Git Library
 X-Description:  
   This is the Dulwich project.
   It aims to provide an interface to git repos (both local and remote) that
   doesn't call out to git directly but instead uses pure Python.
 X-License: Apache License, version 2 or GNU General Public License, version 2 or later.
 Bug-Submit: https://github.com/dulwich/dulwich/issues/new
Lintian-Brush lintian-brush can update DEP-12-style debian/upstream/metadata files that hold information about the upstream project that is packaged as well as the Homepage in the debian/control file based on information provided by the upstream ontologist. By default, it only imports data with the highest certainty - you can override this by specifying the uncertain command-line flag.
[1]Obviously this won t be able to describe the full licensing situation for many projects. Projects like scancode-toolkit are more appropriate for that.

6 April 2021

Jelmer Vernooij: Automatic Fixing of Debian Build Dependencies

The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor. In my last blogpost, I introduced the buildlog consultant - a tool that can identify many reasons why a Debian build failed. For example, here s a fragment of a build log where the Build-Depends lack python3-setuptools:
849
850
851
852
853
854
855
856
857
858
 dpkg-buildpackage: info: host architecture amd64
  fakeroot debian/rules clean
 dh clean --with python3,sphinxdoc --buildsystem=pybuild
    dh_auto_clean -O--buildsystem=pybuild
 I: pybuild base:232: python3.9 setup.py clean
 Traceback (most recent call last):
   File "/<<PKGBUILDDIR>>/setup.py", line 2, in <module>
     from setuptools import setup
 ModuleNotFoundError: No module named 'setuptools'
 E: pybuild pybuild:353: clean: plugin distutils failed with: exit code=1: python3.9 setup.py clean
The buildlog consultant can identify the line in bold as being key, and interprets it:

 % analyse-sbuild-log --json ~/build.log
  
    "stage": "build",
    "section": "Build",
    "lineno": 857,
    "kind": "missing-python-module",
    "details":  "module": "setuptools", "python_version": 3, "minimum_version": null 
  
Automatically acting on buildlog problems A common reason why Debian builds fail is missing dependencies or incorrect versions of dependencies declared in the package build depends. Based on the output of the buildlog consultant, it is possible in many cases to determine what dependency needs to be added to Build-Depends. In the example given above, we can use apt-file to look for the package that contains the path /usr/lib/python3/dist-packages/setuptools/__init__.py - and voila, we find python3-setuptools:
 % apt-file search /usr/lib/python3/dist-packages/setuptools/__init__.py
 python3-setuptools: /usr/lib/python3/dist-packages/setuptools/__init__.py
The deb-fix-build command automates these steps:
  1. It builds the package using sbuild; if the package successfully builds then it just exits successfully
  2. It tries to identify the problem by looking through the build log; if it can t or if it s a problem it has seen before (but apparently failed to resolve), then it exits with a non-zero exit code
  3. It tries to find a dependency that can address the problem
  4. It updates Build-Depends in debian/control or Depends in debian/tests/control
  5. Go to step 1
This takes away the tedious manual process of building a package, discovering that a dependency is missing, updating Build-Depends and trying again. For example, when I ran deb-fix-build while packaging saneyaml, the output looks something like this:
 % deb-fix-build
 Using output directory /tmp/tmpyz0nkgqq
 Using sbuild chroot unstable-amd64-sbuild
 Using fixers:  
 Building debian packages, running 'sbuild --no-clean-source -A -s -v'.
 Attempting to use fixer upstream requirement fixer(apt) to address MissingPythonDistribution('setuptools_scm', python_version=3, minimum_version='4')
 Using apt-file to search apt contents
 Adding build dependency: python3-setuptools-scm (>= 4)
 Building debian packages, running 'sbuild --no-clean-source -A -s -v'.
 Attempting to use fixer upstream requirement fixer(apt) to address MissingPythonDistribution('toml', python_version=3, minimum_version=None)
 Adding build dependency: python3-toml
 Building debian packages, running 'sbuild --no-clean-source -A -s -v'.
 Built 0.5.2-1- changes files at [ saneyaml_0.5.2-1_amd64.changes ].
And in our Git repository, we see these changes as well:
% git log -p
 commit 5a1715f4c7273b042818fc75702f2284034c7277 (HEAD -> master)
 Author: Jelmer Vernoo  <jelmer@jelmer.uk>
 Date:   Sun Apr 4 02:35:56 2021 +0100
     Add missing build dependency on python3-toml.
 diff --git a/debian/control b/debian/control
 index 5b854dc..3b27b73 100644
 --- a/debian/control
 +++ b/debian/control
 @@ -1,6 +1,6 @@
  Rules-Requires-Root: no
  Standards-Version: 4.5.1
 -Build-Depends: debhelper-compat (= 12), dh-sequence-python3, python3-all, python3-setuptools (>= 50), python3-wheel, python3-setuptools-scm (>= 4)
 +Build-Depends: debhelper-compat (= 12), dh-sequence-python3, python3-all, python3-setuptools (>= 50), python3-wheel, python3-setuptools-scm (>= 4), python3-toml
  Testsuite: autopkgtest-pkg-python
  Source: python-saneyaml
  Priority: optional
 commit f03047da80fcd8468ee231fbc4cf8488d7a0acd1
 Author: Jelmer Vernoo  <jelmer@jelmer.uk>
 Date:   Sun Apr 4 02:35:34 2021 +0100
     Add missing build dependency on python3-setuptools-scm (>= 4).
 diff --git a/debian/control b/debian/control
 index a476cc2..5b854dc 100644
 --- a/debian/control
 +++ b/debian/control
 @@ -1,6 +1,6 @@
  Rules-Requires-Root: no
  Standards-Version: 4.5.1
 -Build-Depends: debhelper-compat (= 12), dh-sequence-python3, python3-all, python3-setuptools (>= 50), python3-wheel
 +Build-Depends: debhelper-compat (= 12), dh-sequence-python3, python3-all, python3-setuptools (>= 50), python3-wheel, python3-setuptools-scm (>= 4)
  Testsuite: autopkgtest-pkg-python
  Source: python-saneyaml
  Priority: optional
Using deb-fix-build You can run deb-fix-build by installing the ognibuild package from unstable. The only requirements for using it are that:
  • The package is maintained in Git
  • A sbuild schroot is available for use
Caveats deb-fix-build is fairly easy to understand, and if it doesn t work then you re no worse off than you were without it - you ll have to add your own Build-Depends. That said, there are a couple of things to keep in mind:
  • At the moment, it doesn t distinguish between general, Arch or Indep Build-Depends.
  • It can only add dependencies for things that are actually in the archive
  • Sometimes there are multiple packages that can provide a file, command or python package - it tries to find the right one with heuristics but doesn t always get it right

5 April 2021

Jelmer Vernooij: The Buildlog Consultant

Reading build logs Build logs for Debian packages can be quite long and difficult for a human to read. Anybody who has looked at these logs trying to figure out why a build failed will have spent time scrolling through them and skimming for certain phrases (lines starting with error: for example). In many cases, you can spot the problem in the last 10 or 20 lines of output but it s also quite common that the error is somewhere at the beginning of many pages of error output.
The buildlog consultant The buildlog consultant project attempts to aid in this process by parsing sbuild and non-Debian (e.g. the output of make ) build logs and trying to identify the key line that explains why a build failed. It can then either display this specific line, or a fragment of the log around surrounding the key line.
Classification In addition to finding the key line explaining the failure, it can also classify and parse the error in many cases and return a result code and some metadata. For example, in a failed build of gnss-sdr that has produced 2119 lines of output, the reason for the failure is that log4cpp is missing which is on line 641:
634
635
636
637
638
639
640
641
642
643
644
645
646
647
 -- Required GNU Radio Component: ANALOG missing!
 -- Could NOT find GNURADIO (missing: GNURADIO_RUNTIME_FOUND)
 -- Could NOT find PkgConfig (missing: PKG_CONFIG_EXECUTABLE)
 -- Could NOT find LOG4CPP (missing: LOG4CPP_INCLUDE_DIRS
 LOG4CPP_LIBRARIES)
 CMake Error at CMakeLists.txt:593 (message):
   *** Log4cpp is required to build gnss-sdr
 -- Configuring incomplete, errors occurred!
 See also "/<<PKGBUILDDIR>>/obj-x86_64-linux-gnu/CMakeFiles/
 CMakeOutput.log".
 See also "/<<PKGBUILDDIR>>/obj-x86_64-linux-gnu/CMakeFiles/
 CMakeError.log".
In this case, the buildlog consultant can both figure out line was problematic and what the problem was:
 % analyse-sbuild-log build.log
 Failed stage: build
 Section: build
 Failed line: 641:
   *** Log4cpp is required to build gnss-sdr
 Error: Missing dependency: Log4cpp
Or, if you'd like to do something else with the output, use JSON output:
 % analyse-sbuild-log --json build.log
  "stage": "build", "section": "Build", "lineno": 641, "kind": "missing-dependency", "details":  "name": "Log4cpp"" 
How it works The consultant does some structured parsing (most notably it can parse the sections from a sbuild log), but otherwise is a large set of carefully crafted regular expressions and heuristics. It doesn t always find the problem, but has proven to be fairly accurate. It is constantly improved as part of the Debian Janitor project, and that exposes it to a wide variety of different errors. You can see the classification and error detection in action on the result codes page of the Janitor.
Using the buildlog consultant You can get the buildlog consultant from either pip or Debian unstable (package: python3-buildlog-consultant ). The buildlog consultant comes with two scripts analyse-build-log and analyse-sbuild-log, for analysing build logs and sbuild logs respectively.

25 March 2021

Jelmer Vernooij: The Buildlog Consultant

With Debian packages now widely being maintained in Git repositories, there has been an uptick in the number of bulk changes made to Debian packages. Several maintainers are running commands over many packages (e.g. all packages owned by a specific team) to fix common issues in packages. Examples of changes being made include:
  • Updating the Vcs-Git and Vcs-Browser URLs after migrating from alioth to salsa
  • Stripping trailing whitespace in various control files
  • Updating e.g. homepage URLs to use https rather than http
Most of these can be fixed with simple sed or perl one-liners. Some of these scripts are publically available, for example:
Lintian-Brush Lintian-Brush is both a simple wrapper around a set of these kinds of scripts and a repository for these scripts, with the goal of making it easy for any Debian maintainer to run them. The lintian-brush command-line tool is a simple wrapper that runs a set of "fixer scripts", and for each:
  • Reverts the changes made by the script if it failed with an error
  • Commits the changes to the VCS with an appropriate commit message
  • Adds a changelog entry (if desired)
The tool also provides some basic infrastructure for testing that these scripts do what they should, and e.g. don't have unintended side-effects. The idea is that it should be safe, quick and unobtrusive to run lintian-brush, and get it to opportunistically fix lintian issues and to leave the source tree alone when it can't.
Example For example, running lintian-brush on the package talloc fixes two minor lintian issues:
 % debcheckout talloc
 declared git repository at https://salsa.debian.org/samba-team/talloc.git
 git clone https://salsa.debian.org/samba-team/talloc.git talloc ...
 Cloning into 'talloc'...
 remote: Enumerating objects: 2702, done.
 remote: Counting objects: 100% (2702/2702), done.
 remote: Compressing objects: 100% (996/996), done.
 remote: Total 2702 (delta 1627), reused 2601 (delta 1550)
 Receiving objects: 100% (2702/2702), 1.70 MiB   565.00 KiB/s, done.
 Resolving deltas: 100% (1627/1627), done.
 % cd talloc
 talloc% lintian-brush
 Lintian tags fixed:  'insecure-copyright-format-uri', 'public-upstream-key-not-minimal' 
 % git log
 commit 0ea35f4bb76f6bca3132a9506189ef7531e5c680 (HEAD -> master)
 Author: Jelmer Vernoo  <jelmer@debian.org>
 Date:   Tue Dec 4 16:42:35 2018 +0000
     Re-export upstream signing key without extra signatures.
     Fixes lintian: public-upstream-key-not-minimal
     See https://lintian.debian.org/tags/public-upstream-key-not-minimal.html for more details.
  debian/changelog                    1 +
  debian/upstream/signing-key.asc   102 +++++++++++++++---------------------------------------------------------------------------------------
  2 files changed, 16 insertions(+), 87 deletions(-)
 commit feebce3147df561aa51a385c53d8759b4520c67f
 Author: Jelmer Vernoo  <jelmer@debian.org>
 Date:   Tue Dec 4 16:42:28 2018 +0000
     Use secure copyright file specification URI.
     Fixes lintian: insecure-copyright-format-uri
     See https://lintian.debian.org/tags/insecure-copyright-format-uri.html for more details.
  debian/changelog   3 +++
  debian/copyright   2 +-
  2 files changed, 4 insertions(+), 1 deletion(-)
Script Interface A fixer script is run in the root directory of a package, where it can make changes it deems necessary, and write a summary of what it's done for the changelog (and commit message) to standard out. If a fixer can not provide any improvements, it can simply leave the working tree untouched - lintian-brush will not create any commits for it or update the changelog. If it exits with a non-zero exit code, then it is assumed that it failed to run and it will be listed as such and its changes reset rather than committed. In addition, tests can be added for fixers by providing various before and after source package trees, to verify that a fixer script makes the expected changes. For more details, see the documentation on writing new fixers.
Availability lintian-brush is currently available in unstable and testing. See man lintian-brush(1) for an explanation of the command-line options. Fixer scripts are included that can fix (some of the instances of) 34 lintian tags. Feedback would be great if you try lintian-brush - please file bugs in the BTS, or propose pull requests with new fixers on salsa.

27 November 2020

Reproducible Builds (diffoscope): diffoscope 162 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 162. This version includes the following changes:
[ Chris Lamb ]
* Don't depends on radare2 in the Debian autopkgtests as it will not be in
  bullseye due to security considerations (#950372). (Closes: #975313)
* Avoid "Command  s p a c e d o u t  failed" messages when creating an
  artificial CalledProcessError instance in our generic from_operation
  feeder creator.
* Overhaul long and short descriptions.
* Use the operation's full name so that "command failed" messages include
  its arguments.
* Add a missing comma in a comment.
[ Jelmer Vernoo  ]
* Add missing space to the error message when only one argument is passed to
  diffoscope.
[ Holger Levsen ]
* Update Standards-Version to 4.5.1.
[ Mattia Rizzolo ]
* Split the diffoscope package into a diffoscope-minimal package that
  excludes the larger packages from Recommends. (Closes: #975261)
* Drop support for Python 3.6.
You find out more by visiting the project homepage.

24 October 2020

Jelmer Vernooij: Debian Janitor: Hosters used by Debian packages

The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor. The Janitor knows how to talk to different hosting platforms. For each hosting platform, it needs to support the platform- specific API for creating and managing merge proposals. For each hoster it also needs to have credentials. At the moment, it supports the GitHub API, Launchpad API and GitLab API. Both GitHub and Launchpad have only a single instance; the GitLab instances it supports are gitlab.com and salsa.debian.org. This provides coverage for the vast majority of Debian packages that can be accessed using Git. More than 75% of all packages are available on salsa - although in some cases, the Vcs-Git header has not yet been updated. Of the other 25%, the majority either does not declare where it is hosted using a Vcs-* header (10.5%), or have not yet migrated from alioth to another hosting platform (9.7%). A further 2.3% are hosted somewhere on GitHub (2%), Launchpad (0.18%) or GitLab.com (0.15%), in many cases in the same repository as the upstream code. The remaining 1.6% are hosted on many other hosts, primarily people s personal servers (which usually don t have an API for creating pull requests). Packages per hoster

Outdated Vcs-* headers It is possible that the 20% of packages that do not have a Vcs-* header or have a Vcs header that say there on alioth are actually hosted elsewhere. However, it is hard to know where they are until a version with an updated Vcs-Git header is uploaded. The Janitor primarily relies on vcswatch to find the correct locations of repositories. vcswatch looks at Vcs-* headers but has its own heuristics as well. For about 2,000 packages (6%) that still have Vcs-* headers that point to alioth, vcswatch successfully finds their new home on salsa.
Merge Proposals by Hoster These proportions are also visible in the number of pull requests created by the Janitor on various hosters. The vast majority so far has been created on Salsa.
Hoster Open Merged & Applied Closed
github.com921685
gitlab.com1230
code.launchpad.net24511
salsa.debian.org1,3605,657126
Merge Proposal statistics In this graph, Open means that the pull request has been created but likely nobody has looked at it yet. Merged means that the pull request has been marked as merged on the hoster, and applied means that the changes have ended up in the packaging branch but via a different route (e.g. cherry-picked or manually applied). Closed means that the pull request was closed without the changes being incorporated. Note that this excludes ~5,600 direct pushes, all of which were to salsa-hosted repositories. See also:

For more information about the Janitor's lintian-fixes efforts, see the landing page.

Next.