Search Results: "ghe"

1 July 2024

Russell Coker: VoLTE in Australia

Introduction In Australia the 3G mobile frequencies are to be reused so they are in the process of shutting down the 3G service. That means that everyone has to use VoLTE (Voice Over LTE) for phone calls (including emergency calls). The shutdown time varies by telco, Kogan Mobile (one of the better services which has good value for money and generally works well) shut down their 3G service in January. Aldi Mobile (another one of the good services which is slightly more expensive but has included free calls to most first-world countries and uses the largest phone network) will shut theirs down at the end of August. For background there s a Fosdem talk about OpenSIPS with VoLTE and VoNR [1], it s more complex than you want to know. Also VoNR (Voice over New Radio) is the standard for 5G voice and it s different from VoLTE and has a fallback to VoLTE. Another good lecture for background information is the Fosdem talk on VoLTE at the handset end [2]. The PinePhonePro In October 2023 I tried using my PinePhonePro as my main phone but that only lasted a few days due to problems with calls and poor battery life [3]. Since then I went back to the Huawei Mate 10 Pro that I bought refurbished in June 2019 for $389. So that has been my main phone for 5 years now, giving a cost of $1.50 per week. I had tried using a Huawei Nova 7i running Android without Google Play as an experiment but that had failed, I do many things that need Android apps [4]. I followed the PinePhone wiki to get my PinePhonePro working with VoLTE [5]. That worked fine for me, the only difference from the instructions is that I had to use device /dev/ttyUSB3 and that the modem kept resetting itself during the process and when that happened I had to kill minicom and start again. After changing the setting and saving it the PinePhonePro seemed to work well with VoLTE on a Kogan Mobile SIM (so definitely not using 3G). One issue I have found is that Plasma Mobile (my preferred FOSS phone GUI) appears to have a library issue that results in polling every 14ms even when the screen is locked [6]. If you have a few processes doing that (which means the most lightly used Plasma system) it really hurts battery use. The maintainer has quite reasonably deferred action on this bug report given the KDE 6 transition. Later on in the Trixie development cycle I hope to get this issue resolved, I don t expect it to suddenly make battery life good. But it might make battery life acceptable. I am now idly considering carrying around my PinePhonePro in a powered off state for situations where I might need to do high security operations (root logins to servers or online banking) but for which carrying a laptop isn t convenient. It will do well for the turn on, do 30 mins of work that needs security, and then turn off scenario. Huawei Mate 10 Pro and Redmi 9A The Huawei Mate 10 Pro has been my main phone for 5 years and it has worked well, so it would be ideal if it could do VoLTE as the PinePhonePro isn t ready yet. All the web pages I ve seen about the Mate 10 Pro say that it will either allow upgrading to a VoLTE configuration if run with the right SIM or only support it with the right SIM. I did a test with a Chinese SIM which gave an option of turning on VoLTE but didn t allow any firmware updates and the VoLTE option went away when I put an Australian SIM in. Some forum comments had led me to believe that it would either permanently enable VoLTE or allow upgrading the firmware to one that enables VoLTE if I used a Chinese SIM but that is not the case. I didn t expect a high probability of success but I had to give it a go as it s a nice phone. I did some tests on a Redmi 9A (a terrible phone that has really bad latency on the UI in spite of having reasonably good hardware). The one I tested on didn t have VoLTE enabled when I got it, to test that I used the code *#*#4636#*#* in the dialler to get the menu of SIM information and it showed that VoLTE was not provisioned. I then had to update to the latest release of Android for that phone and enter *#*#86583#*#* in the dialler to enable VoLTE, the message displayed after entering that magic number must end in DISABLE . I get the impression that the code in question makes the phone not check certain aspects of whether the carrier is good for VoLTE and just do it. So apparently Kogan Mobile somehow gives the Redmi 9A the impression that VoLTE isn t supported but if the phone just goes ahead and connects it will work. I don t plan to use a Redmi 9A myself as it s too slow, but I added it to my collection to offer to anyone else I know who needs a phone with VoLTE and doesn t use the phone seriously or to someone who needs a known good phone for testing things. Samsung Galaxy Note 9 I got some Samsung Galaxy Note 9 phones to run Droidian as an experiment [7]. But Droidian dropped support for the Note 9 and I couldn t figure out how to enable VoLTE via Droidian, which was very annoying after I had spent $109 on a test phone and $215 on a phone for real use (I have no plans to try Droidian again at this time). I tried installing LineageOS on one Note 9 [8] which was much easier than expected (especially after previously installing Droidian). But VoLTE wasn t an option. According to Reddit LineageOS doesn t support VoLTE on Samsung devices and you can use a magisk module or a VoLTE enabler module but those aren t supported by LineageOS either [9]. I downloaded an original image for the Note 9 from SamsMobile.com [10]. That image booted past the orange stage (where if you have problems then your phone is probably permanently useless) but didn t boot into the OS. A friend helped me out with that and it turned out that the Heimdal flash tool on Linux didn t do something it needed to do and that Odin on Windows was required. After using Odin everything was fine and I have a Note 9 with VoLTE running the latest Samsung firmware which is security patch level 1st July 2022!!! So I have a choice between using a Note 9 for data and SMS while running a current version of Lineage OS with all security fixes or running a Samsung image with no security updates for 2 years which supports phone calls. So based on this I have to recommend Pixel as the phone of choice, it has a decent level of support from Google and long term support from LineageOS. According to the LineageOS web site you can run the current version of Lineage on the original Pixel phone from 2016! Of course getting VoLTE to work on it might be another saga, but it would probably be easier to do with LineageOS on a Pixel than on a Samsung phone. Conclusion The operation of the Note 9 for me is decent now apart from the potential security issues. The same goes for selling one of the phones. The PinePhonePro still has potential to become my daily driver at some future time if I and others can optimise power use. Also a complicating factor is that I want to have both Jabber and Matrix be actually instant IM systems not IM with a 5 minute delay, so suspend mode isn t a good option. Pixel phones will be a much higher priority when looking at phones to buy in future. The older Pixel phones go for as little as $100 on eBay and can still run the latest LineageOS. VoLTE seems needlessly complicated.

Niels Thykier: Debian packaging with style black

When I started working on the language server for debputy, one of several reasons was about automatic applying a formatting style. Such that you would not have to remember to manually reformat the file. One of the problems with supporting automatic formatting is that no one agrees on the "one true style". To make this concrete, Johannes Schauer Marin Rodrigues did the numbers of which wrap-and-sort option that are most common in https://bugs.debian.org/895570#46. Unsurprising, we end up with 14-15 different styles with various degrees of popularity. To make matters worse, wrap-and-sort does not provide a way to declare "this package uses options -sat". So that begged the question, how would debputy know which style it should use when it was going to reformat file. After a couple of false-starts, Christian Hofstaedtler mentioned that we could just have a field in debian/control for supporting a "per-package" setting in responds to my concern about adding a new "per-package" config file. At first, I was not happy with it, because how would you specify all of these options in a field (in a decent manner)? But then I realized that one I do not want all these styles and that I could start simpler. The Python code formatter black is quite successful despite not having a lot of personalized style options. In fact, black makes a statement out of not allowing a lot of different styles. Combing that, the result was X-Style: black (to be added to the Source stanza of debian/control), which every possible reference to the black tool for how styling would work. Namely, you outsource the style management to the tool (debputy) and then start using your focus on something else than discussing styles. As with black, this packaging formatting style is going to be opinionated and it will evolve over time. At the starting point, it is similar to wrap-and-sort -sat for the deb822 files (debputy does not reformat other files at the moment). But as mentioned, it will likely evolve and possible diverge from wrap-and-sort over time. The choice of the starting point was based on the numbers posted by Johannes #895570. It was not my personal favorite but it seemed to have a majority and is also close to the one suggested by salsa pipeline maintainers. The delta being -kb which I had originally but removed in 0.1.34 at request of Otto Kek l inen after reviewing the numbers from Johannes one more time. To facilitate this new change, I uploaded debputy/0.1.30 (a while back) to Debian unstable with the following changes:
  • Support for the X-Style: black header.
  • When a style is defined, the debputy lsp server command will now automatically reformat deb822 files on save (if the editor supports it) or on explicit "reformat file" request from the editor (usually indirectly from the user).
  • New subcommand debputy reformat command that will reformat the files, when a style is defined.
  • A new pre-commit hook repo to run debputy lint and debputy reformat. These hooks are available from https://salsa.debian.org/debian/debputy-pre-commit-hooks version v0.1 and can be used with the pre-commit tool (from the package of same name).
The obvious omission is a salsa-pipeline feature for this. Otto has put that on to his personal todo list and I am looking forward to that.
Beyond black Another thing I dislike about our existing style tooling is that if you run wrap-and-sort without any arguments, you have a higher probability of "trashing" the style of the current package than getting the desired result. Part of this is because wrap-and-sort's defaults are out of sync with the usage (which is basically what https://bugs.debian.org/895570 is about). But I see another problem. The wrap-and-sort tool explicitly defined options to tweak the style but provided maintainers no way to record their preference in any machine readable way. The net result is that we have tons of diverging styles and that you (as a user of wrap-and-sort) have to manually tell wrap-and-sort which style you want every time you run the tool. In my opinion that is not playing to the strengths of neither human nor machine. Rather, it is playing to the weaknesses of the human if anything at all. But the salsa-CI pipeline people also ran into this issue and decided to work around this deficiency. To use wrap-and-sort in the salsa-CI pipeline, you have to set a variable to activate the job and another variable with the actual options you want. The salsa-CI pipeline is quite machine readable and wrap-and-sort is widely used. I had debputy reformat also check for the salsa-CI variables as a fallback. This fallback also works for the editor mode (debputy lsp server), so you might not even have to run debputy reformat. :) This was a deliberate trade-off. While I do not want all us to have all these options, I also want Debian packaging to be less painful and have fewer paper cuts. Having debputy go extra lengths to meet wrap-and-sort users where they are came out as the better solution for me. A nice side-effect of this trade-off is that debputy reformat now a good tool for drive-by contributors. You can safely run debputy reformat on any package and either it will apply the styling or it will back out and inform you that no obvious style was detected. In the latter case, you would have to fallback to manually deducing the style and applying it.
Differences to wrap-and-sort The debputy reformat has some limitations or known differences to wrap-and-sort. Notably, debputy reformat (nor debputy lsp server) will not invoke wrap-and-sort. Instead, debputy has its own reformatting engine that provides similar features. One reason for not running wrap-and-sort is that I want debputy reformat to match the style that debputy lsp server will give you. That way, you get consistent style across all debputy commands. Another reason is that it is important to me that reformatting is safe and does not change semantics. This leads to two regrettable known differences to the wrap-and-sort behavior due to safety in addition to one scope limitation in debputy:
  1. debputy will ignore requests to sort the stanzas when the "keep first" option is disabled (-b --no-keep-first). This combination is unsafe reformatting. I feel it was a mistake for wrap-and-sort to ever allow this but at least it is no longer the default (-b is now -bk by default). This will be less of a problem in debhelper-compat 15, since the concept of "main package" will disappear and all multi-binary source packages will be required to use debian/package.install rather than debian/install.
  2. debputy will not reorder the contents of debhelper packaging files such as debian/install. This is also an (theoretical) unsafe thing to do. While the average package will not experience issues with this, there are rare corner cases where the re-ordering can affect the end result. I happen to know this, because I ran into issues when trying to optimize dh_install in a way that assumed the order did not matter. Stuff broke and there is now special-case code in dh_install to back out of that optimization when that happens.
  3. debputy has a limited list of wrap-and-sort options it understands. Some options may cause debputy to back out and disable reformatting entirely with a remark that it cannot apply that style. If you run into a case of this, feel free to file a feature request to support it. I will not promise to support everything, but if it is safe and trivially doable with the engine already, then I probably will.
As stated, where debputy cannot implement the wrap-and-sort styles fully, then it will currently implement a subset that is safe if that can be identified or back out entirely of the formatting when it cannot. In all cases, debputy will not break the formatting if it is correct. It may just fail at correcting one aspect of the wrap-and-sort style if you happen to get it wrong. It is also important to remember that the prerequisite for debputy applying any wrap-and-sort style is that you have set the salsa-CI pipeline variables to trigger wrap-and-sort with the salsa-CI pipeline. So there is still a CI check before the merge that will run the wrap-and-sort in its full glory that provides the final safety net for you.
Just give me a style In conclusion, if you, like me, are more interested in getting a consistent style rather than discussing what that style should be, now you can get that with X-Style: black. You can also have your custom wrap-and-sort style be picked up automatically for drive-by contributors.
$ apt satisfy 'dh-debputy (>= 0.1.30), python3-lsprotocol'
# Add  X-Style: black  to  debian/control  for "just give me a style"
#
# OR, if there is a specific  wrap-and-sort  style for you then set
# SALSA_CI_DISABLE_WRAP_AND_SORT=no plus set relevant options in
# SALSA_CI_WRAP_AND_SORT_ARGS in debian/salsa-ci.yml (or .gitlab-ci.yml)
$ debputy reformat
It is sadly not yet in the salsa-ci pipeline. Otto is looking into that and hopefully we will have it soon. :) And if you find yourself often doing archive-wide contributions and is tired of having to reverse engineer package formatting styles, consider using debputy reformat or debputy lsp server. If you use debputy in this way, please consider providing feedback on what would help you.

30 June 2024

Joachim Breitner: Do surprises get larger?

The setup Imagine you are living on a riverbank. Every now and then, the river swells and you have high water. The first few times this may come as a surprise, but soon you learn that such floods are a recurring occurrence at that river, and you make suitable preparation. Let s say you feel well-prepared against any flood that is no higher than the highest one observed so far. The more floods you have seen, the higher that mark is, and the better prepared you are. But of course, eventually a higher flood will occur that surprises you. Of course such new record floods are happening rarer and rarer as you have seen more of them. I was wondering though: By how much do the new records exceed the previous high mark? Does this excess decrease or increase over time? A priori both could be. When the high mark is already rather high, maybe new record floods will just barley pass that mark? Or maybe, simply because new records are so rare events, when they do occur, they can be surprisingly bad? This post is a leisurely mathematical investigating of this question, which of course isn t restricted to high waters; it could be anything that produces a measurement repeatedly and (mostly) independently weather events, sport results, dice rolls. The answer of course depends on the distribution of results: How likely is each possible results.

Dice are simple With dice rolls the answer is rather simple. Let our measurement be how often you can roll a die until it shows a 6. This simple game we can repeat many times, and keep track of our record. Let s say the record happens to be 7 rolls. If in the next run we roll the die 7 times, and it still does not show a 6, then we know that we have broken the record, and every further roll increases by how much we beat the old record. But note that how often we will now roll the die is completely independent of what happened before! So for this game the answer is: The excess with which the record is broken is always the same. Mathematically speaking this is because the distribution of rolls until the die shows a 6 is memoryless. Such distributions are rather special, its essentially just the example we gave (a geometric distribution), or its continuous analogue (the exponential distributions, for example the time until a radioactive particle decays).

Mathematical formulation With this out of the way, let us look at some other distributions, and for that, introduce some mathematical notations. Let X be a random variable with probability density function (x) and cumulative distribution function (x), and a be the previous record. We are interested in the behavior of Y(a) = X a X > x i.e. by how much X exceeds a under the condition that it did exceed a. How does Y change as a increases? In particular, how does the expected value of the excess e(a) = E(Y(a)) change?

Uniform distribution If X is uniformly distributed between, say, 0 and 1, then a new record will appear uniformly distributed between a and 1, and as that range gets smaller, the excess must get smaller as well. More precisely, e(a) = E(X a X > a) = E(X X > a) a = (1 a)/2 This not very interesting linear line is plotted in blue in this diagram:
The expected record surpass for the uniform distribution The expected record surpass for the uniform distribution
The orange line with the logarithmic scale on the right tries to convey how unlikely it is to surpass the record value a: it shows how many attempts we expect before the record is broken. This can be calculated by n(a) = 1/(1 (a)).

Normal distribution For the normal distribution (with median 0 and standard derivation 1, to keep things simple), we can look up the expected value of the one-sided truncated normal distribution and obtain e(a) = E(X X > a) a = (a)/(1 (a)) a Now is this growing or shrinking? We can plot this an have a quick look:
The expected record surpass for the normal distribution The expected record surpass for the normal distribution
Indeed it is, too, a decreasing function! (As a sanity check we can see that e(0) = (2/ ), which is the expected value of the half-normal distribution, as it should.)

Could it be any different? This settles my question: It seems that each new surprisingly high water will tend to be less surprising than the previously assuming high waters were uniformly or normally distributed, which is unlikely to be helpful. This does raise the question, though, if there are probability distributions for which e(a) is be increasing? I can try to construct one, and because it s a bit easier, I ll consider a discrete distribution on the positive natural numbers, and consider at g(0) = E(X) and g(1) = E(X 1 X > 1). What does it take for g(1) > g(0)? Using E(X) = p + (1 p)E(X X > 1) for p = P(X = 1) we find that in order to have g(1) > g(0), we need E(X) > 1/p. This is plausible because we get equality when E(X) = 1/p, as it precisely the case for the geometric distribution. And it is also plausible that it helps if p is large (so that the next first record is likely just 1) and if, nevertheless, E(X) is large (so that if we do get an outcome other than 1, it s much larger). Starting with the geometric distribution, where P(X > n X n) = pn = p (the probability of again not rolling a six) is constant, it seems that these pn is increasing, we get the desired behavior. So let p1 < p2 < pn < be an increasing sequence of probabilities, and define X so that P(X = n) = p1 pn 1 (1 pn) (imagine the die wears off and the more often you roll it, the less likely it shows a 6). Then for this variation of the game, every new record tends to exceed the previous more than previous records. As the p increase, we get a flatter long end in the probability distribution.

Gamma distribution To get a nice plot, I ll take the intuition from this and turn to continuous distributions. The Wikipedia page for the exponential distribution says it is a special case of the gamma distribution, which has an additional shape parameter , and it seems that it could influence the shape of the distribution to be and make the probability distribution have a longer end. Let s play around with = 2 and = 0.5, 1 and 1.5:
The expected record surpass for the gamma distribution The expected record surpass for the gamma distribution
  • For = 1 (dotted) this should just be the exponential distribution, and we see that e(a) is flat, as predicted earlier.
  • For larger (dashed) the graph does not look much different from the one for the normal distribution not a surprise, as for , the gamma distribution turns into the normal distribution.
  • For smaller (solid) we get the desired effect: e(a) is increasing. This means that new records tend to break records more impressively.
The orange line shows that this comes at a cost: for a given old record a, new records are harder to come by with smaller .

Conclusion As usual, it all depends on the distribution. Otherwise, not much, it s late.

19 June 2024

Sahil Dhiman: First Iteration of My Free Software Mirror

As I m gearing towards setting up a Free Software download mirror in India, it occurred to me that I haven t chronicled the work and motivation behind setting up the original mirror in the first place. Also, seems like it would be good to document stuff here for observing the progression, as the mirror is going multi-country now. Right now, my existing mirror i.e., mirrors.de.sahilister.net (was mirrors.sahilister.in), is hosted in Germany and serves traffic for Termux, NomadBSD, Blender, BlendOS and GIMP. For a while in between, it hosted OSMC project mirror as well. To explain what is a Free Software download mirror thing is first, I ll quote myself from work blog -
As most Free Software doesn t have commercial backing and require heavy downloads, the concept of software download mirrors helps take the traffic load off of the primary server, leading to geographical redundancy, higher availability and faster download in general.
So whenever someone wants to download a particular (mirrored) software and click download, upstream redirects the download to one of the mirror server which is geographical (or in other parameters) nearby to the user, leading to faster downloads and load sharing amongst all mirrors. Since the time I got into Linux and servers, I always wanted to help the community somehow, and mirroring seemed to be the most obvious thing. India seems to be a country which has traditionally seen less number of public download mirrors. IITB, TiFR, and some of the public institutions used to host them for popular Linux and Free Softwares, but they seem to be diminishing these days. In the last months of 2021, I started using Termux and saw that it had only a few mirrors (back then). I tried getting a high capacity, high bandwidth node in budget but it was hard in India in 2021-22. So after much deliberation, I decided to go where it s available and chose a German hosting provider with the thought of adding India node when conditions are favorable (thankfully that happened, and India node is live too now.). Termux required only 29 GB of storage, so went ahead and started mirroring it. I raised this issue in Termux s GitHub repository in January 2022. This blog post chronicles the start of the mirror. Termux has high request counts from a mirror point of view. Each Termux client, usually checks every mirror in selected group for availability before randomly selecting one for download (only other case is when client has explicitly selected a single mirror using termux-repo-change). The mirror started getting thousands of requests daily due to this but only a small percentage would actually get my mirror in selection, so download traffic was lower. Similar thing happened with OSMC too (which I started mirroring later). With this start, I started exploring various project that would be benefit from additional mirrors. Public information from Academic Computer Club in Ume s mirror and Freedif s mirror stats helped to figure out storage and bandwidth requirements for potential projects. Fun fact, Academic Computer Club in Ume (which is one of the prominent Debian, Ubuntu etc.) mirror, now has 200 Gbits/s uplink to the internet through SUNET. Later, I migrated to a different provider for better speeds and added LibreSpeed test on the mirror server. Those were fun times. Between OSMC, Termux and LibreSpeed, I was getting almost 1.2 millions hits/day on the server at its peak, crossing for the first time a TB/day traffic number. Next came Blender, which took the longest time to set up of around 9 10 months. Blender had a push-trigger requirement for rsync from upstream that took quite some back and forth. It now contributes the most amount of traffic on the mirror. On release days, mirror does more than 3 TB/day and normal days, it hovers around 2 TB/day. Gimp project is the latest addition. At one time, the mirror traffic touched 4.97 TB/day traffic number. That s when I decided on dropping LibreSpeed server to solely focus on mirroring for now, keeping the bandwidth allotment for serving downloads only. The mirror projects selection grew organically. I used to reach out many projects discussing the need of for additional mirrors. Some projects outright denied mirroring request as Germany already has a good academic mirrors boosting 20-25 Gbits/s speeds from FTP era, which seems fair. Finding the niche was essential to only add softwares, which would truly benefit from additional capacity. There were months when nothing much would happen with the mirror, rsync would continue to update the mirror while nginx would keep on serving the traffic. Nowadays, the mirror pushes around 70 TB/month. I occasionally check logs, vnstat, add new security stuff here and there and pay the bills. It now saturates the Gigabit link sometimes and goes beyond that, peaking around 1.42 Gbits/s (the hosting provider seems to be upping their game). The plan is to upgrade the link to better speeds. vnstat yearly
Yearly traffic stats (through vnstat -y )
On the way, learned quite a few things like - GeoIP Map of Clients from Yesterday Access Logs
GeoIP Map of Clients from Yesterday's Access Logs. Click to enlarge
Generated from IPinfo.io
In hindsight, the statistics look amazing, hundreds of TBs of traffic served from the mirror, month after month. That does show that there s still an appetite for public mirrors in time of commercially donated CDNs and GitHub. The world could have done with one less mirror, but it saved some time, lessened the burden for others, while providing redundancy and traffic localization with one additional mirror. And it s fun for someone like me who s into infrastructure that powers the Internet. Now, I ll try focusing and expanding the India mirror, which in itself started pushing almost half a TB/day. Long live Free Software and public download mirrors.

26 May 2024

Russell Coker: USB-A vs USB-C

USB-A is the original socket for USB at the PC end. There are 2 variants of it, the first is for USB 1.1 to USB 2 and the second is for USB 3 which adds extra pins in a plug and socket compatible manner you can plug a USB-A device into a USB-A socket without worrying about the speeds of each end as long as you don t need USB 3 speeds. The differences between USB-A and USB-C are:
  1. USB-C has the same form factor as Thunderbolt and the Thunderbolt protocol can run over it if both ends support it.
  2. USB-C generally supports higher power modes for charging (like 130W for Dell laptops, monitors, and plugpacks) but there s no technical reason why USB-A couldn t do it. You can buy chargers that do 60W over USB-A which could power one of our laptops via a USB-A to USB-C cable. So high power USB-A is theoretically possible but generally you won t see it.
  3. USB-C has DisplayPort alternate mode which means using some of the wires for DisplayPort.
  4. USB-C is more likely to support the highest speeds than USB-A sockets for super speed etc. This is not a difference in the standards just a choice made by manufacturers.
While USB-C tends to support higher power delivery modes in actual implementations for connecting to a PC the PC end seems to only support lower power modes regardless of port. I think it would be really good if workstations could connect to monitors via USB-C and provide power, DisplayPort, and keyboard, mouse, etc over the same connection. But unfortunately the PC and monitor ends don t appear to support such things. If you don t need any of those benefits in the list above (IE you are using USB for almost anything we do other than connecting a laptop to a dock/monitor/charger) then USB-A will do the job just as well as USB-C. The choice of which type to use should be based on price and which ports are available, EG My laptop has 2*USB-C ports and 2*USB-A so given that one USB-C port is almost always used for the monitor or for charging I don t really want to use USB-C for anything else to avoid running out of ports. When buying USB devices you can t always predict which systems you will need to connect them to. Currently there are a lot of systems without USB-C that are working well and have no need to be replaced. I haven t yet seen a system where the majority of ports are USB-C but that will probably happen in the next few years. Maybe in 2027 there will be PCs on sale with only two USB-A sockets forcing people who don t want to use a USB hub to save both of them for keyboard and mouse. Currently USB-C keyboards and mice are available on AliExpress but they are expensive and I haven t seen them in Australian stores. Most computer users don t wear out keyboards or mice so a lot of USB-A keyboard and mice will be in service for a long time. As an aside there are still many PCs with PS/2 keyboard and mouse ports in service so these things don t go away for a long time. There is one corner case where USB-C is convenient which is when you want to connect a mass storage device for system recovery or emergency backup, want a high speed, and don t want to spend time figuring out which of the ports are super speed (which can be difficult at the back of a PC with poor lighting). With USB-C you can expect a speed of at least 5Gbit/s and don t have to worry about accidentally connecting to a USB 2 port as is the situation with USB-A. For my own use the only times that I prefer USB-C over USB-A are for devices to connect to phones. Eventually I ll get a laptop that only has USB-C ports and this will change, but even then adaptors are possible. For someone who doesn t know the details of how things works it s not unreasonable to just buy the newest stuff and assume it s better as it usually is. But hopefully blog posts like this can help people make more informed decisions.

20 May 2024

Russell Coker: Respect and Children

I attended the school Yarra Valley Grammer (then Yarra Valley Anglican School which I will refer to as YV ) and completed year 12 in 1990. The school is currently in the news for a spreadsheet some boys made rating girls where unrapeable was one of the ratings. The school s PR team are now making claims like Respect for each other is in the DNA of this school . I d like to know when this DNA change allegedly occurred because respect definitely wasn t in the school DNA in 1990! Before I go any further I have to note that if the school threatens legal action against me for this post it will be clear evidence that they don t believe in respect. The actions of that school have wronged me, several of my friends, many people who aren t friends but who I wish they hadn t had to suffer and I hadn t had to witness it, and presumably countless others that I didn t witness. If they have any decency they would not consider legal action but I have learned that as an institution they have no decency so I have to note that they should read the Wikipedia page about the Streisand Effect [1] and keep it in mind before deciding on a course of action. I think it is possible to create a school where most kids enjoy being there and enjoy learning, where hardly any students find it a negative experience and almost no-one finds it traumatic. But it is not possible to do that with the way schools tend to be run. When I was at high school there was a general culture that minor sex crimes committed by boys against boys weren t a problem, this probably applied to all high schools. Things like ripping a boy s pants off (known as dakking ) were considered a big joke. If you accept that ripping the pants off an unwilling boy is a good thing (as was the case when I was at school) then that leads to thinking that describing girls as unrapeable is acceptable. The Wikipedia page for Pantsing [2] has a reference for this issue being raised as a serious problem by the British Secretary of State for Education and Skills Alan Johnson in 2007. So this has continued to be a widespread problem around the world. Has YV become better than other schools in dealing with it or is Dakking and Wedgies as well accepted now as it was when I attended? There is talk about schools preparing kids for the workforce, but grabbing someone s underpants without consent will result in instant dismissal from almost all employment. There should be more tolerance for making mistakes at school than at work, but they shouldn t tolerate what would be serious crimes in other contexts. For work environments there have been significant changes to what is accepted, so it doesn t seem unreasonable to expect that schools can have a similar change in culture. One would hope that spending 6 years wondering who s going to grab your underpants next would teach boys the importance of consent and some sympathy for victims of other forms of sexual assault. But that doesn t seem to happen, apparently it s often the opposite. When I was young Autism wasn t diagnosed for anyone who was capable of having a normal life. Teachers noticed that I wasn t like other kids, some were nice, but some encouraged other boys to attack me as a form of corporal punishment by proxy not a punishment for doing anything wrong (detentions were adequate for that) but for being different. The lesson kids will take from that sort of thing is that if you are in a position of power you can mistreat other people and get away with it. There was a girl in my year level at YV who would probably be diagnosed as Autistic by today s standards, the way I witnessed her being treated was considerably worse than what was described in the recent news reports but it is quite likely that worse things have been done recently which haven t made the news yet. If this issue is declared to be over after 4 boys were expelled then I ll count that as evidence of a cover-up. These things don t happen in a vacuum, there s a culture that permits and encourages it. The word respect has different meanings, it can mean treat a superior as the master or treat someone as a human being . The phrase if you treat me with respect I ll treat you with respect usually means if you treat me as the boss then I ll treat you as a human being . The distinction is very important when discussing respect in schools. If teachers are considered the ultimate bosses whose behaviour can never be questioned then many boys won t need much help from Andrew Tate in developing the belief that they should be the boss of girls in the same way. Do any schools have a process for having students review teachers? Does YV have an ombudsman to take reports of misbehaving teachers in the way that corporations typically have an ombudsman to take reports about bad managers? Any time you have people whose behaviour is beyond scrutiny or oversight you will inevitably have bad people apply for jobs, then bad things will happen and it will create a culture of bad behaviour. If teachers can treat kids badly then kids will treat other kids badly, and this generally ends with girls being treated badly by boys. My experience at YV was that kids barely had the status of people. It seemed that the school operated more as a caretaker of the property of parents than as an organisation that cares for people. The current YV website has a Whistleblower policy [3] that has only one occurrence of the word student and that is about issues that endanger the health or safety of students. Students are the people most vulnerable to reprisal for complaining and not being listed as an eligible whistleblower shows their status. The web site also has a flowchart for complaints and grievances [4] which doesn t describe any policy for a complaint to be initiated by a student. One would hope that parents would advocate for their children but that often isn t the case. When discussing the possibility of boys being bullied at school with parents I ve had them say things like my son wouldn t be so weak that he would be bullied , no boy will tell his parents about being bullied if that s their attitude! I imagine that there are similar but different issues of parents victim-blaming when their daughter is bullied (presumably substituting immoral for weak) but don t have direct knowledge of the topic. The experience of many kids is being disrespected by their parents, the school system, and often siblings too. A school can t solve all the world s problems but can ideally be a refuge for kids who have problems at home. When I was at school the culture in the country and the school was homophobic. One teacher when discussing issues such as how students could tell him if they had psychological problems and no-one else to talk to said some things like the Village People make really good music which was the only time any teacher said anything like It s OK to be gay (the Village People were the gayest pop group at the time). A lot of the bullying at school had a sexual component to it. In addition to the wedgies and dakking (which while not happening often was something you had to constantly be aware of) I routinely avoided PE classes where a shower was necessary because of a thug who hung around by the showers and looked hungrily at my penis, I don t know if he had a particular liking to mine or if he stared at everyone that way. Flashing and perving was quite common in change rooms. Presumably as such boy-boy sexual misbehaviour was so accepted that led to boys mistreating girls. I currently work for a company that is active in telling it s employees about the possibility of free psychological assistance. Any employee can phone a psychologist to discuss problems (whether or not they are work related) free of charge and without their manager or colleagues knowing. The company is billed and is only given a breakdown of the number of people who used the service and roughly what the issue was (work stress, family, friends, grief, etc). When something noteworthy happens employees are given reminders about this such as if you need help after seeing a homeless man try to steal a laptop from the office then feel free to call the assistance program . Do schools offer something similar? With the school fees paid to a school like YV they should be able to afford plenty of psychologist time. Every day I was at YV I saw something considerably worse than laptop theft, most days something was done to me. The problems with schools are part of larger problems with society. About half of the adults in Australia still support the Liberal party in spite of their support of Christian Porter, Cardinal Pell, and Bruce Lehrmann. It s not logical to expect such parents to discourage their sons from mistreating girls or to encourage their daughters to complain when they are mistreated. The Anglican church has recently changed it s policy to suggesting that victims of sexual abuse can contact the police instead of or in addition to the church, previously they had encouraged victims to only contact the church which facilitated cover-ups. One would hope that schools associated with the Anglican church have also changed their practices towards such things. I approve of the respect is in our DNA concept, it s like Google s former slogan of Don t be evil which is something that they can be bound to. Here s a list of questions that could be asked of schools (not just YV but all schools) by journalists when reporting on such things:
  1. Do you have a policy of not trying to silence past students who have been treated badly?
  2. Do you take all sexual assaults seriously including wedgies and dakking?
  3. Do you take all violence at school seriously? Even if there s no blood? Even if the victim says they don t want to make an issue of it?
  4. What are your procedures to deal with misbehaviour from teachers? Do the students all know how to file complaints? Do they know that they can file a complaint if they aren t the victim?
  5. Does the school have policies against homophobia and transphobia and are they enforced?
  6. Does the school offer free psychological assistance to students and staff who need it? NB This only applies to private schools like YV that have huge amounts of money, public schools can t afford that.
  7. Are serious incidents investigated by people who are independent of the school and who don t have a vested interest in keeping things quiet?
  8. Do you encourage students to seek external help from organisations like the ones on the resources list of the Grace Tame Foundation [5]? Having your own list of recommended external organisations would be good too.
Counter Arguments I ve had practice debating such things, here s some responses to common counter arguments. Conclusion I don t think that YV is necessarily worse than other schools, although I m sure that representatives of other private schools are now working to assure parents of students and prospective students that they are. I don t think that all the people who were employed as teachers there when I attended were bad people, some of them were nice people who were competent teachers. But a few good people can t turn around a bad system. I will note that when I attended all the sports teachers were decent people, it was the only department I could say such things about. But sports involves situations that can lead to a bad result, issues started at other times and places can lead to violence or harassment in PE classes regardless of how good the teachers are. Teachers who know that there are problems need to be able to raise issues with the administration. When a teacher quits teaching to join the clergy and another teacher describes it as a loss for the clergy but a gain for YV it raises the question of why the bad teacher in question couldn t have been encouraged to leave earlier. A significant portion of the population will do whatever is permitted. If you say no teacher would ever bully a student so we don t need to look out for that then some teacher will do exactly that. I hope that this will lead to changes both in YV and in other schools. But if they declare this issue as resolved after expelling 4 students then something similar or worse will happen again. At least now students know that when this sort of thing happens they can send evidence to journalists to get some action.

14 May 2024

Julian Andres Klode: The new APT 3.0 solver

APT 2.9.3 introduces the first iteration of the new solver codenamed solver3, and now available with the solver 3.0 option. The new solver works fundamentally different from the old one.

How does it work? Solver3 is a fully backtracking dependency solving algorithm that defers choices to as late as possible. It starts with an empty set of packages, then adds the manually installed packages, and then installs packages automatically as necessary to satisfy the dependencies. Deferring the choices is implemented multiple ways: First, all install requests recursively mark dependencies with a single solution for install, and any packages that are being rejected due to conflicts or user requests will cause their reverse dependencies to be transitively marked as rejected, provided their or group cannot be solved by a different package. Second, any dependency with more than one choice is pushed to a priority queue that is ordered by the number of possible solutions, such that we resolve a b before a b c. Not just by the number of solutions, though. One important point to note is that optional dependencies, that is, Recommends, are always sorting after mandatory dependencies. Do note on that: Recommended packages do not nest in backtracking - dependencies of a Recommended package themselves are not optional, so they will have to be resolved before the next Recommended package is seen in the queue. Another important step in deferring choices is extracting the common dependencies of a package across its version and then installing them before we even decide which of its versions we want to install - one of the dependencies might cycle back to a specific version after all. Decisions about package levels are recorded at a certain decision level, if we reach a conflict we backtrack to the previous decision level, mark the decision we made (install X) in the inverse (DO NOT INSTALL X), reset all the state all decisions made at the higher level, and restore any dependencies that are no longer resolved to the work queue.

Comparison to SAT solver design. If you have studied SAT solver design, you ll find that essentially this is a DPLL solver without pure literal elimination. A pure literal eliminitation phase would not work for a package manager: First negative pure literals (packages that everything conflicts with) do not exist, and positive pure literals (packages nothing conflicts with) we do not want to mark for install - we want to install as little as possible (well subject, to policy). As part of the solving phase, we also construct an implication graph, albeit a partial one: The first package installing another package is marked as the reason (A -> B), the same thing for conflicts (not A -> not B). Once we have added the ability to have multiple parents in the implication graph, it stands to reason that we can also implement the much more advanced method of conflict-driven clause learning; where we do not jump back to the previous decision level but exactly to the decision level that caused the conflict. This would massively speed up backtracking.

What changes can you expect in behavior? The most striking difference to the classic APT solver is that solver3 always keeps manually installed packages around, it never offers to remove them. We will relax that in a future iteration so that it can replace packages with new ones, that is, if your package is no longer available in the repository (obsolete), but there is one that Conflicts+Replaces+Provides it, solver3 will be allowed to install that and remove the other. Implementing that policy is rather trivial: We just need to queue obsolete replacement as a dependency to solve, rather than mark the obsolete package for install. Another critical difference is the change in the autoremove behavior: The new solver currently only knows the strongest dependency chain to each package, and hence it will not keep around any packages that are only reachable via weaker chains. A common example is when gcc-<version> packages accumulate on your system over the years. They all have Provides: c-compiler and the libtool Depends: gcc c-compiler is enough to keep them around.

New features The new option --no-strict-pinning instructs the solver to consider all versions of a package and not just the candidate version. For example, you could use apt install foo=2.0 --no-strict-pinning to install version 2.0 of foo and upgrade - or downgrade - packages as needed to satisfy foo=2.0 dependencies. This mostly comes in handy in use cases involving Debian experimental or the Ubuntu proposed pockets, where you want to install a package from there, but try to satisfy from the normal release as much as possible. The implication graph building allows us to implement an apt why command, that while not as nicely detailed as aptitude, at least tells you the exact reason why a package is installed. It will only show the strongest dependency chain at first of course, since that is what we record.

What is left to do? At the moment, error information is not stored across backtracking in any way, but we generally will want to show you the first conflict we reach as it is the most natural one; or all conflicts. Currently you get the last conflict which may not be particularly useful. Likewise, errors currently are just rendered as implication graphs of the form [not] A -> [not] B -> ..., and we need to put in some work to present those nicely. The test suite is not passing yet, I haven t really started working on it. A challenge is that most packages in the test suite are manually installed as they are mocked, and the solver now doesn t remove those. We plan to implement the replacement logic such that foo can be replaced by foo2 Conflicts/Replaces/Provides foo without needing to be automatically installed. Improving the backtracking to be non-chronological conflict-driven clause learning would vastly enhance our backtracking performance. Not that it seems to be an issue right now in my limited testing (mostly noble 64-bit-time_t upgrades). A lot of that complexity you have normally is not there because the manually installed packages and resulting unit propagation (single-solution Depends/Reverse-Depends for Conflicts) already ground us fairly far in what changes we can actually make. Once all the stuff has landed, we need to start rolling it out and gather feedback. On Ubuntu I d like automated feedback on regressions (running solver3 in parallel, checking if result is worse and then submitting an error to the error tracker), on Debian this could just be a role email address to send solver dumps to. At the same time, we can also incrementally start rolling this out. Like phased updates in Ubuntu, we can also roll out the new solver as the default to 10%, 20%, 50% of users before going to the full 100%. This will allow us to capture regressions early and fix them.

Freexian Collaborators: Monthly report about Debian Long Term Support, April 2024 (by Roberto C. S nchez)

Like each month, have a look at the work funded by Freexian s Debian LTS offering.

Debian LTS contributors In April, 19 contributors have been paid to work on Debian LTS, their reports are available:
  • Abhijith PA did 0.5h (out of 0.0h assigned and 14.0h from previous period), thus carrying over 13.5h to the next month.
  • Adrian Bunk did 35.75h (out of 17.25h assigned and 40.5h from previous period), thus carrying over 22.0h to the next month.
  • Bastien Roucari s did 25.0h (out of 25.0h assigned).
  • Ben Hutchings did 24.0h (out of 9.0h assigned and 15.0h from previous period).
  • Chris Lamb did 18.0h (out of 18.0h assigned).
  • Daniel Leidert did 10.0h (out of 10.0h assigned).
  • Emilio Pozuelo Monfort did 46.0h (out of 12.0h assigned and 34.0h from previous period).
  • Guilhem Moulin did 14.75h (out of 20.0h assigned), thus carrying over 5.25h to the next month.
  • Lee Garrett did 51.25h (out of 0.0h assigned and 60.0h from previous period), thus carrying over 8.75h to the next month.
  • Markus Koschany did 40.0h (out of 40.0h assigned).
  • Ola Lundqvist did 22.5h (out of 19.5h assigned and 4.5h from previous period), thus carrying over 1.5h to the next month.
  • Roberto C. S nchez did 11.0h (out of 9.25h assigned and 2.75h from previous period), thus carrying over 1.0h to the next month.
  • Santiago Ruano Rinc n did 20.0h (out of 20.0h assigned).
  • Sean Whitton did 9.5h (out of 4.5h assigned and 5.5h from previous period), thus carrying over 0.5h to the next month.
  • Stefano Rivera did 1.5h (out of 0.0h assigned and 10.0h from previous period), thus carrying over 8.5h to the next month.
  • Sylvain Beucler did 12.5h (out of 22.75h assigned and 35.0h from previous period), thus carrying over 45.25h to the next month.
  • Thorsten Alteholz did 14.0h (out of 14.0h assigned).
  • Tobias Frost did 10.0h (out of 12.0h assigned), thus carrying over 2.0h to the next month.
  • Utkarsh Gupta did 3.25h (out of 28.5h assigned and 29.25h from previous period), thus carrying over 54.5h to the next month.

Evolution of the situation In April, we have released 28 DLAs. During the month of April, there was one particularly notable security update made in LTS. Guilhem Moulin prepared DLA-3782-1 for util-linux (part of the set of base packages and containing a number of important system utilities) in order to address a possible information disclosure vulnerability. Additionally, several contributors prepared updates for oldstable (bullseye), stable (bookworm), and unstable (sid), including:
  • ruby-rack: prepared for oldstable, stable, and unstable by Adrian Bunk
  • wpa: prepared for oldstable, stable, and unstable by Bastien Roucari s
  • zookeeper: prepared for stable by Bastien Roucari s
  • libjson-smart: prepared for unstable by Bastien Roucari s
  • ansible: prepared for stable and unstable, including autopkgtest fixes to increase future supportability, by Lee Garrett
  • wordpress: prepared for oldstable and stable by Markus Koschany
  • emacs and org-mode: prepared for oldstable and stable by Sean Whitton
  • qtbase-opensource-src: prepared for oldstable and stable by Thorsten Alteholz
  • libjwt: prepared for oldstable by Thorsten Alteholz
  • libmicrohttpd: prepared for oldstable by Thorsten Alteholz
These fixes were in addition to corresponding updates in LTS. Another item to highlight in this month s report is an update to the distro-info-data database by Stefano Rivera. This update ensures that Debian buster systems have the latest available information concerning the end-of-life dates and other related information for all releases of Debian and Ubuntu. As announced on the debian-lts-announce mailing list, it is worth to point out that we are getting close to the end of support of Debian 10 as LTS. After June 30th, no new security updates will be made available on security.debian.org. However, Freexian and its team of paid Debian contributors will continue to maintain Debian 10 going forward for the customers of the Extended LTS offer. If you still have Debian 10 servers to keep secure, it s time to subscribe!

Thanks to our sponsors Sponsors that joined recently are in bold.

28 April 2024

Russell Coker: USB PSUs

I just bought a new USB PSU from AliExpress [1]. I got this to reduce the clutter in my bedroom, I charge my laptop, PineTime, and a few phones at the same time and a single PSU with lots of ports makes it easier. Also I bought a couple of really short USB-C cables as it s been proven by both real life tests and mathematical modelling that shorter cables get tangled less. This power supply is based on Gallium Nitride (GaN) [2] technology which makes it efficient and cool. One thing I only learned about after that purchase is the new USB PPS standard (see the USB Wikipedia page for details [3]). The PPS (Programmable Power Supply) standard allows (quoting Wikipedia) allowing a voltage range of 3.3 to 21 V in 20 mV steps, and a current specified in 50 mA steps, to facilitate constant-voltage and constant-current charging . What this means in practice (when phones support it which for me will probably be 2029 or something) is that the phone could receive power exactly matching the voltage needed for the battery and not have any voltage conversion inside the phone. Phones are designed to stop charging at a certain temperature, this probably doesn t concern people in places like Northern Europe but in Australia it can be an issue. Removing the heat dissipation from inefficiencies in voltage change circuitry means the phone will be cooler when charging and can charge at a higher rate. There is a Certified USB Fast Charger logo for chargers which do this, but it seems that at the moment they just include PPS in the feature list. So I highly recommend that GaN and PPS be on your feature list for your next USB PSU, but failing that the 240W PSU I bought for $36 was a good deal.

18 April 2024

Jonathan McDowell: Sorting out backup internet #2: 5G modem

Having setup recursive DNS it was time to actually sort out a backup internet connection. I live in a Virgin Media area, but I still haven t forgiven them for my terrible Virgin experiences when moving here. Plus it involves a bigger contractual commitment. There are no altnets locally (though I m watching youfibre who have already rolled out in a few Belfast exchanges), so I decided to go for a 5G modem. That gives some flexibility, and is a bit easier to get up and running. I started by purchasing a ZTE MC7010. This had the advantage of being reasonably cheap off eBay, not having any wifi functionality I would just have to disable (it s going to plug it into the same router the FTTP connection terminates on), being outdoor mountable should I decide to go that way, and, finally, being powered via PoE. For now this device sits on the window sill in my study, which is at the top of the house. I printed a table stand for it which mostly does the job (though not as well with a normal, rather than flat, network cable). The router lives downstairs, so I ve extended a dedicated VLAN through the study switch, down to the core switch and out to the router. The PoE study switch can only do GigE, not 2.5Gb/s, but at present that s far from the limiting factor on the speed of the connection. The device is 3 branded, and, as it happens, I ve ended up with a 3 SIM in it. Up until recently my personal phone was with them, but they ve kicked me off Go Roam, so I ve moved. Going with 3 for the backup connection provides some slight extra measure of resiliency; we now have devices on all 4 major UK networks in the house. The SIM is a preloaded data only SIM good for a year; I don t expect to use all of the data allowance, but I didn t want to have to worry about unexpected excess charges. Performance turns out to be disappointing; I end up locking the device to 4G as the 5G signal is marginal - leaving it enabled results in constantly switching between 4G + 5G and a significant extra latency. The smokeping graph below shows a brief period where I removed the 4G lock and allowed 5G: Smokeping 4G vs 5G graph (There s a handy zte.js script to allow doing this from the device web interface.) I get about 10Mb/s sustained downloads out of it. EE/Vodafone did not lead to significantly better results, so for now I m accepting it is what it is. I tried relocating the device to another part of the house (a little tricky while still providing switch-based PoE, but I have an injector), without much improvement. Equally pinning the 4G to certain bands provided a short term improvement (I got up to 40-50Mb/s sustained), but not reliably so. speedtest.net results This is disappointing, but if it turns out to be a problem I can look at mounting it externally. I also assume as 5G is gradually rolled out further things will naturally improve, but that might be wishful thinking on my part. Rather than wait until my main link had a problem I decided to try a day working over the 5G connection. I spend a lot of my time either in browser based apps or accessing remote systems via SSH, so I m reasonably sensitive to a jittery or otherwise flaky connection. I picked a day that I did not have any meetings planned, but as it happened I ended up with an adhoc video call arranged. I m pleased to say that it all worked just fine; definitely noticeable as slower than the FTTP connection (to be expected), but all workable and even the video call was fine (at least from my end). Looking at the traffic graph shows the expected ~ 10Mb/s peak (actually a little higher, and looking at the FTTP stats for previous days not out of keeping with what we see there), and you can just about see the ~ 3Mb/s symmetric use by the video call at 2pm: 4G traffic during the work day The test run also helped iron out the fact that the content filter was still enabled on the SIM, but that was easily resolved. Up next, vaguely automatic failover.

13 April 2024

Paul Tagliamonte: Domo Arigato, Mr. debugfs

Years ago, at what I think I remember was DebConf 15, I hacked for a while on debhelper to write build-ids to debian binary control files, so that the build-id (more specifically, the ELF note .note.gnu.build-id) wound up in the Debian apt archive metadata. I ve always thought this was super cool, and seeing as how Michael Stapelberg blogged some great pointers around the ecosystem, including the fancy new debuginfod service, and the find-dbgsym-packages helper, which uses these same headers, I don t think I m the only one. At work I ve been using a lot of rust, specifically, async rust using tokio. To try and work on my style, and to dig deeper into the how and why of the decisions made in these frameworks, I ve decided to hack up a project that I ve wanted to do ever since 2015 write a debug filesystem. Let s get to it.

Back to the Future Time to admit something. I really love Plan 9. It s just so good. So many ideas from Plan 9 are just so prescient, and everything just feels right. Not just right like, feels good like, correct. The bit that I ve always liked the most is 9p, the network protocol for serving a filesystem over a network. This leads to all sorts of fun programs, like the Plan 9 ftp client being a 9p server you mount the ftp server and access files like any other files. It s kinda like if fuse were more fully a part of how the operating system worked, but fuse is all running client-side. With 9p there s a single client, and different servers that you can connect to, which may be backed by a hard drive, remote resources over something like SFTP, FTP, HTTP or even purely synthetic. The interesting (maybe sad?) part here is that 9p wound up outliving Plan 9 in terms of adoption 9p is in all sorts of places folks don t usually expect. For instance, the Windows Subsystem for Linux uses the 9p protocol to share files between Windows and Linux. ChromeOS uses it to share files with Crostini, and qemu uses 9p (virtio-p9) to share files between guest and host. If you re noticing a pattern here, you d be right; for some reason 9p is the go-to protocol to exchange files between hypervisor and guest. Why? I have no idea, except maybe due to being designed well, simple to implement, and it s a lot easier to validate the data being shared and validate security boundaries. Simplicity has its value. As a result, there s a lot of lingering 9p support kicking around. Turns out Linux can even handle mounting 9p filesystems out of the box. This means that I can deploy a filesystem to my LAN or my localhost by running a process on top of a computer that needs nothing special, and mount it over the network on an unmodified machine unlike fuse, where you d need client-specific software to run in order to mount the directory. For instance, let s mount a 9p filesystem running on my localhost machine, serving requests on 127.0.0.1:564 (tcp) that goes by the name mountpointname to /mnt.
$ mount -t 9p \
-o trans=tcp,port=564,version=9p2000.u,aname=mountpointname \
127.0.0.1 \
/mnt
Linux will mount away, and attach to the filesystem as the root user, and by default, attach to that mountpoint again for each local user that attempts to use it. Nifty, right? I think so. The server is able to keep track of per-user access and authorization along with the host OS.

WHEREIN I STYX WITH IT Since I wanted to push myself a bit more with rust and tokio specifically, I opted to implement the whole stack myself, without third party libraries on the critical path where I could avoid it. The 9p protocol (sometimes called Styx, the original name for it) is incredibly simple. It s a series of client to server requests, which receive a server to client response. These are, respectively, T messages, which transmit a request to the server, which trigger an R message in response (Reply messages). These messages are TLV payload with a very straight forward structure so straight forward, in fact, that I was able to implement a working server off nothing more than a handful of man pages. Later on after the basics worked, I found a more complete spec page that contains more information about the unix specific variant that I opted to use (9P2000.u rather than 9P2000) due to the level of Linux specific support for the 9P2000.u variant over the 9P2000 protocol.

MR ROBOTO The backend stack over at zoo is rust and tokio running i/o for an HTTP and WebRTC server. I figured I d pick something fairly similar to write my filesystem with, since 9P can be implemented on basically anything with I/O. That means tokio tcp server bits, which construct and use a 9p server, which has an idiomatic Rusty API that partially abstracts the raw R and T messages, but not so much as to cause issues with hiding implementation possibilities. At each abstraction level, there s an escape hatch allowing someone to implement any of the layers if required. I called this framework arigato which can be found over on docs.rs and crates.io.
/// Simplified version of the arigato File trait; this isn't actually
/// the same trait; there's some small cosmetic differences. The
/// actual trait can be found at:
///
/// https://docs.rs/arigato/latest/arigato/server/trait.File.html
trait File  
/// OpenFile is the type returned by this File via an Open call.
 type OpenFile: OpenFile;
/// Return the 9p Qid for this file. A file is the same if the Qid is
 /// the same. A Qid contains information about the mode of the file,
 /// version of the file, and a unique 64 bit identifier.
 fn qid(&self) -> Qid;
/// Construct the 9p Stat struct with metadata about a file.
 async fn stat(&self) -> FileResult<Stat>;
/// Attempt to update the file metadata.
 async fn wstat(&mut self, s: &Stat) -> FileResult<()>;
/// Traverse the filesystem tree.
 async fn walk(&self, path: &[&str]) -> FileResult<(Option<Self>, Vec<Self>)>;
/// Request that a file's reference be removed from the file tree.
 async fn unlink(&mut self) -> FileResult<()>;
/// Create a file at a specific location in the file tree.
 async fn create(
&mut self,
name: &str,
perm: u16,
ty: FileType,
mode: OpenMode,
extension: &str,
) -> FileResult<Self>;
/// Open the File, returning a handle to the open file, which handles
 /// file i/o. This is split into a second type since it is genuinely
 /// unrelated -- and the fact that a file is Open or Closed can be
 /// handled by the  arigato  server for us.
 async fn open(&mut self, mode: OpenMode) -> FileResult<Self::OpenFile>;
 
/// Simplified version of the arigato OpenFile trait; this isn't actually
/// the same trait; there's some small cosmetic differences. The
/// actual trait can be found at:
///
/// https://docs.rs/arigato/latest/arigato/server/trait.OpenFile.html
trait OpenFile  
/// iounit to report for this file. The iounit reported is used for Read
 /// or Write operations to signal, if non-zero, the maximum size that is
 /// guaranteed to be transferred atomically.
 fn iounit(&self) -> u32;
/// Read some number of bytes up to  buf.len()  from the provided
 ///  offset  of the underlying file. The number of bytes read is
 /// returned.
 async fn read_at(
&mut self,
buf: &mut [u8],
offset: u64,
) -> FileResult<u32>;
/// Write some number of bytes up to  buf.len()  from the provided
 ///  offset  of the underlying file. The number of bytes written
 /// is returned.
 fn write_at(
&mut self,
buf: &mut [u8],
offset: u64,
) -> FileResult<u32>;
 

Thanks, decade ago paultag! Let s do it! Let s use arigato to implement a 9p filesystem we ll call debugfs that will serve all the debug files shipped according to the Packages metadata from the apt archive. We ll fetch the Packages file and construct a filesystem based on the reported Build-Id entries. For those who don t know much about how an apt repo works, here s the 2-second crash course on what we re doing. The first is to fetch the Packages file, which is specific to a binary architecture (such as amd64, arm64 or riscv64). That architecture is specific to a component (such as main, contrib or non-free). That component is specific to a suite, such as stable, unstable or any of its aliases (bullseye, bookworm, etc). Let s take a look at the Packages.xz file for the unstable-debug suite, main component, for all amd64 binaries.
$ curl \
https://deb.debian.org/debian-debug/dists/unstable-debug/main/binary-amd64/Packages.xz \
  unxz
This will return the Debian-style rfc2822-like headers, which is an export of the metadata contained inside each .deb file which apt (or other tools that can use the apt repo format) use to fetch information about debs. Let s take a look at the debug headers for the netlabel-tools package in unstable which is a package named netlabel-tools-dbgsym in unstable-debug.
Package: netlabel-tools-dbgsym
Source: netlabel-tools (0.30.0-1)
Version: 0.30.0-1+b1
Installed-Size: 79
Maintainer: Paul Tagliamonte <paultag@debian.org>
Architecture: amd64
Depends: netlabel-tools (= 0.30.0-1+b1)
Description: debug symbols for netlabel-tools
Auto-Built-Package: debug-symbols
Build-Ids: e59f81f6573dadd5d95a6e4474d9388ab2777e2a
Description-md5: a0e587a0cf730c88a4010f78562e6db7
Section: debug
Priority: optional
Filename: pool/main/n/netlabel-tools/netlabel-tools-dbgsym_0.30.0-1+b1_amd64.deb
Size: 62776
SHA256: 0e9bdb087617f0350995a84fb9aa84541bc4df45c6cd717f2157aa83711d0c60
So here, we can parse the package headers in the Packages.xz file, and store, for each Build-Id, the Filename where we can fetch the .deb at. Each .deb contains a number of files but we re only really interested in the files inside the .deb located at or under /usr/lib/debug/.build-id/, which you can find in debugfs under rfc822.rs. It s crude, and very single-purpose, but I m feeling a bit lazy.

Who needs dpkg?! For folks who haven t seen it yet, a .deb file is a special type of .ar file, that contains (usually) three files inside debian-binary, control.tar.xz and data.tar.xz. The core of an .ar file is a fixed size (60 byte) entry header, followed by the specified size number of bytes.
[8 byte .ar file magic]
[60 byte entry header]
[N bytes of data]
[60 byte entry header]
[N bytes of data]
[60 byte entry header]
[N bytes of data]
...
First up was to implement a basic ar parser in ar.rs. Before we get into using it to parse a deb, as a quick diversion, let s break apart a .deb file by hand something that is a bit of a rite of passage (or at least it used to be? I m getting old) during the Debian nm (new member) process, to take a look at where exactly the .debug file lives inside the .deb file.
$ ar x netlabel-tools-dbgsym_0.30.0-1+b1_amd64.deb
$ ls
control.tar.xz debian-binary
data.tar.xz netlabel-tools-dbgsym_0.30.0-1+b1_amd64.deb
$ tar --list -f data.tar.xz   grep '.debug$'
./usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug
Since we know quite a bit about the structure of a .deb file, and I had to implement support from scratch anyway, I opted to implement a (very!) basic debfile parser using HTTP Range requests. HTTP Range requests, if supported by the server (denoted by a accept-ranges: bytes HTTP header in response to an HTTP HEAD request to that file) means that we can add a header such as range: bytes=8-68 to specifically request that the returned GET body be the byte range provided (in the above case, the bytes starting from byte offset 8 until byte offset 68). This means we can fetch just the ar file entry from the .deb file until we get to the file inside the .deb we are interested in (in our case, the data.tar.xz file) at which point we can request the body of that file with a final range request. I wound up writing a struct to handle a read_at-style API surface in hrange.rs, which we can pair with ar.rs above and start to find our data in the .deb remotely without downloading and unpacking the .deb at all. After we have the body of the data.tar.xz coming back through the HTTP response, we get to pipe it through an xz decompressor (this kinda sucked in Rust, since a tokio AsyncRead is not the same as an http Body response is not the same as std::io::Read, is not the same as an async (or sync) Iterator is not the same as what the xz2 crate expects; leading me to read blocks of data to a buffer and stuff them through the decoder by looping over the buffer for each lzma2 packet in a loop), and tarfile parser (similarly troublesome). From there we get to iterate over all entries in the tarfile, stopping when we reach our file of interest. Since we can t seek, but gdb needs to, we ll pull it out of the stream into a Cursor<Vec<u8>> in-memory and pass a handle to it back to the user. From here on out its a matter of gluing together a File traited struct in debugfs, and serving the filesystem over TCP using arigato. Done deal!

A quick diversion about compression I was originally hoping to avoid transferring the whole tar file over the network (and therefore also reading the whole debug file into ram, which objectively sucks), but quickly hit issues with figuring out a way around seeking around an xz file. What s interesting is xz has a great primitive to solve this specific problem (specifically, use a block size that allows you to seek to the block as close to your desired seek position just before it, only discarding at most block size - 1 bytes), but data.tar.xz files generated by dpkg appear to have a single mega-huge block for the whole file. I don t know why I would have expected any different, in retrospect. That means that this now devolves into the base case of How do I seek around an lzma2 compressed data stream ; which is a lot more complex of a question. Thankfully, notoriously brilliant tianon was nice enough to introduce me to Jon Johnson who did something super similar adapted a technique to seek inside a compressed gzip file, which lets his service oci.dag.dev seek through Docker container images super fast based on some prior work such as soci-snapshotter, gztool, and zran.c. He also pulled this party trick off for apk based distros over at apk.dag.dev, which seems apropos. Jon was nice enough to publish a lot of his work on this specifically in a central place under the name targz on his GitHub, which has been a ton of fun to read through. The gist is that, by dumping the decompressor s state (window of previous bytes, in-memory data derived from the last N-1 bytes) at specific checkpoints along with the compressed data stream offset in bytes and decompressed offset in bytes, one can seek to that checkpoint in the compressed stream and pick up where you left off creating a similar block mechanism against the wishes of gzip. It means you d need to do an O(n) run over the file, but every request after that will be sped up according to the number of checkpoints you ve taken. Given the complexity of xz and lzma2, I don t think this is possible for me at the moment especially given most of the files I ll be requesting will not be loaded from again especially when I can just cache the debug header by Build-Id. I want to implement this (because I m generally curious and Jon has a way of getting someone excited about compression schemes, which is not a sentence I thought I d ever say out loud), but for now I m going to move on without this optimization. Such a shame, since it kills a lot of the work that went into seeking around the .deb file in the first place, given the debian-binary and control.tar.gz members are so small.

The Good First, the good news right? It works! That s pretty cool. I m positive my younger self would be amused and happy to see this working; as is current day paultag. Let s take debugfs out for a spin! First, we need to mount the filesystem. It even works on an entirely unmodified, stock Debian box on my LAN, which is huge. Let s take it for a spin:
$ mount \
-t 9p \
-o trans=tcp,version=9p2000.u,aname=unstable-debug \
192.168.0.2 \
/usr/lib/debug/.build-id/
And, let s prove to ourselves that this actually mounted before we go trying to use it:
$ mount   grep build-id
192.168.0.2 on /usr/lib/debug/.build-id type 9p (rw,relatime,aname=unstable-debug,access=user,trans=tcp,version=9p2000.u,port=564)
Slick. We ve got an open connection to the server, where our host will keep a connection alive as root, attached to the filesystem provided in aname. Let s take a look at it.
$ ls /usr/lib/debug/.build-id/
00 0d 1a 27 34 41 4e 5b 68 75 82 8E 9b a8 b5 c2 CE db e7 f3
01 0e 1b 28 35 42 4f 5c 69 76 83 8f 9c a9 b6 c3 cf dc E7 f4
02 0f 1c 29 36 43 50 5d 6a 77 84 90 9d aa b7 c4 d0 dd e8 f5
03 10 1d 2a 37 44 51 5e 6b 78 85 91 9e ab b8 c5 d1 de e9 f6
04 11 1e 2b 38 45 52 5f 6c 79 86 92 9f ac b9 c6 d2 df ea f7
05 12 1f 2c 39 46 53 60 6d 7a 87 93 a0 ad ba c7 d3 e0 eb f8
06 13 20 2d 3a 47 54 61 6e 7b 88 94 a1 ae bb c8 d4 e1 ec f9
07 14 21 2e 3b 48 55 62 6f 7c 89 95 a2 af bc c9 d5 e2 ed fa
08 15 22 2f 3c 49 56 63 70 7d 8a 96 a3 b0 bd ca d6 e3 ee fb
09 16 23 30 3d 4a 57 64 71 7e 8b 97 a4 b1 be cb d7 e4 ef fc
0a 17 24 31 3e 4b 58 65 72 7f 8c 98 a5 b2 bf cc d8 E4 f0 fd
0b 18 25 32 3f 4c 59 66 73 80 8d 99 a6 b3 c0 cd d9 e5 f1 fe
0c 19 26 33 40 4d 5a 67 74 81 8e 9a a7 b4 c1 ce da e6 f2 ff
Outstanding. Let s try using gdb to debug a binary that was provided by the Debian archive, and see if it ll load the ELF by build-id from the right .deb in the unstable-debug suite:
$ gdb -q /usr/sbin/netlabelctl
Reading symbols from /usr/sbin/netlabelctl...
Reading symbols from /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug...
(gdb)
Yes! Yes it will!
$ file /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug
/usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter *empty*, BuildID[sha1]=e59f81f6573dadd5d95a6e4474d9388ab2777e2a, for GNU/Linux 3.2.0, with debug_info, not stripped

The Bad Linux s support for 9p is mainline, which is great, but it s not robust. Network issues or server restarts will wedge the mountpoint (Linux can t reconnect when the tcp connection breaks), and things that work fine on local filesystems get translated in a way that causes a lot of network chatter for instance, just due to the way the syscalls are translated, doing an ls, will result in a stat call for each file in the directory, even though linux had just got a stat entry for every file while it was resolving directory names. On top of that, Linux will serialize all I/O with the server, so there s no concurrent requests for file information, writes, or reads pending at the same time to the server; and read and write throughput will degrade as latency increases due to increasing round-trip time, even though there are offsets included in the read and write calls. It works well enough, but is frustrating to run up against, since there s not a lot you can do server-side to help with this beyond implementing the 9P2000.L variant (which, maybe is worth it).

The Ugly Unfortunately, we don t know the file size(s) until we ve actually opened the underlying tar file and found the correct member, so for most files, we don t know the real size to report when getting a stat. We can t parse the tarfiles for every stat call, since that d make ls even slower (bummer). Only hiccup is that when I report a filesize of zero, gdb throws a bit of a fit; let s try with a size of 0 to start:
$ ls -lah /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug
-r--r--r-- 1 root root 0 Dec 31 1969 /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug
$ gdb -q /usr/sbin/netlabelctl
Reading symbols from /usr/sbin/netlabelctl...
Reading symbols from /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug...
warning: Discarding section .note.gnu.build-id which has a section size (24) larger than the file size [in module /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug]
[...]
This obviously won t work since gdb will throw away all our hard work because of stat s output, and neither will loading the real size of the underlying file. That only leaves us with hardcoding a file size and hope nothing else breaks significantly as a result. Let s try it again:
$ ls -lah /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug
-r--r--r-- 1 root root 954M Dec 31 1969 /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug
$ gdb -q /usr/sbin/netlabelctl
Reading symbols from /usr/sbin/netlabelctl...
Reading symbols from /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug...
(gdb)
Much better. I mean, terrible but better. Better for now, anyway.

Kilroy was here Do I think this is a particularly good idea? I mean; kinda. I m probably going to make some fun 9p arigato-based filesystems for use around my LAN, but I don t think I ll be moving to use debugfs until I can figure out how to ensure the connection is more resilient to changing networks, server restarts and fixes on i/o performance. I think it was a useful exercise and is a pretty great hack, but I don t think this ll be shipping anywhere anytime soon. Along with me publishing this post, I ve pushed up all my repos; so you should be able to play along at home! There s a lot more work to be done on arigato; but it does handshake and successfully export a working 9P2000.u filesystem. Check it out on on my github at arigato, debugfs and also on crates.io and docs.rs. At least I can say I was here and I got it working after all these years.

11 April 2024

Reproducible Builds: Reproducible Builds in March 2024

Welcome to the March 2024 report from the Reproducible Builds project! In our reports, we attempt to outline what we have been up to over the past month, as well as mentioning some of the important things happening more generally in software supply-chain security. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website. Table of contents:
  1. Arch Linux minimal container userland now 100% reproducible
  2. Validating Debian s build infrastructure after the XZ backdoor
  3. Making Fedora Linux (more) reproducible
  4. Increasing Trust in the Open Source Supply Chain with Reproducible Builds and Functional Package Management
  5. Software and source code identification with GNU Guix and reproducible builds
  6. Two new Rust-based tools for post-processing determinism
  7. Distribution work
  8. Mailing list highlights
  9. Website updates
  10. Delta chat clients now reproducible
  11. diffoscope updates
  12. Upstream patches
  13. Reproducibility testing framework

Arch Linux minimal container userland now 100% reproducible In remarkable news, Reproducible builds developer kpcyrd reported that that the Arch Linux minimal container userland is now 100% reproducible after work by developers dvzv and Foxboron on the one remaining package. This represents a real world , widely-used Linux distribution being reproducible. Their post, which kpcyrd suffixed with the question now what? , continues on to outline some potential next steps, including validating whether the container image itself could be reproduced bit-for-bit. The post, which was itself a followup for an Arch Linux update earlier in the month, generated a significant number of replies.

Validating Debian s build infrastructure after the XZ backdoor From our mailing list this month, Vagrant Cascadian wrote about being asked about trying to perform concrete reproducibility checks for recent Debian security updates, in an attempt to gain some confidence about Debian s build infrastructure given that they performed builds in environments running the high-profile XZ vulnerability. Vagrant reports (with some caveats):
So far, I have not found any reproducibility issues; everything I tested I was able to get to build bit-for-bit identical with what is in the Debian archive.
That is to say, reproducibility testing permitted Vagrant and Debian to claim with some confidence that builds performed when this vulnerable version of XZ was installed were not interfered with.

Making Fedora Linux (more) reproducible In March, Davide Cavalca gave a talk at the 2024 Southern California Linux Expo (aka SCALE 21x) about the ongoing effort to make the Fedora Linux distribution reproducible. Documented in more detail on Fedora s website, the talk touched on topics such as the specifics of implementing reproducible builds in Fedora, the challenges encountered, the current status and what s coming next. (YouTube video)

Increasing Trust in the Open Source Supply Chain with Reproducible Builds and Functional Package Management Julien Malka published a brief but interesting paper in the HAL open archive on Increasing Trust in the Open Source Supply Chain with Reproducible Builds and Functional Package Management:
Functional package managers (FPMs) and reproducible builds (R-B) are technologies and methodologies that are conceptually very different from the traditional software deployment model, and that have promising properties for software supply chain security. This thesis aims to evaluate the impact of FPMs and R-B on the security of the software supply chain and propose improvements to the FPM model to further improve trust in the open source supply chain. PDF
Julien s paper poses a number of research questions on how the model of distributions such as GNU Guix and NixOS can be leveraged to further improve the safety of the software supply chain , etc.

Software and source code identification with GNU Guix and reproducible builds In a long line of commendably detailed blog posts, Ludovic Court s, Maxim Cournoyer, Jan Nieuwenhuizen and Simon Tournier have together published two interesting posts on the GNU Guix blog this month. In early March, Ludovic Court s, Maxim Cournoyer, Jan Nieuwenhuizen and Simon Tournier wrote about software and source code identification and how that might be performed using Guix, rhetorically posing the questions: What does it take to identify software ? How can we tell what software is running on a machine to determine, for example, what security vulnerabilities might affect it? Later in the month, Ludovic Court s wrote a solo post describing adventures on the quest for long-term reproducible deployment. Ludovic s post touches on GNU Guix s aim to support time travel , the ability to reliably (and reproducibly) revert to an earlier point in time, employing the iconic image of Harold Lloyd hanging off the clock in Safety Last! (1925) to poetically illustrate both the slapstick nature of current modern technology and the gymnastics required to navigate hazards of our own making.

Two new Rust-based tools for post-processing determinism Zbigniew J drzejewski-Szmek announced add-determinism, a work-in-progress reimplementation of the Reproducible Builds project s own strip-nondeterminism tool in the Rust programming language, intended to be used as a post-processor in RPM-based distributions such as Fedora In addition, Yossi Kreinin published a blog post titled refix: fast, debuggable, reproducible builds that describes a tool that post-processes binaries in such a way that they are still debuggable with gdb, etc.. Yossi post details the motivation and techniques behind the (fast) performance of the tool.

Distribution work In Debian this month, since the testing framework no longer varies the build path, James Addison performed a bulk downgrade of the bug severity for issues filed with a level of normal to a new level of wishlist. In addition, 28 reviews of Debian packages were added, 38 were updated and 23 were removed this month adding to ever-growing knowledge about identified issues. As part of this effort, a number of issue types were updated, including Chris Lamb adding a new ocaml_include_directories toolchain issue [ ] and James Addison adding a new filesystem_order_in_java_jar_manifest_mf_include_resource issue [ ] and updating the random_uuid_in_notebooks_generated_by_nbsphinx to reference a relevant discussion thread [ ]. In addition, Roland Clobus posted his 24th status update of reproducible Debian ISO images. Roland highlights that the images for Debian unstable often cannot be generated due to changes in that distribution related to the 64-bit time_t transition. Lastly, Bernhard M. Wiedemann posted another monthly update for his reproducibility work in openSUSE.

Mailing list highlights Elsewhere on our mailing list this month:

Website updates There were made a number of improvements to our website this month, including:
  • Pol Dellaiera noticed the frequent need to correctly cite the website itself in academic work. To facilitate easier citation across multiple formats, Pol contributed a Citation File Format (CIF) file. As a result, an export in BibTeX format is now available in the Academic Publications section. Pol encourages community contributions to further refine the CITATION.cff file. Pol also added an substantial new section to the buy in page documenting the role of Software Bill of Materials (SBOMs) and ephemeral development environments. [ ][ ]
  • Bernhard M. Wiedemann added a new commandments page to the documentation [ ][ ] and fixed some incorrect YAML elsewhere on the site [ ].
  • Chris Lamb add three recent academic papers to the publications page of the website. [ ]
  • Mattia Rizzolo and Holger Levsen collaborated to add Infomaniak as a sponsor of amd64 virtual machines. [ ][ ][ ]
  • Roland Clobus updated the stable outputs page, dropping version numbers from Python documentation pages [ ] and noting that Python s set data structure is also affected by the PYTHONHASHSEED functionality. [ ]

Delta chat clients now reproducible Delta Chat, an open source messaging application that can work over email, announced this month that the Rust-based core library underlying Delta chat application is now reproducible.

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions 259, 260 and 261 to Debian and made the following additional changes:
  • New features:
    • Add support for the zipdetails tool from the Perl distribution. Thanks to Fay Stegerman and Larry Doolittle et al. for the pointer and thread about this tool. [ ]
  • Bug fixes:
    • Don t identify Redis database dumps as GNU R database files based simply on their filename. [ ]
    • Add a missing call to File.recognizes so we actually perform the filename check for GNU R data files. [ ]
    • Don t crash if we encounter an .rdb file without an equivalent .rdx file. (#1066991)
    • Correctly check for 7z being available and not lz4 when testing 7z. [ ]
    • Prevent a traceback when comparing a contentful .pyc file with an empty one. [ ]
  • Testsuite improvements:
    • Fix .epub tests after supporting the new zipdetails tool. [ ]
    • Don t use parenthesis within test skipping messages, as PyTest adds its own parenthesis. [ ]
    • Factor out Python version checking in test_zip.py. [ ]
    • Skip some Zip-related tests under Python 3.10.14, as a potential regression may have been backported to the 3.10.x series. [ ]
    • Actually test 7z support in the test_7z set of tests, not the lz4 functionality. (Closes: reproducible-builds/diffoscope#359). [ ]
In addition, Fay Stegerman updated diffoscope s monkey patch for supporting the unusual Mozilla ZIP file format after Python s zipfile module changed to detect potentially insecure overlapping entries within .zip files. (#362) Chris Lamb also updated the trydiffoscope command line client, dropping a build-dependency on the deprecated python3-distutils package to fix Debian bug #1065988 [ ], taking a moment to also refresh the packaging to the latest Debian standards [ ]. Finally, Vagrant Cascadian submitted an update for diffoscope version 260 in GNU Guix. [ ]

Upstream patches This month, we wrote a large number of patches, including: Bernhard M. Wiedemann used reproducibility-tooling to detect and fix packages that added changes in their %check section, thus failing when built with the --no-checks option. Only half of all openSUSE packages were tested so far, but a large number of bugs were filed, including ones against caddy, exiv2, gnome-disk-utility, grisbi, gsl, itinerary, kosmindoormap, libQuotient, med-tools, plasma6-disks, pspp, python-pypuppetdb, python-urlextract, rsync, vagrant-libvirt and xsimd. Similarly, Jean-Pierre De Jesus DIAZ employed reproducible builds techniques in order to test a proposed refactor of the ath9k-htc-firmware package. As the change produced bit-for-bit identical binaries to the previously shipped pre-built binaries:
I don t have the hardware to test this firmware, but the build produces the same hashes for the firmware so it s safe to say that the firmware should keep working.

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In March, an enormous number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Sleep less after a so-called 404 package state has occurred. [ ]
    • Schedule package builds more often. [ ][ ]
    • Regenerate all our HTML indexes every hour, but only every 12h for the released suites. [ ]
    • Create and update unstable and experimental base systems on armhf again. [ ][ ]
    • Don t reschedule so many depwait packages due to the current size of the i386 architecture queue. [ ]
    • Redefine our scheduling thresholds and amounts. [ ]
    • Schedule untested packages with a higher priority, otherwise slow architectures cannot keep up with the experimental distribution growing. [ ]
    • Only create the stats_buildinfo.png graph once per day. [ ][ ]
    • Reproducible Debian dashboard: refactoring, update several more static stats only every 12h. [ ]
    • Document how to use systemctl with new systemd-based services. [ ]
    • Temporarily disable armhf and i386 continuous integration tests in order to get some stability back. [ ]
    • Use the deb.debian.org CDN everywhere. [ ]
    • Remove the rsyslog logging facility on bookworm systems. [ ]
    • Add zst to the list of packages which are false-positive diskspace issues. [ ]
    • Detect failures to bootstrap Debian base systems. [ ]
  • Arch Linux-related changes:
    • Temporarily disable builds because the pacman package manager is broken. [ ][ ]
    • Split reproducible_html_live_status and split the scheduling timing . [ ][ ][ ]
    • Improve handling when database is locked. [ ][ ]
  • Misc changes:
    • Show failed services that require manual cleanup. [ ][ ]
    • Integrate two new Infomaniak nodes. [ ][ ][ ][ ]
    • Improve IRC notifications for artifacts. [ ]
    • Run diffoscope in different systemd slices. [ ]
    • Run the node health check more often, as it can now repair some issues. [ ][ ]
    • Also include the string Bot in the userAgent for Git. (Re: #929013). [ ]
    • Document increased tmpfs size on our OUSL nodes. [ ]
    • Disable memory account for the reproducible_build service. [ ][ ]
    • Allow 10 times as many open files for the Jenkins service. [ ]
    • Set OOMPolicy=continue and OOMScoreAdjust=-1000 for both the Jenkins and the reproducible_build service. [ ]
Mattia Rizzolo also made the following changes:
  • Debian-related changes:
    • Define a systemd slice to group all relevant services. [ ][ ]
    • Add a bunch of quotes in scripts to assuage the shellcheck tool. [ ]
    • Add stats on how many packages have been built today so far. [ ]
    • Instruct systemd-run to handle diffoscope s exit codes specially. [ ]
    • Prefer the pgrep tool over grepping the output of ps. [ ]
    • Re-enable a couple of i386 and armhf architecture builders. [ ][ ]
    • Fix some stylistic issues flagged by the Python flake8 tool. [ ]
    • Cease scheduling Debian unstable and experimental on the armhf architecture due to the time_t transition. [ ]
    • Start a few more i386 & armhf workers. [ ][ ][ ]
    • Temporarly skip pbuilder updates in the unstable distribution, but only on the armhf architecture. [ ]
  • Other changes:
    • Perform some large-scale refactoring on how the systemd service operates. [ ][ ]
    • Move the list of workers into a separate file so it s accessible to a number of scripts. [ ]
    • Refactor the powercycle_x86_nodes.py script to use the new IONOS API and its new Python bindings. [ ]
    • Also fix nph-logwatch after the worker changes. [ ]
    • Do not install the stunnel tool anymore, it shouldn t be needed by anything anymore. [ ]
    • Move temporary directories related to Arch Linux into a single directory for clarity. [ ]
    • Update the arm64 architecture host keys. [ ]
    • Use a common Postfix configuration. [ ]
The following changes were also made by:
  • Jan-Benedict Glaw:
    • Initial work to clean up a messy NetBSD-related script. [ ][ ]
  • Roland Clobus:
    • Show the installer log if the installer fails to build. [ ]
    • Avoid the minus character (i.e. -) in a variable in order to allow for tags in openQA. [ ]
    • Update the schedule of Debian live image builds. [ ]
  • Vagrant Cascadian:
    • Maintenance on the virt* nodes is completed so bring them back online. [ ]
    • Use the fully qualified domain name in configuration. [ ]
Node maintenance was also performed by Holger Levsen, Mattia Rizzolo [ ][ ] and Vagrant Cascadian [ ][ ][ ][ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

5 April 2024

Bits from Debian: apt install dpl-candidate: Andreas Tille

The Debian Project Developers will shortly vote for a new Debian Project Leader known as the DPL. The Project Leader is the official representative of The Debian Project tasked with managing the overall project, its vision, direction, and finances. The DPL is also responsible for the selection of Delegates, defining areas of responsibility within the project, the coordination of Developers, and making decisions required for the project. Our outgoing and present DPL Jonathan Carter served 4 terms, from 2020 through 2024. Jonathan shared his last Bits from the DPL post to Debian recently and his hopes for the future of Debian. Recently, we sat with the two present candidates for the DPL position asking questions to find out who they really are in a series of interviews about their platforms, visions for Debian, lives, and even their favorite text editors. The interviews were conducted by disaster2life (Yashraj Moghe) and made available from video and audio transcriptions: Voting for the position starts on April 6, 2024. Editors' note: This is our official return to Debian interviews, readers should stay tuned for more upcoming interviews with Developers and other important figures in Debian as part of our "Meet your Debian Developer" series. We used the following tools and services: Turboscribe.ai for the transcription from the audio and video files, IRC: Oftc.net for communication, Jitsi meet for interviews, and Open Broadcaster Software (OBS) for editing and video. While we encountered many technical difficulties in the return to this process, we are still able and proud to present the transcripts of the interviews edited only in a few areas for readability. 2024 Debian Project Leader Candidate: Andrea Tille Andreas' Interview Who are you? Tell us a little about yourself. [Andreas]:
How am I? Well, I'm, as I wrote in my platform, I'm a proud grandfather doing a lot of free software stuff, doing a lot of sports, have some goals in mind which I like to do and hopefully for the best of Debian.
And How are you today? [Andreas]:
How I'm doing today? Well, actually I have some headaches but it's fine for the interview. So, usually I feel very good. Spring was coming here and today it's raining and I plan to do a bicycle tour tomorrow and hope that I do not get really sick but yeah, for the interview it's fine.
What do you do in Debian? Could you mention your story here? [Andreas]:
Yeah, well, I started with Debian kind of an accident because I wanted to have some package salvaged which is called WordNet. It's a monolingual dictionary and I did not really plan to do more than maybe 10 packages or so. I had some kind of training with xTeddy which is totally unimportant, a cute teddy you can put on your desktop. So, and then well, more or less I thought how can I make Debian attractive for my employer which is a medical institute and so on. It could make sense to package bioinformatics and medicine software and it somehow evolved in a direction I did neither expect it nor wanted to do, that I'm currently the most busy uploader in Debian, created several teams around it. DebianMate is very well known from me. I created the Blends team to create teams and techniques around what we are doing which was Debian TIS, Debian Edu, Debian Science and so on and I also created the packaging team for R, for the statistics package R which is technically based and not topic based. All these blends are covering a certain topic and R is just needed by lots of these blends. So, yeah, and to cope with all this I have written a script which is routing an update to manage all these uploads more or less automatically. So, I think I had one day where I uploaded 21 new packages but it's just automatically generated, right? So, it's on one day more than I ever planned to do.
What is the first thing you think of when you think of Debian? Editors' note: The question was misunderstood as the worst thing you think of when you think of Debian [Andreas]:
The worst thing I think about Debian, it's complicated. I think today on Debian board I was asked about the technical progress I want to make and in my opinion we need to standardize things inside Debian. For instance, bringing all the packages to salsa, follow some common standards, some common workflow which is extremely helpful. As I said, if I'm that productive with my own packages we can adopt this in general, at least in most cases I think. I made a lot of good experience by the support of well-formed teams. Well-formed teams are those teams where people support each other, help each other. For instance, how to say, I'm a physicist by profession so I'm not an IT expert. I can tell apart what works and what not but I'm not an expert in those packages. I do and the amount of packages is so high that I do not even understand all the techniques they are covering like Go, Rust and something like this. And I also don't speak Java and I had a problem once in the middle of the night and I've sent the email to the list and was a Java problem and I woke up in the morning and it was solved. This is what I call a team. I don't call a team some common repository that is used by random people for different packages also but it's working together, don't hesitate to solve other people's problems and permit people to get active. This is what I call a team and this is also something I observed in, it's hard to give a percentage, in a lot of other teams but we have other people who do not even understand the concept of the team. Why is working together make some advantage and this is also a tough thing. I [would] like to tackle in my term if I get elected to form solid teams using the common workflow. This is one thing. The other thing is that we have a lot of good people in our infrastructure like FTP masters, DSA and so on. I have the feeling they have a lot of work and are working more or less on their limits, and I like to talk to them [to ask] what kind of change we could do to move that limits or move their personal health to the better side.
The DPL term lasts for a year, What would you do during that you couldn't do now? [Andreas]:
Yeah, well this is basically what I said are my main issues. I need to admit I have no really clear imagination what kind of tasks will come to me as a DPL because all these financial issues and law issues possible and issues [that] people who are not really friendly to Debian might create. I'm afraid these things might occupy a lot of time and I can't say much about this because I simply don't know.
What are three key terms about you and your candidacy? [Andreas]:
As I said, I like to work on standards, I d like to make Debian try [to get it right so] that people don't get overworked, this third key point is be inviting to newcomers, to everybody who wants to come. Yeah, I also mentioned in my term this diversity issue, geographical and from gender point of view. This may be the three points I consider most important.
Preferred text editor? [Andreas]:
Yeah, my preferred one? Ah, well, I have no preferred text editor. I'm using the Midnight Commander very frequently which has an internal editor which is convenient for small text. For other things, I usually use VI but I also use Emacs from time to time. So, no, I have not preferred text editor. Whatever works nicely for me.
What is the importance of the community in the Debian Project? How would like to see it evolving over the next few years? [Andreas]:
Yeah, I think the community is extremely important. So, I was on a lot of DebConfs. I think it's not really 20 but 17 or 18 DebCons and I really enjoyed these events every year because I met so many friends and met so many interesting people that it's really enriching my life and those who I never met in person but have read interesting things and yeah, Debian community makes really a part of my life.
And how do you think it should evolve specifically? [Andreas]:
Yeah, for instance, last year in Kochi, it became even clearer to me that the geographical diversity is a really strong point. Just discussing with some women from India who is afraid about not coming next year to Busan because there's a problem with Shanghai and so on. I'm not really sure how we can solve this but I think this is a problem at least I wish to tackle and yeah, this is an interesting point, the geographical diversity and I'm running the so-called mentoring of the month. This is a small project to attract newcomers for the Debian Med team which has the focus on medical packages and I learned that we had always men applying for this and so I said, okay, I dropped the constraint of medical packages. Any topic is fine, I teach you packaging but it must be someone who does not consider himself a man. I got only two applicants, no, actually, I got one applicant and one response which was kind of strange if I'm hunting for women or so. I did not understand but I got one response and interestingly, it was for me one of the least expected counters. It was from Iran and I met a very nice woman, very open, very skilled and gifted and did a good job or have even lose contact today and maybe we need more actively approach groups that are underrepresented. I don't know if what's a good means which I did but at least I tried and so I try to think about these kind of things.
What part of Debian has made you smile? What part of the project has kept you going all through the years? [Andreas]:
Well, the card game which is called Mao on the DebConf made me smile all the time. I admit I joined only two or three times even if I really love this kind of games but I was occupied by other stuff so this made me really smile. I also think the first online DebConf in 2020 made me smile because we had this kind of short video sequences and I tried to make a funny video sequence about every DebConf I attended before. This is really funny moments but yeah, it's not only smile but yeah. One thing maybe it's totally unconnected to Debian but I learned personally something in Debian that we have a do-ocracy and you can do things which you think that are right if not going in between someone else, right? So respect everybody else but otherwise you can do so. And in 2020 I also started to take trees which are growing widely in my garden and plant them into the woods because in our woods a lot of trees are dying and so I just do something because I can. I have the resource to do something, take the small tree and bring it into the woods because it does not harm anybody. I asked the forester if it is okay, yes, yes, okay. So everybody can do so but I think the idea to do something like this came also because of the free software idea. You have the resources, you have the computer, you can do something and you do something productive, right? And when thinking about this I think it was also my Debian work. Meanwhile I have planted more than 3,000 trees so it's not a small number but yeah, I enjoy this.
What part of Debian would you have some criticisms for? [Andreas]:
Yeah, it's basically the same as I said before. We need more standards to work together. I do not want to repeat this but this is what I think, yeah.
What field in Free Software generally do you think requires the most work to be put into it? What do you think is Debian's part in the field? [Andreas]:
It's also in general, the thing is the fact that I'm maintaining packages which are usually as modern software is maintained in Git, which is fine but we have some software which is at Sourceport, we have software laying around somewhere, we have software where Debian somehow became Upstream because nobody is caring anymore and free software is very different in several things, ways and well, I in principle like freedom of choice which is the basic of all our work. Sometimes this freedom goes in the way of productivity because everybody is free to re-implement. You asked me for the most favorite editor. In principle one really good working editor would be great to have and would work and we have maybe 500 in Debian or so, I don't know. I could imagine if people would concentrate and say five instead of 500 editors, we could get more productive, right? But I know this will not happen, right? But I think this is one thing which goes in the way of making things smooth and productive and we could have more manpower to replace one person who's [having] children, doing some other stuff and can't continue working on something and maybe this is a problem I will not solve, definitely not, but which I see.
What do you think is Debian's part in the field? [Andreas]:
Yeah, well, okay, we can bring together different Upstreams, so we are building some packages and have some general overview about similar things and can say, oh, you are doing this and some other person is doing more or less the same, do you want to join each other or so, but this is kind of a channel we have to our Upstreams which is probably not very successful. It starts with code copies of some libraries which are changed a little bit, which is fine license-wise, but not so helpful for different things and so I've tried to convince those Upstreams to forward their patches to the original one, but for this and I think we could do some kind of, yeah, [find] someone who brings Upstream together or to make them stop their forking stuff, but it costs a lot of energy and we probably don't have this and it's also not realistic that we can really help with this problem.
Do you have any questions for me? [Andreas]:
I enjoyed the interview, I enjoyed seeing you again after half a year or so. Yeah, actually I've seen you in the eating room or cheese and wine party or so, I do not remember we had to really talk together, but yeah, people around, yeah, for sure. Yeah.

Bits from Debian: apt install dpl-candidate: Sruthi Chandran

The Debian Project Developers will shortly vote for a new Debian Project Leader known as the DPL. The DPL is the official representative of representative of The Debian Project tasked with managing the overall project, its vision, direction, and finances. The DPL is also responsible for the selection of Delegates, defining areas of responsibility within the project, the coordination of Developers, and making decisions required for the project. Our outgoing and present DPL Jonathan Carter served 4 terms, from 2020 through 2024. Jonathan shared his last Bits from the DPL post to Debian recently and his hopes for the future of Debian. Recently, we sat with the two present candidates for the DPL position asking questions to find out who they really are in a series of interviews about their platforms, visions for Debian, lives, and even their favorite text editors. The interviews were conducted by disaster2life (Yashraj Moghe) and made available from video and audio transcriptions: Voting for the position starts on April 6, 2024. Editors' note: This is our official return to Debian interviews, readers should stay tuned for more upcoming interviews with Developers and other important figures in Debian as part of our "Meet your Debian Developer" series. We used the following tools and services: Turboscribe.ai for the transcription from the audio and video files, IRC: Oftc.net for communication, Jitsi meet for interviews, and Open Broadcaster Software (OBS) for editing and video. While we encountered many technical difficulties in the return to this process, we are still able and proud to present the transcripts of the interviews edited only in a few areas for readability. 2024 Debian Project Leader Candidate: Sruthi Chandran Sruthi's interview Hi Sruthi, so for the first question, who are you and could you tell us a little bit about yourself? [Sruthi]:
I usually talk about me whenever I am talking about answering the question who am I, I usually say like I am a librarian turned free software enthusiast and a Debian Developer. So I had no technical background and I learned, I was introduced to free software through my husband and then I learned Debian packaging, and eventually I became a Debian Developer. So I always give my example to people who say I am not technically inclined, I don't have technical background so I can't contribute to free software. So yeah, that's what I refer to myself.
For the next question, could you tell me what do you do in Debian, and could you mention your story up until here today? [Sruthi]:
Okay, so let me start from my initial days in Debian. I started contributing to Debian, my first contribution was a Tibetan font. We went to a Tibetan place and they were saying they didn't have a font in Linux. So that's how I started contributing. Then I moved on to Ruby packages, then I have some JavaScript and Go packages, all dependencies of GitLab. So I was involved with maintaining GitLab for some time, now I'm not very active there. But yeah, so GitLab was the main package I was contributing to since I contributed since 2016 to maybe like 2020 or something. Later I have come [over to] packaging. Now I am part of some of the teams, delegated teams, like community team and outreach team, as well as the Debconf committee. And the biggest, I think, my activity in Debian, I would say is organizing Debconf 2023. So it was a great experience and yeah, so that's my story in Debian.
So what are three key terms about you and your candidacy? [Sruthi]:
Okay, let me first think about it. For candidacy, I can start with diversity is one point I started expressing from the first time I contested for DPL. But to be honest, that's the main point I want to bring.
[Yashraj]:
So for diversity, if you could break down your thoughts on diversity and make them, [about] your three points including diversity.
[Sruthi]:
So in addition to, eventually when starting it was just diversity. Now I have like a bit more ideas, like community, like I want to be a leader for the Debian community. More than, I don't know, maybe people may not agree, but I would say I want to be a leader of Debian community rather than a Debian operating system. I connect to community more and third point I would say.
The term of a DPL lasts for an year. So what do you think during, what would you try to do during that, that you can't do from your position now? [Sruthi]:
Okay. So I, like, I am very happy with the structure of Debian and how things work in Debian. Like you can do almost a lot of things, like almost all things without being a DPL. Whatever change you want to bring about or whatever you want to do, you can do without being a DPL. Anyone, like every DD has the same rights. Only things I feel [the] DPL has hold on are mainly the budget or the funding part, which like, that's where they do the decision making part. And then comes like, and one advantage of DPL driving some idea is that somehow people tend to listen to that with more, like, tend to give more attention to what DPL is saying rather than a normal DD. So I wanted to, like, I have answered some of the questions on how to, how I plan to do the financial budgeting part, how I want to handle, like, and the other thing is using the extra attention that I get as a DPL, I would like to obviously start with the diversity aspect in Debian. And yeah, like, I, what I want to do is not, like, be a leader and say, like, take Debian to one direction where I want to go, but I would rather take suggestions and inputs from the whole community and go about with that. So yes, that's what I would say.
And taking a less serious question now, what is your preferred text editor? [Sruthi]:
Vim.
[Yashraj]:
Vim, wholeheartedly team Vim?
[Sruthi]:
Yes.
[Yashraj]:
Great. Well, this was made in Vim, all the text for this.
[Sruthi]:
So, like, since you mentioned extra data, I'll give my example, like, it's just a fun note, when I started contributing to Debian, as I mentioned, I didn't have any knowledge about free software, like Debian, and I was not used to even using Linux. So, and I didn't have experience with these text editors. So, when I started contributing, I used to do the editing part using gedit. So, that's how I started. Eventually, I moved to Nano, and once I reached Vim, I didn't move on.
Team Vim. Next question. What, what do you think is the importance of the Debian project in the world today? And where would you like to see it in 10 years, like 10 years into the future? [Sruthi]:
Okay. So, Debian, as we all know, is referred to as the universal operating system without, like, it is said for a reason. We have hundreds and hundreds of operating systems, like Linux, distributions based on Debian. So, I believe Debian, like even now, Debian has good influence on the, at least on the Linux or Linux ecosystem. So, what we implement in Debian has, like, is going to affect quite a lot of, like, a very good percentage of people using Linux. So, yes. So, I think Debian is one of the leading Linux distributions. And I think in 10 years, we should be able to reach a position, like, where we are not, like, even now, like, even these many years after having Linux, we face a lot of problems in newer and newer hardware coming up and installing on them is a big problem. Like, firmwares and all those things are getting more and more complicated. Like, it should be getting simpler, but it's getting more and more complicated. So, I, one thing I would imagine, like, I don't know if we will ever reach there, but I would imagine that eventually with the Debian, we should be able to have some, at least a few of the hardware developers or hardware producers have Debian pre-installed and those kind of things. Like, not, like, become, I'm not saying it's all, it's also available right now. What I'm saying is that it becomes prominent enough to be opted as, like, default distro.
What part of Debian has made you And what part of the project has kept you going all through these years? [Sruthi]:
Okay. So, I started to contribute in 2016, and I was part of the team doing GitLab packaging, and we did have a lot of training workshops and those kind of things within India. And I was, like, I had interacted with some of the Indian DDs, but I never got, like, even through chat or mail. I didn't have a lot of interaction with the rest of the world, DDs. And the 2019 Debconf changed my whole perspective about Debian. Before that, I wasn't, like, even, I was interested in free software. I was doing the technical stuff and all. But after DebConf, my whole idea has been, like, my focus changed to the community. Debian community is a very welcoming, very interesting community to be with. And so, I believe that, like, 2019 DebConf was a for me. And that kept, from 2019, my focus has been to how to support, like, how, I moved to the community part of Debian from there. Then in 2020 I became part of the community team, and, like, I started being part of other teams. So, these, I would say, the Debian community is the one, like, aspect of Debian that keeps me whole, keeps me held on to the Debian ecosystem as a whole.
Continuing to speak about Debian, what do you think, what is the first thing that comes to your mind when you think of Debian, like, the word, the community, what's the first thing? [Sruthi]:
I think I may sound like a broken record or something.
[Yashraj]:
No, no.
[Sruthi]:
Again, I would say the Debian community, like, it's the people who makes Debian, that makes Debian special. Like, apart from that, if I say, I would say I'm very, like, one part of Debian that makes me very happy is the, how the governing system of Debian works, the Debian constitution and all those things, like, it's a very unique thing for Debian. And, and it's like, when people say you can't work without a proper, like, establishment or even somebody deciding everything for you, it's difficult. When people say, like, we have been, Debian has been proving it for quite a long time now, that it's possible. So, so that's one thing I believe, like, that's one unique point. And I am very proud about that.
What areas do you think Debian is failing in, how can it (that standing) be improved? [Sruthi]:
So, I think where Debian is failing now is getting new people into Debian. Like, I don't remember, like, exactly the answer. But I remember hearing someone mention, like, the average age of a Debian Developer is, like, above 40 or 45 or something, like, exact age, I don't remember. But it's like, Debian is getting old. Like, the people in Debian are getting old and we are not getting enough of new people into Debian. And that's very important to have people, like, new people coming up. Otherwise, eventually, like, after a few years, nobody, like, we won't have enough people to take the project forward. So, yeah, I believe that is where we need to work on. We are doing some efforts, like, being part of GSOC or outreachy and having maybe other events, like, local events. Like, we used to have a lot of Debian packaging workshops in India. And those kind of, I think, in Brazil and all, they all have, like, local communities are doing. But we are not very successful in retaining the people who maybe come and try out things. But we are not very good at retaining the people, like, retaining people who come. So, we need to work on those things. Right now, I don't have a solid answer for that. But one thing, like, I was thinking about is, like, having a Debian specific outreach project, wherein the focus will be about the Debian, like, starting will be more on, like, usually what happens in GSOC and outreach is that people come, have the, do the contributions, and they go back. Like, they don't have that connection with the Debian, like, Debian community or Debian project. So, what I envision with these, the Debian outreach, the Debian specific outreach is that we have some part of the internship, like, even before starting the internship, we have some sessions and, like, with the people in Debian having, like, getting them introduced to the Debian philosophy and Debian community and Debian, how Debian works. And those things, we focus on that. And then we move on to the technical internship parts. So, I believe this could do some good in having, like, when you have people you can connect to, you tend to stay back in a project mode. When you feel something more than, like, right now, we have so many technical stuff to do, like, the choice for a college student is endless. So, if they want, if they stay back for something, like, maybe for Debian, I would say, we need to have them connected to the Debian project before we go into technical parts. Like, technical parts, like, there are other things as well, where they can go and do the technical part, but, like, they can come here, like, yeah. So, that's what I was saying. Focused outreach projects is one thing. That's just one. That's not enough. We need more of, like, more ideas to have more new people come up. And I'm very happy with, like, the DebConf thing. We tend to get more and more people from the places where we have a DebConf. Brazil is an example. After the Debconf, they have quite a good improvement on Debian contributors. And I think in India also, it did give a good result. Like, we have more people contributing and staying back and those things. So, yeah. So, these were the things I would say, like, we can do to improve.
For the final question, what field in free software do you, what field in free software generally do you think requires the most work to be put into it? What do you think is Debian's part in that field? [Sruthi]:
Okay. Like, right now, what comes to my mind is the free software licenses parts. Like, we have a lot of free software licenses, and there are non-free software licenses. But currently, I feel free software is having a big problem in enforcing these licenses. Like, there are, there may be big corporations or like some people who take up the whole, the code and may not follow the whole, for example, the GPL licenses. Like, we don't know how much of those, how much of the free softwares are used in the bigger things. Yeah, I agree. There are a lot of corporations who are afraid to touch free software. But there would be good amount of free software, free work that converts into property, things violating the free software licenses and those things. And we do not have the kind of like, we have SFLC, SFC, etc. But still, we do not have the ability to go behind and trace and implement the licenses. So, enforce those licenses and bring people who are violating the licenses forward and those kind of things is challenging because one thing is it takes time, like, and most importantly, money is required for the legal stuff. And not always people who like people who make small software, or maybe big, but they may not have the kind of time and money to have these things enforced. So, that's a big challenge free software is facing, especially in our current scenario. I feel we are having those, like, we need to find ways how we can get it sorted. I don't have an answer right now what to do. But this is a challenge I felt like and Debian's part in that. Yeah, as I said, I don't have a solution for that. But the Debian, so DFSG and Debian sticking on to the free software licenses is a good support, I think.
So, that was the final question, Do you have anything else you want to mention for anyone watching this? [Sruthi]:
Not really, like, I am happy, like, I think I was able to answer the questions. And yeah, I would say who is watching. I won't say like, I'm the best DPL candidate, you can't have a better one or something. I stand for a reason. And if you believe in that, or the Debian community and Debian diversity, and those kinds of things, if you believe it, I hope you would be interested, like, you would want to vote for me. That's it. Like, I'm not, I'll make it very clear. I'm not doing a technical leadership part here. So, those, I can't convince people who want technical leadership to vote for me. But I would say people who connect with me, I hope they vote for me.

25 March 2024

Jonathan Dowland: a bug a day

I recently became a maintainer of/committer to IkiWiki, the software that powers my site. I also took over maintenance of the Debian package. Last week I cut a new upstream point release, 3.20200202.4, and a corresponding Debian package upload, consisting only of a handful of low-hanging-fruit patches from other people, largely to exercise both processes. I've been discussing IkiWiki's maintenance situation with some other users for a couple of years now. I've also weighed up the pros and cons of moving to a different static-site-generator (a term that describes what IkiWiki is, but was actually coined more recently). It turns out IkiWiki is exceptionally flexible and powerful: I estimate the cost of moving to something modern(er) and fashionable such as Jekyll, Hugo or Hakyll as unreasonably high, in part because they are surprisingly rigid and inflexible in some key places. Like most mature software, IkiWiki has a bug backlog. Over the past couple of weeks, as a sort-of "palate cleanser" around work pieces, I've tried to triage one IkiWiki bug per day: either upstream or in the Debian Bug Tracker. This is a really lightweight task: it can be as simple as "find a bug reported in Debian, copy it upstream, tag it upstream, mark it forwarded; perhaps taking 5-10 minutes. Often I'll stumble across something that has already been fixed but not recorded as such as I go. Despite this minimal level of work, I'm quite satisfied with the cumulative progress. It's notable to me how much my perspective has shifted by becoming a maintainer: I'm considering everything through a different lens to that of being just one user. Eventually I will put some time aside to scratch some of my own itches (html5 by default; support dark mode; duckduckgo plugin; use the details tag...) but for now this minimal exercise is of broader use.

21 March 2024

Ravi Dwivedi: Thailand Trip

This post is the second and final part of my Malaysia-Thailand trip. Feel free to check out the Malaysia part here if you haven t already. Kuala Lumpur to Bangkok is around 1500 km by road, and so I took a Malaysian Airlines flight to travel to Bangkok. The flight staff at the Kuala Lumpur only asked me for a return/onward flight and Thailand immigration asked a few questions but did not check any documents (obviously they checked and stamped my passport ;)). The currency of Thailand is the Thai baht, and 1 Thai baht = 2.5 Indian Rupees. The Thailand time is 1.5 hours ahead of Indian time (For example, if it is 12 noon in India, it will be 13:30 in Thailand). I landed in Bangkok at around 3 PM local time. Fletcher was in Bangkok that time, leaving for Pattaya and we had booked the same hostel. So I took a bus to Pattaya from the airport. The next bus for which the tickets were available was at 7 PM, so I took tickets for that one. The bus ticket cost was 143 Thai Baht. I didn t buy SIM at the airport, thinking there must be better deals in the city. As a consequence, there was no way to contact Fletcher through internet. Although I had a few minutes call remaining out of my international roaming pack.
A welcome sign at Bangkok's Suvarnabhumi airport.
Bus from Suvarnabhumi Airport to Jomtien Beach in Pattaya.
Our accommodation was near Jomtien beach, so I got off at the last stop, as the bus terminates at the Jomtien beach. Then I decided to walk towards my accommodation. I was using OsmAnd for navigation. However, the place was not marked on OpenStreetMap, and it turned out I missed the street my hostel was on and walked around 1 km further as I was chasing a similarly named incorrect hostel on OpenStreetMap. Then I asked for help from two men sitting at a caf . One of them said he will help me find the street my hostel is on. So, I walked with him, and he told me he lives in Thailand for many years, but he is from Kuwait. He also gave me valuable information. Like, he told me about shared hail-and-ride songthaews which run along the Jomtien Second Road and charge 10 Baht for any distance on their route. This tip significantly reduced our expenses. Further, he suggested me 7-Eleven shops for buying a local SIM. Like Malaysia, Thailand has 24/7 7-Eleven convenience stores, a lot of them not even 100 m apart. The Kuwaiti person dropped me at the address where my hostel was. I tried searching for a person in-charge of that hostel, and soon I realized there was no reception. After asking for help from locals for some time, I bumped into Fletcher, who also came to this address and was searching for the same. After finding a friend, I felt a sigh of relief. Adjacent to the property, there was a hairdresser shop. We went there and asked about this property. The woman called the owner, and she also told us the required passcodes to go inside. Our accommodation was in a room on the second floor, which required us to put a passcode for opening. We entered the passcode and entered the room. So, we stayed at this hostel which had no reception. Due to this, it took 2 hours to find our room and enter. It reminded me of a difficult experience I had in Albania, where me and Akshat were not able to find our apartment in one of the hottest days and the owner didn t know our language. Traveling from the place where the bus dropped me to the hostel, I saw streets were filled with bars and massage parlors, which was expected. Prostitutes were everywhere. We went out at night towards the beach and also roamed around in 7-Elevens to buy a SIM card for myself. I got a SIM for 7 day unlimited internet for 399 baht. Turns out that the rates of SIM cards at the airport were not so different from inside the city.
Road near Jomtien beach in Pattaya
Photo of a songthaew in Pattaya. There are shared songthaews which run along Jomtien Second road and takes 10 bath to anywhere on the route.
Jomtien Beach in Pattaya.
In terms of speaking English, locals didn t know English at all in both Pattaya and Bangkok. I normally don t expect locals to know English in a non-English speaking country, but the fact that Bangkok is one of the most visited places by tourists made me expect locals to know some English. Talking to locals is an integral part of travel for me, which I couldn t do a lot in Thailand. This aspect is much more important for me than going to touristy places. So, we were in Pattaya. Next morning, Fletcher and I went to Tiger park using shared songthaew. After that, we planned to visit Pattaya Floating market which is near the Tiger Park, but we felt the ticket prices were higher than it was worth. Fletcher had to leave for Bangkok on that day. I suggested him to go to Suvarnabhumi Airport from the Jomtien beach bus terminal (this was the route I took the last day in opposite direction) to avoid traffic congestion inside Bangkok, as he can follow up with metro once he reaches the airport. From the floating market, we were walking in sweltering heat to reach the Jomtien beach. I tried asking for a lift and eventually got successful as a scooty stopped, and surprisingly the person gave a ride to both of us. He was from Delhi, so maybe that s the reason he stopped for us. Then we took a songthaew to the bus terminal and after having lunch, Fletcher left for Bangkok.
A welcome sign at Pattaya Floating market.
This Korean Vegetasty noodles pack was yummy and was available at many 7-Eleven stores.
Next day I went to Bangkok, but Fletcher already left for Kuala Lumpur. Here I had booked a private room in a hotel (instead of a hostel) for four nights, mainly because of my luggage. This costed 5600 INR for four nights. It was 2 km from the metro station, which I used to walk both sides. In Bangkok, I visited Sukhumvit and Siam by metro. Going to some areas require crossing the Chao Phraya river. For this, I took Chao Phraya Express Boat for going to places like Khao San road and Wat Arun. I would recommend taking the boat ride as it had very good views. In Bangkok, I met a person from Pakistan staying in my hotel and so here also I got some company. But by the time I met him, my days were almost over. So, we went to a random restaurant selling Indian food where we ate some paneer dish with naan and that restaurant person was from Myanmar.
Wat Arun temple stamps your hand upon entry
Wat Arun temple
Khao San Road
A food stall at Khao San Road
Chao Phraya Express Boat
For eating, I mainly relied on fruits and convenience stores. Bananas were very tasty. This was the first time I saw banana flesh being yellow. Mangoes were delicious and pineapples were smaller and flavorful. I also ate Rose Apple, which I never had before. I had Chhole Kulche once in Sukhumvit. That was a little expensive as it costed 164 baht. I also used to buy premix coffee packets from 7-Eleven convenience stores and prepare them inside the stores.
Banana with yellow flesh
Fruits at a stall in Bangkok
Trimmed pineapples from Thailand.
Corn in Bangkok.
A board showing coffee menu at a 7-Eleven store along with rates in Pattaya.
In this section of 7-Eleven, you can buy a premix coffee and mix it with hot water provided at the store to prepare.
My booking from Bangkok to Delhi was in Air India flight, and they were serving alcohol in the flight. I chose red wine, and this was my first time having alcohol in a flight.
Red wine being served in Air India

Notes
  • In this whole trip spanning two weeks, I did not pay for drinking water (except for once in Pattaya which was 9 baht) and toilets. Bangkok and Kuala Lumpur have plenty of malls where you should find a free-of-cost toilet nearby. For drinking water, I relied mainly on my accommodation providing refillable water for my bottle.
  • Thailand seemed more expensive than Malaysia on average. Malaysia had discounted price due to the Chinese New year.
  • I liked Pattaya more than Bangkok. Maybe because Pattaya has beach and Bangkok doesn t. Pattaya seemed more lively, and I could meet and talk to a few people as opposed to Bangkok.
  • Chao Phraya River express boat costs 150 baht for one day where you can hop on and off to any boat.

18 March 2024

Simon Josefsson: Apt archive mirrors in Git-LFS

My effort to improve transparency and confidence of public apt archives continues. I started to work on this in Apt Archive Transparency in which I mention the debdistget project in passing. Debdistget is responsible for mirroring index files for some public apt archives. I ve realized that having a publicly auditable and preserved mirror of the apt repositories is central to being able to do apt transparency work, so the debdistget project has become more central to my project than I thought. Currently I track Trisquel, PureOS, Gnuinos and their upstreams Ubuntu, Debian and Devuan. Debdistget download Release/Package/Sources files and store them in a git repository published on GitLab. Due to size constraints, it uses two repositories: one for the Release/InRelease files (which are small) and one that also include the Package/Sources files (which are large). See for example the repository for Trisquel release files and the Trisquel package/sources files. Repositories for all distributions can be found in debdistutils archives GitLab sub-group. The reason for splitting into two repositories was that the git repository for the combined files become large, and that some of my use-cases only needed the release files. Currently the repositories with packages (which contain a couple of months worth of data now) are 9GB for Ubuntu, 2.5GB for Trisquel/Debian/PureOS, 970MB for Devuan and 450MB for Gnuinos. The repository size is correlated to the size of the archive (for the initial import) plus the frequency and size of updates. Ubuntu s use of Apt Phased Updates (which triggers a higher churn of Packages file modifications) appears to be the primary reason for its larger size. Working with large Git repositories is inefficient and the GitLab CI/CD jobs generate quite some network traffic downloading the git repository over and over again. The most heavy user is the debdistdiff project that download all distribution package repositories to do diff operations on the package lists between distributions. The daily job takes around 80 minutes to run, with the majority of time is spent on downloading the archives. Yes I know I could look into runner-side caching but I dislike complexity caused by caching. Fortunately not all use-cases requires the package files. The debdistcanary project only needs the Release/InRelease files, in order to commit signatures to the Sigstore and Sigsum transparency logs. These jobs still run fairly quickly, but watching the repository size growth worries me. Currently these repositories are at Debian 440MB, PureOS 130MB, Ubuntu/Devuan 90MB, Trisquel 12MB, Gnuinos 2MB. Here I believe the main size correlation is update frequency, and Debian is large because I track the volatile unstable. So I hit a scalability end with my first approach. A couple of months ago I solved this by discarding and resetting these archival repositories. The GitLab CI/CD jobs were fast again and all was well. However this meant discarding precious historic information. A couple of days ago I was reaching the limits of practicality again, and started to explore ways to fix this. I like having data stored in git (it allows easy integration with software integrity tools such as GnuPG and Sigstore, and the git log provides a kind of temporal ordering of data), so it felt like giving up on nice properties to use a traditional database with on-disk approach. So I started to learn about Git-LFS and understanding that it was able to handle multi-GB worth of data that looked promising. Fairly quickly I scripted up a GitLab CI/CD job that incrementally update the Release/Package/Sources files in a git repository that uses Git-LFS to store all the files. The repository size is now at Ubuntu 650kb, Debian 300kb, Trisquel 50kb, Devuan 250kb, PureOS 172kb and Gnuinos 17kb. As can be expected, jobs are quick to clone the git archives: debdistdiff pipelines went from a run-time of 80 minutes down to 10 minutes which more reasonable correlate with the archive size and CPU run-time. The LFS storage size for those repositories are at Ubuntu 15GB, Debian 8GB, Trisquel 1.7GB, Devuan 1.1GB, PureOS/Gnuinos 420MB. This is for a couple of days worth of data. It seems native Git is better at compressing/deduplicating data than Git-LFS is: the combined size for Ubuntu is already 15GB for a couple of days data compared to 8GB for a couple of months worth of data with pure Git. This may be a sub-optimal implementation of Git-LFS in GitLab but it does worry me that this new approach will be difficult to scale too. At some level the difference is understandable, Git-LFS probably store two different Packages files around 90MB each for Trisquel as two 90MB files, but native Git would store it as one compressed version of the 90MB file and one relatively small patch to turn the old files into the next file. So the Git-LFS approach surprisingly scale less well for overall storage-size. Still, the original repository is much smaller, and you usually don t have to pull all LFS files anyway. So it is net win. Throughout this work, I kept thinking about how my approach relates to Debian s snapshot service. Ultimately what I would want is a combination of these two services. To have a good foundation to do transparency work I would want to have a collection of all Release/Packages/Sources files ever published, and ultimately also the source code and binaries. While it makes sense to start on the latest stable releases of distributions, this effort should scale backwards in time as well. For reproducing binaries from source code, I need to be able to securely find earlier versions of binary packages used for rebuilds. So I need to import all the Release/Packages/Sources packages from snapshot into my repositories. The latency to retrieve files from that server is slow so I haven t been able to find an efficient/parallelized way to download the files. If I m able to finish this, I would have confidence that my new Git-LFS based approach to store these files will scale over many years to come. This remains to be seen. Perhaps the repository has to be split up per release or per architecture or similar. Another factor is storage costs. While the git repository size for a Git-LFS based repository with files from several years may be possible to sustain, the Git-LFS storage size surely won t be. It seems GitLab charges the same for files in repositories and in Git-LFS, and it is around $500 per 100GB per year. It may be possible to setup a separate Git-LFS backend not hosted at GitLab to serve the LFS files. Does anyone know of a suitable server implementation for this? I had a quick look at the Git-LFS implementation list and it seems the closest reasonable approach would be to setup the Gitea-clone Forgejo as a self-hosted server. Perhaps a cloud storage approach a la S3 is the way to go? The cost to host this on GitLab will be manageable for up to ~1TB ($5000/year) but scaling it to storing say 500TB of data would mean an yearly fee of $2.5M which seems like poor value for the money. I realized that ultimately I would want a git repository locally with the entire content of all apt archives, including their binary and source packages, ever published. The storage requirements for a service like snapshot (~300TB of data?) is today not prohibitly expensive: 20TB disks are $500 a piece, so a storage enclosure with 36 disks would be around $18.000 for 720TB and using RAID1 means 360TB which is a good start. While I have heard about ~TB-sized Git-LFS repositories, would Git-LFS scale to 1PB? Perhaps the size of a git repository with multi-millions number of Git-LFS pointer files will become unmanageable? To get started on this approach, I decided to import a mirror of Debian s bookworm for amd64 into a Git-LFS repository. That is around 175GB so reasonable cheap to host even on GitLab ($1000/year for 200GB). Having this repository publicly available will make it possible to write software that uses this approach (e.g., porting debdistreproduce), to find out if this is useful and if it could scale. Distributing the apt repository via Git-LFS would also enable other interesting ideas to protecting the data. Consider configuring apt to use a local file:// URL to this git repository, and verifying the git checkout using some method similar to Guix s approach to trusting git content or Sigstore s gitsign. A naive push of the 175GB archive in a single git commit ran into pack size limitations: remote: fatal: pack exceeds maximum allowed size (4.88 GiB) however breaking up the commit into smaller commits for parts of the archive made it possible to push the entire archive. Here are the commands to create this repository: git init
git lfs install
git lfs track 'dists/**' 'pool/**'
git add .gitattributes
git commit -m"Add Git-LFS track attributes." .gitattributes
time debmirror --method=rsync --host ftp.se.debian.org --root :debian --arch=amd64 --source --dist=bookworm,bookworm-updates --section=main --verbose --diff=none --keyring /usr/share/keyrings/debian-archive-keyring.gpg --ignore .git .
git add dists project
git commit -m"Add." -a
git remote add origin git@gitlab.com:debdistutils/archives/debian/mirror.git
git push --set-upstream origin --all
for d in pool//; do
echo $d;
time git add $d;
git commit -m"Add $d." -a
git push
done
The resulting repository size is around 27MB with Git LFS object storage around 174GB. I think this approach would scale to handle all architectures for one release, but working with a single git repository for all releases for all architectures may lead to a too large git repository (>1GB). So maybe one repository per release? These repositories could also be split up on a subset of pool/ files, or there could be one repository per release per architecture or sources. Finally, I have concerns about using SHA1 for identifying objects. It seems both Git and Debian s snapshot service is currently using SHA1. For Git there is SHA-256 transition and it seems GitLab is working on support for SHA256-based repositories. For serious long-term deployment of these concepts, it would be nice to go for SHA256 identifiers directly. Git-LFS already uses SHA256 but Git internally uses SHA1 as does the Debian snapshot service. What do you think? Happy Hacking!

13 March 2024

Russell Coker: The Shape of Computers

Introduction There have been many experiments with the sizes of computers, some of which have stayed around and some have gone away. The trend has been to make computers smaller, the early computers had buildings for them. Recently for come classes computers have started becoming as small as could be reasonably desired. For example phones are thin enough that they can blow away in a strong breeze, smart watches are much the same size as the old fashioned watches they replace, and NUC type computers are as small as they need to be given the size of monitors etc that they connect to. This means that further development in the size and shape of computers will largely be determined by human factors. I think we need to consider how computers might be developed to better suit humans and how to write free software to make such computers usable without being constrained by corporate interests. Those of us who are involved in developing OSs and applications need to consider how to adjust to the changes and ideally anticipate changes. While we can t anticipate the details of future devices we can easily predict general trends such as being smaller, higher resolution, etc. Desktop/Laptop PCs When home computers first came out it was standard to have the keyboard in the main box, the Apple ][ being the most well known example. This has lost popularity due to the demand to have multiple options for a light keyboard that can be moved for convenience combined with multiple options for the box part. But it still pops up occasionally such as the Raspberry Pi 400 [1] which succeeds due to having the computer part being small and light. I think this type of computer will remain a niche product. It could be used in a add a screen to make a laptop as opposed to the add a keyboard to a tablet to make a laptop model but a tablet without a keyboard is more useful than a non-server PC without a display. The PC as box with connections for keyboard, display, etc has a long future ahead of it. But the sizes will probably decrease (they should have stopped making PC cases to fit CD/DVD drives at least 10 years ago). The NUC size is a useful option and I think that DVD drives will stop being used for software soon which will allow a range of smaller form factors. The regular laptop is something that will remain useful, but the tablet with detachable keyboard devices could take a lot of that market. Full functionality for all tasks requires a keyboard because at the moment text editing with a touch screen is an unsolved problem in computer science [2]. The Lenovo Thinkpad X1 Fold [3] and related Lenovo products are very interesting. Advances in materials allow laptops to be thinner and lighter which leaves the screen size as a major limitation to portability. There is a conflict between desiring a large screen to see lots of content and wanting a small size to carry and making a device foldable is an obvious solution that has recently become possible. Making a foldable laptop drives a desire for not having a permanently attached keyboard which then makes a touch screen keyboard a requirement. So this means that user interfaces for PCs have to be adapted to work well on touch screens. The Think line seems to be continuing the history of innovation that it had when owned by IBM. There are also a range of other laptops that have two regular screens so they are essentially the same as the Thinkpad X1 Fold but with two separate screens instead of one folding one, prices are as low as $600US. I think that the typical interfaces for desktop PCs (EG MS-Windows and KDE) don t work well for small devices and touch devices and the Android interface generally isn t a good match for desktop systems. We need to invent more options for this. This is not a criticism of KDE, I use it every day and it works well. But it s designed for use cases that don t match new hardware that is on sale. As an aside it would be nice if Lenovo gave samples of their newest gear to people who make significant contributions to GUIs. Give a few Thinkpad Fold devices to KDE people, a few to GNOME people, and a few others to people involved in Wayland development and see how that promotes software development and future sales. We also need to adopt features from laptops and phones into desktop PCs. When voice recognition software was first released in the 90s it was for desktop PCs, it didn t take off largely because it wasn t very accurate (none of them recognised my voice). Now voice recognition in phones is very accurate and it s very common for desktop PCs to have a webcam or headset with a microphone so it s time for this to be re-visited. GPS support in laptops is obviously useful and can work via Wifi location, via a USB GPS device, or via wwan mobile phone hardware (even if not used for wwan networking). Another possibility is using the same software interfaces as used for GPS on laptops for a static definition of location for a desktop PC or server. The Interesting New Things Watch Like The wrist-watch [4] has been a standard format for easy access to data when on the go since it s military use at the end of the 19th century when the practical benefits beat the supposed femininity of the watch. So it seems most likely that they will continue to be in widespread use in computerised form for the forseeable future. For comparison smart phones have been in widespread use as pocket watches for about 10 years. The question is how will watch computers end up? Will we have Dick Tracy style watch phones that you speak into? Will it be the current smart watch functionality of using the watch to answer a call which goes to a bluetooth headset? Will smart watches end up taking over the functionality of the calculator watch [5] which was popular in the 80 s? With today s technology you could easily have a fully capable PC strapped to your forearm, would that be useful? Phone Like Folding phones (originally popularised as Star Trek Tricorders) seem likely to have a long future ahead of them. Engineering technology has only recently developed to the stage of allowing them to work the way people would hope them to work (a folding screen with no gaps). Phones and tablets with multiple folds are coming out now [6]. This will allow phones to take much of the market share that tablets used to have while tablets and laptops merge at the high end. I ve previously written about Convergence between phones and desktop computers [7], the increased capabilities of phones adds to the case for Convergence. Folding phones also provide new possibilities for the OS. The Oppo OnePlus Open and the Google Pixel Fold both have a UI based around using the two halves of the folding screen for separate data at some times. I think that the current user interfaces for desktop PCs don t properly take advantage of multiple monitors and the possibilities raised by folding phones only adds to the lack. My pet peeve with multiple monitor setups is when they don t make it obvious which monitor has keyboard focus so you send a CTRL-W or ALT-F4 to the wrong screen by mistake, it s a problem that also happens on a single screen but is worse with multiple screens. There are rumours of phones described as three fold (where three means the number of segments with two folds between them), it will be interesting to see how that goes. Will phones go the same way as PCs in terms of having a separation between the compute bit and the input device? It s quite possible to have a compute device in the phone form factor inside a secure pocket which talks via Bluetooth to another device with a display and speakers. Then you could change your phone between a phone-size display and a tablet sized display easily and when using your phone a thief would not be able to easily steal the compute bit (which has passwords etc). Could the watch part of the phone (strapped to your wrist and difficult to steal) be the active part and have a tablet size device as an external display? There are already announcements of smart watches with up to 1GB of RAM (same as the Samsung Galaxy S3), that s enough for a lot of phone functionality. The Rabbit R1 [8] and the Humane AI Pin [9] have some interesting possibilities for AI speech interfaces. Could that take over some of the current phone use? It seems that visually impaired people have been doing badly in the trend towards touch screen phones so an option of a voice interface phone would be a good option for them. As an aside I hope some people are working on AI stuff for FOSS devices. Laptop Like One interesting PC variant I just discovered is the Higole 2 Pro portable battery operated Windows PC with 5.5 touch screen [10]. It looks too thick to fit in the same pockets as current phones but is still very portable. The version with built in battery is $AU423 which is in the usual price range for low end laptops and tablets. I don t think this is the future of computing, but it is something that is usable today while we wait for foldable devices to take over. The recent release of the Apple Vision Pro [11] has driven interest in 3D and head mounted computers. I think this could be a useful peripheral for a laptop or phone but it won t be part of a primary computing environment. In 2011 I wrote about the possibility of using augmented reality technology for providing a desktop computing environment [12]. I wonder how a Vision Pro would work for that on a train or passenger jet. Another interesting thing that s on offer is a laptop with 7 touch screen beside the keyboard [13]. It seems that someone just looked at what parts are available cheaply in China (due to being parts of more popular devices) and what could fit together. I think a keyboard should be central to the monitor for serious typing, but there may be useful corner cases where typing isn t that common and a touch-screen display is of use. Developing a range of strange hardware and then seeing which ones get adopted is a good thing and an advantage of Ali Express and Temu. Useful Hardware for Developing These Things I recently bought a second hand Thinkpad X1 Yoga Gen3 for $359 which has stylus support [14], and it s generally a great little laptop in every other way. There s a common failure case of that model where touch support for fingers breaks but the stylus still works which allows it to be used for testing touch screen functionality while making it cheap. The PineTime is a nice smart watch from Pine64 which is designed to be open [15]. I am quite happy with it but haven t done much with it yet (apart from wearing it every day and getting alerts etc from Android). At $50 when delivered to Australia it s significantly more expensive than most smart watches with similar features but still a lot cheaper than the high end ones. Also the Raspberry Pi Watch [16] is interesting too. The PinePhonePro is an OK phone made to open standards but it s hardware isn t as good as Android phones released in the same year [17]. I ve got some useful stuff done on mine, but the battery life is a major issue and the screen resolution is low. The Librem 5 phone from Purism has a better hardware design for security with switches to disable functionality [18], but it s even slower than the PinePhonePro. These are good devices for test and development but not ones that many people would be excited to use every day. Wwan hardware (for accessing the phone network) in M.2 form factor can be obtained for free if you have access to old/broken laptops. Such devices start at about $35 if you want to buy one. USB GPS devices also start at about $35 so probably not worth getting if you can get a wwan device that does GPS as well. What We Must Do Debian appears to have some voice input software in the pocketsphinx package but no documentation on how it s to be used. This would be a good thing to document, I spent 15 mins looking at it and couldn t get it going. To take advantage of the hardware features in phones we need software support and we ideally don t want free software to lag too far behind proprietary software which IMHO means the typical Android setup for phones/tablets. Support for changing screen resolution is already there as is support for touch screens. Support for adapting the GUI to changed screen size is something that needs to be done even today s hardware of connecting a small laptop to an external monitor doesn t have the ideal functionality for changing the UI. There also seem to be some limitations in touch screen support with multiple screens, I haven t investigated this properly yet, it definitely doesn t work in an expected manner in Ubuntu 22.04 and I haven t yet tested the combinations on Debian/Unstable. ML is becoming a big thing and it has some interesting use cases for small devices where a smart device can compensate for limited input options. There s a lot of work that needs to be done in this area and we are limited by the fact that we can t just rip off the work of other people for use as training data in the way that corporations do. Security is more important for devices that are at high risk of theft. The vast majority of free software installations are way behind Android in terms of security and we need to address that. I have some ideas for improvement but there is always a conflict between security and usability and while Android is usable for it s own special apps it s not usable in a I want to run applications that use any files from any other applicationsin any way I want sense. My post about Sandboxing Phone apps is relevant for people who are interested in this [19]. We also need to extend security models to cope with things like ok google type functionality which has the potential to be a bug and the emerging class of LLM based attacks. I will write more posts about these thing. Please write comments mentioning FOSS hardware and software projects that address these issues and also documentation for such things.

25 February 2024

Russ Allbery: Review: The Fund

Review: The Fund, by Rob Copeland
Publisher: St. Martin's Press
Copyright: 2023
ISBN: 1-250-27694-2
Format: Kindle
Pages: 310
I first became aware of Ray Dalio when either he or his publisher plastered advertisements for The Principles all over the San Francisco 4th and King Caltrain station. If I recall correctly, there were also constant radio commercials; it was a whole thing in 2017. My brain is very good at tuning out advertisements, so my only thought at the time was "some business guy wrote a self-help book." I think I vaguely assumed he was a CEO of some traditional business, since that's usually who writes heavily marketed books like this. I did not connect him with hedge funds or Bridgewater, which I have a bad habit of confusing with Blackwater. The Principles turns out to be more of a laundered cult manual than a self-help book. And therein lies a story. Rob Copeland is currently with The New York Times, but for many years he was the hedge fund reporter for The Wall Street Journal. He covered, among other things, Bridgewater Associates, the enormous hedge fund founded by Ray Dalio. The Fund is a biography of Ray Dalio and a history of Bridgewater from its founding as a vehicle for Dalio's advising business until 2022 when Dalio, after multiple false starts and title shuffles, finally retired from running the company. (Maybe. Based on the history recounted here, it wouldn't surprise me if he was back at the helm by the time you read this.) It is one of the wildest, creepiest, and most abusive business histories that I have ever read. It's probably worth mentioning, as Copeland does explicitly, that Ray Dalio and Bridgewater hate this book and claim it's a pack of lies. Copeland includes some of their denials (and many non-denials that sound as good as confirmations to me) in footnotes that I found increasingly amusing.
A lawyer for Dalio said he "treated all employees equally, giving people at all levels the same respect and extending them the same perks."
Uh-huh. Anyway, I personally know nothing about Bridgewater other than what I learned here and the occasional mention in Matt Levine's newsletter (which is where I got the recommendation for this book). I have no independent information whether anything Copeland describes here is true, but Copeland provides the typical extensive list of notes and sourcing one expects in a book like this, and Levine's comments indicated it's generally consistent with Bridgewater's industry reputation. I think this book is true, but since the clear implication is that the world's largest hedge fund was primarily a deranged cult whose employees mostly spied on and rated each other rather than doing any real investment work, I also have questions, not all of which Copeland answers to my satisfaction. But more on that later. The center of this book are the Principles. These were an ever-changing list of rules and maxims for how people should conduct themselves within Bridgewater. Per Copeland, although Dalio later published a book by that name, the version of the Principles that made it into the book was sanitized and significantly edited down from the version used inside the company. Dalio was constantly adding new ones and sometimes changing them, but the common theme was radical, confrontational "honesty": never being silent about problems, confronting people directly about anything that they did wrong, and telling people all of their faults so that they could "know themselves better." If this sounds like textbook abusive behavior, you have the right idea. This part Dalio admits to openly, describing Bridgewater as a firm that isn't for everyone but that achieves great results because of this culture. But the uncomfortably confrontational vibes are only the tip of the iceberg of dysfunction. Here are just a few of the ways this played out according to Copeland: In one of the common and all-too-disturbing connections between Wall Street finance and the United States' dysfunctional government, James Comey (yes, that James Comey) ran internal security for Bridgewater for three years, meaning that he was the one who pulled evidence from surveillance cameras for Dalio to use to confront employees during his trials. In case the cult vibes weren't strong enough already, Bridgewater developed its own idiosyncratic language worthy of Scientology. The trials were called "probings," firing someone was called "sorting" them, and rating them was called "dotting," among many other Bridgewater-specific terms. Needless to say, no one ever probed Dalio himself. You will also be completely unsurprised to learn that Copeland documents instances of sexual harassment and discrimination at Bridgewater, including some by Dalio himself, although that seems to be a relatively small part of the overall dysfunction. Dalio was happy to publicly humiliate anyone regardless of gender. If you're like me, at this point you're probably wondering how Bridgewater continued operating for so long in this environment. (Per Copeland, since Dalio's retirement in 2022, Bridgewater has drastically reduced the cult-like behaviors, deleted its archive of probings, and de-emphasized the Principles.) It was not actually a religious cult; it was a hedge fund that has to provide investment services to huge, sophisticated clients, and by all accounts it's a very successful one. Why did this bizarre nightmare of a workplace not interfere with Bridgewater's business? This, I think, is the weakest part of this book. Copeland makes a few gestures at answering this question, but none of them are very satisfying. First, it's clear from Copeland's account that almost none of the employees of Bridgewater had any control over Bridgewater's investments. Nearly everyone was working on other parts of the business (sales, investor relations) or on cult-related obsessions. Investment decisions (largely incorporated into algorithms) were made by a tiny core of people and often by Dalio himself. Bridgewater also appears to not trade frequently, unlike some other hedge funds, meaning that they probably stay clear of the more labor-intensive high-frequency parts of the business. Second, Bridgewater took off as a hedge fund just before the hedge fund boom in the 1990s. It transformed from Dalio's personal consulting business and investment newsletter to a hedge fund in 1990 (with an earlier investment from the World Bank in 1987), and the 1990s were a very good decade for hedge funds. Bridgewater, in part due to Dalio's connections and effective marketing via his newsletter, became one of the largest hedge funds in the world, which gave it a sort of institutional momentum. No one was questioned for putting money into Bridgewater even in years when it did poorly compared to its rivals. Third, Dalio used the tried and true method of getting free publicity from the financial press: constantly predict an upcoming downturn, and aggressively take credit whenever you were right. From nearly the start of his career, Dalio predicted economic downturns year after year. Bridgewater did very well in the 2000 to 2003 downturn, and again during the 2008 financial crisis. Dalio aggressively takes credit for predicting both of those downturns and positioning Bridgewater correctly going into them. This is correct; what he avoids mentioning is that he also predicted downturns in every other year, the majority of which never happened. These points together create a bit of an answer, but they don't feel like the whole picture and Copeland doesn't connect the pieces. It seems possible that Dalio may simply be good at investing; he reads obsessively and clearly enjoys thinking about markets, and being an abusive cult leader doesn't take up all of his time. It's also true that to some extent hedge funds are semi-free money machines, in that once you have a sufficient quantity of money and political connections you gain access to investment opportunities and mechanisms that are very likely to make money and that the typical investor simply cannot access. Dalio is clearly good at making personal connections, and invested a lot of effort into forming close ties with tricky clients such as pools of Chinese money. Perhaps the most compelling explanation isn't mentioned directly in this book but instead comes from Matt Levine. Bridgewater touts its algorithmic trading over humans making individual trades, and there is some reason to believe that consistently applying an algorithm without regard to human emotion is a solid trading strategy in at least some investment areas. Levine has asked in his newsletter, tongue firmly in cheek, whether the bizarre cult-like behavior and constant infighting is a strategy to distract all the humans and keep them from messing with the algorithm and thus making bad decisions. Copeland leaves this question unsettled. Instead, one comes away from this book with a clear vision of the most dysfunctional workplace I have ever heard of, and an endless litany of bizarre events each more astonishing than the last. If you like watching train wrecks, this is the book for you. The only drawback is that, unlike other entries in this genre such as Bad Blood or Billion Dollar Loser, Bridgewater is a wildly successful company, so you don't get the schadenfreude of seeing a house of cards collapse. You do, however, get a helpful mental model to apply to the next person who tries to talk to you about "radical honesty" and "idea meritocracy." The flaw in this book is that the existence of an organization like Bridgewater is pointing to systematic flaws in how our society works, which Copeland is largely uninterested in interrogating. "How could this have happened?" is a rather large question to leave unanswered. The sheer outrageousness of Dalio's behavior also gets a bit tiring by the end of the book, when you've seen the patterns and are hearing about the fourth variation. But this is still an astonishing book, and a worthy entry in the genre of capitalism disasters. Rating: 7 out of 10

18 February 2024

Russell Coker: Release Years

In 2008 I wrote about the idea of having a scheduled release for Debian and other distributions as Mark Shuttleworth had proposed [1]. I still believe that Mark s original idea for synchronised release dates of Linux distributions (or at least synchronised feature sets) is a good one but unfortunately it didn t take off. Having been using Ubuntu a bit recently I ve found the version numbering system to be really good. Ubuntu version 16.04 was release in April 2016, it s support ended 5 years later in about April 2021, so any commonly available computers from 2020 should run it well and versions of applications released in about 2017 should run on it. If I have to support a Debian 10 (Buster) system I need to start with a web search to discover when it was released (July 2019). That suggests that applications packaged for Ubuntu 18.04 are likely to run on it. If we had the same numbering system for Debian and Ubuntu then it would be easier to compare versions. Debian 19.06 would be obviously similar to Ubuntu 18.04, or we could plan for the future and call it Debian 2019. Then it would be ideal if hardware vendors did the same thing (as car manufacturers have been doing for a long time). Which versions of Ubuntu and Debian would run well on a Dell PowerEdge R750? It takes a little searching to discover that the R750 was released in 2021, but if they called it a PowerEdge 2021R70 then it would be quite obvious that Ubuntu 2022.04 would run well on it and that Ubuntu 2020.04 probably has a kernel update with all the hardware supported. One of the benefits for the car industry in naming model years is that it drives the purchase of a new car. A 2015 car probably isn t going to impress anyone and you will know that it is missing some of the features in more recent models. It would be easier to get management to sign off on replacing old computers if they had 2015 on the front, trying to estimate hidden costs of support and lost productivity of using old computers is hard but saying it s a 2015 model and way out of date is easy. There is a history of using dates as software versions. The Reference Policy for SE Linux [2] (which is used for Debian) has releases based on date. During the Debian development process I upload policy to Debian based on the upstream Git and use the same version numbering scheme which is more convenient than the append git date to last full release system that some maintainers are forced to use. The users can get an idea of how much the Debian/Unstable policy has diverged from the last full release by looking at the dates. Also an observer might see the short difference between release dates of SE Linux policy and Debian release freeze dates as an indication that I beg the upstream maintainers to make a new release just before each Debian freeze which is expactly what I do. When I took over the Portslave [3] program I made releases based on date because there were several forks with different version numbering schemes so my options were to just choose a higher number (which is OK initially but doesn t scale if there are more forks) or use a date and have people know that the recent date is the most recent version. The fact that the latest release of Portslave is version 2010.04.19 shows that I have not been maintaining it in recent years (due to lack of hardware as well as lack of interest), so if someone wants to take over the project I would be happy to help them do so! I don t expect people at Dell and other hardware vendors to take much notice of my ideas (I have tweeted them photographic evidence of a problem with no good response). But hopefully this will start a discussion in the free software community.

Next.

Previous.