Search Results: "aph"

12 July 2024

Reproducible Builds: Reproducible Builds in June 2024

Welcome to the June 2024 report from the Reproducible Builds project! In our reports, we outline what we ve been up to over the past month and highlight news items in software supply-chain security more broadly. As always, if you are interested in contributing to the project, please visit our Contribute page on our website. Table of contents:
  1. Next Reproducible Builds Summit dates announced
  2. GNU Guix patch review session for reproducibility
  3. New reproducibility-related academic papers
  4. Misc development news
  5. Website updates
  6. Reproducibility testing framework


Next Reproducible Builds Summit dates announced We are very pleased to announce the upcoming Reproducible Builds Summit, set to take place from September 17th 19th 2024 in Hamburg, Germany. We are thrilled to host the seventh edition of this exciting event, following the success of previous summits in various iconic locations around the world, including Venice, Marrakesh, Paris, Berlin and Athens. Our summits are a unique gathering that brings together attendees from diverse projects, united by a shared vision of advancing the Reproducible Builds effort. During this enriching event, participants will have the opportunity to engage in discussions, establish connections and exchange ideas to drive progress in this vital field. Our aim is to create an inclusive space that fosters collaboration, innovation and problem-solving. If you re interesting in joining us this year, please make sure to read the event page which has more details about the event and location. We are very much looking forward to seeing many readers of these reports there.

GNU Guix patch review session for reproducibility Vagrant Cascadian will be holding a Reproducible Builds session as part of the monthly Guix patch review series on July 11th at 17:00 UTC. These online events are intended to encourage everyone everyone becoming a patch reviewer and the goal of reviewing patches is to help Guix project accept contributions while maintaining our quality standards and learning how to do patch reviews together in a friendly hacking session.

Development news In Debian this month, 4 reviews of Debian packages were added, 11 were updated and 14 were removed this month adding to our knowledge about identified issues. Only one issue types was updated, though, explaining that we don t vary the build path anymore.
On our mailing list this month, Bernhard M. Wiedemann wrote that whilst he had previously collected issues that introduce non-determinism he has now moved on to discuss about mitigations , in the sense of how can we avoid whole categories of problem without patching an infinite number of individual packages . In addition, Janneke Nieuwenhuizen announced the release of two versions of GNU Mes. [ ][ ]
In openSUSE news, Bernhard M. Wiedemann published another report for that distribution.
In NixOS, with the 24.05 release out, it was again validated that our minimal ISO is reproducible by building it on a virtual machine with no access to the binary cache.
What s more, we continued to write patches in order to fix specific reproducibility issues, including Bernhard M. Wiedemann writing three patches (for qutebrowser, samba and systemd), Chris Lamb filing Debian bug #1074214 against the fastfetch package and Arnout Engelen proposing fixes to refind and for the Scala compiler [ .
Lastly, diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb uploaded two versions (270 and 271) to Debian, and made the following changes as well:
  • Drop Build-Depends on liblz4-tool in order to fix Debian bug #1072575. [ ]
  • Update tests to support zipdetails version 4.004 that is shipped with Perl 5.40. [ ]

Website updates There were a number of improvements made to our website this month, including Akihiro Suda very helpfully making the <h4> elements more distinguishable from the <h3> level [ ][ ] as well as adding a guide for Dockerfile reproducibility [ ]. In addition Fay Stegerman added two tools, apksigcopier and reproducible-apk-tools, to our Tools page.

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In June, a number of changes were made by Holger Levsen, including:
  • Marking the virt(32 64)c-armhf nodes as down. [ ]
  • Granting a developer access to the osuosl4 node in order to debug a regression on the ppc64el architecture. [ ]
  • Granting a developer access to the osuosl4 node. [ ][ ]
In addition, Mattia Rizzolo re-aligned the /etc/default/jenkins file with changes performed upstream [ ] and changed how configuration files are handled on the rb-mail1 host. [ ], whilst Vagrant Cascadian documented the failure of the virt32c and virt64c nodes after initial investigation [ ].

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

8 July 2024

Thorsten Alteholz: My Debian Activities in June 2024

FTP master This month I accepted 270 and rejected 23 packages. The overall number of packages that got accepted was 279.

Debian LTS This was my hundred-twentieth month that I did some work for the Debian LTS initiative, started by Raphael Hertzog at Freexian. During my allocated time I uploaded or worked on: This month handling of the CVE of cups was a bit messy. After lifting the embargo of the CVE, a published patch did not work with all possible combinations of the configuration. In other words, in cases of having only one local domain socket configured, the cupsd did not start and failed with a strange error. Anyway, upstream published a new set of patches, which made cups work again. Unfortunately this happended just before the latest point release for Bullseye and Bookworm, so that the new packages did not make it into the release, but stopped in the corresponding p-u-queues: stable-p-u and old-p-u. I also continued to work on tiff and last but not least did a week of FD and attended the monthly LTS/ELTS meeting. Debian ELTS This month was the seventy-first ELTS month. During my allocated time I tried to upload a new version of cups for Jessie and Stretch. Unfortunately this was stopped due to an autopkgtest error, which I could not reproduce yet. I also wanted to finally upload a fixed version of exim4. Unfortunately this was stopped due to lots of CI-jobs for Buster. Updates for Buster are now also availble from ELTS, so some stuff had to prepared before the actual switch end of June. Additionally everything was delayed due to a crash of the CI worker. All in all this month was rather ill-fated. At least the exim4 upload will happen/already happened in July. I also continued to work on an update for libvirt, did a week of FD and attended the LTS/ELTS meeting. Debian Printing This month I uploaded new upstream or bugfix versions of: This work is generously funded by Freexian! Debian Astro This month I uploaded a new upstream or bugfix version of: All of those uploads are somehow related to /usr-move. Debian IoT This month I uploaded new upstream or bugfix versions of: Debian Mobcom The following packages have been prepared by the GSoC student Nathan: misc This month I uploaded new upstream or bugfix versions of: Here as well all uploads are somehow related to /usr-move

Russ Allbery: Review: Beyond Control

Review: Beyond Control, by Kit Rocha
Series: Beyond #2
Publisher: Kit Rocha
Copyright: December 2013
ASIN: B00GIA4GN8
Format: Kindle
Pages: 364
Beyond Control is science fiction erotica (dystopian erotic romance, per the marketing) and a direct sequel to Beyond Shame. These books shift protagonists with each volume and enough of the world background is explained that you could start here, but there are significant spoilers for the previous book. I read this book as part of the Beyond Series Bundle (Books 1-3), which is what the sidebar information is for. This is one of those reviews that I write because I'm stubborn about reviewing all the books I read, not because it's likely to be useful to anyone. There are also considerably more spoilers for the shape of the story than I normally include, so be warned. The Beyond series is erotica. Specifically, so far, consensual BDSM erotica with bisexuality but otherwise typical gender stereotypes. The authors (Kit Rocha is a pen name for Donna Herren and Bree Bridges) are women, so it's more female gaze than male gaze, but by erotica I don't mean romance with an above-average number of steamy scenes. I mean it felt like half the book by page count was descriptions of sex. This review is rather pointless because, one, I'm not going to review the sex that's the main point of the book, and two, I skimmed all the sex and read it for the story because I'm weird. Beyond Shame got me interested in these absurdly horny people and their post-apocalyptic survival struggles in the outskirts of a city run by a religious surveillance state, and I wanted to find out what happened next. Besides, this book promised to focus on my favorite character from the first novel, Lex, and I wanted to read more about her. Beyond Control uses a series pattern that I understand is common in romance but which is not often seen in SFF (my usual genre): each book focuses on a new couple adjacent to the previous couple, while the happily ever after of the previous couple plays out in the background. In this case, it also teases the protagonists of the next book. I can see why romance uses this structure: it's an excuse to provide satisfying interludes for the reader. In between Lex and Dallas's current relationship problems, one gets to enjoy how well everything worked out for Noelle and how much she's grown. In Beyond Shame, Lex was the sort-of partner of Dallas O'Kane, the leader of the street gang that is running Sector Four. (Picture a circle surrounding the rich-people-only city of Eden. That circle is divided into eight wedge-shaped sectors, which provide heavy industries, black-market pleasures, and slums for agricultural workers.) Dallas is an intensely possessive, personally charismatic semi-dictator who cultivates the image of a dangerous barbarian to everyone outside and most of the people inside Sector Four. Since he's supposed to be one of the good guys, this is more image than reality, but it's not entirely disconnected from reality. This book is about Lex and Dallas forming an actual relationship, instead of the fraught and complicated thing they had in the first book. I was hoping that this would involve Dallas becoming less of an asshole. It unfortunately does not, although some of what I attributed to malice may be adequately explained by stupidity. I'm not sure that's an improvement. Lex is great, just like she was in the first book. It's obvious by this point in the series that she does most of the emotional labor of keeping the gang running, and her support is central to Dallas's success. Like most of the people in this story, she has a nasty and abusive background that she's still dealing with in various ways. Dallas's possessiveness is intensely appealing to her, but she wants that possessiveness on different terms than Dallas may be willing to offer, or is even aware of. Lex was, I thought, exceptionally clear about what she wanted out of this relationship. Dallas thinks this is entirely about sex, and is, in general, dumber than a sack of hammers. That means fights. Also orgies, but, well, hopefully you knew what you were getting into if you picked up this book. I know, I know, it's erotica, that's the whole point, but these people have a truly absurd amount of sex. Eden puts birth control in the water supply, which is a neat way to simplify some of the in-story consequences of erotica. They must be putting aphrodisiacs in the water supply as well. There was a lot of sector politics in this book that I found way more interesting than it had any right to be. I really like most of these people, even Dallas when he manages to get his three brain cells connected for more than a few minutes. The events of the first book have a lot of significant fallout, Lex continues being a badass, the social dynamics between the women are very well-done (and pass the Bechdel test yet again even though this is mostly traditional-gender-role erotica), and if Dallas had managed to understand what he did wrong at a deeper-than-emotional level, I would have rather enjoyed the non-erotica story parts. Alas. I therefore wouldn't recommend this book even if I were willing to offer any recommendations about erotica (which I'm not). I was hoping it was going somewhere more rewarding than it did. But I still kind of want to read another one? I am weirdly fascinated with the lives of these people. The next book is about Six, who has the potential to turn into the sort of snarky, cynical character I love reading about. And it's not that hard to skim over the orgies. Maybe Dallas will get one additional brain cell per book? Followed by Beyond Pain. Rating: 5 out of 10

7 July 2024

Russ Allbery: Review: Welcome to Boy.Net

Review: Welcome to Boy.Net, by Lyda Morehouse
Series: Earth's Shadow #1
Publisher: Wizard's Tower Press
Copyright: April 2024
ISBN: 1-913892-71-9
Format: Kindle
Pages: 355
Welcome to Boy.Net is a science fiction novel with cyberpunk vibes, the first of a possible series. Earth is a largely abandoned wasteland. Humanity has survived in the rest of the solar system and spread from Earth's moon to the outer planets. Mars is the power in the inner system, obsessed with all things Earth and effectively run by the Earth Nations' Peacekeeping Force, the ENForcers. An ENForcer soldier is raised in a creche from an early age, implanted with cybernetic wetware and nanite enhancements, and extensively trained to be an elite fighting unit. As befits a proper military, every ENForcer is, of course, male. The ENForcers thought Lucia Del Toro was a good, obedient soldier. They also thought she was a man. They were wrong about those and many other things. After her role in an atrocity that named her the Scourge of New Shanghai, she went AWOL and stole her command ship. Now she and her partner/girlfriend Hawk, a computer hacker from Luna, make a living with bounty hunting jobs in the outer system. The ENForcers rarely cross the asteroid belt; the United Miners see to that. The appearance of an F-class ENForcer battle cruiser in Jupiter orbit is a very unpleasant surprise. Lucia and Hawk hope it has nothing to do with them. That hope is dashed when ENForcers turn up in the middle of their next job: a bounty to retrieve an AI eye. I first found Lyda Morehouse via her AngeLINK cyberpunk series, the last of which was published in 2011. Since then, she's been writing paranormal romance and urban fantasy as Tate Hallaway. This return to science fiction is an adventure with trickster hackers, throwback anime-based cowboy bars, tense confrontations with fascist thugs, and unexpected mutual aid, but its core is a cyberpunk look at the people who are unwilling or unable to follow the rules of social conformity. Gender conformity, specifically. Once you understand what this book is about, Welcome to Boy.Net is a great title, but I'm not sure it serves its purpose as a marketing tool. This is not the book that I would have expected from that title in isolation, and I'm a bit worried that people who would like it might pass it by. Inside the story, Boy.Net is the slang term for the cybernetic network that links all ENForcers. If this were the derogatory term used by people outside the ENForcers, I could see it, but it's what the ENForcers themselves call it. That left me with a few suspension of disbelief problems, since the sort of macho assholes who are this obsessed with male gender conformance usually consider "boys" to be derogatory and wouldn't call their military cybernetic network something that sounds that belittling, even as a joke. It would be named after some sort of Orwellian reference to freedom, or something related to violence, dominance, brutality, or some other "traditional male" virtue. But although this term didn't work for me as world-building, it's a beautiful touch thematically. What Morehouse is doing here is the sort of concretized metaphor that science fiction is so good at: an element of world-building that is both an analogy for something the reader is familiar with and is also a concrete piece of world background that follows believable rules and can be manipulated by the characters. Boy.Net is trying to reconnect to Lucia against her will. If it succeeds, it will treat the body modifications she's made as damage and try to reverse all of them, attempting to convert her back to the model of an ENForcer. But it is also a sharp metaphor for how gender roles are enforced in our world: a child assigned male is connected to a pervasive network of gender expectations and is programmed, shaped, and monitored to match the social role of a boy. Even if they reject those expectations, the gender role keeps trying to reconnect and convert them back. I really enjoyed Morehouse's handling of the gender dynamics. It's an important part of the plot, but it's not the only thing going on or the only thing the characters think about. Lucia is occasionally caught by surprise by well-described gender euphoria, but mostly gender is something other people keep trying to impose on her because they're obsessed with forcing social conformity. The rest of the book is a fun romp with a few memorable characters and a couple of great moments with unexpected allies. Hawk and Lucia have an imperfect but low drama relationship that features a great combination of insight and the occasional misunderstanding. It's the kind of believable human relationship that I don't see very much in science fiction, written with the comfortable assurance of an author with over a dozen books under her belt. Some of the supporting characters are also excellent, including a non-binary deaf hacker that I wish had been a bit more central to the story. This is not the greatest science fiction novel I've read, but it was entertaining throughout and kept me turning the pages. Recommended if you want some solar-system cyberpunk in your life. Welcome to Boy.Net reaches a conclusion of sorts, but there's an obvious hook for a sequel and a lot of room left for more stories. I hope enough people buy this book so that I can read it. Rating: 7 out of 10

6 July 2024

Dirk Eddelbuettel: binb 0.0.7 on CRAN: Maintenance

The seventh release of the binb package, and first in four years, is now on CRAN. binb regroups four rather nice themes for writing LaTeX Beamer presentations much more easily in (R)Markdown. As a teaser, a quick demo combining all four themes is available; documentation and examples are in the package. This release contains a CRAN-requested fix for a just-released new pandoc version which deals differently with overflowing bounding boxes from graphics; an added new LaTeX command is needed. We also polished continuous integration and related internals a few times but this does not contain directly user-facing changes in the package.

Changes in binb version 0.0.7 (2024-07-06)
  • Several rounds of small updates to ge continuous integration setup.
  • An additional LaTeX command needed by pandoc (>= 3.2.1) has been added.

CRANberries provides a summary of changes to the previous version. For questions or comments, please use the issue tracker at the GitHub repo. If you like this or other open-source work I do, you can now sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

3 July 2024

Sahil Dhiman: RTI to NPL Regarding Their NTP Infrastructure

I became interested in Network Time Protocol (NTP) last year after learning how fundamental this protocol is to the functioning of the global Internet. NTP helps synchronize clocks on devices over the Internet, which is essential for secure browsing, timestamping, keeping everyone in sync or just checking what time it is. Computers usually have a hardware real-time clock (RTC) but that deviates over time, so an occasional sync over NTP is required to keep the time accurate. Many network and IoT devices don t have hardware RTC so have even more reliance on NTP. Accurate time keeping starts with reference clocks like atomic clocks, GPS etc. Multiple government standard agencies host these reference clocks, which are regarded as Stratum 0. Stratum 1 servers are known as primary servers, and directly connect to Stratum 0 clocks for time. Stratum 1 servers then distribute time to Stratum 2 and further down the hierarchy. Computers typically connects to one or more Stratum 1/2/3 servers to get their time. Someone has to host these public Stratum 1,2,3 NTP servers. That s what NTP pool, a global effort by volunteers does. They provide NTP servers for the public to use. As of today, there are 4700+ servers in the pool which are free to use for anyone. Now let s come to the reason for writing this post. Indian Computer Emergency Response Team (CERT-In) in April 2022 released a set of cybersecurity directions which set the alarm bells ringing. Internet Society (and almost everyone else) wrote about it. And then there was this specific section about NTP:
All service providers, intermediaries, data centres, body corporate and Government organisations shall connect to the Network Time Protocol (NTP) Server of National Informatics Centre (NIC) or National Physical Laboratory (NPL) or with NTP servers traceable to these NTP servers, for synchronisation of all their ICT systems clocks. Entities having ICT infrastructure spanning multiple geographies may also use accurate and standard time source other than NPL and NIC, however it is to be ensured that their time source shall not deviate from NPL and NIC.
CSIR-National Physical Laboratory (NPL) is the official timekeeper for India and hosts the only public Stratum 1 clock in India, according to NTP pool website. So I was naturally curious to know what kind of infrastructure they re running for NTP. India has a Right to Information (RTI) Act which, like the Freedom of Information Act (FOIA) in the United States, gives citizens rights to request information from governmental entities, to which they have to respond in under 30 days. So last year, I filed two sets of RTI (one after the first reply came) inquiring about NPL s public NTP server setup. The first RTI had some generic questions: RTI 1
First RTI. Click to enlarge
This gave a vague idea about the setup, so I sat down and came with some specific questions in the next RTI. RTI 2
Second RTI. Click to enlarge
Feel free to make your conclusions from it now. Bear in mind these were filled last year so things might have changed. Do let me know if you have more information about it. Update (07/07/2024): Found an article from Medianama about Indian government time servers with information sourced through RTI.

30 June 2024

Joachim Breitner: Do surprises get larger?

The setup Imagine you are living on a riverbank. Every now and then, the river swells and you have high water. The first few times this may come as a surprise, but soon you learn that such floods are a recurring occurrence at that river, and you make suitable preparation. Let s say you feel well-prepared against any flood that is no higher than the highest one observed so far. The more floods you have seen, the higher that mark is, and the better prepared you are. But of course, eventually a higher flood will occur that surprises you. Of course such new record floods are happening rarer and rarer as you have seen more of them. I was wondering though: By how much do the new records exceed the previous high mark? Does this excess decrease or increase over time? A priori both could be. When the high mark is already rather high, maybe new record floods will just barley pass that mark? Or maybe, simply because new records are so rare events, when they do occur, they can be surprisingly bad? This post is a leisurely mathematical investigating of this question, which of course isn t restricted to high waters; it could be anything that produces a measurement repeatedly and (mostly) independently weather events, sport results, dice rolls. The answer of course depends on the distribution of results: How likely is each possible results.

Dice are simple With dice rolls the answer is rather simple. Let our measurement be how often you can roll a die until it shows a 6. This simple game we can repeat many times, and keep track of our record. Let s say the record happens to be 7 rolls. If in the next run we roll the die 7 times, and it still does not show a 6, then we know that we have broken the record, and every further roll increases by how much we beat the old record. But note that how often we will now roll the die is completely independent of what happened before! So for this game the answer is: The excess with which the record is broken is always the same. Mathematically speaking this is because the distribution of rolls until the die shows a 6 is memoryless. Such distributions are rather special, its essentially just the example we gave (a geometric distribution), or its continuous analogue (the exponential distributions, for example the time until a radioactive particle decays).

Mathematical formulation With this out of the way, let us look at some other distributions, and for that, introduce some mathematical notations. Let X be a random variable with probability density function (x) and cumulative distribution function (x), and a be the previous record. We are interested in the behavior of Y(a) = X a X > x i.e. by how much X exceeds a under the condition that it did exceed a. How does Y change as a increases? In particular, how does the expected value of the excess e(a) = E(Y(a)) change?

Uniform distribution If X is uniformly distributed between, say, 0 and 1, then a new record will appear uniformly distributed between a and 1, and as that range gets smaller, the excess must get smaller as well. More precisely, e(a) = E(X a X > a) = E(X X > a) a = (1 a)/2 This not very interesting linear line is plotted in blue in this diagram:
The expected record surpass for the uniform distribution The expected record surpass for the uniform distribution
The orange line with the logarithmic scale on the right tries to convey how unlikely it is to surpass the record value a: it shows how many attempts we expect before the record is broken. This can be calculated by n(a) = 1/(1 (a)).

Normal distribution For the normal distribution (with median 0 and standard derivation 1, to keep things simple), we can look up the expected value of the one-sided truncated normal distribution and obtain e(a) = E(X X > a) a = (a)/(1 (a)) a Now is this growing or shrinking? We can plot this an have a quick look:
The expected record surpass for the normal distribution The expected record surpass for the normal distribution
Indeed it is, too, a decreasing function! (As a sanity check we can see that e(0) = (2/ ), which is the expected value of the half-normal distribution, as it should.)

Could it be any different? This settles my question: It seems that each new surprisingly high water will tend to be less surprising than the previously assuming high waters were uniformly or normally distributed, which is unlikely to be helpful. This does raise the question, though, if there are probability distributions for which e(a) is be increasing? I can try to construct one, and because it s a bit easier, I ll consider a discrete distribution on the positive natural numbers, and consider at g(0) = E(X) and g(1) = E(X 1 X > 1). What does it take for g(1) > g(0)? Using E(X) = p + (1 p)E(X X > 1) for p = P(X = 1) we find that in order to have g(1) > g(0), we need E(X) > 1/p. This is plausible because we get equality when E(X) = 1/p, as it precisely the case for the geometric distribution. And it is also plausible that it helps if p is large (so that the next first record is likely just 1) and if, nevertheless, E(X) is large (so that if we do get an outcome other than 1, it s much larger). Starting with the geometric distribution, where P(X > n X n) = pn = p (the probability of again not rolling a six) is constant, it seems that these pn is increasing, we get the desired behavior. So let p1 < p2 < pn < be an increasing sequence of probabilities, and define X so that P(X = n) = p1 pn 1 (1 pn) (imagine the die wears off and the more often you roll it, the less likely it shows a 6). Then for this variation of the game, every new record tends to exceed the previous more than previous records. As the p increase, we get a flatter long end in the probability distribution.

Gamma distribution To get a nice plot, I ll take the intuition from this and turn to continuous distributions. The Wikipedia page for the exponential distribution says it is a special case of the gamma distribution, which has an additional shape parameter , and it seems that it could influence the shape of the distribution to be and make the probability distribution have a longer end. Let s play around with = 2 and = 0.5, 1 and 1.5:
The expected record surpass for the gamma distribution The expected record surpass for the gamma distribution
  • For = 1 (dotted) this should just be the exponential distribution, and we see that e(a) is flat, as predicted earlier.
  • For larger (dashed) the graph does not look much different from the one for the normal distribution not a surprise, as for , the gamma distribution turns into the normal distribution.
  • For smaller (solid) we get the desired effect: e(a) is increasing. This means that new records tend to break records more impressively.
The orange line shows that this comes at a cost: for a given old record a, new records are harder to come by with smaller .

Conclusion As usual, it all depends on the distribution. Otherwise, not much, it s late.

23 June 2024

Vincent Bernat: Why content providers need IPv6

IPv4 is an expensive resource. However, many content providers are still IPv4-only. The most common reason is that IPv4 is here to stay and IPv6 is an additional complexity.1 This mindset may seem selfish, but there are compelling reasons for a content provider to enable IPv6, even when they have enough IPv4 addresses available for their needs.

Disclaimer It s been a while since this article has been in my drafts. I started it when I was working at Shadow, a content provider, while I now work for Free, an internet service provider.

Why ISPs need IPv6? Providing a public IPv4 address to each customer is quite costly when each IP address costs US$40 on the market. For fixed access, some consumer ISPs are still providing one IPv4 address per customer.2 Other ISPs provide, by default, an IPv4 address shared among several customers. For mobile access, most ISPs distribute a shared IPv4 address. There are several methods to share an IPv4 address:3
NAT44
The customer device is given a private IPv4 address, which is translated to a public one by a service provider device. This device needs to maintain a state for each translation.
464XLAT and DS-Lite
The customer device translates the private IPv4 address to an IPv6 address or encapsulates IPv4 traffic in IPv6 packets. The provider device then translates the IPv6 address to a public IPv4 address. It still needs to maintain a state for the NAT64 translation.
Lightweight IPv4 over IPv6, MAP-E, and MAP-T
The customer device encapsulates IPv4 in IPv6 packets or performs a stateless NAT46 translation. The provider device uses a binding table or an algorithmic rule to map IPv6 tunnels to IPv4 addresses and ports. It does not need to maintain a state.
Solutions to share an IPv4 address
Solutions to share an IPv4 address across several customers. Some of them require the ISP to keep state, some don't.
All these solutions require a translation device in the ISP s network. This device represents a non-negligible cost in terms of money and reliability. As half of the top 1000 websites support IPv6 and the biggest players can deliver most of their traffic using IPv6,4 ISPs have a clear path to reduce the cost of translation devices: provide IPv6 by default to their customers.

Why content providers need IPv6? Content providers should expose their services over IPv6 primarily to avoid going through the ISP s translation devices. This doesn t help users who don t have IPv6 or users with a non-shared IPv4 address, but it provides a better service for all the others. Why would the service be better delivered over IPv6 than over IPv4 when a translation device is in the path? There are two main reasons for that:5
  1. Translation devices introduce additional latency due to their geographical placement inside the network: it is easier and cheaper to only install these devices at a few points in the network instead of putting them close to the users.
  2. Translation devices are an additional point of failure in the path between the user and the content. They can become overloaded or malfunction. Moreover, as they are not used for the five most visited websites, which serve their traffic over IPv6, the ISPs may not be incentivized to ensure they perform as well as the native IPv6 path.
Looking at Google statistics, half of the users reach Google over IPv6. Moreover, their latency is lower.6 In the US, all the nationwide mobile providers have IPv6 enabled. For France, we can refer to the annual ARCEP report: in 2022, 72% of fixed users and 60% of mobile users had IPv6 enabled, with projections of 94% and 88% for 2025. Starting from this projection, since all mobile users go through a network translation device, content providers can deliver a better service for 88% of them by exposing their services over IPv6. If we exclude Orange, which has 40% of the market share on consumer fixed access, enabling IPv6 should positively impact more than 55% of fixed access users.
In conclusion, content providers aiming for the best user experience should expose their services over IPv6. By avoiding translation devices, they can ensure fast and reliable content delivery. This is crucial for latency-sensitive applications, like live streaming, but also for websites in competitive markets, where even slight delays can lead to user disengagement.

  1. A way to limit this complexity is to build IPv6 services and only provide IPv4 through reverse proxies at the edge.
  2. In France, this includes non-profit ISPs, like FDN and Milkywan. Additionally, Orange, the previously state-owned telecom provider, supplies non-shared IPv4 addresses. Free also provides a dedicated IPv4 address for customers connected to the point-to-point FTTH access.
  3. I use the term NAT instead of the more correct term NAPT. Feel free to do a mental substitution. If you are curious, check RFC 2663. For a survey of the IPv6 transition technologies enumerated here, have a look at RFC 9313.
  4. For AS 12322, Google, Netflix, and Meta are delivering 85% of their traffic over IPv6. Also, more than half of our traffic is delivered over IPv6.
  5. An additional reason is for fighting abuse: blacklisting an IPv4 address may impact unrelated users who share the same IPv4 as the culprits.
  6. IPv6 may not be the sole reason the latency is lower: users with IPv6 generally have a better connection.

Sahil Dhiman: How I Write Blogs - June 2024 Edition

I wrote about my blog writing methodology back April 2021. My writing method has undergone a significant shift now, so thought about writing this update. New blog topics are added to my note-taking app quite frequently now. Occasionally going through the list, I merge topics, change order to prioritize certain topics or purely drop ideas which seems not worth a write-up. Due to this, I have the liberty to work on blogs according to mood. Writing the last one was tiring, so I chose to work on an easy one, i.e. this blog now. Topic decided, everything starts on etherpad now. Etherpad has this nice font and sync feature, which helps me write from any device of choice. Actual writing usually happens in the morning, right after I wake up. For most topics, I quickly jot down pointers and keep on expanding them over the course of multiple days at a leisurely pace. Though, sometime it adds too many pieces in the puzzle and takes additional time to put everything in flow. New pointer addition keeps on happening along with writing. Nowadays, pictures too dot my blog, which I rarely use to do earlier. I have come to believe on less usage of external links. These breaks the flow of readers. If someone is sufficiently motivated to learn more about something, finding useful sources isn t. As the first draft comes into being, I run it through LanguageTool for spelling corrections (which typically are many) and fixing grammatical issues. Post that, for the first time I read the complete write-up in one go for formation of coherent storyline, moving paragraphs around for form a structure , adding explainers wherever something new or unexplained is introduced, removing elaborate sentences, making amends wherever required and moving paragraphs around for forming structure. Another round of LanguageTool follows. All set now, I try to space out my final read before publishing, which helps find additional mistakes or loopholes. When everything is set, I do hugo to generate the site and rsync everything to the web server. A final git sync closes the publication part. After a day or two, I come back to read the blog on the website. This entails another round finding and fixing trivial mistakes. After this, it s set for good. Nowadays, in addition to being on my blog, everything is syndicated on Planet FSCI and Planet Debian, which has given it more visibility. As someone who s into infrastructure and Internet as a lot, I do pay attention to logs on my server, but as a disconnected exercise to if the blog is being read or not. More hits on the blog doesn t translate to any gratification for me, at least for writers point of view. Occasionally, people do mention my blog, which does flatter me. Four years and nearly a hundred posts later, I still wonder how I kept on writing for this long.

19 June 2024

Sahil Dhiman: First Iteration of My Free Software Mirror

As I m gearing towards setting up a Free Software download mirror in India, it occurred to me that I haven t chronicled the work and motivation behind setting up the original mirror in the first place. Also, seems like it would be good to document stuff here for observing the progression, as the mirror is going multi-country now. Right now, my existing mirror i.e., mirrors.de.sahilister.net (was mirrors.sahilister.in), is hosted in Germany and serves traffic for Termux, NomadBSD, Blender, BlendOS and GIMP. For a while in between, it hosted OSMC project mirror as well. To explain what is a Free Software download mirror thing is first, I ll quote myself from work blog -
As most Free Software doesn t have commercial backing and require heavy downloads, the concept of software download mirrors helps take the traffic load off of the primary server, leading to geographical redundancy, higher availability and faster download in general.
So whenever someone wants to download a particular (mirrored) software and click download, upstream redirects the download to one of the mirror server which is geographical (or in other parameters) nearby to the user, leading to faster downloads and load sharing amongst all mirrors. Since the time I got into Linux and servers, I always wanted to help the community somehow, and mirroring seemed to be the most obvious thing. India seems to be a country which has traditionally seen less number of public download mirrors. IITB, TiFR, and some of the public institutions used to host them for popular Linux and Free Softwares, but they seem to be diminishing these days. In the last months of 2021, I started using Termux and saw that it had only a few mirrors (back then). I tried getting a high capacity, high bandwidth node in budget but it was hard in India in 2021-22. So after much deliberation, I decided to go where it s available and chose a German hosting provider with the thought of adding India node when conditions are favorable (thankfully that happened, and India node is live too now.). Termux required only 29 GB of storage, so went ahead and started mirroring it. I raised this issue in Termux s GitHub repository in January 2022. This blog post chronicles the start of the mirror. Termux has high request counts from a mirror point of view. Each Termux client, usually checks every mirror in selected group for availability before randomly selecting one for download (only other case is when client has explicitly selected a single mirror using termux-repo-change). The mirror started getting thousands of requests daily due to this but only a small percentage would actually get my mirror in selection, so download traffic was lower. Similar thing happened with OSMC too (which I started mirroring later). With this start, I started exploring various project that would be benefit from additional mirrors. Public information from Academic Computer Club in Ume s mirror and Freedif s mirror stats helped to figure out storage and bandwidth requirements for potential projects. Fun fact, Academic Computer Club in Ume (which is one of the prominent Debian, Ubuntu etc.) mirror, now has 200 Gbits/s uplink to the internet through SUNET. Later, I migrated to a different provider for better speeds and added LibreSpeed test on the mirror server. Those were fun times. Between OSMC, Termux and LibreSpeed, I was getting almost 1.2 millions hits/day on the server at its peak, crossing for the first time a TB/day traffic number. Next came Blender, which took the longest time to set up of around 9 10 months. Blender had a push-trigger requirement for rsync from upstream that took quite some back and forth. It now contributes the most amount of traffic on the mirror. On release days, mirror does more than 3 TB/day and normal days, it hovers around 2 TB/day. Gimp project is the latest addition. At one time, the mirror traffic touched 4.97 TB/day traffic number. That s when I decided on dropping LibreSpeed server to solely focus on mirroring for now, keeping the bandwidth allotment for serving downloads only. The mirror projects selection grew organically. I used to reach out many projects discussing the need of for additional mirrors. Some projects outright denied mirroring request as Germany already has a good academic mirrors boosting 20-25 Gbits/s speeds from FTP era, which seems fair. Finding the niche was essential to only add softwares, which would truly benefit from additional capacity. There were months when nothing much would happen with the mirror, rsync would continue to update the mirror while nginx would keep on serving the traffic. Nowadays, the mirror pushes around 70 TB/month. I occasionally check logs, vnstat, add new security stuff here and there and pay the bills. It now saturates the Gigabit link sometimes and goes beyond that, peaking around 1.42 Gbits/s (the hosting provider seems to be upping their game). The plan is to upgrade the link to better speeds. vnstat yearly
Yearly traffic stats (through vnstat -y )
On the way, learned quite a few things like - GeoIP Map of Clients from Yesterday Access Logs
GeoIP Map of Clients from Yesterday's Access Logs. Click to enlarge
Generated from IPinfo.io
In hindsight, the statistics look amazing, hundreds of TBs of traffic served from the mirror, month after month. That does show that there s still an appetite for public mirrors in time of commercially donated CDNs and GitHub. The world could have done with one less mirror, but it saved some time, lessened the burden for others, while providing redundancy and traffic localization with one additional mirror. And it s fun for someone like me who s into infrastructure that powers the Internet. Now, I ll try focusing and expanding the India mirror, which in itself started pushing almost half a TB/day. Long live Free Software and public download mirrors.

8 June 2024

Thorsten Alteholz: My Debian Activities in May 2024

FTP master This month I accepted 347 and rejected 49 packages. The overall number of packages that got accepted was 348.

Debian LTS This was my hundred-nineteenth month that I did some work for the Debian LTS initiative, started by Raphael Hertzog at Freexian. During my allocated time I uploaded or worked on: I also continued to work on tiff and last but not least did a week of FD and attended the monthly LTS/ELTS meeting. Unfortunately I used lots of time to debug an issue with nghttp2. Please see my odyssey below. Debian ELTS This month was the seventieth ELTS month. During my allocated time I uploaded: For some tests I installed the new nghttp2 package on my Stretch VM and started the daemon. Unfortunately I got an unexpected error from getaddrinfo() about ai_socktype not supported. The daemon was configured to listen on lo, the device was available, but the error remained. I was pretty sure that my patch was not the reason for this and indeed the unpatched version showed this error as well. I didn t want to release an untested package, so nghttp2 had to start at least! Therefore I built a minimal example to reproduce the issue. getaddrinfo() failed for hints.ai_socktype=SOCK_STREAM and a numerical IP address. Having no hints at all or localhost instead of 127.0.0.1 made the error disappear (as a remark: localhost resolves to 127.0.0.1, the ipv6 variant is ip6-localhost ). I could see that in nghttp2 as well. Configuring it with localhost let the error vanish but the daemon still exited due to other reasons. After some time of debugging, I added another network interface to my VM and configured it with a dummy IPv4 address. Voila, everything worked as expected. According to Wikipedia, IPv6 was ratified as standard in 2017 and Stretch was also released in 2017. No wonder that a IPv6-only-VM had problems back then and these problems survived to the present. I also continued to work on an update for tiff in Jessie and Stretch, did a week of FD and attended the LTS/ELTS meeting. Debian Printing This month I uploaded new upstream or bugfix versions of: This work is generously funded by Freexian! Debian Astro This month I uploaded a new upstream or bugfix version of: Debian IoT This month I uploaded new upstream or bugfix versions of: Debian Mobcom Due to more and more problems with time_t, I removed osmo-iuh and all dependencies from armel, armhf and i386, sorry. If there is really anybody using this software on 32-bit architectures don t hesitate to get in touch. It is official now, the GSoC student working on the Mobcom packages is Nathan Doris. He already finished the hardest part of the job and I could upload the latest version of libosmocore. I really enjoy working with him and look forward to a pleasant SoC :-). misc This month I uploaded new upstream or bugfix versions of: Did I already mention that I love lists with topics I can work on. I print out such lists and enjoy checking off one after the other. End of May Helmut told me that I am a bit lazy and gave me such a list with all my packages that have one or the other issue with /usr-move. Most of the uploads above are packages on that list and I could check off a lot :-).

Reproducible Builds: Reproducible Builds in May 2024

Welcome to the May 2024 report from the Reproducible Builds project! In these reports, we try to outline what we have been up to over the past month and highlight news items in software supply-chain security more broadly. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website. Table of contents:
  1. A peek into build provenance for Homebrew
  2. Distribution news
  3. Mailing list news
  4. Miscellaneous news
  5. Two new academic papers
  6. diffoscope
  7. Website updates
  8. Upstream patches
  9. Reproducibility testing framework


A peek into build provenance for Homebrew Joe Sweeney and William Woodruff on the Trail of Bits blog wrote an extensive post about build provenance for Homebrew, the third-party package manager for MacOS. Their post details how each bottle (i.e. each release):
[ ] built by Homebrew will come with a cryptographically verifiable statement binding the bottle s content to the specific workflow and other build-time metadata that produced it. [ ] In effect, this injects greater transparency into the Homebrew build process, and diminishes the threat posed by a compromised or malicious insider by making it impossible to trick ordinary users into installing non-CI-built bottles.
The post also briefly touches on future work, including work on source provenance:
Homebrew s formulae already hash-pin their source artifacts, but we can go a step further and additionally assert that source artifacts are produced by the repository (or other signing identity) that s latent in their URL or otherwise embedded into the formula specification.

Distribution news In Debian this month, Johannes Schauer Marin Rodrigues (aka josch) noticed that the Debian binary package bash version 5.2.15-2+b3 was uploaded to the archive twice. Once to bookworm and once to sid but with differing content. This is problem for reproducible builds in Debian due its assumption that the package name, version and architecture triplet is unique. However, josch highlighted that
This example with bash is especially problematic since bash is Essential:yes, so there will now be a large portion of .buildinfo files where it is not possible to figure out with which of the two differing bash packages the sources were compiled.
In response to this, Holger Levsen performed an analysis of all .buildinfo files and found that this needs almost 1,500 binNMUs to fix the fallout from this bug. Elsewhere in Debian, Vagrant Cascadian posted about a Non-Maintainer Upload (NMU) sprint to take place during early June, and it was announced that there is now a #debian-snapshot IRC channel on OFTC to discuss the creation of a new source code archiving service to, perhaps, replace snapshot.debian.org. Lastly, 11 reviews of Debian packages were added, 15 were updated and 48 were removed this month adding to our extensive knowledge about identified issues. A number of issue types have been updated by Chris Lamb as well. [ ][ ]
Elsewhere in the world of distributions, deep within a larger announcement from Colin Percival about the release of version 14.1-BETA2, it was mentioned that the FreeBSD kernels are now built reproducibly.
In Fedora, however, the change proposal mentioned in our report for April 2024 was approved, so, per the ReproduciblePackageBuilds wiki page, the add-determinism tool is now running in new builds for Fedora 41 ( rawhide ). The add-determinism tool is a Rust program which, as its name suggests, adds determinism to files that are given as input by attempting to standardize metadata contained in binary or source files to ensure consistency and clamping to $SOURCE_DATE_EPOCH in all instances . This is essentially the Fedora version of Debian s strip-nondeterminism. However, strip-nondeterminism is written in Perl, and Fedora did not want to pull Perl in the buildroot for every package. The add-determinism tool eliminates many causes of non-determinism and work is ongoing to continue the scope of packages it can operate on.

Mailing list news On our mailing list this month, regular contributor kpcyrd wrote to the list with an update on their source code indexing project, whatsrc.org. The whatsrc.org project, which was launched last month in response to the XZ Utils backdoor, now contains and indexes almost 250,000 unique source code archives. In their post, kpcyrd gives an example of its intended purpose, noting that it shown that whilst there seems to be consensus about [the] source code for zsh 5.9 in various Linux distributions, it does not align with the contents of the zsh Git repository . Holger Levsen also posted to the list with a pre-announcement of sorts for the 2024 Reproducible Builds summit. In particular:
[Whilst] the dates and location are not fixed yet, however if you don help us with finding a suitable location soon, it is very likely that we ll meet again in Hamburg in the 2nd half of September 2024 [ ].
Lastly, Frederic-Emmanuel Picca wrote to the list asking for help understanding the non-reproducible status of the Debian silx package and received replies from both Vagrant Cascadian and Chris Lamb.

Miscellaneous news strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build. This month strip-nondeterminism version 1.14.0-1 was uploaded to Debian unstable by Chris Lamb chiefly to incorporate a change from Alex Muntada to avoid a dependency on Sub::Override to perform monkey-patching and break circular dependencies related to debhelper [ ]. Elsewhere in our tooling, Jelle van der Waa modified reprotest because the pipes module will be removed in Python version 3.13 [ ].
It was also noticed that a new blog post by Daniel Stenberg detailing How to verify a Curl release mentions the SOURCE_DATE_EPOCH environment variable. This is because:
The [curl] release tools document also contains another key component: the exact time stamp at which the release was done using integer second resolution. In order to generate a correct tarball clone, you need to also generate the new version using the old version s timestamp. Because the modification date of all files in the produced tarball will be set to this timestamp.

Furthermore, Fay Stegerman filed a bug against the Signal messenger app for Android to report that their reproducible builds cannot, in fact, be reproduced. However, Fay is quick to note that she has:
found zero evidence of any kind of compromise. Some differences are yet unexplained but everything I found seems to be benign. I am disappointed that Reproducible Builds have been broken for months but I have zero reason to doubt Signal s security in any way.

Lastly, it was observed that there was a concise and diagrammatic overview of supply chain threats on the SLSA website.

Two new academic papers Two new scholarly papers were published this month. Firstly, Mathieu Acher, Beno t Combemale, Georges Aaron Randrianaina and Jean-Marc J z quel of University of Rennes on Embracing Deep Variability For Reproducibility & Replicability. The authors describe their approach as follows:
In this short [vision] paper we delve into the application of software engineering techniques, specifically variability management, to systematically identify and explicit points of variability that may give rise to reproducibility issues (e.g., language, libraries, compiler, virtual machine, OS, environment variables, etc.). The primary objectives are: i) gaining insights into the variability layers and their possible interactions, ii) capturing and documenting configurations for the sake of reproducibility, and iii) exploring diverse configurations to replicate, and hence validate and ensure the robustness of results. By adopting these methodologies, we aim to address the complexities associated with reproducibility and replicability in modern software systems and environments, facilitating a more comprehensive and nuanced perspective on these critical aspects.
(A PDF of this article is available.)
Secondly, Ludovic Court s, Timothy Sample, Simon Tournier and Stefano Zacchiroli have collaborated to publish a paper on Source Code Archiving to the Rescue of Reproducible Deployment. Their paper was motivated because:
The ability to verify research results and to experiment with methodologies are core tenets of science. As research results are increasingly the outcome of computational processes, software plays a central role. GNU Guix is a software deployment tool that supports reproducible software deployment, making it a foundation for computational research workflows. To achieve reproducibility, we must first ensure the source code of software packages Guix deploys remains available.
(A PDF of this article is also available.)

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions 266, 267, 268 and 269 to Debian, making the following changes:
  • New features:
    • Use xz --list to supplement output when comparing .xz archives; essential when metadata differs. (#1069329)
    • Include xz --verbose --verbose (ie. double) output. (#1069329)
    • Strip the first line from the xz --list output. [ ]
    • Only include xz --list --verbose output if the xz has no other differences. [ ]
    • Actually append the xz --list after the container differences, as it simplifies a lot. [ ]
  • Testing improvements:
    • Allow Debian testing to fail right now. [ ]
    • Drop apktool from Build-Depends; we can still test APK functionality via autopkgtests. (#1071410)
    • Add a versioned dependency for at least version 5.4.5 for the xz tests as they fail under (at least) version 5.2.8. (#374)
    • Fix tests for 7zip 24.05. [ ][ ]
    • Fix all tests after additon of xz --list. [ ][ ]
  • Misc:
    • Update copyright years. [ ]
In addition, James Addison fixed an issue where the HTML output showed only the first difference in a file, while the text output shows all differences [ ][ ][ ], Sergei Trofimovich amended the 7zip version test for older 7z versions that include the string [64] [ ][ ] and Vagrant Cascadian relaxed the versioned dependency to allow version 5.4.1 for the xz tests [ ] and proposed updates to guix for versions 267, 268 and pushed version 269 to Guix. Furthermore, Eli Schwartz updated the diffoscope.org website in order to explain how to install diffoscope on Gentoo [ ].

Website updates There were a number of improvements made to our website this month, including Chris Lamb making the print CSS stylesheet nicer [ ]. Fay Stegerman made a number of updates to the page about the SOURCE_DATE_EPOCH environment variable [ ][ ][ ] and Holger Levsen added some of their presentations to the Resources page. Furthermore, IOhannes zm lnig stipulated support for SOURCE_DATE_EPOCH in clang version 16.0.0+ [ ], Jan Zerebecki expanded the Formal definition page and fixed a number of typos on the Buy-in page [ ] and Simon Josefsson fixed the link to Trisquel GNU/Linux on the Projects page [ ].

Upstream patches This month, we wrote a number of patches to fix specific reproducibility issues, including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In May, a number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Enable the rebuilder-snapshot API on osuosl4. [ ]
    • Schedule the i386 architecture a bit more often. [ ]
    • Adapt cleanup_nodes.sh to the new way of running our build services. [ ]
    • Add 8 more workers for the i386 architecture. [ ]
    • Update configuration now that the infom07 and infom08 nodes have been reinstalled as real i386 systems. [ ]
    • Make diffoscope timeouts more visible on the #debian-reproducible-changes IRC channel. [ ]
    • Mark the cbxi4a-armhf node as down. [ ][ ]
    • Only install the hdmi2usb-mode-switch package only on Debian bookworm and earlier [ ] and only install the haskell-platform package on Debian bullseye [ ].
  • Misc:
    • Install the ntpdate utility as we need it later. [ ]
    • Document the progress on the i386 architecture nodes at Infomaniak. [ ]
    • Drop an outdated and unnoticed notice. [ ]
    • Add live_setup_schroot to the list of so-called zombie jobs. [ ]
In addition, Mattia Rizzolo reinstalled the infom07 and infom08 nodes [ ] and Vagrant Cascadian marked the cbxi4a node as online [ ].

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

5 June 2024

Alberto Garc a: More ways to install software in SteamOS: Distrobox and Nix

Introduction In my previous post I talked about how to use systemd-sysext to add software to the Steam Deck without modifying the root filesystem. In this post I will give a brief overview of two additional methods. Distrobox distrobox is a tool that uses containers to create a mutable environment on top of your OS. Distrobox running in SteamOS With distrobox you can open a terminal with your favorite Linux distro inside, with full access to the package manager and the ability to install additional software. Containers created by distrobox are integrated with the system so apps running inside have normal access to the user s home directory and the Wayland/X11 session. Since these containers are not stored in the root filesystem they can survive an OS update and continue to work fine. For this reason they are particularly suited to systems with an immutable root filesystem such as Silverblue, Endless OS or SteamOS. Starting from SteamOS 3.5 the system comes with distrobox (and podman) preinstalled and it can be used right out of the box without having to do any previous setup. For example, in order to create a Debian bookworm container simply open a terminal and run this:
$ distrobox create -i debian:bookworm debbox
Here debian:bookworm is the image that this container is created from (debian is the name and bookworm is the tag, see the list of supported tags here) and debbox is the name that is given to this new container. Once the container is created you can enter it:
$ distrobox enter debbox
Or from the Debian entry in the desktop menu -> Lost & Found. Once inside the container you can run your Debian commands normally:
$ sudo apt update
$ sudo apt install vim-gtk3
Nix Nix is a package manager for Linux and other Unix-like systems. It has the property that it can be installed alongside the official package manager of any distribution, allowing the user to add software without affecting the rest of the system. Nix running in SteamOS Nix installs everything under the /nix directory, and packages are made available to the user through a new entry in the PATH and a ~/.nix-profile symlink stored in the home directory. Nix is more things, including the basis of the NixOS operating system. Explaning Nix in more detail is beyond the scope of this blog post, but for SteamOS users these are perhaps its most interesting properties: The only thing that Nix needs from SteamOS is help to set up the /nix directory so its contents are not stored in the root filesystem. This is already happening starting from SteamOS 3.5 so you can install Nix right away in single-user mode:
$ sudo chown deck:deck /nix
$ wget https://nixos.org/nix/install
$ sh ./install --no-daemon
This installs Nix and adds a line to ~/.bash_profile to set up the necessary environment variables. After that you can log in again and start using it. Here s a very simple example (refer to the official documentation for more details):
# Install and run Midnight Commander
$ nix-env -iA nixpkgs.mc
$ mc
# List installed packages
$ nix-env -q
mc-4.8.31
nix-2.21.1
# Uninstall Midnight Commander
$ nix-env -e mc-4.8.31
What we have seen so far is how to install Nix in single-user mode, which is the simplest one and probably good enough for a single-user machine like the Steam Deck. The Nix project however recommends a multi-user installation, see here for the reasons. Unfortunately the official multi-user installer does not work out of the box on the Steam Deck yet, but if you want to go the multi-user way you can use the Determinate Systems installer: https://github.com/DeterminateSystems/nix-installer Conclusion Distrobox and Nix are useful tools and they give SteamOS users the ability to add additional software to the system without having to modify the base operating system. While for graphical applications the recommended way to install third-party software is still Flatpak, Distrobox and Nix give the user additional flexibility and are particularly useful for installing command-line utilities and other system tools.

4 June 2024

Dirk Eddelbuettel: ulid 0.4.0 on CRAN: Extended to Milliseconds

A new version of the ulid package is now on CRAN. The packages provides universally (unique) lexicographically (sortable) identifiers see the spec at GitHub for details on those which offer sorting which uuids lack. The R package provides access via the standard C++ library, had been put together by Bob Rudis and is now maintained by me. Mark Heckmann noticed that a ulid round trip of generating and unmarshalling swallowed subsecond informationm and posted on a well-known site I no longer go to. Duncan Murdoch was kind enough to open an issue to make me aware, and in it included the nice minimally complete verifiable example by Mark. It turns out that this issue was known, documented upstream in two issues and fixed in fork by the authors of those issues, Chris Bove. It replaces time_t as the value of record (constrained at the second resolution) with a proper std::chrono object which offers milliseconds (and much more, yay Modern C++). So I switched the two main files of library to his, and updated the wrapper code to interface from POSIXct to std::chrono object. And with that we are in business. The original example of five ulids create 100 millisecond part, then unmarshalled and here printed as a data.table as data.frame by default truncates to seconds:
> library(ulid)
> gen_ulid <- \(sleep) replicate(5,  Sys.sleep(sleep); generate() )
> u <- gen_ulid(.1)
> df <- unmarshal(u)
> data.table::data.table(df)
                        ts              rnd
                    <POSc>           <char>
1: 2024-05-30 16:38:28.588 CSQAJBPNX75R0G5A
2: 2024-05-30 16:38:28.688 XZX0TREDHD6PC1YR
3: 2024-05-30 16:38:28.789 0YK9GKZVTED27QMK
4: 2024-05-30 16:38:28.890 SC3M3G6KGPH7S50S
5: 2024-05-30 16:38:28.990 TSKCBWJ3TEKCPBY0
>
We updated the documentation accordingly, and added some new tests as well. The NEWS entry for this release follows.

Changes in version 0.4.0 (2024-06-03)
  • Switch two functions to fork by Chris Bove using std::chrono instead of time_t for consistent millisecond resolution (#3 fixing #2)
  • Updated documentation showing consistent millisecond functionality
  • Added unit tests for millisecond functionality

Courtesy of my CRANberries, there is also a diffstat report for this release. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

29 May 2024

Antoine Beaupr : Playing with fonts again

I am getting increasingly frustrated by Fira Mono's lack of italic support so I am looking at alternative fonts again.

Commit Mono This time I seem to be settling on either Commit Mono or Space Mono. For now I'm using Commit Mono because it's a little more compressed than Fira and does have a italic version. I don't like how Space Mono's parenthesis (()) is "squarish", it feels visually ambiguous with the square brackets ([]), a big no-no for my primary use case (code). So here I am using a new font, again. It required changing a bunch of configuration files in my home directory (which is in a private repository, sorry) and Emacs configuration (thankfully that's public!). One gotcha is I realized I didn't actually have a global font configuration in Emacs, as some Faces define their own font family, which overrides the frame defaults. This is what it looks like, before:
A dark terminal showing the test sheet in Fira Mono Fira Mono
After:
A dark terminal showing the test sheet in Fira Mono Commit Mono
(Notice how those screenshots are not sharp? I'm surprised too. The originals look sharp on my display, I suspect this is something to do with the Wayland transition. I've tried with both grim and flameshot, for what its worth.) They are pretty similar! Commit Mono feels a bit more vertically compressed maybe too much so, actually -- the line height feels too low. But it's heavily customizable so that's something that's relatively easy to fix, if it's really a problem. Its weight is also a little heavier and wider than Fira which I find a little distracting right now, but maybe I'll get used to it. All characters seem properly distinguishable, although, if I'd really want to nitpick I'd say the and are too different, with the latter (REGISTERED SIGN) being way too small, basically unreadable here. Since I see this sign approximately never, it probably doesn't matter at all. I like how the ampersand (&) is more traditional, although I'll miss the exotic one Fira produced... I like how the back quotes ( , GRAVE ACCENT) drop down low, nicely aligned with the apostrophe. As I mentioned before, I like how the bar on the "f" aligns with the other top of letters, something in Fira mono that really annoys me now that I've noticed it (it's not aligned!).

A UTF-8 test file Here's the test sheet I've made up to test various characters. I could have sworn I had a good one like this lying around somewhere but couldn't find it so here it is, I guess.
US keyboard coverage:
abcdefghijklmnopqrstuvwxyz 1234567890-=[]\;',./
ABCDEFGHIJKLMNOPQRSTUVWXYZ~!@#$%^&*()_+ :"<>?
latin1 coverage:  
EURO SIGN, TRADE MARK SIGN:  
ambiguity test:
e coC0ODQ iI71lL! 
b6G&0B83  []() /\. 
zs$S52Z%   '" 
all characters in a sentence, uppercase:
the quick fox jumps over the lazy dog
THE QUICK FOX JUMPS OVER THE LAZY DOG
same, in french:
Portez ce vieux whisky au juge blond qui fume.
d s no l, o  un z phyr ha  me v t de gla ons w rmiens, je d ne
d exquis r tis de b uf au kir,   l a  d ge m r, &c tera.
D S NO L, O  UN Z PHYR HA  ME V T DE GLA ONS W RMIENS, JE D NE
D EXQUIS R TIS DE B UF AU KIR,   L A  D GE M R, &C TERA.
Ligatures test:
-<< -< -<- <-- <--- <<- <- -> ->> --> ---> ->- >- >>-
=<< =< =<= <== <=== <<= <= => =>> ==> ===> =>= >= >>=
<-> <--> <---> <----> <=> <==> <===> <====> :: ::: __
<~~ </ </> /> ~~> == != /= ~= <> === !== !=== =/= =!=
<: := *= *+ <* <*> *> <  < >  > <. <.> .> +* =* =: :>
(* *) /* */ [   ]     ++ +++ \/ /\  - -  <!-- <!---
Box drawing alignment tests:
                                                                    
                                 
                           
                                                   
                                       
                                                 
                               
                                
Dashes alignment test:
HYPHEN-MINUS, MINUS SIGN, EN, EM DASH, HORIZONTAL BAR, LOW LINE
--------------------------------------------------
 
 
 
 
__________________________________________________
Update: here is another such sample sheet, it's pretty good and has support for more languages while being still relatively small. So there you have it, got completely nerd swiped by typography again. Now I can go back to writing a too-long proposal again. Sources and inspiration for the above:
  • the unicode(1) command, to lookup individual characters to disambiguate, for example, - (U+002D HYPHEN-MINUS, the minus sign next to zero on US keyboards) and (U+2212 MINUS SIGN, a math symbol)
  • searchable list of characters and their names - roughly equivalent to the unicode(1) command, but in one page, amazingly the /usr/share/unicode database doesn't have any one file like this
  • bits/UTF-8-Unicode-Test-Documents - full list of UTF-8 characters
  • UTF-8 encoded plain text file - nice examples of edge cases, curly quotes example and box drawing alignment test which, incidentally, showed me I needed specific faces customisation in Emacs to get the Markdown code areas to display properly, also the idea of comparing various dashes
  • sample sentences in many languages - unused, "Sentences that contain all letters commonly used in a language"
  • UTF-8 sampler - unused, similar

Other fonts In my previous blog post about fonts, I had a list of alternative fonts, but it seems people are not digging through this, so I figured I would redo the list here to preempt "but have you tried Jetbrains mono" kind of comments. My requirements are:
  • no ligatures: yes, in the previous post, I wanted ligatures but I have changed my mind. after testing this, I find them distracting, confusing, and they often break the monospace nature of the display
  • monospace: this is to display code
  • italics: often used when writing Markdown, where I do make use of italics... Emacs falls back to underlining text when lacking italics which is hard to read
  • free-ish, ultimately should be packaged in Debian
Here is the list of alternatives I have considered in the past and why I'm not using them:
  • agave: recommended by tarzeau, not sure I like the lowercase a, a bit too exotic, packaged as fonts-agave
  • Cascadia code: optional ligatures, multilingual, not liking the alignment, ambiguous parenthesis (look too much like square brackets), new default for Windows Terminal and Visual Studio, packaged as fonts-cascadia-code
  • Fira Code: ligatures, was using Fira Mono from which it is derived, lacking italics except for forks, interestingly, Fira Code succeeds the alignment test but Fira Mono fails to show the X signs properly! packaged as fonts-firacode
  • Hack: no ligatures, very similar to Fira, italics, good alternative, fails the X test in box alignment, packaged as fonts-hack
  • Hermit: no ligatures, smaller, alignment issues in box drawing and dashes, packaged as fonts-hermit somehow part of cool-retro-term
  • IBM Plex: irritating website, replaces Helvetica as the IBM corporate font, no ligatures by default, italics, proportional alternatives, serifs and sans, multiple languages, partial failure in box alignment test (X signs), fancy curly braces contrast perhaps too much with the rest of the font, packaged in Debian as fonts-ibm-plex
  • Inconsolata: no ligatures, maybe italics? more compressed than others, feels a little out of balance because of that, packaged in Debian as fonts-inconsolata
  • Intel One Mono: nice legibility, no ligatures, alignment issues in box drawing, not packaged in Debian
  • Iosevka: optional ligatures, italics, multilingual, good legibility, has a proportional option, serifs and sans, line height issue in box drawing, fails dash test, not in Debian
  • Jetbrains Mono: (mandatory?) ligatures, good coverage, originally rumored to be not DFSG-free (Debian Free Software Guidelines) but ultimately packaged in Debian as fonts-jetbrains-mono
  • Monoid: optional ligatures, feels much "thinner" than Jetbrains, not liking alignment or spacing on that one, ambiguous 2Z, problems rendering box drawing, packaged as fonts-monoid
  • Mononoki: no ligatures, looks good, good alternative, suggested by the Debian fonts team as part of fonts-recommended, problems rendering box drawing, em dash bigger than en dash, packaged as fonts-mononoki
  • Source Code Pro: italics, looks good, but dash metrics look whacky, not in Debian
  • spleen: bitmap font, old school, spacing issue in box drawing test, packaged as fonts-spleen
  • sudo: personal project, no ligatures, zero originally not dotted, relied on metrics for legibility, spacing issue in box drawing, not in Debian
So, if I get tired of Commit Mono, I might probably try, in order:
  1. Hack
  2. Jetbrains Mono
  3. IBM Plex Mono
Iosevka, Monoki and Intel One Mono are also good options, but have alignment problems. Iosevka is particularly disappointing as the EM DASH metrics are just completely wrong (much too wide). This was tested using the Programming fonts site which has all the above fonts, which cannot be said of Font Squirrel or Google Fonts, amazingly. Other such tools:

25 May 2024

Gunnar Wolf: How computers make books from graphics rendering, search algorithms, and functional programming to indexing and typesetting

This post is a review for Computing Reviews for How computers make books from graphics rendering, search algorithms, and functional programming to indexing and typesetting , a book published in Manning
If we look at the age-old process of creating books, how many different areas can a computer help us with? And how can each of them be used to teach computer science (CS) fundamentals to a nontechnical audience? This is the premise of John Whitington s enticing book and the result is quite amazing. The book immediately drew my attention when looking at the titles available for review. After all, my initiation into computing as a kid was learning the LaTeX typesetting system while my father worked on his first book on scientific language and typography [1]. Whitington picks 11 different technical aspects of book production, from how dots of ink are transferred to a white page and how they are made into controllable, recognizable shapes, all the way to forming beautiful typefaces and the nuances of properly addressing white-space to present aesthetically pleasing paragraphs, building it all into specific formats aimed at different ends. But if we dig beyond just the chapter titles, we will find a very interesting book on CS that, without ever using technical language or notation, presents aspects as varied as anti-aliasing, vector and raster images, character sets such as ASCII and Unicode, an introduction to programming, input methods for different writing systems, efficient encoding (compression) methods, both for text and images, lossless and lossy, and recursion and dithering methods. To my absolute surprise, while the author thankfully spared the reader the syntax usually associated with LISP-related languages, the programming examples clearly stem from the LISP school, presenting solutions based on tail recursion. Of course, it is no match for Donald Knuth s classic book on this same topic [2], but could very well be a primer for readers to approach it. The book is light and easy to read, and keeps a very informal, nontechnical tone throughout. My only complaint relates to reading it in PDF format; the topic of this book, and the care with which the images were provided by the author, warrant high resolution. The included images are not only decorative but an integral part of the book. Maybe this is specific to my review copy, but all of the raster images were in very low resolution. This book is quite different from what readers may usually expect, as it introduces several significant topics in the field. CS professors will enjoy it, of course, but also readers with a humanities background, students new to the field, or even those who are just interested in learning a bit more.

References
  1. S nchez y G ndara, A.; Magari os Lamas, F.; Wolf, K. B., Manual de lenguaje y tipograf a cient fica en castellano. Trillas, Mexico City, Mexico, 1986, https://www.fis.unam.mx/~bwolf/manual.html
  2. Knuth, D. E. Digital typographyCSLI Lecture Notes: CSLI Lecture Notes. CSLI Publications, Stanford, CA, 1999, https://www-cs-faculty.stanford.edu/~knuth/dt.html

22 May 2024

Evgeni Golov: Upgrading CentOS Stream 8 to CentOS Stream 9 using Leapp

Warning to the Planet Debian readers: the following post might shock you, if you're used to Debian's smooth upgrades using only the package manager. Leapp?! Contrary to distributions like Debian and Fedora, RHEL can't be upgraded using the package manager alone. Instead there is a tool called Leapp that takes care of orchestrating the update and also includes a set of checks whether a system can be upgraded at all. Have a look at the RHEL documentation about upgrading if you want more details on the process itself. You might have noticed that the title of this post says "CentOS Stream" but here I am talking about RHEL. This is mostly because Leapp was originally written with RHEL in mind. Upgrading CentOS 7 to EL8 When people started pondering upgrading their CentOS 7 installations, AlmaLinux started the ELevate project to allow upgrading CentOS 7 to CentOS Stream 8 but also to AlmaLinux 8, Rocky 8 or Oracle Linux 8. ELevate was essentially Leapp with patches to allow working on CentOS, which has different package signature keys, different OS release versioning, etc. Sadly these patches were never merged back into Leapp. Making Leapp work with CentOS Stream 8 (and other distributions) At some point I noticed that things weren't moving and EL8 to EL9 upgrades were coming closer (and I had my own systems that I wanted to be able to upgrade in place). Annoyed-Evgeni-Development is best development? Not sure, but it produced a set of patches that allowed some movement: However, this is not yet the end of the story. At least convert dot-less CentOS versions to X.999 is open, and another followup would be needed if we go that route. But I don't expect this to be merged soon, as the patch is technically wrong - yet it makes things mostly work. The big problem here is that CentOS Stream doesn't have X.Y versioning, just X as it's a constant stream with no point releases. Leapp however relies on X.Y versioning to know which package changes it needs to perform. Pretending CentOS Stream 8 is "RHEL" 8.999 works if you assume that Stream is always ahead of RHEL. This is however a CentOS only problem. I still need to properly test that, but I'd expect things to work fine with upstream Leapp on AlmaLinux/Rocky if you feed it the right signature and repository data. Actually upgrading CentOS Stream 8 to CentOS Stream 9 using Leapp Like I've already teased in my HPE rant, I've actually used that code to upgrade virt01.conova.theforeman.org to CentOS Stream 9. I've also used it to upgrade a server at home that's responsible for running important containers like Home Assistant and UniFi. So it's absolutely battle tested and production grade! It's also hungry for kittens. As mentioned above, you can't just use upstream Leapp, but I have a Copr: evgeni/leapp.
# dnf copr enable evgeni/leapp
# dnf install leapp leapp-upgrade-el8toel9
Apart from the software, we'll also need to tell it which repositories to use for the upgrade.
# vim /etc/leapp/files/leapp_upgrade_repositories.repo
[c9-baseos]
name=CentOS Stream $releasever - BaseOS
metalink=https://mirrors.centos.org/metalink?repo=centos-baseos-9-stream&arch=$basearch&protocol=https,http
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-centosofficial
gpgcheck=1
repo_gpgcheck=0
metadata_expire=6h
countme=1
enabled=1
[c9-appstream]
name=CentOS Stream $releasever - AppStream
metalink=https://mirrors.centos.org/metalink?repo=centos-appstream-9-stream&arch=$basearch&protocol=https,http
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-centosofficial
gpgcheck=1
repo_gpgcheck=0
metadata_expire=6h
countme=1
enabled=1
Depending on the setup and installed packages, more repositories might be needed. Just make sure that the $stream substitution is not used as Leapp doesn't override that and you'd end up with CentOS Stream 8 repos again. Once all that is in place, we can call leapp preupgrade and let it analyze the system. Ideally, the output will look like this:
# leapp preupgrade
 
============================================================
                      REPORT OVERVIEW                       
============================================================
Reports summary:
    Errors:                      0
    Inhibitors:                  0
    HIGH severity reports:       0
    MEDIUM severity reports:     0
    LOW severity reports:        3
    INFO severity reports:       3
Before continuing consult the full report:
    A report has been generated at /var/log/leapp/leapp-report.json
    A report has been generated at /var/log/leapp/leapp-report.txt
============================================================
                   END OF REPORT OVERVIEW                   
============================================================
But trust me, it won't ;-) As mentioned above, Leapp analyzes the system before the upgrade. Some checks can completely inhibit the upgrade, while others will just be logged as "you better should have a look". Firewalld Configuration AllowZoneDrifting Is Unsupported EL7 and EL8 shipped with AllowZoneDrifting=yes, but since EL9 this is not supported anymore. As this can potentially break the networking of the system, the upgrade gets inhibited. Newest installed kernel not in use Admit it, you also don't reboot into every new kernel available! Well, Leapp won't let that pass and inhibits the upgrade. Cannot perform the VDO check of block devices In EL8 there are two ways to manage VDO: using the dedicated vdo tool and via LVM. If your system uses LVM (it should!) but not VDO, you probably don't have the vdo package installed. But then Leapp can't check if your LVM devices really aren't VDO without the vdo tooling and will inhibit the upgrade. So you gotta install vdo for it to find out that you don't use VDO LUKS encrypted partition detected Yeah. Sorry. Using LUKS? Straight into the inhibit corner! But hey, if you don't use LUKS for / you can probably get away by deleting the inhibitwhenluks actor. That worked for me, but remember the kittens! Really upgrading CentOS Stream 8 to CentOS Stream 9 using Leapp The headings are getting silly, huh? Anyway, once leapp preupgrade is happy and doesn't throw any inhibitors anymore, the actual (real?) upgrade can be done by calling leapp upgrade. This will download all necessary packages and create an intermediate initramfs that contains all the things needed for the upgrade and ask you to reboot. Once booted, the upgrade itself takes somewhere between 5 and 10 minutes. Then another minute or 5 to relabel your disks with the new SELinux policy. And three reboots (into the upgrade initramfs, into SELinux relabel, into real OS) of a ProLiant DL325 - 5 minutes each? And then for good measure another one, to flip SELinux from permissive to enforcing. Are we done yet? Nope. There are a few post-upgrade tasks you get to do yourself. Yes, the switching of SELinux back to enforcing is one of them. Please don't forget it. Using the system after the upgrade A customer once said "We're not running those systems for the sake of running systems, but for the sake of running some application ontop of them". This is very true. libvirt doesn't support Spice/QXL In EL9, support for Spice/QXL was dropped, so if you try to boot a VM using it, libvirt will nicely error out with
Error starting domain: unsupported configuration: domain configuration does not support video model 'qxl'
Interestingly, because multiple parts of the VM are invalid, you can't edit it in virt-manager (at least the one in Fedora 39) as removing/fixing one part requires applying the new configuration which is still invalid. So virsh edit <vm> it is! Look for entries like
    <channel type='spicevmc'>
      <target type='virtio' name='com.redhat.spice.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='2'/>
    </channel>
    <graphics type='spice' autoport='yes'>
      <listen type='address'/>
    </graphics>
    <audio id='1' type='spice'/>
    <video>
      <model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1' primary='yes'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
    </video>
    <redirdev bus='usb' type='spicevmc'> 
      <address type='usb' bus='0' port='2'/> 
    </redirdev> 
    <redirdev bus='usb' type='spicevmc'> 
      <address type='usb' bus='0' port='3'/> 
    </redirdev>
and either just delete the or (better) replace them with VNC/cirrus
    <graphics type='vnc' port='-1' autoport='yes'>
      <listen type='address'/>
    </graphics>
    <audio id='1' type='none'/>
    <video>
      <model type='cirrus' vram='16384' heads='1' primary='yes'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
    </video>
Podman needs re-login to private registries One of the machines I've updated runs Podman and pulls containers from GitHub which are marked as private. To do so, I have a personal access token that I've used to login to ghcr.io. After the CentOS Stream 9 upgrade (which included an upgrade to Podman 5), pulls stopped working with authentication/permission errors. No idea what exactly happened, but a simple podman login fixed this issue quickly.
$ echo ghp_token   podman login ghcr.io -u <user> --password-stdin
shim has an el8 tag One of the documented post-upgrade tasks is to verify that no EL8 packages are installed, and to remove those if there are any. However, when you do this, you'll notice that the shim-x64 package has an EL8 version: shim-x64-15-15.el8_2.x86_64. That's because the same build is used in both CentOS Stream 8 and CentOS Stream 9. Confusing, but should really not be uninstalled if you want the machine to boot ;-) Are we done yet? Yes! That's it. Enjoy your CentOS Stream 9!

16 May 2024

John Goerzen: Review of Reputable, Functional, and Secure Email Service

I last reviewed email services in 2019. That review focused a lot of attention on privacy. At the time, I selected mailbox.org as my provider, and have been using them for these 5 years since. However, both their service and their support have gone significantly downhill since, so it is time for me to look at other options. Here I am focusing strongly on email. Some of the providers mentioned here provide other services (IM, video calls, groupware, etc.), and to the extent they do, I am ignoring them.

What Matters in 2024
I want to start off by acknowledging that what you need in email probably depends on your circumstances and the country in which you live. For me, I begin by naming that the largest threat most of us face isn t from state actors but from criminals: hackers, ransomware gangs, etc. It is important to take as many steps as possible to secure one s account against that. Privacy and security are both part of the mix. I still value privacy but I am acknowledging, as Migadu does, that Email as we know it and encryption are incompatible. Although some of these services strongly protect parts of the conversation, the reality is that most people will be emailing people using plain old email services which don t. For stronger security, something like Signal would be needed. (I wrote about Signal in 2021 also.) Interestingly, OpenPGP support seems to be something of a standard feature in the providers I reviewed by this point. All or almost all of them provide integration with browser-based encryption as well as server-side encryption if you prefer that. Although mailbox.org can automatically PGP-encrypt every message that arrives in plaintext, for general use, this is unwieldy; there isn t good tooling for searching mailboxes where every message is encrypted, etc. So I never enabled that feature at Mailbox. I still value security and privacy, but a pragmatic approach addresses the most pressing threats first.

My criteria
The basic requirements for an email service include:
  1. Ability to use my own domains
  2. Strong privacy policy
  3. Ability for me to use my own IMAP and SMTP clients on both desktop and mobile
  4. It must be extremely reliable
  5. It must not be free
  6. It must have excellent support for those rare occasions when it is needed
  7. Support for basic aliases
Why do I say it must not be free? Because if someone is providing a service with the quality I m talking about here, and not charging for it, it implies something is fishy: either they are unscrupulous, are financially unstable, or the product is something else like ads. I am not aware of any provider that matches the other criteria with a free account anyhow. These providers range from about $30 to $90 per year, so cheaper than a Netflix subscription. Immediately, this rules out several options:
  • Proton doesn t let me use my own clients on mobile (their bridge is desktop-only)
  • Tuta also doesn t let me use my own clients
  • Posteo doesn t let me use my own domain
  • mxroute.com lacks a strong privacy policy, and its policy has numerous causes for concern (for instance, If you repeatedly send email to invalid/unroutable recipients, they may be published on our GitHub )
I will have a bit more to say about a couple of these providers below. There are some additional criteria that are strongly desired but not absolutely required:
  1. Ability to set individual access passwords for every device/app
  2. Support for two-factor authentication (2FA/TFA/TOTP) for web-based access
  3. Support for basics in filtering: ability to filter on envelope recipient (so if I get BCC d, I can still filter), and ability to execute more than one action on filter match (eg, deliver to two folders, or deliver to a folder and forward to someone else)
IMAP and SMTP don t really support 2FA, so by setting individual passwords for every device, you can at least limit the blast radius and cut off a specific device if something is (or might be) compromised.

The candidates
I considered these providers: Startmail, Mailfence, Runbox, Fastmail, Kolab, Mailbox.org, and Migadu. I ll review each, and highlight the pricing of the plan I would most likely use. Each provider offers multiple plans; some may be more expensive and some may be cheaper than the one I reviewed. I included a link to each provider s full pricing information so you can compare for your needs. I set up trials with each of these (except Mailbox.org, with which I already had a paid account). It so happend that I had actual questions for support for each one, which gave me an opportunity to see how support responded. I did not fabricate questions, and would not have contacted support if I didn t have real ones. (This means that I asked different questions of each provider, because they were the REAL questions I had.) I ll jump to the spoiler right now: I eventually chose Migadu, with Fastmail and Mailfence as close seconds. I looked for providers myself, and also solicited recommendations in a Mastodon thread.

Mailbox.org
I begin with Mailbox, as it was my top choice in 2019 and the incumbent. Until this year, I had been quite happy with it. I had cause to reach their support less than once a year on average, and each time they replied the same day or next day. Now, however, they are failing on reliability and on support. Their spam filter has become overly aggressive. It has blocked quite a bit of legitimate mail. When contacting their support about a prior issue earlier this year, they initially took 4 days to reply, and then 6 days to reply after that. Ouch. They had me disable some spam settings. It didn t really help. I continue to lose mail. I don t know how much, because they block a lot of it before it even hits the spam folder. One of my friends texted to say mail was dropping. I raised a new ticket with mailbox, which took them 5 days to reply to. Their reply was unhelpful. As the Internet is not a static system, unforeseen events can always occur. Well yes, that s true, and I get it, false positives exist with email. But this was from an ISP s mail system with an address that had been established for years, and it was part of a larger pattern of rejecting quite a bit of legit mail. And every interaction with them recently hasn t resulted in them actually doing anything to resolve anything. It s just a paragraph or two of reply that does nothing and helps nothing. When I complained that it took 5 days to reply, they said We have not been able to reply sooner as we are currently experiencing a high volume of customer enquiries. Even though their SLA for my account is a not-great 48 business hour turnaround, they still missed it and their reason is we re busy. I finally asked what RBL had caught the blocked email, since when I checked, the sender wasn t on any RBL. Mailbox s reply: they only keep their logs for 7 days, so next time I should contact them within 7 days. Which, of course, I DID; it was them that kept delaying. Ugh! It s like they ve become a cable company. Even worse is how they have been blocking mail from GrapheneOS s discussion form. See their thread about it. In short, Graphene s mail server has a clean reputation and Mailbox has no problem with it. But because one of Graphene s IPv6 webservers has an IPv6 allocation of a size Mailbox doesn t like, they drop mail. It s ridiculous, and Mailbox was dismissive of this well-known and well-regarded Open Source project. So if the likes of GrapheneOS can t get good faith effort to deliver their mail, what chance does an individual like me have? I m sorry, but I m literally paying you to deliver email for me and provide good support. If you can t do either of those, you don t get to push that problem down onto me. Hire appropriate staff. On the technical side, they support aliases, my own clients, and have a reasonable privacy policy. Their 2FA support exists for the web interface (though weirdly not the support site), though it is somewhat weird. They do not support app passwords. A somewhat unique feature is the @secure.mailbox.org domain. If you try to receive mail at that address, mailbox.org will block it unless it uses TLS. Same for sending. This isn t E2EE, but it does at least require things not be in plaintext for the last hop to Mailbox. Verdict: not recommended due to poor reliability and support. Mailbox.Org summary:
  • Website: https://mailbox.org/en/
  • Reliability: iffy due to over-aggressive spam filtering
  • Support: Poor; takes 4-6 days for a reply and replies are unhelpful
  • Individual access passwords: No
  • 2FA: Yes, but with a PIN instead of a password as the other factor
  • Filtering: Full SIEVE feature set and GUI editor
  • Spam settings: greylisting on/off, reject some/all spam, etc. But they re insufficient to address Mailbox s overzealousness, which support says I cannot workaround within the interface.
  • Server storage location: Germany
  • Plan as reviewed: standard [pricing link]
    • Cost per year: EUR 30 (about $33)
    • Mail storage included: 10GB
    • Limits on send/receive volume: none
    • Aliases: 50 on your domain name, 25 on mailbox.org
    • Additional mailboxes: Available; each one at the same fee as the primary mailbox

Startmail
I really wanted to like Startmail. Its vault is an interesting idea and should contribute to the security and privacy of an account. They clearly care about privacy. It falls down in filtering. They have no way to filter on envelope recipient (BCC or similar). Their support confirmed this to me and that s a showstopper. Startmail support was also as slow as Mailbox, taking 5 days to respond to me. Two showstoppers right there. Verdict: Not recommended due to slow support responsiveness and weak filtering. Startmail summary:
  • Website: https://www.startmail.com/
  • Reliability: Seems to be fine
  • Support: Mediocre; Took 5 days for a reply, but the reply was helpful
  • Individual app access passwords: Yes
  • 2FA: Yes
  • Filtering: Poor; cannot filter on envelope recipient, and can t build filters with multiple actions
  • Spam settings: None
  • Server storage location: The Netherlands
  • Plan as reviewed: Custom domain (trial was Personal), [pricing link]
    • Cost per year: $70
    • Mail storage included: 20GB
    • Limits on send/receive volume: none
    • Aliases: unlimited, with lots of features: can set expiration, etc.
    • Additional mailboxes: not available

Kolab
Kolab Now is mainly positioned as a full groupware service, but they do have a email-only option which I investigated. There isn t much documentation about it compared to other providers, and also not much in the way of settings. You can turn greylisting on or off. And . that s it. It has a full suite of filtering options. They set an X-Envelope-To header which you can use with the arbitrary header match to do the right thing even for BCC situations. Filters can have multiple conditions and multiple actions. It is SIEVE-based and you can download your SIEVE definitions. If you enable 2FA, you disable IMAP and SMTP; not great. Verdict: Not an impressive enough email featureset to justify going with it. Kolab Now summary:
  • Website: https://kolabnow.com/
  • Reliability: Seems to be fine
  • Support: Fine responsiveness (next day)
  • Invidiaul app passwords: no
  • 2FA: Yes, but if you enable it, they disable IMAP and SMTP
  • Filtering: Excellent
  • Spam settings: Only greylisting on/off
  • Server storage location: Switzerland; they have lots of details on their setup
  • Plan as reviewed: Just email [pricing link]
    • Cost per year: CHF 60, about $66
    • Mail storage included: 5GB
    • Limitations on send/receive volume: None
    • Aliases: Yes. Not sure if there are limits.
    • Additional mailboxes: Yes if you set up a group account. Flexible pricing based on user count is not documented anywhere I could find.

Mailfence
Mailfence is another option, somewhat similar to Startmail but without the unique vault. I had some questions about filters, and support was quite responsive, responding in a couple of hours. Some of their copy on their website is a bit misleading, but support clarified when I asked them. They do not offer encryption at rest (like most of the entries here). Mailfence s filtering system is the kind I d like to see. It allows multiple conditions and multiple actions for each rule, and has some unique actions as well (notify by SMS or XMPP). Support says that Recipients matches envelope recipients. However, one ommission is that I can t match on arbitrary headers; only the canned list of headers they provide. They have only two spam settings:
  • spam filter on/off
  • whitelist
Given some recent complaints about their spam filter being overly aggressive, I find this lack of control somewhat concerning. (However, I discount complaints about people begging for more features in free accounts; free won t provide the kind of service I m looking for with any provider.) There are generally just very few settings for email as well. Verdict: Response and helpful support, filtering has the right structure but lacks arbitrary header match. Could be a good option. Mailfence summary:
  • Website: https://mailfence.com/
  • Reliability: Seems to be fine
  • Support: Excellent responsiveness and helpful replies (after some initial confusion about my question of greylisting)
  • Individual app access passwords: No. You can set a per-service password (eg, an IMAP password), but those will be shared with all devices speaking that protocol.
  • 2FA: Yes
  • Filtering: Good; only misses the ability to filter on arbitrary headers
  • Spam settings: Very few
  • Server storage location: Belgium
  • Plan as reviewed: Entry [pricing link]
    • Cost per year: $42
    • Mail storage included: 10GB, with a maximum of 50,000 messages
    • Limits on send/receive volume: none
    • Aliases: 50. Aliases can t be deleted once created (there may be an exeption to this for aliases on your own domain rather than mailfence.com)
    • Additional mailboxes: Their page on this is a bit confusing, and the pricing page lacks the information promised. It looks like you can pay the same $42/year for additional mailboxes, with a limit of up to 2 additional paid mailboxes and 2 additional free mailboxes tied to the account.

Runbox
This one came recommended in a Mastodon thread. I had some questions about it, and support response was fantastic I heard from two people that were co-founders of the company! Even within hours, on a weekend. Incredible! This kind of response was only surpassed by Migadu. I initially wrote to Runbox with questions about the incoming and outgoing message limits, which I hadn t seen elsewhere, as well as the bandwidth limit. They said the bandwidth limit is no longer enforced on paid accounts. The incoming and outgoing limits are enforced, and all email (even spam) counts towards the limit. Notably the outgoing limit is per recipient, so if you send 10 messages to your 50-recipient family group, that s the limit. However, they also indicated a willingness to reset the limit if something happens. Unfortunately, hitting the limit results in a hard bounce (SMTP 5xx) rather than a temporary failure (SMTP 4xx) so it can result in lost mail. This means I d be worried about some attack or other weirdness causing me to lose mail. Their filter is a pain point. Here are the challenges:
  • You can t directly match on a BCC recipient. Support advised to use a headers match, which will search for something anywhere in the headers. This works and is probably good enough since this data is in the Received: headers, but it is a little more imprecise.
  • They only have a contains , not an equals operator. So, for instance, a pattern searching for test@example.com would also match newtest@example.com . Support advised to put the email address in angle brackets to avoid this. That will work mostly. Angle brackets aren t always required in headers.
  • There is no way to have multiple actions on the filter (there is just no way to file an incoming message into two folders). This was the ultimate showstopper for me.
Support advised they are planning to upgrade the filter system in the future, but these are the limitations today. Verdict: A good option if you don t need much from the filtering system. Lots of privacy emphasis. Runbox summary:
  • Website: https://runbox.com/
  • Reliability: Seems to be fine, except returning 5xx codes if per-day limits are exceeded
  • Support: Excellent responsiveness and replies from founders
  • Individual app passwords: Yes
  • 2FA: Yes
  • Filtering: Poor
  • Spam settings: Very few
  • Server storage location: Norway
  • Plan as reviewed: Mini [pricing link]
    • Cost per year: $35
    • Mail storage included: 10GB
    • Limited on send/receive volume: Receive 5000 messages/day, Send 500 recipients/day
    • Aliases: 100 on runbox.com; unlimited on your own domain
    • Additional mailboxes: $15/yr each, also with 10GB non-shared storage per mailbox

Fastmail
Fastmail came recommended to me by a friend I ve known for decades. Here s the thing about Fastmail, compared to all the services listed above: It all just works. Everything. Filtering, spam prevention, it is all there, all feature-complete, and all just does the right thing as you d hope. Their filtering system has a canned dropdown for To/Cc/Bcc , it supports multiple conditions and multiple actions, and just does the right thing. (Delivering to multiple folders is a little cumbersome but possible.) It has a particularly strong feature set around administering multiple accounts, including things like whether users can prevent admins from reading their mail. The not-so-great part of the picture is around privacy. Fastmail is based in Australia, where the government has extensive power around spying on data, even to the point of forcing companies to add wiretap capabilities. Fastmail s privacy policy states user data may be held in Australia, USA, India, and Netherlands. By default, they share data with unidentified spam companies , though you can disable this in settings. On the other hand, they do make a good effort towards privacy. I contacted support with some questions and got back a helpful response in three hours. However, one of the questions was about in which countries my particular data would be stored, and the support response said they would have to get back to me on that. It s been several days and no word back. Verdict: A featureful option that just works , with a lot of features for managing family accounts and the like, but lacking in the privacy area. Fastmail summary:
  • Website: https://www.fastmail.com/
  • Reliability: Seems to be fine
  • Support: Good response time on most questions; dropped the ball on one tha trequired research
  • Individual app access passwords: Yes
  • 2FA: Yes
  • Filtering: Excellent
  • Spam settings: Can set filter aggressiveness, decide whether to share spam data with spam-fighting companies , configure how to handle backscatter spam, and evaluate the personal learning filter.
  • Server storage locations: Australia, USA, India, and The Netherlands. Legal jurisdiction is Australia.
  • Plan as reviewed: Individual [pricing link]
    • Cost per year: $60
    • Mail storage included: 50GB
    • Limits on send/receive volume: 300/hour
    • Aliases: Unlimited from what I can see
    • Additional mailboxes: No; requires a different plan for that

Migadu
Migadu was a service I d never heard of, but came recommended to me on Mastodon. I listed Migadu last because it is a class of its own compared to all the other options. Every other service is basically a webmail interface with a few extra settings tacked on. Migadu has a full-featured email admin console in addition. By that I mean you can:
  • View usage graphs (incoming, outgoing, storage) over time
  • Manage DNS (if you want Migadu to run your nameservers)
  • Manage multiple domains, and cross-domain relationships with mailboxes
  • View a limited set of logs
  • Configure accounts, reset their passwords if needed/authorized, etc.
  • Configure email address rewrite rules with wildcards and so forth
Basically, if you were the sort of person that ran your own mail servers back in the day, here is Migadu giving you most of that functionality. Effectively you have a web interface to do all the useful stuff, and they handle the boring and annoying bits. This is a really attractive model. Migadu support has been fantastic. They are quick to respond, and went above and beyond. I pointed out that their X-Envelope-To header, which is needed for filtering by BCC, wasn t being added on emails I sent myself. They replied 5 hours later indicating they had added the feature to add X-Envelope-To even for internal mails! Wow! I am impressed. With Migadu, you buy a pool of resources: storage space and incoming/outgoing traffic. What you do within that pool is up to you. You can set up users ( mailboxes ), aliases, domains, whatever you like. It all just shares the pool. You can restrict users further so that an individual user has access to only a subset of the pool resources. I was initially concerned about Migadu s daily send/receive message count limits, but in visiting with support and reading the documentation, what really comes out is that Migadu is a service with a personal touch. Hitting the incoming traffic limit will cause a SMTP temporary fail (4xx) response so you won t lose legit mail and support will work with you if it s a problem for legit uses. In other words, restrictions are soft and they are interpreted reasonably. One interesting thing about Migadu is that they do not offer accounts under their domain. That is, you MUST bring your own domain. That s pretty easy and cheap, of course. It also puts you in a position of power, because it is easy to migrate email from one provider to another if you own the domain. Filtering is done via SIEVE. There is a GUI editor which lets you accomplish most things, though it has an odd blind spot where you can t file a message into multiple folders. However, you can edit a SIEVE ruleset directly and you get the full SIEVE featureset, which is extensive (and does support filing a message into multiple folders). I note that the SIEVE :envelope match doesn t work, but Migadu adds an X-Envelope-To header which is just as good. I particularly love a company that tells you all the reasons you might not want to use them. Migadu s pro/con list is an honest drawbacks list (of course, their homepage highlights all the features!). Verdict: Fantastically powerful, excellent support, and good privacy. I chose this one. Migadu summary:
  • Website: https://migadu.com/
  • Reliability: Excellent
  • Support: Fantastic. Good response times and they added a feature (or fixed a bug?) a few hours after I requested it.
  • Individual access passwords: Yes. Create identities to support them.
  • 2FA: Yes, on both the admin interface and the webmail interface
  • Filtering: Excellent, based on SIEVE. GUI editor doesn t support multiple actions when filing into a folder, but full SIEVE functionality is exposed.
  • Spam settings:
    • On the domain level, filter aggressiveness, Greylisting on/off, black and white lists
    • On the mailbox level, filter aggressiveness, black and whitelists, action to take with spam; compatible with filters.
  • Server storage location: France; legal jurisdiction Switzerland
  • Plan as reviewed: mini [pricing link]
    • Cost per year: $90
    • Mail storage included: 30GB ( soft quota)
    • Limits on send/receive volume: 1000 messgaes in/day, 100 messages out/day ( soft quotas)
    • Aliases: Unlimited on an unlimited number of domains
    • Additional mailboxes: Unlimited and free; uses pooled quotas, but individual quotas can be set

Others
Here are a few others that I didn t think worthy of getting a trial:
  • mxroute was recommended by several. Lots of concerning things in their policy, such as:
    • if you repeatedly send mail to unroutable recipients, they may publish the addresses on Github
    • they will terminate your account if they think you are rude or want to contest a charge
    • they reserve the right to cancel your service at any time for any (or no) reason.
  • Proton keeps coming up, and I will not consider it so long as I am locked into their client on mobile.
  • Skiff comes up sometimes, but they were acquired by Notion.
  • Disroot comes up; this discussion highlights a number of reasons why I avoid them. Their Terms of Service (ToS) is inconsistent with a general-purpose email account (I guess for targeting nonprofits and activists, that could make sense). Particularly laughable is that they claim to be friends of Open Source, but then would take down your account if you upload copyrighted material. News flash: in order for an Open Source license to be meaningful, the underlying work is copyrighted. It is perfectly legal to upload copyrighted material when you wrote it or have the license to do so!

Conclusions
There are a lot of good options for email hosting today, and in particular I appreciate the excellent personal support from companies like Migadu and Runbox. Support small businesses!

14 May 2024

Julian Andres Klode: The new APT 3.0 solver

APT 2.9.3 introduces the first iteration of the new solver codenamed solver3, and now available with the solver 3.0 option. The new solver works fundamentally different from the old one.

How does it work? Solver3 is a fully backtracking dependency solving algorithm that defers choices to as late as possible. It starts with an empty set of packages, then adds the manually installed packages, and then installs packages automatically as necessary to satisfy the dependencies. Deferring the choices is implemented multiple ways: First, all install requests recursively mark dependencies with a single solution for install, and any packages that are being rejected due to conflicts or user requests will cause their reverse dependencies to be transitively marked as rejected, provided their or group cannot be solved by a different package. Second, any dependency with more than one choice is pushed to a priority queue that is ordered by the number of possible solutions, such that we resolve a b before a b c. Not just by the number of solutions, though. One important point to note is that optional dependencies, that is, Recommends, are always sorting after mandatory dependencies. Do note on that: Recommended packages do not nest in backtracking - dependencies of a Recommended package themselves are not optional, so they will have to be resolved before the next Recommended package is seen in the queue. Another important step in deferring choices is extracting the common dependencies of a package across its version and then installing them before we even decide which of its versions we want to install - one of the dependencies might cycle back to a specific version after all. Decisions about package levels are recorded at a certain decision level, if we reach a conflict we backtrack to the previous decision level, mark the decision we made (install X) in the inverse (DO NOT INSTALL X), reset all the state all decisions made at the higher level, and restore any dependencies that are no longer resolved to the work queue.

Comparison to SAT solver design. If you have studied SAT solver design, you ll find that essentially this is a DPLL solver without pure literal elimination. A pure literal eliminitation phase would not work for a package manager: First negative pure literals (packages that everything conflicts with) do not exist, and positive pure literals (packages nothing conflicts with) we do not want to mark for install - we want to install as little as possible (well subject, to policy). As part of the solving phase, we also construct an implication graph, albeit a partial one: The first package installing another package is marked as the reason (A -> B), the same thing for conflicts (not A -> not B). Once we have added the ability to have multiple parents in the implication graph, it stands to reason that we can also implement the much more advanced method of conflict-driven clause learning; where we do not jump back to the previous decision level but exactly to the decision level that caused the conflict. This would massively speed up backtracking.

What changes can you expect in behavior? The most striking difference to the classic APT solver is that solver3 always keeps manually installed packages around, it never offers to remove them. We will relax that in a future iteration so that it can replace packages with new ones, that is, if your package is no longer available in the repository (obsolete), but there is one that Conflicts+Replaces+Provides it, solver3 will be allowed to install that and remove the other. Implementing that policy is rather trivial: We just need to queue obsolete replacement as a dependency to solve, rather than mark the obsolete package for install. Another critical difference is the change in the autoremove behavior: The new solver currently only knows the strongest dependency chain to each package, and hence it will not keep around any packages that are only reachable via weaker chains. A common example is when gcc-<version> packages accumulate on your system over the years. They all have Provides: c-compiler and the libtool Depends: gcc c-compiler is enough to keep them around.

New features The new option --no-strict-pinning instructs the solver to consider all versions of a package and not just the candidate version. For example, you could use apt install foo=2.0 --no-strict-pinning to install version 2.0 of foo and upgrade - or downgrade - packages as needed to satisfy foo=2.0 dependencies. This mostly comes in handy in use cases involving Debian experimental or the Ubuntu proposed pockets, where you want to install a package from there, but try to satisfy from the normal release as much as possible. The implication graph building allows us to implement an apt why command, that while not as nicely detailed as aptitude, at least tells you the exact reason why a package is installed. It will only show the strongest dependency chain at first of course, since that is what we record.

What is left to do? At the moment, error information is not stored across backtracking in any way, but we generally will want to show you the first conflict we reach as it is the most natural one; or all conflicts. Currently you get the last conflict which may not be particularly useful. Likewise, errors currently are just rendered as implication graphs of the form [not] A -> [not] B -> ..., and we need to put in some work to present those nicely. The test suite is not passing yet, I haven t really started working on it. A challenge is that most packages in the test suite are manually installed as they are mocked, and the solver now doesn t remove those. We plan to implement the replacement logic such that foo can be replaced by foo2 Conflicts/Replaces/Provides foo without needing to be automatically installed. Improving the backtracking to be non-chronological conflict-driven clause learning would vastly enhance our backtracking performance. Not that it seems to be an issue right now in my limited testing (mostly noble 64-bit-time_t upgrades). A lot of that complexity you have normally is not there because the manually installed packages and resulting unit propagation (single-solution Depends/Reverse-Depends for Conflicts) already ground us fairly far in what changes we can actually make. Once all the stuff has landed, we need to start rolling it out and gather feedback. On Ubuntu I d like automated feedback on regressions (running solver3 in parallel, checking if result is worse and then submitting an error to the error tracker), on Debian this could just be a role email address to send solver dumps to. At the same time, we can also incrementally start rolling this out. Like phased updates in Ubuntu, we can also roll out the new solver as the default to 10%, 20%, 50% of users before going to the full 100%. This will allow us to capture regressions early and fix them.

11 May 2024

Sven Hoexter: xdg and mime types - stuff I would've loved to know a week ago

Learned a few things about xdg and mimetype registration in the last week that could be helpful to have condensed in a single place. No Need to Ship a Mailcap Mime File If you already ship a .desktop file (that is what ends up in /usr/share/applications/) which has a MimeType declared, there is no need to also ship a mailcap file (that is what ends up in /usr/lib/mime/packages/). Some triggers will do the conversion work for you. See also Debian Policy 4.9. Reverse DNS Naming Convention for .desktop Files Seems to be a closely guarded secret, maybe mainly known inside the Gnome world, but it's in the spec. Also not very widely known inside Debian if I look at my local system as not very representative sample. Your hicolor Theme App Icon can be a Mime Type Icon as Well In case you didn't know the hicolor icon theme is the default fallback theme. Many of us already install application icons e.g. in /usr/share/icons/hicolor/48x48/apps/ which is used in conjunction with the Icon field in the .desktop file to locate the application icon. Now the next step, and there it seems quite of few us miss out, is to create a symlink to also provide a mime type icon, so it's displayed in graphical file managers for the application data files. The schema here is simple: Take the MimeType e.g. application/x-vymand replace the / with a - and use that as file name in e.g. /usr/share/icons/hicolor/48x48/mimetypes/. In the vym case that is /usr/share/icons/hicolor/48x48/mimetypes/application-x-vym.png. If you have one use a scalable .svg file instead of .png. This seems to be an area where Debian lacks a bit of tooling to automatically convert application icons to all the different sizes and install it in all the appropriate places. What is already there is a trigger to run gtk-update-icon-cache when you install new icons into one of the icon theme folder so they're picked up. No Priority or Order in .desktop Files Likely something that hapens on all my fresh installations: Libreoffice is installed and xdg-open starts to open pdf files with Libreoffice instead of evince. Now I've to figure out again to run xdg-mime default org.gnome.Evince.desktop application/pdf to change that (at least for my user). Background here is that the desktop file spec explicitly mandates "Priority for applications is handled external to the .desktop files.". That's why we got in addition to all of that mimeapps.list files. And now, after running the xdg-mime command from above, we've a ~/.config/mimeapps.list defining
[Default Applications]
application/pdf=org.gnome.Evince.desktop
Debian as whole seems to be not very keen on shipping something like a sensible default mimeapps.list outside of desktop environment specific ones. A quick search gave me just
$ apt-file search mimeapps.list
cinnamon-desktop-data: /usr/share/applications/x-cinnamon-mimeapps.list
gdm3: /usr/share/gdm/greeter/applications/mimeapps.list
gnome-session-common: /usr/share/applications/gnome-mimeapps.list
plasma-workspace: /usr/share/applications/kde-mimeapps.list
sxmo-utils: /usr/share/applications/mimeapps.list
sxmo-utils: /usr/share/sxmo/xdg/mimeapps.list
While it's a bit anoying to run into that pdf vs Libreoffice thing every now and then, it's maybe better to not have long controversial threads about default pdf viewer, like the ones we already had about the default MTA choices. ;) And while we're at it: everyone using Libreoffice should give a virtual hug to rene@ for taming that beast since 2010 and OpenOffice.org before.

Next.

Previous.