Search Results: "will"

8 February 2026

Dirk Eddelbuettel: chronometre: A new package (pair) demo for R and Python

Both R and Python make it reasonably easy to work with compiled extensions. But how to access objects in one environment from the other and share state or (non-trivial) objects remains trickier. Recently (and while r-forge was resting so we opened GitHub Discussions) a question was asked concerning R and Python object pointer exchange. This lead to a pretty decent discussion including arrow interchange demos (pretty ideal if dealing with data.frame-alike objects), but once the focus is on more library-specific objects from a given (C or C++, say) library it is less clear what to do, or how involved it may get. R has external pointers, and these make it feasible to instantiate the same object in Python. To demonstrate, I created a pair of (minimal) packages wrapping a lovely (small) class from the excellent spdlog library by Gabi Melman, and more specifically in an adapted-for-R version (to avoid some R CMD check nags) in my RcppSpdlog package. It is essentially a nicer/fancier C++ version of the tic() and tic() timing scheme. When an object is instantiated, it starts the clock and when we accessing it later it prints the time elapsed in microsecond resolution. In Modern C++ this takes little more than keeping an internal chrono object. Which makes for a nice, small, yet specific object to pass to Python. So the R side of the package pair instantiates such an object, and accesses its address. For different reasons, sending a raw pointer across does not work so well, but a string with the address printed works fabulously (and is a paradigm used around other packages so we did not invent this). Over on the Python side of the package pair, we then take this string representation and pass it to a little bit of pybind11 code to instantiate a new object. This can of course also expose functionality such as the show time elapsed feature, either formatted or just numerically, of interest here. And that is all that there is! Now this can be done from R as well thanks to reticulate as the demo() (also shown on the package README.md) shows:
> library(chronometre)
> demo("chronometre", ask=FALSE)


        demo(chronometre)
        ---- ~~~~~~~~~~~

> #!/usr/bin/env r
> 
> stopifnot("Demo requires 'reticulate'" = requireNamespace("reticulate", quietly=TRUE))

> stopifnot("Demo requires 'RcppSpdlog'" = requireNamespace("RcppSpdlog", quietly=TRUE))

> stopifnot("Demo requires 'xptr'" = requireNamespace("xptr", quietly=TRUE))

> library(reticulate)

> ## reticulate and Python in general these days really want a venv so we will use one,
> ## the default value is a location used locally; if needed create one
> ## check for existing virtualenv to use, or else set one up
> venvdir <- Sys.getenv("CHRONOMETRE_VENV", "/opt/venv/chronometre")

> if (dir.exists(venvdir))  
+ >     use_virtualenv(venvdir, required = TRUE)
+ >   else  
+ >     ## create a virtual environment, but make it temporary
+ >     Sys.setenv(RETICULATE_VIRTUALENV_ROOT=tempdir())
+ >     virtualenv_create("r-reticulate-env")
+ >     virtualenv_install("r-reticulate-env", packages = c("chronometre"))
+ >     use_virtualenv("r-reticulate-env", required = TRUE)
+ >  


> sw <- RcppSpdlog::get_stopwatch()                   # we use a C++ struct as example

> Sys.sleep(0.5)                                      # imagine doing some code here

> print(sw)                                           # stopwatch shows elapsed time
0.501220 

> xptr::is_xptr(sw)                                   # this is an external pointer in R
[1] TRUE

> xptr::xptr_address(sw)                              # get address, format is "0x...."
[1] "0x58adb5918510"

> sw2 <- xptr::new_xptr(xptr::xptr_address(sw))       # cloned (!!) but unclassed

> attr(sw2, "class") <- c("stopwatch", "externalptr") # class it .. and then use it!

> print(sw2)                                          #  xptr  allows us close and use
0.501597 

> sw3 <- ch$Stopwatch(  xptr::xptr_address(sw) )      # new Python object via string ctor

> print(sw3$elapsed())                                # shows output via Python I/O
datetime.timedelta(microseconds=502013)

> cat(sw3$count(), "\n")                              # shows double
0.502657 

> print(sw)                                           # object still works in R
0.502721 
> 
The same object, instantiated in R is used in Python and thereafter again in R. While this object here is minimal in features, the concept of passing a pointer is universal. We could use it for any interesting object that R can access and Python too can instantiate. Obviously, there be dragons as we pass pointers so one may want to ascertain that headers from corresponding compatible versions are used etc but principle is unaffected and should just work. Both parts of this pair of packages are now at the corresponding repositories: PyPi and CRAN. As I commonly do here on package (change) announcements, I include the (minimal so far) set of high-level changes for the R package.

Changes in version 0.0.2 (2026-02-05)
  • Removed replaced unconditional virtualenv use in demo given preceding conditional block
  • Updated README.md with badges and an updated demo

Changes in version 0.0.1 (2026-01-25)
  • Initial version and CRAN upload

Questions, suggestions, bug reports, are welcome at either the (now awoken from the R-Forge slumber) Rcpp mailing list or the newer Rcpp Discussions.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

Louis-Philippe V ronneau: Montreal Subway Foot Traffic Data, 2025 edition

Another year of data from Soci t de Transport de Montr al, Montreal's transit agency! A few highlights this year:
  1. Although the Saint-Michel station closed for emergency repairs in November 2024, traffic never bounced back to its pre-closure levels and is still stuck somewhere around 2022 Q2 levels. I wonder if this could be caused by the roadwork on Jean-Talon for the new Blue Line stations making it harder for folks in Montreal-Nord to reach the station by bus.
  2. The effects of the opening of the Royalmount shopping center has had a durable impact on the traffic at the De la Savane station. I reported on this last year, but it seems this wasn't just a fad.
  3. With the completion of the Deux-Montagnes branch of the R seau express m tropolitain (REM, a light-rail, above the surface transit network still in construction), the transfer stations to the Montreal subway have seen major traffic increases. The douard-Montpetit station has nearly reached its previous all-time record of 2015 and the McGill station has recovered from the general slump all the other stations have had in 2025.
  4. The Assomption station, which used to have one of the lowest number of riders of the subway network, has had a tremendous growth in the past few years. This is mostly explained by the many high-rise projects that were built around the station since the end of the COVID-19 pandemic.
  5. Although still affected by a very high seasonality, the Jean-Drapeau station broke its previous record of 2019, a testament of the continued attraction power of the various summer festivals taking place on the Sainte-H l ne et Notre-Dame islands.
More generally, it seems the Montreal subway has had a pretty bad year. Traffic had been slowly climbing back since the COVID-19 pandemic, but this is the first year since 2020 such a sharp decline can be witnessed. Even major stations like Jean-Talon or Lionel-Groulx are on a downward trend and it is pretty worrisome. As for causes, a few things come to mind. First of all, as the number of Montrealers commuting to work by bike continues to rise1, a modal shift from public transit to active mobility is to be expected. As local experts put it, this is not uncommon and has been seen in other cities before. Another important factor that certainly turned people away from the subway this year has been the impacts of the continued housing crisis in Montreal. As more and more people get kicked out of their apartments, many have been seeking refuge in the subway stations to find shelter. Sadly, this also brought a unprecedented wave of incivilities. As riders' sense of security sharply decreased, the STM eventually resorted to banning unhoused people from sheltering in the subway. This decision did bring back some peace to the network, but one can posit damage had already been done and many casual riders are still avoiding the subway for this reason. Finally, the weekslong STM worker's strike in Q4 had an important impact on general traffic, as it severely reduced the opening hours of the subway. As for the previous item, once people find alternative ways to get around, it's always harder to bring them back. Hopefully, my 2026 report will be a more cheerful one... By clicking on a subway station, you'll be redirected to a graph of the station's foot traffic. Licences

  1. Mostly thanks to major improvements to the cycling network and the BIXI bikesharing program.

6 February 2026

Reproducible Builds: Reproducible Builds in January 2026

Welcome to the first monthly report in 2026 from the Reproducible Builds project! These reports outline what we ve been up to over the past month, highlighting items of news from elsewhere in the increasingly-important area of software supply-chain security. As ever, if you are interested in contributing to the Reproducible Builds project, please see the Contribute page on our website.

  1. Flathub now testing for reproducibility
  2. Reproducibility identifying projects that will fail to build in 2038
  3. Distribution work
  4. Tool development
  5. Two new academic papers
  6. Upstream patches

Flathub now testing for reproducibility Flathub, the primary repository/app store for Flatpak-based applications, has begun checking for build reproducibility. According to a recent blog post:
We have started testing binary reproducibility of x86_64 builds targeting the stable repository. This is possible thanks to flathub-repro-checker, a tool doing the necessary legwork to recreate the build environment and compare the result of the rebuild with what is published on Flathub. While these tests have been running for a while now, we have recently restarted them from scratch after enabling S3 storage for diffoscope artifacts.
The test results and status is available on their reproducible builds page.

Reproducibility identifying software projects that will fail to build in 2038 Longtime Reproducible Builds developer Bernhard M. Wiedemann posted on Reddit on Y2K38 commemoration day T-12 that is to say, twelve years to the day before the UNIX Epoch will no longer fit into a signed 32-bit integer variable on 19th January 2038. Bernhard s comment succinctly outlines the problem as well as notes some of the potential remedies, as well as links to a discussion with the GCC developers regarding adding warnings for int time_t conversions . At the time of publication, Bernard s topic had generated 50 comments in response.

Distribution work Conda is language-agnostic package manager which was originally developed to help Python data scientists and is now a popular package manager for Python and R. conda-forge, a community-led infrastructure for Conda recently revamped their dashboards to rebuild packages straight to track reproducibility. There have been changes over the past two years to make the conda-forge build tooling fully reproducible by embedding the lockfile of the entire build environment inside the packages.
In Debian this month:
In NixOS this month, it was announced that the GNU Guix Full Source Bootstrap was ported to NixOS as part of Wire Jansen bachelor s thesis (PDF). At the time of publication, this change has landed in NiX stdev distribution.
Lastly, Bernhard M. Wiedemann posted another openSUSE monthly update for his work there.

Tool development diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made the following changes, including preparing and uploading versions, 310 and 311 to Debian.
  • Fix test compatibility with u-boot-tools version 2026-01. [ ]
  • Drop the implied Rules-Requires-Root: no entry in debian/control. [ ]
  • Bump Standards-Version to 4.7.3. [ ]
  • Reference the Debian ocaml package instead of ocaml-nox. (#1125094)
  • Apply a patch by Jelle van der Waa to adjust a test fixture match new lines. [ ]
  • Also the drop implied Priority: optional from debian/control. [ ]

In addition, Holger Levsen uploaded two versions of disorderfs, first updating the package from FUSE 2 to FUSE 3 as described in last months report, as well as updating the packaging to the latest Debian standards. A second upload (0.6.2-1) was subsequently made, with Holger adding instructions on how to add the upstream release to our release archive and incorporating changes by Roland Clobus to set _FILE_OFFSET_BITS on 32-bit platforms, fixing a build failure on 32-bit systems. Vagrant Cascadian updated diffoscope in GNU Guix to version 311-2-ge4ec97f7 and disorderfs to 0.6.2.

Two new academic papers Julien Malka, Stefano Zacchiroli and Th o Zimmermann of T l com Paris in-house research laboratory, the Information Processing and Communications Laboratory (LTCI) published a paper this month titled Docker Does Not Guarantee Reproducibility:
[ ] While Docker is frequently cited in the literature as a tool that enables reproducibility in theory, the extent of its guarantees and limitations in practice remains under-explored. In this work, we address this gap through two complementary approaches. First, we conduct a systematic literature review to examine how Docker is framed in scientific discourse on reproducibility and to identify documented best practices for writing Dockerfiles enabling reproducible image building. Then, we perform a large-scale empirical study of 5,298 Docker builds collected from GitHub workflows. By rebuilding these images and comparing the results with their historical counterparts, we assess the real reproducibility of Docker images and evaluate the effectiveness of the best practices identified in the literature.
A PDF of their paper is available online.
Quentin Guilloteau, Antoine Waehren and Florina M. Ciorba of the University of Basel in Sweden also published a Docker-related paper, theirs called Longitudinal Study of the Software Environments Produced by Dockerfiles from Research Artifacts:
The reproducibility crisis has affected all scientific disciplines, including computer science (CS). To address this issue, the CS community has established artifact evaluation processes at conferences and in journals to evaluate the reproducibility of the results shared in publications. Authors are therefore required to share their artifacts with reviewers, including code, data, and the software environment necessary to reproduce the results. One method for sharing the software environment proposed by conferences and journals is to utilize container technologies such as Docker and Apptainer. However, these tools rely on non-reproducible tools, resulting in non-reproducible containers. In this paper, we present a tool and methodology to evaluate variations over time in software environments of container images derived from research artifacts. We also present initial results on a small set of Dockerfiles from the Euro-Par 2024 conference.
A PDF of their paper is available online.

Miscellaneous news On our mailing list this month: Lastly, kpcyrd added a Rust section to the Stable order for outputs page on our website. [ ]

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Finally, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

Birger Schacht: Status update, January 2026

January was a slow month, I only did three uploads to Debian unstable: I was very happy to see the new dfsg-new-queue and that there are more hands now processing the NEW queue. I also finally got one of the packages accepted that I uploaded after the Trixie release: wayback which I uploaded last August. There has been another release since then, I ll try to upload that in the next few days. There was a bug report for carl asking for Windows support. carl used the xdg create for looking up the XDG directories, but xdg does not support windows systems (and it seems this will not change) The reporter also provided a PR to replace the dependency with the directories crate which more system agnostic. I adapted the PR a bit and merged it and released version 0.6.0 of carl. At my dayjob I refactored django-grouper. django-grouper is a package we use to find duplicate objects in our data. Our users often work with datasets of thousands of historical persons, places and institutions and in projects that run over years and ingest data from multiple sources, it happens that entries are created several times. I wrote the initial app in 2024, but was never really happy about the approach I used back then. It was based on this blog post that describes how to group spreadsheet text cells. It uses sklearns TfidfVectorizer with a custom analyzer and the library sparse_dot_topn for creating the matrix. All in all the module to calculate the clusters was 80 lines and with sparse_dot_topn it pulled in a rather niche Python library. I was pretty sure that this functionality could also be implemented with basic sklearn functionality and it was: we are now using DictVectorizer because in a Django app we are working with objects that can be mapped to dicts anyway. And for clustering the data, the app now uses the DBSCAN algorithm (with the manhattan distance as metric). The module is now only half the size and the whole app lost one dependency! I released those changes as version 0.3.0 of the app. At the end of January together with friends I went to Brussels to attend FOSDEM. We took the night train but there were a couple of broken down trains so the ride took 26 hours instead of one night. It is a good thing we had a one day buffer and FOSDEM only started on Saturday. As usual there were too many talks to visit, so I ll have to watch some of the recordings in the next few weeks. Some examples of talks I found interesting so far:

3 February 2026

Valhalla's Things: A Day Off

Posted on February 3, 2026
Tags: madeof:atoms, madeof:bits
Today I had a day off. Some of it went great. Some less so. I woke up, went out to pay our tribute to NotOurCat, and it was snowing! yay! And I had a day off, so if it had snowed enough that shovelling was needed, I had time to do it (it didn t, it started to rain soon afterwards, but still, YAY snow!). Then I had breakfast, with the fruit rye bread I had baked yesterday, and I treated myself to some of the strong Irish tea I have left, instead of the milder ones I want to finish before buying more of the Irish. And then, I bought myself a fancy new expensive fountain pen. One that costs 16 ! more than three times as much as my usual ones! I hope it will work as well, but I m quite confident it should. I ll find out when it arrives from Germany (together with a few ink samples that will result in a future blog post with some SCIENCE). I decided to try and use bank transfers instead of my visa debit card when buying from online shops that give the option to do so: it s a tiny bit more effort, but it means I m paying 0.25 to my bank1 rather than the seller having to pay some unknown amount to an US based payment provider. Unluckily, the fountain pen website offered a huge number of payment methods, but not bank transfers. sigh. And then, I could start working a bit on the connecting wires for the LED strips for our living room: I soldered two pieces, six wires each (it s one RGB strip, 4 pins, and a warm white one requiring two more), then did a bit of tests, including writing some micropython code to add a test mode that lights up each colour in sequence, and the morning was almost gone. For some reason this project, as simple as it is, is taking forever. But it is showing progress. There was a break, when the postman delivered a package of chemicals 2 for a future project or two. There will be blog posts! After lunch I spent some time finishing eyelets on the outfit I wanted to wear this evening, as I had not been able to finish it during fosdem. This one will result in two blog posts! Meanwhile, in the morning I didn t remember the name of the program I used to load software on micropython boards such as the one that will control the LED strips (that s thonny), and while searching for it in the documentation, I found that there is also a command line program I can use, mpremote, and that s a much better fit for my preferences! I mentioned it in an xmpp room full of nerds, and one of them mentioned that he could try it on his Inkplate, when he had time, and I was nerd-sniped into trying it on mine, which had been sitting unused showing the temperatures in our old house on the last day it spent there and needs to be updated for the sensors in the new house. And that lead to the writing of some notes on how to set it up from the command line (good), and to the opening on one upstream issue (bad), because I have an old model, and the board-specific library isn t working. at all. And that s when I realized that it was 17:00, I still had to cook the bread I had been working on since yesterday evening (ciabatta, one of my favourites, but it needs almost one hour in the oven), the outfit I wanted to wear in the evening was still not wearable, the table needed cleaning and some panicking was due. Thankfully, my mother was cooking dinner, so I didn t have to do that too. I turned the oven on, sewed the shoulder seams of the bodice while spraying water on the bread every 5 minutes, and then while it was cooking on its own, started to attach a closure to the skirt, decided that a safety pin was a perfectly reasonable closure for the first day an outfit is worn, took care of the table, took care of the bread, used some twine to close the bodice, because I still haven t worked out what to use for laces, realized my bodkin is still misplaced, used a long and sharp and big needle meant for sewing mattresses instead of a bodkin, managed not to stab myself, and less than half an hour late we could have dinner. There was bread, there was Swedish crispbread, there were spreads (tuna, and beans), and vegetables, and then there was the cake that caused my mother to panic when she added her last honey to the milk and it curdled (my SO and I tried it, it had no odd taste, we decided it could be used) and it was good, although I had to get a second slice just to be 100% sure of it. And now I m exhausted, and I ve only done half of the things I had planned to do, but I d still say I ve had quite a good day.

  1. Banca Etica, so one that avoids any investment in weapons and a number of other problematic things.
  2. not food grade, except for one, but kitchen-safe.

2 February 2026

Hellen Chemtai: Career Growth Through Open Source: A Personal Journey

Hello world  ! I am an intern at Outreachy working with the Debian OpenQA team on images testing. We get to know what career opportunities awaits us when we work on open source projects. In open source, we are constantly learning. The community has different sets of skills and a large network of people.

So, how did I start off in this internship

I entered the community with the these skills:

  • MERN (Mongo DB, Express JS , React JS and Node JS) for web development
  • Linux and Shell Scripting for some administrative purposes
  • Containerization using Google Cloud
  • Operating Systems
  • A learning passion for Open Source I contributed to some open source work in the past but it was in terms of documentation and bug hunting

I was a newbie at OpenQA but, I had a month to learn and contribute. Time is a critical resource but so is understanding what you are doing. I followed the installations instructions given but whenever I got errors, I had to research why I got the errors. I took time to understand errors I was solving then continued on with the tasks I wanted to do. I communicated my logic and understanding while working on the task and awaited reviews and feedback. Within a span of two months I had learned a lot by practicing and testing.

The skills I gained

As of today, I gained these technical skills from my work with Debian OpenQA team.

  • Perl the tests that we run are written in this language
  • Ansible configuration ansible configurations and settings for the machines the team runs
  • Git this is needed for code versioning and diving tasks into different branches
  • Linux shell scripting and working with the Debian Operating system
  • Virtual Machines and Operating Systems I constantly view how different operating systems are booted and run on virtual machines during testing
  • Testing I keep watch of needles and ensure the tests work as required
  • Debian I use a Debian system to run my Virtual Machines
  • OpenQA the tool that is used to automate testing of Images

With open source comes the need of constant communication. The community is diverse and the team is usually on different time zones. These are some of the soft / social skills I gained when working with the team

  • Communication this is essential especially in taking tasks with confidence, talking about issues encountered and stating the progress of the tasks
  • Interpersonal skills this is for general communication within the community
  • Flexibility we have to adapt to changes because we are a community of different people with different skills

With these skills and the willingness to learn , open source is a great area to focus on . Aside from your career you will extend your network. My interests are set on open source and Linux in general. Working with a wider network has really skilled me up and I will continue learning. Working with the Debian OpenQA team has been very great. The team is great at communication and I learn every day. The knowledge I gain from the team is helping me build a great career in open source.

1 February 2026

Benjamin Mako Hill: What do people do when they edit Wikipedia through Tor?

Note: I have not published blog posts about my academic papers over the past few years. To ensure that my blog contains a more comprehensive record of my published papers and to surface these for folks who missed them, I will be periodically (re)publishing blog posts about some older published projects. This post is closely based on a previously published post by Kaylea Champion on the Community Data Science Blog.

Many individuals use Tor to reduce their visibility to widespread internet surveillance.
One popular approach to protecting our privacy online is to use the Tor network. Tor protects users from being identified by their IP address, which can be tied to a physical location. However, if you d like to contribute to Wikipedia using Tor, you ll run into a problem. Although most IP addresses can edit without an account, Tor users are blocked from editing.
Tor users attempting to contribute to Wikipedia are shown a screen that informs them that they are not allowed to edit Wikipedia.
Other research by my team has shown that Wikipedia s attempt to block Tor is imperfect and that some people have been able to edit despite the ban. As part of this work, we built a dataset of more than 11,000 contributions to Wikipedia via Tor and used quantitative analysis to show that contributions from Tor were of about the same quality as contributions from other new editors and other contributors without accounts. Of course, given the unusual circumstances Tor-based contributors face, we wondered whether a deeper look at the content of their edits might reveal more about their motives and the kinds of contributions they seek to make. Kaylea Champion (then a student, now faculty at UW Bothell) led a qualitative investigation to explore these questions. Given the challenges of studying anonymity seekers, we designed a novel forensic qualitative approach inspired by techniques common in computer security and criminal investigation. We applied this new technique to a sample of 500 editing sessions and categorized each session based on what the editor seemed to be intending to do. Most of the contributions we found fell into one of the two following categories: Although these were most of what we observed, we also found evidence of several types of contributor intent:
An exploratory mapping of our themes in terms of the value a type of contribution represents to the Wikipedia community and the importance of anonymity in facilitating it. Anonymity-protecting tools play a critical role in facilitating contributions on the right side of the figure, while edits on the left are more likely to occur even when anonymity is impossible. Contributions toward the top reflect valuable forms of participation in Wikipedia, while edits at the bottom reflect damage.
In all, these themes led us to reflect on how the risks individuals face when contributing to online communities sometimes diverge from the risks the communities face in accepting their work. Expressing minoritized perspectives, maintaining community standards even when you may be targeted by the rulebreaker, highlighting injustice, or acting as a whistleblower can be very risky for an individual, and may not be possible without privacy protections. Of course, in platforms seeking to support the public good, such knowledge and accountability may be crucial.

This work was published as a paper at CSCW: Kaylea Champion, Nora McDonald, Stephanie Bankes, Joseph Zhang, Rachel Greenstadt, Andrea Forte, and Benjamin Mako Hill. 2019. A Forensic Qualitative Analysis of Contributions to Wikipedia from Anonymity Seeking Users. Proceedings of the ACM on Human-Computer Interaction. 3, CSCW, Article 53 (November 2019), 26 pages. https://doi.org/10.1145/3359155

This project was conducted by Kaylea Champion, Nora McDonald, Stephanie Bankes, Joseph Zhang, Rachel Greenstadt, Andrea Forte, and Benjamin Mako Hill. This work was supported by the National Science Foundation (awards CNS-1703736 and CNS-1703049) and included the work of two undergraduates supported through an NSF REU supplement.

Russ Allbery: Review: Paladin's Faith

Review: Paladin's Faith, by T. Kingfisher
Series: The Saint of Steel #4
Publisher: Red Wombat Studio
Copyright: 2023
ISBN: 1-61450-614-0
Format: Kindle
Pages: 515
Paladin's Faith is the fourth book in T. Kingfisher's loosely connected series of fantasy novels about the berserker former paladins of the Saint of Steel. You could read this as a standalone, but there are numerous (spoilery) references to the previous books in the series. Marguerite, who was central to the plot of the first book in the series, Paladin's Grace, is a spy with a problem. An internal power struggle in the Red Sail, the organization that she's been working for, has left her a target. She has a plan for how to break their power sufficiently that they will hopefully leave her alone, but to pull it off she's going to need help. As the story opens, she is working to acquire that help in a very Marguerite sort of way: breaking into the office of Bishop Beartongue of the Temple of the White Rat. The Red Sail, the powerful merchant organization Marguerite worked for, makes their money in the salt trade. Marguerite has learned that someone invented a cheap and reproducible way to extract salt from sea water, thus making the salt trade irrelevant. The Red Sail wants to ensure that invention never sees the light of day, and has forced the artificer into hiding. Marguerite doesn't know where they are, but she knows where she can find out: the Court of Smoke, where the artificer has a patron.
Having grown up in Anuket City, Marguerite was familiar with many clockwork creations, not to mention all the ways that they could go horribly wrong. (Ninety-nine times out of a hundred, it was an explosion. The hundredth time, it ran amok and stabbed innocent bystanders, and the artificer would be left standing there saying, "But I had to put blades on it, or how would it rake the leaves?" while the gutters filled up with blood.)
All Marguerite needs to put her plan into motion is some bodyguards so that she's not constantly distracted and anxious about being assassinated. Readers of this series will be unsurprised to learn that the bodyguards she asks Beartongue for are paladins, including a large broody male one with serious self-esteem problems. This is, like the other books in this series, a slow-burn romance with infuriating communication problems and a male protagonist who would do well to seek out a sack of hammers as a mentor. However, it has two things going for it that most books in this series do not: a long and complex plot to which the romance takes a back seat, and Marguerite, who is not particularly interested in playing along with the expected romance developments. There are also two main paladins in this story, not just one, and the other is one of the two female paladins of the Saint of Steel and rather more entertaining than Shane. I generally like court intrigue stories, which is what fills most of this book. Marguerite is an experienced operative, so the reader gets some solid competence porn, and the paladins are fish out of water but are also unexpectedly dangerous, which adds both comedy and satisfying table-turning. I thoroughly enjoyed the maneuvering and the culture clashes. Marguerite is very good at what she does, knows it, and is entirely uninterested in other people's opinions about that, which short-circuits a lot of Shane's most annoying behavior and keeps the story from devolving into mopey angst like some of the books in this series have done. The end of this book takes the plot in a different direction that adds significantly to the world-building, but also has a (thankfully short) depths of despair segment that I endured rather than enjoyed. I am not really in the mood for bleak hopelessness in my fiction at the moment, even if the reader is fairly sure it will be temporary. But apart from that, I thoroughly enjoyed this book from beginning to end. When we finally meet the artificer, they are an absolute delight in that way that Kingfisher is so good at. The whole story is infused with the sense of determined and competent people refusing to stop trying to fix problems. As usual, the romance was not for me and I think the book would have been better without it, but it's less central to the plot and therefore annoyed me less than any of the books in this series so far. My one major complaint is the lack of gnoles, but we get some new and intriguing world-building to make up for it, along with a setup for a fifth book that I am now extremely curious about. By this point in the series, you probably know if you like the general formula. Compared to the previous book, Paladin's Hope, I thought Paladin's Faith was much stronger and more interesting, but it's clearly of the same type. If, like me, you like the plots but not the romance, the plot here is more substantial. You will have to decide if that makes up for a romance in the typical T. Kingfisher configuration. Personally, I enjoyed this quite a bit, except for the short bleak part, and I'm back to eagerly awaiting the next book in the series. Rating: 8 out of 10

31 January 2026

Michael Prokop: apt, SHA-1 keys + 2026-02-01

You might have seen Policy will reject signature within a year warnings in apt(-get) update runs like this:
root@424812bd4556:/# apt update
Get:1 http://foo.example.org/debian demo InRelease [4229 B]
Hit:2 http://deb.debian.org/debian trixie InRelease
Hit:3 http://deb.debian.org/debian trixie-updates InRelease
Hit:4 http://deb.debian.org/debian-security trixie-security InRelease
Get:5 http://foo.example.org/debian demo/main amd64 Packages [1097 B]
Fetched 5326 B in 0s (43.2 kB/s)
All packages are up to date.
Warning: http://foo.example.org/debian/dists/demo/InRelease: Policy will reject signature within a year, see --audit for details
root@424812bd4556:/# apt --audit update
Hit:1 http://foo.example.org/debian demo InRelease
Hit:2 http://deb.debian.org/debian trixie InRelease
Hit:3 http://deb.debian.org/debian trixie-updates InRelease
Hit:4 http://deb.debian.org/debian-security trixie-security InRelease
All packages are up to date.    
Warning:  http://foo.example.org/debian/dists/demo/InRelease: Policy will reject signature within a year, see --audit for details
Audit:  http://foo.example.org/debian/dists/demo/InRelease: Sub-process /usr/bin/sqv returned an error code (1), error message is:
   Signing key on 54321ABCD6789ABCD0123ABCD124567ABCD89123 is not bound:
              No binding signature at time 2024-06-19T10:33:47Z
     because: Policy rejected non-revocation signature (PositiveCertification) requiring second pre-image resistance
     because: SHA1 is not considered secure since 2026-02-01T00:00:00Z
Audit: The sources.list(5) entry for 'http://foo.example.org/debian' should be upgraded to deb822 .sources
Audit: Missing Signed-By in the sources.list(5) entry for 'http://foo.example.org/debian'
Audit: Consider migrating all sources.list(5) entries to the deb822 .sources format
Audit: The deb822 .sources format supports both embedded as well as external OpenPGP keys
Audit: See apt-secure(8) for best practices in configuring repository signing.
Audit: Some sources can be modernized. Run 'apt modernize-sources' to do so.
If you ignored this for the last year, I would like to tell you that 2026-02-01 is not that far away (hello from the past if you re reading this because you re already affected). Let s simulate the future:
root@424812bd4556:/# apt --update -y install faketime
[...]
root@424812bd4556:/# export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/faketime/libfaketime.so.1 FAKETIME="2026-08-29 23:42:11" 
root@424812bd4556:/# date
Sat Aug 29 23:42:11 UTC 2026
root@424812bd4556:/# apt update
Get:1 http://foo.example.org/debian demo InRelease [4229 B]
Hit:2 http://deb.debian.org/debian trixie InRelease                                 
Err:1 http://foo.example.org/debian demo InRelease
  Sub-process /usr/bin/sqv returned an error code (1), error message is: Signing key on 54321ABCD6789ABCD0123ABCD124567ABCD89123 is not bound:            No binding signature at time 2024-06-19T10:33:47Z   because: Policy rejected non-revocation signature (PositiveCertification) requiring second pre-image resistance   because: SHA1 is not considered secure since 2026-02-01T00:00:00Z
[...]
Warning: An error occurred during the signature verification. The repository is not updated and the previous index files will be used. OpenPGP signature verification failed: http://foo.example.org/debian demo InRelease: Sub-process /usr/bin/sqv returned an error code (1), error message is: Signing key on 54321ABCD6789ABCD0123ABCD124567ABCD89123 is not bound:            No binding signature at time 2024-06-19T10:33:47Z   because: Policy rejected non-revocation signature (PositiveCertification) requiring second pre-image resistance   because: SHA1 is not considered secure since 2026-02-01T00:00:00Z
[...]
root@424812bd4556:/# echo $?
100
Now, the proper solution would have been to fix the signing key underneath (via e.g. sq cert lint &dash&dashfix &dash&dashcert-file $PRIVAT_KEY_FILE > $PRIVAT_KEY_FILE-fixed). If you don t have access to the according private key (e.g. when using an upstream repository that has been ignoring this issue), you re out of luck for a proper fix. But there s a workaround for the apt situation (related see apt commit 0989275c2f7afb7a5f7698a096664a1035118ebf):
root@424812bd4556:/# cat /usr/share/apt/default-sequoia.config
# Default APT Sequoia configuration. To overwrite, consider copying this
# to /etc/crypto-policies/back-ends/apt-sequoia.config and modify the
# desired values.
[asymmetric_algorithms]
dsa2048 = 2024-02-01
dsa3072 = 2024-02-01
dsa4096 = 2024-02-01
brainpoolp256 = 2028-02-01
brainpoolp384 = 2028-02-01
brainpoolp512 = 2028-02-01
rsa2048  = 2030-02-01
[hash_algorithms]
sha1.second_preimage_resistance = 2026-02-01    # Extend the expiry for legacy repositories
sha224 = 2026-02-01
[packets]
signature.v3 = 2026-02-01   # Extend the expiry
Adjust this according to your needs:
root@424812bd4556:/# mkdir -p /etc/crypto-policies/back-ends/
root@424812bd4556:/# cp /usr/share/apt/default-sequoia.config /etc/crypto-policies/back-ends/apt-sequoia.config
root@424812bd4556:/# $EDITOR /etc/crypto-policies/back-ends/apt-sequoia.config
root@424812bd4556:/# cat /etc/crypto-policies/back-ends/apt-sequoia.config
# APT Sequoia override configuration
[asymmetric_algorithms]
dsa2048 = 2024-02-01
dsa3072 = 2024-02-01
dsa4096 = 2024-02-01
brainpoolp256 = 2028-02-01
brainpoolp384 = 2028-02-01
brainpoolp512 = 2028-02-01
rsa2048  = 2030-02-01
[hash_algorithms]
sha1.second_preimage_resistance = 2026-09-01    # Extend the expiry for legacy repositories
sha224 = 2026-09-01
[packets]
signature.v3 = 2026-02-01   # Extend the expiry
Then we re back into the original situation, being a warning instead of an error:
root@424812bd4556:/# apt update
Hit:1 http://deb.debian.org/debian trixie InRelease
Get:2 http://foo.example.org/debian demo InRelease [4229 B]
Hit:3 http://deb.debian.org/debian trixie-updates InRelease
Hit:4 http://deb.debian.org/debian-security trixie-security InRelease
Warning: http://foo.example.org/debian/dists/demo/InRelease: Policy will reject signature within a year, see --audit for details
[..]
Please note that this is a workaround, and not a proper solution.

Russ Allbery: Review: Dragon Pearl

Review: Dragon Pearl, by Yoon Ha Lee
Series: Thousand Worlds #1
Publisher: Rick Riordan Presents
Copyright: 2019
ISBN: 1-368-01519-0
Format: Kindle
Pages: 315
Dragon Pearl is a middle-grade space fantasy based on Korean mythology and the first book of a series. Min is a fourteen-year-old girl living on the barely-terraformed world of Jinju with her extended family. Her older brother Jun passed the entrance exams for the Academy and left to join the Thousand Worlds Space Forces, and Min is counting the years until she can do the same. Those plans are thrown into turmoil when an official investigator appears at their door claiming that Jun deserted to search for the Dragon Pearl. A series of impulsive fourteen-year-old decisions lead to Min heading for a spaceport alone, determined to find her brother and prove his innocence. This would be a rather improbable quest for a young girl, but Min is a gumiho, one of the supernaturals who live in the Thousand Worlds alongside non-magical humans. Unlike the more respectable dragons, tigers, goblins, and shamans, gumiho are viewed with suspicion and distrust because their powers are useful for deception. They are natural shapeshifters who can copy the shapes of others, and their Charm ability lets them influence people's thoughts and create temporary illusions of objects such as ID cards. It will take all of Min's powers, and some rather lucky coincidences, to infiltrate the Space Forces and determine what happened to her brother. It's common for reviews of this book to open with a caution that this is a middle-grade adventure novel and you should not expect a story like Ninefox Gambit. I will be boring and repeat that caution. Dragon Pearl has a single first-person viewpoint and a very linear and straightforward plot. Adult readers are unlikely to be surprised by plot twists; the fun is the world-building and seeing how Min manages to work around plot obstacles. The world-building is enjoyable but not very rigorous. Min uses and abuses Charm with the creative intensity of a Dungeons & Dragons min-maxer. Each individual event makes sense given the implication that Min is unusually powerful, but I'm dubious about the surrounding society and lack of protections against Charm given what Min is able to do. Min does say that gumiho are rare and many people think they're extinct, which is a bit of a fig leaf, but you'll need to bring your urban fantasy suspension of disbelief skills to this one. I did like that the world-building conceit went more than skin deep and influenced every part of the world. There are ghosts who are critical to the plot. Terraforming is done through magic, hence the quest for the Dragon Pearl and the miserable state of Min's home planet due to its loss. Medical treatment involves the body's meridians, as does engineering: The starships have meridians similar to those of humans, and engineers partly merge with those meridians to adjust them. This is not the sort of book that tries to build rigorous scientific theories or explain them to the reader, and I'm not sure everything would hang together if you poked at it too hard, but Min isn't interested in doing that poking and the story doesn't try to justify itself. It's mostly a vibe, but it's a vibe that I enjoyed and that is rather different than other space fantasy I've read. The characters were okay but never quite clicked for me, in part because proper character exploration would have required Min take a detour from her quest to find her brother and that was not going to happen. The reader gets occasional glimpses of a military SF cadet story and a friendship on false premises story, but neither have time to breathe because Min drops any entanglement that gets in the way of her quest. She's almost amoral in a way that I found believable but not quite aligned with my reading mood. I also felt a bit wrong-footed by how her friendships developed; saying too much more would be a spoiler, but I was expecting more human connection than I got. I think my primary disappointment with this book was something I knew going in, not in any way its fault, and part of the reason why I'd put off reading it: This is pitched at young teenagers and didn't have quite enough plot and characterization complexity to satisfy me. It's a linear, somewhat episodic adventure story with some neat world-building, and it therefore glides over the spots where an adult novel would have added political and factional complexity. That is exactly as advertised, so it's up to you whether that's the book you're in the mood for. One warning: The text of this book opens with an introduction by Rick Riordan that is just fluff marketing and that spoils the first few chapters of the book. It is unmarked as such at the beginning and tricked me into thinking it was the start of the book proper, and then deeply annoyed me. If you do read this book, I recommend skipping the utterly pointless introduction and going straight to chapter one. Followed by Tiger Honor. Rating: 6 out of 10

30 January 2026

Joey Hess: the local weather

Snow coming. I'm tuned into the local 24 hour slop weather stream. AI generated, narrated, up to the minute radar and forecast graphics. People popping up on the live weather map with questions "snow soon?" (They pay for the privilege.) LLM generating reply that riffs on their name. Tuned to keep the urgency up, something is always happening somewhere, scanners are pulling the police reports, live webcam description models add verisimilitude to the description of the morning commute. Weather is happening. In the subtext, climate change is happening. Weather is a growth industry. The guy up in Kentucky coal country who put this thing together is building an empire. He started as just another local news greenscreener. Dropped out and went twitch weather stream. Hyping up tornado days and dicy snow forecasts. Nowcasting, hyper individualized, interacting with chat. Now he's automated it all. On big days when he's getting real views, the bot breaks into his live streams, gives him a break. Only a few thousand watching this morning yet. Perfect 2026 grade slop. Details never quite right, but close enough to keep on in the background all day. Nobody expects a perfect forecast after all, and it's fed from the National Weather Center discussion too. We still fund those guys? Why bother when a bot can do it? He knows why he's big in these states, these rural areas. Understands the target audience. Airbrushed AI aesthetics are ok with them, receive no pushback. Flying more under the radar coastally, but weather is big there and getting bigger. The local weather will come for us all. 6 inches of snow covering some ground mount solar panels with a vertical solar panel fence behind them free of snow except cute little caps (Not fiction FYI.)

Utkarsh Gupta: FOSS Activites in January 2026

Here s my monthly but brief update about the activities I ve done in the FOSS world.

Debian
Whilst I didn t get a chance to do much, here are still a few things that I worked on:
  • A few discussions with the new DFSG team, et al.
  • Assited a few folks in getting their patches submitted via Salsa.
  • Reviewing pyenv MR for Ujjwal.
  • Mentoring for newcomers.
  • Moderation of -project mailing list.

Ubuntu
I joined Canonical to work on Ubuntu full-time back in February 2021. Whilst I can t give a full, detailed list of things I did, here s a quick TL;DR of what I did:
  • Successfully released Resolute Snapshot 3!
    • This one was also done without the ISO tracker and cdimage access.
    • We also worked very hard to build and promote all the image in due time.
  • Worked further on the whole artifact signing story for cdimage.
  • Assisted a bunch of folks with my Archive Admin and Release team hats to:
    • Helped in EOL ing Plucky.
    • Starting to help with the upcoming 24.04.4 release.
  • With that, the mid-cycle sprints are around the corner, so quite busy preparing for that.

Debian (E)LTS
This month I have worked 59 hours on Debian Long Term Support (LTS) and on its sister Extended LTS project and did the following things:

Released Security Updates

Work in Progress
  • knot-resolver: Affected by CVE-2023-26249, CVE-2023-46317, and CVE-2022-40188, leading to Denial of Service.
  • ruby-rack: There were multiple vulnerabilities reported in Rack, leading to DoS (memory exhaustion) and proxy bypass.
    • [ELTS]: After completing the work for LTS myself, Bastien picked it up for ELTS and reached out about an upstream regression and we ve been doing some exchanges. Bastien has done most of the work backporting the patches but needs a review and help backporting CVE-2025-61771. Haven t made much progress since last month and will carry it over.
  • node-lodash: Affected by CVE-2025-13465, lrototype pollution in baseUnset function.
    • [stable]: The patch for trixie and bookworm are ready but haven t been uploaded yet as I d like for the unstable upload to settle a bit before I proceed with stable uploads.
    • [LTS]: The bullseye upload will follow once the stable uploads are in and ACK d by the SRMs.
  • xrdp: Affected by CVE-2025-68670, leading to a stack-based buffer overflow.

Other Activities
  • [ELTS] Helped Bastien Roucaries debug a tomcat9 regression for buster.
    • I spent quite a lot of time trying to help Bastien (with Markus and Santiago involved via mail thread) by reproducing the regression that the user(s) reported.
    • I also helped suggest a path forward by vendoring everything, which I was then requested to also help perform.
    • Whilst doing that, I noticed circular dependency hellhole and suggested another path forward by backporting bnd and its dependencies as separate NEW packages.
    • Bastien liked the idea and is going to work on that but preferred to revert the update to remedy the immediate regressions reported. I further helped him in reviewing his update. This conversation happened on #debian-elts IRC channel.
  • [LTS] Assisted Ben Hutchings with his question about the next possible steps with a plausible libvirt regression caused by the Linux kernel update. This was a thread on debian-lts@ mailing list.
  • [LTS] Attended the monthly LTS meeting on IRC. Summary here.
  • [E/LTS] Monitored discussions on mailing lists, IRC, and all the documentation updates.

Until next time.
:wq for today.

29 January 2026

C.J. Collier: Part 3: Building the Keystone Dataproc Custom Images for Secure Boot & GPUs

Part
3: Building the Keystone Dataproc Custom Images for Secure Boot &
GPUs In Part 1, we established a secure, proxy-only network. In Part 2, we
explored the enhanced install_gpu_driver.sh initialization
action. Now, in Part 3, we ll focus on using the LLC-Technologies-Collier/custom-images
repository (branch proxy-exercise-2025-11) to build the
actual custom Dataproc images embedded with NVIDIA drivers signed for
Secure Boot, all within our proxied environment.

Why Custom Images? To run NVIDIA GPUs on Shielded VMs with Secure Boot enabled, the
NVIDIA kernel modules must be signed with a key trusted by the VM s EFI
firmware. Since standard Dataproc images don t include these
custom-signed modules, we need to build our own. This process also
allows us to pre-install a full stack of GPU-accelerated software.

The
custom-images Toolkit
(examples/secure-boot) The examples/secure-boot directory within the
custom-images repository contains the necessary scripts and
configurations, refined through significant development to handle proxy
and Secure Boot challenges. Key Components & Development Insights:
  • env.json: The central configuration
    file (as used in Part 1) for project, network, proxy, and bucket
    details. This became the single source of truth to avoid configuration
    drift.
  • create-key-pair.sh: Manages the Secure
    Boot signing keys (PK, KEK, DB) in Google Secret Manager, essential for
    the module signing.
  • build-and-run-podman.sh: Orchestrates
    the image build process in an isolated Podman container. This was
    introduced to standardize the build environment and encapsulate
    dependencies, simplifying what the user needs to install locally.
  • pre-init.sh: Sets up the build
    environment within the container and calls
    generate_custom_image.py. It crucially passes metadata
    derived from env.json (like proxy settings and Secure Boot
    key secret names) to the temporary build VM.
  • generate_custom_image.py: The core
    Python script that automates GCE VM creation, runs the customization
    script, and creates the final GCE image.
  • gce-proxy-setup.sh: This script from
    startup_script/ is vital. It s injected into the temporary
    build VM and runs first to configure the OS, package
    managers (apt, dnf), tools (curl, wget, GPG), Conda, and Java to use the
    proxy settings passed in the metadata. This ensures the entire build
    process is proxy-aware.
  • install_gpu_driver.sh: Used as the
    --customization-script within the build VM. As detailed in
    Part 2, this script handles the driver/CUDA/ML stack installation and
    signing, now able to function correctly due to the proxy setup by
    gce-proxy-setup.sh.
Layered Image Strategy: The pre-init.sh script employs a layered approach:
  1. secure-boot Image: Base image with
    Secure Boot certificates injected.
  2. tf Image: Based on
    secure-boot, this image runs the full
    install_gpu_driver.sh within the proxy-configured build VM
    to install NVIDIA drivers, CUDA, ML libraries (TensorFlow, PyTorch,
    RAPIDS), and sign the modules. This is the primary target image for our
    use case.
(Note: secure-proxy and proxy-tf layers
were experiments, but the -tf image combined with runtime
metadata emerged as the most effective solution for 2.2-debian12). Build Steps:
  1. Clone Repos & Configure
    env.json: Ensure you have the
    custom-images and cloud-dataproc repos and a
    complete env.json as described in Part 1.
  2. Run the Build:
    bash # Example: Build a 2.2-debian12 based image set # Run from the custom-images repository root bash examples/secure-boot/build-and-run-podman.sh 2.2-debian12
    This command will build the layered images, leveraging the proxy
    settings from env.json via the metadata injected into the
    build VM. Note the final image name produced (e.g.,
    dataproc-2-2-deb12-YYYYMMDD-HHMMSS-tf).

Conclusion of Part 3 Through an iterative process, we ve developed a robust workflow
within the custom-images repository to build Secure
Boot-compatible GPU images in a proxy-only environment. The key was
isolating the build in Podman, ensuring the build VM is fully
proxy-aware using gce-proxy-setup.sh, and leveraging the
enhanced install_gpu_driver.sh from Part 2. In Part 4, we ll bring it all together, deploying a Dataproc cluster
using this custom -tf image within the secure network, and
verifying the end-to-end functionality.

28 January 2026

C.J. Collier: Dataproc GPUs, Secure Boot, & Proxies

Part
1: Building a Secure Network Foundation for Dataproc with GPUs &
SWP Welcome to the first post in our series on running GPU-accelerated
Dataproc workloads in secure, enterprise-grade environments. Many
organizations need to operate within VPCs that have no direct internet
egress, instead routing all traffic through a Secure Web Proxy (SWP).
Additionally, security mandates often require the use of Shielded VMs
with Secure Boot enabled. This series will show you how to meet these
requirements for your Dataproc GPU clusters. In this post, we ll focus on laying the network foundation using
tools from the LLC-Technologies-Collier/cloud-dataproc
repository (branch proxy-sync-2026-01).

The Challenge: Network
Isolation & Control Before we can even think about custom images or GPU drivers, we need
a network environment that:
  1. Prevents direct internet access from Dataproc cluster nodes.
  2. Forces all egress traffic through a manageable and auditable
    SWP.
  3. Provides the necessary connectivity for Dataproc to function and for
    us to build images later.
  4. Supports Secure Boot for all VMs.

The Toolkit:
LLC-Technologies-Collier/cloud-dataproc To make setting up and tearing down these complex network
environments repeatable and consistent, we ve developed a set of bash
scripts within the gcloud directory of the
cloud-dataproc repository. These scripts handle the
creation of VPCs, subnets, firewall rules, service accounts, and the
Secure Web Proxy itself. Key Script:
gcloud/bin/create-dpgce-private This script is the cornerstone for creating the private, proxied
environment. It automates:
  • VPC and Subnet creation (for the cluster, SWP, and management).
  • Setup of Certificate Authority Service and Certificate Manager for
    SWP TLS interception.
  • Deployment of the SWP Gateway instance.
  • Configuration of a Gateway Security Policy to control egress.
  • Creation of necessary firewall rules.
  • Result: Cluster nodes in this VPC have NO default
    internet route and MUST use the SWP.
Configuration via env.json We use a single env.json file to drive the
configuration. This file will also be used by the
custom-images scripts in Part 3. This env.json
should reside in your custom-images repository clone, and
you ll symlink it into the cloud-dataproc/gcloud
directory. Running the Setup:
# Assuming you have cloud-dataproc and custom-images cloned side-by-side
# And your env.json is in the custom-images root
cd cloud-dataproc/gcloud
# Symlink to the env.json in custom-images
ln -sf ../../custom-images/env.json env.json
# Run the creation script, but don't create a cluster yet
bash bin/create-dpgce-private --no-create-cluster
cd ../../custom-images

Node
Configuration: The Metadata Startup Script for Runtime For the Dataproc cluster nodes to function correctly in this proxied
environment, they need to be configured to use the SWP on boot. We
achieve this using a GCE metadata startup script. The script startup_script/gce-proxy-setup.sh (from the
custom-images repository) is designed to be run on each
cluster node at boot. It reads metadata like http-proxy and
http-proxy-pem-uri (which our cluster creation scripts in
Part 4 will pass) to configure the OS environment, package managers, and
other tools to use the SWP. Upload this script to your GCS bucket:
# Run from the custom-images repository root
gsutil cp startup_script/gce-proxy-setup.sh gs://$(jq -r .BUCKET env.json)/custom-image-deps/
This script is essential for the runtime behavior of the
cluster nodes.

Conclusion of Part 1 With the cloud-dataproc scripts, we ve laid the
groundwork by provisioning a secure VPC with controlled egress through
an SWP. We ve also prepared the essential node-level proxy configuration
script (gce-proxy-setup.sh) in GCS, ready to be used by our
clusters. Stay tuned for Part 2, where we ll dive into the
install_gpu_driver.sh initialization action from the
LLC-Technologies-Collier/initialization-actions repository
(branch gpu-202601) and how it s been adapted to install
all GPU-related software through the proxy during the image build
process.

27 January 2026

Elana Hashman: A beginner's guide to improving your digital security

In 2017, I led a series of workshops aimed at teaching beginners a better understanding of encryption, how the internet works, and their digital security. Nearly a decade later, there is still a great need to share reliable resources and guides on improving these skills. I have worked professionally in computer security one way or another for well over a decade, at many major technology companies and in many open source software projects. There are many inaccurate and unreliable resources out there on this subject, put together by well-meaning people without a background in security, which can lead to sharing misinformation, exaggeration and fearmongering. I hope that I can offer you a trusted, curated list of high impact things that you can do right now, using whichever vetted guide you prefer. In addition, I also include how long it should take, why you should do each task, and any limitations. This guide is aimed at improving your personal security, and does not apply to your work-owned devices. Always assume your company can monitor all of your messages and activities on work devices. What can I do to improve my security right away? I put together this list in order of effort, easiest tasks first. You should be able to complete many of the low effort tasks in a single hour. The medium to high effort tasks are very much worth doing, but may take you a few days or even weeks to complete them. Low effort (<15 minutes) Upgrade your software to the latest versions Why? I don't know anyone who hasn't complained about software updates breaking features, introducing bugs, and causing headaches. If it ain't broke, why upgrade, right? Well, alongside all of those annoying bugs and breaking changes, software updates also include security fixes, which will protect your device from being exploited by bad actors. Security issues can be found in software at any time, even software that's been available for many years and thought to be secure. You want to install these as soon as they are available. Recommendation: Turn on automatic upgrades and always keep your devices as up-to-date as possible. If you have some software you know will not work if you upgrade it, at least be sure to upgrade your laptop and phone operating system (iOS, Android, Windows, etc.) and web browser (Chrome, Safari, Firefox, etc.). Do not use devices that do not receive security support (e.g. old Android or iPhones). Guides: Limitations: This will prevent someone from exploiting known security issues on your devices, but it won't help if your device was already compromised. If this is a concern, doing a factory reset, upgrade, and turning on automatic upgrades may help. This also won't protect against all types of attacks, but it is a necessary foundation. Use Signal Why? Signal is a trusted, vetted, secure messaging application that allows you to send end-to-end encrypted messages and make video/phone calls. This means that only you and your intended recipient can decrypt the messages and someone cannot intercept and read your messages, in contrast to texting (SMS) and other insecure forms of messaging. Other applications advertise themselves as end-to-end encrypted, but Signal provides the strongest protections. Recommendation: I recommend installing the Signal app and using it! My mom loves that she can video call me on Wi-Fi on my Android phone. It also supports group chats. I use it as a secure alternative to texting (SMS) and other chat platforms. I also like Signal's "disappearing messages" feature which I enable by default because it automatically deletes messages after a certain period of time. This avoids your messages taking up too much storage. Guides: Limitations: Signal is only able to protect your messages in transit. If someone has access to your phone or the phone of the person you sent messages to, they will still be able to read them. As a rule of thumb, if you don't want someone to read something, don't write it down! Meet in person or make an encrypted phone call where you will not be overheard. If you are talking to someone you don't know, assume your messages are as public as posting on social media. Set passwords and turn on device encryption Why? Passwords ensure that someone else can't unlock your device without your consent or knowledge. They also are required to turn on device encryption, which protects your information on your device from being accessed when it is locked. Biometric (fingerprint or face ID) locking provides some privacy, but your fingerprint or face ID can be used against your wishes, whereas if you are the only person who knows your password, only you can use it. Recommendation: Always set passwords and have device encryption enabled in order to protect your personal privacy. It may be convenient to allow kids or family members access to an unlocked device, but anyone else can access it, too! Use strong passwords that cannot be guessed avoid using names, birthdays, phone numbers, addresses, or other public information. Using a password manager will make creating and managing passwords even easier. Disable biometric unlock, or at least know how to disable it. Most devices will enable disk encryption by default, but you should double-check. Guides: Limitations: If your device is unlocked, the password and encryption will provide no protections; the device must be locked for this to protect your privacy. It is possible, though unlikely, for someone to gain remote access to your device (for example through malware or stalkerware), which would bypass these protections. Some forensic tools are also sophisticated enough to work with physical access to a device that is turned on and locked, but not a device that is turned off/freshly powered on and encrypted. If you lose your password or disk encryption key, you may lose access to your device. For this reason, Windows and Apple laptops can make a cloud backup of your disk encryption key. However, a cloud backup can potentially be disclosed to law enforcement. Install an ad blocker Why? Online ad networks are often exploited to spread malware to unsuspecting visitors. If you've ever visited a regular website and suddenly seen an urgent, flashing pop-up claiming your device was hacked, it is often due to a bad ad. Blocking ads provides an additional layer of protection against these kinds of attacks. Recommendation: I recommend everyone uses an ad blocker at all times. Not only are ads annoying and disruptive, but they can even result in your devices being compromised! Guides: Limitations: Sometimes the use of ad blockers can break functionality on websites, which can be annoying, but you can temporarily disable them to fix the problem. These may not be able to block all ads or all tracking, but they make browsing the web much more pleasant and lower risk! Some people might also be concerned that blocking ads might impact the revenue of their favourite websites or creators. In this case, I recommend either donating directly or sharing the site with a wider audience, but keep using the ad blocker for your safety. Enable HTTPS-Only Mode Why? The "S" in "HTTPS" stands for "secure". This feature, which can be enabled on your web browser, ensures that every time you visit a website, your connection is always end-to-end encrypted (just like when you use Signal!) This ensures that someone can't intercept what you search for, what pages on websites you visit, and any information you or the website share such as your banking details. Recommendation: I recommend enabling this for everyone, though with improvements in web browser security and adoption of HTTPS over the years, your devices will often do this by default! There is a small risk you will encounter some websites that do not support HTTPS, usually older sites. Guides: Limitations: HTTPS protects the information on your connection to a website. It does not hide or protect the fact that you visited that website, only the information you accessed. If the website is malicious, HTTPS does not provide any protection. In certain settings, like when you use a work-managed computer that was set up for you, it can still be possible for your IT Department to see what you are browsing, even over an HTTPS connection, because they have administrator access to your computer and the network. Medium to high effort (1+ hours) These tasks require more effort but are worth the investment. Set up a password manager Why? It is not possible for a person to remember a unique password for every single website and app that they use. I have, as of writing, 556 passwords stored in my password manager. Password managers do three important things very well:
  1. They generate secure passwords with ease. You don't need to worry about getting your digits and special characters just right; the app will do it for you, and generate long, secure passwords.
  2. They remember all your passwords for you, and you just need to remember one password to access all of them. The most common reason people's accounts get hacked online is because they used the same password across multiple websites, and one of the websites had all their passwords leaked. When you use a unique password on every website, it doesn't matter if your password gets leaked!
  3. They autofill passwords based on the website you're visiting. This is important because it helps prevent you from getting phished. If you're tricked into visiting an evil lookalike site, your password manager will refuse to fill the password.
Recommendation: These benefits are extremely important, and setting up a password manager is often one of the most impactful things you can do for your digital security. However, they take time to get used to, and migrating all of your passwords into the app (and immediately changing them!) can take a few minutes at a time... over weeks. I recommend you prioritize the most important sites, such as your email accounts, banking/financial sites, and cellphone provider. This process will feel like a lot of work, but you will get to enjoy the benefits of never having to remember new passwords and the autofill functionality for websites. My recommended password manager is 1Password, but it stores passwords in the cloud and costs money. There are some good free options as well if cost is a concern. You can also use web browser- or OS-based password managers, but I do not prefer these. Guides: Limitations: Many people are concerned about the risk of using a password manager causing all of their passwords to be compromised. For this reason, it's very important to use a vetted, reputable password manager that has passed audits, such as 1Password or Bitwarden. It is also extremely important to choose a strong password to unlock your password manager. 1Password makes this easier by generating a secret to strengthen your unlock password, but I recommend using a long, memorable password in any case. Another risk is that if you forget your password manager's password, you will lose access to all your passwords. This is why I recommend 1Password, which has you set up an Emergency Kit to recover access to your account. Set up two-factor authentication (2FA) for your accounts Why? If your password is compromised in a website leak or due to a phishing attack, two-factor authentication will require a second piece of information to log in and potentially thwart the intruder. This provides you with an extra layer of security on your accounts. Recommendation: You don't necessarily need to enable 2FA on every account, but prioritize enabling it on your most important accounts (email, banking, cellphone, etc.) There are typically a few different kinds: email-based (which is why your email account's security is so important), text message or SMS-based (which is why your cell phone account's security is so important), app-based, and hardware token-based. Email and text message 2FA are fine for most accounts. You may want to enable app- or hardware token-based 2FA for your most sensitive accounts. Guides: Limitations: The major limitation is that if you lose access to 2FA, you can be locked out of an account. This can happen if you're travelling abroad and can't access your usual cellphone number, if you break your phone and you don't have a backup of your authenticator app, or if you lose your hardware-based token. For this reason, many websites will provide you with "backup tokens" you can print them out and store them in a secure location or use your password manager. I also recommend if you use an app, you choose one that will allow you to make secure backups, such as Ente. You are also limited by the types of 2FA a website supports; many don't support app- or hardware token-based 2FA. Remove your information from data brokers Why? This is a problem that mostly affects people in the US. It surprises many people that information from their credit reports and other public records is scraped and available (for free or at a low cost) online through "data broker" websites. I have shocked friends who didn't believe this was an issue by searching for their full names and within 5 minutes being able to show them their birthday, home address, and phone number. This is a serious privacy problem! Recommendation: Opt out of any and all data broker websites to remove this information from the internet. This is especially important if you are at risk of being stalked or harassed. Guides: Limitations: It can take time for your information to be removed once you opt out, and unfortunately search engines may have cached your information for a while longer. This is also not a one-and-done process. New data brokers are constantly popping up and some may not properly honour your opt out, so you will need to check on a regular basis (perhaps once or twice a year) to make sure your data has been properly scrubbed. This also cannot prevent someone from directly searching public records to find your information, but that requires much more effort. "Recommended security measures" I think beginners should avoid We've covered a lot of tasks you should do, but I also think it's important to cover what not to do. I see many of these tools recommended to security beginners, and I think that's a mistake. For each tool, I will explain my reasoning around why I don't think you should use it, and the scenarios in which it might make sense to use. "Secure email" What is it? Many email providers, such as Proton Mail, advertise themselves as providing secure email. They are often recommended as a "more secure" alternative to typical email providers such as GMail. What's the problem? Email is fundamentally insecure by design. The email specification (RFC-3207) states that any publicly available email server MUST NOT require the use of end-to-end encryption in transit. Email providers can of course provide additional security by encrypting their copies of your email, and providing you access to your email by HTTPS, but the messages themselves can always be sent without encryption. Some platforms such as Proton Mail advertise end-to-end encrypted emails so long as you email another Proton user. This is not truly email, but their own internal encrypted messaging platform that follows the email format. What should I do instead? Use Signal to send encrypted messages. NEVER assume the contents of an email are secure. Who should use it? I don't believe there are any major advantages to using a service such as this one. Even if you pay for a more "secure" email provider, the majority of your emails will still be delivered to people who don't. Additionally, while I don't use or necessarily recommend their service, Google offers an Advanced Protection Program for people who may be targeted by state-level actors. PGP/GPG Encryption What is it? PGP ("Pretty Good Privacy") and GPG ("GNU Privacy Guard") are encryption and cryptographic signing software. They are often recommended to encrypt messages or email. What's the problem? GPG is decades old and its usability has always been terrible. It is extremely easy to accidentally send a message that you thought was encrypted without encryption! The problems with PGP/GPG have been extensively documented. What should I do instead? Use Signal to send encrypted messages. Again, NEVER use email for sensitive information. Who should use it? Software developers who contribute to projects where there is a requirement to use GPG should continue to use it until an adequate alternative is available. Everyone else should live their lives in PGP-free bliss. Installing a "secure" operating system (OS) on your phone What is it? There are a number of self-installed operating systems for Android phones, such as GrapheneOS, that advertise as being "more secure" than using the version of the Android operating system provided by your phone manufacturer. They often remove core Google APIs and services to allow you to "de-Google" your phone. What's the problem? These projects are relatively niche, and don't have nearly enough resourcing to be able to respond to the high levels of security pressure Android experiences (such as against the forensic tools I mentioned earlier). You may suddenly lose security support with no notice, as with CalyxOS. You need a high level of technical know-how and a lot of spare time to maintain your device with a custom operating system, which is not a reasonable expectation for the average person. By stripping all Google APIs such as Google Play Services, some useful apps can no longer function. And some law enforcement organizations have gone as far as accusing people who install GrapheneOS on Pixel phones to be engaging in criminal activity. What should I do instead? For the best security on an Android device, use a phone manufactured by Google or Samsung (smaller manufacturers are more unreliable), or consider buying an iPhone. Make sure your device is receiving security updates and up-to-date. Who should use it? These projects are great for tech enthusiasts who are interested in contributing to and developing them further. They can be used to give new life to old phones that are not receiving security or software updates. They are also great for people with an interest in free and open source software and digital autonomy. But these tools are not a good choice for a general audience, nor do they provide more practical security than using an up-to-date Google or Samsung Android phone. Virtual Private Network (VPN) Services What is it? A virtual private network or VPN service can provide you with a secure tunnel from your device to the location that the VPN operates. This means that if I am using my phone in Seattle connected to a VPN in Amsterdam, if I access a website, it appears to the website that my phone is located in Amsterdam. What's the problem? VPN services are frequently advertised as providing security or protection from nefarious bad actors, or helping protect your privacy. These benefits are often far overstated, and there are predatory VPN providers that can actually be harmful. It costs money and resources to provide a VPN, so free VPN services are especially suspect. When you use a VPN, the VPN provider knows the websites you are visiting in order to provide you with the service. Free VPN providers may sell this data in order to cover the cost of providing the service, leaving you with less security and privacy. The average person does not have the knowledge to be able to determine if a VPN service is trustworthy or not. VPNs also don't provide any additional encryption benefits if you are already using HTTPS. They may provide a small amount of privacy benefit if you are connected to an untrusted network with an attacker. What should I do instead? Always use HTTPS to access websites. Don't connect to untrusted internet providers for example, use cellphone network data instead of a sketchy Wi-Fi access point. Your local neighbourhood coffee shop is probably fine. Who should use it? There are three main use cases for VPNs. The first is to bypass geographic restrictions. A VPN will cause all of your web traffic to appear to be coming from another location. If you live in an area that has local internet censorship policies, you can use a VPN to access the internet from a location that lacks such policies. The second is if you know your internet service provider is actively hostile or malicious. A trusted VPN will protect the visibility of all your traffic, including which websites you visit, from your internet service provider, and the only thing they will be able to see is that you are accessing a VPN. The third use case is to access a network that isn't connected to the public internet, such as a corporate intranet. I strongly discourage the use of VPNs for "general-purpose security." Tor What is it? Tor, "The Onion Router", is a free and open source software project that provides anonymous networking. Unlike with a VPN, where the VPN provider knows who you are and what websites you are requesting, Tor's architecture makes it extremely difficult to determine who sent a request. What's the problem? Tor is difficult to set up properly; similar to PGP-encrypted email, it is possible to accidentally not be connected to Tor and not know the difference. This usability has improved over the years, but Tor is still not a good tool for beginners to use. Due to the way Tor works, it is also extremely slow. If you have used cable or fiber internet, get ready to go back to dialup speeds. Tor also doesn't provide perfect privacy and without a strong understanding of its limitations, it can be possible to deanonymize someone despite using it. Additionally, many websites are able to detect connections from the Tor network and block them. What should I do instead? If you want to use Tor to bypass censorship, it is often better to use a trusted VPN provider, particularly if you need high bandwidth (e.g. for streaming). If you want to use Tor to access a website anonymously, Tor itself might not be enough to protect you. For example, if you need to provide an email address or personal information, you can decline to provide accurate information and use a masked email address. A friend of mine once used the alias "Nunya Biznes" Who should use it? Tor should only be used by people who are experienced users of security tools and understand its strengths and limitations. Tor also is best used on a purpose-built system, such as Tor Browser or Freedom of the Press Foundation's SecureDrop. I want to learn more! I hope you've found this guide to be a useful starting point. I always welcome folks reaching out to me with questions, though I might take a little bit of time to respond. You can always email me. If there's enough interest, I might cover the following topics in a future post: Stay safe out there!

26 January 2026

Otto Kek l inen: Ubuntu Pro subscription - should you pay to use Linux?

Featured image of post Ubuntu Pro subscription - should you pay to use Linux?Ubuntu Pro is a subscription offering for Ubuntu users who want to pay for the assurance of getting quick and high-quality security updates for Ubuntu. I tested it out to see how it works in practice, and to evaluate how well it works as a commercial open source service model for Linux. Anyone running Ubuntu can subscribe to it at ubuntu.com/pro/subscribe by selecting the setup type Desktops for the price of $25 per year (+applicable taxes) for Enterprise users. There is also a free version for personal use. Once you have an account, you can find your activation token at ubuntu.com/pro/dashboard, and use it to activate Ubuntu Pro on your desktop or laptop Ubuntu machine by running sudo pro attach <token>:
$ sudo pro attach aabbcc112233aabbcc112233
Enabling default service esm-apps
Updating package lists
Ubuntu Pro: ESM Apps enabled
Enabling default service esm-infra
Updating package lists
Ubuntu Pro: ESM Infra enabled
Enabling default service livepatch
Installing canonical-livepatch snap
Canonical livepatch enabled.
Unable to determine current instance-id
This machine is now attached to 'Ubuntu Pro Desktop'
You can at any time confirm the Ubuntu Pro status by running:
$ sudo pro status --all
SERVICE ENTITLED STATUS DESCRIPTION
anbox-cloud yes disabled Scalable Android in the cloud
cc-eal yes n/a Common Criteria EAL2 Provisioning Packages
esm-apps yes enabled Expanded Security Maintenance for Applications
esm-infra yes enabled Expanded Security Maintenance for Infrastructure
fips yes n/a NIST-certified FIPS crypto packages
fips-preview yes n/a Preview of FIPS crypto packages undergoing certification with NIST
fips-updates yes disabled FIPS compliant crypto packages with stable security updates
landscape yes enabled Management and administration tool for Ubuntu
livepatch yes disabled Canonical Livepatch service
realtime-kernel yes disabled Ubuntu kernel with PREEMPT_RT patches integrated
  generic yes disabled Generic version of the RT kernel (default)
  intel-iotg yes n/a RT kernel optimized for Intel IOTG platform
  raspi yes n/a 24.04 Real-time kernel optimised for Raspberry Pi
ros yes n/a Security Updates for the Robot Operating System
ros-updates yes n/a All Updates for the Robot Operating System
usg yes disabled Security compliance and audit tools
Enable services with: pro enable <service>
Account: Otto Kekalainen
Subscription: Ubuntu Pro Desktop
Valid until: Thu Mar 3 08:08:38 2026 PDT
Technical support level: essential
For a regular desktop/laptop user the most relevant service is the esm-apps, which delivers extended security updates to many applications typically used on desktop systems. Another relevant command to confirm the current subscription status is:
$ sudo pro security-status
2828 packages installed:
2143 packages from Ubuntu Main/Restricted repository
660 packages from Ubuntu Universe/Multiverse repository
13 packages from third parties
12 packages no longer available for download
To get more information about the packages, run
pro security-status --help
for a list of available options.
This machine is receiving security patching for Ubuntu Main/Restricted
repository until 2029.
This machine is attached to an Ubuntu Pro subscription.
Ubuntu Pro with 'esm-infra' enabled provides security updates for
Main/Restricted packages until 2034.
Ubuntu Pro with 'esm-apps' enabled provides security updates for
Universe/Multiverse packages until 2034. You have received 26 security
updates.
This confirms the scope of the security support. You can even run sudo pro security-status --esm-apps to get a detailed breakdown of the installed software packages in scope for Expanded Security Maintenance (ESM).

Experiences from using Ubuntu Pro for over a year Personally I have been using it on two laptop systems for well over a year now and everything seems to have worked well. I see apt is downloading software updates from https://esm.ubuntu.com/apps/ubuntu, but other than that there aren t any notable signs of Ubuntu Pro being in use. That is a good thing after all one is paying for assurance that everything works with minimum disruptions, so the system that enables smooth sailing should stay in the background and not make too much noise of itself.

Using Landscape to manage multiple Ubuntu laptops Landscape portal reports showing security update status and resource utilization Landscape.canonical.com is a fleet management system that shows information like security update status and resource utilization for the computers you administer. Ubuntu Pro attached systems under one s account are not automatically visible in Landscape, but have to be enrolled in it. To enroll an Ubuntu Pro attached desktop/laptop to Landscape, first install the required package with sudo apt install landscape-client and then run sudo landscape-config --account-name <account name> to start the configuration wizard. You can find your account name in the Landscape portal. On the last wizard question Request a new registration for this computer now? [y/N] hit y to accept. If successful, the new computer will be visible on the Landscape portal page Pending computers , from where you can click to accept it. Landscape portal page showing pending computer registration If I had a large fleet of computers, Landscape might come in useful. Also it is obvious Landscape is intended primarily for managing server systems. For example, the default alarm trigger on systems being offline, which is common for laptops and desktop computers, is an alert-worthy thing only on server systems. It is good to know that Landscape exists, but on desktop systems I would probably skip it, and only stick to the security updates offered by Ubuntu Pro without using Landscape.

Landscape is evolving The screenshots above are from the current Landscape portal which I have been using so far. Recently Canonical has also launched a new web portal, with a fresh look: New Landscape dashboard with fresh look This shows Canonical is actively investing in the service and it is likely going to sit at the center of their business model for years to come.

Other offerings by Canonical for individual users Canonical, the company behind the world s most popular desktop Linux distribution Ubuntu, has been offering various commercial support services for corporate customers since the company launched back in 2005, but there haven t been any offerings available to individual users since Ubuntu One, with file syncing, a music store and more, was wound down back in 2014. Canonical and the other major Linux companies, Red Hat and SUSE, have always been very enterprise-oriented, presumably because achieving economies of scale is much easier when maintaining standardized corporate environments compared to dealing with a wide range of custom configurations that individual consumer customers might have. I remember some years ago Canonical offered desktop support under the Ubuntu Advantage product name, but the minimum subscription was for 5 desktop systems, which typically isn t an option for a regular home consumer. I am glad to see Ubuntu Pro is now available and I honestly hope people using Ubuntu will opt into it. The more customers it has, the more it incentivizes Canonical to develop and maintain features that are important for desktop and home users.

Pay for Linux because you can, not because you have to Open source is a great software development model for rapid innovation and adoption, but I don t think the business models in the space are yet quite mature. Users who get long-term value should participate more in funding open source maintenance work. While some donation platforms like GitHub Sponsors, OpenCollective and the like have gained popularity in recent years, none of them seem to generate recurring revenue comparable to the scale of how popular open source software is now in 2026. I welcome more paid schemes, such as Ubuntu Pro, as I believe it is beneficial for the whole ecosystem. I also expect more service providers to enter this space and experiment with different open source business models and various forms of decentralized funding. Linux and open source are primarily free as in speech, but as a side effect license fees are hard to enforce and many use Linux without paying for it. The more people, corporations and even countries rely on it to stay sovereign in the information society, the more users should think about how they want to use Linux and who they want to pay to maintain it and other critical parts of the open source ecosystem.

24 January 2026

Gunnar Wolf: Finally some light for those who care about Debian on the Raspberry Pi

Finally, some light at the end of the tunnel! As I have said in this blog and elsewhere, after putting quite a bit of work into generating the Debian Raspberry Pi images between late 2018 and 2023, I had to recognize I don t have the time and energy to properly care for it. I even registered a GSoC project for it. I mentored Kurva Prashanth, who did good work on the vmdb2 scripts we use for the image generation but in the end, was unable to push them to be built in Debian infrastructure. Maybe a different approach was needed! While I adopted the images as they were conceived by Michael Stapelberg, sometimes it s easier to start from scratch and build a fresh approach. So, I m not yet pointing at a stable, proven release, but to a good promise. And I hope I m not being pushy by making this public: in the #debian-raspberrypi channel, waldi has shared the images he has created with the Debian Cloud Team s infrastructure. So, right now, the images built so far support Raspberry Pi families 4 and 5 (notably, not the 500 computer I have, due to a missing Device Tree, but I ll try to help figure that bit out Anyway, p400/500/500+ systems are not that usual). Work is underway to get the 3B+ to boot (some hackery is needed, as it only understands MBR partition schemes, so creating a hybrid image seems to be needed). Debian Cloud images for Raspberries Sadly, I don t think the effort will be extended to cover older, 32-bit-only systems (RPi 0, 1 and 2). Anyway, as this effort stabilizes, I will phase out my (stale!) work on raspi.debian.net, and will redirect it to point at the new images.

Comments Andrea Pappacoda tachi@d.o 2026-01-26 17:39:14 GMT+1 Are there any particular caveats compared to using the regular Raspberry Pi OS? Are they documented anywhere? Gunnar Wolf gwolf.blog@gwolf.org 2026-01-26 11:02:29 GMT-6 Well, the Raspberry Pi OS includes quite a bit of software that s not packaged in Debian for various reasons some of it because it s non-free demo-ware, some of it because it s RPiOS-specific configuration, some of it I don t care, I like running Debian wherever possible Andrea Pappacoda tachi@d.o 2026-01-26 18:20:24 GMT+1 Thanks for the reply! Yeah, sorry, I should ve been more specific. I also just care about the Debian part. But: are there any hardware issues or unsupported stuff, like booting from an SSD (which I m currently doing)? Gunnar Wolf gwolf.blog@gwolf.org 2026-01-26 12:16:29 GMT-6 That s beyond my knowledge Although I can tell you that:
  • Raspberry Pi OS has hardware support as soon as their new boards hit the market. The ability to even boot a board can take over a year for the mainline Linux kernel (at least, it has, both in the cases of the 4 and the 5 families).
  • Also, sometimes some bits of hardware are not discovered by the Linux kernels even if the general family boots because they are not declared in the right place of the Device Tree (i.e. the wireless network interface in the 02W is in a different address than in the 3B+, or the 500 does not fully boot while the 5B now does). Usually it is a matter of just declaring stuff in the right place, but it s not a skill many of us have.
  • Also, many RPi hats ship with their own Device Tree overlays, and they cannot always be loaded on top of mainline kernels.
Andrea Pappacoda tachi@d.o 2026-01-26 19:31:55 GMT+1
That s beyond my knowledge Although I can tell you that: Raspberry Pi OS has hardware support as soon as their new boards hit the market. The ability to even boot a board can take over a year for the mainline Linux kernel (at least, it has, both in the cases of the 4 and the 5 families).
Yeah, unfortunately I m aware of that I ve also been trying to boot OpenBSD on my rpi5 out of curiosity, but been blocked by my somewhat unusual setup involving an NVMe SSD as the boot drive :/
Also, sometimes some bits of hardware are not discovered by the Linux kernels even if the general family boots because they are not declared in the right place of the Device Tree (i.e. the wireless network interface in the 02W is in a different address than in the 3B+, or the 500 does not fully boot while the 5B now does). Usually it is a matter of just declaring stuff in the right place, but it s not a skill many of us have.
At some point in my life I had started reading a bit about device trees and stuff, but got distracted by other stuff before I could develop any familiarity with it. So I don t have the skills either :)
Also, many RPi hats ship with their own Device Tree overlays, and they cannot always be loaded on top of mainline kernels.
I m definitely not happy to hear this! Guess I ll have to try, and maybe report back once some page for these new builds materializes.

21 January 2026

Evgeni Golov: Validating cloud-init configs without being root

Somehow this whole DevOps thing is all about generating the wildest things from some (usually equally wild) template. And today we're gonna generate YAML from ERB, what could possibly go wrong?! Well, actually, quite a lot, so one wants to validate the generated result before using it to break systems at scale. The YAML we generate is a cloud-init cloud-config, and while checking that we generated a valid YAML document is easy (and we were already doing that), it would be much better if we could check that cloud-init can actually use it. Enter cloud-init schema, or so I thought. Turns out running cloud-init schema is rather broken without root privileges, as it tries to load a ton of information from the running system. This seems like a bug (or multiple), as the data should not be required for the validation of the schema itself. I've not found a way to disable that behavior. Luckily, I know Python. Enter evgeni-knows-better-and-can-write-python:
#!/usr/bin/env python3
import sys
from cloudinit.config.schema import get_schema, validate_cloudconfig_file, SchemaValidationError
try:
    valid = validate_cloudconfig_file(config_path=sys.argv[1], schema=get_schema())
    if not valid:
        raise RuntimeError("Schema is not valid")
except (SchemaValidationError, RuntimeError) as e:
    print(e)
    sys.exit(1)
The canonical1 version if this lives in the Foreman git repo, so go there if you think this will ever receive any updates. The hardest part was to understand thevalidate_cloudconfig_file API, as it will sometimes raise an SchemaValidationError, sometimes a RuntimeError and sometimes just return False. No idea why. But the above just turns it into a couple of printed lines and a non zero exit code, unless of course there are no problems, then you get peaceful silence.

19 January 2026

Russell Coker: Furilabs FLX1s

The Aim I have just got a Furilabs FLX1s [1] which is a phone running a modified version of Debian. I want to have a phone that runs all apps that I control and can observe and debug. Android is very good for what it does and there are security focused forks of Android which have a lot of potential, but for my use a Debian phone is what I want. The FLX1s is not going to be my ideal phone, I am evaluating it for use as a daily-driver until a phone that meets my ideal criteria is built. In this post I aim to provide information to potential users about what it can do, how it does it, and how to get the basic functions working. I also evaluate how well it meets my usage criteria. I am not anywhere near an average user. I don t think an average user would ever even see one unless a more technical relative showed one to them. So while this phone could be used by an average user I am not evaluating it on that basis. But of course the features of the GUI that make a phone usable for an average user will allow a developer to rapidly get past the beginning stages and into more complex stuff. Features The Furilabs FLX1s [1] is a phone that is designed to run FuriOS which is a slightly modified version of Debian. The purpose of this is to run Debian instead of Android on a phone. It has switches to disable camera, phone communication, and microphone (similar to the Librem 5) but the one to disable phone communication doesn t turn off Wifi, the only other phone I know of with such switches is the Purism Librem 5. It has a 720*1600 display which is only slightly better than the 720*1440 display in the Librem 5 and PinePhone Pro. This doesn t compare well to the OnePlus 6 from early 2018 with 2280*1080 or the Note9 from late 2018 with 2960*1440 which are both phones that I ve run Debian on. The current price is $US499 which isn t that good when compared to the latest Google Pixel series, a Pixel 10 costs $US649 and has a 2424*1080 display and it also has 12G of RAM while the FLX1s only has 8G. Another annoying thing is how rounded the corners are, it seems that round corners that cut off the content are a standard practice nowadays, in my collection of phones the latest one I found with hard right angles on the display was a Huawei Mate 10 Pro which was released in 2017. The corners are rounder than the Note 9, this annoys me because the screen is not high resolution by today s standards so losing the corners matters. The default installation is Phosh (the GNOME shell for phones) and it is very well configured. Based on my experience with older phone users I think I could give a phone with this configuration to a relative in the 70+ age range who has minimal computer knowledge and they would be happy with it. Additionally I could set it up to allow ssh login and instead of going through the phone support thing of trying to describe every GUI setting to click on based on a web page describing menus for the version of Android they are running I could just ssh in and run diff on the .config directory to find out what they changed. Furilabs have done a very good job of setting up the default configuration, while Debian developers deserve a lot of credit for packaging the apps the Furilabs people have chosen a good set of default apps to install to get it going and appear to have made some noteworthy changes to some of them. Droidian The OS is based on Android drivers (using the same techniques as Droidian [2]) and the storage device has the huge number of partitions you expect from Android as well as a 110G Ext4 filesystem for the main OS. The first issue with the Droidian approach of using an Android kernel and containers for user space code to deal with drivers is that it doesn t work that well. There are 3 D state processes (uninterrupteable sleep which usually means a kernel bug if the process remains in that state) after booting and doing nothing special. My tests running Droidian on the Note 9 also had D state processes, in this case they are D state kernel threads (I can t remember if the Note 9 had regular processes or kernel threads stuck in D state). It is possible for a system to have full functionality in spite of some kernel threads in D state but generally it s a symptom of things not working as well as you would hope. The design of Droidian is inherently fragile. You use a kernel and user space code from Android and then use Debian for the rest. You can t do everything the Android way (with the full OS updates etc) and you also can t do everything the Debian way. The TOW Boot functionality in the PinePhone Pro is really handy for recovery [3], it allows the internal storage to be accessed as a USB mass storage device. The full Android setup with ADB has some OK options for recovery, but part Android and part Debian has less options. While it probably is technically possible to do the same things in regard to OS repair and reinstall the fact that it s different from most other devices means that fixes can t be done in the same way. Applications GUI The system uses Phosh and Phoc, the GNOME system for handheld devices. It s a very different UI from Android, I prefer Android but it is usable with Phosh. IM Chatty works well for Jabber (XMPP) in my tests. It supports Matrix which I didn t test because I don t desire the same program doing Matrix and Jabber and because Matrix is a heavy protocol which establishes new security keys for each login so I don t want to keep logging in on new applications. Chatty also does SMS but I couldn t test that without the SIM caddy. I use Nheko for Matrix which has worked very well for me on desktops and laptops running Debian. Email I am currently using Geary for email. It works reasonably well but is lacking proper management of folders, so I can t just subscribe to the important email on my phone so that bandwidth isn t wasted on less important email (there is a GNOME gitlab issue about this see the Debian Wiki page about Mobile apps [4]). Music Music playing isn t a noteworthy thing for a desktop or laptop, but a good music player is important for phone use. The Lollypop music player generally does everything you expect along with support for all the encoding formats including FLAC0 a major limitation of most Android music players seems to be lack of support for some of the common encoding formats. Lollypop has it s controls for pause/play and going forward and backward one track on the lock screen. Maps The installed map program is gnome-maps which works reasonably well. It gets directions via the Graphhopper API [5]. One thing we really need is a FOSS replacement for Graphhopper in GNOME Maps. Delivery and Unboxing I received my FLX1s on the 13th of Jan [1]. I had paid for it on the 16th of Oct but hadn t received the email with the confirmation link so the order had been put on hold. But after I contacted support about that on the 5th of Jan they rapidly got it to me which was good. They also gave me a free case and screen protector to apologise, I don t usually use screen protectors but in this case it might be useful as the edges of the case don t even extend 0.5mm above the screen. So if it falls face down the case won t help much. When I got it there was an open space at the bottom where the caddy for SIMs is supposed to be. So I couldn t immediately test VoLTE functionality. The contact form on their web site wasn t working when I tried to report that and the email for support was bouncing. Bluetooth As a test of Bluetooth I connected it to my Nissan LEAF which worked well for playing music and I connected it to several Bluetooth headphones. My Thinkpad running Debian/Trixie doesn t connect to the LEAF and to headphones which have worked on previous laptops running Debian and Ubuntu. A friend s laptop running Debian/Trixie also wouldn t connect to the LEAF so I suspect a bug in Trixie, I need to spend more time investigating this. Wifi Currently 5GHz wifi doesn t work, this is a software bug that the Furilabs people are working on. 2.4GHz wifi works fine. I haven t tested running a hotspot due to being unable to get 4G working as they haven t yet shipped me the SIM caddy. Docking This phone doesn t support DP Alt-mode or Thunderbolt docking so it can t drive an external monitor. This is disappointing, Samsung phones and tablets have supported such things since long before USB-C was invented. Samsung DeX is quite handy for Android devices and that type feature is much more useful on a device running Debian than on an Android device. Camera The camera works reasonably well on the FLX1s. Until recently for the Librem 5 the camera didn t work and the camera on my PinePhone Pro currently doesn t work. Here are samples of the regular camera and the selfie camera on the FLX1s and the Note 9. I think this shows that the camera is pretty decent. The selfie looks better and the front camera is worse for the relatively close photo of a laptop screen taking photos of computer screens is an important part of my work but I can probably work around that. I wasn t assessing this camera t find out if it s great, just to find out if I have the sorts of problems I had before and it just worked. The Samsung Galaxy Note series of phones has always had decent specs including good cameras. Even though the Note 9 is old comparing to it is a respectable performance. The lighting was poor for all photos. FLX1s
Note 9
Power Use In 93 minutes having the PinePhone Pro, Librem 5, and FLX1s online with open ssh sessions from my workstation the PinePhone Pro went from 100% battery to 26%, the Librem 5 went from 95% to 69%, and the FLX1s went from 100% to 99%. The battery discharge rate of them was reported as 3.0W, 2.6W, and 0.39W respectively. Based on having a 16.7Wh battery 93 minutes of use should have been close to 4% battery use, but in any case all measurements make it clear that the FLX1s will have a much longer battery life. Including the measurement of just putting my fingers on the phones and feeling the temperature (FLX1s felt cool and the others felt hot). The PinePhone Pro and the Librem 5 have an optional Caffeine mode which I enabled for this test, without that enabled the phone goes into a sleep state and disconnects from Wifi. So those phones would use much less power with caffeine mode enabled, but they also couldn t get fast response to notifications etc. I found the option to enable a Caffeine mode switch on the FLX1s but the power use was reported as being the same both with and without it. Charging One problem I found with my phone is that in every case it takes 22 seconds to negotiate power. Even when using straight USB charging (no BC or PD) it doesn t draw any current for 22 seconds. When I connect it it will stay at 5V and varying between 0W and 0.1W (current rounded off to zero) for 22 seconds or so and then start charging. After the 22 second display the phone will make the tick sound indicating that it s charging and the power meter will measure that it s drawing some current. I added the table from my previous post about phone charging speed [6] with an extra row for the FLX1s. For charging from my PC USB ports the results were the worst ever, the port that does BC did not work at all it was looping trying to negotiate after a 22 second negotiation delay the port would turn off. The non-BC port gave only 2.4W which matches the 2.5W given by the spec for a High-power device which is what that port is designed to give. In a discussion on the Purism forum about the Librem5 charging speed one of their engineers told me that the reason why their phone would draw 2A from that port was because the cable was identifying itself as a USB-C port not a High-power device port. But for some reason out of the 7 phones I tested the FLX1s and the One Plus 6 are the only ones to limit themselves to what the port is apparently supposed to do. Also the One Plus 6 charges slowly on every power supply so I don t know if it is obeying the spec or just sucking. On a cheap AliExpress charger the FLX1s gets 5.9V and on a USB battery it gets 5.8V. Out of all 42 combinations of device and charger I tested these were the only ones to involve more than 5.1V but less than 9V. I welcome comments suggesting an explanation. The case that I received has a hole for the USB-C connector that isn t wide enough for the plastic surrounds on most of my USB-C cables (including the Dell dock). Also to make a connection requires a fairly deep insertion (deeper than the One Plus 6 or the Note 9). So without adjustment I have to take the case off to charge it. It s no big deal to adjust the hole (I have done it with other cases) but it s an annoyance.
Phone Top z640 Bottom Z640 Monitor Ali Charger Dell Dock Battery Best Worst
FLX1s FAIL 5.0V 0.49A 2.4W 4.8V 1.9A 9.0W 5.9V 1.8A 11W 4.8V 2.1A 10W 5.8V 2.1A 12W 5.8V 2.1A 12W 5.0V 0.49A 2.4W
Note9 4.8V 1.0A 5.2W 4.8V 1.6A 7.5W 4.9V 2.0A 9.5W 5.1V 1.9A 9.7W 4.8V 2.1A 10W 5.1V 2.1A 10W 5.1V 2.1A 10W 4.8V 1.0A 5.2W
Pixel 7 pro 4.9V 0.80A 4.2W 4.8V 1.2A 5.9W 9.1V 1.3A 12W 9.1V 1.2A 11W 4.9V 1.8A 8.7W 9.0V 1.3A 12W 9.1V 1.3A 12W 4.9V 0.80A 4.2W
Pixel 8 4.7V 1.2A 5.4W 4.7V 1.5A 7.2W 8.9V 2.1A 19W 9.1V 2.7A 24W 4.8V 2.3A 11.0W 9.1V 2.6A 24W 9.1V 2.7A 24W 4.7V 1.2A 5.4W
PPP 4.7V 1.2A 6.0W 4.8V 1.3A 6.8W 4.9V 1.4A 6.6W 5.0V 1.2A 5.8W 4.9V 1.4A 5.9W 5.1V 1.2A 6.3W 4.8V 1.3A 6.8W 5.0V 1.2A 5.8W
Librem 5 4.4V 1.5A 6.7W 4.6V 2.0A 9.2W 4.8V 2.4A 11.2W 12V 0.48A 5.8W 5.0V 0.56A 2.7W 5.1V 2.0A 10W 4.8V 2.4A 11.2W 5.0V 0.56A 2.7W
OnePlus6 5.0V 0.51A 2.5W 5.0V 0.50A 2.5W 5.0V 0.81A 4.0W 5.0V 0.75A 3.7W 5.0V 0.77A 3.7W 5.0V 0.77A 3.9W 5.0V 0.81A 4.0W 5.0V 0.50A 2.5W
Best 4.4V 1.5A 6.7W 4.6V 2.0A 9.2W 8.9V 2.1A 19W 9.1V 2.7A 24W 4.8V 2.3A 11.0W 9.1V 2.6A 24W
Conclusion The Furilabs support people are friendly and enthusiastic but my customer experience wasn t ideal. It was good that they could quickly respond to my missing order status and the missing SIM caddy (which I still haven t received but believe is in the mail) but it would be better if such things just didn t happen. The phone is quite user friendly and could be used by a novice. I paid $US577 for the FLX1s which is $AU863 by today s exchange rates. For comparison I could get a refurbished Pixel 9 Pro Fold for $891 from Kogan (the major Australian mail-order company for technology) or a refurbished Pixel 9 Pro XL for $842. The Pixel 9 series has security support until 2031 which is probably longer than you can expect a phone to be used without being broken. So a phone with a much higher resolution screen that s only one generation behind the latest high end phones and is refurbished will cost less. For a brand new phone a Pixel 8 Pro which has security updates until 2030 costs $874 and a Pixel 9A which has security updates until 2032 costs $861. Doing what the Furilabs people have done is not a small project. It s a significant amount of work and the prices of their products need to cover that. I m not saying that the prices are bad, just that economies of scale and the large quantity of older stock makes the older Google products quite good value for money. The new Pixel phones of the latest models are unreasonably expensive. The Pixel 10 is selling new from Google for $AU1,149 which I consider a ridiculous price that I would not pay given the market for used phones etc. If I had a choice of $1,149 or a feature phone I d pay $1,149. But the FLX1s for $863 is a much better option for me. If all I had to choose from was a new Pixel 10 or a FLX1s for my parents I d get them the FLX1s. For a FOSS developer a FLX1s could be a mobile test and development system which could be lent to a relative when their main phone breaks and the replacement is on order. It seems to be fit for use as a commodity phone. Note that I give this review on the assumption that SMS and VoLTE will just work, I haven t tested them yet. The UI on the FLX1s is functional and easy enough for a new user while allowing an advanced user to do the things they desire. I prefer the Android style and the Plasma Mobile style is closer to Android than Phosh is, but changing it is something I can do later. Generally I think that the differences between UIs matter more when on a desktop environment that could be used for more complex tasks than on a phone which limits what can be done by the size of the screen. I am comparing the FLX1s to Android phones on the basis of what technology is available. But most people who would consider buying this phone will compare it to the PinePhone Pro and the Librem 5 as they have similar uses. The FLX1s beats both those phones handily in terms of battery life and of having everything just work. But it has the most non free software of the three and the people who want the $2000 Librem 5 that s entirely made in the US won t want the FLX1s. This isn t the destination for Debian based phones, but it s a good step on the way to it and I don t think I ll regret this purchase.

Vincent Bernat: RAID 5 with mixed-capacity disks on Linux

Standard RAID solutions waste space when disks have different sizes. Linux software RAID with LVM uses the full capacity of each disk and lets you grow storage by replacing one or two disks at a time. We start with four disks of equal size:
$ lsblk -Mo NAME,TYPE,SIZE
NAME TYPE  SIZE
vda  disk  101M
vdb  disk  101M
vdc  disk  101M
vdd  disk  101M
We create one partition on each of them:
$ sgdisk --zap-all --new=0:0:0 -t 0:fd00 /dev/vda
$ sgdisk --zap-all --new=0:0:0 -t 0:fd00 /dev/vdb
$ sgdisk --zap-all --new=0:0:0 -t 0:fd00 /dev/vdc
$ sgdisk --zap-all --new=0:0:0 -t 0:fd00 /dev/vdd
$ lsblk -Mo NAME,TYPE,SIZE
NAME   TYPE  SIZE
vda    disk  101M
 vda1 part  100M
vdb    disk  101M
 vdb1 part  100M
vdc    disk  101M
 vdc1 part  100M
vdd    disk  101M
 vdd1 part  100M
We set up a RAID 5 device by assembling the four partitions:1
$ mdadm --create /dev/md0 --level=raid5 --bitmap=internal --raid-devices=4 \
>   /dev/vda1 /dev/vdb1 /dev/vdc1 /dev/vdd1
$ lsblk -Mo NAME,TYPE,SIZE
    NAME          TYPE    SIZE
    vda           disk    101M
   vda1        part    100M
    vdb           disk    101M
   vdb1        part    100M
    vdc           disk    101M
   vdc1        part    100M
    vdd           disk    101M
   vdd1        part    100M
  md0           raid5 292.5M
$ cat /proc/mdstat
md0 : active raid5 vdd1[4] vdc1[2] vdb1[1] vda1[0]
      299520 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
We use LVM to create logical volumes on top of the RAID 5 device.
$ pvcreate /dev/md0
  Physical volume "/dev/md0" successfully created.
$ vgcreate data /dev/md0
  Volume group "data" successfully created
$ lvcreate -L 100m -n bits data
  Logical volume "bits" created.
$ lvcreate -L 100m -n pieces data
  Logical volume "pieces" created.
$ mkfs.ext4 -q /dev/data/bits
$ mkfs.ext4 -q /dev/data/pieces
$ lsblk -Mo NAME,TYPE,SIZE
    NAME          TYPE    SIZE
    vda           disk    101M
   vda1        part    100M
    vdb           disk    101M
   vdb1        part    100M
    vdc           disk    101M
   vdc1        part    100M
    vdd           disk    101M
   vdd1        part    100M
  md0           raid5 292.5M
     data-bits   lvm     100M
     data-pieces lvm     100M
$ vgs
  VG   #PV #LV #SN Attr   VSize   VFree
  data   1   2   0 wz--n- 288.00m 88.00m
This gives us the following setup:
One RAID 5 device built from four partitions from four disks of equal capacity. The RAID device is part of an LVM volume group with two logical volumes.
RAID 5 setup with disks of equal capacity
We replace /dev/vda with a bigger disk. We add it back to the RAID 5 array after copying the partitions from /dev/vdb:
$ cat /proc/mdstat
md0 : active (auto-read-only) raid5 vdb1[1] vdd1[4] vdc1[2]
      299520 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [_UUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
$ sgdisk --replicate=/dev/vda /dev/vdb
$ sgdisk --randomize-guids /dev/vda
$ mdadm --manage /dev/md0 --add /dev/vda1
$ cat /proc/mdstat
md0 : active raid5 vda1[5] vdb1[1] vdd1[4] vdc1[2]
      299520 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
We do not use the additional capacity: this setup would not survive the loss of /dev/vda because we have no spare capacity. We need a second disk replacement, like /dev/vdb:
$ cat /proc/mdstat
md0 : active (auto-read-only) raid5 vda1[5] vdd1[4] vdc1[2]
      299520 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [U_UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
$ sgdisk --replicate=/dev/vdb /dev/vdc
$ sgdisk --randomize-guids /dev/vdb
$ mdadm --manage /dev/md0 --add /dev/vdb1
$ cat /proc/mdstat
md0 : active raid5 vdb1[6] vda1[5] vdd1[4] vdc1[2]
      299520 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
We create a new RAID 1 array by using the free space on /dev/vda and /dev/vdb:
$ sgdisk --new=0:0:0 -t 0:fd00 /dev/vda
$ sgdisk --new=0:0:0 -t 0:fd00 /dev/vdb
$ mdadm --create /dev/md1 --level=raid1 --bitmap=internal --raid-devices=2 \
>   /dev/vda2 /dev/vdb2
$ cat /proc/mdstat
md1 : active raid1 vdb2[1] vda2[0]
      101312 blocks super 1.2 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
md0 : active raid5 vdb1[6] vda1[5] vdd1[4] vdc1[2]
      299520 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
We add /dev/md1 to the volume group:
$ pvcreate /dev/md1
  Physical volume "/dev/md1" successfully created.
$ vgextend data /dev/md1
  Volume group "data" successfully extended
$ vgs
  VG   #PV #LV #SN Attr   VSize   VFree
  data   2   2   0 wz--n- 384.00m 184.00m
$  lsblk -Mo NAME,TYPE,SIZE
       NAME          TYPE    SIZE
       vda           disk    201M
      vda1        part    100M
     vda2        part    100M
       vdb           disk    201M
      vdb1        part    100M
     vdb2        part    100M
  md1           raid1  98.9M
       vdc           disk    101M
      vdc1        part    100M
       vdd           disk    101M
      vdd1        part    100M
     md0           raid5 292.5M
        data-bits   lvm     100M
        data-pieces lvm     100M
This gives us the following setup:2
One RAID 5 device built from four partitions and one RAID 1 device built from two partitions. The two last disks are smaller. The two RAID devices are part of a single LVM volume group.
Setup mixing both RAID 1 and RAID 5
We extend our capacity further by replacing /dev/vdc:
$ cat /proc/mdstat
md1 : active (auto-read-only) raid1 vda2[0] vdb2[1]
      101312 blocks super 1.2 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
md0 : active (auto-read-only) raid5 vda1[5] vdd1[4] vdb1[6]
      299520 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UU_U]
      bitmap: 0/1 pages [0KB], 65536KB chunk
$ sgdisk --replicate=/dev/vdc /dev/vdb
$ sgdisk --randomize-guids /dev/vdc
$ mdadm --manage /dev/md0 --add /dev/vdc1
$ cat /proc/mdstat
md1 : active (auto-read-only) raid1 vda2[0] vdb2[1]
      101312 blocks super 1.2 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
md0 : active raid5 vdc1[7] vda1[5] vdd1[4] vdb1[6]
      299520 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
Then, we convert /dev/md1 from RAID 1 to RAID 5:
$ mdadm --grow /dev/md1 --level=5 --raid-devices=3 --add /dev/vdc2
mdadm: level of /dev/md1 changed to raid5
mdadm: added /dev/vdc2
$ cat /proc/mdstat
md1 : active raid5 vdc2[2] vda2[0] vdb2[1]
      202624 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/3] [UUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
md0 : active raid5 vdc1[7] vda1[5] vdd1[4] vdb1[6]
      299520 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
$ pvresize /dev/md1
$ vgs
  VG   #PV #LV #SN Attr   VSize   VFree
  data   2   2   0 wz--n- 482.00m 282.00m
This gives us the following layout:
Two RAID 5 devices built from four disks of different sizes. The last disk is smaller and contains only one partition, while the others have two partitions: one for /dev/md0 and one for /dev/md1. The two RAID devices are part of a single LVM volume group.
RAID 5 setup with mixed-capacity disks using partitions and LVM
We further extend our capacity by replacing /dev/vdd:
$ cat /proc/mdstat
md0 : active (auto-read-only) raid5 vda1[5] vdc1[7] vdb1[6]
      299520 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUU_]
      bitmap: 0/1 pages [0KB], 65536KB chunk
md1 : active (auto-read-only) raid5 vda2[0] vdc2[2] vdb2[1]
      202624 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/3] [UUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
$ sgdisk --replicate=/dev/vdd /dev/vdc
$ sgdisk --randomize-guids /dev/vdd
$ mdadm --manage /dev/md0 --add /dev/vdd1
$ cat /proc/mdstat
md0 : active raid5 vdd1[4] vda1[5] vdc1[7] vdb1[6]
      299520 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
md1 : active (auto-read-only) raid5 vda2[0] vdc2[2] vdb2[1]
      202624 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/3] [UUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
We grow the second RAID 5 array:
$ mdadm --grow /dev/md1 --raid-devices=4 --add /dev/vdd2
mdadm: added /dev/vdd2
$ cat /proc/mdstat
md0 : active raid5 vdd1[4] vda1[5] vdc1[7] vdb1[6]
      299520 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
md1 : active raid5 vdd2[3] vda2[0] vdc2[2] vdb2[1]
      303936 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
$ pvresize /dev/md1
$ vgs
  VG   #PV #LV #SN Attr   VSize   VFree
  data   2   2   0 wz--n- 580.00m 380.00m
$ lsblk -Mo NAME,TYPE,SIZE
       NAME          TYPE    SIZE
       vda           disk    201M
      vda1        part    100M
     vda2        part    100M
       vdb           disk    201M
      vdb1        part    100M
     vdb2        part    100M
       vdc           disk    201M
      vdc1        part    100M
     vdc2        part    100M
       vdd           disk    301M
      vdd1        part    100M
      vdd2        part    100M
     md0           raid5 292.5M
        data-bits   lvm     100M
        data-pieces lvm     100M
  md1           raid5 296.8M
You can continue by replacing each disk one by one using the same steps.

  1. Write-intent bitmaps speed up recovery of the RAID array after a power failure by marking unsynchronized regions as dirty. They have an impact on performance, but I did not measure it myself.
  2. In the lsblk output, /dev/md1 appears unused because the logical volumes do not use any space from it yet. Once you create more logical volumes or extend them, lsblk will reflect the usage.

Next.