Search Results: "ltd"

10 July 2023

Shirish Agarwal: PLIO, Mum, Debconf, Pressure Cooker, RISC-V,

PLIO I have been looking for an image viewer that can view images via modification date by default. The newer, the better. Alas, most of the image viewers do not do that. Even feh somehow fails. What I need is default listing of images as thumbnails by modification date. I put it up on Unix Stackexchange couple of years ago. Somebody shared ristretto but that just gives listing and doesn t give the way I want it. To be more illustrative, maybe this may serve as a guide to what I mean.
There is an RFP for it. While playing with it, I also discovered another benefit of the viewer, a sort of side-benefit, it tells you if any images have gone corrupt or whatever and you get that info. on the CLI so you can try viewing that image with the path using another viewer or viewers before deleting them. One of the issues is there doesn t seem to be a magnify option by default. While the documentation says use the ^ key to maximize it, it doesn t maximize. Took me a while to find it as that isn t a key that I use most of the time. Ironically, that is the key used on the mobile quite a bit. Anyways, so that needs to be fixed. Sadly, it doesn t have creation date or modification date sort, although the documentation does say it does (at least the modification date) but it doesn t show at my end. I also got Warning: UNKNOWN command detected! but that doesn t tell me enough as to what the issue is. Hopefully the developer will fix the issues and it will become part of Debian as many such projects are. Compiling was dead easy even with gcc-12 once I got freeimage-dev.

Mum s first death anniversary I do not know where the year went by or how. The day went in a sort of suspended animation. The only thing I did was eat and sleep that day, didn t feel like doing anything. Old memories, even dreams of fighting with her only to realize in the dream itself it s fake, she isn t there anymore  Something that can never be fixed

Debconf Kochi I should have shared it few days ago but somehow slipped my mind. While it s too late for most people to ask for bursary for Debconf Kochi, if you are anywhere near Kochi in the month of September between the dates. September 3 to September 17 nearby Infopark, Kochi you could walk in and talk to people. This would be for people who either have an interest in free software, FOSS or Debian specific. For those who may not know, while Debian is a Linux Distribution having ports to other kernels as well as well as hardware. While I may not be able to provide the list of all the various flavors as well as hardware, can say it is quite a bit. For e.g. there is a port to RISC-V that was done few years back (2018). Why that is needed will be shared below. There is always something new to look forward in a Debconf.

Pressure Cooker and Potatoes This was asked to me in the last Debconf (2016) by few people. So as people are coming to India, it probably is a good time to sort of reignite the topic :). So a Pressure Cooker boils your veggies and whatnot while still preserving the nutrients. While there are quite a number of brands I would suggest either Prestige or Hawkins, I have had good experience with both. There are also some new pressure cookers that have come that are somewhat in the design of the Thai Wok. So if that is something that you are either comfortable with or looking for, you could look at that. One of the things that you have to be sort of aware of and be most particular is the pressure safety valve. Just putting up pressure cooker safety valve in your favorite search-engine should show you different makes and whatnot. While they are relatively cheap, you need to see it is not cracked, used or whatever. The other thing is the Pressure Cooker whistle as well. The easiest thing to cook are mashed potatoes in a pressure cooker. A pressure Cooker comes in Litres, from 1 Ltr. to 20 Ltr. The larger ones are obviously for hotels or whatnot. General rule of using Pressure cooker is have water 1/4th, whatever vegetable or non-veg you want to boil 1/2 and let the remaining part for the steam. Now the easiest thing to do is have wash the potatoes and put 1/4th water of the pressure cooker. Then put 1/2 or less or little bit more of the veggies, in this instance just Potatoes. You can put salt to or that can be done later. The taste will be different. Also, there are various salts so won t really go into it as spices is a rabbit hole. Anyways, after making sure that there is enough space for the steam to be built, Put the handle on the cooker and basically wait for 5-10 minutes for the pressure to be built. You will hear a whistling sound, wait for around 5 minutes or a bit more (depends on many factors, kind of potatoes, weather etc.) and then just let it cool off naturally. After 5-10 minutes or a bit more, the pressure will be off. Your mashed potatoes are ready for either consumption or for further processing. I am assuming gas, induction cooking will have its own temperature, have no idea about it, hence not sharing that. Pressure Cooker, first put on the heaviest settings, once it starts to whistle, put it on medium for 5-10 minutes and then let it cool off. The first time I had tried that, I burned the cooker. You understand things via trial and error.

Poha recipe This is a nice low-cost healthy and fulfilling breakfast called Poha that can be made anytime and requires at the most 10-15 minutes to prepare with minimal fuss. The main ingredient is Poha or flattened rice. So how is it prepared. I won t go into the details of quantity as that is upto how hungry people are. There are various kinds of flattened rice available in the market, what you are looking for is called thick Poha or zhad Poha (in Marathi). The first step is the trickiest. What do you want to do is put water on Poha but not to let it be soggy. There is an accessory similar to tea filter but forgot the name, it basically drains all the extra moisture and you want Poha to be a bit fluffy and not soggy. The Poha should breathe for about 5 minutes before being cooked. To cook, use a heavy bottomed skillet, put some oil in it, depends on what oil you like, again lot of variations, you can use ground nut or whatever oil you prefer. Then use single mustard seeds to check temperature of the oil. Once the mustard seeds starts to pop, it means it s ready for things. So put mustard seeds in, finely chopped onion, finely chopped coriander leaves, a little bit of lemon juice, if you want potatoes, then potatoes too. Be aware that Potatoes will soak oil like anything, so if you are going to have potatoes than the oil should be a bit more. Some people love curry leaves, others don t. I like them quite a bit, it gives a slightly different taste. So the order is
  1. Oil
  2. Mustard seeds (1-2 teaspoon)
  3. Curry leaves 5-10
  4. Onion (2-3 medium onions finely chopped, onion can also be used as garnish.)
  5. Potatoes (2-3 medium ones, mashed)
  6. Small green chillies or 1-2 Red chillies (if you want)
  7. Coriander Leaves (one bunch finely chopped)
  8. Peanuts (half a glass)
Make sure that you are stirring them quite a bit. On a good warm skillet, this should hardly take 5 minutes. Once the onions are slighly brown, you are ready to put Poha in. So put the poha, add turmeric, salt, and sugar. Again depends on number of people. If I made for myself and mum, usually did 1 teaspoon of salt, not even one fourth of turmeric, just a hint, it is for the color, 1 to 2 teapoons of sugar and mix them all well at medium flame. Poha used to be two or three glasses. If you don t want potato, you can fry them a bit separately and garnish with it, along with coriander, coconut and whatnot. In Kerala, there is possibility that people might have it one day or all days. It serves as a snack at anytime, breakfast, lunch, tea time or even dinner if people don t want to be heavy. The first few times I did, I did manage to do everything wrong. So, if things go wrong, let it be. After a while, you will find your own place. And again, this is just one way, I m sure this can be made as elaborate a meal as you want. This is just something you can do if you don t want noodles or are bored with it. The timing is similar. While I don t claim to be an expert in cooking in anyway or form, if people have questions feel free to ask. If you are single or two people, 2 Ltr. Pressure cooker is enough for most Indians, Westerners may take a slightly bit larger Pressure Cooker, maybe a 3 Ltr. one may be good for you. Happy Cooking and Experimenting  I have had the pleasure to have Poha in many ways. One of my favorite ones is when people have actually put tadka on top of Poha. You do everything else but in a slight reverse order. The tadka has all the spices mixed and is concentrated and is put on top of Poha and then mixed. Done right, it tastes out of this world. For those who might not have had the Indian culinary experience, most of which is actually borrowed from the Mughals, you are in for a treat. One of the other things I would suggest to people is to ask people where there can get five types of rice. This is a specialty of South India and a sort of street food. I know where you can get it Hyderabad, Bangalore, Chennai but not in Kerala, although am dead sure there is, just somehow have missed it. If asked, am sure the Kerala team should be able to guide. That s all for now, feeling hungry, having dinner as have been sharing about cooking.

RISC-V There has been lot of conversations about how India could be in the microprocesor spacee. The x86 and x86-64 is all tied up in Intel and AMD so that s a no go area. Let me elaborate a bit why I say that. While most of the people know that IBM was the first producers of transistors as well as microprocessors. Coincidentally, AMD and Intel story are similar in some aspects but not in others. For a long time Intel was a market leader and by hook or crook it remained a market leader. One of the more interesting companies in the 1980s was Cyrix which sold lot of low-end microprocessors. A lot of that technology also went into Via which became a sort of spiritual successor of Cyrix. It is because of Cyrix and Via that Intel was forced to launch the Celeron model of microprocessors.

Lawsuits, European Regulation For those who have been there in the 1990s may have heard the term Wintel that basically meant Microsoft Windows and Intel and they had a sort of monopoly power. While the Americans were sorta ok with it, the Europeans were not and time and time again they forced both Microsoft as well as Intel to provide alternatives. The pushback from the regulators was so great that Intel funded AMD to remain solvent for few years. The successes that we see today from AMD is Lisa Su s but there is a whole lot of history as well as bad blood between the two companies. Lot of lawsuits and whatnot. Lot of cross-licensing agreements between them as well. So for any new country it would need lot of cash just for licensing all the patents there are and it s just not feasible for any newcomer to come in this market as they would have to fork the cash for the design apart from manufacturing fab.

ARM Most of the mobiles today sport an ARM processor. At one time it meant Advanced RISC Machines but now goes by Arm Ltd. Arm itself licenses its designs and while there are lot of customers, you are depending on ARM and they can change any of the conditions anytime they want. You are also hoping that ARM does not steal your design or do anything else with it. And while people trust ARM, it is still a risk if you are a company.

RISC and Shakti There is not much to say about RISC other than this article at Register. While India does have large ambitions, executing it is far trickier than most people believe as well as complex and highly capital intensive. The RISC way could be a game-changer provided India moves deftly ahead. FWIW, Debian did a RISC port in 2018. From what I can tell, you can install it on a VM/QEMU and do stuff. And while RISC has its own niches, you never know what happens next.One can speculate a lot and there is certainly a lot of momentum behind RISC. From what little experience I have had, where India has failed time and time again, whether in software or hardware is support. Support is the key, unless that is not fixed, it will remain a dream  On a slightly sad note, Foxconn is withdrawing from the joint venture it had with Vedanta.

27 June 2023

Wouter Verhelst: The future of the eID on RHEL

Since before I got involved in the eID back in 2014, we have provided official packages of the eID for Red Hat Enterprise Linux. Since RHEL itself requires a license, we did this, first, by using buildbot and mock on a Fedora VM to set up a CentOS chroot in which to build the RPM package. Later this was migrated to using GitLab CI and to using docker rather than VMs, in an effort to save some resources. Even later still, when Red Hat made CentOS no longer be a downstream of RHEL, we migrated from building in a CentOS chroot to building in a Rocky chroot, so that we could continue providing RHEL-compatible packages. Now, as it seems that Red Hat is determined to make that impossible too, I investigated switching to actually building inside a RHEL chroot rather than a derivative one. Let's just say that might be a challenge...
[root@b09b7eb7821d ~]# mock --dnf --isolation=simple --verbose -r rhel-9-x86_64 --rebuild eid-mw-5.1.11-0.v5.1.11.fc38.src.rpm --resultdir /root --define "revision v5.1.11"
ERROR: /etc/pki/entitlement is not a directory is subscription-manager installed?
Okay, so let's fix that.
[root@b09b7eb7821d ~]# dnf install -y subscription-manager
(...)
Complete!
[root@b09b7eb7821d ~]# mock --dnf --isolation=simple --verbose -r rhel-9-x86_64 --rebuild eid-mw-5.1.11-0.v5.1.11.fc38.src.rpm --resultdir /root --define "revision v5.1.11"
ERROR: No key found in /etc/pki/entitlement directory.  It means this machine is not subscribed.  Please use 
  1. subscription-manager register
  2. subscription-manager list --all --available (available pool IDs)
  3. subscription-manager attach --pool <POOL_ID>
If you don't have Red Hat subscription yet, consider getting subscription:
  https://access.redhat.com/solutions/253273
You can have a free developer subscription:
  https://developers.redhat.com/faq/
Okay... let's fix that too, then.
[root@b09b7eb7821d ~]# subscription-manager register
subscription-manager is disabled when running inside a container. Please refer to your host system for subscription management.
Wut.
[root@b09b7eb7821d ~]# exit
wouter@pc220518:~$ apt-cache search subscription-manager
wouter@pc220518:~$
As I thought, yes. Having to reinstall the docker host machine with Fedora just so I can build Red Hat chroots seems like a somewhat excessive requirement, which I don't think we'll be doing that any time soon. We'll see what the future brings, I guess.

14 June 2023

Freexian Collaborators: Monthly report about Debian Long Term Support, May 2023 (by Roberto C. S nchez)

Like each month, have a look at the work funded by Freexian s Debian LTS offering.

Debian LTS contributors In May, 18 contributors have been paid to work on Debian LTS, their reports are available:
  • Abhijith PA did 6.0h (out of 6.0h assigned and 8.0h from previous period), thus carrying over 8.0h to the next month.
  • Anton Gladky did 6.0h (out of 8.0h assigned and 7.0h from previous period), thus carrying over 9.0h to the next month.
  • Bastien Roucari s did 17.0h (out of 17.0h assigned and 3.0h from previous period), thus carrying over 3.0h to the next month.
  • Ben Hutchings did 17.0h (out of 16.0h assigned and 8.0h from previous period), thus carrying over 7.0h to the next month.
  • Chris Lamb did 18.0h (out of 18.0h assigned).
  • Daniel Leidert did 0.0h (out of 0h assigned and 12.0h from previous period), thus carrying over 12.0h to the next month.
  • Dominik George did 0.0h (out of 0h assigned and 20.34h from previous period), thus carrying over 20.34h to the next month.
  • Emilio Pozuelo Monfort did 32.0h (out of 18.5h assigned and 16.0h from previous period), thus carrying over 2.5h to the next month.
  • Guilhem Moulin did 20.0h (out of 8.5h assigned and 11.5h from previous period).
  • Holger Levsen did 0.0h (out of 0h assigned and 10.0h from previous period), thus carrying over 10.0h to the next month.
  • Lee Garrett did 0.0h (out of 0h assigned and 40.5h from previous period), thus carrying over 40.5h to the next month.
  • Markus Koschany did 34.5h (out of 34.5h assigned).
  • Roberto C. S nchez did 18.25h (out of 20.5h assigned and 11.5h from previous period), thus carrying over 13.75h to the next month.
  • Scarlett Moore did 20.0h (out of 20.0h assigned).
  • Sylvain Beucler did 34.5h (out of 29.0h assigned and 5.5h from previous period).
  • Thorsten Alteholz did 14.0h (out of 14.0h assigned).
  • Tobias Frost did 16.0h (out of 15.0h assigned and 1.0h from previous period).
  • Utkarsh Gupta did 5.5h (out of 5.0h assigned and 26.0h from previous period), thus carrying over 25.5h to the next month.

Evolution of the situation In May, we have released 34 DLAs. Several of the DLAs constituted notable security updates to LTS during the month of May. Of particular note were the linux (4.19) and linux-5.10 packages, both of which addressed a considerable number of CVEs. Additionally, the postgresql-11 package was updated by synchronizing it with the 11.20 release from upstream. Notable non-security updates were made to the distro-info-data database and the timezone database. The distro-info-data package was updated with the final expected release date of Debian 12, made aware of Debian 14 and Ubuntu 23.10, and was updated with the latest EOL dates for Ubuntu releases. The tzdata and libdatetime-timezone-perl packages were updated with the 2023c timezone database. The changes in these packages ensure that in addition to the latest security updates LTS users also have the latest information concerning Debian and Ubuntu support windows, as well as the latest timezone data for accurate worldwide timekeeping. LTS contributor Anton implemented an improvement to the Debian Security Tracker Unfixed vulnerabilities in unstable without a filed bug view, allowing for more effective management of CVEs which do not yet have a corresponding bug entry in the Debian BTS. LTS contributor Sylvain concluded an audit of obsolete packages still supported in LTS to ensure that new CVEs are properly associated. In this case, a package being obsolete means that it is no longer associated with a Debian release for which the Debian Security Team has direct responsibility. When this occurs, it is the responsibility of the LTS team to ensure that incoming CVEs are properly associated to packages which exist only in LTS. Finally, LTS contributors also contributed several updates to packages in unstable/testing/stable to fix CVEs. This helps package maintainers, addresses CVEs in current and future Debian releases, and ensures that the CVEs do not remain open for an extended period of time only for the LTS team to be required to deal with them much later in the future.

Thanks to our sponsors Sponsors that joined recently are in bold.

16 May 2023

Freexian Collaborators: Monthly report about Debian Long Term Support, April 2023 (by Roberto C. S nchez)

Like each month, have a look at the work funded by Freexian s Debian LTS offering.

Debian LTS contributors In April, 18 contributors have been paid to work on Debian LTS, their reports are available:
  • Abhijith PA did 6.0h (out of 0h assigned and 14.0h from previous period), thus carrying over 8.0h to the next month.
  • Adrian Bunk did 18.0h (out of 16.5h assigned and 24.0h from previous period), thus carrying over 22.5h to the next month.
  • Anton Gladky did 8.0h (out of 9.5h assigned and 5.5h from previous period), thus carrying over 7.0h to the next month.
  • Bastien Roucari s did 17.0h (out of 17.0h assigned and 3.0h from previous period), thus carrying over 3.0h to the next month.
  • Ben Hutchings did 16.0h (out of 12.0h assigned and 12.0h from previous period), thus carrying over 8.0h to the next month.
  • Chris Lamb did 18.0h (out of 18.0h assigned).
  • Dominik George did 0.0h (out of 0h assigned and 20.34h from previous period), thus carrying over 20.34h to the next month.
  • Emilio Pozuelo Monfort did 4.5h (out of 11.0h assigned and 9.5h from previous period), thus carrying over 16.0h to the next month.
  • Guilhem Moulin did 8.5h (out of 8.0h assigned and 12.0h from previous period), thus carrying over 11.5h to the next month.
  • Helmut Grohne did 5.0h (out of 2.5h assigned and 7.5h from previous period), thus carrying over 5.0h to the next month.
  • Lee Garrett did 0.0h (out of 31.5h assigned and 9.0h from previous period), thus carrying over 40.5h to the next month.
  • Markus Koschany did 40.0h (out of 40.0h assigned).
  • Ola Lundqvist did 12.5h (out of 0h assigned and 24.0h from previous period), thus carrying over 11.5h to the next month.
  • Roberto C. S nchez did 8.5h (out of 4.75h assigned and 15.25h from previous period), thus carrying over 11.5h to the next month.
  • Stefano Rivera did 1.0h (out of 0h assigned and 28.0h from previous period), thus carrying over 27.0h to the next month.
  • Sylvain Beucler did 35.0h (out of 40.5h assigned), thus carrying over 5.5h to the next month.
  • Thorsten Alteholz did 14.0h (out of 14.0h assigned).
  • Tobias Frost did 15.0h (out of 15.0h assigned and 1.0h from previous period), thus carrying over 1.0h to the next month.
  • Utkarsh Gupta did 3.5h (out of 11.0h assigned and 18.5h from previous period), thus carrying over 26.0h to the next month.

Evolution of the situation In April, we have released 35 DLAs. The LTS team would like to welcome our newest sponsor, Institut Camille Jordan, a French research lab. Thanks to the support of the many LTS sponsors, the entire Debian community benefits from direct security updates, as well as indirect improvements and collaboration with other members of the Debian community. As part of improving the efficiency of our work and the quality of the security updates we produce, the LTS has continued improving our workflow. Improvements include more consistent tagging of release versions in Git and broader use of continuous integration (CI) to ensure packages are tested thoroughly and consistently. Sponsors and users can rest assured that we work continuously to maintain and improve the already high quality of the work that we do.

Thanks to our sponsors Sponsors that joined recently are in bold.

5 May 2023

Shirish Agarwal: CAT-6, AMD 5600G, Dealerships closing down, TRAI-caller and privacy.

CAT-6 patch cord & ONU Few months back I was offered a fibre service. Most of the service offering has been using Chinese infrastructure including the ONU (Optical Network Unit). Wikipedia doesn t have a good page on ONU hence had to rely on third-party sites. FS (a name I don t really know) has some (good basic info. on ONU and how it s part and parcel of the whole infrastructure. I also got an ONT (Optical Network Terminal) but it seems to be very basic and mostly dumb. I used the old CAT-6 cable ( a decade old) to connect them and it worked for couple of months. Had to change it, first went to know if a higher cable solution offered themselves. CAT-7 is there but not backward compatible. CAT-8 is the next higher version but apparently it s expensive and also not easily bought. I did quite a few tests on CAT-6 and the ONU and it conks out at best 1 mbps which is still far better than what I am used to. CAT-8 are either not available or simply too expensive for home applications atm. A good summary of CAT-8 and what they stand for can be found here. The networking part is hopeless as most consumer facing CPU s and motherboards don t even offer 10 mbps, so asking anything more is just overkill without any benefit. Which does bring me to the next question, something that I may do in a few months or a year down the road. Just to clarify they may say it is 100 mbps or even 1 Gbps but that s plain wrong.

AMD APU, Asus Motherboard & Dealerships I had been thinking of an AMD APU, could wait a while but sooner or later would have to get one. I got quoted an AMD Ryzen 3 3200G with an Asus A320 Motherboard for around 14k which kinda looked steep to me. Quite a few hardware dealers whom I had traded, consulted over years simply shut down. While there are new people, it s much more harder now to make relationships (due to deafness) rather than before. The easiest to share which was also online was pcpartpicker.com that had an Indian domain now no longer available. The number of offline brick and mortar PC business has also closed quite a bit. There are a few new ones but it takes time and the big guys have made more of a killing. I was shocked quite a bit. Came home and browsed a bit and was hit by this. Both AMD and Intel PC business has taken a beating. AMD a bit more as Intel still holds part of the business segment as traditionally been theirs. There have been proofs and allegations of bribing in the past (do remember the EU Antitrust case against Intel for monopoly) but Intel s own cutting corners with the Spectre and Meltdown flaws hasn t helped its case, nor the suits themselves. AMD on the other hand under expertise of Lisa Su has simply grown strength by strength. Inflation and Profiteering by other big companies has made the outlook for both AMD and Intel a bit lackluster. AMD is supposed to show Zen5 chips in a few days time and the rumor mill has been ongoing. Correction Not few days but 2025. Personally, I would be happy with maybe a Ryzen 5600G with an Asus motherboard. My main motive whenever I buy an APU is not to hit beyond 65 TDP. It s kinda middle of the road. As far as what I could read this year and next year we could have AM4+ or something like those updates, AM5 APU s, CPU s and boards are slated to be launched in 2025. I did see pcpricetracker and it does give idea of various APU prices although have to say pcpartpicker was much intuitive to work with than the above. I just had my system cleaned couple of months so touchwood I should be able to use it for another couple of years or more before I have to get one of these APU s and do hope they are worth it. My idea is to use that not only for testing various softwares but also delve a bit into VR if that s possible. I did read a bit about deafness and VR as well. A good summary can be found here. I am hopeful that there may be few people in the community who may look and respond to that. It s crucial.

TRAI-caller, Privacy 101& Element. While most of us in Debian and FOSS communities do engage in privacy, lots of times it s frustrating. I m always looking for videos that seek to share that view why Privacy is needed by individuals and why Governments and other parties hate it. There are a couple of basic Youtube Videos that does explain the same quite practically.
Now why am I sharing the above. It isn t that people do not privacy and how we hold it dear. I share it because GOI just today blocked Element. While it may be trivial for us to workaround the issues, it does tell what GOI is doing. And it still acts as if surprised why it s press ranking is going to pits. Even our Women Wrestlers have been protesting for a week to just file an FIR (First Information Report) . And these are women who have got medals for the country. More than half of these organizations, specifically the women wrestling team don t have POSH which is a mandatory body supposed to be in every organization. POSH stands for Prevention of Sexual Harassment at Workplace. The gentleman concerned is a known rowdy/Goon hence it took almost a week of protest to do the needful  I do try not to report because right now every other day we see somewhere or the other the Govt. curtailing our rights and most people are mute  Signing out, till later

20 March 2023

Freexian Collaborators: Monthly report about Debian Long Term Support, February 2023 (by LTS Team)

Like each month, have a look at the work funded by Freexian s Debian LTS offering.

Debian LTS contributors In February, 15 contributors have been paid to work on Debian LTS, their reports are available:
  • Adrian Bunk did 22.0h (out of 32.25h assigned), thus carrying over 10.25h to the next month.
  • Anton Gladky did 9.75h (out of 11.5h assigned and 3.5h from previous period), thus carrying over 5.25h to the next month.
  • Ben Hutchings did 8.0h (out of 8.0h assigned and 16.0h from previous period), thus carrying over 16.0h to the next month.
  • Chris Lamb did 18.0h (out of 18.0h assigned).
  • Emilio Pozuelo Monfort did 26.25h (out of 0h assigned and 35.0h from previous period), thus carrying over 8.75h to the next month.
  • Guilhem Moulin did 20.0h (out of 20.0h assigned).
  • Helmut Grohne did 5.0h (out of 5.0h assigned and 5.0h from previous period), thus carrying over 5.0h to the next month.
  • Lee Garrett did 26.75h (out of 19.75h assigned and 12.5h from previous period), thus carrying over 5.5h to the next month.
  • Markus Koschany did 32.25h (out of 32.25h assigned).
  • Ola Lundqvist did 11.5h (out of 12.5h assigned and 11.5h from previous period), thus carrying over 12.5h to the next month.
  • Roberto C. S nchez did 5.0h (out of 9.5h assigned and 22.5h from previous period), thus carrying over 27.0h to the next month.
  • Sylvain Beucler did 32.0h (out of 17.25h assigned and 15.0h from previous period), thus carrying over 0.25h to the next month.
  • Thorsten Alteholz did 8.0h (out of 14.0h assigned), thus carrying over 6.0h to the next month.
  • Tobias Frost did 16.0h (out of 16.0h assigned).
  • Utkarsh Gupta did 24.25h (out of 49.25h assigned), thus carrying over 8.0h to the next month.

Evolution of the situation In February, we have released 44 DLAs, which resolved 156 CVEs. We are glad to welcome some new contributors who will hopefully help us fix CVEs in the supported release even faster. However, we also experienced some setbacks as a few sponsors have stopped (or decreased) their support. If your company ever hesitated to sponsor Debian LTS, now might be a good time to join to ensure that we can continue this important work without having to scale down on the number of packages that we are able to support.

Thanks to our sponsors Sponsors that joined recently are in bold.

21 February 2023

Freexian Collaborators: Monthly report about Debian Long Term Support, January 2023 (by Anton Gladky)

Like each month, have a look at the work funded by Freexian s Debian LTS offering. This is the first monthly report in 2023.

Debian LTS contributors In January, 17 contributors have been paid to work on Debian LTS. which is possibly the highest number of active contributors per month! Their reports are available:
  • Abhijith PA did 0.0h (out of 3.0h assigned and 11.0h from previous period), thus carrying over 14.0h to the next month.
  • Adrian Bunk did 26.25h (out of 26.25h assigned).
  • Anton Gladky did 11.5h (out of 8.0h assigned and 7.0h from previous period), thus carrying over 3.5h to the next month.
  • Ben Hutchings did 8.0h (out of 24.0h assigned), thus carrying over 16.0h to the next month.
  • Chris Lamb did 18.0h (out of 18.0h assigned).
  • Emilio Pozuelo Monfort did 8.0h (out of 0h assigned and 43.0h from previous period), thus carrying over 35.0h to the next month.
  • Guilhem Moulin did 20.0h (out of 17.5h assigned and 2.5h from previous period).
  • Helmut Grohne did 10.0h (out of 15.0h assigned), thus carrying over 5.0h to the next month.
  • Lee Garrett did 7.5h (out of 20.0h assigned), thus carrying over 12.5h to the next month.
  • Markus Koschany did 26.25h (out of 26.25h assigned).
  • Ola Lundqvist did 4.5h (out of 10.0h assigned and 6.0h from previous period), thus carrying over 11.5h to the next month.
  • Roberto C. S nchez did 3.75h (out of 18.75h assigned and 7.5h from previous period), thus carrying over 22.5h to the next month.
  • Stefano Rivera did 4.5h (out of 0h assigned and 32.5h from previous period), thus carrying over 28.0h to the next month.
  • Sylvain Beucler did 23.5h (out of 0h assigned and 38.5h from previous period), thus carrying over 15.0h to the next month.
  • Thorsten Alteholz did 14.0h (out of 10.0h assigned and 4.0h from previous period).
  • Tobias Frost did 19.0h (out of 19.0h assigned).
  • Utkarsh Gupta did 43.25h (out of 26.25h assigned and 17.0h from previous period).

Evolution of the situation Furthermore, we released 46 DLAs in January, which resolved 146 CVEs. We are working diligently to reduce the number of packages listed in dla-needed.txt, and currently, we have 55 packages listed. We are constantly growing and seeking new contributors. If you are a Debian Developer and want to join the LTS team, please contact us.

Thanks to our sponsors Sponsors that joined recently are in bold.

16 January 2023

Freexian Collaborators: Monthly report about Debian Long Term Support, December 2022 (by Anton Gladky)

Like each month, have a look at the work funded by Freexian s Debian LTS offering.

Debian LTS contributors In December, 17 contributors have been paid to work on Debian LTS, their reports are available:
  • Abhijith PA did 3.0h (out of 0h assigned and 14.0h from previous period), thus carrying over 11.0h to the next month.
  • Anton Gladky did 8.0h (out of 6.0h assigned and 9.0h from previous period), thus carrying over 7.0h to the next month.
  • Ben Hutchings did 24.0h (out of 9.0h assigned and 15.0h from previous period).
  • Chris Lamb did 18.0h (out of 18.0h assigned).
  • Dominik George did 0.0h (out of 10.0h assigned and 14.0h from previous period), thus carrying over 24.0h to the next month.
  • Emilio Pozuelo Monfort did 8.0h in December, 8.0h in November (out of 1.5h assigned and 49.5h from previous period), thus carrying over 43.0h to the next month.
  • Enrico Zini did 0.0h (out of 0h assigned and 8.0h from previous period), thus carrying over 8.0h to the next month.
  • Guilhem Moulin did 17.5h (out of 20.0h assigned), thus carrying over 2.5h to the next month.
  • Helmut Grohne did 15.0h (out of 15.0h assigned, 2.5h were taken from the extra-budget and worked on).
  • Markus Koschany did 40.0h (out of 40.0h assigned).
  • Ola Lundqvist did 10.0h (out of 7.5h assigned and 8.5h from previous period), thus carrying over 6.0h to the next month.
  • Roberto C. S nchez did 24.5h (out of 20.25h assigned and 11.75h from previous period), thus carrying over 7.5h to the next month.
  • Stefano Rivera did 2.5h (out of 20.5h assigned and 14.5h from previous period), thus carrying over 32.5h to the next month.
  • Sylvain Beucler did 20.5h (out of 37.0h assigned and 22.0h from previous period), thus carrying over 38.5h to the next month.
  • Thorsten Alteholz did 10.0h (out of 14.0h assigned), thus carrying over 4.0h to the next month.
  • Tobias Frost did 16.0h (out of 16.0h assigned).
  • Utkarsh Gupta did 51.5h (out of 42.5h assigned and 9.0h from previous period).

Evolution of the situation In December, we have released 47 DLAs, closing 232 CVEs. In the same year, in total we released 394 DLAs, closing 1450 CVEs. We are constantly growing and seeking new contributors. If you are a Debian Developer and want to join the LTS team, please contact us.

Thanks to our sponsors Sponsors that joined recently are in bold.

18 December 2022

Freexian Collaborators: Monthly report about Debian Long Term Support, November 2022 (by Anton Gladky)

Like each month, have a look at the work funded by Freexian s Debian LTS offering.

Debian LTS contributors In November, 15 contributors have been paid to work on Debian LTS, their reports are available:
  • Abhijith PA did 0.0h (out of 14.0h assigned), thus carrying over 14.0h to the next month.
  • Anton Gladky did 6.0h (out of 15.0h assigned), thus carrying over 9.0h to the next month.
  • Ben Hutchings did 9.0h (out of 24.0h assigned), thus carrying over 15.0h to the next month.
  • Chris Lamb did 18.0h (out of 18.0h assigned).
  • Dominik George did 10.0h (out of 0h assigned and 24.0h from previous period), thus carrying over 14.0h to the next month.
  • Emilio Pozuelo Monfort did 0.0h (out of 38.0h assigned and 19.5h from previous period), thus carrying over 57.5h to the next month.
  • Enrico Zini did 0.0h (out of 0h assigned and 8.0h from previous period), thus carrying over 8.0h to the next month.
  • Helmut Grohne did 17.5h (out of 20.0h assigned).
  • Markus Koschany did 40.0h (out of 40.0h assigned).
  • Ola Lundqvist did 7.5h (out of 11.0h assigned and 5.0h from previous period), thus carrying over 8.5h to the next month.
  • Roberto C. S nchez did 20.25h (out of 0.75h assigned and 31.25h from previous period), thus carrying over 11.75h to the next month.
  • Stefano Rivera did 2.5h (out of 0h assigned and 17.0h from previous period), thus carrying over 14.5h to the next month.
  • Sylvain Beucler did 35.5h (out of 23.0h assigned and 34.5h from previous period), thus carrying over 22.0h to the next month.
  • Thorsten Alteholz did 14.0h (out of 14.0h assigned).
  • Utkarsh Gupta did 41.0h (out of 32.5h assigned and 25.0h from previous period), thus carrying over 16.5h to the next month.

Evolution of the situation In November, we released 43 DLAs, fixing 183 CVEs. We currently have 63 packages in dla-needed.txt that are waiting for updates, which is 19 fewer than the previous month. We re excited to announce that two Debian Developers Tobias Frost and Guilhem Moulin, have completed the on-boarding process and will begin contributing to LTS as of December 2022. Welcome aboard!

Thanks to our sponsors Sponsors that joined recently are in bold.

19 November 2022

Freexian Collaborators: Monthly report about Debian Long Term Support, October 2022 (by Rapha l Hertzog)

Like each month, have a look at the work funded by Freexian s Debian LTS offering.

Debian LTS contributors In October, 15 contributors have been paid to work on Debian LTS, their reports are available:
  • Abhijith PA did 14.0h (out of 2.0h assigned and 12.0h from previous period).
  • Anton Gladky did 20.0h (out of 19.0h assigned and 1.0h from previous period).
  • Ben Hutchings did 9.0h (out of 0h assigned and 9.0h from previous period).
  • Chris Lamb did 18.0h (out of 18.0h assigned).
  • Dominik George did 0.0h (out of 0h assigned and 24.0h from previous period), thus carrying over 24.0h to the next month.
  • Emilio Pozuelo Monfort did 40.5h (out of 58.0h assigned and 2.0h from previous period), thus carrying over 19.5h to the next month.
  • Enrico Zini did 0.0h (out of 0h assigned and 8.0h from previous period), thus carrying over 8.0h to the next month.
  • Helmut Grohne did 15.0h (out of 15.0h assigned).
  • Markus Koschany did 40.0h (out of 40.0h assigned).
  • Ola Lundqvist did 7.0h (out of 12.0h assigned), thus carrying over 5.0h to the next month.
  • Roberto C. S nchez did 0.75h (out of 1.0h assigned and 31.0h from previous period), thus carrying over 31.25h to the next month.
  • Stefano Rivera did 12.5h (out of 9.0h assigned and 26.0h from previous period), thus carrying over 22.5h to the next month.
  • Sylvain Beucler did 25.5h (out of 31.5h assigned and 28.5h from previous period), thus carrying over 34.5h to the next month.
  • Thorsten Alteholz did 14.0h (out of 14.0h assigned).
  • Utkarsh Gupta did 35.0h (out of 38.0h assigned and 22.0h from previous period), thus carrying over 25.0h to the next month.

Evolution of the situation In October, we have released 42 DLAs, closing 106 CVEs. At the moment we have 82 packages in dla-needed.txt, waiting for update. We are continuously working on updating our infrastructure, trying to document all of our changes in the git-repo. Most of packages there are having continuous integration (CI) pipelines.

Thanks to our sponsors Sponsors that joined recently are in bold.

16 November 2022

Antoine Beaupr : A ZFS migration

In my tubman setup, I started using ZFS on an old server I had lying around. The machine is really old though (2011!) and it "feels" pretty slow. I want to see how much of that is ZFS and how much is the machine. Synthetic benchmarks show that ZFS may be slower than mdadm in RAID-10 or RAID-6 configuration, so I want to confirm that on a live workload: my workstation. Plus, I want easy, regular, high performance backups (with send/receive snapshots) and there's no way I'm going to use BTRFS because I find it too confusing and unreliable. So off we go.

Installation Since this is a conversion (and not a new install), our procedure is slightly different than the official documentation but otherwise it's pretty much in the same spirit: we're going to use ZFS for everything, including the root filesystem. So, install the required packages, on the current system:
apt install --yes gdisk zfs-dkms zfs zfs-initramfs zfsutils-linux
We also tell DKMS that we need to rebuild the initrd when upgrading:
echo REMAKE_INITRD=yes > /etc/dkms/zfs.conf

Partitioning This is going to partition /dev/sdc with:
  • 1MB MBR / BIOS legacy boot
  • 512MB EFI boot
  • 1GB bpool, unencrypted pool for /boot
  • rest of the disk for zpool, the rest of the data
     sgdisk --zap-all /dev/sdc
     sgdisk -a1 -n1:24K:+1000K -t1:EF02 /dev/sdc
     sgdisk     -n2:1M:+512M   -t2:EF00 /dev/sdc
     sgdisk     -n3:0:+1G      -t3:BF01 /dev/sdc
     sgdisk     -n4:0:0        -t4:BF00 /dev/sdc
    
That will look something like this:
    root@curie:/home/anarcat# sgdisk -p /dev/sdc
    Disk /dev/sdc: 1953525168 sectors, 931.5 GiB
    Model: ESD-S1C         
    Sector size (logical/physical): 512/512 bytes
    Disk identifier (GUID): [REDACTED]
    Partition table holds up to 128 entries
    Main partition table begins at sector 2 and ends at sector 33
    First usable sector is 34, last usable sector is 1953525134
    Partitions will be aligned on 16-sector boundaries
    Total free space is 14 sectors (7.0 KiB)
    Number  Start (sector)    End (sector)  Size       Code  Name
       1              48            2047   1000.0 KiB  EF02  
       2            2048         1050623   512.0 MiB   EF00  
       3         1050624         3147775   1024.0 MiB  BF01  
       4         3147776      1953525134   930.0 GiB   BF00
Unfortunately, we can't be sure of the sector size here, because the USB controller is probably lying to us about it. Normally, this smartctl command should tell us the sector size as well:
root@curie:~# smartctl -i /dev/sdb -qnoserial
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.0-14-amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Black Mobile
Device Model:     WDC WD10JPLX-00MBPT0
Firmware Version: 01.01H01
User Capacity:    1 000 204 886 016 bytes [1,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      2.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 6
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Tue May 17 13:33:04 2022 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Above is the example of the builtin HDD drive. But the SSD device enclosed in that USB controller doesn't support SMART commands, so we can't trust that it really has 512 bytes sectors. This matters because we need to tweak the ashift value correctly. We're going to go ahead the SSD drive has the common 4KB settings, which means ashift=12. Note here that we are not creating a separate partition for swap. Swap on ZFS volumes (AKA "swap on ZVOL") can trigger lockups and that issue is still not fixed upstream. Ubuntu recommends using a separate partition for swap instead. But since this is "just" a workstation, we're betting that we will not suffer from this problem, after hearing a report from another Debian developer running this setup on their workstation successfully. We do not recommend this setup though. In fact, if I were to redo this partition scheme, I would probably use LUKS encryption and setup a dedicated swap partition, as I had problems with ZFS encryption as well.

Creating pools ZFS pools are somewhat like "volume groups" if you are familiar with LVM, except they obviously also do things like RAID-10. (Even though LVM can technically also do RAID, people typically use mdadm instead.) In any case, the guide suggests creating two different pools here: one, in cleartext, for boot, and a separate, encrypted one, for the rest. Technically, the boot partition is required because the Grub bootloader only supports readonly ZFS pools, from what I understand. But I'm a little out of my depth here and just following the guide.

Boot pool creation This creates the boot pool in readonly mode with features that grub supports:
    zpool create \
        -o cachefile=/etc/zfs/zpool.cache \
        -o ashift=12 -d \
        -o feature@async_destroy=enabled \
        -o feature@bookmarks=enabled \
        -o feature@embedded_data=enabled \
        -o feature@empty_bpobj=enabled \
        -o feature@enabled_txg=enabled \
        -o feature@extensible_dataset=enabled \
        -o feature@filesystem_limits=enabled \
        -o feature@hole_birth=enabled \
        -o feature@large_blocks=enabled \
        -o feature@lz4_compress=enabled \
        -o feature@spacemap_histogram=enabled \
        -o feature@zpool_checkpoint=enabled \
        -O acltype=posixacl -O canmount=off \
        -O compression=lz4 \
        -O devices=off -O normalization=formD -O relatime=on -O xattr=sa \
        -O mountpoint=/boot -R /mnt \
        bpool /dev/sdc3
I haven't investigated all those settings and just trust the upstream guide on the above.

Main pool creation This is a more typical pool creation.
    zpool create \
        -o ashift=12 \
        -O encryption=on -O keylocation=prompt -O keyformat=passphrase \
        -O acltype=posixacl -O xattr=sa -O dnodesize=auto \
        -O compression=zstd \
        -O relatime=on \
        -O canmount=off \
        -O mountpoint=/ -R /mnt \
        rpool /dev/sdc4
Breaking this down:
  • -o ashift=12: mentioned above, 4k sector size
  • -O encryption=on -O keylocation=prompt -O keyformat=passphrase: encryption, prompt for a password, default algorithm is aes-256-gcm, explicit in the guide, made implicit here
  • -O acltype=posixacl -O xattr=sa: enable ACLs, with better performance (not enabled by default)
  • -O dnodesize=auto: related to extended attributes, less compatibility with other implementations
  • -O compression=zstd: enable zstd compression, can be disabled/enabled by dataset to with zfs set compression=off rpool/example
  • -O relatime=on: classic atime optimisation, another that could be used on a busy server is atime=off
  • -O canmount=off: do not make the pool mount automatically with mount -a?
  • -O mountpoint=/ -R /mnt: mount pool on / in the future, but /mnt for now
Those settings are all available in zfsprops(8). Other flags are defined in zpool-create(8). The reasoning behind them is also explained in the upstream guide and some also in [the Debian wiki][]. Those flags were actually not used:
  • -O normalization=formD: normalize file names on comparisons (not storage), implies utf8only=on, which is a bad idea (and effectively meant my first sync failed to copy some files, including this folder from a supysonic checkout). and this cannot be changed after the filesystem is created. bad, bad, bad.
[the Debian wiki]: https://wiki.debian.org/ZFS#Advanced_Topics

Side note about single-disk pools Also note that we're living dangerously here: single-disk ZFS pools are rumoured to be more dangerous than not running ZFS at all. The choice quote from this article is:
[...] any error can be detected, but cannot be corrected. This sounds like an acceptable compromise, but its actually not. The reason its not is that ZFS' metadata cannot be allowed to be corrupted. If it is it is likely the zpool will be impossible to mount (and will probably crash the system once the corruption is found). So a couple of bad sectors in the right place will mean that all data on the zpool will be lost. Not some, all. Also there's no ZFS recovery tools, so you cannot recover any data on the drives.
Compared with (say) ext4, where a single disk error can recovered, this is pretty bad. But we are ready to live with this with the idea that we'll have hourly offline snapshots that we can easily recover from. It's trade-off. Also, we're running this on a NVMe/M.2 drive which typically just blinks out of existence completely, and doesn't "bit rot" the way a HDD would. Also, the FreeBSD handbook quick start doesn't have any warnings about their first example, which is with a single disk. So I am reassured at least.

Creating mount points Next we create the actual filesystems, known as "datasets" which are the things that get mounted on mountpoint and hold the actual files.
  • this creates two containers, for ROOT and BOOT
     zfs create -o canmount=off -o mountpoint=none rpool/ROOT &&
     zfs create -o canmount=off -o mountpoint=none bpool/BOOT
    
    Note that it's unclear to me why those datasets are necessary, but they seem common practice, also used in this FreeBSD example. The OpenZFS guide mentions the Solaris upgrades and Ubuntu's zsys that use that container for upgrades and rollbacks. This blog post seems to explain a bit the layout behind the installer.
  • this creates the actual boot and root filesystems:
     zfs create -o canmount=noauto -o mountpoint=/ rpool/ROOT/debian &&
     zfs mount rpool/ROOT/debian &&
     zfs create -o mountpoint=/boot bpool/BOOT/debian
    
    I guess the debian name here is because we could technically have multiple operating systems with the same underlying datasets.
  • then the main datasets:
     zfs create                                 rpool/home &&
     zfs create -o mountpoint=/root             rpool/home/root &&
     chmod 700 /mnt/root &&
     zfs create                                 rpool/var
    
  • exclude temporary files from snapshots:
     zfs create -o com.sun:auto-snapshot=false  rpool/var/cache &&
     zfs create -o com.sun:auto-snapshot=false  rpool/var/tmp &&
     chmod 1777 /mnt/var/tmp
    
  • and skip automatic snapshots in Docker:
     zfs create -o canmount=off                 rpool/var/lib &&
     zfs create -o com.sun:auto-snapshot=false  rpool/var/lib/docker
    
    Notice here a peculiarity: we must create rpool/var/lib to create rpool/var/lib/docker otherwise we get this error:
     cannot create 'rpool/var/lib/docker': parent does not exist
    
    ... and no, just creating /mnt/var/lib doesn't fix that problem. In fact, it makes things even more confusing because an existing directory shadows a mountpoint, which is the opposite of how things normally work. Also note that you will probably need to change storage driver in Docker, see the zfs-driver documentation for details but, basically, I did:
    echo '  "storage-driver": "zfs"  ' > /etc/docker/daemon.json
    
    Note that podman has the same problem (and similar solution):
    printf '[storage]\ndriver = "zfs"\n' > /etc/containers/storage.conf
    
  • make a tmpfs for /run:
     mkdir /mnt/run &&
     mount -t tmpfs tmpfs /mnt/run &&
     mkdir /mnt/run/lock
    
We don't create a /srv, as that's the HDD stuff. Also mount the EFI partition:
mkfs.fat -F 32 /dev/sdc2 &&
mount /dev/sdc2 /mnt/boot/efi/
At this point, everything should be mounted in /mnt. It should look like this:
root@curie:~# LANG=C df -h -t zfs -t vfat
Filesystem            Size  Used Avail Use% Mounted on
rpool/ROOT/debian     899G  384K  899G   1% /mnt
bpool/BOOT/debian     832M  123M  709M  15% /mnt/boot
rpool/home            899G  256K  899G   1% /mnt/home
rpool/home/root       899G  256K  899G   1% /mnt/root
rpool/var             899G  384K  899G   1% /mnt/var
rpool/var/cache       899G  256K  899G   1% /mnt/var/cache
rpool/var/tmp         899G  256K  899G   1% /mnt/var/tmp
rpool/var/lib/docker  899G  256K  899G   1% /mnt/var/lib/docker
/dev/sdc2             511M  4.0K  511M   1% /mnt/boot/efi
Now that we have everything setup and mounted, let's copy all files over.

Copying files This is a list of all the mounted filesystems
for fs in /boot/ /boot/efi/ / /home/; do
    echo "syncing $fs to /mnt$fs..." && 
    rsync -aSHAXx --info=progress2 --delete $fs /mnt$fs
done
You can check that the list is correct with:
mount -l -t ext4,btrfs,vfat   awk ' print $3 '
Note that we skip /srv as it's on a different disk. On the first run, we had:
root@curie:~# for fs in /boot/ /boot/efi/ / /home/; do
        echo "syncing $fs to /mnt$fs..." && 
        rsync -aSHAXx --info=progress2 $fs /mnt$fs
    done
syncing /boot/ to /mnt/boot/...
              0   0%    0.00kB/s    0:00:00 (xfr#0, to-chk=0/299)  
syncing /boot/efi/ to /mnt/boot/efi/...
     16,831,437 100%  184.14MB/s    0:00:00 (xfr#101, to-chk=0/110)
syncing / to /mnt/...
 28,019,293,280  94%   47.63MB/s    0:09:21 (xfr#703710, ir-chk=6748/839220)rsync: [generator] delete_file: rmdir(var/lib/docker) failed: Device or resource busy (16)
could not make way for new symlink: var/lib/docker
 34,081,267,990  98%   50.71MB/s    0:10:40 (xfr#736577, to-chk=0/867732)    
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1333) [sender=3.2.3]
syncing /home/ to /mnt/home/...
rsync: [sender] readlink_stat("/home/anarcat/.fuse") failed: Permission denied (13)
 24,456,268,098  98%   68.03MB/s    0:05:42 (xfr#159867, ir-chk=6875/172377) 
file has vanished: "/home/anarcat/.cache/mozilla/firefox/s2hwvqbu.quantum/cache2/entries/B3AB0CDA9C4454B3C1197E5A22669DF8EE849D90"
199,762,528,125  93%   74.82MB/s    0:42:26 (xfr#1437846, ir-chk=1018/1983979)rsync: [generator] recv_generator: mkdir "/mnt/home/anarcat/dist/supysonic/tests/assets/\#346" failed: Invalid or incomplete multibyte or wide character (84)
*** Skipping any contents from this failed directory ***
315,384,723,978  96%   76.82MB/s    1:05:15 (xfr#2256473, to-chk=0/2993950)    
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1333) [sender=3.2.3]
Note the failure to transfer that supysonic file? It turns out they had a weird filename in their source tree, since then removed, but still it showed how the utf8only feature might not be such a bad idea. At this point, the procedure was restarted all the way back to "Creating pools", after unmounting all ZFS filesystems (umount /mnt/run /mnt/boot/efi && umount -t zfs -a) and destroying the pool, which, surprisingly, doesn't require any confirmation (zpool destroy rpool). The second run was cleaner:
root@curie:~# for fs in /boot/ /boot/efi/ / /home/; do
        echo "syncing $fs to /mnt$fs..." && 
        rsync -aSHAXx --info=progress2 --delete $fs /mnt$fs
    done
syncing /boot/ to /mnt/boot/...
              0   0%    0.00kB/s    0:00:00 (xfr#0, to-chk=0/299)  
syncing /boot/efi/ to /mnt/boot/efi/...
              0   0%    0.00kB/s    0:00:00 (xfr#0, to-chk=0/110)  
syncing / to /mnt/...
 28,019,033,070  97%   42.03MB/s    0:10:35 (xfr#703671, ir-chk=1093/833515)rsync: [generator] delete_file: rmdir(var/lib/docker) failed: Device or resource busy (16)
could not make way for new symlink: var/lib/docker
 34,081,807,102  98%   44.84MB/s    0:12:04 (xfr#736580, to-chk=0/867723)    
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1333) [sender=3.2.3]
syncing /home/ to /mnt/home/...
rsync: [sender] readlink_stat("/home/anarcat/.fuse") failed: Permission denied (13)
IO error encountered -- skipping file deletion
 24,043,086,450  96%   62.03MB/s    0:06:09 (xfr#151819, ir-chk=15117/172571)
file has vanished: "/home/anarcat/.cache/mozilla/firefox/s2hwvqbu.quantum/cache2/entries/4C1FDBFEA976FF924D062FB990B24B897A77B84B"
315,423,626,507  96%   67.09MB/s    1:14:43 (xfr#2256845, to-chk=0/2994364)    
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1333) [sender=3.2.3]
Also note the transfer speed: we seem capped at 76MB/s, or 608Mbit/s. This is not as fast as I was expecting: the USB connection seems to be at around 5Gbps:
anarcat@curie:~$ lsusb -tv   head -4
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/6p, 5000M
    ID 1d6b:0003 Linux Foundation 3.0 root hub
     __ Port 1: Dev 4, If 0, Class=Mass Storage, Driver=uas, 5000M
        ID 0b05:1932 ASUSTek Computer, Inc.
So it shouldn't cap at that speed. It's possible the USB adapter is failing to give me the full speed though. It's not the M.2 SSD drive either, as that has a ~500MB/s bandwidth, acccording to its spec. At this point, we're about ready to do the final configuration. We drop to single user mode and do the rest of the procedure. That used to be shutdown now, but it seems like the systemd switch broke that, so now you can reboot into grub and pick the "recovery" option. Alternatively, you might try systemctl rescue, as I found out. I also wanted to copy the drive over to another new NVMe drive, but that failed: it looks like the USB controller I have doesn't work with older, non-NVME drives.

Boot configuration Now we need to enter the new system to rebuild the boot loader and initrd and so on. First, we bind mounts and chroot into the ZFS disk:
mount --rbind /dev  /mnt/dev &&
mount --rbind /proc /mnt/proc &&
mount --rbind /sys  /mnt/sys &&
chroot /mnt /bin/bash
Next we add an extra service that imports the bpool on boot, to make sure it survives a zpool.cache destruction:
cat > /etc/systemd/system/zfs-import-bpool.service <<EOF
[Unit]
DefaultDependencies=no
Before=zfs-import-scan.service
Before=zfs-import-cache.service
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/sbin/zpool import -N -o cachefile=none bpool
# Work-around to preserve zpool cache:
ExecStartPre=-/bin/mv /etc/zfs/zpool.cache /etc/zfs/preboot_zpool.cache
ExecStartPost=-/bin/mv /etc/zfs/preboot_zpool.cache /etc/zfs/zpool.cache
[Install]
WantedBy=zfs-import.target
EOF
Enable the service:
systemctl enable zfs-import-bpool.service
I had to trim down /etc/fstab and /etc/crypttab to only contain references to the legacy filesystems (/srv is still BTRFS!). If we don't already have a tmpfs defined in /etc/fstab:
ln -s /usr/share/systemd/tmp.mount /etc/systemd/system/ &&
systemctl enable tmp.mount
Rebuild boot loader with support for ZFS, but also to workaround GRUB's missing zpool-features support:
grub-probe /boot   grep -q zfs &&
update-initramfs -c -k all &&
sed -i 's,GRUB_CMDLINE_LINUX.*,GRUB_CMDLINE_LINUX="root=ZFS=rpool/ROOT/debian",' /etc/default/grub &&
update-grub
For good measure, make sure the right disk is configured here, for example you might want to tag both drives in a RAID array:
dpkg-reconfigure grub-pc
Install grub to EFI while you're there:
grub-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=debian --recheck --no-floppy
Filesystem mount ordering. The rationale here in the OpenZFS guide is a little strange, but I don't dare ignore that.
mkdir /etc/zfs/zfs-list.cache
touch /etc/zfs/zfs-list.cache/bpool
touch /etc/zfs/zfs-list.cache/rpool
zed -F &
Verify that zed updated the cache by making sure these are not empty:
cat /etc/zfs/zfs-list.cache/bpool
cat /etc/zfs/zfs-list.cache/rpool
Once the files have data, stop zed:
fg
Press Ctrl-C.
Fix the paths to eliminate /mnt:
sed -Ei "s /mnt/? / " /etc/zfs/zfs-list.cache/*
Snapshot initial install:
zfs snapshot bpool/BOOT/debian@install
zfs snapshot rpool/ROOT/debian@install
Exit chroot:
exit

Finalizing One last sync was done in rescue mode:
for fs in /boot/ /boot/efi/ / /home/; do
    echo "syncing $fs to /mnt$fs..." && 
    rsync -aSHAXx --info=progress2 --delete $fs /mnt$fs
done
Then we unmount all filesystems:
mount   grep -v zfs   tac   awk '/\/mnt/  print $3 '   xargs -i  umount -lf  
zpool export -a
Reboot, swap the drives, and boot in ZFS. Hurray!

Benchmarks This is a test that was ran in single-user mode using fio and the Ars Technica recommended tests, which are:
  • Single 4KiB random write process:
     fio --name=randwrite4k1x --ioengine=posixaio --rw=randwrite --bs=4k --size=4g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1
    
  • 16 parallel 64KiB random write processes:
     fio --name=randwrite64k16x --ioengine=posixaio --rw=randwrite --bs=64k --size=256m --numjobs=16 --iodepth=16 --runtime=60 --time_based --end_fsync=1
    
  • Single 1MiB random write process:
     fio --name=randwrite1m1x --ioengine=posixaio --rw=randwrite --bs=1m --size=16g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1
    
Strangely, that's not exactly what the author, Jim Salter, did in his actual test bench used in the ZFS benchmarking article. The first thing is there's no read test at all, which is already pretty strange. But also it doesn't include stuff like dropping caches or repeating results. So here's my variation, which i called fio-ars-bench.sh for now. It just batches a bunch of fio tests, one by one, 60 seconds each. It should take about 12 minutes to run, as there are 3 pair of tests, read/write, with and without async. My bias, before building, running and analysing those results is that ZFS should outperform the traditional stack on writes, but possibly not on reads. It's also possible it outperforms it on both, because it's a newer drive. A new test might be possible with a new external USB drive as well, although I doubt I will find the time to do this.

Results All tests were done on WD blue SN550 drives, which claims to be able to push 2400MB/s read and 1750MB/s write. An extra drive was bought to move the LVM setup from a WDC WDS500G1B0B-00AS40 SSD, a WD blue M.2 2280 SSD that was at least 5 years old, spec'd at 560MB/s read, 530MB/s write. Benchmarks were done on the M.2 SSD drive but discarded so that the drive difference is not a factor in the test. In practice, I'm going to assume we'll never reach those numbers because we're not actually NVMe (this is an old workstation!) so the bottleneck isn't the disk itself. For our purposes, it might still give us useful results.

Rescue test, LUKS/LVM/ext4 Those tests were performed with everything shutdown, after either entering the system in rescue mode, or by reaching that target with:
systemctl rescue
The network might have been started before or after the test as well:
systemctl start systemd-networkd
So it should be fairly reliable as basically nothing else is running. Raw numbers, from the ?job-curie-lvm.log, converted to MiB/s and manually merged:
test read I/O read IOPS write I/O write IOPS
rand4k4g1x 39.27 10052 212.15 54310
rand4k4g1x--fsync=1 39.29 10057 2.73 699
rand64k256m16x 1297.00 20751 1068.57 17097
rand64k256m16x--fsync=1 1290.90 20654 353.82 5661
rand1m16g1x 315.15 315 563.77 563
rand1m16g1x--fsync=1 345.88 345 157.01 157
Peaks are at about 20k IOPS and ~1.3GiB/s read, 1GiB/s write in the 64KB blocks with 16 jobs. Slowest is the random 4k block sync write at an abysmal 3MB/s and 700 IOPS The 1MB read/write tests have lower IOPS, but that is expected.

Rescue test, ZFS This test was also performed in rescue mode. Raw numbers, from the ?job-curie-zfs.log, converted to MiB/s and manually merged:
test read I/O read IOPS write I/O write IOPS
rand4k4g1x 77.20 19763 27.13 6944
rand4k4g1x--fsync=1 76.16 19495 6.53 1673
rand64k256m16x 1882.40 30118 70.58 1129
rand64k256m16x--fsync=1 1865.13 29842 71.98 1151
rand1m16g1x 921.62 921 102.21 102
rand1m16g1x--fsync=1 908.37 908 64.30 64
Peaks are at 1.8GiB/s read, also in the 64k job like above, but much faster. The write is, as expected, much slower at 70MiB/s (compared to 1GiB/s!), but it should be noted the sync write doesn't degrade performance compared to async writes (although it's still below the LVM 300MB/s).

Conclusions Really, ZFS has trouble performing in all write conditions. The random 4k sync write test is the only place where ZFS outperforms LVM in writes, and barely (7MiB/s vs 3MiB/s). Everywhere else, writes are much slower, sometimes by an order of magnitude. And before some ZFS zealot jumps in talking about the SLOG or some other cache that could be added to improved performance, I'll remind you that those numbers are on a bare bones NVMe drive, pretty much as fast storage as you can find on this machine. Adding another NVMe drive as a cache probably will not improve write performance here. Still, those are very different results than the tests performed by Salter which shows ZFS beating traditional configurations in all categories but uncached 4k reads (not writes!). That said, those tests are very different from the tests I performed here, where I test writes on a single disk, not a RAID array, which might explain the discrepancy. Also, note that neither LVM or ZFS manage to reach the 2400MB/s read and 1750MB/s write performance specification. ZFS does manage to reach 82% of the read performance (1973MB/s) and LVM 64% of the write performance (1120MB/s). LVM hits 57% of the read performance and ZFS hits barely 6% of the write performance. Overall, I'm a bit disappointed in the ZFS write performance here, I must say. Maybe I need to tweak the record size or some other ZFS voodoo, but I'll note that I didn't have to do any such configuration on the other side to kick ZFS in the pants...

Real world experience This section document not synthetic backups, but actual real world workloads, comparing before and after I switched my workstation to ZFS.

Docker performance I had the feeling that running some git hook (which was firing a Docker container) was "slower" somehow. It seems that, at runtime, ZFS backends are significant slower than their overlayfs/ext4 equivalent:
May 16 14:42:52 curie systemd[1]: home-docker-overlay2-17e4d24228decc2d2d493efc401dbfb7ac29739da0e46775e122078d9daf3e87\x2dinit-merged.mount: Succeeded.
May 16 14:42:52 curie systemd[5161]: home-docker-overlay2-17e4d24228decc2d2d493efc401dbfb7ac29739da0e46775e122078d9daf3e87\x2dinit-merged.mount: Succeeded.
May 16 14:42:52 curie systemd[1]: home-docker-overlay2-17e4d24228decc2d2d493efc401dbfb7ac29739da0e46775e122078d9daf3e87-merged.mount: Succeeded.
May 16 14:42:53 curie dockerd[1723]: time="2022-05-16T14:42:53.087219426-04:00" level=info msg="starting signal loop" namespace=moby path=/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/af22586fba07014a4d10ab19da10cf280db7a43cad804d6c1e9f2682f12b5f10 pid=151170
May 16 14:42:53 curie systemd[1]: Started libcontainer container af22586fba07014a4d10ab19da10cf280db7a43cad804d6c1e9f2682f12b5f10.
May 16 14:42:54 curie systemd[1]: docker-af22586fba07014a4d10ab19da10cf280db7a43cad804d6c1e9f2682f12b5f10.scope: Succeeded.
May 16 14:42:54 curie dockerd[1723]: time="2022-05-16T14:42:54.047297800-04:00" level=info msg="shim disconnected" id=af22586fba07014a4d10ab19da10cf280db7a43cad804d6c1e9f2682f12b5f10
May 16 14:42:54 curie dockerd[998]: time="2022-05-16T14:42:54.051365015-04:00" level=info msg="ignoring event" container=af22586fba07014a4d10ab19da10cf280db7a43cad804d6c1e9f2682f12b5f10 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
May 16 14:42:54 curie systemd[2444]: run-docker-netns-f5453c87c879.mount: Succeeded.
May 16 14:42:54 curie systemd[5161]: run-docker-netns-f5453c87c879.mount: Succeeded.
May 16 14:42:54 curie systemd[2444]: home-docker-overlay2-17e4d24228decc2d2d493efc401dbfb7ac29739da0e46775e122078d9daf3e87-merged.mount: Succeeded.
May 16 14:42:54 curie systemd[5161]: home-docker-overlay2-17e4d24228decc2d2d493efc401dbfb7ac29739da0e46775e122078d9daf3e87-merged.mount: Succeeded.
May 16 14:42:54 curie systemd[1]: run-docker-netns-f5453c87c879.mount: Succeeded.
May 16 14:42:54 curie systemd[1]: home-docker-overlay2-17e4d24228decc2d2d493efc401dbfb7ac29739da0e46775e122078d9daf3e87-merged.mount: Succeeded.
Translating this:
  • container setup: ~1 second
  • container runtime: ~1 second
  • container teardown: ~1 second
  • total runtime: 2-3 seconds
Obviously, those timestamps are not quite accurate enough to make precise measurements... After I switched to ZFS:
mai 30 15:31:39 curie systemd[1]: var-lib-docker-zfs-graph-41ce08fb7a1d3a9c101694b82722f5621c0b4819bd1d9f070933fd1e00543cdf\x2dinit.mount: Succeeded. 
mai 30 15:31:39 curie systemd[5287]: var-lib-docker-zfs-graph-41ce08fb7a1d3a9c101694b82722f5621c0b4819bd1d9f070933fd1e00543cdf\x2dinit.mount: Succeeded. 
mai 30 15:31:40 curie systemd[1]: var-lib-docker-zfs-graph-41ce08fb7a1d3a9c101694b82722f5621c0b4819bd1d9f070933fd1e00543cdf.mount: Succeeded. 
mai 30 15:31:40 curie systemd[5287]: var-lib-docker-zfs-graph-41ce08fb7a1d3a9c101694b82722f5621c0b4819bd1d9f070933fd1e00543cdf.mount: Succeeded. 
mai 30 15:31:41 curie dockerd[3199]: time="2022-05-30T15:31:41.551403693-04:00" level=info msg="starting signal loop" namespace=moby path=/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/42a1a1ed5912a7227148e997f442e7ab2e5cc3558aa3471548223c5888c9b142 pid=141080 
mai 30 15:31:41 curie systemd[1]: run-docker-runtime\x2drunc-moby-42a1a1ed5912a7227148e997f442e7ab2e5cc3558aa3471548223c5888c9b142-runc.ZVcjvl.mount: Succeeded. 
mai 30 15:31:41 curie systemd[5287]: run-docker-runtime\x2drunc-moby-42a1a1ed5912a7227148e997f442e7ab2e5cc3558aa3471548223c5888c9b142-runc.ZVcjvl.mount: Succeeded. 
mai 30 15:31:41 curie systemd[1]: Started libcontainer container 42a1a1ed5912a7227148e997f442e7ab2e5cc3558aa3471548223c5888c9b142. 
mai 30 15:31:45 curie systemd[1]: docker-42a1a1ed5912a7227148e997f442e7ab2e5cc3558aa3471548223c5888c9b142.scope: Succeeded. 
mai 30 15:31:45 curie dockerd[3199]: time="2022-05-30T15:31:45.883019128-04:00" level=info msg="shim disconnected" id=42a1a1ed5912a7227148e997f442e7ab2e5cc3558aa3471548223c5888c9b142 
mai 30 15:31:45 curie dockerd[1726]: time="2022-05-30T15:31:45.883064491-04:00" level=info msg="ignoring event" container=42a1a1ed5912a7227148e997f442e7ab2e5cc3558aa3471548223c5888c9b142 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete" 
mai 30 15:31:45 curie systemd[1]: run-docker-netns-e45f5cf5f465.mount: Succeeded. 
mai 30 15:31:45 curie systemd[5287]: run-docker-netns-e45f5cf5f465.mount: Succeeded. 
mai 30 15:31:45 curie systemd[1]: var-lib-docker-zfs-graph-41ce08fb7a1d3a9c101694b82722f5621c0b4819bd1d9f070933fd1e00543cdf.mount: Succeeded. 
mai 30 15:31:45 curie systemd[5287]: var-lib-docker-zfs-graph-41ce08fb7a1d3a9c101694b82722f5621c0b4819bd1d9f070933fd1e00543cdf.mount: Succeeded.
That's double or triple the run time, from 2 seconds to 6 seconds. Most of the time is spent in run time, inside the container. Here's the breakdown:
  • container setup: ~2 seconds
  • container run: ~4 seconds
  • container teardown: ~1 second
  • total run time: about ~6-7 seconds
That's a two- to three-fold increase! Clearly something is going on here that I should tweak. It's possible that code path is less optimized in Docker. I also worry about podman, but apparently it also supports ZFS backends. Possibly it would perform better, but at this stage I wouldn't have a good comparison: maybe it would have performed better on non-ZFS as well...

Interactivity While doing the offsite backups (below), the system became somewhat "sluggish". I felt everything was slow, and I estimate it introduced ~50ms latency in any input device. Arguably, those are all USB and the external drive was connected through USB, but I suspect the ZFS drivers are not as well tuned with the scheduler as the regular filesystem drivers...

Recovery procedures For test purposes, I unmounted all systems during the procedure:
umount /mnt/boot/efi /mnt/boot/run
umount -a -t zfs
zpool export -a
And disconnected the drive, to see how I would recover this system from another Linux system in case of a total motherboard failure. To import an existing pool, plug the device, then import the pool with an alternate root, so it doesn't mount over your existing filesystems, then you mount the root filesystem and all the others:
zpool import -l -a -R /mnt &&
zfs mount rpool/ROOT/debian &&
zfs mount -a &&
mount /dev/sdc2 /mnt/boot/efi &&
mount -t tmpfs tmpfs /mnt/run &&
mkdir /mnt/run/lock

Offsite backup Part of the goal of using ZFS is to simplify and harden backups. I wanted to experiment with shorter recovery times specifically both point in time recovery objective and recovery time objective and faster incremental backups. This is, therefore, part of my backup services. This section documents how an external NVMe enclosure was setup in a pool to mirror the datasets from my workstation. The final setup should include syncoid copying datasets to the backup server regularly, but I haven't finished that configuration yet.

Partitioning The above partitioning procedure used sgdisk, but I couldn't figure out how to do this with sgdisk, so this uses sfdisk to dump the partition from the first disk to an external, identical drive:
sfdisk -d /dev/nvme0n1   sfdisk --no-reread /dev/sda --force

Pool creation This is similar to the main pool creation, except we tweaked a few bits after changing the upstream procedure:
zpool create \
        -o cachefile=/etc/zfs/zpool.cache \
        -o ashift=12 -d \
        -o feature@async_destroy=enabled \
        -o feature@bookmarks=enabled \
        -o feature@embedded_data=enabled \
        -o feature@empty_bpobj=enabled \
        -o feature@enabled_txg=enabled \
        -o feature@extensible_dataset=enabled \
        -o feature@filesystem_limits=enabled \
        -o feature@hole_birth=enabled \
        -o feature@large_blocks=enabled \
        -o feature@lz4_compress=enabled \
        -o feature@spacemap_histogram=enabled \
        -o feature@zpool_checkpoint=enabled \
        -O acltype=posixacl -O xattr=sa \
        -O compression=lz4 \
        -O devices=off \
        -O relatime=on \
        -O canmount=off \
        -O mountpoint=/boot -R /mnt \
        bpool-tubman /dev/sdb3
The change from the main boot pool are: Main pool creation is:
zpool create \
        -o ashift=12 \
        -O encryption=on -O keylocation=prompt -O keyformat=passphrase \
        -O acltype=posixacl -O xattr=sa -O dnodesize=auto \
        -O compression=zstd \
        -O relatime=on \
        -O canmount=off \
        -O mountpoint=/ -R /mnt \
        rpool-tubman /dev/sdb4

First sync I used syncoid to copy all pools over to the external device. syncoid is a thing that's part of the sanoid project which is specifically designed to sync snapshots between pool, typically over SSH links but it can also operate locally. The sanoid command had a --readonly argument to simulate changes, but syncoid didn't so I tried to fix that with an upstream PR. It seems it would be better to do this by hand, but this was much easier. The full first sync was:
root@curie:/home/anarcat# ./bin/syncoid -r  bpool bpool-tubman
CRITICAL ERROR: Target bpool-tubman exists but has no snapshots matching with bpool!
                Replication to target would require destroying existing
                target. Cowardly refusing to destroy your existing target.
          NOTE: Target bpool-tubman dataset is < 64MB used - did you mistakenly run
                 zfs create bpool-tubman  on the target? ZFS initial
                replication must be to a NON EXISTENT DATASET, which will
                then be CREATED BY the initial replication process.
INFO: Sending oldest full snapshot bpool/BOOT@test (~ 42 KB) to new target filesystem:
44.2KiB 0:00:00 [4.19MiB/s] [========================================================================================================================] 103%            
INFO: Updating new target filesystem with incremental bpool/BOOT@test ... syncoid_curie_2022-05-30:12:50:39 (~ 4 KB):
2.13KiB 0:00:00 [ 114KiB/s] [===============================================================>                                                         ] 53%            
INFO: Sending oldest full snapshot bpool/BOOT/debian@install (~ 126.0 MB) to new target filesystem:
 126MiB 0:00:00 [ 308MiB/s] [=======================================================================================================================>] 100%            
INFO: Updating new target filesystem with incremental bpool/BOOT/debian@install ... syncoid_curie_2022-05-30:12:50:39 (~ 113.4 MB):
 113MiB 0:00:00 [ 315MiB/s] [=======================================================================================================================>] 100%
root@curie:/home/anarcat# ./bin/syncoid -r  rpool rpool-tubman
CRITICAL ERROR: Target rpool-tubman exists but has no snapshots matching with rpool!
                Replication to target would require destroying existing
                target. Cowardly refusing to destroy your existing target.
          NOTE: Target rpool-tubman dataset is < 64MB used - did you mistakenly run
                 zfs create rpool-tubman  on the target? ZFS initial
                replication must be to a NON EXISTENT DATASET, which will
                then be CREATED BY the initial replication process.
INFO: Sending oldest full snapshot rpool/ROOT@syncoid_curie_2022-05-30:12:50:51 (~ 69 KB) to new target filesystem:
44.2KiB 0:00:00 [2.44MiB/s] [===========================================================================>                                             ] 63%            
INFO: Sending oldest full snapshot rpool/ROOT/debian@install (~ 25.9 GB) to new target filesystem:
25.9GiB 0:03:33 [ 124MiB/s] [=======================================================================================================================>] 100%            
INFO: Updating new target filesystem with incremental rpool/ROOT/debian@install ... syncoid_curie_2022-05-30:12:50:52 (~ 3.9 GB):
3.92GiB 0:00:33 [ 119MiB/s] [======================================================================================================================>  ] 99%            
INFO: Sending oldest full snapshot rpool/home@syncoid_curie_2022-05-30:12:55:04 (~ 276.8 GB) to new target filesystem:
 277GiB 0:27:13 [ 174MiB/s] [=======================================================================================================================>] 100%            
INFO: Sending oldest full snapshot rpool/home/root@syncoid_curie_2022-05-30:13:22:19 (~ 2.2 GB) to new target filesystem:
2.22GiB 0:00:25 [90.2MiB/s] [=======================================================================================================================>] 100%            
INFO: Sending oldest full snapshot rpool/var@syncoid_curie_2022-05-30:13:22:47 (~ 5.6 GB) to new target filesystem:
5.56GiB 0:00:32 [ 176MiB/s] [=======================================================================================================================>] 100%            
INFO: Sending oldest full snapshot rpool/var/cache@syncoid_curie_2022-05-30:13:23:22 (~ 627.3 MB) to new target filesystem:
 627MiB 0:00:03 [ 169MiB/s] [=======================================================================================================================>] 100%            
INFO: Sending oldest full snapshot rpool/var/lib@syncoid_curie_2022-05-30:13:23:28 (~ 69 KB) to new target filesystem:
44.2KiB 0:00:00 [1.40MiB/s] [===========================================================================>                                             ] 63%            
INFO: Sending oldest full snapshot rpool/var/lib/docker@syncoid_curie_2022-05-30:13:23:28 (~ 442.6 MB) to new target filesystem:
 443MiB 0:00:04 [ 103MiB/s] [=======================================================================================================================>] 100%            
INFO: Sending oldest full snapshot rpool/var/lib/docker/05c0de7fabbea60500eaa495d0d82038249f6faa63b12914737c4d71520e62c5@266253254 (~ 6.3 MB) to new target filesystem:
6.49MiB 0:00:00 [12.9MiB/s] [========================================================================================================================] 102%            
INFO: Updating new target filesystem with incremental rpool/var/lib/docker/05c0de7fabbea60500eaa495d0d82038249f6faa63b12914737c4d71520e62c5@266253254 ... syncoid_curie_2022-05-30:13:23:34 (~ 4 KB):
1.52KiB 0:00:00 [27.6KiB/s] [============================================>                                                                            ] 38%            
INFO: Sending oldest full snapshot rpool/var/lib/flatpak@syncoid_curie_2022-05-30:13:23:36 (~ 2.0 GB) to new target filesystem:
2.00GiB 0:00:17 [ 115MiB/s] [=======================================================================================================================>] 100%            
INFO: Sending oldest full snapshot rpool/var/tmp@syncoid_curie_2022-05-30:13:23:55 (~ 57.0 MB) to new target filesystem:
61.8MiB 0:00:01 [45.0MiB/s] [========================================================================================================================] 108%            
INFO: Clone is recreated on target rpool-tubman/var/lib/docker/ed71ddd563a779ba6fb37b3b1d0cc2c11eca9b594e77b4b234867ebcb162b205 based on rpool/var/lib/docker/05c0de7fabbea60500eaa495d0d82038249f6faa63b12914737c4d71520e62c5@266253254
INFO: Sending oldest full snapshot rpool/var/lib/docker/ed71ddd563a779ba6fb37b3b1d0cc2c11eca9b594e77b4b234867ebcb162b205@syncoid_curie_2022-05-30:13:23:58 (~ 218.6 MB) to new target filesystem:
 219MiB 0:00:01 [ 151MiB/s] [=======================================================================================================================>] 100%
Funny how the CRITICAL ERROR doesn't actually stop syncoid and it just carries on merrily doing when it's telling you it's "cowardly refusing to destroy your existing target"... Maybe that's because my pull request broke something though... During the transfer, the computer was very sluggish: everything feels like it has ~30-50ms latency extra:
anarcat@curie:sanoid$ LANG=C top -b  -n 1   head -20
top - 13:07:05 up 6 days,  4:01,  1 user,  load average: 16.13, 16.55, 11.83
Tasks: 606 total,   6 running, 598 sleeping,   0 stopped,   2 zombie
%Cpu(s): 18.8 us, 72.5 sy,  1.2 ni,  5.0 id,  1.2 wa,  0.0 hi,  1.2 si,  0.0 st
MiB Mem :  15898.4 total,   1387.6 free,  13170.0 used,   1340.8 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.   1319.8 avail Mem 
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
     70 root      20   0       0      0      0 S  83.3   0.0   6:12.67 kswapd0
4024878 root      20   0  282644  96432  10288 S  44.4   0.6   0:11.43 puppet
3896136 root      20   0   35328  16528     48 S  22.2   0.1   2:08.04 mbuffer
3896135 root      20   0   10328    776    168 R  16.7   0.0   1:22.93 zfs
3896138 root      20   0   10588    788    156 R  16.7   0.0   1:49.30 zfs
    350 root       0 -20       0      0      0 R  11.1   0.0   1:03.53 z_rd_int
    351 root       0 -20       0      0      0 S  11.1   0.0   1:04.15 z_rd_int
3896137 root      20   0    4384    352    244 R  11.1   0.0   0:44.73 pv
4034094 anarcat   30  10   20028  13960   2428 S  11.1   0.1   0:00.70 mbsync
4036539 anarcat   20   0    9604   3464   2408 R  11.1   0.0   0:00.04 top
    352 root       0 -20       0      0      0 S   5.6   0.0   1:03.64 z_rd_int
    353 root       0 -20       0      0      0 S   5.6   0.0   1:03.64 z_rd_int
    354 root       0 -20       0      0      0 S   5.6   0.0   1:04.01 z_rd_int
I wonder how much of that is due to syncoid, particularly because I often saw mbuffer and pv in there which are not strictly necessary to do those kind of operations, as far as I understand. Once that's done, export the pools to disconnect the drive:
zpool export bpool-tubman
zpool export rpool-tubman

Raw disk benchmark Copied the 512GB SSD/M.2 device to another 1024GB NVMe/M.2 device:
anarcat@curie:~$ sudo dd if=/dev/sdb of=/dev/sdc bs=4M status=progress conv=fdatasync
499944259584 octets (500 GB, 466 GiB) copi s, 1713 s, 292 MB/s
119235+1 enregistrements lus
119235+1 enregistrements  crits
500107862016 octets (500 GB, 466 GiB) copi s, 1719,93 s, 291 MB/s
... while both over USB, whoohoo 300MB/s!

Monitoring ZFS should be monitoring your pools regularly. Normally, the [[!debman zed]] daemon monitors all ZFS events. It is the thing that will report when a scrub failed, for example. See this configuration guide. Scrubs should be regularly scheduled to ensure consistency of the pool. This can be done in newer zfsutils-linux versions (bullseye-backports or bookworm) with one of those, depending on the desired frequency:
systemctl enable zfs-scrub-weekly@rpool.timer --now
systemctl enable zfs-scrub-monthly@rpool.timer --now
When the scrub runs, if it finds anything it will send an event which will get picked up by the zed daemon which will then send a notification, see below for an example. TODO: deploy on curie, if possible (probably not because no RAID) TODO: this should be in Puppet

Scrub warning example So what happens when problems are found? Here's an example of how I dealt with an error I received. After setting up another server (tubman) with ZFS, I eventually ended up getting a warning from the ZFS toolchain.
Date: Sun, 09 Oct 2022 00:58:08 -0400
From: root <root@anarc.at>
To: root@anarc.at
Subject: ZFS scrub_finish event for rpool on tubman
ZFS has finished a scrub:
   eid: 39536
 class: scrub_finish
  host: tubman
  time: 2022-10-09 00:58:07-0400
  pool: rpool
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: scrub repaired 0B in 00:33:57 with 0 errors on Sun Oct  9 00:58:07 2022
config:
        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sdb4    ONLINE       0     1     0
            sdc4    ONLINE       0     0     0
        cache
          sda3      ONLINE       0     0     0
errors: No known data errors
This, in itself, is a little worrisome. But it helpfully links to this more detailed documentation (and props up there: the link still works) which explains this is a "minor" problem (something that could be included in the report). In this case, this happened on a server setup on 2021-04-28, but the disks and server hardware are much older. The server itself (marcos v1) was built around 2011, over 10 years ago now. The hard drive in question is:
root@tubman:~# smartctl -i -qnoserial /dev/sdb
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.0-15-amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family:     Seagate BarraCuda 3.5
Device Model:     ST4000DM004-2CV104
Firmware Version: 0001
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5425 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Tue Oct 11 11:02:32 2022 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Some more SMART stats:
root@tubman:~# smartctl -a -qnoserial /dev/sdb   grep -e  Head_Flying_Hours -e Power_On_Hours -e Total_LBA -e 'Sector Sizes'
Sector Sizes:     512 bytes logical, 4096 bytes physical
  9 Power_On_Hours          0x0032   086   086   000    Old_age   Always       -       12464 (206 202 0)
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       10966h+55m+23.757s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       21107792664
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       3201579750
That's over a year of power on, which shouldn't be so bad. It has written about 10TB of data (21107792664 LBAs * 512 byte/LBA), which is about two full writes. According to its specification, this device is supposed to support 55 TB/year of writes, so we're far below spec. Note that are still far from the "non-recoverable read error per bits" spec (1 per 10E15), as we've basically read 13E12 bits (3201579750 LBAs * 512 byte/LBA = 13E12 bits). It's likely this disk was made in 2018, so it is in its fourth year. Interestingly, /dev/sdc is also a Seagate drive, but of a different series:
root@tubman:~# smartctl -qnoserial  -i /dev/sdb
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.0-15-amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family:     Seagate BarraCuda 3.5
Device Model:     ST4000DM004-2CV104
Firmware Version: 0001
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5425 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Tue Oct 11 11:21:35 2022 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
It has seen much more reads than the other disk which is also interesting:
root@tubman:~# smartctl -a -qnoserial /dev/sdc   grep -e  Head_Flying_Hours -e Power_On_Hours -e Total_LBA -e 'Sector Sizes'
Sector Sizes:     512 bytes logical, 4096 bytes physical
  9 Power_On_Hours          0x0032   059   059   000    Old_age   Always       -       36240
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       33994h+10m+52.118s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       30730174438
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       51894566538
That's 4 years of Head_Flying_Hours, and over 4 years (4 years and 48 days) of Power_On_Hours. The copyright date on that drive's specs goes back to 2016, so it's a much older drive. SMART self-test succeeded.

Remaining issues
  • TODO: move send/receive backups to offsite host, see also zfs for alternatives to syncoid/sanoid there
  • TODO: setup backup cron job (or timer?)
  • TODO: swap still not setup on curie, see zfs
  • TODO: document this somewhere: bpool and rpool are both pools and datasets. that's pretty confusing, but also very useful because it allows for pool-wide recursive snapshots, which are used for the backup system

fio improvements I really want to improve my experience with fio. Right now, I'm just cargo-culting stuff from other folks and I don't really like it. stressant is a good example of my struggles, in the sense that it doesn't really work that well for disk tests. I would love to have just a single .fio job file that lists multiple jobs to run serially. For example, this file describes the above workload pretty well:
[global]
# cargo-culting Salter
fallocate=none
ioengine=posixaio
runtime=60
time_based=1
end_fsync=1
stonewall=1
group_reporting=1
# no need to drop caches, done by default
# invalidate=1
# Single 4KiB random read/write process
[randread-4k-4g-1x]
rw=randread
bs=4k
size=4g
numjobs=1
iodepth=1
[randwrite-4k-4g-1x]
rw=randwrite
bs=4k
size=4g
numjobs=1
iodepth=1
# 16 parallel 64KiB random read/write processes:
[randread-64k-256m-16x]
rw=randread
bs=64k
size=256m
numjobs=16
iodepth=16
[randwrite-64k-256m-16x]
rw=randwrite
bs=64k
size=256m
numjobs=16
iodepth=16
# Single 1MiB random read/write process
[randread-1m-16g-1x]
rw=randread
bs=1m
size=16g
numjobs=1
iodepth=1
[randwrite-1m-16g-1x]
rw=randwrite
bs=1m
size=16g
numjobs=1
iodepth=1
... except the jobs are actually started in parallel, even though they are stonewall'd, as far as I can tell by the reports. I sent a mail to the fio mailing list for clarification. It looks like the jobs are started in parallel, but actual (correctly) run serially. It seems like this might just be a matter of reporting the right timestamps in the end, although it does feel like starting all the processes (even if not doing any work yet) could skew the results.

Hangs during procedure During the procedure, it happened a few times where any ZFS command would completely hang. It seems that using an external USB drive to sync stuff didn't work so well: sometimes it would reconnect under a different device (from sdc to sdd, for example), and this would greatly confuse ZFS. Here, for example, is sdd reappearing out of the blue:
May 19 11:22:53 curie kernel: [  699.820301] scsi host4: uas
May 19 11:22:53 curie kernel: [  699.820544] usb 2-1: authorized to connect
May 19 11:22:53 curie kernel: [  699.922433] scsi 4:0:0:0: Direct-Access     ROG      ESD-S1C          0    PQ: 0 ANSI: 6
May 19 11:22:53 curie kernel: [  699.923235] sd 4:0:0:0: Attached scsi generic sg2 type 0
May 19 11:22:53 curie kernel: [  699.923676] sd 4:0:0:0: [sdd] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB)
May 19 11:22:53 curie kernel: [  699.923788] sd 4:0:0:0: [sdd] Write Protect is off
May 19 11:22:53 curie kernel: [  699.923949] sd 4:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
May 19 11:22:53 curie kernel: [  699.924149] sd 4:0:0:0: [sdd] Optimal transfer size 33553920 bytes
May 19 11:22:53 curie kernel: [  699.961602]  sdd: sdd1 sdd2 sdd3 sdd4
May 19 11:22:53 curie kernel: [  699.996083] sd 4:0:0:0: [sdd] Attached SCSI disk
Next time I run a ZFS command (say zpool list), the command completely hangs (D state) and this comes up in the logs:
May 19 11:34:21 curie kernel: [ 1387.914843] zio pool=bpool vdev=/dev/sdc3 error=5 type=2 offset=71344128 size=4096 flags=184880
May 19 11:34:21 curie kernel: [ 1387.914859] zio pool=bpool vdev=/dev/sdc3 error=5 type=2 offset=205565952 size=4096 flags=184880
May 19 11:34:21 curie kernel: [ 1387.914874] zio pool=bpool vdev=/dev/sdc3 error=5 type=2 offset=272789504 size=4096 flags=184880
May 19 11:34:21 curie kernel: [ 1387.914906] zio pool=bpool vdev=/dev/sdc3 error=5 type=1 offset=270336 size=8192 flags=b08c1
May 19 11:34:21 curie kernel: [ 1387.914932] zio pool=bpool vdev=/dev/sdc3 error=5 type=1 offset=1073225728 size=8192 flags=b08c1
May 19 11:34:21 curie kernel: [ 1387.914948] zio pool=bpool vdev=/dev/sdc3 error=5 type=1 offset=1073487872 size=8192 flags=b08c1
May 19 11:34:21 curie kernel: [ 1387.915165] zio pool=bpool vdev=/dev/sdc3 error=5 type=2 offset=272793600 size=4096 flags=184880
May 19 11:34:21 curie kernel: [ 1387.915183] zio pool=bpool vdev=/dev/sdc3 error=5 type=2 offset=339853312 size=4096 flags=184880
May 19 11:34:21 curie kernel: [ 1387.915648] WARNING: Pool 'bpool' has encountered an uncorrectable I/O failure and has been suspended.
May 19 11:34:21 curie kernel: [ 1387.915648] 
May 19 11:37:25 curie kernel: [ 1571.558614] task:txg_sync        state:D stack:    0 pid:  997 ppid:     2 flags:0x00004000
May 19 11:37:25 curie kernel: [ 1571.558623] Call Trace:
May 19 11:37:25 curie kernel: [ 1571.558640]  __schedule+0x282/0x870
May 19 11:37:25 curie kernel: [ 1571.558650]  schedule+0x46/0xb0
May 19 11:37:25 curie kernel: [ 1571.558670]  schedule_timeout+0x8b/0x140
May 19 11:37:25 curie kernel: [ 1571.558675]  ? __next_timer_interrupt+0x110/0x110
May 19 11:37:25 curie kernel: [ 1571.558678]  io_schedule_timeout+0x4c/0x80
May 19 11:37:25 curie kernel: [ 1571.558689]  __cv_timedwait_common+0x12b/0x160 [spl]
May 19 11:37:25 curie kernel: [ 1571.558694]  ? add_wait_queue_exclusive+0x70/0x70
May 19 11:37:25 curie kernel: [ 1571.558702]  __cv_timedwait_io+0x15/0x20 [spl]
May 19 11:37:25 curie kernel: [ 1571.558816]  zio_wait+0x129/0x2b0 [zfs]
May 19 11:37:25 curie kernel: [ 1571.558929]  dsl_pool_sync+0x461/0x4f0 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559032]  spa_sync+0x575/0xfa0 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559138]  ? spa_txg_history_init_io+0x101/0x110 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559245]  txg_sync_thread+0x2e0/0x4a0 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559354]  ? txg_fini+0x240/0x240 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559366]  thread_generic_wrapper+0x6f/0x80 [spl]
May 19 11:37:25 curie kernel: [ 1571.559376]  ? __thread_exit+0x20/0x20 [spl]
May 19 11:37:25 curie kernel: [ 1571.559379]  kthread+0x11b/0x140
May 19 11:37:25 curie kernel: [ 1571.559382]  ? __kthread_bind_mask+0x60/0x60
May 19 11:37:25 curie kernel: [ 1571.559386]  ret_from_fork+0x22/0x30
May 19 11:37:25 curie kernel: [ 1571.559401] task:zed             state:D stack:    0 pid: 1564 ppid:     1 flags:0x00000000
May 19 11:37:25 curie kernel: [ 1571.559404] Call Trace:
May 19 11:37:25 curie kernel: [ 1571.559409]  __schedule+0x282/0x870
May 19 11:37:25 curie kernel: [ 1571.559412]  ? __kmalloc_node+0x141/0x2b0
May 19 11:37:25 curie kernel: [ 1571.559417]  schedule+0x46/0xb0
May 19 11:37:25 curie kernel: [ 1571.559420]  schedule_preempt_disabled+0xa/0x10
May 19 11:37:25 curie kernel: [ 1571.559424]  __mutex_lock.constprop.0+0x133/0x460
May 19 11:37:25 curie kernel: [ 1571.559435]  ? nvlist_xalloc.part.0+0x68/0xc0 [znvpair]
May 19 11:37:25 curie kernel: [ 1571.559537]  spa_all_configs+0x41/0x120 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559644]  zfs_ioc_pool_configs+0x17/0x70 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559752]  zfsdev_ioctl_common+0x697/0x870 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559758]  ? _copy_from_user+0x28/0x60
May 19 11:37:25 curie kernel: [ 1571.559860]  zfsdev_ioctl+0x53/0xe0 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559866]  __x64_sys_ioctl+0x83/0xb0
May 19 11:37:25 curie kernel: [ 1571.559869]  do_syscall_64+0x33/0x80
May 19 11:37:25 curie kernel: [ 1571.559873]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
May 19 11:37:25 curie kernel: [ 1571.559876] RIP: 0033:0x7fcf0ef32cc7
May 19 11:37:25 curie kernel: [ 1571.559878] RSP: 002b:00007fcf0e181618 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
May 19 11:37:25 curie kernel: [ 1571.559881] RAX: ffffffffffffffda RBX: 000055b212f972a0 RCX: 00007fcf0ef32cc7
May 19 11:37:25 curie kernel: [ 1571.559883] RDX: 00007fcf0e181640 RSI: 0000000000005a04 RDI: 000000000000000b
May 19 11:37:25 curie kernel: [ 1571.559885] RBP: 00007fcf0e184c30 R08: 00007fcf08016810 R09: 00007fcf08000080
May 19 11:37:25 curie kernel: [ 1571.559886] R10: 0000000000080000 R11: 0000000000000246 R12: 000055b212f972a0
May 19 11:37:25 curie kernel: [ 1571.559888] R13: 0000000000000000 R14: 00007fcf0e181640 R15: 0000000000000000
May 19 11:37:25 curie kernel: [ 1571.559980] task:zpool           state:D stack:    0 pid:11815 ppid:  3816 flags:0x00004000
May 19 11:37:25 curie kernel: [ 1571.559983] Call Trace:
May 19 11:37:25 curie kernel: [ 1571.559988]  __schedule+0x282/0x870
May 19 11:37:25 curie kernel: [ 1571.559992]  schedule+0x46/0xb0
May 19 11:37:25 curie kernel: [ 1571.559995]  io_schedule+0x42/0x70
May 19 11:37:25 curie kernel: [ 1571.560004]  cv_wait_common+0xac/0x130 [spl]
May 19 11:37:25 curie kernel: [ 1571.560008]  ? add_wait_queue_exclusive+0x70/0x70
May 19 11:37:25 curie kernel: [ 1571.560118]  txg_wait_synced_impl+0xc9/0x110 [zfs]
May 19 11:37:25 curie kernel: [ 1571.560223]  txg_wait_synced+0xc/0x40 [zfs]
May 19 11:37:25 curie kernel: [ 1571.560325]  spa_export_common+0x4cd/0x590 [zfs]
May 19 11:37:25 curie kernel: [ 1571.560430]  ? zfs_log_history+0x9c/0xf0 [zfs]
May 19 11:37:25 curie kernel: [ 1571.560537]  zfsdev_ioctl_common+0x697/0x870 [zfs]
May 19 11:37:25 curie kernel: [ 1571.560543]  ? _copy_from_user+0x28/0x60
May 19 11:37:25 curie kernel: [ 1571.560644]  zfsdev_ioctl+0x53/0xe0 [zfs]
May 19 11:37:25 curie kernel: [ 1571.560649]  __x64_sys_ioctl+0x83/0xb0
May 19 11:37:25 curie kernel: [ 1571.560653]  do_syscall_64+0x33/0x80
May 19 11:37:25 curie kernel: [ 1571.560656]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
May 19 11:37:25 curie kernel: [ 1571.560659] RIP: 0033:0x7fdc23be2cc7
May 19 11:37:25 curie kernel: [ 1571.560661] RSP: 002b:00007ffc8c792478 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
May 19 11:37:25 curie kernel: [ 1571.560664] RAX: ffffffffffffffda RBX: 000055942ca49e20 RCX: 00007fdc23be2cc7
May 19 11:37:25 curie kernel: [ 1571.560666] RDX: 00007ffc8c792490 RSI: 0000000000005a03 RDI: 0000000000000003
May 19 11:37:25 curie kernel: [ 1571.560667] RBP: 00007ffc8c795e80 R08: 00000000ffffffff R09: 00007ffc8c792310
May 19 11:37:25 curie kernel: [ 1571.560669] R10: 000055942ca49e30 R11: 0000000000000246 R12: 00007ffc8c792490
May 19 11:37:25 curie kernel: [ 1571.560671] R13: 000055942ca49e30 R14: 000055942aed2c20 R15: 00007ffc8c795a40
Here's another example, where you see the USB controller bleeping out and back into existence:
mai 19 11:38:39 curie kernel: usb 2-1: USB disconnect, device number 2
mai 19 11:38:39 curie kernel: sd 4:0:0:0: [sdd] Synchronizing SCSI cache
mai 19 11:38:39 curie kernel: sd 4:0:0:0: [sdd] Synchronize Cache(10) failed: Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
mai 19 11:39:25 curie kernel: INFO: task zed:1564 blocked for more than 241 seconds.
mai 19 11:39:25 curie kernel:       Tainted: P          IOE     5.10.0-14-amd64 #1 Debian 5.10.113-1
mai 19 11:39:25 curie kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mai 19 11:39:25 curie kernel: task:zed             state:D stack:    0 pid: 1564 ppid:     1 flags:0x00000000
mai 19 11:39:25 curie kernel: Call Trace:
mai 19 11:39:25 curie kernel:  __schedule+0x282/0x870
mai 19 11:39:25 curie kernel:  ? __kmalloc_node+0x141/0x2b0
mai 19 11:39:25 curie kernel:  schedule+0x46/0xb0
mai 19 11:39:25 curie kernel:  schedule_preempt_disabled+0xa/0x10
mai 19 11:39:25 curie kernel:  __mutex_lock.constprop.0+0x133/0x460
mai 19 11:39:25 curie kernel:  ? nvlist_xalloc.part.0+0x68/0xc0 [znvpair]
mai 19 11:39:25 curie kernel:  spa_all_configs+0x41/0x120 [zfs]
mai 19 11:39:25 curie kernel:  zfs_ioc_pool_configs+0x17/0x70 [zfs]
mai 19 11:39:25 curie kernel:  zfsdev_ioctl_common+0x697/0x870 [zfs]
mai 19 11:39:25 curie kernel:  ? _copy_from_user+0x28/0x60
mai 19 11:39:25 curie kernel:  zfsdev_ioctl+0x53/0xe0 [zfs]
mai 19 11:39:25 curie kernel:  __x64_sys_ioctl+0x83/0xb0
mai 19 11:39:25 curie kernel:  do_syscall_64+0x33/0x80
mai 19 11:39:25 curie kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
mai 19 11:39:25 curie kernel: RIP: 0033:0x7fcf0ef32cc7
mai 19 11:39:25 curie kernel: RSP: 002b:00007fcf0e181618 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
mai 19 11:39:25 curie kernel: RAX: ffffffffffffffda RBX: 000055b212f972a0 RCX: 00007fcf0ef32cc7
mai 19 11:39:25 curie kernel: RDX: 00007fcf0e181640 RSI: 0000000000005a04 RDI: 000000000000000b
mai 19 11:39:25 curie kernel: RBP: 00007fcf0e184c30 R08: 00007fcf08016810 R09: 00007fcf08000080
mai 19 11:39:25 curie kernel: R10: 0000000000080000 R11: 0000000000000246 R12: 000055b212f972a0
mai 19 11:39:25 curie kernel: R13: 0000000000000000 R14: 00007fcf0e181640 R15: 0000000000000000
mai 19 11:39:25 curie kernel: INFO: task zpool:11815 blocked for more than 241 seconds.
mai 19 11:39:25 curie kernel:       Tainted: P          IOE     5.10.0-14-amd64 #1 Debian 5.10.113-1
mai 19 11:39:25 curie kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mai 19 11:39:25 curie kernel: task:zpool           state:D stack:    0 pid:11815 ppid:  2621 flags:0x00004004
mai 19 11:39:25 curie kernel: Call Trace:
mai 19 11:39:25 curie kernel:  __schedule+0x282/0x870
mai 19 11:39:25 curie kernel:  schedule+0x46/0xb0
mai 19 11:39:25 curie kernel:  io_schedule+0x42/0x70
mai 19 11:39:25 curie kernel:  cv_wait_common+0xac/0x130 [spl]
mai 19 11:39:25 curie kernel:  ? add_wait_queue_exclusive+0x70/0x70
mai 19 11:39:25 curie kernel:  txg_wait_synced_impl+0xc9/0x110 [zfs]
mai 19 11:39:25 curie kernel:  txg_wait_synced+0xc/0x40 [zfs]
mai 19 11:39:25 curie kernel:  spa_export_common+0x4cd/0x590 [zfs]
mai 19 11:39:25 curie kernel:  ? zfs_log_history+0x9c/0xf0 [zfs]
mai 19 11:39:25 curie kernel:  zfsdev_ioctl_common+0x697/0x870 [zfs]
mai 19 11:39:25 curie kernel:  ? _copy_from_user+0x28/0x60
mai 19 11:39:25 curie kernel:  zfsdev_ioctl+0x53/0xe0 [zfs]
mai 19 11:39:25 curie kernel:  __x64_sys_ioctl+0x83/0xb0
mai 19 11:39:25 curie kernel:  do_syscall_64+0x33/0x80
mai 19 11:39:25 curie kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
mai 19 11:39:25 curie kernel: RIP: 0033:0x7fdc23be2cc7
mai 19 11:39:25 curie kernel: RSP: 002b:00007ffc8c792478 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
mai 19 11:39:25 curie kernel: RAX: ffffffffffffffda RBX: 000055942ca49e20 RCX: 00007fdc23be2cc7
mai 19 11:39:25 curie kernel: RDX: 00007ffc8c792490 RSI: 0000000000005a03 RDI: 0000000000000003
mai 19 11:39:25 curie kernel: RBP: 00007ffc8c795e80 R08: 00000000ffffffff R09: 00007ffc8c792310
mai 19 11:39:25 curie kernel: R10: 000055942ca49e30 R11: 0000000000000246 R12: 00007ffc8c792490
mai 19 11:39:25 curie kernel: R13: 000055942ca49e30 R14: 000055942aed2c20 R15: 00007ffc8c795a40
I understand those are rather extreme conditions: I would fully expect the pool to stop working if the underlying drives disappear. What doesn't seem acceptable is that a command would completely hang like this.

References See the zfs documentation for more information about ZFS, and tubman for another installation and migration procedure.

29 September 2022

Antoine Beaupr : Detecting manual (and optimizing large) package installs in Puppet

Well this is a mouthful. I recently worked on a neat hack called puppet-package-check. It is designed to warn about manually installed packages, to make sure "everything is in Puppet". But it turns out it can (probably?) dramatically decrease the bootstrap time of Puppet bootstrap when it needs to install a large number of packages.

Detecting manual packages On a cleanly filed workstation, it looks like this:
root@emma:/home/anarcat/bin# ./puppet-package-check -v
listing puppet packages...
listing apt packages...
loading apt cache...
0 unmanaged packages found
A messy workstation will look like this:
root@curie:/home/anarcat/bin# ./puppet-package-check -v
listing puppet packages...
listing apt packages...
loading apt cache...
288 unmanaged packages found
apparmor-utils beignet-opencl-icd bridge-utils clustershell cups-pk-helper davfs2 dconf-cli dconf-editor dconf-gsettings-backend ddccontrol ddrescueview debmake debootstrap decopy dict-devil dict-freedict-eng-fra dict-freedict-eng-spa dict-freedict-fra-eng dict-freedict-spa-eng diffoscope dnsdiag dropbear-initramfs ebtables efibootmgr elpa-lua-mode entr eog evince figlet file file-roller fio flac flex font-manager fonts-cantarell fonts-inconsolata fonts-ipafont-gothic fonts-ipafont-mincho fonts-liberation fonts-monoid fonts-monoid-tight fonts-noto fonts-powerline fonts-symbola freeipmi freetype2-demos ftp fwupd-amd64-signed gallery-dl gcc-arm-linux-gnueabihf gcolor3 gcp gdisk gdm3 gdu gedit gedit-plugins gettext-base git-debrebase gnome-boxes gnote gnupg2 golang-any golang-docker-credential-helpers golang-golang-x-tools grub-efi-amd64-signed gsettings-desktop-schemas gsfonts gstreamer1.0-libav gstreamer1.0-plugins-base gstreamer1.0-plugins-good gstreamer1.0-plugins-ugly gstreamer1.0-pulseaudio gtypist gvfs-backends hackrf hashcat html2text httpie httping hugo humanfriendly iamerican-huge ibus ibus-gtk3 ibus-libpinyin ibus-pinyin im-config imediff img2pdf imv initramfs-tools input-utils installation-birthday internetarchive ipmitool iptables iptraf-ng jackd2 jupyter jupyter-nbextension-jupyter-js-widgets jupyter-qtconsole k3b kbtin kdialog keditbookmarks keepassxc kexec-tools keyboard-configuration kfind konsole krb5-locales kwin-x11 leiningen lightdm lintian linux-image-amd64 linux-perf lmodern lsb-base lvm2 lynx lz4json magic-wormhole mailscripts mailutils manuskript mat2 mate-notification-daemon mate-themes mime-support mktorrent mp3splt mpdris2 msitools mtp-tools mtree-netbsd mupdf nautilus nautilus-sendto ncal nd ndisc6 neomutt net-tools nethogs nghttp2-client nocache npm2deb ntfs-3g ntpdate nvme-cli nwipe obs-studio okular-extra-backends openstack-clients openstack-pkg-tools paprefs pass-extension-audit pcmanfm pdf-presenter-console pdf2svg percol pipenv playerctl plymouth plymouth-themes popularity-contest progress prometheus-node-exporter psensor pubpaste pulseaudio python3-ldap qjackctl qpdfview qrencode r-cran-ggplot2 r-cran-reshape2 rake restic rhash rpl rpm2cpio rs ruby ruby-dev ruby-feedparser ruby-magic ruby-mocha ruby-ronn rygel-playbin rygel-tracker s-tui sanoid saytime scrcpy scrcpy-server screenfetch scrot sdate sddm seahorse shim-signed sigil smartmontools smem smplayer sng sound-juicer sound-theme-freedesktop spectre-meltdown-checker sq ssh-audit sshuttle stress-ng strongswan strongswan-swanctl syncthing system-config-printer system-config-printer-common system-config-printer-udev systemd-bootchart systemd-container tardiff task-desktop task-english task-ssh-server tasksel tellico texinfo texlive-fonts-extra texlive-lang-cyrillic texlive-lang-french texlive-lang-german texlive-lang-italian texlive-xetex tftp-hpa thunar-archive-plugin tidy tikzit tint2 tintin++ tipa tpm2-tools traceroute tree trocla ucf udisks2 unifont unrar-free upower usbguard uuid-runtime vagrant-cachier vagrant-libvirt virt-manager vmtouch vorbis-tools w3m wamerican wamerican-huge wfrench whipper whohas wireshark xapian-tools xclip xdg-user-dirs-gtk xlax xmlto xsensors xserver-xorg xsltproc xxd xz-utils yubioath-desktop zathura zathura-pdf-poppler zenity zfs-dkms zfs-initramfs zfsutils-linux zip zlib1g zlib1g-dev
157 old: apparmor-utils clustershell davfs2 dconf-cli dconf-editor ddccontrol ddrescueview decopy dnsdiag ebtables efibootmgr elpa-lua-mode entr figlet file-roller fio flac flex font-manager freetype2-demos ftp gallery-dl gcc-arm-linux-gnueabihf gcolor3 gcp gdu gedit git-debrebase gnote golang-docker-credential-helpers golang-golang-x-tools gtypist hackrf hashcat html2text httpie httping hugo humanfriendly iamerican-huge ibus ibus-pinyin imediff input-utils internetarchive ipmitool iptraf-ng jackd2 jupyter-qtconsole k3b kbtin kdialog keditbookmarks keepassxc kexec-tools kfind konsole leiningen lightdm lynx lz4json magic-wormhole manuskript mat2 mate-notification-daemon mktorrent mp3splt msitools mtp-tools mtree-netbsd nautilus nautilus-sendto nd ndisc6 neomutt net-tools nethogs nghttp2-client nocache ntpdate nwipe obs-studio openstack-pkg-tools paprefs pass-extension-audit pcmanfm pdf-presenter-console pdf2svg percol pipenv playerctl qjackctl qpdfview qrencode r-cran-ggplot2 r-cran-reshape2 rake restic rhash rpl rpm2cpio rs ruby-feedparser ruby-magic ruby-mocha ruby-ronn s-tui saytime scrcpy screenfetch scrot sdate seahorse shim-signed sigil smem smplayer sng sound-juicer spectre-meltdown-checker sq ssh-audit sshuttle stress-ng system-config-printer system-config-printer-common tardiff tasksel tellico texlive-lang-cyrillic texlive-lang-french tftp-hpa tikzit tint2 tintin++ tpm2-tools traceroute tree unrar-free vagrant-cachier vagrant-libvirt vmtouch vorbis-tools w3m wamerican wamerican-huge wfrench whipper whohas xdg-user-dirs-gtk xlax xmlto xsensors xxd yubioath-desktop zenity zip
131 new: beignet-opencl-icd bridge-utils cups-pk-helper dconf-gsettings-backend debmake debootstrap dict-devil dict-freedict-eng-fra dict-freedict-eng-spa dict-freedict-fra-eng dict-freedict-spa-eng diffoscope dropbear-initramfs eog evince file fonts-cantarell fonts-inconsolata fonts-ipafont-gothic fonts-ipafont-mincho fonts-liberation fonts-monoid fonts-monoid-tight fonts-noto fonts-powerline fonts-symbola freeipmi fwupd-amd64-signed gdisk gdm3 gedit-plugins gettext-base gnome-boxes gnupg2 golang-any grub-efi-amd64-signed gsettings-desktop-schemas gsfonts gstreamer1.0-libav gstreamer1.0-plugins-base gstreamer1.0-plugins-good gstreamer1.0-plugins-ugly gstreamer1.0-pulseaudio gvfs-backends ibus-gtk3 ibus-libpinyin im-config img2pdf imv initramfs-tools installation-birthday iptables jupyter jupyter-nbextension-jupyter-js-widgets keyboard-configuration krb5-locales kwin-x11 lintian linux-image-amd64 linux-perf lmodern lsb-base lvm2 mailscripts mailutils mate-themes mime-support mpdris2 mupdf ncal npm2deb ntfs-3g nvme-cli okular-extra-backends openstack-clients plymouth plymouth-themes popularity-contest progress prometheus-node-exporter psensor pubpaste pulseaudio python3-ldap ruby ruby-dev rygel-playbin rygel-tracker sanoid scrcpy-server sddm smartmontools sound-theme-freedesktop strongswan strongswan-swanctl syncthing system-config-printer-udev systemd-bootchart systemd-container task-desktop task-english task-ssh-server texinfo texlive-fonts-extra texlive-lang-german texlive-lang-italian texlive-xetex thunar-archive-plugin tidy tipa trocla ucf udisks2 unifont upower usbguard uuid-runtime virt-manager wireshark xapian-tools xclip xserver-xorg xsltproc xz-utils zathura zathura-pdf-poppler zfs-dkms zfs-initramfs zfsutils-linux zlib1g zlib1g-dev
Yuck! That's a lot of shit to go through. Notice how the packages get sorted between "old" and "new" packages. This is because popcon is used as a tool to mark which packages are "old". If you have unmanaged packages, the "old" ones are likely things that you can uninstall, for example. If you don't have popcon installed, you'll also get this warning:
popcon stats not available: [Errno 2] No such file or directory: '/var/log/popularity-contest'
The error can otherwise be safely ignored, but you won't get "help" prioritizing the packages to add to your manifests. Note that the tool ignores packages that were "marked" (see apt-mark(8)) as automatically installed. This implies that you might have to do a little bit of cleanup the first time you run this, as Debian doesn't necessarily mark all of those packages correctly on first install. For example, here's how it looks like on a clean install, after Puppet ran:
root@angela:/home/anarcat# ./bin/puppet-package-check -v
listing puppet packages...
listing apt packages...
loading apt cache...
127 unmanaged packages found
ca-certificates console-setup cryptsetup-initramfs dbus file gcc-12-base gettext-base grub-common grub-efi-amd64 i3lock initramfs-tools iw keyboard-configuration krb5-locales laptop-detect libacl1 libapparmor1 libapt-pkg6.0 libargon2-1 libattr1 libaudit-common libaudit1 libblkid1 libbpf0 libbsd0 libbz2-1.0 libc6 libcap-ng0 libcap2 libcap2-bin libcom-err2 libcrypt1 libcryptsetup12 libdb5.3 libdebconfclient0 libdevmapper1.02.1 libedit2 libelf1 libext2fs2 libfdisk1 libffi8 libgcc-s1 libgcrypt20 libgmp10 libgnutls30 libgpg-error0 libgssapi-krb5-2 libhogweed6 libidn2-0 libip4tc2 libiw30 libjansson4 libjson-c5 libk5crypto3 libkeyutils1 libkmod2 libkrb5-3 libkrb5support0 liblocale-gettext-perl liblockfile-bin liblz4-1 liblzma5 libmd0 libmnl0 libmount1 libncurses6 libncursesw6 libnettle8 libnewt0.52 libnftables1 libnftnl11 libnl-3-200 libnl-genl-3-200 libnl-route-3-200 libnss-systemd libp11-kit0 libpam-systemd libpam0g libpcre2-8-0 libpcre3 libpcsclite1 libpopt0 libprocps8 libreadline8 libselinux1 libsemanage-common libsemanage2 libsepol2 libslang2 libsmartcols1 libss2 libssl1.1 libssl3 libstdc++6 libsystemd-shared libsystemd0 libtasn1-6 libtext-charwidth-perl libtext-iconv-perl libtext-wrapi18n-perl libtinfo6 libtirpc-common libtirpc3 libudev1 libunistring2 libuuid1 libxtables12 libxxhash0 libzstd1 linux-image-amd64 logsave lsb-base lvm2 media-types mlocate ncurses-term pass-extension-otp puppet python3-reportbug shim-signed tasksel ucf usr-is-merged util-linux-extra wpasupplicant xorg zlib1g
popcon stats not available: [Errno 2] No such file or directory: '/var/log/popularity-contest'
Normally, there should be unmanaged packages here. But because of the way Debian is installed, a lot of libraries and some core packages are marked as manually installed, and are of course not managed through Puppet. There are two solutions to this problem:
  • really manage everything in Puppet (argh)
  • mark packages as automatically installed
I typically chose the second path and mark a ton of stuff as automatic. Then either they will be auto-removed, or will stop being listed. In the above scenario, one could mark all libraries as automatically installed with:
apt-mark auto $(./bin/puppet-package-check   grep -o 'lib[^ ]*')
... but if you trust that most of that stuff is actually garbage that you don't really want installed anyways, you could just mark it all as automatically installed:
apt-mark auto $(./bin/puppet-package-check)
In my case, that ended up keeping basically all libraries (because of course they're installed for some reason) and auto-removing this:
dh-dkms discover-data dkms libdiscover2 libjsoncpp25 libssl1.1 linux-headers-amd64 mlocate pass-extension-otp pass-otp plocate x11-apps x11-session-utils xinit xorg
You'll notice xorg in there: yep, that's bad. Not what I wanted. But for some reason, on other workstations, I did not actually have xorg installed. Turns out having xserver-xorg is enough, and that one has dependencies. So now I guess I just learned to stop worrying and live without X(org).

Optimizing large package installs But that, of course, is not all. Why make things simple when you can have an unreadable title that is trying to be both syntactically correct and click-baity enough to flatter my vain ego? Right. One of the challenges in bootstrapping Puppet with large package lists is that it's slow. Puppet lists packages as individual resources and will basically run apt install $PKG on every package in the manifest, one at a time. While the overhead of apt is generally small, when you add things like apt-listbugs, apt-listchanges, needrestart, triggers and so on, it can take forever setting up a new host. So for initial installs, it can actually makes sense to skip the queue and just install everything in one big batch. And because the above tool inspects the packages installed by Puppet, you can run it against a catalog and have a full lists of all the packages Puppet would install, even before I even had Puppet running. So when reinstalling my laptop, I basically did this:
apt install puppet-agent/experimental
puppet agent --test --noop
apt install $(./puppet-package-check --debug \
    2>&1   grep ^puppet\ packages 
      sed 's/puppet packages://;s/ /\n/g'
      grep -v -e onionshare -e golint -e git-sizer -e github-backup -e hledger -e xsane -e audacity -e chirp -e elpa-flycheck -e elpa-lsp-ui -e yubikey-manager -e git-annex -e hopenpgp-tools -e puppet
) puppet-agent/experimental
That massive grep was because there are currently a lot of packages missing from bookworm. Those are all packages that I have in my catalog but that still haven't made it to bookworm. Sad, I know. I eventually worked around that by adding bullseye sources so that the Puppet manifest actually ran. The point here is that this improves the Puppet run time a lot. All packages get installed at once, and you get a nice progress bar. Then you actually run Puppet to deploy configurations and all the other goodies:
puppet agent --test
I wish I could tell you how much faster that ran. I don't know, and I will not go through a full reinstall just to please your curiosity. The only hard number I have is that it installed 444 packages (which exploded in 10,191 packages with dependencies) in a mere 10 minutes. That might also be with the packages already downloaded. In any case, I have that gut feeling it's faster, so you'll have to just trust my gut. It is, after all, much more important than you might think.

Similar work The blueprint system is something similar to this:
It figures out what you ve done manually, stores it locally in a Git repository, generates code that s able to recreate your efforts, and helps you deploy those changes to production
That tool has unfortunately been abandoned for a decade at this point. Also note that the AutoRemove::RecommendsImportant and AutoRemove::SuggestsImportant are relevant here. If it is set to true (the default), a package will not be removed if it is (respectively) a Recommends or Suggests of another package (as opposed to the normal Depends). In other words, if you want to also auto-remove packages that are only Suggests, you would, for example, add this to apt.conf:
AutoRemove::SuggestsImportant false;
Paul Wise has tried to make the Debian installer and debootstrap properly mark packages as automatically installed in the past, but his bug reports were rejected. The other suggestions in this section are also from Paul, thanks!

31 August 2022

Rapha&#235;l Hertzog: Freexian s report about Debian Long Term Support, July 2022

A Debian LTS logo
Like each month, have a look at the work funded by Freexian s Debian LTS offering. Debian project funding No any major updates on running projects.
Two 1, 2 projects are in the pipeline now.
Tryton project is in a review phase. Gradle projects is still fighting in work. In July, we put aside 2389 EUR to fund Debian projects. We re looking forward to receive more projects from various Debian teams! Learn more about the rationale behind this initiative in this article. Debian LTS contributors In July, 14 contributors have been paid to work on Debian LTS, their reports are available: Evolution of the situation In July, we have released 3 DLAs. July was the period, when the Debian Stretch had already ELTS status, but Debian Buster was still in the hands of security team. Many member of LTS used this time to update internal infrastructure, documentation and some internal tickets. Now we are ready to take the next release in our hands: Buster! Thanks to our sponsors Sponsors that joined recently are in bold.

26 July 2022

Rapha&#235;l Hertzog: Freexian s report about Debian Long Term Support, June 2022

A Debian LTS logo
Like each month, have a look at the work funded by Freexian s Debian LTS offering. Debian project funding No any major updates on running projects.
Two 1, 2 projects are in the pipeline now.
Tryton project is in a review phase. Gradle projects is still fighting in work. In June, we put aside 2254 EUR to fund Debian projects. We re looking forward to receive more projects from various Debian teams! Learn more about the rationale behind this initiative in this article. Debian LTS contributors In June, 15 contributors have been paid to work on Debian LTS, their reports are available: Evolution of the situation In June we released 27 DLAs.

This is a special month, where we have two releases (stretch and jessie) as ELTS and NO release as LTS. Buster is still handled by the security team and will probably be given in LTS hands at the beginning of the August. During this month we are updating the infrastructure, documentation and improve our internal processes to switch to a new release.
Many developers have just returned back from Debconf22, hold in Prizren, Kosovo! Many (E)LTS members could meet face-to-face and discuss some technical and social topics! Also LTS BoF took place, where the project was introduced (link to video).
Thanks to our sponsors Sponsors that joined recently are in bold. We are pleased to welcome Alter Way where their support of Debian is publicly acknowledged at the higher level, see this French quote of Alterway s CEO.

23 June 2022

Rapha&#235;l Hertzog: Freexian s report about Debian Long Term Support, May 2022

A Debian LTS logo
Like each month, have a look at the work funded by Freexian s Debian LTS offering. Debian project funding Two [1, 2] projects are in the pipeline now. Tryton project is in a final phase. Gradle projects is fighting with technical difficulties. In May, we put aside 2233 EUR to fund Debian projects. We re looking forward to receive more projects from various Debian teams! Learn more about the rationale behind this initiative in this article. Debian LTS contributors In May, 14 contributors have been paid to work on Debian LTS, their reports are available: Evolution of the situation In May we released 49 DLAs. The security tracker currently lists 71 packages with a known CVE and the dla-needed.txt file has 65 packages needing an update. The number of paid contributors increased significantly, we are pleased to welcome our latest team members: Andreas R nnquist, Dominik George, Enrico Zini and Stefano Rivera. It is worth pointing out that we are getting close to the end of the LTS period for Debian 9. After June 30th, no new security updates will be made available on security.debian.org. We are preparing to overtake Debian 10 Buster for the next two years and to make this process as smooth as possible. But Freexian and its team of paid Debian contributors will continue to maintain Debian 9 going forward for the customers of the Extended LTS offer. If you have Debian 9 servers to keep secure, it s time to subscribe! You might not have noticed, but Freexian formalized a mission statement where we explain that our purpose is to help improve Debian. For this, we want to fund work time for the Debian developers that recently joined Freexian as collaborators. The Extended LTS and the PHP LTS offers are built following a model that will help us to achieve this if we manage to have enough customers for those offers. So consider subscribing: you help your organization but you also help Debian! Thanks to our sponsors Sponsors that joined recently are in bold.

3 June 2022

Rapha&#235;l Hertzog: Freexian s report about Debian Long Term Support, April 2022

A Debian LTS logo
Like each month, have a look at the work funded by Freexian s Debian LTS offering. Debian project funding Two projects are currently in the pipeline: Gradle enterprise and Tryton update. Progress is quite slow on the Gradle one, there are technical difficulties. The tryton one was stalled because the developer had not enough time but seems to progress smoothly in the last weeks. In April, we put aside 2635 EUR to fund Debian projects. We re looking forward to receive more projects from various Debian teams! Learn more about the rationale behind this initiative in this article. Debian LTS contributors In April, 11 contributors have been paid to work on Debian LTS, their reports are available: Evolution of the situation In April we released 30 DLAs and we were glad to welcome a new customer with Alter Way. The security tracker currently lists 72 packages with a known CVE and the dla-needed.txt file has 71 packages needing an update. It is worth pointing out that we are getting close to the end of the LTS period for Debian 9. After June 30th, no new security updates will be made available on security.debian.org. But Freexian and its team of paid Debian contributors will continue to maintain Debian 9 going forward for the customers of the Extended LTS offer. If you have Debian 9 servers to keep secure, it s time to subscribe! You might not have noticed, but Freexian formalized a mission statement where we explain that our purpose is to help improve Debian. For this, we want to fund work time for the Debian developers that recently joined Freexian as collaborators. The Extended LTS and the PHP LTS offers are built following a model that will help us to achieve this if we manage to have enough customers for those offers. So consider subscribing: you help your organization but you also help Debian! Thanks to our sponsors Sponsors that joined recently are in bold.

28 April 2022

Rapha&#235;l Hertzog: Freexian s report about Debian Long Term Support, March 2022

A Debian LTS logo
Every month we review the work funded by Freexian s Debian LTS offering. Please find the report for March below. Debian project funding Learn more about the rationale behind this initiative in this article. Debian LTS contributors In March, 11 contributors were paid to work on Debian LTS, their reports are available below. If you re interested in participating in the LTS or ELTS teams, we welcome participation from the Debian community. Simply get in touch with Jeremiah or Rapha l if you are if you are interested in participating. Evolution of the situation In March we released 42 DLAs. The security tracker currently lists 81 packages with a known CVE and the dla-needed.txt file has 52 packages needing an update. We re glad to welcome a few new sponsors such as lectricit de France (Gold sponsor), Telecats BV and Soliton Systems. Thanks to our sponsors Sponsors that joined recently are in bold.

21 April 2022

Andy Simpkins: Firmware and Debian

There has been a flurry of activity on the Debian mailing lists ever since Steve McIntyre raised the issue of including non-free firmware as part of official Debian installation images. Firstly I should point out that I am in complete agreement with Steve s proposal to include non-free firmware as part of an installation image. Likewise I think that we should have a separate archive section for firmware. Because without doing so it will soon become almost impossible to install onto any new hardware. However, as always the issue is more nuanced than a first glance would suggest. Lets start by defining what is firmware? Firmware is any software that runs outside the orchestration of the operating system. Typically firmware will be executed on a processor(s) separate from the processor(s) running the OS, but this does not need to be the case. As Debian we are content that our systems can operate using fully free and open source software and firmware. We can install our OS without needing any non-free firmware. This is an illusion! Each and every PC platform contains non-free firmware It may be possible to run free firmware on some Graphics controllers, Wi-Fi chip-sets, or Ethernet cards and we can (and perhaps should) choose to spend our money on systems where this is the case. When installing a new system we might still be forced to hold our nose and install with non-free firmware on the peripheral before we are able to upgrade it to FLOSS firmware later if this is exists or is even possible to do so. However after the installation we are running a full FLOSS system in terms of software and firmware. We all (almost without exception) are running propitiatory firmware whether we like it or not. Even after carefully selecting graphics and network hardware with FLOSS firmware options we still haven t escaped from non-free-firmware. Other peripherals contain firmware too each keyboard, disk (SSDs and Spinning rust). Even your USB memory stick that you use to contain the Debian installation image contains a microcontroller and hence also contains firmware that runs on it.
  1. Much of this firmware can not even be updated.
  2. Some can be updated, but is stored in FLASH ROM and the hardware vendor has defeated all programming methods (possibly circumnavigated with a hardware mod).
  3. Some of it can be updated but requires external device programmers (and often the programming connections are a series of test points dotted around the board and not on a header in order to make programming as difficult as possible).
  4. Sometimes the firmware can be updated from within the host operating system (i.e. Debian)
  5. Sometimes, as Steve pointed out in his post, the hardware vendor has enough firmware on a peripheral to perform basic functions perhaps enough to install the OS, but requires additional firmware to enable specific feature (i.e. higher screen resolutions, hardware accelerated functions etc.)
  6. Finally some vendors don t even bother with any non-volatile storage beyond a basic boot loader and firmware must be loaded before the device can be used in any mode.
What about the motherboard? If we are lucky we might be able to run a FLOSS implementation of the UEFI subsystem (edk2/tianocore for example), indeed the non AMD64/i386 platforms based around ARM, MIPS architectures are often the most free when it comes to firmware. What about the microcode on the processor? Personally I wasn t aware that that this was updatable firmware until the Spectre and Meltdown classes of vulnerabilities arose a few years back. So back to Debian images including non-free firmware. This is specifically to address the last two use cases mentioned above, i.e. where firmware needs to be loaded to achieve a minimum functioning of a device. Although it could also include motherboard support, and microcode as well. As far as I can tell the proposal exists for several reasons: #1 Because some freely distributable firmware is required for more and more devices, in order to install Debian, or because whilst Debian can be installed a desktop environment can not be started or fully function #2 Because frankly it is less work to produce, test and maintain fewer installation images As someone who performs tests on our images, this clearly gets my vote :-) and perhaps most important of all.. #3 Because our least experienced users, and new users will download an official image and give up if things don t just work TM Steve s proposal option 5 would address theses issues and I fully support it. I would love to see separate repositories for firmware and firmware-none free. Additionally to accompany firmware non-free I would like to have information on what the firmware actually does. Can I run my hardware without it, what function(s) are limited without the firmware, better yet is there a FLOSS equivalent that I can load instead? Is this something that we can present in Debian installer? I would love not to require non-free firmware, but if I can t, I would love if DI would enable a user to make an informed choice as to what, if any, firmware is installed. Should we be requesting (requiring?) this information for any non-free firmware image that we carry in the archive? Finally lets consider firmware in the wider general case, not just the case where we need to load firmware from within Debian each and every boot. Personally I am annoyed whenever a hardware manufacturer has gone out of their way to prevent firmware updates. Lets face it software contains bugs, and we can assume that the software making up a firmware image will as well. Critical (security) vulnerabilities found in firmware, especially if this runs on the same processor(s) as the OS can impact on the wider system, not just the device itself. This will mean that, without updatable firmware, the hardware itself should be withdrawn from use whilst it would otherwise still function. By preventing firmware updates vendors are forcing early obsolescence in the hardware they sell, perhaps good for their bottom line, but certainly no good for users or the environment. Here I can practice what I preach. As an Electronic Engineer / Systems architect I have been beating the drum for In System Updatable firmware for ALL programmable devices in a system, be it a simple peripheral or a deeply embedded system. I can honestly say that over the last 20 years (yes I have been banging this particular drum for that long) I have had 100% success in arguing this case commercially. Having device programmers in R&D departments is one thing, but that is additional cost for production, and field service. Needing custom programming headers or even a bed of nails fixture to connect your target device to a programmer is more trouble than it is worth. Finally, the ability to update firmware in the field means that you can launch your product on schedule, make a sale and ship to a customer even if the first thing that you need to do is download an update. Offering that to any project manager will make you very popular indeed. So what if this firmware is non-free? As long as the firmware resides in non-volatile media without needing the OS to interact with it, we as a project don t need to carry it in our archives. And we as principled individuals can vote with our feet and wallets by choosing to purchase devices that have free firmware. But where that isn t an option, I ll take updatable but non-free firmware over non-free firmware that can not be updated any day of the week. Sure, the manufacture can choose to no longer support the firmware, and it is shocking how soon this happens often in the consumer market, the manufacture has withdrawn support for a product before it even reaches the end user (In which case we should boycott that manufacture in future until they either change their ways of go bust). But again if firmware can be updated in system that would at least allow the possibility of open firmware to arise. Indeed the only commercial case I have seen to argue against updatable firmware has been either for DRM, in which case good lets get rid of both, or for RF licence compliance, and even then it is tenuous because in this case the manufacture wants ISP for its own use right up until a device is shipped out the door, typically achived by blowing one time programmable fuse links .

1 April 2022

Paul Wise: FLOSS Activities March 2022

Focus This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Review
  • Spam: reported 3 Debian bug reports and 53 Debian mailing list posts
  • Debian wiki: RecentChanges for the month
  • Debian BTS usertags: changes for the month
  • Debian screenshots:

Administration
  • Debian servers: investigate wiki mail delivery issue, restart backup director
  • Debian wiki: unblock IP addresses, approve accounts

Communication
  • Forward python-plac test failure issue upstream
  • Participate in Debian Project Leader election discussions
  • Respond to queries from Debian users and contributors on the mailing lists and IRC

Sponsors The oci-python-sdk and plac work was sponsored. All other work was done on a volunteer basis.

17 March 2022

Rapha&#235;l Hertzog: Freexian s report about Debian Long Term Support, February 2022

A Debian LTS logo
Every month we review the work funded by Freexian s Debian LTS offering. Please find the report for February below. Debian project funding Debian LTS contributors In February, 12 contributors were paid to work on Debian LTS, their reports are available below. If you re interested in participating in the LTS or ELTS teams, we welcome participation from the Debian community. Simply get in touch with Jeremiah or Rapha l if you are if you are interested in participating. Evolution of the situation In February we released 24 DLAs. The security tracker currently lists 61 packages with a known CVE and the dla-needed.txt file has 26 packages needing an update. You can find out more about the Debian LTS project via the following video:
Thanks to our sponsors Sponsors that joined recently are in bold.

Next.

Previous.