Search Results: "mbr"

28 December 2022

Chris Lamb: Favourite books of 2022: Classics

As a follow-up to yesterday's post detailing my favourite works of fiction from 2022, today I'll be listing my favourite fictional works that are typically filed under classics. Books that just missed the cut here include: E. M. Forster's A Room with a View (1908) and his later A Passage to India (1913), both gently nudged out by Forster's superb Howard's End (see below). Giuseppe Tomasi di Lampedusa's The Leopard (1958) also just missed out on a write-up here, but I can definitely recommend it to anyone interested in reading a modern Italian classic.

War and Peace (1867) Leo Tolstoy It's strange to think that there is almost no point in reviewing this novel: who hasn't heard of War and Peace? What more could possibly be said about it now? Still, when I was growing up, War and Peace was always the stereotypical example of the 'impossible book', and even start it was, at best, a pointless task, and an act of hubris at worst. And so there surely exists a parallel universe in which I never have and will never will read the book... Nevertheless, let us try to set the scene. Book nine of the novel opens as follows:
On the twelfth of June, 1812, the forces of Western Europe crossed the Russian frontier and war began; that is, an event took place opposed to human reason and to human nature. Millions of men perpetrated against one another such innumerable crimes, frauds, treacheries, thefts, forgeries, issues of false money, burglaries, incendiarisms and murders as in whole centuries are not recorded in the annals of all the law courts of the world, but which those who committed them did not at the time regard as being crimes. What produced this extraordinary occurrence? What were its causes? [ ] The more we try to explain such events in history reasonably, the more unreasonable and incomprehensible they become to us.
Set against the backdrop of the Napoleonic Wars and Napoleon's invasion of Russia, War and Peace follows the lives and fates of three aristocratic families: The Rostovs, The Bolkonskys and the Bezukhov's. These characters find themselves situated athwart (or against) history, and all this time, Napoleon is marching ever closer to Moscow. Still, Napoleon himself is essentially just a kind of wallpaper for a diverse set of personal stories touching on love, jealousy, hatred, retribution, naivety, nationalism, stupidity and much much more. As Elif Batuman wrote earlier this year, "the whole premise of the book was that you couldn t explain war without recourse to domesticity and interpersonal relations." The result is that Tolstoy has woven an incredibly intricate web that connects the war, noble families and the everyday Russian people to a degree that is surprising for a book started in 1865. Tolstoy's characters are probably timeless (especially the picaresque adventures and constantly changing thoughts Pierre Bezukhov), and the reader who has any social experience will immediately recognise characters' thoughts and actions. Some of this is at a 'micro' interpersonal level: for instance, take this example from the elegant party that opens the novel:
Each visitor performed the ceremony of greeting this old aunt whom not one of them knew, not one of them wanted to know, and not one of them cared about. The aunt spoke to each of them in the same words, about their health and her own and the health of Her Majesty, who, thank God, was better today. And each visitor, though politeness prevented his showing impatience, left the old woman with a sense of relief at having performed a vexatious duty and did not return to her the whole evening.
But then, some of the focus of the observations are at the 'macro' level of the entire continent. This section about cities that feel themselves in danger might suffice as an example:
At the approach of danger, there are always two voices that speak with equal power in the human soul: one very reasonably tells a man to consider the nature of the danger and the means of escaping it; the other, still more reasonably, says that it is too depressing and painful to think of the danger, since it is not in man s power to foresee everything and avert the general course of events, and it is therefore better to disregard what is painful till it comes and to think about what is pleasant. In solitude, a man generally listens to the first voice, but in society to the second.
And finally, in his lengthy epilogues, Tolstoy offers us a dissertation on the behaviour of large organisations, much of it through engagingly witty analogies. These epilogues actually turn out to be an oblique and sarcastic commentary on the idiocy of governments and the madness of war in general. Indeed, the thorough dismantling of the 'great man' theory of history is a common theme throughout the book:
During the whole of that period [of 1812], Napoleon, who seems to us to have been the leader of all these movements as the figurehead of a ship may seem to a savage to guide the vessel acted like a child who, holding a couple of strings inside a carriage, thinks he is driving it. [ ] Why do [we] all speak of a military genius ? Is a man a genius who can order bread to be brought up at the right time and say who is to go to the right and who to the left? It is only because military men are invested with pomp and power and crowds of sychophants flatter power, attributing to it qualities of genius it does not possess.
Unlike some other readers, I especially enjoyed these diversions into the accounting and workings of history, as well as our narrow-minded way of trying to 'explain' things in a singular way:
When an apple has ripened and falls, why does it fall? Because of its attraction to the earth, because its stalk withers, because it is dried by the sun, because it grows heavier, because the wind shakes it, or because the boy standing below wants to eat it? Nothing is the cause. All this is only the coincidence of conditions in which all vital organic and elemental events occur. And the botanist who finds that the apple falls because the cellular tissue decays and so forth is equally right with the child who stands under the tree and says the apple fell because he wanted to eat it and prayed for it.
Given all of these serious asides, I was also not expecting this book to be quite so funny. At the risk of boring the reader with citations, take this sarcastic remark about the ineptness of medicine men:
After his liberation, [Pierre] fell ill and was laid up for three months. He had what the doctors termed 'bilious fever.' But despite the fact that the doctors treated him, bled him and gave him medicines to drink he recovered.
There is actually a multitude of remarks that are not entirely complimentary towards Russian medical practice, but they are usually deployed with an eye to the human element involved rather than simply to the detriment of a doctor's reputation "How would the count have borne his dearly loved daughter s illness had he not known that it was costing him a thousand rubles?" Other elements of note include some stunning set literary pieces, such as when Prince Andrei encounters a gnarly oak tree under two different circumstances in his life, and when Nat sha's 'Russian' soul is awakened by the strains of a folk song on the balalaika. Still, despite all of these micro- and macro-level happenings, for a long time I felt that something else was going on in War and Peace. It was difficult to put into words precisely what it was until I came across this passage by E. M. Forster:
After one has read War and Peace for a bit, great chords begin to sound, and we cannot say exactly what struck them. They do not arise from the story [and] they do not come from the episodes nor yet from the characters. They come from the immense area of Russia, over which episodes and characters have been scattered, from the sum-total of bridges and frozen rivers, forests, roads, gardens and fields, which accumulate grandeur and sonority after we have passed them. Many novelists have the feeling for place, [but] very few have the sense of space, and the possession of it ranks high in Tolstoy s divine equipment. Space is the lord of War and Peace, not time.
'Space' indeed. Yes, potential readers should note the novel's great length, but the 365 chapters are actually remarkably short, so the sensation of reading it is not in the least overwhelming. And more importantly, once you become familiar with its large cast of characters, it is really not a difficult book to follow, especially when compared to the other Russian classics. My only regret is that it has taken me so long to read this magnificent novel and that I might find it hard to find time to re-read it within the next few years.

Coming Up for Air (1939) George Orwell It wouldn't be a roundup of mine without at least one entry from George Orwell, and, this year, that place is occupied by a book I hadn't haven't read in almost two decades Still, the George Bowling of Coming Up for Air is a middle-aged insurance salesman who lives in a distinctly average English suburban row house with his nuclear family. One day, after winning some money on a bet, he goes back to the village where he grew up in order to fish in a pool he remembers from thirty years before. Less important than the plot, however, is both the well-observed remarks and scathing criticisms that Bowling has of the town he has returned to, combined with an ominous sense of foreboding before the Second World War breaks out. At several times throughout the book, George's placid thoughts about his beloved carp pool are replaced by racing, anxious thoughts that overwhelm his inner peace:
War is coming. In 1941, they say. And there'll be plenty of broken crockery, and little houses ripped open like packing-cases, and the guts of the chartered accountant's clerk plastered over the piano that he's buying on the never-never. But what does that kind of thing matter, anyway? I'll tell you what my stay in Lower Binfield had taught me, and it was this. IT'S ALL GOING TO HAPPEN. All the things you've got at the back of your mind, the things you're terrified of, the things that you tell yourself are just a nightmare or only happen in foreign countries. The bombs, the food-queues, the rubber truncheons, the barbed wire, the coloured shirts, the slogans, the enormous faces, the machine-guns squirting out of bedroom windows. It's all going to happen. I know it - at any rate, I knew it then. There's no escape. Fight against it if you like, or look the other way and pretend not to notice, or grab your spanner and rush out to do a bit of face-smashing along with the others. But there's no way out. It's just something that's got to happen.
Already we can hear psychological madness that underpinned the Second World War. Indeed, there is no great story in Coming Up For Air, no wonderfully empathetic characters and no revelations or catharsis, so it is impressive that I was held by the descriptions, observations and nostalgic remembrances about life in modern Lower Binfield, its residents, and how it has changed over the years. It turns out, of course, that George's beloved pool has been filled in with rubbish, and the village has been perverted by modernity beyond recognition. And to cap it off, the principal event of George's holiday in Lower Binfield is an accidental bombing by the British Royal Air Force. Orwell is always good at descriptions of awful food, and this book is no exception:
The frankfurter had a rubber skin, of course, and my temporary teeth weren't much of a fit. I had to do a kind of sawing movement before I could get my teeth through the skin. And then suddenly pop! The thing burst in my mouth like a rotten pear. A sort of horrible soft stuff was oozing all over my tongue. But the taste! For a moment I just couldn't believe it. Then I rolled my tongue around it again and had another try. It was fish! A sausage, a thing calling itself a frankfurter, filled with fish! I got up and walked straight out without touching my coffee. God knows what that might have tasted of.
Many other tell-tale elements of Orwell's fictional writing are in attendance in this book as well, albeit worked out somewhat less successfully than elsewhere in his oeuvre. For example, the idea of a physical ailment also serving as a metaphor is present in George's false teeth, embodying his constant preoccupation with his ageing. (Readers may recall Winston Smith's varicose ulcer representing his repressed humanity in Nineteen Eighty-Four). And, of course, we have a prematurely middle-aged protagonist who almost but not quite resembles Orwell himself. Given this and a few other niggles (such as almost all the women being of the typical Orwell 'nagging wife' type), it is not exactly Orwell's magnum opus. But it remains a fascinating historical snapshot of the feeling felt by a vast number of people just prior to the Second World War breaking out, as well as a captivating insight into how the process of nostalgia functions and operates.

Howards End (1910) E. M. Forster Howards End begins with the following sentence:
One may as well begin with Helen s letters to her sister.
In fact, "one may as well begin with" my own assumptions about this book instead. I was actually primed to consider Howards End a much more 'Victorian' book: I had just finished Virginia Woolf's Mrs Dalloway and had found her 1925 book at once rather 'modern' but also very much constrained by its time. I must have then unconsciously surmised that a book written 15 years before would be even more inscrutable, and, with its Victorian social mores added on as well, Howards End would probably not undress itself so readily in front of the reader. No doubt there were also the usual expectations about 'the classics' as well. So imagine my surprise when I realised just how inordinately affable and witty Howards End turned out to be. It doesn't have that Wildean shine of humour, of course, but it's a couple of fields over in the English countryside, perhaps abutting the more mordant social satires of the earlier George Orwell novels (see Coming Up for Air above). But now let us return to the story itself. Howards End explores class warfare, conflict and the English character through a tale of three quite different families at the beginning of the twentieth century: the rich Wilcoxes; the gentle & idealistic Schlegels; and the lower-middle class Basts. As the Bloomsbury Group Schlegel sisters desperately try to help the Basts and educate the rich but close-minded Wilcoxes, the three families are drawn ever closer and closer together. Although the whole story does, I suppose, revolve around the house in the title (which is based on the Forster's own childhood home), Howards End is perhaps best described as a comedy of manners or a novel that shows up the hypocrisy of people and society. In fact, it is surprising how little of the story actually takes place in the eponymous house, with the overwhelming majority of the first half of the book taking place in London. But it is perhaps more illuminating to remark that the Howards End of the book is a house that the Wilcoxes who own it at the start of the novel do not really need or want. What I particularly liked about Howards End is how the main character's ideals alter as they age, and subsequently how they find their lives changing in different ways. Some of them find themselves better off at the end, others worse. And whilst it is also surprisingly funny, it still manages to trade in heavier social topics as well. This is apparent in the fact that, although the characters themselves are primarily in charge of their own destinies, their choices are still constrained by the changing world and shifting sense of morality around them. This shouldn't be too surprising: after all, Forster's novel was published just four years before the Great War, a distinctly uncertain time. Not for nothing did Virginia Woolf herself later observe that "on or about December 1910, human character changed" and that "all human relations have shifted: those between masters and servants, husbands and wives, parents and children." This process can undoubtedly be seen rehearsed throughout Forster's Howards End, and it's a credit to the author to be able to capture it so early on, if not even before it was widespread throughout Western Europe. I was also particularly taken by Forster's fertile use of simile. An extremely apposite example can be found in the description Tibby Schlegel gives of his fellow Cambridge undergraduates. Here, Timmy doesn't want to besmirch his lofty idealisation of them with any banal specificities, and wishes that the idea of them remain as ideal Platonic forms instead. Or, as Forster puts it, to Timmy it is if they are "pictures that must not walk out of their frames." Wilde, at his most weakest, is 'just' style, but Forster often deploys his flair for a deeper effect. Indeed, when you get to the end of this section mentioning picture frames, you realise Forster has actually just smuggled into the story a failed attempt on Tibby's part to engineer an anonymous homosexual encounter with another undergraduate. It is a credit to Forster's sleight-of-hand that you don't quite notice what has just happened underneath you and that the books' reticence to honestly describe what has happened is thus structually analogus Tibby's reluctance to admit his desires to himself. Another layer to the character of Tibby (and the novel as a whole) is thereby introduced without the imposition of clumsy literary scaffolding. In a similar vein, I felt very clever noticing the arch reference to Debussy's Pr lude l'apr s-midi d'un faune until I realised I just fell into the trap Forster set for the reader in that I had become even more like Tibby in his pseudo-scholarly views on classical music. Finally, I enjoyed that each chapter commences with an ironic and self-conscious bon mot about society which is only slightly overblown for effect. Particularly amusing are the ironic asides on "women" that run through the book, ventriloquising the narrow-minded views of people like the Wilcoxes. The omniscient and amiable narrator of the book also recalls those ironically distant voiceovers from various French New Wave films at times, yet Forster's narrator seems to have bigger concerns in his mordant asides: Forster seems to encourage some sympathy for all of the characters even the more contemptible ones at their worst moments. Highly recommended, as are Forster's A Room with a View (1908) and his slightly later A Passage to India (1913).

The Good Soldier (1915) Ford Madox Ford The Good Soldier starts off fairly simply as the narrator's account of his and his wife's relationship with some old friends, including the eponymous 'Good Soldier' of the book's title. It's an experience to read the beginning of this novel, as, like any account of endless praise of someone you've never met or care about, the pages of approving remarks about them appear to be intended to wash over you. Yet as the chapters of The Good Soldier go by, the account of the other characters in the book gets darker and darker. Although the author himself is uncritical of others' actions, your own critical faculties are slowgrly brought into play, and you gradully begin to question the narrator's retelling of events. Our narrator is an unreliable narrator in the strict sense of the term, but with the caveat that he is at least is telling us everything we need to know to come to our own conclusions. As the book unfolds further, the narrator's compromised credibility seems to infuse every element of the novel even the 'Good' of the book's title starts to seem like a minor dishonesty, perhaps serving as the inspiration for the irony embedded in the title of The 'Great' Gatsby. Much more effectively, however, the narrator's fixations, distractions and manner of speaking feel very much part of his dissimulation. It sometimes feels like he is unconsciously skirting over the crucial elements in his tale, exactly like one does in real life when recounting a story containing incriminating ingredients. Indeed, just how much the narrator is conscious of his own concealment is just one part of what makes this such an interesting book: Ford Madox Ford has gifted us with enough ambiguity that it is also possible that even the narrator cannot find it within himself to understand the events of the story he is narrating. It was initially hard to believe that such a carefully crafted analysis of a small group of characters could have been written so long ago, and despite being fairly easy to read, The Good Soldier is an almost infinitely subtle book even the jokes are of the subtle kind and will likely get a re-read within the next few years.

Anna Karenina (1878) Leo Tolstoy There are many similar themes running through War and Peace (reviewed above) and Anna Karenina. Unrequited love; a young man struggling to find a purpose in life; a loving family; an overwhelming love of nature and countless fascinating observations about the minuti of Russian society. Indeed, rather than primarily being about the eponymous Anna, Anna Karenina provides a vast panorama of contemporary life in Russia and of humanity in general. Nevertheless, our Anna is a sophisticated woman who abandons her empty existence as the wife of government official Alexei Karenin, a colourless man who has little personality of his own, and she turns to a certain Count Vronsky in order to fulfil her passionate nature. Needless to say, this results in tragic consequences as their (admittedly somewhat qualified) desire to live together crashes against the rocks of reality and Russian society. Parallel to Anna's narrative, though, Konstantin Levin serves as the novel's alter-protagonist. In contrast to Anna, Levin is a socially awkward individual who straddles many schools of thought within Russia at the time: he is neither a free-thinker (nor heavy-drinker) like his brother Nikolai, and neither is he a bookish intellectual like his half-brother Serge. In short, Levin is his own man, and it is generally agreed by commentators that he is Tolstoy's surrogate within the novel. Levin tends to come to his own version of an idea, and he would rather find his own way than adopt any prefabricated view, even if confusion and muddle is the eventual result. In a roughly isomorphic fashion then, he resembles Anna in this particular sense, whose story is a counterpart to Levin's in their respective searches for happiness and self-actualisation. Whilst many of the passionate and exciting passages are told on Anna's side of the story (I'm thinking horse race in particular, as thrilling as anything in cinema ), many of the broader political thoughts about the nature of the working classes are expressed on Levin's side instead. These are stirring and engaging in their own way, though, such as when he joins his peasants to mow the field and seems to enter the nineteenth-century version of 'flow':
The longer Levin mowed, the more often he felt those moments of oblivion during which it was no longer his arms that swung the scythe, but the scythe itself that lent motion to his whole body, full of life and conscious of itself, and, as if by magic, without a thought of it, the work got rightly and neatly done on its own. These were the most blissful moments.
Overall, Tolstoy poses no didactic moral message towards any of the characters in Anna Karenina, and merely invites us to watch rather than judge. (Still, there is a hilarious section that is scathing of contemporary classical music, presaging many of the ideas found in Tolstoy's 1897 What is Art?). In addition, just like the earlier War and Peace, the novel is run through with a number of uncannily accurate observations about daily life:
Anna smiled, as one smiles at the weaknesses of people one loves, and, putting her arm under his, accompanied him to the door of the study.
... as well as the usual sprinkling of Tolstoy's sardonic humour ("No one is pleased with his fortune, but everyone is pleased with his wit."). Fyodor Dostoyevsky, the other titan of Russian literature, once described Anna Karenina as a "flawless work of art," and if you re only going to read one Tolstoy novel in your life, it should probably be this one.

23 December 2022

Scarlett Gately Moore: Debian uploads, Core22 KDE snap content pack and more!

I have been quite busy! I have been working on several projects so my cover image is a lovely sunset where I live. Debian: I have updated and uploaded several packages and working on more. KDE Snaps: I have reworked the CI to now do Core22 snaps! They will publish to the beta channel until we get them tested. First snap completed is the ever important KDE Frameworks / QT content snap + SDK! Applications will start after I tackle the kde-neon extention in snapcraft. GUI-Testing: I have begun learning/writing some GUI tests using python and https://invent.kde.org/sdk/selenium-webdriver-at-spi/, inspired by one of my favorite people, Harald. See https://apachelog.wordpress.com/2022/12/14/selenium-at-spi-gui-testing/ for more info and I hope to get these in repos near you soon! In closing, I am still seeking employment/sponsor amidst this terrible layoff season. If anyone knows of anyone with my diverse skill set please let me know. In the meantime if you can spare anything to keep the lights on I would be ever so grateful. Patreon: https://www.patreon.com/sgmoore Cash App: $ScarlettMoore0903 Stripe: https://buy.stripe.com/28o16y3PHcISfaE8ww Thank you, I want to wish everyone a very merry < insert your holiday here > !!!

16 November 2022

Antoine Beaupr : A ZFS migration

In my tubman setup, I started using ZFS on an old server I had lying around. The machine is really old though (2011!) and it "feels" pretty slow. I want to see how much of that is ZFS and how much is the machine. Synthetic benchmarks show that ZFS may be slower than mdadm in RAID-10 or RAID-6 configuration, so I want to confirm that on a live workload: my workstation. Plus, I want easy, regular, high performance backups (with send/receive snapshots) and there's no way I'm going to use BTRFS because I find it too confusing and unreliable. So off we go.

Installation Since this is a conversion (and not a new install), our procedure is slightly different than the official documentation but otherwise it's pretty much in the same spirit: we're going to use ZFS for everything, including the root filesystem. So, install the required packages, on the current system:
apt install --yes gdisk zfs-dkms zfs zfs-initramfs zfsutils-linux
We also tell DKMS that we need to rebuild the initrd when upgrading:
echo REMAKE_INITRD=yes > /etc/dkms/zfs.conf

Partitioning This is going to partition /dev/sdc with:
  • 1MB MBR / BIOS legacy boot
  • 512MB EFI boot
  • 1GB bpool, unencrypted pool for /boot
  • rest of the disk for zpool, the rest of the data
     sgdisk --zap-all /dev/sdc
     sgdisk -a1 -n1:24K:+1000K -t1:EF02 /dev/sdc
     sgdisk     -n2:1M:+512M   -t2:EF00 /dev/sdc
     sgdisk     -n3:0:+1G      -t3:BF01 /dev/sdc
     sgdisk     -n4:0:0        -t4:BF00 /dev/sdc
    
That will look something like this:
    root@curie:/home/anarcat# sgdisk -p /dev/sdc
    Disk /dev/sdc: 1953525168 sectors, 931.5 GiB
    Model: ESD-S1C         
    Sector size (logical/physical): 512/512 bytes
    Disk identifier (GUID): [REDACTED]
    Partition table holds up to 128 entries
    Main partition table begins at sector 2 and ends at sector 33
    First usable sector is 34, last usable sector is 1953525134
    Partitions will be aligned on 16-sector boundaries
    Total free space is 14 sectors (7.0 KiB)
    Number  Start (sector)    End (sector)  Size       Code  Name
       1              48            2047   1000.0 KiB  EF02  
       2            2048         1050623   512.0 MiB   EF00  
       3         1050624         3147775   1024.0 MiB  BF01  
       4         3147776      1953525134   930.0 GiB   BF00
Unfortunately, we can't be sure of the sector size here, because the USB controller is probably lying to us about it. Normally, this smartctl command should tell us the sector size as well:
root@curie:~# smartctl -i /dev/sdb -qnoserial
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.0-14-amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Black Mobile
Device Model:     WDC WD10JPLX-00MBPT0
Firmware Version: 01.01H01
User Capacity:    1 000 204 886 016 bytes [1,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      2.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 6
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Tue May 17 13:33:04 2022 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Above is the example of the builtin HDD drive. But the SSD device enclosed in that USB controller doesn't support SMART commands, so we can't trust that it really has 512 bytes sectors. This matters because we need to tweak the ashift value correctly. We're going to go ahead the SSD drive has the common 4KB settings, which means ashift=12. Note here that we are not creating a separate partition for swap. Swap on ZFS volumes (AKA "swap on ZVOL") can trigger lockups and that issue is still not fixed upstream. Ubuntu recommends using a separate partition for swap instead. But since this is "just" a workstation, we're betting that we will not suffer from this problem, after hearing a report from another Debian developer running this setup on their workstation successfully. We do not recommend this setup though. In fact, if I were to redo this partition scheme, I would probably use LUKS encryption and setup a dedicated swap partition, as I had problems with ZFS encryption as well.

Creating pools ZFS pools are somewhat like "volume groups" if you are familiar with LVM, except they obviously also do things like RAID-10. (Even though LVM can technically also do RAID, people typically use mdadm instead.) In any case, the guide suggests creating two different pools here: one, in cleartext, for boot, and a separate, encrypted one, for the rest. Technically, the boot partition is required because the Grub bootloader only supports readonly ZFS pools, from what I understand. But I'm a little out of my depth here and just following the guide.

Boot pool creation This creates the boot pool in readonly mode with features that grub supports:
    zpool create \
        -o cachefile=/etc/zfs/zpool.cache \
        -o ashift=12 -d \
        -o feature@async_destroy=enabled \
        -o feature@bookmarks=enabled \
        -o feature@embedded_data=enabled \
        -o feature@empty_bpobj=enabled \
        -o feature@enabled_txg=enabled \
        -o feature@extensible_dataset=enabled \
        -o feature@filesystem_limits=enabled \
        -o feature@hole_birth=enabled \
        -o feature@large_blocks=enabled \
        -o feature@lz4_compress=enabled \
        -o feature@spacemap_histogram=enabled \
        -o feature@zpool_checkpoint=enabled \
        -O acltype=posixacl -O canmount=off \
        -O compression=lz4 \
        -O devices=off -O normalization=formD -O relatime=on -O xattr=sa \
        -O mountpoint=/boot -R /mnt \
        bpool /dev/sdc3
I haven't investigated all those settings and just trust the upstream guide on the above.

Main pool creation This is a more typical pool creation.
    zpool create \
        -o ashift=12 \
        -O encryption=on -O keylocation=prompt -O keyformat=passphrase \
        -O acltype=posixacl -O xattr=sa -O dnodesize=auto \
        -O compression=zstd \
        -O relatime=on \
        -O canmount=off \
        -O mountpoint=/ -R /mnt \
        rpool /dev/sdc4
Breaking this down:
  • -o ashift=12: mentioned above, 4k sector size
  • -O encryption=on -O keylocation=prompt -O keyformat=passphrase: encryption, prompt for a password, default algorithm is aes-256-gcm, explicit in the guide, made implicit here
  • -O acltype=posixacl -O xattr=sa: enable ACLs, with better performance (not enabled by default)
  • -O dnodesize=auto: related to extended attributes, less compatibility with other implementations
  • -O compression=zstd: enable zstd compression, can be disabled/enabled by dataset to with zfs set compression=off rpool/example
  • -O relatime=on: classic atime optimisation, another that could be used on a busy server is atime=off
  • -O canmount=off: do not make the pool mount automatically with mount -a?
  • -O mountpoint=/ -R /mnt: mount pool on / in the future, but /mnt for now
Those settings are all available in zfsprops(8). Other flags are defined in zpool-create(8). The reasoning behind them is also explained in the upstream guide and some also in [the Debian wiki][]. Those flags were actually not used:
  • -O normalization=formD: normalize file names on comparisons (not storage), implies utf8only=on, which is a bad idea (and effectively meant my first sync failed to copy some files, including this folder from a supysonic checkout). and this cannot be changed after the filesystem is created. bad, bad, bad.
[the Debian wiki]: https://wiki.debian.org/ZFS#Advanced_Topics

Side note about single-disk pools Also note that we're living dangerously here: single-disk ZFS pools are rumoured to be more dangerous than not running ZFS at all. The choice quote from this article is:
[...] any error can be detected, but cannot be corrected. This sounds like an acceptable compromise, but its actually not. The reason its not is that ZFS' metadata cannot be allowed to be corrupted. If it is it is likely the zpool will be impossible to mount (and will probably crash the system once the corruption is found). So a couple of bad sectors in the right place will mean that all data on the zpool will be lost. Not some, all. Also there's no ZFS recovery tools, so you cannot recover any data on the drives.
Compared with (say) ext4, where a single disk error can recovered, this is pretty bad. But we are ready to live with this with the idea that we'll have hourly offline snapshots that we can easily recover from. It's trade-off. Also, we're running this on a NVMe/M.2 drive which typically just blinks out of existence completely, and doesn't "bit rot" the way a HDD would. Also, the FreeBSD handbook quick start doesn't have any warnings about their first example, which is with a single disk. So I am reassured at least.

Creating mount points Next we create the actual filesystems, known as "datasets" which are the things that get mounted on mountpoint and hold the actual files.
  • this creates two containers, for ROOT and BOOT
     zfs create -o canmount=off -o mountpoint=none rpool/ROOT &&
     zfs create -o canmount=off -o mountpoint=none bpool/BOOT
    
    Note that it's unclear to me why those datasets are necessary, but they seem common practice, also used in this FreeBSD example. The OpenZFS guide mentions the Solaris upgrades and Ubuntu's zsys that use that container for upgrades and rollbacks. This blog post seems to explain a bit the layout behind the installer.
  • this creates the actual boot and root filesystems:
     zfs create -o canmount=noauto -o mountpoint=/ rpool/ROOT/debian &&
     zfs mount rpool/ROOT/debian &&
     zfs create -o mountpoint=/boot bpool/BOOT/debian
    
    I guess the debian name here is because we could technically have multiple operating systems with the same underlying datasets.
  • then the main datasets:
     zfs create                                 rpool/home &&
     zfs create -o mountpoint=/root             rpool/home/root &&
     chmod 700 /mnt/root &&
     zfs create                                 rpool/var
    
  • exclude temporary files from snapshots:
     zfs create -o com.sun:auto-snapshot=false  rpool/var/cache &&
     zfs create -o com.sun:auto-snapshot=false  rpool/var/tmp &&
     chmod 1777 /mnt/var/tmp
    
  • and skip automatic snapshots in Docker:
     zfs create -o canmount=off                 rpool/var/lib &&
     zfs create -o com.sun:auto-snapshot=false  rpool/var/lib/docker
    
    Notice here a peculiarity: we must create rpool/var/lib to create rpool/var/lib/docker otherwise we get this error:
     cannot create 'rpool/var/lib/docker': parent does not exist
    
    ... and no, just creating /mnt/var/lib doesn't fix that problem. In fact, it makes things even more confusing because an existing directory shadows a mountpoint, which is the opposite of how things normally work. Also note that you will probably need to change storage driver in Docker, see the zfs-driver documentation for details but, basically, I did:
    echo '  "storage-driver": "zfs"  ' > /etc/docker/daemon.json
    
    Note that podman has the same problem (and similar solution):
    printf '[storage]\ndriver = "zfs"\n' > /etc/containers/storage.conf
    
  • make a tmpfs for /run:
     mkdir /mnt/run &&
     mount -t tmpfs tmpfs /mnt/run &&
     mkdir /mnt/run/lock
    
We don't create a /srv, as that's the HDD stuff. Also mount the EFI partition:
mkfs.fat -F 32 /dev/sdc2 &&
mount /dev/sdc2 /mnt/boot/efi/
At this point, everything should be mounted in /mnt. It should look like this:
root@curie:~# LANG=C df -h -t zfs -t vfat
Filesystem            Size  Used Avail Use% Mounted on
rpool/ROOT/debian     899G  384K  899G   1% /mnt
bpool/BOOT/debian     832M  123M  709M  15% /mnt/boot
rpool/home            899G  256K  899G   1% /mnt/home
rpool/home/root       899G  256K  899G   1% /mnt/root
rpool/var             899G  384K  899G   1% /mnt/var
rpool/var/cache       899G  256K  899G   1% /mnt/var/cache
rpool/var/tmp         899G  256K  899G   1% /mnt/var/tmp
rpool/var/lib/docker  899G  256K  899G   1% /mnt/var/lib/docker
/dev/sdc2             511M  4.0K  511M   1% /mnt/boot/efi
Now that we have everything setup and mounted, let's copy all files over.

Copying files This is a list of all the mounted filesystems
for fs in /boot/ /boot/efi/ / /home/; do
    echo "syncing $fs to /mnt$fs..." && 
    rsync -aSHAXx --info=progress2 --delete $fs /mnt$fs
done
You can check that the list is correct with:
mount -l -t ext4,btrfs,vfat   awk ' print $3 '
Note that we skip /srv as it's on a different disk. On the first run, we had:
root@curie:~# for fs in /boot/ /boot/efi/ / /home/; do
        echo "syncing $fs to /mnt$fs..." && 
        rsync -aSHAXx --info=progress2 $fs /mnt$fs
    done
syncing /boot/ to /mnt/boot/...
              0   0%    0.00kB/s    0:00:00 (xfr#0, to-chk=0/299)  
syncing /boot/efi/ to /mnt/boot/efi/...
     16,831,437 100%  184.14MB/s    0:00:00 (xfr#101, to-chk=0/110)
syncing / to /mnt/...
 28,019,293,280  94%   47.63MB/s    0:09:21 (xfr#703710, ir-chk=6748/839220)rsync: [generator] delete_file: rmdir(var/lib/docker) failed: Device or resource busy (16)
could not make way for new symlink: var/lib/docker
 34,081,267,990  98%   50.71MB/s    0:10:40 (xfr#736577, to-chk=0/867732)    
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1333) [sender=3.2.3]
syncing /home/ to /mnt/home/...
rsync: [sender] readlink_stat("/home/anarcat/.fuse") failed: Permission denied (13)
 24,456,268,098  98%   68.03MB/s    0:05:42 (xfr#159867, ir-chk=6875/172377) 
file has vanished: "/home/anarcat/.cache/mozilla/firefox/s2hwvqbu.quantum/cache2/entries/B3AB0CDA9C4454B3C1197E5A22669DF8EE849D90"
199,762,528,125  93%   74.82MB/s    0:42:26 (xfr#1437846, ir-chk=1018/1983979)rsync: [generator] recv_generator: mkdir "/mnt/home/anarcat/dist/supysonic/tests/assets/\#346" failed: Invalid or incomplete multibyte or wide character (84)
*** Skipping any contents from this failed directory ***
315,384,723,978  96%   76.82MB/s    1:05:15 (xfr#2256473, to-chk=0/2993950)    
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1333) [sender=3.2.3]
Note the failure to transfer that supysonic file? It turns out they had a weird filename in their source tree, since then removed, but still it showed how the utf8only feature might not be such a bad idea. At this point, the procedure was restarted all the way back to "Creating pools", after unmounting all ZFS filesystems (umount /mnt/run /mnt/boot/efi && umount -t zfs -a) and destroying the pool, which, surprisingly, doesn't require any confirmation (zpool destroy rpool). The second run was cleaner:
root@curie:~# for fs in /boot/ /boot/efi/ / /home/; do
        echo "syncing $fs to /mnt$fs..." && 
        rsync -aSHAXx --info=progress2 --delete $fs /mnt$fs
    done
syncing /boot/ to /mnt/boot/...
              0   0%    0.00kB/s    0:00:00 (xfr#0, to-chk=0/299)  
syncing /boot/efi/ to /mnt/boot/efi/...
              0   0%    0.00kB/s    0:00:00 (xfr#0, to-chk=0/110)  
syncing / to /mnt/...
 28,019,033,070  97%   42.03MB/s    0:10:35 (xfr#703671, ir-chk=1093/833515)rsync: [generator] delete_file: rmdir(var/lib/docker) failed: Device or resource busy (16)
could not make way for new symlink: var/lib/docker
 34,081,807,102  98%   44.84MB/s    0:12:04 (xfr#736580, to-chk=0/867723)    
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1333) [sender=3.2.3]
syncing /home/ to /mnt/home/...
rsync: [sender] readlink_stat("/home/anarcat/.fuse") failed: Permission denied (13)
IO error encountered -- skipping file deletion
 24,043,086,450  96%   62.03MB/s    0:06:09 (xfr#151819, ir-chk=15117/172571)
file has vanished: "/home/anarcat/.cache/mozilla/firefox/s2hwvqbu.quantum/cache2/entries/4C1FDBFEA976FF924D062FB990B24B897A77B84B"
315,423,626,507  96%   67.09MB/s    1:14:43 (xfr#2256845, to-chk=0/2994364)    
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1333) [sender=3.2.3]
Also note the transfer speed: we seem capped at 76MB/s, or 608Mbit/s. This is not as fast as I was expecting: the USB connection seems to be at around 5Gbps:
anarcat@curie:~$ lsusb -tv   head -4
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/6p, 5000M
    ID 1d6b:0003 Linux Foundation 3.0 root hub
     __ Port 1: Dev 4, If 0, Class=Mass Storage, Driver=uas, 5000M
        ID 0b05:1932 ASUSTek Computer, Inc.
So it shouldn't cap at that speed. It's possible the USB adapter is failing to give me the full speed though. It's not the M.2 SSD drive either, as that has a ~500MB/s bandwidth, acccording to its spec. At this point, we're about ready to do the final configuration. We drop to single user mode and do the rest of the procedure. That used to be shutdown now, but it seems like the systemd switch broke that, so now you can reboot into grub and pick the "recovery" option. Alternatively, you might try systemctl rescue, as I found out. I also wanted to copy the drive over to another new NVMe drive, but that failed: it looks like the USB controller I have doesn't work with older, non-NVME drives.

Boot configuration Now we need to enter the new system to rebuild the boot loader and initrd and so on. First, we bind mounts and chroot into the ZFS disk:
mount --rbind /dev  /mnt/dev &&
mount --rbind /proc /mnt/proc &&
mount --rbind /sys  /mnt/sys &&
chroot /mnt /bin/bash
Next we add an extra service that imports the bpool on boot, to make sure it survives a zpool.cache destruction:
cat > /etc/systemd/system/zfs-import-bpool.service <<EOF
[Unit]
DefaultDependencies=no
Before=zfs-import-scan.service
Before=zfs-import-cache.service
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/sbin/zpool import -N -o cachefile=none bpool
# Work-around to preserve zpool cache:
ExecStartPre=-/bin/mv /etc/zfs/zpool.cache /etc/zfs/preboot_zpool.cache
ExecStartPost=-/bin/mv /etc/zfs/preboot_zpool.cache /etc/zfs/zpool.cache
[Install]
WantedBy=zfs-import.target
EOF
Enable the service:
systemctl enable zfs-import-bpool.service
I had to trim down /etc/fstab and /etc/crypttab to only contain references to the legacy filesystems (/srv is still BTRFS!). If we don't already have a tmpfs defined in /etc/fstab:
ln -s /usr/share/systemd/tmp.mount /etc/systemd/system/ &&
systemctl enable tmp.mount
Rebuild boot loader with support for ZFS, but also to workaround GRUB's missing zpool-features support:
grub-probe /boot   grep -q zfs &&
update-initramfs -c -k all &&
sed -i 's,GRUB_CMDLINE_LINUX.*,GRUB_CMDLINE_LINUX="root=ZFS=rpool/ROOT/debian",' /etc/default/grub &&
update-grub
For good measure, make sure the right disk is configured here, for example you might want to tag both drives in a RAID array:
dpkg-reconfigure grub-pc
Install grub to EFI while you're there:
grub-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=debian --recheck --no-floppy
Filesystem mount ordering. The rationale here in the OpenZFS guide is a little strange, but I don't dare ignore that.
mkdir /etc/zfs/zfs-list.cache
touch /etc/zfs/zfs-list.cache/bpool
touch /etc/zfs/zfs-list.cache/rpool
zed -F &
Verify that zed updated the cache by making sure these are not empty:
cat /etc/zfs/zfs-list.cache/bpool
cat /etc/zfs/zfs-list.cache/rpool
Once the files have data, stop zed:
fg
Press Ctrl-C.
Fix the paths to eliminate /mnt:
sed -Ei "s /mnt/? / " /etc/zfs/zfs-list.cache/*
Snapshot initial install:
zfs snapshot bpool/BOOT/debian@install
zfs snapshot rpool/ROOT/debian@install
Exit chroot:
exit

Finalizing One last sync was done in rescue mode:
for fs in /boot/ /boot/efi/ / /home/; do
    echo "syncing $fs to /mnt$fs..." && 
    rsync -aSHAXx --info=progress2 --delete $fs /mnt$fs
done
Then we unmount all filesystems:
mount   grep -v zfs   tac   awk '/\/mnt/  print $3 '   xargs -i  umount -lf  
zpool export -a
Reboot, swap the drives, and boot in ZFS. Hurray!

Benchmarks This is a test that was ran in single-user mode using fio and the Ars Technica recommended tests, which are:
  • Single 4KiB random write process:
     fio --name=randwrite4k1x --ioengine=posixaio --rw=randwrite --bs=4k --size=4g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1
    
  • 16 parallel 64KiB random write processes:
     fio --name=randwrite64k16x --ioengine=posixaio --rw=randwrite --bs=64k --size=256m --numjobs=16 --iodepth=16 --runtime=60 --time_based --end_fsync=1
    
  • Single 1MiB random write process:
     fio --name=randwrite1m1x --ioengine=posixaio --rw=randwrite --bs=1m --size=16g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1
    
Strangely, that's not exactly what the author, Jim Salter, did in his actual test bench used in the ZFS benchmarking article. The first thing is there's no read test at all, which is already pretty strange. But also it doesn't include stuff like dropping caches or repeating results. So here's my variation, which i called fio-ars-bench.sh for now. It just batches a bunch of fio tests, one by one, 60 seconds each. It should take about 12 minutes to run, as there are 3 pair of tests, read/write, with and without async. My bias, before building, running and analysing those results is that ZFS should outperform the traditional stack on writes, but possibly not on reads. It's also possible it outperforms it on both, because it's a newer drive. A new test might be possible with a new external USB drive as well, although I doubt I will find the time to do this.

Results All tests were done on WD blue SN550 drives, which claims to be able to push 2400MB/s read and 1750MB/s write. An extra drive was bought to move the LVM setup from a WDC WDS500G1B0B-00AS40 SSD, a WD blue M.2 2280 SSD that was at least 5 years old, spec'd at 560MB/s read, 530MB/s write. Benchmarks were done on the M.2 SSD drive but discarded so that the drive difference is not a factor in the test. In practice, I'm going to assume we'll never reach those numbers because we're not actually NVMe (this is an old workstation!) so the bottleneck isn't the disk itself. For our purposes, it might still give us useful results.

Rescue test, LUKS/LVM/ext4 Those tests were performed with everything shutdown, after either entering the system in rescue mode, or by reaching that target with:
systemctl rescue
The network might have been started before or after the test as well:
systemctl start systemd-networkd
So it should be fairly reliable as basically nothing else is running. Raw numbers, from the ?job-curie-lvm.log, converted to MiB/s and manually merged:
test read I/O read IOPS write I/O write IOPS
rand4k4g1x 39.27 10052 212.15 54310
rand4k4g1x--fsync=1 39.29 10057 2.73 699
rand64k256m16x 1297.00 20751 1068.57 17097
rand64k256m16x--fsync=1 1290.90 20654 353.82 5661
rand1m16g1x 315.15 315 563.77 563
rand1m16g1x--fsync=1 345.88 345 157.01 157
Peaks are at about 20k IOPS and ~1.3GiB/s read, 1GiB/s write in the 64KB blocks with 16 jobs. Slowest is the random 4k block sync write at an abysmal 3MB/s and 700 IOPS The 1MB read/write tests have lower IOPS, but that is expected.

Rescue test, ZFS This test was also performed in rescue mode. Raw numbers, from the ?job-curie-zfs.log, converted to MiB/s and manually merged:
test read I/O read IOPS write I/O write IOPS
rand4k4g1x 77.20 19763 27.13 6944
rand4k4g1x--fsync=1 76.16 19495 6.53 1673
rand64k256m16x 1882.40 30118 70.58 1129
rand64k256m16x--fsync=1 1865.13 29842 71.98 1151
rand1m16g1x 921.62 921 102.21 102
rand1m16g1x--fsync=1 908.37 908 64.30 64
Peaks are at 1.8GiB/s read, also in the 64k job like above, but much faster. The write is, as expected, much slower at 70MiB/s (compared to 1GiB/s!), but it should be noted the sync write doesn't degrade performance compared to async writes (although it's still below the LVM 300MB/s).

Conclusions Really, ZFS has trouble performing in all write conditions. The random 4k sync write test is the only place where ZFS outperforms LVM in writes, and barely (7MiB/s vs 3MiB/s). Everywhere else, writes are much slower, sometimes by an order of magnitude. And before some ZFS zealot jumps in talking about the SLOG or some other cache that could be added to improved performance, I'll remind you that those numbers are on a bare bones NVMe drive, pretty much as fast storage as you can find on this machine. Adding another NVMe drive as a cache probably will not improve write performance here. Still, those are very different results than the tests performed by Salter which shows ZFS beating traditional configurations in all categories but uncached 4k reads (not writes!). That said, those tests are very different from the tests I performed here, where I test writes on a single disk, not a RAID array, which might explain the discrepancy. Also, note that neither LVM or ZFS manage to reach the 2400MB/s read and 1750MB/s write performance specification. ZFS does manage to reach 82% of the read performance (1973MB/s) and LVM 64% of the write performance (1120MB/s). LVM hits 57% of the read performance and ZFS hits barely 6% of the write performance. Overall, I'm a bit disappointed in the ZFS write performance here, I must say. Maybe I need to tweak the record size or some other ZFS voodoo, but I'll note that I didn't have to do any such configuration on the other side to kick ZFS in the pants...

Real world experience This section document not synthetic backups, but actual real world workloads, comparing before and after I switched my workstation to ZFS.

Docker performance I had the feeling that running some git hook (which was firing a Docker container) was "slower" somehow. It seems that, at runtime, ZFS backends are significant slower than their overlayfs/ext4 equivalent:
May 16 14:42:52 curie systemd[1]: home-docker-overlay2-17e4d24228decc2d2d493efc401dbfb7ac29739da0e46775e122078d9daf3e87\x2dinit-merged.mount: Succeeded.
May 16 14:42:52 curie systemd[5161]: home-docker-overlay2-17e4d24228decc2d2d493efc401dbfb7ac29739da0e46775e122078d9daf3e87\x2dinit-merged.mount: Succeeded.
May 16 14:42:52 curie systemd[1]: home-docker-overlay2-17e4d24228decc2d2d493efc401dbfb7ac29739da0e46775e122078d9daf3e87-merged.mount: Succeeded.
May 16 14:42:53 curie dockerd[1723]: time="2022-05-16T14:42:53.087219426-04:00" level=info msg="starting signal loop" namespace=moby path=/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/af22586fba07014a4d10ab19da10cf280db7a43cad804d6c1e9f2682f12b5f10 pid=151170
May 16 14:42:53 curie systemd[1]: Started libcontainer container af22586fba07014a4d10ab19da10cf280db7a43cad804d6c1e9f2682f12b5f10.
May 16 14:42:54 curie systemd[1]: docker-af22586fba07014a4d10ab19da10cf280db7a43cad804d6c1e9f2682f12b5f10.scope: Succeeded.
May 16 14:42:54 curie dockerd[1723]: time="2022-05-16T14:42:54.047297800-04:00" level=info msg="shim disconnected" id=af22586fba07014a4d10ab19da10cf280db7a43cad804d6c1e9f2682f12b5f10
May 16 14:42:54 curie dockerd[998]: time="2022-05-16T14:42:54.051365015-04:00" level=info msg="ignoring event" container=af22586fba07014a4d10ab19da10cf280db7a43cad804d6c1e9f2682f12b5f10 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
May 16 14:42:54 curie systemd[2444]: run-docker-netns-f5453c87c879.mount: Succeeded.
May 16 14:42:54 curie systemd[5161]: run-docker-netns-f5453c87c879.mount: Succeeded.
May 16 14:42:54 curie systemd[2444]: home-docker-overlay2-17e4d24228decc2d2d493efc401dbfb7ac29739da0e46775e122078d9daf3e87-merged.mount: Succeeded.
May 16 14:42:54 curie systemd[5161]: home-docker-overlay2-17e4d24228decc2d2d493efc401dbfb7ac29739da0e46775e122078d9daf3e87-merged.mount: Succeeded.
May 16 14:42:54 curie systemd[1]: run-docker-netns-f5453c87c879.mount: Succeeded.
May 16 14:42:54 curie systemd[1]: home-docker-overlay2-17e4d24228decc2d2d493efc401dbfb7ac29739da0e46775e122078d9daf3e87-merged.mount: Succeeded.
Translating this:
  • container setup: ~1 second
  • container runtime: ~1 second
  • container teardown: ~1 second
  • total runtime: 2-3 seconds
Obviously, those timestamps are not quite accurate enough to make precise measurements... After I switched to ZFS:
mai 30 15:31:39 curie systemd[1]: var-lib-docker-zfs-graph-41ce08fb7a1d3a9c101694b82722f5621c0b4819bd1d9f070933fd1e00543cdf\x2dinit.mount: Succeeded. 
mai 30 15:31:39 curie systemd[5287]: var-lib-docker-zfs-graph-41ce08fb7a1d3a9c101694b82722f5621c0b4819bd1d9f070933fd1e00543cdf\x2dinit.mount: Succeeded. 
mai 30 15:31:40 curie systemd[1]: var-lib-docker-zfs-graph-41ce08fb7a1d3a9c101694b82722f5621c0b4819bd1d9f070933fd1e00543cdf.mount: Succeeded. 
mai 30 15:31:40 curie systemd[5287]: var-lib-docker-zfs-graph-41ce08fb7a1d3a9c101694b82722f5621c0b4819bd1d9f070933fd1e00543cdf.mount: Succeeded. 
mai 30 15:31:41 curie dockerd[3199]: time="2022-05-30T15:31:41.551403693-04:00" level=info msg="starting signal loop" namespace=moby path=/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/42a1a1ed5912a7227148e997f442e7ab2e5cc3558aa3471548223c5888c9b142 pid=141080 
mai 30 15:31:41 curie systemd[1]: run-docker-runtime\x2drunc-moby-42a1a1ed5912a7227148e997f442e7ab2e5cc3558aa3471548223c5888c9b142-runc.ZVcjvl.mount: Succeeded. 
mai 30 15:31:41 curie systemd[5287]: run-docker-runtime\x2drunc-moby-42a1a1ed5912a7227148e997f442e7ab2e5cc3558aa3471548223c5888c9b142-runc.ZVcjvl.mount: Succeeded. 
mai 30 15:31:41 curie systemd[1]: Started libcontainer container 42a1a1ed5912a7227148e997f442e7ab2e5cc3558aa3471548223c5888c9b142. 
mai 30 15:31:45 curie systemd[1]: docker-42a1a1ed5912a7227148e997f442e7ab2e5cc3558aa3471548223c5888c9b142.scope: Succeeded. 
mai 30 15:31:45 curie dockerd[3199]: time="2022-05-30T15:31:45.883019128-04:00" level=info msg="shim disconnected" id=42a1a1ed5912a7227148e997f442e7ab2e5cc3558aa3471548223c5888c9b142 
mai 30 15:31:45 curie dockerd[1726]: time="2022-05-30T15:31:45.883064491-04:00" level=info msg="ignoring event" container=42a1a1ed5912a7227148e997f442e7ab2e5cc3558aa3471548223c5888c9b142 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete" 
mai 30 15:31:45 curie systemd[1]: run-docker-netns-e45f5cf5f465.mount: Succeeded. 
mai 30 15:31:45 curie systemd[5287]: run-docker-netns-e45f5cf5f465.mount: Succeeded. 
mai 30 15:31:45 curie systemd[1]: var-lib-docker-zfs-graph-41ce08fb7a1d3a9c101694b82722f5621c0b4819bd1d9f070933fd1e00543cdf.mount: Succeeded. 
mai 30 15:31:45 curie systemd[5287]: var-lib-docker-zfs-graph-41ce08fb7a1d3a9c101694b82722f5621c0b4819bd1d9f070933fd1e00543cdf.mount: Succeeded.
That's double or triple the run time, from 2 seconds to 6 seconds. Most of the time is spent in run time, inside the container. Here's the breakdown:
  • container setup: ~2 seconds
  • container run: ~4 seconds
  • container teardown: ~1 second
  • total run time: about ~6-7 seconds
That's a two- to three-fold increase! Clearly something is going on here that I should tweak. It's possible that code path is less optimized in Docker. I also worry about podman, but apparently it also supports ZFS backends. Possibly it would perform better, but at this stage I wouldn't have a good comparison: maybe it would have performed better on non-ZFS as well...

Interactivity While doing the offsite backups (below), the system became somewhat "sluggish". I felt everything was slow, and I estimate it introduced ~50ms latency in any input device. Arguably, those are all USB and the external drive was connected through USB, but I suspect the ZFS drivers are not as well tuned with the scheduler as the regular filesystem drivers...

Recovery procedures For test purposes, I unmounted all systems during the procedure:
umount /mnt/boot/efi /mnt/boot/run
umount -a -t zfs
zpool export -a
And disconnected the drive, to see how I would recover this system from another Linux system in case of a total motherboard failure. To import an existing pool, plug the device, then import the pool with an alternate root, so it doesn't mount over your existing filesystems, then you mount the root filesystem and all the others:
zpool import -l -a -R /mnt &&
zfs mount rpool/ROOT/debian &&
zfs mount -a &&
mount /dev/sdc2 /mnt/boot/efi &&
mount -t tmpfs tmpfs /mnt/run &&
mkdir /mnt/run/lock

Offsite backup Part of the goal of using ZFS is to simplify and harden backups. I wanted to experiment with shorter recovery times specifically both point in time recovery objective and recovery time objective and faster incremental backups. This is, therefore, part of my backup services. This section documents how an external NVMe enclosure was setup in a pool to mirror the datasets from my workstation. The final setup should include syncoid copying datasets to the backup server regularly, but I haven't finished that configuration yet.

Partitioning The above partitioning procedure used sgdisk, but I couldn't figure out how to do this with sgdisk, so this uses sfdisk to dump the partition from the first disk to an external, identical drive:
sfdisk -d /dev/nvme0n1   sfdisk --no-reread /dev/sda --force

Pool creation This is similar to the main pool creation, except we tweaked a few bits after changing the upstream procedure:
zpool create \
        -o cachefile=/etc/zfs/zpool.cache \
        -o ashift=12 -d \
        -o feature@async_destroy=enabled \
        -o feature@bookmarks=enabled \
        -o feature@embedded_data=enabled \
        -o feature@empty_bpobj=enabled \
        -o feature@enabled_txg=enabled \
        -o feature@extensible_dataset=enabled \
        -o feature@filesystem_limits=enabled \
        -o feature@hole_birth=enabled \
        -o feature@large_blocks=enabled \
        -o feature@lz4_compress=enabled \
        -o feature@spacemap_histogram=enabled \
        -o feature@zpool_checkpoint=enabled \
        -O acltype=posixacl -O xattr=sa \
        -O compression=lz4 \
        -O devices=off \
        -O relatime=on \
        -O canmount=off \
        -O mountpoint=/boot -R /mnt \
        bpool-tubman /dev/sdb3
The change from the main boot pool are: Main pool creation is:
zpool create \
        -o ashift=12 \
        -O encryption=on -O keylocation=prompt -O keyformat=passphrase \
        -O acltype=posixacl -O xattr=sa -O dnodesize=auto \
        -O compression=zstd \
        -O relatime=on \
        -O canmount=off \
        -O mountpoint=/ -R /mnt \
        rpool-tubman /dev/sdb4

First sync I used syncoid to copy all pools over to the external device. syncoid is a thing that's part of the sanoid project which is specifically designed to sync snapshots between pool, typically over SSH links but it can also operate locally. The sanoid command had a --readonly argument to simulate changes, but syncoid didn't so I tried to fix that with an upstream PR. It seems it would be better to do this by hand, but this was much easier. The full first sync was:
root@curie:/home/anarcat# ./bin/syncoid -r  bpool bpool-tubman
CRITICAL ERROR: Target bpool-tubman exists but has no snapshots matching with bpool!
                Replication to target would require destroying existing
                target. Cowardly refusing to destroy your existing target.
          NOTE: Target bpool-tubman dataset is < 64MB used - did you mistakenly run
                 zfs create bpool-tubman  on the target? ZFS initial
                replication must be to a NON EXISTENT DATASET, which will
                then be CREATED BY the initial replication process.
INFO: Sending oldest full snapshot bpool/BOOT@test (~ 42 KB) to new target filesystem:
44.2KiB 0:00:00 [4.19MiB/s] [========================================================================================================================] 103%            
INFO: Updating new target filesystem with incremental bpool/BOOT@test ... syncoid_curie_2022-05-30:12:50:39 (~ 4 KB):
2.13KiB 0:00:00 [ 114KiB/s] [===============================================================>                                                         ] 53%            
INFO: Sending oldest full snapshot bpool/BOOT/debian@install (~ 126.0 MB) to new target filesystem:
 126MiB 0:00:00 [ 308MiB/s] [=======================================================================================================================>] 100%            
INFO: Updating new target filesystem with incremental bpool/BOOT/debian@install ... syncoid_curie_2022-05-30:12:50:39 (~ 113.4 MB):
 113MiB 0:00:00 [ 315MiB/s] [=======================================================================================================================>] 100%
root@curie:/home/anarcat# ./bin/syncoid -r  rpool rpool-tubman
CRITICAL ERROR: Target rpool-tubman exists but has no snapshots matching with rpool!
                Replication to target would require destroying existing
                target. Cowardly refusing to destroy your existing target.
          NOTE: Target rpool-tubman dataset is < 64MB used - did you mistakenly run
                 zfs create rpool-tubman  on the target? ZFS initial
                replication must be to a NON EXISTENT DATASET, which will
                then be CREATED BY the initial replication process.
INFO: Sending oldest full snapshot rpool/ROOT@syncoid_curie_2022-05-30:12:50:51 (~ 69 KB) to new target filesystem:
44.2KiB 0:00:00 [2.44MiB/s] [===========================================================================>                                             ] 63%            
INFO: Sending oldest full snapshot rpool/ROOT/debian@install (~ 25.9 GB) to new target filesystem:
25.9GiB 0:03:33 [ 124MiB/s] [=======================================================================================================================>] 100%            
INFO: Updating new target filesystem with incremental rpool/ROOT/debian@install ... syncoid_curie_2022-05-30:12:50:52 (~ 3.9 GB):
3.92GiB 0:00:33 [ 119MiB/s] [======================================================================================================================>  ] 99%            
INFO: Sending oldest full snapshot rpool/home@syncoid_curie_2022-05-30:12:55:04 (~ 276.8 GB) to new target filesystem:
 277GiB 0:27:13 [ 174MiB/s] [=======================================================================================================================>] 100%            
INFO: Sending oldest full snapshot rpool/home/root@syncoid_curie_2022-05-30:13:22:19 (~ 2.2 GB) to new target filesystem:
2.22GiB 0:00:25 [90.2MiB/s] [=======================================================================================================================>] 100%            
INFO: Sending oldest full snapshot rpool/var@syncoid_curie_2022-05-30:13:22:47 (~ 5.6 GB) to new target filesystem:
5.56GiB 0:00:32 [ 176MiB/s] [=======================================================================================================================>] 100%            
INFO: Sending oldest full snapshot rpool/var/cache@syncoid_curie_2022-05-30:13:23:22 (~ 627.3 MB) to new target filesystem:
 627MiB 0:00:03 [ 169MiB/s] [=======================================================================================================================>] 100%            
INFO: Sending oldest full snapshot rpool/var/lib@syncoid_curie_2022-05-30:13:23:28 (~ 69 KB) to new target filesystem:
44.2KiB 0:00:00 [1.40MiB/s] [===========================================================================>                                             ] 63%            
INFO: Sending oldest full snapshot rpool/var/lib/docker@syncoid_curie_2022-05-30:13:23:28 (~ 442.6 MB) to new target filesystem:
 443MiB 0:00:04 [ 103MiB/s] [=======================================================================================================================>] 100%            
INFO: Sending oldest full snapshot rpool/var/lib/docker/05c0de7fabbea60500eaa495d0d82038249f6faa63b12914737c4d71520e62c5@266253254 (~ 6.3 MB) to new target filesystem:
6.49MiB 0:00:00 [12.9MiB/s] [========================================================================================================================] 102%            
INFO: Updating new target filesystem with incremental rpool/var/lib/docker/05c0de7fabbea60500eaa495d0d82038249f6faa63b12914737c4d71520e62c5@266253254 ... syncoid_curie_2022-05-30:13:23:34 (~ 4 KB):
1.52KiB 0:00:00 [27.6KiB/s] [============================================>                                                                            ] 38%            
INFO: Sending oldest full snapshot rpool/var/lib/flatpak@syncoid_curie_2022-05-30:13:23:36 (~ 2.0 GB) to new target filesystem:
2.00GiB 0:00:17 [ 115MiB/s] [=======================================================================================================================>] 100%            
INFO: Sending oldest full snapshot rpool/var/tmp@syncoid_curie_2022-05-30:13:23:55 (~ 57.0 MB) to new target filesystem:
61.8MiB 0:00:01 [45.0MiB/s] [========================================================================================================================] 108%            
INFO: Clone is recreated on target rpool-tubman/var/lib/docker/ed71ddd563a779ba6fb37b3b1d0cc2c11eca9b594e77b4b234867ebcb162b205 based on rpool/var/lib/docker/05c0de7fabbea60500eaa495d0d82038249f6faa63b12914737c4d71520e62c5@266253254
INFO: Sending oldest full snapshot rpool/var/lib/docker/ed71ddd563a779ba6fb37b3b1d0cc2c11eca9b594e77b4b234867ebcb162b205@syncoid_curie_2022-05-30:13:23:58 (~ 218.6 MB) to new target filesystem:
 219MiB 0:00:01 [ 151MiB/s] [=======================================================================================================================>] 100%
Funny how the CRITICAL ERROR doesn't actually stop syncoid and it just carries on merrily doing when it's telling you it's "cowardly refusing to destroy your existing target"... Maybe that's because my pull request broke something though... During the transfer, the computer was very sluggish: everything feels like it has ~30-50ms latency extra:
anarcat@curie:sanoid$ LANG=C top -b  -n 1   head -20
top - 13:07:05 up 6 days,  4:01,  1 user,  load average: 16.13, 16.55, 11.83
Tasks: 606 total,   6 running, 598 sleeping,   0 stopped,   2 zombie
%Cpu(s): 18.8 us, 72.5 sy,  1.2 ni,  5.0 id,  1.2 wa,  0.0 hi,  1.2 si,  0.0 st
MiB Mem :  15898.4 total,   1387.6 free,  13170.0 used,   1340.8 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.   1319.8 avail Mem 
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
     70 root      20   0       0      0      0 S  83.3   0.0   6:12.67 kswapd0
4024878 root      20   0  282644  96432  10288 S  44.4   0.6   0:11.43 puppet
3896136 root      20   0   35328  16528     48 S  22.2   0.1   2:08.04 mbuffer
3896135 root      20   0   10328    776    168 R  16.7   0.0   1:22.93 zfs
3896138 root      20   0   10588    788    156 R  16.7   0.0   1:49.30 zfs
    350 root       0 -20       0      0      0 R  11.1   0.0   1:03.53 z_rd_int
    351 root       0 -20       0      0      0 S  11.1   0.0   1:04.15 z_rd_int
3896137 root      20   0    4384    352    244 R  11.1   0.0   0:44.73 pv
4034094 anarcat   30  10   20028  13960   2428 S  11.1   0.1   0:00.70 mbsync
4036539 anarcat   20   0    9604   3464   2408 R  11.1   0.0   0:00.04 top
    352 root       0 -20       0      0      0 S   5.6   0.0   1:03.64 z_rd_int
    353 root       0 -20       0      0      0 S   5.6   0.0   1:03.64 z_rd_int
    354 root       0 -20       0      0      0 S   5.6   0.0   1:04.01 z_rd_int
I wonder how much of that is due to syncoid, particularly because I often saw mbuffer and pv in there which are not strictly necessary to do those kind of operations, as far as I understand. Once that's done, export the pools to disconnect the drive:
zpool export bpool-tubman
zpool export rpool-tubman

Raw disk benchmark Copied the 512GB SSD/M.2 device to another 1024GB NVMe/M.2 device:
anarcat@curie:~$ sudo dd if=/dev/sdb of=/dev/sdc bs=4M status=progress conv=fdatasync
499944259584 octets (500 GB, 466 GiB) copi s, 1713 s, 292 MB/s
119235+1 enregistrements lus
119235+1 enregistrements  crits
500107862016 octets (500 GB, 466 GiB) copi s, 1719,93 s, 291 MB/s
... while both over USB, whoohoo 300MB/s!

Monitoring ZFS should be monitoring your pools regularly. Normally, the [[!debman zed]] daemon monitors all ZFS events. It is the thing that will report when a scrub failed, for example. See this configuration guide. Scrubs should be regularly scheduled to ensure consistency of the pool. This can be done in newer zfsutils-linux versions (bullseye-backports or bookworm) with one of those, depending on the desired frequency:
systemctl enable zfs-scrub-weekly@rpool.timer --now
systemctl enable zfs-scrub-monthly@rpool.timer --now
When the scrub runs, if it finds anything it will send an event which will get picked up by the zed daemon which will then send a notification, see below for an example. TODO: deploy on curie, if possible (probably not because no RAID) TODO: this should be in Puppet

Scrub warning example So what happens when problems are found? Here's an example of how I dealt with an error I received. After setting up another server (tubman) with ZFS, I eventually ended up getting a warning from the ZFS toolchain.
Date: Sun, 09 Oct 2022 00:58:08 -0400
From: root <root@anarc.at>
To: root@anarc.at
Subject: ZFS scrub_finish event for rpool on tubman
ZFS has finished a scrub:
   eid: 39536
 class: scrub_finish
  host: tubman
  time: 2022-10-09 00:58:07-0400
  pool: rpool
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: scrub repaired 0B in 00:33:57 with 0 errors on Sun Oct  9 00:58:07 2022
config:
        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sdb4    ONLINE       0     1     0
            sdc4    ONLINE       0     0     0
        cache
          sda3      ONLINE       0     0     0
errors: No known data errors
This, in itself, is a little worrisome. But it helpfully links to this more detailed documentation (and props up there: the link still works) which explains this is a "minor" problem (something that could be included in the report). In this case, this happened on a server setup on 2021-04-28, but the disks and server hardware are much older. The server itself (marcos v1) was built around 2011, over 10 years ago now. The hard drive in question is:
root@tubman:~# smartctl -i -qnoserial /dev/sdb
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.0-15-amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family:     Seagate BarraCuda 3.5
Device Model:     ST4000DM004-2CV104
Firmware Version: 0001
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5425 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Tue Oct 11 11:02:32 2022 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Some more SMART stats:
root@tubman:~# smartctl -a -qnoserial /dev/sdb   grep -e  Head_Flying_Hours -e Power_On_Hours -e Total_LBA -e 'Sector Sizes'
Sector Sizes:     512 bytes logical, 4096 bytes physical
  9 Power_On_Hours          0x0032   086   086   000    Old_age   Always       -       12464 (206 202 0)
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       10966h+55m+23.757s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       21107792664
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       3201579750
That's over a year of power on, which shouldn't be so bad. It has written about 10TB of data (21107792664 LBAs * 512 byte/LBA), which is about two full writes. According to its specification, this device is supposed to support 55 TB/year of writes, so we're far below spec. Note that are still far from the "non-recoverable read error per bits" spec (1 per 10E15), as we've basically read 13E12 bits (3201579750 LBAs * 512 byte/LBA = 13E12 bits). It's likely this disk was made in 2018, so it is in its fourth year. Interestingly, /dev/sdc is also a Seagate drive, but of a different series:
root@tubman:~# smartctl -qnoserial  -i /dev/sdb
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.0-15-amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family:     Seagate BarraCuda 3.5
Device Model:     ST4000DM004-2CV104
Firmware Version: 0001
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5425 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Tue Oct 11 11:21:35 2022 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
It has seen much more reads than the other disk which is also interesting:
root@tubman:~# smartctl -a -qnoserial /dev/sdc   grep -e  Head_Flying_Hours -e Power_On_Hours -e Total_LBA -e 'Sector Sizes'
Sector Sizes:     512 bytes logical, 4096 bytes physical
  9 Power_On_Hours          0x0032   059   059   000    Old_age   Always       -       36240
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       33994h+10m+52.118s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       30730174438
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       51894566538
That's 4 years of Head_Flying_Hours, and over 4 years (4 years and 48 days) of Power_On_Hours. The copyright date on that drive's specs goes back to 2016, so it's a much older drive. SMART self-test succeeded.

Remaining issues
  • TODO: move send/receive backups to offsite host, see also zfs for alternatives to syncoid/sanoid there
  • TODO: setup backup cron job (or timer?)
  • TODO: swap still not setup on curie, see zfs
  • TODO: document this somewhere: bpool and rpool are both pools and datasets. that's pretty confusing, but also very useful because it allows for pool-wide recursive snapshots, which are used for the backup system

fio improvements I really want to improve my experience with fio. Right now, I'm just cargo-culting stuff from other folks and I don't really like it. stressant is a good example of my struggles, in the sense that it doesn't really work that well for disk tests. I would love to have just a single .fio job file that lists multiple jobs to run serially. For example, this file describes the above workload pretty well:
[global]
# cargo-culting Salter
fallocate=none
ioengine=posixaio
runtime=60
time_based=1
end_fsync=1
stonewall=1
group_reporting=1
# no need to drop caches, done by default
# invalidate=1
# Single 4KiB random read/write process
[randread-4k-4g-1x]
rw=randread
bs=4k
size=4g
numjobs=1
iodepth=1
[randwrite-4k-4g-1x]
rw=randwrite
bs=4k
size=4g
numjobs=1
iodepth=1
# 16 parallel 64KiB random read/write processes:
[randread-64k-256m-16x]
rw=randread
bs=64k
size=256m
numjobs=16
iodepth=16
[randwrite-64k-256m-16x]
rw=randwrite
bs=64k
size=256m
numjobs=16
iodepth=16
# Single 1MiB random read/write process
[randread-1m-16g-1x]
rw=randread
bs=1m
size=16g
numjobs=1
iodepth=1
[randwrite-1m-16g-1x]
rw=randwrite
bs=1m
size=16g
numjobs=1
iodepth=1
... except the jobs are actually started in parallel, even though they are stonewall'd, as far as I can tell by the reports. I sent a mail to the fio mailing list for clarification. It looks like the jobs are started in parallel, but actual (correctly) run serially. It seems like this might just be a matter of reporting the right timestamps in the end, although it does feel like starting all the processes (even if not doing any work yet) could skew the results.

Hangs during procedure During the procedure, it happened a few times where any ZFS command would completely hang. It seems that using an external USB drive to sync stuff didn't work so well: sometimes it would reconnect under a different device (from sdc to sdd, for example), and this would greatly confuse ZFS. Here, for example, is sdd reappearing out of the blue:
May 19 11:22:53 curie kernel: [  699.820301] scsi host4: uas
May 19 11:22:53 curie kernel: [  699.820544] usb 2-1: authorized to connect
May 19 11:22:53 curie kernel: [  699.922433] scsi 4:0:0:0: Direct-Access     ROG      ESD-S1C          0    PQ: 0 ANSI: 6
May 19 11:22:53 curie kernel: [  699.923235] sd 4:0:0:0: Attached scsi generic sg2 type 0
May 19 11:22:53 curie kernel: [  699.923676] sd 4:0:0:0: [sdd] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB)
May 19 11:22:53 curie kernel: [  699.923788] sd 4:0:0:0: [sdd] Write Protect is off
May 19 11:22:53 curie kernel: [  699.923949] sd 4:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
May 19 11:22:53 curie kernel: [  699.924149] sd 4:0:0:0: [sdd] Optimal transfer size 33553920 bytes
May 19 11:22:53 curie kernel: [  699.961602]  sdd: sdd1 sdd2 sdd3 sdd4
May 19 11:22:53 curie kernel: [  699.996083] sd 4:0:0:0: [sdd] Attached SCSI disk
Next time I run a ZFS command (say zpool list), the command completely hangs (D state) and this comes up in the logs:
May 19 11:34:21 curie kernel: [ 1387.914843] zio pool=bpool vdev=/dev/sdc3 error=5 type=2 offset=71344128 size=4096 flags=184880
May 19 11:34:21 curie kernel: [ 1387.914859] zio pool=bpool vdev=/dev/sdc3 error=5 type=2 offset=205565952 size=4096 flags=184880
May 19 11:34:21 curie kernel: [ 1387.914874] zio pool=bpool vdev=/dev/sdc3 error=5 type=2 offset=272789504 size=4096 flags=184880
May 19 11:34:21 curie kernel: [ 1387.914906] zio pool=bpool vdev=/dev/sdc3 error=5 type=1 offset=270336 size=8192 flags=b08c1
May 19 11:34:21 curie kernel: [ 1387.914932] zio pool=bpool vdev=/dev/sdc3 error=5 type=1 offset=1073225728 size=8192 flags=b08c1
May 19 11:34:21 curie kernel: [ 1387.914948] zio pool=bpool vdev=/dev/sdc3 error=5 type=1 offset=1073487872 size=8192 flags=b08c1
May 19 11:34:21 curie kernel: [ 1387.915165] zio pool=bpool vdev=/dev/sdc3 error=5 type=2 offset=272793600 size=4096 flags=184880
May 19 11:34:21 curie kernel: [ 1387.915183] zio pool=bpool vdev=/dev/sdc3 error=5 type=2 offset=339853312 size=4096 flags=184880
May 19 11:34:21 curie kernel: [ 1387.915648] WARNING: Pool 'bpool' has encountered an uncorrectable I/O failure and has been suspended.
May 19 11:34:21 curie kernel: [ 1387.915648] 
May 19 11:37:25 curie kernel: [ 1571.558614] task:txg_sync        state:D stack:    0 pid:  997 ppid:     2 flags:0x00004000
May 19 11:37:25 curie kernel: [ 1571.558623] Call Trace:
May 19 11:37:25 curie kernel: [ 1571.558640]  __schedule+0x282/0x870
May 19 11:37:25 curie kernel: [ 1571.558650]  schedule+0x46/0xb0
May 19 11:37:25 curie kernel: [ 1571.558670]  schedule_timeout+0x8b/0x140
May 19 11:37:25 curie kernel: [ 1571.558675]  ? __next_timer_interrupt+0x110/0x110
May 19 11:37:25 curie kernel: [ 1571.558678]  io_schedule_timeout+0x4c/0x80
May 19 11:37:25 curie kernel: [ 1571.558689]  __cv_timedwait_common+0x12b/0x160 [spl]
May 19 11:37:25 curie kernel: [ 1571.558694]  ? add_wait_queue_exclusive+0x70/0x70
May 19 11:37:25 curie kernel: [ 1571.558702]  __cv_timedwait_io+0x15/0x20 [spl]
May 19 11:37:25 curie kernel: [ 1571.558816]  zio_wait+0x129/0x2b0 [zfs]
May 19 11:37:25 curie kernel: [ 1571.558929]  dsl_pool_sync+0x461/0x4f0 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559032]  spa_sync+0x575/0xfa0 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559138]  ? spa_txg_history_init_io+0x101/0x110 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559245]  txg_sync_thread+0x2e0/0x4a0 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559354]  ? txg_fini+0x240/0x240 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559366]  thread_generic_wrapper+0x6f/0x80 [spl]
May 19 11:37:25 curie kernel: [ 1571.559376]  ? __thread_exit+0x20/0x20 [spl]
May 19 11:37:25 curie kernel: [ 1571.559379]  kthread+0x11b/0x140
May 19 11:37:25 curie kernel: [ 1571.559382]  ? __kthread_bind_mask+0x60/0x60
May 19 11:37:25 curie kernel: [ 1571.559386]  ret_from_fork+0x22/0x30
May 19 11:37:25 curie kernel: [ 1571.559401] task:zed             state:D stack:    0 pid: 1564 ppid:     1 flags:0x00000000
May 19 11:37:25 curie kernel: [ 1571.559404] Call Trace:
May 19 11:37:25 curie kernel: [ 1571.559409]  __schedule+0x282/0x870
May 19 11:37:25 curie kernel: [ 1571.559412]  ? __kmalloc_node+0x141/0x2b0
May 19 11:37:25 curie kernel: [ 1571.559417]  schedule+0x46/0xb0
May 19 11:37:25 curie kernel: [ 1571.559420]  schedule_preempt_disabled+0xa/0x10
May 19 11:37:25 curie kernel: [ 1571.559424]  __mutex_lock.constprop.0+0x133/0x460
May 19 11:37:25 curie kernel: [ 1571.559435]  ? nvlist_xalloc.part.0+0x68/0xc0 [znvpair]
May 19 11:37:25 curie kernel: [ 1571.559537]  spa_all_configs+0x41/0x120 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559644]  zfs_ioc_pool_configs+0x17/0x70 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559752]  zfsdev_ioctl_common+0x697/0x870 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559758]  ? _copy_from_user+0x28/0x60
May 19 11:37:25 curie kernel: [ 1571.559860]  zfsdev_ioctl+0x53/0xe0 [zfs]
May 19 11:37:25 curie kernel: [ 1571.559866]  __x64_sys_ioctl+0x83/0xb0
May 19 11:37:25 curie kernel: [ 1571.559869]  do_syscall_64+0x33/0x80
May 19 11:37:25 curie kernel: [ 1571.559873]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
May 19 11:37:25 curie kernel: [ 1571.559876] RIP: 0033:0x7fcf0ef32cc7
May 19 11:37:25 curie kernel: [ 1571.559878] RSP: 002b:00007fcf0e181618 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
May 19 11:37:25 curie kernel: [ 1571.559881] RAX: ffffffffffffffda RBX: 000055b212f972a0 RCX: 00007fcf0ef32cc7
May 19 11:37:25 curie kernel: [ 1571.559883] RDX: 00007fcf0e181640 RSI: 0000000000005a04 RDI: 000000000000000b
May 19 11:37:25 curie kernel: [ 1571.559885] RBP: 00007fcf0e184c30 R08: 00007fcf08016810 R09: 00007fcf08000080
May 19 11:37:25 curie kernel: [ 1571.559886] R10: 0000000000080000 R11: 0000000000000246 R12: 000055b212f972a0
May 19 11:37:25 curie kernel: [ 1571.559888] R13: 0000000000000000 R14: 00007fcf0e181640 R15: 0000000000000000
May 19 11:37:25 curie kernel: [ 1571.559980] task:zpool           state:D stack:    0 pid:11815 ppid:  3816 flags:0x00004000
May 19 11:37:25 curie kernel: [ 1571.559983] Call Trace:
May 19 11:37:25 curie kernel: [ 1571.559988]  __schedule+0x282/0x870
May 19 11:37:25 curie kernel: [ 1571.559992]  schedule+0x46/0xb0
May 19 11:37:25 curie kernel: [ 1571.559995]  io_schedule+0x42/0x70
May 19 11:37:25 curie kernel: [ 1571.560004]  cv_wait_common+0xac/0x130 [spl]
May 19 11:37:25 curie kernel: [ 1571.560008]  ? add_wait_queue_exclusive+0x70/0x70
May 19 11:37:25 curie kernel: [ 1571.560118]  txg_wait_synced_impl+0xc9/0x110 [zfs]
May 19 11:37:25 curie kernel: [ 1571.560223]  txg_wait_synced+0xc/0x40 [zfs]
May 19 11:37:25 curie kernel: [ 1571.560325]  spa_export_common+0x4cd/0x590 [zfs]
May 19 11:37:25 curie kernel: [ 1571.560430]  ? zfs_log_history+0x9c/0xf0 [zfs]
May 19 11:37:25 curie kernel: [ 1571.560537]  zfsdev_ioctl_common+0x697/0x870 [zfs]
May 19 11:37:25 curie kernel: [ 1571.560543]  ? _copy_from_user+0x28/0x60
May 19 11:37:25 curie kernel: [ 1571.560644]  zfsdev_ioctl+0x53/0xe0 [zfs]
May 19 11:37:25 curie kernel: [ 1571.560649]  __x64_sys_ioctl+0x83/0xb0
May 19 11:37:25 curie kernel: [ 1571.560653]  do_syscall_64+0x33/0x80
May 19 11:37:25 curie kernel: [ 1571.560656]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
May 19 11:37:25 curie kernel: [ 1571.560659] RIP: 0033:0x7fdc23be2cc7
May 19 11:37:25 curie kernel: [ 1571.560661] RSP: 002b:00007ffc8c792478 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
May 19 11:37:25 curie kernel: [ 1571.560664] RAX: ffffffffffffffda RBX: 000055942ca49e20 RCX: 00007fdc23be2cc7
May 19 11:37:25 curie kernel: [ 1571.560666] RDX: 00007ffc8c792490 RSI: 0000000000005a03 RDI: 0000000000000003
May 19 11:37:25 curie kernel: [ 1571.560667] RBP: 00007ffc8c795e80 R08: 00000000ffffffff R09: 00007ffc8c792310
May 19 11:37:25 curie kernel: [ 1571.560669] R10: 000055942ca49e30 R11: 0000000000000246 R12: 00007ffc8c792490
May 19 11:37:25 curie kernel: [ 1571.560671] R13: 000055942ca49e30 R14: 000055942aed2c20 R15: 00007ffc8c795a40
Here's another example, where you see the USB controller bleeping out and back into existence:
mai 19 11:38:39 curie kernel: usb 2-1: USB disconnect, device number 2
mai 19 11:38:39 curie kernel: sd 4:0:0:0: [sdd] Synchronizing SCSI cache
mai 19 11:38:39 curie kernel: sd 4:0:0:0: [sdd] Synchronize Cache(10) failed: Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
mai 19 11:39:25 curie kernel: INFO: task zed:1564 blocked for more than 241 seconds.
mai 19 11:39:25 curie kernel:       Tainted: P          IOE     5.10.0-14-amd64 #1 Debian 5.10.113-1
mai 19 11:39:25 curie kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mai 19 11:39:25 curie kernel: task:zed             state:D stack:    0 pid: 1564 ppid:     1 flags:0x00000000
mai 19 11:39:25 curie kernel: Call Trace:
mai 19 11:39:25 curie kernel:  __schedule+0x282/0x870
mai 19 11:39:25 curie kernel:  ? __kmalloc_node+0x141/0x2b0
mai 19 11:39:25 curie kernel:  schedule+0x46/0xb0
mai 19 11:39:25 curie kernel:  schedule_preempt_disabled+0xa/0x10
mai 19 11:39:25 curie kernel:  __mutex_lock.constprop.0+0x133/0x460
mai 19 11:39:25 curie kernel:  ? nvlist_xalloc.part.0+0x68/0xc0 [znvpair]
mai 19 11:39:25 curie kernel:  spa_all_configs+0x41/0x120 [zfs]
mai 19 11:39:25 curie kernel:  zfs_ioc_pool_configs+0x17/0x70 [zfs]
mai 19 11:39:25 curie kernel:  zfsdev_ioctl_common+0x697/0x870 [zfs]
mai 19 11:39:25 curie kernel:  ? _copy_from_user+0x28/0x60
mai 19 11:39:25 curie kernel:  zfsdev_ioctl+0x53/0xe0 [zfs]
mai 19 11:39:25 curie kernel:  __x64_sys_ioctl+0x83/0xb0
mai 19 11:39:25 curie kernel:  do_syscall_64+0x33/0x80
mai 19 11:39:25 curie kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
mai 19 11:39:25 curie kernel: RIP: 0033:0x7fcf0ef32cc7
mai 19 11:39:25 curie kernel: RSP: 002b:00007fcf0e181618 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
mai 19 11:39:25 curie kernel: RAX: ffffffffffffffda RBX: 000055b212f972a0 RCX: 00007fcf0ef32cc7
mai 19 11:39:25 curie kernel: RDX: 00007fcf0e181640 RSI: 0000000000005a04 RDI: 000000000000000b
mai 19 11:39:25 curie kernel: RBP: 00007fcf0e184c30 R08: 00007fcf08016810 R09: 00007fcf08000080
mai 19 11:39:25 curie kernel: R10: 0000000000080000 R11: 0000000000000246 R12: 000055b212f972a0
mai 19 11:39:25 curie kernel: R13: 0000000000000000 R14: 00007fcf0e181640 R15: 0000000000000000
mai 19 11:39:25 curie kernel: INFO: task zpool:11815 blocked for more than 241 seconds.
mai 19 11:39:25 curie kernel:       Tainted: P          IOE     5.10.0-14-amd64 #1 Debian 5.10.113-1
mai 19 11:39:25 curie kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mai 19 11:39:25 curie kernel: task:zpool           state:D stack:    0 pid:11815 ppid:  2621 flags:0x00004004
mai 19 11:39:25 curie kernel: Call Trace:
mai 19 11:39:25 curie kernel:  __schedule+0x282/0x870
mai 19 11:39:25 curie kernel:  schedule+0x46/0xb0
mai 19 11:39:25 curie kernel:  io_schedule+0x42/0x70
mai 19 11:39:25 curie kernel:  cv_wait_common+0xac/0x130 [spl]
mai 19 11:39:25 curie kernel:  ? add_wait_queue_exclusive+0x70/0x70
mai 19 11:39:25 curie kernel:  txg_wait_synced_impl+0xc9/0x110 [zfs]
mai 19 11:39:25 curie kernel:  txg_wait_synced+0xc/0x40 [zfs]
mai 19 11:39:25 curie kernel:  spa_export_common+0x4cd/0x590 [zfs]
mai 19 11:39:25 curie kernel:  ? zfs_log_history+0x9c/0xf0 [zfs]
mai 19 11:39:25 curie kernel:  zfsdev_ioctl_common+0x697/0x870 [zfs]
mai 19 11:39:25 curie kernel:  ? _copy_from_user+0x28/0x60
mai 19 11:39:25 curie kernel:  zfsdev_ioctl+0x53/0xe0 [zfs]
mai 19 11:39:25 curie kernel:  __x64_sys_ioctl+0x83/0xb0
mai 19 11:39:25 curie kernel:  do_syscall_64+0x33/0x80
mai 19 11:39:25 curie kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
mai 19 11:39:25 curie kernel: RIP: 0033:0x7fdc23be2cc7
mai 19 11:39:25 curie kernel: RSP: 002b:00007ffc8c792478 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
mai 19 11:39:25 curie kernel: RAX: ffffffffffffffda RBX: 000055942ca49e20 RCX: 00007fdc23be2cc7
mai 19 11:39:25 curie kernel: RDX: 00007ffc8c792490 RSI: 0000000000005a03 RDI: 0000000000000003
mai 19 11:39:25 curie kernel: RBP: 00007ffc8c795e80 R08: 00000000ffffffff R09: 00007ffc8c792310
mai 19 11:39:25 curie kernel: R10: 000055942ca49e30 R11: 0000000000000246 R12: 00007ffc8c792490
mai 19 11:39:25 curie kernel: R13: 000055942ca49e30 R14: 000055942aed2c20 R15: 00007ffc8c795a40
I understand those are rather extreme conditions: I would fully expect the pool to stop working if the underlying drives disappear. What doesn't seem acceptable is that a command would completely hang like this.

References See the zfs documentation for more information about ZFS, and tubman for another installation and migration procedure.

12 November 2022

Debian Brasil: Participa o da comunidade Debian no Latinoware 2022

De 2 a 4 de novembro de 2022 aconteceu a 19 edi o do Latinoware - Congresso Latino-americano de Software Livre e Tecnologias Abertas, em Foz do Igua u. Ap s 2 anos acontecendo de forma online devido a pandemia do COVID-19, o evento voltou a ser presencial e sentimos que a comunidade Debian Brasil deveria estar presente. Nossa ltima participa o no Latinoware foi em 2016 A organiza o do Latinoware cedeu para a comunidade Debian Brasil um estande para que pud ssemos ter contato com as pessoas que visitavam a rea aberta de exposi es e assim divulgarmos o projeto Debian. Durante os 3 dias do evento, o estande foi organizado por mim (Paulo Henrique Santana) como Desenvolvedor Debian, e pelo Leonardo Rodrigues como contribuidor Debian. Infelizmente o Daniel Lenharo teve um imprevisto de ltima hora e n o pode ir para Foz do Igua u (sentimos sua falta l !). Latinoware 2022 estande 1 V rias pessoas visitaram o estande e aquelas mais iniciantes (principalmente estudantes) que n o conheciam o Debian, perguntavam do que se tratava o nosso grupo e a gente explicava v rios conceitos como o que Software Livre, distribui o GNU/Linux e o Debian propriamente dito. Tamb m recebemos pessoas da comunidade de Software Livre brasileira e de outros pa ses da Am rica Latina que j utilizavam uma distribui o GNU/Linux e claro, muitas pessoas que j utilizavam Debian. Tivemos algumas visitas especiais como do Jon maddog Hall, do Desenvolvedor Debian Emeritus Ot vio Salvador, do Desenvolvedor Debian Eriberto Mota, e dos Mantenedores Debian Guilherme de Paula Segundo e Paulo Kretcheu. Latinoware 2022 estande 4 Foto da esquerda pra direita: Leonardo, Paulo, Eriberto e Ot vio. Latinoware 2022 estande 5 Foto da esquerda pra direita: Paulo, Fabian (Argentina) e Leonardo. Al m de conversarmos bastante, distribu mos adesivos do Debian que foram produzidos alguns meses atr s com o patroc nio do Debian para serem distribu dos na DebConf22(e que haviam sobrado), e vendemos v rias camisetas do Debian produzidas pela comunidade Curitiba Livre. Latinoware 2022 estande 2 Latinoware 2022 estande 3 Tamb m tivemos 3 palestras inseridas na programa o oficial do Latinoware. Eu fiz as palestras: como tornar um(a) contribuidor(a) do Debian fazendo tradu es e como os SysAdmins de uma empresa global usam Debian . E o Leonardo fez a palestra: vantagens da telefonia Open Source nas empresas . Latinoware 2022 estande 6 Foto Paulo na palestra. Agradecemos a organiza o do Latinoware por receber mais uma vez a comunidade Debian e gentilmente ceder os espa os para a nossa participa o, e parabenizamos a todas as pessoas envolvidas na organiza o pelo sucesso desse importante evento para a nossa comunidade. Esperamos estar presentes novamente em 2023. Agracemos tamb m ao Jonathan Carter por aprovar o suporte financeiro do Debian para a nossa participa o no Latinoware. Vers o em ingl s

11 November 2022

Debian Brasil: Participa o da comunidade Debian no Latinoware 2022

De 2 a 4 de novembro de 2022 aconteceu a 19 edi o do Latinoware - Congresso Latino-americano de Software Livre e Tecnologias Abertas, em Foz do Igua u. Ap s 2 anos acontecendo de forma online devido a pandemia do COVID-19, o evento voltou a ser presencial e sentimos que a comunidade Debian Brasil deveria estar presente. Nossa ltima participa o no Latinoware foi em 2016 A organiza o do Latinoware cedeu para a comunidade Debian Brasil um estande para que pud ssemos ter contato com as pessoas que visitavam a rea aberta de exposi es e assim divulgarmos o projeto Debian. Durante os 3 dias do evento, o estande foi organizado por mim (Paulo Henrique Santana) como Desenvolvedor Debian, e pelo Leonardo Rodrigues como contribuidor Debian. Infelizmente o Daniel Lenharo teve um imprevisto de ltima hora e n o pode ir para Foz do Igua u (sentimos sua falta l !). Latinoware 2022 estande 1 V rias pessoas visitaram o estande e aquelas mais iniciantes (principalmente estudantes) que n o conheciam o Debian, perguntavam do que se tratava o nosso grupo e a gente explicava v rios conceitos como o que Software Livre, distribui o GNU/Linux e o Debian propriamente dito. Tamb m recebemos pessoas da comunidade de Software Livre brasileira e de outros pa ses da Am rica Latina que j utilizavam uma distribui o GNU/Linux e claro, muitas pessoas que j utilizavam Debian. Tivemos algumas visitas especiais como do Jon maddog Hall, do Desenvolvedor Debian Emeritus Ot vio Salvador, do Desenvolvedor Debian Eriberto Mota, e dos Mantenedores Debian Guilherme de Paula Segundo e Paulo Kretcheu. Latinoware 2022 estande 4 Foto da esquerda pra direita: Leonardo, Paulo, Eriberto e Ot vio. Latinoware 2022 estande 5 Foto da esquerda pra direita: Paulo, Fabian (Argentina) e Leonardo. Al m de conversarmos bastante, distribu mos adesivos do Debian que foram produzidos alguns meses atr s com o patroc nio do Debian para serem distribu dos na DebConf22(e que haviam sobrado), e vendemos v rias camisetas do Debian produzidas pela comunidade Curitiba Livre. Latinoware 2022 estande 2 Latinoware 2022 estande 3 Tamb m tivemos 3 palestras inseridas na programa o oficial do Latinoware. Eu fiz as palestras: como tornar um(a) contribuidor(a) do Debian fazendo tradu es e como os SysAdmins de uma empresa global usam Debian . E o Leonardo fez a palestra: vantagens da telefonia Open Source nas empresas . Latinoware 2022 estande 6 Foto Paulo na palestra. Agradecemos a organiza o do Latinoware por receber mais uma vez a comunidade Debian e gentilmente ceder os espa os para a nossa participa o, e parabenizamos a todas as pessoas envolvidas na organiza o pelo sucesso desse importante evento para a nossa comunidade. Esperamos estar presentes novamente em 2023. Agracemos tamb m ao Jonathan Carter por aprovar o suporte financeiro do Debian para a nossa participa o no Latinoware. Vers o em ingl s

9 November 2022

Debian Brasil: Brasileiros(as) Mantenedores(as) e Desenvolvedores(as) Debian a partir de julho de 2015

Desde de setembro de 2015, o time de publicidade do Projeto Debian passou a publicar a cada dois meses listas com os nomes dos(as) novos(as) Desenvolvedores(as) Debian (DD - do ingl s Debian Developer) e Mantenedores(as) Debian (DM - do ingl s Debian Maintainer). Estamos aproveitando estas listas para publicar abaixo os nomes dos(as) brasileiros(as) que se tornaram Desenvolvedores(as) e Mantenedores(as) Debian a partir de julho de 2015. Desenvolvedores(as) Debian / Debian Developers / DDs: Marcos Talau Fabio Augusto De Muzio Tobich Gabriel F. T. Gomes Thiago Andrade Marques M rcio de Souza Oliveira Paulo Henrique de Lima Santana Samuel Henrique S rgio Durigan J nior Daniel Lenharo de Souza Giovani Augusto Ferreira Adriano Rafael Gomes Breno Leit o Lucas Kanashiro Herbert Parentes Fortes Neto Mantenedores(as) Debian / Debian Maintainers / DMs: Guilherme de Paula Xavier Segundo David da Silva Polverari Paulo Roberto Alves de Oliveira Sergio Almeida Cipriano Junior Francisco Vilmar Cardoso Ruviaro William Grzybowski Tiago Ilieve
Observa es:
  1. Esta lista ser atualizada quando o time de publicidade do Debian publicar novas listas com DMs e DDs e tiver brasileiros.
  2. Para ver a lista completa de Mantenedores(as) e Desenvolvedores(as) Debian, inclusive outros(as) brasileiros(as) antes de julho de 2015 acesse: https://nm.debian.org/public/people

Debian Brasil: Brasileiros(as) Mantenedores(as) e Desenvolvedores(as) Debian a partir de julho de 2015

Desde de setembro de 2015, o time de publicidade do Projeto Debian passou a publicar a cada dois meses listas com os nomes dos(as) novos(as) Desenvolvedores(as) Debian (DD - do ingl s Debian Developer) e Mantenedores(as) Debian (DM - do ingl s Debian Maintainer). Estamos aproveitando estas listas para publicar abaixo os nomes dos(as) brasileiros(as) que se tornaram Desenvolvedores(as) e Mantenedores(as) Debian a partir de julho de 2015. Desenvolvedores(as) Debian / Debian Developers / DDs: Marcos Talau Fabio Augusto De Muzio Tobich Gabriel F. T. Gomes Thiago Andrade Marques M rcio de Souza Oliveira Paulo Henrique de Lima Santana Samuel Henrique S rgio Durigan J nior Daniel Lenharo de Souza Giovani Augusto Ferreira Adriano Rafael Gomes Breno Leit o Lucas Kanashiro Herbert Parentes Fortes Neto Mantenedores(as) Debian / Debian Maintainers / DMs: Guilherme de Paula Xavier Segundo David da Silva Polverari Paulo Roberto Alves de Oliveira Sergio Almeida Cipriano Junior Francisco Vilmar Cardoso Ruviaro William Grzybowski Tiago Ilieve
Observa es:
  1. Esta lista ser atualizada quando o time de publicidade do Debian publicar novas listas com DMs e DDs e tiver brasileiros.
  2. Para ver a lista completa de Mantenedores(as) e Desenvolvedores(as) Debian, inclusive outros(as) brasileiros(as) antes de julho de 2015 acesse: https://nm.debian.org/public/people

27 October 2022

Emmanuel Kasper: Convert a root filesystem to a bootable disk image

The year is 2022, and it is still that complicated to install GRUB2 externally onto a disk image. But using the wonders of libguestfs, you can create a bootable diskimage using a qemu VM abstraction very easily. The steps here imply we want to create a disk with a single partition containing the root filesystem. Create an empty disk image, partition it
$ truncate --size 40G target.img
$ virt-format --add target.img --partition=mbr --filesystem=ext4
copy the root file system into a partition
cd path/to/root/fs
sudo tar --numeric-owner -cvf - .   guestfish --rw --add ../target.img --mount /dev/sda1:/ -- tar-in - /
install grub using guestfish
$ guestfish --add target.img --inspector
and in the guestfish prompt:
>> command 'grub-install /dev/sda'
>> command 'update-grub'
# also make sure init can mount our root partition
>> write /etc/fstab '/dev/sda1 / ext4 defaults 0 1'
>> exit
test boot the disk image
$ kvm -m 1024 -drive file=target.img,format=raw

25 October 2022

Arturo Borrero Gonz lez: Netfilter Workshop 2022 summary

Netfilter logo This is my report from the Netfilter Workshop 2022. The event was held on 2022-10-20/2022-10-21 in Seville, and the venue was the offices of Zevenet. We started on Thursday with Pablo Neira (head of the project) giving a short welcome / opening speech. The previous iteration of this event was in virtual fashion in 2020, two years ago. In the year 2021 we were unable to meet either in person or online. This year, the number of participants was just eight people, and this allowed the setup to be a bit more informal. We had kind of an un-conference style meeting, in which whoever had something prepared just went ahead and opened a topic for debate. In the opening speech, Pablo did a quick recap on the legal problems the Netfilter project had a few years ago, a topic that was settled for good some months ago, in January 2022. There were no news in this front, which was definitely a good thing. Moving into the technical topics, the workshop proper, Pablo started to comment on the recent developments to instrument a way to perform inner matching for tunnel protocols. The current implementation supports VXLAN, IPIP, GRE and GENEVE. Using nftables you can match packet headers that are encapsulated inside these protocols. He mentioned the design and the goals, that was to have a kernel space setup that allows adding more protocols by just patching userspace. In that sense, more tunnel protocols will be supported soon, such as IP6IP, UDP, and ESP. Pablo requested our opinion on whether if nftables should generate the matching dependencies. For example, if a given tunnel is UDP-based, a dependency match should be there otherwise the rule won t work as expected. The agreement was to assist the user in the setup when possible and if not, print clear error messages. By the way, this inner thing is pure stateless packet filtering. Doing inner-conntracking is an open topic that will be worked on in the future. Pablo continued with the next topic: nftables automatic ruleset optimizations. The times of linear ruleset evaluation are over, but some people have a hard time understanding / creating rulesets that leverage maps, sets, and concatenations. This is where the ruleset optimizations kick in: it can transform a given ruleset to be more optimal by using such advanced data structures. This is purely about optimizing the ruleset, not about validating the usefulness of it, which could be another interesting project. There were a couple of problems mentioned, however. The ruleset optimizer can be slow, O(n!) in worst case. And the user needs to use nested syntax. More improvements to come in the future. Next was Stefano Brivio s turn (Red Hat engineer). He had been involved lately in a couple of migrations to nftables, in particular libvirt and KubeVirt. We were pointed to https://libvirt.org/firewall.html, and Stefano walked us through the 3 or 4 different virtual networks that libvirt can create. He evaluated some options to generate efficient rulesets in nftables to instrument such networks, and commented on a couple of ideas: having a null matcher in nftables set expression. Or perhaps having kind of subsets, something similar to a view in a SQL database. The room spent quite a bit of time debating how the nft_lookup API could be extended to support such new search operations. We also discussed if having intermediate facilities such as firewalld could provide the abstraction levels that could make developers more comfortable. Using firewalld also may have the advantage that coordination between different system components writing ruleset to nftables is handled by firewalld itself and developers are freed of the responsibility of doing it right. Next was Fernando F. Mancera (Red Hat engineer). He wanted to improve error reporting when deleting table/chain/rules with nftables. In general, there are some inconsistencies on how tables can be deleted (or flushed). And there seems to be no correct way to make a single table go away with all its content in a single command. The room agreed in that the commands destroy table and delete table should be defined consistently, with the following meanings: This topic diverted into another: how to reload/replace a ruleset but keep stateful information (such as counters). Next was Phil Sutter (Netfilter coreteam member and Red Hat engineer). He was interested in discussing options to make iptables-nft backward compatible. The use case he brought was simple: What happens if a container running iptables 1.8.7 creates a ruleset with features not supported by 1.8.6. A later container running 1.8.6 may fail to operate. Phil s first approach was to attach additional metadata into rules to assist older iptables-nft in decoding and printing the ruleset. But in general, there are no obvious or easy solutions to this problem. Some people are mixing different tooling version, and there is no way all cases can be predicted/covered. iptables-nft already refuses to work in some of the most basic failure scenarios. An other way to approach the issue could be to introduce some kind of support to print raw expressions in iptables-nft, like -m nft xyz. Which feels ugly, but may work. We also explored playing with the semantics of release version numbers. And another idea: store strings in the nft rule userdata area with the equivalent matching information for older iptables-nft. In fact, what Phil may have been looking for is not backwards but forward compatibility. Phil was undecided which path to follow, but perhaps the most common-sense approach is to fall back to a major release version bump (2.x.y) and declaring compatibility breakage with older iptables 1.x.y. That was pretty much it for the first day. We had dinner together and went to sleep for the next day. The room The second day was opened by Florian Westphal (Netfilter coreteam member and Red Hat engineer). Florian has been trying to improve nftables performance in kernels with RETPOLINE mitigations enabled. He commented that several workarounds have been collected over the years to avoid the performance penalty of such mitigations. The basic strategy is to avoid function indirect calls in the kernel. Florian also described how BPF programs work around this more effectively. And actually, Florian tried translating nf_hook_slow() to BPF. Some preliminary benchmarks results were showed, with about 2% performance improvement in MB/s and PPS. The flowtable infrastructure is specially benefited from this approach. The software flowtable infrastructure already offers a 5x performance improvement with regards the classic forwarding path, and the change being researched by Florian would be an addition on top of that. We then moved into discussing the meeting Florian had with Alexei in Zurich. My personal opinion was that Netfilter offers interesting user-facing interfaces and semantics that BPF does not. Whereas BPF may be more performant in certain scenarios. The idea of both things going hand in hand may feel natural for some people. Others also shared my view, but no particular agreement was reached in this topic. Florian will probably continue exploring options on that front. The next topic was opened by Fernando. He wanted to discuss Netfilter involvement in Google Summer of Code and Outreachy. Pablo had some personal stuff going on last year that prevented him from engaging in such projects. After all, GSoC is not fundamental or a priority for Netfilter. Also, Pablo mentioned the lack of support from others in the project for mentoring activities. There was no particular decision made here. Netfilter may be present again in such initiatives in the future, perhaps under the umbrella of other organizations. Again, Fernando proposed the next topic: nftables JSON support. Fernando shared his plan of going over all features and introduce programmatic tests from them. He also mentioned that the nftables wiki was incomplete and couldn t be used as a reference for missing tests. Phil suggested running the nftables python test-suite in JSON mode, which should complain about missing features. The py test suite should cover pretty much all statements and variations on how the nftables expression are invoked. Next, Phil commented on nftables xtables support. This is, supporting legacy xtables extensions in nftables. The most prominent problem was that some translations had some corner cases that resulted in a listed ruleset that couldn t be fed back into the kernel. Also, iptables-to-nftables translations can be sloppy, and the resulting rule won t work in some cases. In general, nft list ruleset nft -f may fail in rulesets created by iptables-nft and there is no trivial way to solve it. Phil also commented on potential iptables-tests.py speed-ups. Running the test suite may take very long time depending on the hardware. Phil will try to re-architect it, so it runs faster. Some alternatives had been explored, including collecting all rules into a single iptables-restore run, instead of hundreds of individual iptables calls. Next topic was about documentation on the nftables wiki. Phil is interested in having all nftables code-flows documented, and presented some improvements in that front. We are trying to organize all developer-oriented docs on a mediawiki portal, but the extension was not active yet. Since I worked at the Wikimedia Foundation, all the room stared at me, so at the end I kind of committed to exploring and enabling the mediawiki portal extension. Note to self: is this perhaps https://www.mediawiki.org/wiki/Portals ? Next presentation was by Pablo. He had a list of assorted topics for quick review and comment. Following this, a new topic was introduced by Stefano. He wanted to talk about nft_set_pipapo, documentation, what to do next, etc. He did a nice explanation of how the pipapo algorithm works for element inserts, lookups, and deletion. The source code is pretty well documented, by the way. He showed performance measurements of different data types being stored in the structure. After some lengthly debate on how to introduce changes without breaking usage for users, he declared some action items: writing more docs, addressing problems with non-atomic set reloads and a potential rework of nft_rbtree. After that, the next topic was kubernetes & netfilter , also by Stefano. Actually, this topic was very similar to what we already discussed regarding libvirt. Developers want to reduce packet matching effort, but also often don t leverage nftables most performant features, like sets, maps or concatenations. Some Red Hat developers are already working on replacing everything with native nftables & firewalld integrations. But some rules generators are very bad. Kubernetes (kube-proxy) is a known case. Developers simply won t learn how to code better ruleset generators. There was a good question floating around: What are people missing on first encounter with nftables? The Netfilter project doesn t have a training or marketing department or something like that. We cannot force-educate developers on how to use nftables in the right way. Perhaps we need to create a set of dedicated guidelines, or best practices, in the wiki for app developers that rely on nftables. Jozsef Kadlecsik (Netfilter coreteam) supported this idea, and suggested going beyond: such documents should be written exclusively from the nftables point of view: stop approaching the docs as a comparison to the old iptables semantics. Related to that last topic, next was Laura Garc a (Zevenet engineer, and venue host). She shared the same information as she presented in the Kubernetes network SIG in August 2020. She walked us through nftlb and kube-nftlb, a proof-of-concept replacement for kube-proxy based on nftlb that can outperform it. For whatever reason, kube-nftlb wasn t adopted by the upstream kubernetes community. She also covered latest changes to nftlb and some missing features, such as integration with nftables egress. nftlb is being extended to be a full proxy service and a more robust overall solution for service abstractions. In a nutshell, nftlb uses a templated ruleset and only adds elements to sets, which is exactly the right usage of the nftables framework. Some other projects should follow its example. The performance numbers are impressive, and from the early days it was clear that it was outperforming classical LVS-DSR by 10x. I used this opportunity to bring a topic that I wanted to discuss. I ve seen some SRE coworkers talking about katran as a replacement for traditional LVS setups. This software is a XDP/BPF based solution for load balancing. I was puzzled about what this software had to offer versus, for example, nftlb or any other nftables-based solutions. I commented on the highlighs of katran, and we discussed the nftables equivalents. nftlb is a simple daemon which does everything using a JSON-enabled REST API. It is already packaged into Debian, ready to use, whereas katran feels more like a collection of steps that you need to run in a certain order to get it working. All the hashing, caching, HA without state sharing, and backend weight selection features of katran are already present in nftlb. To work on a pure L3/ToR datacenter network setting, katran uses IPIP encapsulation. They can t just mangle the MAC address as in traditional DSR because the backend server is on a different L3 domain. It turns out nftables has a nft_tunnel expression that can do this encapsulation for complete feature parity. It is only available in the kernel, but it can be made available easily on the userspace utility too. Also, we discussed some limitations of katran, for example, inability to handle IP fragmentation, IP options, and potentially others not documented anywhere. This seems to be common with XDP/BPF programs, because handling all possible network scenarios would over-complicate the BPF programs, and at that point you are probably better off by using the normal Linux network stack and nftables. In summary, we agreed that nftlb can pretty much offer the same as katran, in a more flexible way. Group photo Finally, after many interesting debates over two days, the workshop ended. We all agreed on the need for extending it to 3 days next time, since 2 days feel too intense and too short for all the topics worth discussing. That s all on my side! I really enjoyed this Netfilter workshop round.

Russ Allbery: Review: A Spaceship Repair Girl Supposedly Named Rachel

Review: A Spaceship Repair Girl Supposedly Named Rachel, by Richard Roberts
Publisher: Mystique Press
Copyright: 2022
ISBN: 1-63789-763-4
Format: Kindle
Pages: 353
Rachel had snuck out of the house to sit on the hill, to write and draw in rare peace and quiet, when a bus fell out of the sky like a meteor and plowed into the ground in front of her. This is quickly followed by a baffling encounter with a seven-foot-tall man with a blunderbuss, two misunderstandings and a storytelling lie, and a hurried invitation to get into the bus and escape before they're both infected by math. That's how Rachel discovers that she's able to make on-the-fly repairs to bicycle-powered spaceships, and how she ends up at the Lighthouse of Ceres. The title comes from Rachel's initial hesitation to give her name, which propagates through the book to everyone she meets as certainty that Rachel isn't really her name. I enjoyed this running gag way more than I expected to. I don't read enough young adult and middle-grade books to be entirely clear on the boundaries, but this felt very middle-grade. It has a headlong plot, larger-than-life characters, excitingly imaginative scenery (such as a giant space lighthouse dwarfing the asteroid that it's attached to), a focus on friendship, and no romance. This is, to be clear, not a complaint. But it's a different feel than my normal fare, and there were a few places where I was going one direction and the book went another. The conceit of this book is that Earth is unique in the solar system in being stifled by the horrific weight of math, which infects anyone who visits and makes the routine wonders of other planets impossible. Other planets have their own styles and mythos (Saturn is full of pirates, the inhabitants of Venus are space bunnies with names like Passionfruit Nectar Ecstasy), but throughout the rest of the solar system, belief, style, and story logic reign supreme. That means Rachel's wild imagination and reflexive reliance on tall tales makes her surprisingly powerful. The first wild story she tells, to the man who crashed on earth, shapes most of the story. She had written in her sketchbook that it was the property of the Witch Queen of Eloquent Verbosity and Grandiose Ornamentation, and when challenged on it, says that she stole it to cure her partner. Much to her surprise, everyone outside of Earth takes this completely seriously. Also much to her surprise, her habit of sketching spaceships and imaginative devices makes her a natural spaceship mechanic, a skill in high demand. Some of the story is set on Ceres, a refuge for misfits with hearts of gold. That's where Rachel meets Wrench, a kobold who is by far my favorite character of the book and the one relationship that I thought had profound emotional depth. Rachel's other adventures are set off by the pirate girl Violet (she's literally purple), who is the sort of plot-provoking character that I think only works in middle-grade fiction. By normal standards, Violet's total lack of respect for other people's boundaries or consent would make her more of a villain. Here, while it often annoys Rachel, it's clear that both Rachel and the book take Violet's steamroller personality in good fun, more like the gentle coercion between neighborhood friends trying to pull each other into games. I still got rather tired of Violet, though, which caused me a few problems around the middle of the book. There's a bit of found family here (some of it quite touching), a lot of adventures, a lot of delightful spaceship repair, and even some more serious plot involving the actual Witch Queen of Charon. There is a bit of a plot arc to give some structure to the adventures, but this is not the book to read if you're looking for complex plotting or depth. I thought the story fell apart a bit at the tail end, with a conflict that felt like it was supposed to be metaphorical and then never resolved for me into something concrete. I was expecting Rachel to eventually have to do more introspection and more direct wrestling with her identity, but the resolution felt a bit superficial and unsatisfying. Reading this as an adult, I found it odd but fun. I wanted more from the ending, and I was surprised that Roberts does not do more to explain to the reader why Rachel does not regret leaving Earth and her family behind. It feels like something Rachel will have to confront eventually, but this is not the book for it. Instead we get some great friendships (some of which I agreed with wholeheartedly, and some of which I found annoying) and an imaginative, chaotic universe that Rachel takes to like a fish to water. The parts of the story focused on her surprising competence (and her delight in her own competence) were my favorites. The book this most reminds me of is Norton Juster's The Phantom Tollbooth. It is, to be clear, nowhere near as good as The Phantom Tollbooth, which is a very high bar, and it's not as focused on puns. But it has the same sense of internal logic and the same tendency to put far more weight on belief and stories than our world does, and to embrace the resulting chaos. I'm not sure this will be anyone's favorite book (although I'm also not the target age), but I enjoyed reading it. It was a great change of pace after Nona the Ninth. Recommended if you're in the mood for some space fantasy that doesn't take itself seriously. Rating: 7 out of 10

20 October 2022

Scarlett Gately Moore: KDE Gear Snaps round 2

As a continuation of https://www.scarlettgatelymoore.dev/new-kde-gear-snaps-in-the-works/ Todays releases, tested on both amd64 and arm64, are: This week has also been a busy week gardening snap bugs in bugs.kde.org. They are all over the place  I am trying to sort out getting them there own section. I have assigned all snap bugs I have found to myself and requested that this is default. If you have bugs, please report them at bugs.kde.org , for now under neon / Snaps. More coming next week!

9 October 2022

Sergio Talens-Oliag: Shared networking for Virtual Machines and Containers

This entry explains how I have configured a linux bridge, dnsmasq and iptables to be able to run and communicate different virtualization systems and containers on laptops running Debian GNU/Linux. I ve used different variations of this setup for a long time with VirtualBox and KVM for the Virtual Machines and Linux-VServer, OpenVZ, LXC and lately Docker or Podman for the Containers.

Required packagesI m running Debian Sid with systemd and network-manager to configure the WiFi and Ethernet interfaces, but for the bridge I use bridge-utils with ifupdown (as I said this setup is old, I guess ifupdow2 and ifupdown-ng will work too). To start and stop the DNS and DHCP services and add NAT rules when the bridge is brought up or down I execute a script that uses:
  • ip from iproute2 to get the network information,
  • dnsmasq to provide the DNS and DHCP services (currently only the dnsmasq-base package is needed and it is recommended by network-manager, so it is probably installed),
  • iptables to configure NAT (for now docker kind of forces me to keep using iptables, but at some point I d like to move to nftables).
To make sure you have everything installed you can run the following command:
sudo apt install bridge-utils dnsmasq-base ifupdown iproute2 iptables

Bridge configurationThe bridge configuration for ifupdow is available on the file /etc/network/interfaces.d/vmbr0:
# Virtual servers NAT Bridge
auto vmbr0
iface vmbr0 inet static
    address         10.0.4.1
    network         10.0.4.0
    netmask         255.255.255.0
    broadcast       10.0.4.255
    bridge_ports    none
    bridge_maxwait  0
    up              /usr/local/sbin/vmbridge $ IFACE  start nat
    pre-down        /usr/local/sbin/vmbridge $ IFACE  stop nat
Warning: To use a separate file with ifupdown make sure that /etc/network/interfaces contains the line:
source /etc/network/interfaces.d/*
or add its contents to /etc/network/interfaces directly, if you prefer.
This configuration creates a bridge with the address 10.0.4.1 and assumes that the machines connected to it will use the 10.0.4.0/24 network; you can change the network address if you want, as long as you use a private range and it does not collide with networks used in your Virtual Machines all should be OK. The vmbridge script is used to start the dnsmasq server and setup the NAT rules when the interface is brought up and remove the firewall rules and stop the dnsmasq server when it is brought down.

The vmbridge scriptThe vmbridge script launches an instance of dnsmasq that binds to the bridge interface (vmbr0 in our case) that is used as DNS and DHCP server. The DNS server reads the /etc/hosts file to publish local DNS names and forwards all the other requests to the the dnsmasq server launched by NetworkManager that is listening on the loopback interface. As this server already does catching we disable it for our server, with the added advantage that, if we change networks, new requests go to the new resolvers because the DNS server handled by NetworkManager gets restarted and flushes its cache (this is useful if we connect to a new network that has internal DNS servers that are configured to do split DNS for internal services; if we use this model all requests get the internal address as soon as the DNS server is queried again). The DHCP server is configured to provide IPs to unknown hosts for a sub range of the addresses on the bridge network and use fixed IPs if the /etc/ethers file has a MAC with a matching hostname on the /etc/hosts file. To make things work with old DHCP clients the script also adds checksums to the DHCP packets using iptables (when the interface is not linked to a physical device the kernel does not add checksums, but we can fix it adding a rule on the mangle table). If we want external connectivity we can pass the nat argument and then the script creates a MASQUERADE rule for the bridge network and enables IP forwarding. The script source code is the following:
/usr/local/sbin/vmbridge
#!/bin/sh
set -e
# ---------
# VARIABLES
# ---------
LOCAL_DOMAIN="vmnet"
MIN_IP_LEASE="192"
MAX_IP_LEASE="223"
# ---------
# FUNCTIONS
# ---------
get_net()  
  NET="$(
    ip a ls "$ BRIDGE " 2>/dev/null   sed -ne 's/^.*inet \(.*\) brd.*$/\1/p'
  )"
  [ "$NET" ]   return 1
 
checksum_fix_start()  
  iptables -t mangle -A POSTROUTING -o "$ BRIDGE " -p udp --dport 68 \
    -j CHECKSUM --checksum-fill 2>/dev/null   true
 
checksum_fix_stop()  
  iptables -t mangle -D POSTROUTING -o "$ BRIDGE " -p udp --dport 68 \
    -j CHECKSUM --checksum-fill 2>/dev/null   true
 
nat_start()  
  [ "$NAT" = "yes" ]   return 0
  # Configure NAT
  iptables -t nat -A POSTROUTING -s "$ NET " ! -d "$ NET " -j MASQUERADE
  # Enable forwarding (just in case)
  echo 1 >/proc/sys/net/ipv4/ip_forward
 
nat_stop()  
  [ "$NAT" = "yes" ]   return 0
  iptables -t nat -D POSTROUTING -s "$ NET " ! -d "$ NET " \
    -j MASQUERADE 2>/dev/null   true
 
do_start()  
  # Bridge address
  _addr="$ NET%%/* "
  # DNS leases (between .MIN_IP_LEASE and .MAX_IP_LEASE)
  _dhcp_range="$ _addr%.* .$ MIN_IP_LEASE ,$ _addr%.* .$ MAX_IP_LEASE "
  # Bridge mtu
  _mtu="$(
    ip link show dev "$ BRIDGE "  
      sed -n -e '/mtu/   s/^.*mtu \([0-9]\+\).*$/\1/p  '
  )"
  # Compute extra dnsmasq options
  dnsmasq_extra_opts=""
  # Disable gateway when not using NAT
  if [ "$NAT" != "yes" ]; then
    dnsmasq_extra_opts="$dnsmasq_extra_opts --dhcp-option=3"
  fi
  # Adjust MTU size if needed
  if [ -n "$_mtu" ] && [ "$_mtu" -ne "1500" ]; then
    dnsmasq_extra_opts="$dnsmasq_extra_opts --dhcp-option=26,$_mtu"
  fi
  # shellcheck disable=SC2086
  dnsmasq --bind-interfaces \
    --cache-size="0" \
    --conf-file="/dev/null" \
    --dhcp-authoritative \
    --dhcp-leasefile="/var/lib/misc/dnsmasq.$ BRIDGE .leases" \
    --dhcp-no-override \
    --dhcp-range "$ _dhcp_range " \
    --domain="$ LOCAL_DOMAIN " \
    --except-interface="lo" \
    --expand-hosts \
    --interface="$ BRIDGE " \
    --listen-address "$ _addr " \
    --no-resolv \
    --pid-file="$ PIDF " \
    --read-ethers \
    --server="127.0.0.1" \
    $dnsmasq_extra_opts
  checksum_fix_start
  nat_start
 
do_stop()  
  nat_stop
  checksum_fix_stop
  if [ -f "$ PIDF " ]; then
    kill "$(cat "$ PIDF ")"   true
    rm -f "$ PIDF "
  fi
 
do_status()  
  if [ -f "$ PIDF " ] && kill -HUP "$(cat "$ PIDF ")"; then
    echo "dnsmasq RUNNING"
  else
    echo "dnsmasq NOT running"
  fi
 
do_reload()  
  [ -f "$ PIDF " ] && kill -HUP "$(cat "$ PIDF ")"
 
usage()  
  echo "Uso: $0 BRIDGE (start stop [nat]) status reload"
  exit 1
 
# ----
# MAIN
# ----
[ "$#" -ge "2" ]   usage
BRIDGE="$1"
OPTION="$2"
shift 2
NAT="no"
for arg in "$@"; do
  case "$arg" in
  nat) NAT="yes" ;;
  *) echo "Unknown arg '$arg'" && exit 1 ;;
  esac
done
PIDF="/var/run/vmbridge-$ BRIDGE -dnsmasq.pid"
case "$OPTION" in
start) get_net && do_start ;;
stop) get_net && do_stop ;;
status) do_status ;;
reload) get_net && do_reload ;;
*) echo "Unknown command '$OPTION'" && exit 1 ;;
esac
# vim: ts=2:sw=2:et:ai:sts=2

NetworkManager ConfigurationThe default /etc/NetworkManager/NetworkManager.conf file has the following contents:
[main]
plugins=ifupdown,keyfile
[ifupdown]
managed=false
Which means that it will leave interfaces managed by ifupdown alone and, by default, will send the connection DNS configuration to systemd-resolved if it is installed. As we want to use dnsmasq for DNS resolution, but we don t want NetworkManager to modify our /etc/resolv.conf we are going to add the following file (/etc/NetworkManager/conf.d/dnsmasq.conf) to our system:
/etc/NetworkManager/conf.d/dnsmasq.conf
[main]
dns=dnsmasq
rc-manager=unmanaged
and restart the NetworkManager service:
$ sudo systemctl restart NetworkManager.service
From now on the NetworkManager will start a dnsmasq service that queries the servers provided by the DHCP servers we connect to on 127.0.0.1:53 but will not touch our /etc/resolv.conf file.

Configuring systemd-resolvedIf we start using our own name server but our system has systemd-resolved installed we will no longer need or use the DNS stub; programs using it will use our dnsmasq server directly now, but we keep running systemd-resolved for the host programs that use its native api or access it through /etc/nsswitch.conf (when libnss-resolve is installed). To disable the stub we add a /etc/systemd/resolved.conf.d/disable-stub.conf file to our machine with the following content:
# Disable the DNS Stub Listener, we use our own dnsmasq
[Resolve]
DNSStubListener=no
and restart the systemd-resolved to make sure that the stub is stopped:
$ sudo systemctl restart systemd-resolved.service

Adjusting /etc/resolv.confFirst we remove the existing /etc/resolv.conf file (it does not matter if it is a link or a regular file) and then create a new one that contains at least the following line (we can add a search line if is useful for us):
nameserver 10.0.4.1
From now on we will be using the dnsmasq server launched when we bring up the vmbr0 for multiple systems:
  • as our main DNS server from the host (if we use the standard /etc/nsswitch.conf and libnss-resolve is installed it is queried first, but the systemd-resolved uses it as forwarder by default if needed),
  • as the DNS server of the Virtual Machines or containers that use DHCP for network configuration and attach their virtual interfaces to our bridge,
  • as the DNS server of docker containers that get the DNS information from /etc/resolv.conf (if we have entries that use loopback addresses the containers that don t use the host network tend to fail, as those addresses inside the running containers are not linked to the loopback device of the host).

TestingAfter all the configuration files and scripts are in place we just need to bring up the bridge interface and check that everything works:
$ # Bring interface up
$ sudo ifup vmbr0
$ # Check that it is available
$ ip a ls dev vmbr0
4: vmbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN
          group default qlen 1000
    link/ether 0a:b8:ef:b8:07:6c brd ff:ff:ff:ff:ff:ff
    inet 10.0.4.1/24 brd 10.0.4.255 scope global vmbr0
       valid_lft forever preferred_lft forever
$ # View the listening ports used by our dnsmasq servers
$ sudo ss -tulpan   grep dnsmasq
udp UNCONN 0 0  127.0.0.1:53     0.0.0.0:* users:(("dnsmasq",pid=1733930,fd=4))
udp UNCONN 0 0  10.0.4.1:53      0.0.0.0:* users:(("dnsmasq",pid=1705267,fd=6))
udp UNCONN 0 0  0.0.0.0%vmbr0:67 0.0.0.0:* users:(("dnsmasq",pid=1705267,fd=4))
tcp LISTEN 0 32 10.0.4.1:53      0.0.0.0:* users:(("dnsmasq",pid=1705267,fd=7))
tcp LISTEN 0 32 127.0.0.1:53     0.0.0.0:* users:(("dnsmasq",pid=1733930,fd=5))
$ # Verify that the DNS server works on the vmbr0 address
$ host www.debian.org 10.0.4.1
Name: 10.0.4.1
Address: 10.0.4.1#53
Aliases:
www.debian.org has address 130.89.148.77
www.debian.org has IPv6 address 2001:67c:2564:a119::77

Managing running systemsIf we want to update DNS entries and/or MAC addresses we can edit the /etc/hosts and /etc/ethers files and reload the dnsmasq configuration using the vmbridge script:
$ sudo /usr/local/sbin/vmbridge vmbr0 reload
That call sends a signal to the running dnsmasq server and it reloads the files; after that we can refresh the DHCP addresses from the client machines or start using the new DNS names immediately.

3 October 2022

Marco d'Itri: Debian bookworm on a Lenovo T14s Gen3 AMD

I recently upgraded my laptop to a Lenovo T14s Gen3 AMD and I am happy to report that it works just fine with Debian/unstable using a 5.19 kernel. The only issue is that some firmware files are still missing and I had to install them manually. Updates are needed for the firmware-amd-graphics package (#1019847) for the Radeon 680M GPU (AMD Rembrandt) and for the firmware-atheros package (#1021157) for the Qualcomm NFA725A Wi-Fi card (which is actually reported as a NFA765). s2idle (AKA "modern suspend") works too, and a ~10 seconds delay on resume has been removed by setting iommu=pt on the kernel command line. For improved energy efficiency it is recommended to switch from the acpi_cpufreq CPU frequency scaling driver to amd_pstate. Please note that so far it is not loaded automatically. As expected, fwupdmgr can update the system BIOS and the firmware of the NVMe device. Everybody should do it immediately, because there are major suspend bugs with BIOS releases earlier than 1.25.

17 September 2022

Russ Allbery: Effective altruism and the control trap

William MacAskill has been on a book tour for What We Owe to the Future, which has put effective altruism back in the news. That plus the decision by GiveWell to remove GiveDirectly from their top charity list got me thinking about charity again. I think effective altruism, by embracing long-termism, is falling into an ethical trap, and I'm going to start heavily discounting their recommendations for donations. Background Some background first for people who have no idea what I'm talking about. Effective altruism is the idea that we should hold charities accountable for effectiveness. It's not sufficient to have an appealing mission. A charity should demonstrate that the money they spend accomplishes the goals they claimed it would. There is a lot of debate around defining "effective," but as a basic principle, this is sound. Mainstream charity evaluators such as Charity Navigator measure overhead and (arguable) waste, but they don't ask whether the on-the-ground work of the charity has a positive effect proportional to the resources it's expending. This is a good question to ask. GiveWell is a charity research organization that directs money for donors based on effective altruism principles. It's one of the central organizations in effective altruism. GiveDirectly is a charity that directly transfers money from donors to poor people. It doesn't attempt to build infrastructure, buy specific things, or fund programs. It identifies poor people and gives them cash with no strings attached. Long-termism is part of the debate over what "effectiveness" means. It says we should value impact on future generations more highly than we tend to do. (In other words, we should have a much smaller future discount rate.) A sloppy but intuitive expression of long-termism is that (hopefully) there will be far more humans living in the future than are living today, and therefore a "greatest good for the greatest number" moral philosophy argues that we should invest significant resources into making the long-term future brighter. This has obvious appeal to those of us who are concerned about the long-term impacts of climate change, for example. There is a lot of overlap between the communities of effective altruism, long-termism, and "rationalism." One way this becomes apparent is that all three communities have a tendency to obsess over the risks of sentient AI taking over the world. I'm going to come back to that. Psychology of control GiveWell, early on, discovered that GiveDirectly was measurably more effective than most charities. Giving money directly to poor people without telling them how to spend it produced more benefits for those people and their surrounding society than nearly all international aid charities. GiveDirectly then became the baseline for GiveWell's evaluations, and GiveWell started looking for ways to be more effective than that. There is some logic to thinking more effectiveness is possible. Some problems are poorly addressed by markets and too large for individual spending. Health care infrastructure is an obvious example. That said, there's also a psychological reason to look for other charities. Part of the appeal of charity is picking a cause that supports your values (whether that be raw effectiveness or something else). Your opinions and expertise are valued alongside your money. In some cases, this may be objectively true. But in all cases, it's more flattering to the ego than giving poor people cash. At that point, the argument was over how to address immediate and objectively measurable human problems. The innovation of effective altruism is to tie charitable giving to a research feedback cycle. You measure the world, see if it is improving, and adjust your funding accordingly. Impact is measured by its effects on actual people. Effective altruism was somewhat suspicious of talking directly to individuals and preferred "objective" statistical measures, but the point was to remain in contact with physical reality. Enter long-termism: what if you could get more value for your money by addressing problems that would affect vast numbers of future people, instead of the smaller number of people who happen to be alive today? Rather than looking at the merits of that argument, look at its psychology. Real people are messy. They do things you don't approve of. They have opinions that don't fit your models. They're hard to "objectively" measure. But people who haven't been born yet are much tidier. They're comfortably theoretical; instead of having to go to a strange place with unfamiliar food and languages to talk to people who aren't like you, you can think hard about future trends in the comfort of your home. You control how your theoretical future people are defined, so the results of your analysis will align with your philosophical and ideological beliefs. Problems affecting future humans are still extrapolations of problems visible today in the world, though. They're constrained by observations of real human societies, despite the layer of projection and extrapolation. We can do better: what if the most serious problem facing humanity is the possible future development of rogue AI? Here's a problem that no one can observe or measure because it's never happened. It is purely theoretical, and thus under the control of the smart philosopher or rich western donor. We don't know if a rogue AI is possible, what it would be like, how one might arise, or what we could do about it, but we can convince ourselves that all those things can be calculated with some probability bar through the power of pure logic. Now we have escaped the uncomfortable psychological tension of effective altruism and returned to the familiar world in which the rich donor can define both the problem and the solution. Effectiveness is once again what we say it is. William MacAskill, one of the originators of effective altruism, now constantly talks about the threat of rogue AI. In a way, it's quite sad. Where to give money? The mindset of long-termism is bad for the human brain. It whispers to you that you're smarter than other people, that you know what's really important, and that you should retain control of more resources because you'll spend them more wisely than others. It's the opposite of intellectual humility. A government funding agency should take some risks on theoretical solutions to real problems, and maybe a few on theoretical solutions to theoretical problems (although an order of magnitude less). I don't think this is a useful way for an individual donor to think. So, if I think effective altruism is abandoning the one good idea it had and turning back into psychological support for the egos of philosophers and rich donors, where does this leave my charitable donations? To their credit, GiveWell so far seems uninterested in shifting from concrete to theoretical problems. However, they believe they can do better by picking projects than giving people money, and they're committing to that by dropping GiveDirectly (while still praising them). They may be right. But I'm increasingly suspicious of the level of control donors want to retain. It's too easy to trick oneself into thinking you know better than the people directly affected. I have two goals when I donate money. One is to make the world a better, kinder place. The other is to redistribute wealth. I have more of something than I need, and it should go to someone who does need it. The net effect should be to make the world fairer and more equal. The first goal argues for effective altruism principles: where can I give money to have the most impact on making the world better? The second goal argues for giving across an inequality gradient. I should find the people who are struggling the most and transfer as many resources to them as I can. This is Peter Singer's classic argument for giving money to the global poor. I think one can sometimes do better than transferring money, but doing so requires a deep understanding of the infrastructure and economies of scale that are being used as leverage. The more distant one is from a society, the more dubious I think one should be of one's ability to evaluate that, and the more wary one should be of retaining any control over how resources are used. Therefore, I'm pulling my recurring donation to GiveWell. Half of it is going to go to GiveDirectly, because I think it is an effective way of redistributing wealth while giving up control. The other half is going to my local foodbank, because they have a straightforward analysis of how they can take advantage of economy of scale, and because I have more tools available (such as local news) to understand what problem they're solving and if they're doing so effectively. I don't know that those are the best choices. There are a lot of good ones. But I do feel strongly that the best charity comes from embracing the idea that I do not have special wisdom, other people know more about what they need than I do, and deploying my ego and logic from the comfort of my home is not helpful. Find someone who needs something you have an excess of. Give it to them. Treat them as equals. Don't retain control. You won't go far wrong.

11 September 2022

Andrew Cater: 202209110020 - Debian release day(s) - Cambridge - post 4

RattusRattus, Isy, smcv have all just left after a very long day. Steve is finishing up the final stages. The mayhem has quietened, the network cables are coiled, pretty much everything is tidied away. A new experience for two of us - I just hope it hasn't put them off too much.The IRC channels are quiet and we can put this one to bed after a good day's work well done.

10 September 2022

Andrew Cater: 202209102213 - Debian release day - Cambridge - post 3

Working a bit more slowly - coming to the end of the process. I've been wrestling with a couple of annoying old laptops and creating mayhem. The others are almost through the process - it's been a very long day, almost 12 hours now.As ever, it's good to be with people who appreciate this work - I'm also being menaced by a dog that wants fuss all the time. It certainly makes a difference to have fast connectivity and even faster remarks backwards and forwards.


Andrew Cater: 202209101602 Debian release day - Cambridge - post 2

Definitely settling into a rhythm - we've been joined by smcv in person (and bittin on line). Bullseye testing is now well beyond the standard image testing into the live images.Buster images are gradually being built so there's the added confusion of two sets of wiki editing, two sets of potential edit conflicts ...So six people in a small-ish sitting room, several with multiple laptops running several checks at once. It's all good, as ever.Dining room table has nine machines on it, three packet switches are fairly well full ...

Andrew Cater: 202209101115 Debian release day - Cambridge - Bullseye and Buster testing starting

And I'm over here with the Debian images/media release team in Cambridge.First time together in Cambridge for a long time: several of the usual suspects - RattusRattus, Sledge, Isy and myself. Also in the room are Kartik and egw - I think this is their first time.Chat is now physically in Sledge's sitting room as well as on IRC. The first couple of images are trickling in and tests are starting for Bullseye.
This is going to be a very long day - we've got full tests for Bullseye (Debian 11) and Buster (Debian 10) so double duty. This should be the last release for Buster since this has now passed to LTS.

8 September 2022

Antoine Beaupr : Complaint about Canada's phone cartel

I have just filed a complaint with the CRTC about my phone provider's outrageous fees. This is a copy of the complaint.
I am traveling to Europe, specifically to Ireland, for a 6 days for a work meeting. I thought I could use my phone there. So I looked at my phone provider's services in Europe, and found the "Fido roaming" services: https://www.fido.ca/mobility/roaming The fees, at the time of writing, at fifteen (15!) dollars PER DAY to get access to my regular phone service (not unlimited!!). If I do not use that "roaming" service, the fees are: That is absolutely outrageous. Any random phone plan in Europe will be cheaper than this, by at least one order of magnitude. Just to take any example: https://www.tescomobile.ie/sim-only-plans.aspx Those fine folks offer a one-time, prepaid plan for 15 for 28 days which includes: I think it's absolutely scandalous that telecommunications providers in Canada can charge so much money, especially since the most prohibitive fee (the "non-prepaid" plans) are automatically charged if I happen to forget to remove my sim card or put my phone in "airplane mode". As advised, I have called customer service at Fido for advice on how to handle this situation. They have confirmed those are the only plans available for travelers and could not accommodate me otherwise. I have notified them I was in the process of filing this complaint. I believe that Canada has become the technological dunce of the world, and I blame the CRTC for its lack of regulation in that matter. You should not allow those companies to grow into such a cartel that they can do such price-fixing as they wish. I haven't investigated Fido's competitors, but I will bet at least one of my hats that they do not offer better service. I attach a screenshot of the Fido page showing those outrageous fees.
I have no illusions about this having any effect. I thought of filing such a complain after the Rogers outage as well, but felt I had less of a standing there because I wasn't affected that much (e.g. I didn't have a life-threatening situation myself). This, however, was ridiculous and frustrating enough to trigger this outrage. We'll see how it goes...
"We will respond to you within 10 working days."

Response from CRTC They did respond within 10 days. Here is the full response:
Dear Antoine Beaupr : Thank you for contacting us about your mobile telephone international roaming service plan rates concern with Fido Solutions Inc. (Fido). In Canada, mobile telephone service is offered on a competitive basis. Therefore, the Canadian Radio-television and Telecommunications Commission (CRTC) is not involved in Fido's terms of service (including international roaming service plan rates), billing and marketing practices, quality of service issues and customer relations. If you haven't already done so, we encourage you to escalate your concern to a manager if you believe the answer you have received from Fido's customer service is not satisfactory. Based on the information that you have provided, this may also appear to be a Competition Bureau matter. The Competition Bureau is responsible for administering and enforcing the Competition Act, and deals with issues such as false or misleading representations, deceptive marketing practices and collusion. You can reach the Competition Bureau by calling 1-800-348-5358 (toll-free), by TTY (for deaf and hard of hearing people) by calling 1-866-694-8389 (toll-free). For more contact information, please visit http://www.competitionbureau.gc.ca/eic/site/cb-bc.nsf/eng/00157.html When consumers are not satisfied with the service they are offered, we encourage them to compare the products and services of other providers in their area and look for a company that can better match their needs. The following tool helps to show choices of providers in your area: https://crtc.gc.ca/eng/comm/fourprov.htm Thank you for sharing your concern with us.
In other words, complain with Fido, or change providers. Don't complain to us, we don't manage the telcos, they self-regulate. Great job, CRTC. This is going great. This is exactly why we're one of the most expensive countries on the planet for cell phone service.

Live chat with Fido Interestingly, the day after I received that response from the CRTC, I received this email from Fido, while traveling:
Date: Tue, 13 Sep 2022 10:10:00 -0400 From: Fido DONOTREPLY@fido.ca To: REDACTED Subject: Courriel d avis d itin rance Fido Roaming Welcome Confirmation Fido Date : 13 septembre 2022
Num ro de compte : [redacted] Bonjour
Antoine Beaupr ! Nous vous crivons pour vous indiquer qu au moins un utilisateur inscrit votre compte s est r cemment connect un r seau en itin rance.
Vous trouverez ci-dessous le message texte de bienvenue en itin rance envoy l utilisateur (ou aux utilisateurs), qui contenait les tarifs d itin rance
applicables. Message texte de bienvenue en itin rance Destinataire : REDACTED Date et heure : 2022-09-13 / 10:10:00
Allo, ici Fido : Bienvenue destination! Vous tes inscrit Fido Nomade alors utilisez vos donn es, parlez et textez comme vous le faites la
maison. Depuis le 1 mars 2022 le tarif cette destination pour 15 $/jour (+ taxes) et valide tous les jours jusqu' 23 h 59 HE, peu importe le fuseau
horaire dans lequel vous vous trouvez. Bon voyage! Des questions? Consultez fido.ca/m/itinerance ou composez +15149333436 (sans frais). Besoin d aide?
  • PLANIFIEZ UN VOYAGE AVEC Fido NomadeMC
    D couvrez nos options d itin rance et restez en contact l tranger sans vous soucier de votre
    facture.
D tails
  • G rez votre compte
    G rez vos produits et services Fido la maison ou sur la route gr ce Mon
    Compte.
D tails Ce courriel est produit automatiquement; veuillez ne pas y r pondre. Ce courriel (et toute pi ce jointe) est confidentiel. Si vous n tes pas le destinataire,
veuillez supprimer le pr sent message et en d truire toute copie. FIDO SOLUTIONS 800, RUE DE LA GAUCHETI RE OUEST
BUREAU 4000 MONTR AL (QU BEC) H5A 1K3
Fido
I found that message utterly confusing (and yes, I can read french). Basically, it says that some user (presumably me!) connected to the network with roaming. I did just disabled airplane mode on my phone to debug a Syncthing bug but had not enabled roaming. So this message seemed to say that I would be charged 15$ (per DAY!) for roaming from now on. Confused, I tried their live chat to try to clarify things, worried I would get charged even more for calling tech support on *611. This is a transcript of the chat:
F: Hi! What are we doing today? Type in your question or choose from the options below: * Track my Equipment Order * View Bill Online * Payment Options * iPhone 14 Pre-Order A: i received a message about roaming while abroad but i did not enable roaming on my phone, will i be charged anyways? F: I think I know what you re asking for. Select the topic that best matches your request or try rephrasing your question. A: no F: Thank you, this will help us to improve! Would you like to chat with a specialist? Chat with a specialist I'll get a specialist to help you with this. It appears that you're not signed in. Your session may have timed out. To save time and identify your account details, please sign in to My Account.
  • Sign in
  • I'm not able to sign in
Have any questions specific to your Fido account? To service you faster, please identify yourself by completing the form below. A: Personal info Form submitted F: Thank you! I'll connect you with the next available specialist. Your chat is being transferred to a Live Chat agent. Thanks for your patience. We are here to assist you and we kindly ask that our team members be treated with respect and dignity. Please note that abuse directed towards any Consumer Care Specialist will not be tolerated and will result in the termination of your conversation with us. All of our agents are with other customers at the moment. Your chat is in a priority sequence and someone will be with you as soon as possible. Thanks! Thanks for continuing to hold. An agent will be with you as soon as possible. Thank you for your continued patience. We re getting more Live Chat requests than usual so it s taking longer to answer. Your chat is still in a priority sequence and will be answered as soon as an agent becomes available. Thank you so much for your patience we're sorry for the wait. Your chat is still in a priority sequence and will be answered as soon as possible. Hi, I'm [REDACTED] from Fido in [REDACTED]. May I have your name please? A: hi i am antoine, nice to meet you sorry to use the live chat, but it's not clear to me i can safely use my phone to call support, because i am in ireland and i'm worried i'll get charged for the call F: Thank You Antoine , I see you waited to speak with me today, thank you for your patience.Apart from having to wait, how are you today? A: i am good thank you
[... delay ...]
A: should i restate my question? F: Yes please what is the concern you have? A: i have received an email from fido saying i someone used my phone for roaming it's in french (which is fine), but that's the gist of it i am traveling to ireland for a week i do not want to use fido's services here... i have set the phon eto airplane mode for most of my time here F: The SMS just says what will be the charges if you used any services. A: but today i have mistakenly turned that off and did not turn on roaming well it's not a SMS, it's an email F: Yes take out the sim and keep it safe.Turun off or On for roaming you cant do it as it is part of plan. A: wat F: if you used any service you will be charged if you not used any service you will not be charged. A: you are saying i need to physically take the SIM out of the phone? i guess i will have a fun conversation with your management once i return from this trip not that i can do that now, given that, you know, i nee dto take the sim out of this phone fun times F: Yes that is better as most of the customer end up using some kind of service and get charged for roaming. A: well that is completely outrageous roaming is off on the phone i shouldn't get charged for roaming, since roaming is off on the phone i also don't get why i cannot be clearly told whether i will be charged or not the message i have received says i will be charged if i use the service and you seem to say i could accidentally do that easily can you tell me if i have indeed used service sthat will incur an extra charge? are incoming text messages free? F: I understand but it is on you if you used some data SMS or voice mail you can get charged as you used some services.And we cant check anything for now you have to wait for next bill. and incoming SMS are free rest all service comes under roaming. That is the reason I suggested take out the sim from phone and keep it safe or always keep the phone or airplane mode. A: okay can you confirm whether or not i can call fido by voice for support? i mean for free F: So use your Fido sim and call on +1-514-925-4590 on this number it will be free from out side Canada from Fido sim. A: that is quite counter-intuitive, but i guess i will trust you on that thank you, i think that will be all F: Perfect, Again, my name is [REDACTED] and it s been my pleasure to help you today. Thank you for being a part of the Fido family and have a great day! A: you too
So, in other words:
  1. they can't tell me if I've actually been roaming
  2. they can't tell me how much it's going to cost me
  3. I should remove the SIM card from my phone (!?) or turn on airplane mode, but the former is safer
  4. I can call Fido support, but not on the usual *611, and instead on that long-distance-looking phone number, and yes, that means turning off airplane mode and putting the SIM card in, which contradicts step 3
Also notice how the phone number from the live chat (+1-514-925-4590) is different than the one provided in the email (15149333436). So who knows what would have happened if I would have called the latter. The former is mentioned in their contact page. I guess the next step is to call Fido over the phone and talk to a manager, which is what the CRTC told me to do in the first place... I ended up talking with a manager (another 1h phone call) and they confirmed there is no other package available at Fido for this. At best they can provide me with a credit if I mistakenly use the roaming by accident to refund me, but that's it. The manager also confirmed that I cannot know if I have actually used any data before reading the bill, which is issued on the 15th of every month, but only available... three days later, at which point I'll be back home anyways. Fantastic.

28 August 2022

Andrew Cater: Debian Barbeque, Cambridge 2022

And here we are: second day of the barbeque in Cambridge. Lots of food - as always - some alcohol, some soft drinks, coffee.Lots of good friends, and banter and good natured argument. For a couple of folk, it's their first time here - but most people have known each other for years. Lots of reminiscing, some crochet from two of us. Multiple technical discussions weaving and overlapping
Not just meat and vegetarian options for food: a fresh loaf, gingerbread of various sorts, fresh Belgian-style waffles.I''m in the front room: four of us silently on laptops, one on a phone. Sounds of a loud game of Mao from the garden - all very normal for this time of year.Thanks to Jo and Steve, to all the cooks and folk sorting things out. One more night and I'll have done my first full BBQ here. Diet and slimming - what diet?

Next.

Previous.