Search Results: "Wouter Verhelst"

30 August 2022

Wouter Verhelst: Not currently uploading

A notorious ex-DD decided to post garbage on his site in which he links my name to the suicide of Frans Pop, and mentions that my GPG key is currently disabled in the Debian keyring, along with some manufactured screenshots of the Debian NM site that allegedly show I'm no longer a DD. I'm not going to link to the post -- he deserves to be ridiculed, not given attention. Just to set the record straight, however: Frans Pop was my friend. I never treated him with anything but respect. I do not know why he chose to take his own life, but I grieved for him for a long time. It saddens me that Mr. Notorious believes it a good idea to drag Frans' name through the mud like this, but then, one can hardly expect anything else from him by this point. Although his post is mostly garbage, there is one bit of information that is correct, and that is that my GPG key is currently no longer in the Debian keyring. Nothing sinister is going on here, however; the simple fact of the matter is that I misplaced my OpenPGP key card, which means there is a (very very slight) chance that a malicious actor (like, perhaps, Mr. Notorious) would get access to my GPG key and abuse that to upload packages to Debian. Obviously we can't have that -- certainly not from him -- so for that reason, I asked the Debian keyring maintainers to please disable my key in the Debian keyring. I've ordered new cards; as soon as they arrive I'll generate a new key and perform the necessary steps to get my new key into the Debian keyring again. Given that shipping key cards to South Africa takes a while, this has taken longer than I would have initially hoped, but I'm hoping at this point that by about halfway September this hurdle will have been taken, meaning, I will be able to exercise my rights as a Debian Developer again. As for Mr. Notorious, one can only hope he will get the psychiatric help he very obviously needs, sooner rather than later, because right now he appears to be more like a goat yelling in the desert. Ah well.

22 August 2022

Wouter Verhelst: Remote notification

Sometimes, it's useful to get a notification that a command has finished doing something you were waiting for:
make my-large-program && notify-send "compile finished" "success"   notify-send "compile finished" "failure"
This will send a notification message with the title "compile finished", and a body of "success" or "failure" depending on whether the command completed successfully, and allows you to minimize (or otherwise hide) the terminal window while you do something else, which can be a very useful thing to do. It works great when you're running something on your own machine, but what if you're running it remotely? There might be something easy to do, but I whipped up a bit of Perl instead:
#!/usr/bin/perl -w
use strict;
use warnings;
use Glib::Object::Introspection;
Glib::Object::Introspection->setup(
    basename => "Notify",
    version => "0.7",
    package => "Gtk3::Notify",
);
use Mojolicious::Lite -signatures;
Gtk3::Notify->init();
get '/notify' => sub ($c)  
    my $msg = $c->param("message");
    if(!defined($msg))  
        $msg = "message";
     
    my $title = $c->param("title");
    if(!defined($title))  
        $title = "title";
     
    app->log->debug("Sending notification '$msg' with title '$title'");
    my $n = Gtk3::Notify::Notification->new($title, $msg, "");
    $n->show;
    $c->render(text => "OK");
 ;
app->start;
This requires the packages libglib-object-introspection-perl, gir1.2-notify-0.7, and libmojolicious-perl to be installed, and can then be started like so:
./remote-notify daemon -l http://0.0.0.0:3000/
(assuming you did what I did and saved the above as "remote-notify") Once you've done that, you can just curl a notification message to yourself:
curl 'http://localhost:3000/notify?title=test&message=test+body'
Doing this via localhost is rather silly (much better to use notify-send for that), but it becomes much more interesting if you're going to run this to your laptop from a remote system. An obvious TODO would be to add in some form of security, but that's left as an exercise to the reader...

12 August 2022

Wouter Verhelst: Upgrading a Windows 10 VM to Windows 11

I run Debian on my laptop (obviously); but occasionally, for $DAYJOB, I have some work to do on Windows. In order to do so, I have had a Windows 10 VM in my libvirt configuration that I can use. A while ago, Microsoft issued Windows 11. I recently found out that all the components for running Windows 11 inside a libvirt VM are available, and so I set out to upgrade my VM from Windows 10 to Windows 11. This wasn't as easy as I thought, so here's a bit of a writeup of all the things I ran against, and how I fixed them. Windows 11 has a number of hardware requirements that aren't necessary for Windows 10. There are a number of them, but the most important three are: So let's see about all three.

A modern enough processor If your processor isn't modern enough to run Windows 11, then you can probably forget about it (unless you want to use qemu JIT compilation -- I dunno, probably not going to work, and also not worth it if it were). If it is, all you need is the "host-passthrough" setting in libvirt, which I've been using for a long time now. Since my laptop is less than two months old, that's not a problem for me.

A TPM 2.0 module My Windows 10 VM did not have a TPM configured, because it wasn't needed. Luckily, a quick web search told me that enabling that is not hard. All you need to do is:
  • Install the swtpm and swtpm-tools packages
  • Adding the TPM module, by adding the following XML snippet to your VM configuration:
    <devices>
      <tpm model='tpm-tis'>
        <backend type='emulator' version='2.0'/>
      </tpm>
    </devices>
    
    Alternatively, if you prefer the graphical interface, click on the "Add hardware" button in the VM properties, choose the TPM, set it to Emulated, model TIS, and set its version to 2.0.
You're done! Well, with this part, anyway. Read on.

Secure boot Here is where it gets interesting. My Windows 10 VM was old enough that it was configured for the older i440fx chipset. This one is limited to PCI and IDE, unlike the more modern q35 chipset (which supports PCIe and SATA, and does not support IDE nor SATA in IDE mode). There is a UEFI/Secure Boot-capable BIOS for qemu, but it apparently requires the q35 chipset, Fun fact (which I found out the hard way): Windows stores where its boot partition is somewhere. If you change the hard drive controller from an IDE one to a SATA one, you will get a BSOD at startup. In order to fix that, you need a recovery drive. To create the virtual USB disk, go to the VM properties, click "Add hardware", choose "Storage", choose the USB bus, and then under "Advanced options", select the "Removable" option, so it shows up as a USB stick in the VM. Note: this takes a while to do (took about an hour on my system), and your virtual USB drive needs to be 16G or larger (I used the libvirt default of 20G). There is no possibility, using the buttons in the virt-manager GUI, to convert the machine from i440fx to q35. However, that doesn't mean it's not possible to do so. I found that the easiest way is to use the direct XML editing capabilities in the virt-manager interface; if you edit the XML in an editor it will produce error messages if something doesn't look right and tell you to go and fix it, whereas the virt-manager GUI will actually fix things itself in some cases (and will produce helpful error messages if not). What I did was:
  • Take backups of everything. No, really. If you fuck up, you'll have to start from scratch. I'm not responsible if you do.
  • Go to the Edit->Preferences option in the VM manager, then on the "General" tab, choose "Enable XML editing"
  • Open the Windows VM properties, and in the "Overview" section, go to the "XML" tab.
  • Change the value of the machine attribute of the domain.os.type element, so that it says pc-q35-7.0.
  • Search for the domain.devices.controller element that has pci in its type attribute and pci-root in its model one, and set the model attribute to pcie-root instead.
  • Find all domain.devices.disk.target elements, setting their dev=hdX to dev=sdX, and bus="ide" to bus="sata"
  • Find the USB controller (domain.devices.controller with type="usb", and set its model to qemu-xhci. You may also want to add ports="15" if you didn't have that yet.
  • Perhaps also add a few PCIe root ports:
    <controller type="pci" index="1" model="pcie-root-port"/>
    <controller type="pci" index="2" model="pcie-root-port"/>
    <controller type="pci" index="3" model="pcie-root-port"/>
    
I figured out most of this by starting the process for creating a new VM, on the last page of the wizard that pops up selecting the "Modify configuration before installation" option, going to the "XML" tab on the "Overview" section of the new window that shows up, and then comparing that against what my current VM had. Also, it took me a while to get this right, so I might have forgotten something. If virt-manager gives you an error when you hit the Apply button, compare notes against the VM that you're in the process of creating, and copy/paste things from there to the old VM to make the errors go away. As long as you don't remove configuration that is critical for things to start, this shouldn't break matters permanently (but hey, use your backups if you do break -- you have backups, right?) OK, cool, so now we have a Windows VM that is... unable to boot. Remember what I said about Windows storing where the controller is? Yeah, there you go. Boot from the virtual USB disk that you created above, and select the "Fix the boot" option in the menu. That will fix it. Ha ha, only kidding. Of course it doesn't. I honestly can't tell you everything that I fiddled with, but I think the bit that eventually fixed it was where I chose "safe mode", which caused the system to do a hickup, a regular reboot, and then suddenly everything was working again. Meh. Don't throw the virtual USB disk away yet, you'll still need it. Anyway, once you have it booting again, you will now have a machine that theoretically supports Secure Boot, but you're still running off an MBR partition. I found a procedure on how to convert things from MBR to GPT that was written almost 10 years ago, but surprisingly it still works, except for the bit where the procedure suggests you use diskmgmt.msc (for one thing, that was renamed; and for another, it can't touch the partition table of the system disk either). The last step in that procedure says to restart your computer!, which is fine, except at this point you obviously need to switch over to the TianoCore firmware, otherwise you're trying to read a UEFI boot configuration on a system that only supports MBR booting, which obviously won't work. In order to do that, you need to add a loader element to the domain.os element of your libvirt configuration:
<loader readonly="yes" type="pflash">/usr/share/OVMF/OVMF_CODE_4M.ms.fd</loader>
When you do this, you'll note that virt-manager automatically adds an nvram element. That's fine, let it. I figured this out by looking at the documentation for enabling Secure Boot in a VM on the Debian wiki, and using the same trick as for how to switch chipsets that I explained above. Okay, yay, so now secure boot is enabled, and we can install Windows 11! All good? Well, almost. I found that once I enabled secure boot, my display reverted to a 1024x768 screen. This turned out to be because I was using older unsigned drivers, and since we're using Secure Boot, that's no longer allowed, which means Windows reverts to the default VGA driver, and that only supports the 1024x768 resolution. Yeah, I know. The solution is to download the virtio-win ISO from one of the links in the virtio-win github project, connecting it to the VM, going to Device manager, selecting the display controller, clicking on the "Update driver" button, telling the system that you have the driver on your computer, browsing to the CD-ROM drive, clicking the "include subdirectories" option, and then tell Windows to do its thing. While there, it might be good to do the same thing for unrecognized devices in the device manager, if any. So, all I have to do next is to get used to the completely different user interface of Windows 11. Sigh. Oh, and to rename the "w10" VM to "w11", or some such. Maybe.

23 July 2022

Wouter Verhelst: Planet Grep now running PtLink

Almost 2 decades ago, Planet Debian was created using the "planetplanet" RSS aggregator. A short while later, I created Planet Grep using the same software. Over the years, the blog aggregator landscape has changed a bit. First of all, planetplanet was abandoned, forked into Planet Venus, and then abandoned again. Second, the world of blogging (aka the "blogosphere") has disappeared much, and the more modern world uses things like "Social Networks", etc, making blogs less relevant these days. A blog aggregator community site is still useful, however, and so I've never taken Planet Grep down, even though over the years the number of blogs that was carried on Planet Grep has been reducing. In the past almost 20 years, I've just run Planet Grep on my personal server, upgrading its Debian release from whichever was the most recent stable release in 2005 to buster, never encountering any problems. That all changed when I did the upgrade to Debian bullseye, however. Planet Venus is a Python 2 application, which was never updated to Python 3. Since Debian bullseye drops support for much of Python 2, focusing only on Python 3 (in accordance with python upstream's policy on the matter), that means I have had to run Planet Venus from inside a VM for a while now, which works as a short-term solution but not as a long-term one. Although there are other implementations of blog aggregation software out there, I wanted to stick with something (mostly) similar. Additionally, I have been wanting to add functionality to it to also pull stuff from Social Networks, where possible (and legal, since some of these have... scary Terms Of Use documents). So, as of today, Planet Grep is no longer powered by Planet Venus, but instead by PtLink. Rather than Python, it was written in Perl (a language with which I am more familiar), and there are plans for me to extend things in ways that have little to do with blog aggregation anymore... There are a few other Planets out there that also use Planet Venus at this point -- Planet Debian and Planet FSFE are two that I'm currently already aware of, but I'm sure there might be more, too. At this point, PtLink is not yet on feature parity with Planet Venus -- as shown by the fact that it can't yet build either Planet Debian or Planet FSFE successfully. But I'm not stopping my development here, and hopefully I'll have something that successfully builds both of those soon, too. As a side note, PtLink is not intended to be bug compatible with Planet Venus. For one example, the configuration for Planet Grep contains an entry for Frederic Descamps, but somehow Planet Venus failed to fetch his feed. With the switch to PtLink, that seems fixed, and now some entries from Frederic seem to appear. I'm not going to be "fixing" that feature... but of course there might be other issues that will appear. If that's the case, let me know. If you're reading this post through Planet Grep, consider this a public service announcement for the possibility (hopefully a remote one) of minor issues.

20 May 2022

Wouter Verhelst: Faster tar

I have a new laptop. The new one is a Dell Latitude 5521, whereas the old one was a Dell Latitude 5590. As both the old and the new laptops are owned by the people who pay my paycheck, I'm supposed to copy all my data off the old laptop and then return it to the IT department. A simple way of doing this (and what I'd usually use) is to just rsync the home directory (and other relevant locations) to the new machine. However, for various reasons I didn't want to do that this time around; for one, my home directory on the old laptop is a bit of a mess, and a new laptop is an ideal moment in time to clean that up. If I were to just rsync over the new home directory, then, well. So instead, I'm creating a tar ball. The first attempt was quite slow:
tar cvpzf wouter@new-laptop:old-laptop.tar.gz /home /var /etc
The problem here is that the default compression algorithm, gzip, is quite slow, especially if you use the default non-parallel implementation. So we tried something else:
tar cvpf wouter@new-laptop:old-laptop.tar.gz -Ipigz /home /var /etc
Better, but not quite great yet. The old laptop now has bursts of maxing out CPU, but it doesn't even come close to maxing out the gigabit network cable between the two. Tar can compress to the LZ4 algorithm. That algorithm doesn't compress very well, but it's the best algorithm if "speed" is the most important consideration. So I could do that:
tar cvpf wouter@new-laptop:old-laptop.tar.gz -Ilz4 /home /var /etc
The trouble with that, however, is that the tarball will then be quite big. So why not use the CPU power of the new laptop?
tar cvpf - /home /var /etc   ssh new-laptop "pigz > old-laptop.tar.gz"
Yeah, that's much faster. Except, now the network speed becomes the limiting factor. We can do better.
tar cvpf - -Ilz4 /home /var /etc   ssh new-laptop "lz4 -d   pigz > old-laptop.tar.gz"
This uses about 70% of the link speed, just over one core on the old laptop, and 60% of CPU time on the new laptop. After also adding a bit of --exclude="*cache*", to avoid files we don't care about, things go quite quickly now: somewhere between 200 and 250G (uncompressed) was transferred into a 74G file, in 20 minutes. My first attempt hadn't even done 10G after an hour!

17 January 2022

Wouter Verhelst: Different types of Backups

In my previous post, I explained how I recently set up backups for my home server to be synced using Amazon's services. I received a (correct) comment on that by Iustin Pop which pointed out that while it is reasonably cheap to upload data into Amazon's offering, the reverse -- extracting data -- is not as cheap. He is right, in that extracting data from S3 Glacier Deep Archive costs over an order of magnitude more than it costs to store it there on a monthly basis -- in my case, I expect to have to pay somewhere in the vicinity of 300-400 USD for a full restore. However, I do not consider this to be a major problem, as these backups are only to fulfill the rarer of the two types of backups cases. There are two reasons why you should have backups. The first is the most common one: "oops, I shouldn't have deleted that file". This happens reasonably often; people will occasionally delete or edit a file that they did not mean to, and then they will want to recover their data. At my first job, a significant part of my job was to handle recovery requests from users who had accidentally deleted a file that they still needed. Ideally, backups to handle this type of situation are easily accessible to end users, and are performed reasonably frequently. A system that automatically creates and deletes filesystem snapshots (such as the zfsnap script for ZFS snapshots, which I use on my server) works well. The crucial bit here is to ensure that it is easier to copy an older version of a file than it is to start again from scratch -- if a user must file a support request that may or may not be answered within a day or so, it is likely they will not do so for a file they were working on for only half a day, which means they lose half a day of work in such a case. If, on the other hand, they can just go into the snapshots directory themselves and it takes them all of two minutes to copy their file, then they will also do that for files they only created half an hour ago, so they don't even lose half an hour of work and can get right back to it. This means that backup strategies to mitigate the "oops I lost a file" case ideally do not involve off-site file storage, and instead are performed online. The second case is the much rarer one, but (when required) has the much bigger impact: "oops the building burned down". Variants of this can involve things like lightning strikes, thieves, earth quakes, and the like; in all cases, the point is that you want to be able to recover all your files, even if every piece of equipment you own is no longer usable. That being the case, you will first need to replace that equipment, which is not going to be cheap, and it is also not going to be an overnight thing. In order to still be useful after you lost all your equipment, they must also be stored off-site, and should preferably be offline backups, too. Since replacing your equipment is going to cost you time and money, it's fine if restoring the backups is going to take a while -- you can't really restore from backup any time soon anyway. And since you will lose a number of days of content that you can't create when you can only fall back on your off-site backups, it's fine if you also lose a few days of content that you will have to re-create. All in all, the two types of backups have opposing requirements: "oops I lost a file" backups should be performed often and should be easily available; "oops I lost my building" backups should not be easily available, and are ideally done less often, so you don't pay a high amount of money for storage of your off-sites. In my opinion, if you have good "lost my file" backups, then it's also fine if the recovery of your backups are a bit more expensive. You don't expect to have to ever pay for these; you may end up with a situation where you don't have a choice, and then you'll be happy that the choice is there, but as long as you can reasonably pay for the worst case scenario of a full restore, it's not a case you should be worried about much. As such, and given that a full restore from Amazon Storage Gateway is going to be somewhere between 300 and 400 USD for my case -- a price I can afford, although it's not something I want to pay every day -- I don't think it's a major issue that extracting data is significantly more expensive than uploading data. But of course, this is something everyone should consider for themselves...

16 January 2022

Wouter Verhelst: Backing up my home server with Bacula and Amazon Storage Gateway

I have a home server. Initially conceived and sized so I could digitize my (rather sizeable) DVD collection, I started using it for other things; I added a few play VMs on it, started using it as a destination for the deja-dup-based backups of my laptop and the time machine-based ones of the various macs in the house, and used it as the primary location of all the photos I've taken with my cameras over the years (currently taking up somewhere around 500G) as well as those that were taking at our wedding (another 100G). To add to that, I've copied the data that my wife had on various older laptops and external hard drives onto this home server as well, so that we don't lose the data should something happen to one or more of these bits of older hardware. Needless to say, the server was running full, so a few months ago I replaced the 4x2T hard drives that I originally put in the server with 4x6T ones, and there was much rejoicing. But then I started considering what I was doing. Originally, the intent was for the server to contain DVD rips of my collection; if I were to lose the server, I could always re-rip the collection and recover that way (unless something happened that caused me to lose both at the same time, of course, but I consider that sufficiently unlikely that I don't want to worry about it). Much of the new data on the server, however, cannot be recovered like that; if the server dies, I lose my photos forever, with no way of recovering them. Obviously that can't be okay. So I started looking at options to create backups of my data, preferably in ways that make it easily doable for me to automate the backups -- because backups that have to be initiated are backups that will be forgotten, and backups that are forgotten are backups that don't exist. So let's not try that. When I was still self-employed in Belgium and running a consultancy business, I sold a number of lower-end tape libraries for which I then configured bacula, and I preferred a solution that would be similar to that without costing an arm and a leg. I did have a look at a few second-hand tape libraries, but even second hand these are still way outside what I can budget for this kind of thing, so that was out too. After looking at a few solutions that seemed very hackish and would require quite a bit of handholding (which I don't think is a good idea), I remembered that a few years ago, I had a look at the Amazon Storage Gateway for a customer. This gateway provides a virtual tape library with 10 drives and 3200 slots (half of which are import/export slots) over iSCSI. The idea is that you install the VM on a local machine, you connect it to your Amazon account, you connect your backup software to it over iSCSI, and then it syncs the data that you write to Amazon S3, with the ability to archive data to S3 Glacier or S3 Glacier Deep Archive. I didn't end up using it at the time because it required a VMWare virtualization infrastructure (which I'm not interested in), but I found out that these days, they also provide VM images for Linux KVM-based virtual machines (amongst others), so that changes things significantly. After making a few calculations, I figured out that for the amount of data that I would need to back up, I would require a monthly budget of somewhere between 10 and 20 USD if the bulk of the data would be on S3 Glacier Deep Archive. This is well within my means, so I gave it a try. The VM's technical requirements state that you need to assign four vCPUs and 16GiB of RAM, which just so happens to be the exact amount of RAM and CPU that my physical home server has. Obviously we can't do that. I tried getting away with 4GiB and 2 vCPUs, but that didn't work; the backup failed out after about 500G out of 2T had been written, due to the VM running out of resources. On the VM's console I found complaints that it required more memory, and I saw it mention something in the vicinity of 7GiB instead, so I decided to try again, this time with 8GiB of RAM rather than 4. This worked, and the backup was successful. As far as bacula is concerned, the tape library is just a (very big...) normal tape library, and I got data throughput of about 30M/s while the VM's upload buffer hadn't run full yet, with things slowing down to pretty much my Internet line speed when it had. With those speeds, Bacula finished the backup successfully in "1 day 6 hours 43 mins 45 secs", although the storage gateway was still uploading things to S3 Glacier for a few hours after that. All in all, this seems like a viable backup solution for large(r) amounts of data, although I haven't yet tried to perform a restore.

28 November 2021

Wouter Verhelst: GR procedures and timelines

A vote has been proposed in Debian to change the formal procedure in Debian by which General Resolutions (our name for "votes") are proposed. The original proposal is based on a text by Russ Allberry, which changes a number of rules to be less ambiguous and, frankly, less weird. One thing Russ' proposal does, however, which I am absolutely not in agreement with, is to add a absolutly hard time limit after three weeks. That is, in the proposed procedure, the discussion time will be two weeks initially (unless the Debian Project Leader chooses to reduce it, which they can do by up to one week), and it will be extended if more options are added to the ballot; but after three weeks, no matter where the discussion stands, the discussion period ends and Russ' proposed procedure forces us to go to a vote, unless all proposers of ballot options agree to withdraw their option. I believe this is a big mistake. I think any procedure we come up with should allow for the possibility that we may end up with a situation where everyone agrees that extending the discussion time a short time is a good idea, without necessarily resetting the whole discussion time to another two weeks (modulo a decision by the DPL). At the same time, any procedure we come up with should try to avoid the possibility of process abuse by people who would rather delay a vote ad infinitum than to see it voted upon. A hard time limit certainly does that; but I believe it causes more problems than it solves. I think insted that it is necessary for any procedure to allow for the discussion time to be extended as long as a strong enough consensus exists that this would be beneficial. As such, I have proposed an amendment to Russ' proposal (a full version of my proposed constitution can be seen on salsa) that hopefully solves these issues in a novel way: it allows anyone to request an extension to the discussion time, which then needs to be sponsored according to the same rules as a new ballot option. If the time extension is successfully created, those who supported the extension can then also no longer propose any new ones. Additionally, after 4 weeks, the proposed procedure allows anyone to object, so that 4 weeks is probably the practical limit -- although the possibility exists if enough support exists to extend the discussion time (or not enough to end it). The full rules involve slightly more than that (I don't like to put too much formal language in a blog post), but they're not too complicated, I think. That proposal has received a number of seconds, but after a week it hasn't yet reached the constitutional requirement for the option to be on the ballot. So, I guess this is a public request for more support to my proposal. If you're a Debian Developer and you agree with me that my proposed procedure is better than the alternative, please step forward and let yourself be heard. Thanks!

27 September 2021

Wouter Verhelst: SReview::Video is now Media::Convert

SReview, the video review and transcode tool that I originally wrote for FOSDEM 2017 but which has since been used for debconfs and minidebconfs as well, has long had a sizeable component for inspecting media files with ffprobe, and generating ffmpeg command lines to convert media files from one format to another. This component, SReview::Video (plus a number of supporting modules), is really not tied very much to the SReview webinterface or the transcoding backend. That is, the webinterface and the transcoding backend obviously use the ffmpeg handling library, but they don't provide any services that SReview::Video could not live without. It did use the configuration API that I wrote for SReview, but disentangling that turned out to be very easy. As I think SReview::Video is actually an easy to use, flexible API, I decided to refactor it into Media::Convert, and have just uploaded the latter to CPAN itself. The intent is to refactor the SReview webinterface and transcoding backend so that they will also use Media::Convert instead of SReview::Video in the near future -- otherwise I would end up maintaining everything twice, and then what's the point. This hasn't happened yet, but it will soon (this shouldn't be too difficult after all). Unfortunately Media::Convert doesn't currently install cleanly from CPAN, since I made it depend on Alien::ffmpeg which currently doesn't work (I'm in communication with the Alien::ffmpeg maintainer in order to get that resolved), so if you want to try it out you'll have to do a few steps manually. I'll upload it to Debian soon, too.

27 May 2021

Wouter Verhelst: SReview and pandemics

The pandemic was a bit of a mess for most FLOSS conferences. The two conferences that I help organize -- FOSDEM and DebConf -- are no exception. In both conferences, I do essentially the same work: as a member of both video teams, I manage the postprocessing of the video recordings of all the talks that happened at the respective conference(s). I do this by way of SReview, the online video review and transcode system that I wrote, which essentially crowdsources the manual work that needs to be done, and automates as much as possible of the workflow. The original version of SReview consisted of a database, a (very basic) Mojolicious-based webinterface, and a bunch of perl scripts which would build and execute ffmpeg command lines using string interpolation. As a quick hack that I needed to get working while writing it in my spare time in half a year, that approach was workable and resulted in successful postprocessing after FOSDEM 2017, and a significant improvement in time from the previous years. However, I did not end development with that, and since then I've replaced the string interpolation by an object oriented API for generating ffmpeg command lines, as well as modularized the webinterface. Additionally, I've had help reworking the user interface into a system that is somewhat easier to use than my original interface, and have slowly but surely added more features to the system so as to make it more flexible, as well as support more types of environments for the system to run in. One of the major issues that still remains with SReview is that the administrator's interface is pretty terrible. I had been planning on revamping that for 2020, but then massive amounts of people got sick, travel was banned, and both the conferences that I work on were converted to an online-only conference. These have some very specific requirements; e.g., both conferences allowed people to upload a prerecorded version of their talk, rather than doing the talk live; since preprocessing a video is, technically, very similar to postprocessing it, I adapted SReview to allow people to upload a video file that it would then validate (in terms of length, codec, and apparent resolution). This seems like easy to do, but I decided to implement this functionality so that it would also allow future use for in-person conferences, where occasionally a speaker requests that modifications would be made to the video file in a way that SReview is unable to do. This made it marginally more involved, but at least will mean that a feature which I had planned to implement some years down the line is now already implemented. The new feature works quite well, and I'm happy I've implemented it in the way that I have. In order for the "upload" processing and the "post-event" processing to not be confused, however, I decided to import the conference schedules twice: once as the conference itself, and once as a shadow version of that conference for the prerecordings. That way, I could track the progress through the system of the prerecording completely separately from the progress of the postprocessing of the video (which adds opening/closing credits, and transcodes to multiple variants of the same video). Schedule parsing was something that had not been implemented in a generic way yet, however; since that made doubling the schedule in that way rather complex, I decided to bite the bullet and (finally) implement schedule parsing in a generic way. Currently, schedule parsers exist for two formats (Pentabarf XML and the Wafer variant of that same format which is almost, but not quite, entirely the same). The API for that is quite flexible, and I'm happy with the way things have been implemented there. I've also implemented a set of "virtual" parsers, which allow mangling the schedule in various ways (by either filtering out talks that we don't want, or by generating the shadow version of the schedule that I talked about earlier). While the SReview settings have reasonable defaults, occasionally the output of SReview is not entirely acceptable, due to more complicated matters that then result in encoding artifacts. As a result, the DebConf video team has been doing a final review step, completely outside of SReview, to ensure that such encoding artifacts don't exist. That seemed suboptimal, so recently I've been working on integrating that into SReview as well. First tests have been run, and seem to be acceptable, but there's still a few loose ends to be finalized. As part of this, I've also reworked the way comments could be entered into the system. Previously the presence of a comment would signal that the video has some problems that an administrator needed to look at. Unfortunately, that was causing some confusion, with some people even thinking it's a good place to enter a "thank you for your work" style of comment... which it obviously isn't. Turning it into a "comment log" system instead fixes that, and also allows for better two-way communication between administrators and reviewers. Hopefully that'll improve things in that area as well. Finally, the audio normalization in SReview -- for which I've long used bs1770gain -- is having problems. First of all, bs1770gain will sometimes alter the timing of the video or audio file that it's passed, which is very problematic if I want to process it further. There is an ffmpeg loudnorm filter which implements the same algorithm, so that should make things easier to use. Secondly, the author of bs1770gain is a strange character that I'd rather not be involved with. Before I knew about the loudnorm filter I didn't really have a choice, but now I can just rip bs1770gain out and replace it by the loudnorm filter. That will fix various other bugs in SReview, too, because SReview relies on behaviour that isn't actually there (but which I didn't know at the time when I wrote it). All in all, the past year-and-a-bit has seen a lot of development for SReview, with multiple features being added and a number of long-standing problems being fixed. Now if only the pandemic would subside, allowing the whole "let's do everything online only" wave to cool down a bit, so that I can finally make time to implement the admin interface...

Wouter Verhelst: Freenode

Bye, Freenode I have been on Freenode for about 20 years, since my earliest involvement with Debian in about 2001. When Debian moved to OFTC for its IRC presence way back in 2006, I hung around on Freenode somewhat since FOSDEM's IRC channels were still there, as well as for a number of other channels that I was on at the time (not anymore though). This is now over and done with. What's happening with Freenode is a shitstorm -- one that could easily have been fixed if one particular person were to step down a few days ago, but by now is a lost cause. At any rate, I'm now lurking, mostly for FOSDEM channels, on libera.chat, under my usual nick, as well as on OFTC.

22 March 2021

Wouter Verhelst: Twenty years of Debian

Ten years ago, I reflected on the fact that -- by that time -- I had been in Debian for just over ten years. This year, in early February, I've passed the twenty year milestone. As I'm turning 43 this year, I will have been in Debian for half my life in about three years. Scary thought, that. In the past ten years, not much has changed, and yet at the same time, much has. I became involved in the Debian video team; I stepped down from the m68k port; and my organizing of the Debian devroom at FOSDEM resulted in me eventually joining the FOSDEM orga team, where I eventually ended up also doing video. As part of my video work, I wrote SReview, for which in these COVID-19 times in much of my spare time I have had to write new code and/or fix bugs. I was a candidate for the position of DPL one more time, without being elected. I was also a candidate for the technical committee a few times, also without success. I also added a few packages to the list of packages that I maintain for Debian; most obviously this includes SReview, but there's also things like extrepo and policy-rcd-declarative, both fairly recent packages that I hope will improve Debian as a whole in the longer term. On a more personal level, at one debconf I met a wonderful girl that I now have just celebrated my first wedding anniversary with. Before that could happen, I have had to move to South Africa two years ago. Moving is an involved process at any one time; moving to a different continent altogether is even more so. As it would have been complicated and involved to remain a business owner of a Belgian business while living 9500km away from the country, I sold my shares to my (now ex) business partner; it turned the page of a 15-year chapter of my life, something I could not do without feelings one way or the other. The things I do in Debian has changed over the past twenty years. I've been the maintainer of the second-highest number of packages in the project when I maintained the Linux Gazette packages; I've been an m68k porter; I've been an AM, and briefly even an NM frontdesk member; I've been a DPL candidate three times, and a TC candidate twice. At the turn of my first decade of being a Debian Developer, I noted that people started to recognize my name, and that I started to be one of the Debian Developers who had been with the project longer than most. This has, obviously, not changed. New in the "I'm getting old" department is the fact that during the last Debconf, I noticed for the first time that there was a speaker who had been alive for less long than I had been a Debian Developer. I'm assuming these types of things will continue happening in the next decade, and that the future will bring more of these kinds of changes that will make me feel older as I and the project mature more. I'm looking forward to it. Here's to you, Debian; may you continue to influence my life, in good ways and in bad (but hopefully mostly good), as well as continue to inspire me to improve the world, as you have over the past twenty years!

17 January 2021

Wouter Verhelst: SReview 0.6

... isn't ready yet, but it's getting there. I had planned to release a new version of SReview, my online video review and transcoding system that I wrote originally for FOSDEM but is being used for DebConf, too, after it was set up and running properly for FOSDEM 2020. However, things got a bit busy (both in my personal life and in the world at large), so it fell a bit by the wayside. I've now also been working on things a bit more, in preparation for an improved administrator's interface, and have started implementing a REST API to deal with talks etc through HTTP calls. This seems to be coming along nicely, thanks to OpenAPI and the Mojolicious plugin for parsing that. I can now design the API nicely, and autogenerate client side libraries to call them. While at it, because libmojolicious-plugin-openapi-perl isn't available in Debian 10 "buster", I moved the docker containers over from stable to testing. This revealed that both bs1770gain and inkscape changed their command line incompatibly, resulting in me having to work around those incompatibilities. The good news is that I managed to do so in a way that keeps running SReview on Debian 10 viable, provided one installs Mojolicious::Plugin::OpenAPI from CPAN rather than from a Debian package. Or installs a backport of that package, of course. Or, heck, uses the Docker containers in a kubernetes environment or some such -- I'd love to see someone use that in production. Anyway, I'm still finishing the API, and the implementation of that API and the test suite that ensures the API works correctly, but progress is happening; and as soon as things seem to be working properly, I'll do a release of SReview 0.6, and will upload that to Debian. Hopefully that'll be soon.

Wouter Verhelst: Dear Google

... Why do you have to be so effing difficult about a YouTube API project that is used for a single event per year? FOSDEM creates 600+ videos on a yearly basis. There is no way I am going to manually upload 600+ videos through your webinterface, so we use the API you provide, using a script written by Stefano Rivera. This script grabs video filenames and metadata from a YAML file, and then uses your APIs to upload said videos with said metadata. It works quite well. I run it from cron, and it uploads files until the quota is exhausted, then waits until the next time the cron job runs. It runs so well, that the first time we used it, we could upload 50+ videos on a daily basis, and so the uploads were done as soon as all the videos were created, which was a few months after the event. Cool! The second time we used the script, it did not work at all. We asked one of our key note speakers who happened to be some hotshot at your company, to help us out. He contacted the YouTube people, and whatever had been broken was quickly fixed, so yay, uploads worked again. I found out later that this is actually a normal thing if you don't use your API quota for 90 days or more. Because it's happened to us every bloody year. For the 2020 event, rather than going through back channels (which happened to be unavailable this edition), I tried to use your normal ways of unblocking the API project. This involves creating a screencast of a bloody command line script and describing various things that don't apply to FOSDEM and ghaah shoot me now so meh, I created a new API project instead, and had the uploads go through that. Doing so gives me a limited quota that only allows about 5 or 6 videos per day, but that's fine, it gives people subscribed to our channel the time to actually watch all the videos while they're being uploaded, rather than being presented with a boatload of videos that they can never watch in a day. Also it doesn't overload subscribers, so yay. About three months ago, I started uploading videos. Since then, every day, the "fosdemtalks" channel on YouTube has published five or six videos. Given that, imagine my surprise when I found this in my mailbox this morning... Google lies, claiming that my YouTube API project isn't being used for 90 days and informing me that it will be disabled This is an outright lie, Google. The project has been created 90 days ago, yes, that's correct. It has been used every day since then to upload videos. I guess that means I'll have to deal with your broken automatic content filters to try and get stuff unblocked... ... or I could just give up and not do this anymore. After all, all the FOSDEM content is available on our public video host, too.

Wouter Verhelst: Software available through Extrepo

Just over 7 months ago, I blogged about extrepo, my answer to the "how do you safely install software on Debian without downloading random scripts off the Internet and running them as root" question. I also held a talk during the recent "MiniDebConf Online" that was held, well, online. The most important part of extrepo is "what can you install through it". If the number of available repositories is too low, there's really no reason to use it. So, I thought, let's look what we have after 7 months... To cut to the chase, there's a bunch of interesting content there, although not all of it has a "main" policy. Each of these can be enabled by installing extrepo, and then running extrepo enable <reponame>, where <reponame> is the name of the repository. Note that the list is not exhaustive, but I intend to show that even though we're nowhere near complete, extrepo is already quite useful in its current state:

Free software
  • The debian_official, debian_backports, and debian_experimental repositories contain Debian's official, backports, and experimental repositories, respectively. These shouldn't have to be managed through extrepo, but then again it might be useful for someone, so I decided to just add them anyway. The config here uses the deb.debian.org alias for CDN-backed package mirrors.
  • The belgium_eid repository contains the Belgian eID software. Obviously this is added, since I'm upstream for eID, and as such it was a large motivating factor for me to actually write extrepo in the first place.
  • elastic: the elasticsearch software.
  • Some repositories, such as dovecot, winehq and bareos contain upstream versions of their respective software. These two repositories contain software that is available in Debian, too; but their upstreams package their most recent release independently, and some people might prefer to run those instead.
  • The sury, fai, and postgresql repositories, as well as a number of repositories such as openstack_rocky, openstack_train, haproxy-1.5 and haproxy-2.0 (there are more) contain more recent versions of software packaged in Debian already by the same maintainer of that package repository. For the sury repository, that is PHP; for the others, the name should give it away. The difference between these repositories and the ones above is that it is the official Debian maintainer for the same software who maintains the repository, which is not the case for the others.
  • The vscodium repository contains the unencumbered version of Microsoft's Visual Studio Code; i.e., the codium version of Visual Studio Code is to code as the chromium browser is to chrome: it is a build of the same softare, but without the non-free bits that make code not entirely Free Software.
  • While Debian ships with at least two browsers (Firefox and Chromium), additional browsers are available through extrepo, too. The iridiumbrowser repository contains a Chromium-based browser that focuses on privacy.
  • Speaking of privacy, perhaps you might want to try out the torproject repository.
  • For those who want to do Cloud Computing on Debian in ways that isn't covered by Openstack, there is a kubernetes repository that contains the Kubernetes stack, the as well as the google_cloud one containing the Google Cloud SDK.

Non-free software While these are available to be installed through extrepo, please note that non-free and contrib repositories are disabled by default. In order to enable these repositories, you must first enable them; this can be accomplished through /etc/extrepo/config.yaml.
  • In case you don't care about freedom and want the official build of Visual Studio Code, the vscode repository contains it.
  • While we're on the subject of Microsoft, there's also Microsoft Teams available in the msteams repository. And, hey, skype.
  • For those who are not satisfied with the free browsers in Debian or any of the free repositories, there's opera and google_chrome.
  • The docker-ce repository contains the official build of Docker CE. While this is the free "community edition" that should have free licenses, I could not find a licensing statement anywhere, and therefore I'm not 100% sure whether this repository is actually free software. For that reason, it is currently marked as a non-free one. Merge Requests for rectifying that from someone with more information on the actual licensing situation of Docker CE would be welcome...
  • For gamers, there's Valve's steam repository.
Again, the above lists are not meant to be exhaustive. Special thanks go out to Russ Allbery, Kim Alvefur, Vincent Bernat, Nick Black, Arnaud Ferraris, Thorsten Glaser, Thomas Goirand, Juri Grabowski, Paolo Greppi, and Josh Triplett, for helping me build the current list of repositories. Is your favourite repository not listed? Create a configuration based on template.yaml, and file a merge request!

Wouter Verhelst: On Statements, Facts, Hypotheses, Science, Religion, and Opinions

The other day, we went to a designer's fashion shop whose owner was rather adamant that he was never ever going to wear a face mask, and that he didn't believe the COVID-19 thing was real. When I argued for the opposing position, he pretty much dismissed what I said out of hand, claiming that "the hospitals are empty dude" and "it's all a lie". When I told him that this really isn't true, he went like "well, that's just your opinion". Well, no -- certain things are facts, not opinions. Even if you don't believe that this disease kills people, the idea that this is a matter of opinion is missing the ball by so much that I was pretty much stunned by the level of ignorance. His whole demeanor pissed me off rather quickly. While I disagree with the position that it should be your decision whether or not to wear a mask, it's certainly possible to have that opinion. However, whether or not people need to go to hospitals is not an opinion -- it's something else entirely. After calming down, the encounter got me thinking, and made me focus on something I'd been thinking about before but hadn't fully forumlated: the fact that some people in this world seem to misunderstand the nature of what it is to do science, and end up, under the claim of being "sceptical", with various nonsense things -- see scientology, flat earth societies, conspiracy theories, and whathaveyou. So, here's something that might (but probably won't) help some people figuring out stuff. Even if it doesn't, it's been bothering me and I want to write it down so it won't bother me again. If you know all this stuff, it might be boring and you might want to skip this post. Otherwise, take a deep breath and read on... Statements are things people say. They can be true or false; "the sun is blue" is an example of a statement that is trivially false. "The sun produces light" is another one that is trivially true. "The sun produces light through a process that includes hydrogen fusion" is another statement, one that is a bit more difficult to prove true or false. Another example is "Wouter Verhelst does not have a favourite color". That happens to be a true statement, but it's fairly difficult for anyone that isn't me (or any one of the other Wouters Verhelst out there) to validate as true. While statements can be true or false, combining statements without more context is not always possible. As an example, the statement "Wouter Verhelst is a Debian Developer" is a true statement, as is the statement "Wouter Verhelst is a professional Volleybal player"; but the statement "Wouter Verhelst is a professional Volleybal player and a Debian Developer" is not, because while I am a Debian Developer, I am not a professional Volleybal player -- I just happen to share a name with someone who is. A statement is never a fact, but it can describe a fact. When a statement is a true statement, either because we trivially know what it states to be true or because we have performed an experiment that proved beyond any possible doubt that the statement is true, then what the statement describes is a fact. For example, "Red is a color" is a statement that describes a fact (because, yes, red is definitely a color, that is a fact). Such statements are called statements of fact. There are other possible statements. "Grass is purple" is a statement, but it is not a statement of fact; because as everyone knows, grass is (usually) green. A statement can also describe an opinion. "The Porsche 911 is a nice car" is a statement of opinion. It is one I happen to agree with, but it is certainly valid for someone else to make a statement that conflicts with this position, and there is nothing wrong with that. As the saying goes, "opinions are like assholes: everyone has one". Statements describing opinions are known as statements of opinion. The differentiating factor between facts and opinions is that facts are universally true, whereas opinions only hold for the people who state the opinion and anyone who agrees with them. Sometimes it's difficult or even impossible to determine whether a statement is true or not. The statement "The numbers that win the South African Powerball lottery on the 31st of July 2020 are 2, 3, 5, 19, 35, and powerball 14" is not a statement of fact, because at the time of writing, the 31st of July 2020 is in the future, which at this point gives it a 1 in 24,435,180 chance to be true). However, that does not make it a statement of opinion; it is not my opinion that the above numbers will win the South African powerball; instead, it is my guess that those numbers will be correct. Another word for "guess" is hypothesis: a hypothesis is a statement that may be universally true or universally false, but for which the truth -- or its lack thereof -- cannot currently be proven beyond doubt. On Saturday, August 1st, 2020 the above statement about the South African Powerball may become a statement of fact; most likely however, it will instead become a false statement. An unproven hypothesis may be expressed as a matter of belief. The statement "There is a God who rules the heavens and the Earth" cannot currently (or ever) be proven beyond doubt to be either true or false, which by definition makes it a hypothesis; however, for matters of religion this is entirely unimportant, as for believers the belief that the statement is correct is all that matters, whereas for nonbelievers the truth of that statement is not at all relevant. A belief is not an opinion; an opinion is not a belief. Scientists do not deal with unproven hypotheses, except insofar that they attempt to prove, through direct observation of nature (either out in the field or in a controlled laboratory setting) that the hypothesis is, in fact, a statement of fact. This makes unprovable hypotheses unscientific -- but that does not mean that they are false, or even that they are uninteresting statements. Unscientific statements are merely statements that science cannot either prove or disprove, and that therefore lie outside of the realm of what science deals with. Given that background, I have always found the so-called "conflict" between science and religion to be a non-sequitur. Religion deals in one type of statements; science deals in another. The do not overlap, since a statement can either be proven or it cannot, and religious statements by their very nature focus on unprovable belief rather than universal truth. Sure, the range of things that science has figured out the facts about has grown over time, which implies that religious statements have sometimes been proven false; but is it heresy to say that "animals exist that can run 120 kph" if that is the truth, even if such animals don't exist in, say, Rome? Something very similar can be said about conspiracy theories. Yes, it is possible to hypothesize that NASA did not send men to the moon, and that all the proof contrary to that statement was somehow fabricated. However, by its very nature such a hypothesis cannot be proven or disproven (because the statement states that all proof was fabricated), which therefore implies that it is an unscientific statement. It is good to be sceptical about what is being said to you. People can have various ideas about how the world works, but only one of those ideas -- one of the possible hypotheses -- can be true. As long as a hypothesis remains unproven, scientists love to be sceptical themselves. In fact, if you can somehow prove beyond doubt that a scientific hypothesis is false, scientists will love you -- it means they now know something more about the world and that they'll have to come up with something else, which is a lot of fun. When a scientific experiment or observation proves that a certain hypothesis is true, then this probably turns the hypothesis into a statement of fact. That is, it is of course possible that there's a flaw in the proof, or that the experiment failed (but that the failure was somehow missed), or that no observance of a particular event happened when a scientist tried to observe something, but that this was only because the scientist missed it. If you can show that any of those possibilities hold for a scientific proof, then you'll have turned a statement of fact back into a hypothesis, or even (depending on the exact nature of the flaw) into a false statement. There's more. It's human nature to want to be rich and famous, sometimes no matter what the cost. As such, there have been scientists who have falsified experimental results, or who have claimed to have observed something when this was not the case. For that reason, a scientific paper that gets written after an experiment turned a hypothesis into fact describes not only the results of the experiment and the observed behavior, but also the methodology: the way in which the experiment was run, with enough details so that anyone can retry the experiment. Sometimes that may mean spending a large amount of money just to be able to run the experiment (most people don't have an LHC in their backyard, say), and in some cases some of the required materials won't be available (the latter is expecially true for, e.g., certain chemical experiments that involve highly explosive things); but the information is always there, and if you spend enough time and money reading through the available papers, you will be able to independently prove the hypothesis yourself. Scientists tend to do just that; when the results of a new experiment are published, they will try to rerun the experiment, partially because they want to see things with their own eyes; but partially also because if they can find fault in the experiment or the observed behavior, they'll have reason to write a paper of their own, which will make them a bit more rich and famous. I guess you could say that there's three types of people who deal with statements: scientists, who deal with provable hypotheses and statements of fact (but who have no use for unprovable hypotheses and statements of opinion); religious people and conspiracy theorists, who deal with unprovable hypotheses (where the religious people deal with these to serve a large cause, while conspiracy theorists only care about the unprovable hypotheses); and politicians, who should care about proven statements of fact and produce statements of opinion, but who usually attempt the reverse of those two these days :-/ Anyway... [[!img Error: Image::Magick is not installed]]

5 October 2020

Reproducible Builds: Reproducible Builds in September 2020

Welcome to the September 2020 report from the Reproducible Builds project. In our monthly reports, we attempt to summarise the things that we have been up to over the past month, but if you are interested in contributing to the project, please visit our main website. This month, the Reproducible Builds project was pleased to announce a donation from Amateur Radio Digital Communications (ARDC) in support of its goals. ARDC s contribution will propel the Reproducible Builds project s efforts in ensuring the future health, security and sustainability of our increasingly digital society. Amateur Radio Digital Communications (ARDC) is a non-profit which was formed to further research and experimentation with digital communications using radio, with a goal of advancing the state of the art of amateur radio and to educate radio operators in these techniques. You can view the full announcement as well as more information about ARDC on their website.
In August s report, we announced that Jennifer Helsby (redshiftzero) launched a new reproduciblewheels.com website to address the lack of reproducibility of Python wheels . This month, Kushal Das posted a brief follow-up to provide an update on reproducible sources as well. The Threema privacy and security-oriented messaging application announced that within the next months , their apps will become fully open source, supporting reproducible builds :
This is to say that anyone will be able to independently review Threema s security and verify that the published source code corresponds to the downloaded app.
You can view the full announcement on Threema s website.

Events Sadly, due to the unprecedented events in 2020, there will be no in-person Reproducible Builds event this year. However, the Reproducible Builds project intends to resume meeting regularly on IRC, starting on Monday, October 12th at 18:00 UTC (full announcement). The cadence of these meetings will probably be every two weeks, although this will be discussed and decided on at the first meeting. (An editable agenda is available.) On 18th September, Bernhard M. Wiedemann gave a presentation in German titled Wie reproducible builds Software sicherer machen ( How reproducible builds make software more secure ) at the Internet Security Digital Days 2020 conference. (View video.) On Saturday 10th October, Morten Linderud will give a talk at Arch Conf Online 2020 on The State of Reproducible Builds in the Arch Linux distribution:
The previous year has seen great progress in Arch Linux to get reproducible builds in the hands of the users and developers. In this talk we will explore the current tooling that allows users to reproduce packages, the rebuilder software that has been written to check packages and the current issues in this space.
During the Reproducible Builds summit in Marrakesh, GNU Guix, NixOS and Debian were able to produce a bit-for-bit identical binary when building GNU Mes, despite using three different major versions of GCC. Since the summit, additional work resulted in a bit-for-bit identical Mes binary using tcc and this month, a fuller update was posted by the individuals involved.

Development work In openSUSE, Bernhard M. Wiedemann published his monthly Reproducible Builds status update.

Debian Chris Lamb uploaded a number of Debian packages to address reproducibility issues that he had previously provided patches for, including cfingerd (#831021), grap (#870573), splint (#924003) & schroot (#902804) Last month, an issue was identified where a large number of Debian .buildinfo build certificates had been tainted on the official Debian build servers, as these environments had files underneath the /usr/local/sbin directory to prevent the execution of system services during package builds. However, this month, Aurelien Jarno and Wouter Verhelst fixed this issue in varying ways, resulting in a special policy-rcd-declarative-deny-all package. Building on Chris Lamb s previous work on reproducible builds for Debian .ISO images, Roland Clobus announced his work in progress on making the Debian Live images reproducible. [ ] Lucas Nussbaum performed an archive-wide rebuild of packages to test enabling the reproducible=+fixfilepath Debian build flag by default. Enabling the fixfilepath feature will likely fix reproducibility issues in an estimated 500-700 packages. The test revealed only 33 packages (out of 30,000 in the archive) that fail to build with fixfilepath. Many of those will be fixed when the default LLVM/Clang version is upgraded. 79 reviews of Debian packages were added, 23 were updated and 17 were removed this month adding to our knowledge about identified issues. Chris Lamb added and categorised a number of new issue types, including packages that captures their build path via quicktest.h and absolute build directories in documentation generated by Doxygen , etc. Lastly, Lukas Puehringer s uploaded a new version of the in-toto to Debian which was sponsored by Holger Levsen. [ ]

diffoscope diffoscope is our in-depth and content-aware diff utility that can not only locate and diagnose reproducibility issues, it provides human-readable diffs of all kinds too. In September, Chris Lamb made the following changes to diffoscope, including preparing and uploading versions 159 and 160 to Debian:
  • New features:
    • Show ordering differences only in strings(1) output by applying the ordering check to all differences across the codebase. [ ]
  • Bug fixes:
    • Mark some PGP tests that they require pgpdump, and check that the associated binary is actually installed before attempting to run it. (#969753)
    • Don t raise exceptions when cleaning up after guestfs cleanup failure. [ ]
    • Ensure we check FALLBACK_FILE_EXTENSION_SUFFIX, otherwise we run pgpdump against all files that are recognised by file(1) as data. [ ]
  • Codebase improvements:
    • Add some documentation for the EXTERNAL_TOOLS dictionary. [ ]
    • Abstract out a variable we use a couple of times. [ ]
  • diffoscope.org website improvements:
    • Make the (long) demonstration GIF less prominent on the page. [ ]
In addition, Paul Spooren added support for automatically deploying Docker images. [ ]

Website and documentation This month, a number of updates to the main Reproducible Builds website and related documentation. Chris Lamb made the following changes: In addition, Holger Levsen re-added the documentation link to the top-level navigation [ ] and documented that the jekyll-polyglot package is required [ ]. Lastly, diffoscope.org and reproducible-builds.org were transferred to Software Freedom Conservancy. Many thanks to Brett Smith from Conservancy, J r my Bobbio (lunar) and Holger Levsen for their help with transferring and to Mattia Rizzolo for initiating this.

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of these patches, including: Bernhard M. Wiedemann also reported issues in git2-rs, pyftpdlib, python-nbclient, python-pyzmq & python-sidpy.

Testing framework The Reproducible Builds project operates a Jenkins-based testing framework to power tests.reproducible-builds.org. This month, Holger Levsen made the following changes:
  • Debian:
    • Shorten the subject of nodes have gone offline notification emails. [ ]
    • Also track bugs that have been usertagged with usrmerge. [ ]
    • Drop abort-related codepaths as that functionality has been removed from Jenkins. [ ]
    • Update the frequency we update base images and status pages. [ ][ ][ ][ ]
  • Status summary view page:
    • Add support for monitoring systemctl status [ ] and the number of diffoscope processes [ ].
    • Show the total number of nodes [ ] and colourise critical disk space situations [ ].
    • Improve the visuals with respect to vertical space. [ ][ ]
  • Debian rebuilder prototype:
    • Resume building random packages again [ ] and update the frequency that packages are rebuilt. [ ][ ]
    • Use --no-respect-build-path parameter until sbuild 0.81 is available. [ ]
    • Treat the inability to locate some packages as a debrebuild problem, and not as a issue with the rebuilder itself. [ ]
  • Arch Linux:
    • Update various components to be compatible with Arch Linux s move to the xz compression format. [ ][ ][ ]
    • Allow scheduling of old packages to catch up on the backlog. [ ][ ][ ]
    • Improve formatting on the summary page. [ ][ ]
    • Update HTML pages once every hour, not every 30 minutes. [ ]
    • Use the Ubuntu (!) GPG keyserver to validate packages. [ ]
  • System health checks:
    • Highlight important bad conditions in colour. [ ][ ]
    • Add support for detecting more problems, including Jenkins shutdown issues [ ], failure to upgrade Arch Linux packages [ ], kernels with wrong permissions [ ], etc.
  • Misc:
    • Delete old schroot sessions after 2 days, not 3. [ ]
    • Use sudo to cleanup diffoscope schroot sessions. [ ]
In addition, stefan0xC fixed a query for unknown results in the handling of Arch Linux packages [ ] and Mattia Rizzolo updated the template that notifies maintainers by email of their newly-unreproducible packages to ensure that it did not get caught in junk/spam folders [ ]. Finally, build node maintenance was performed by Holger Levsen [ ][ ][ ][ ], Mattia Rizzolo [ ][ ] and Vagrant Cascadian [ ][ ][ ].
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

27 July 2020

Wouter Verhelst: giphy.gif

16 July 2020

Louis-Philippe V ronneau: DebConf Videoteam Sprint Report -- DebConf20@Home

DebConf20 starts in about 5 weeks, and as always, the DebConf Videoteam is working hard to make sure it'll be a success. As such, we held a sprint from July 9th to 13th to work on our new infrastructure. A remote sprint certainly ain't as fun as an in-person one, but we nonetheless managed to enjoy ourselves. Many thanks to those who participated, namely: We also wish to extend our thanks to Thomas Goirand and Infomaniak for providing us with virtual machines to experiment on and host the video infrastructure for DebConf20. Advice for presenters For DebConf20, we strongly encourage presenters to record their talks in advance and send us the resulting video. We understand this is more work, but we think it'll make for a more agreeable conference for everyone. Video conferencing is still pretty wonky and there is nothing worse than a talk ruined by a flaky internet connection or hardware failures. As such, if you are giving a talk at DebConf this year, we are asking you to read and follow our guide on how to record your presentation. Fear not: we are not getting rid of the Q&A period at the end of talks. Attendees will ask their questions either on IRC or on a collaborative pad and the Talkmeister will relay them to the speaker once the pre-recorded video has finished playing. New infrastructure, who dis? Organising a virtual DebConf implies migrating from our battle-tested on-premise workflow to a completely new remote one. One of the major changes this means for us is the addition of Jitsi Meet to our infrastructure. We normally have 3 different video sources in a room: two cameras and a slides grabber. With the new online workflow, directors will be able to play pre-recorded videos as a source, will get a feed from a Jitsi room and will see the audience questions as a third source. This might seem simple at first, but is in fact a very major change to our workflow and required a lot of work to implement.
               == On-premise ==                                          == Online ==
                                                      
              Camera 1                                                 Jitsi
                                                                          
                 v                 ---> Frontend                         v                 ---> Frontend
                                                                                            
    Slides -> Voctomix -> Backend -+--> Frontend         Questions -> Voctomix -> Backend -+--> Frontend
                                                                                            
                 ^                 ---> Frontend                         ^                 ---> Frontend
                                                                          
              Camera 2                                           Pre-recorded video
In our tests, playing back pre-recorded videos to voctomix worked well, but was sometimes unreliable due to inconsistent encoding settings. Presenters will thus upload their pre-recorded talks to SReview so we can make sure there aren't any obvious errors. Videos will then be re-encoded to ensure a consistent encoding and to normalise audio levels. This process will also let us stitch the Q&As at the end of the pre-recorded videos more easily prior to publication. Reducing the stream latency One of the pitfalls of the streaming infrastructure we have been using since 2016 is high video latency. In a worst case scenario, remote attendees could get up to 45 seconds of latency, making participation in events like BoFs arduous. In preparation for DebConf20, we added a new way to stream our talks: RTMP. Attendees will thus have the option of using either an HLS stream with higher latency or an RTMP stream with lower latency. Here is a comparative table that can help you decide between the two protocols:
HLS RTMP
Pros
  • Can be watched from a browser
  • Auto-selects a stream encoding
  • Single URL to remember
  • Lower latency (~5s)
Cons
  • Higher latency (up to 45s)
  • Requires a dedicated video player (VLC, mpv)
  • Specific URLs for each encoding setting
Live mixing from home with VoctoWeb Since DebConf16, we have been using voctomix, a live video mixer developed by the CCC VOC. voctomix is conveniently divided in two: voctocore is the backend server while voctogui is a GTK+ UI frontend directors can use to live-mix. Although voctogui can connect to a remote server, it was primarily designed to run either on the same machine as voctocore or on the same LAN. Trying to use voctogui from a machine at home to connect to a voctocore running in a datacenter proved unreliable, especially for high-latency and low bandwidth connections. Inspired by the setup FOSDEM uses, we instead decided to go with a web frontend for voctocore. We initially used FOSDEM's code as a proof of concept, but quickly reimplemented it in Python, a language we are more familiar with as a team. Compared to the FOSDEM PHP implementation, voctoweb implements A / B source selection (akin to voctogui) as well as audio control, two very useful features. In the following screen captures, you can see the old PHP UI on the left and the new shiny Python one on the right. The old PHP voctowebThe new Python3 voctoweb Voctoweb is still under development and is likely to change quite a bit until DebConf20. Still, the current version seems to works well enough to be used in production if you ever need to. Python GeoIP redirector We run multiple geographically-distributed streaming frontend servers to minimize the load on our streaming backend and to reduce overall latency. Although users can connect to the frontends directly, we typically point them to live.debconf.org and redirect connections to the nearest server. Sadly, 6 months ago MaxMind decided to change the licence on their GeoLite2 database and left us scrambling. To fix this annoying issue, Stefano Rivera wrote a Python program that uses the new database and reworked our ansible frontend server role. Since the new database cannot be redistributed freely, you'll have to get a (free) license key from MaxMind if you to use this role. Ansible & CI improvements Infrastructure as code is a living process and needs constant care to fix bugs, follow changes in DSL and to implement new features. All that to say a large part of the sprint was spent making our ansible roles and continuous integration setup more reliable, less buggy and more featureful. All in all, we merged 26 separate ansible-related merge request during the sprint! As always, if you are good with ansible and wish to help, we accept merge requests on our ansible repository :)

29 May 2020

Gunnar Wolf: Heads up Online MiniDebConf is Online

I know most Debian people know about this already But in case you don t follow the usual Debian communications channels, this might interest you! Given most of the world is still under COVID-19 restrictions, and that we want to work on Debian, given there is no certainty as to what the future holds in store for us Our DPL fearless as they always are had the bold initiative to make this weekend into the first-ever miniDebConf Online (MDCO)! miniDebConf Online So, we are already halfway through DebCamp (which means, you can come and hang out with us in the debian.social DebCamp Jitsi lounge, where some impromptu presentations might happen (or not). Starting tomorrow morning (11AM UTC), we will have a quite interesting set of talks. I am reproducing the schedule here:

Saturday 2020.05.30
Time (UTC) Speaker Talk
11:00 - 11:10 MDCO team members Hello + Welcome
11:30 - 11:50 Wouter Verhelst Extrepo
12:00 - 12:45 JP Mengual Debian France, trust european organization
13:00 - 13:20 Arnaud Ferraris Bringing Debian to mobile phones, one package at a time
13:30 - 15:00 Lunch Break A chance for the teams to catch some air
15:00 - 15:45 JP Mengual The community team, United Nations Organizations of Debian?
16:00 - 16:45 Christoph Biedl Clevis and tang - overcoming the disk unlocking problem
17:00 - 17:45 Antonio Terceiro I m a programmer, how can I help Debian

Sunday 2020.05.31
Time (UTC) Speaker Talk
11:00 - 11:45 Andreas Tille The effect of Covid-19 on the Debian Med project
12:00 - 12:45 Paul Gevers BoF: running autopkgtest for your package
13:00 - 13:20 Ben Hutchings debplate: Build many binary packages with templates
13:30 - 15:00 Lunch break A chance for the teams to catch some air
15:00 - 15:45 Holger Levsen Reproducing bullseye in practice
16:00 - 16:45 Jonathan Carter Striving towards excellence
17:00 - 17:45 Delib* Organizing Peer-to-Peer Debian Facilitation Training
18:00 - 18:15 MDCO team members Closing
  • subject to confirmation

Timezone Remember this is an online event, meant for all of the world! Yes, the chosen times seem quite Europe-centric (but they are mostly a function of the times the talk submitters requested). Talks are 11:00 18:00UTC, which means, 06:00 13:00 Mexico (GMT-5), 20:00 03:00 Japan (GMT+9), 04:00 11:00 Western Canada/USA/Mexico (GMT-7) and the rest of the world, somewhere in between. (No, this was clearly not optimized for our dear usual beer team. Sorry! I guess we need you to be fully awake at beertime!)

[update] Connecting! Of course, I didn t make it clear at first how to connect to the Online miniDebConf, silly me!
  • The video streams are available at: https://video.debconf.org/
  • Suggested: tune in to the #minidebconf-online IRC channel in OFTC.
That should be it. Hope to see you there! (Stay home, stay safe )

Next.