Search Results: "toma"

29 September 2022

Antoine Beaupr : Detecting manual (and optimizing large) package installs in Puppet

Well this is a mouthful. I recently worked on a neat hack called puppet-package-check. It is designed to warn about manually installed packages, to make sure "everything is in Puppet". But it turns out it can (probably?) dramatically decrease the bootstrap time of Puppet bootstrap when it needs to install a large number of packages.

Detecting manual packages On a cleanly filed workstation, it looks like this:
root@emma:/home/anarcat/bin# ./puppet-package-check -v
listing puppet packages...
listing apt packages...
loading apt cache...
0 unmanaged packages found
A messy workstation will look like this:
root@curie:/home/anarcat/bin# ./puppet-package-check -v
listing puppet packages...
listing apt packages...
loading apt cache...
288 unmanaged packages found
apparmor-utils beignet-opencl-icd bridge-utils clustershell cups-pk-helper davfs2 dconf-cli dconf-editor dconf-gsettings-backend ddccontrol ddrescueview debmake debootstrap decopy dict-devil dict-freedict-eng-fra dict-freedict-eng-spa dict-freedict-fra-eng dict-freedict-spa-eng diffoscope dnsdiag dropbear-initramfs ebtables efibootmgr elpa-lua-mode entr eog evince figlet file file-roller fio flac flex font-manager fonts-cantarell fonts-inconsolata fonts-ipafont-gothic fonts-ipafont-mincho fonts-liberation fonts-monoid fonts-monoid-tight fonts-noto fonts-powerline fonts-symbola freeipmi freetype2-demos ftp fwupd-amd64-signed gallery-dl gcc-arm-linux-gnueabihf gcolor3 gcp gdisk gdm3 gdu gedit gedit-plugins gettext-base git-debrebase gnome-boxes gnote gnupg2 golang-any golang-docker-credential-helpers golang-golang-x-tools grub-efi-amd64-signed gsettings-desktop-schemas gsfonts gstreamer1.0-libav gstreamer1.0-plugins-base gstreamer1.0-plugins-good gstreamer1.0-plugins-ugly gstreamer1.0-pulseaudio gtypist gvfs-backends hackrf hashcat html2text httpie httping hugo humanfriendly iamerican-huge ibus ibus-gtk3 ibus-libpinyin ibus-pinyin im-config imediff img2pdf imv initramfs-tools input-utils installation-birthday internetarchive ipmitool iptables iptraf-ng jackd2 jupyter jupyter-nbextension-jupyter-js-widgets jupyter-qtconsole k3b kbtin kdialog keditbookmarks keepassxc kexec-tools keyboard-configuration kfind konsole krb5-locales kwin-x11 leiningen lightdm lintian linux-image-amd64 linux-perf lmodern lsb-base lvm2 lynx lz4json magic-wormhole mailscripts mailutils manuskript mat2 mate-notification-daemon mate-themes mime-support mktorrent mp3splt mpdris2 msitools mtp-tools mtree-netbsd mupdf nautilus nautilus-sendto ncal nd ndisc6 neomutt net-tools nethogs nghttp2-client nocache npm2deb ntfs-3g ntpdate nvme-cli nwipe obs-studio okular-extra-backends openstack-clients openstack-pkg-tools paprefs pass-extension-audit pcmanfm pdf-presenter-console pdf2svg percol pipenv playerctl plymouth plymouth-themes popularity-contest progress prometheus-node-exporter psensor pubpaste pulseaudio python3-ldap qjackctl qpdfview qrencode r-cran-ggplot2 r-cran-reshape2 rake restic rhash rpl rpm2cpio rs ruby ruby-dev ruby-feedparser ruby-magic ruby-mocha ruby-ronn rygel-playbin rygel-tracker s-tui sanoid saytime scrcpy scrcpy-server screenfetch scrot sdate sddm seahorse shim-signed sigil smartmontools smem smplayer sng sound-juicer sound-theme-freedesktop spectre-meltdown-checker sq ssh-audit sshuttle stress-ng strongswan strongswan-swanctl syncthing system-config-printer system-config-printer-common system-config-printer-udev systemd-bootchart systemd-container tardiff task-desktop task-english task-ssh-server tasksel tellico texinfo texlive-fonts-extra texlive-lang-cyrillic texlive-lang-french texlive-lang-german texlive-lang-italian texlive-xetex tftp-hpa thunar-archive-plugin tidy tikzit tint2 tintin++ tipa tpm2-tools traceroute tree trocla ucf udisks2 unifont unrar-free upower usbguard uuid-runtime vagrant-cachier vagrant-libvirt virt-manager vmtouch vorbis-tools w3m wamerican wamerican-huge wfrench whipper whohas wireshark xapian-tools xclip xdg-user-dirs-gtk xlax xmlto xsensors xserver-xorg xsltproc xxd xz-utils yubioath-desktop zathura zathura-pdf-poppler zenity zfs-dkms zfs-initramfs zfsutils-linux zip zlib1g zlib1g-dev
157 old: apparmor-utils clustershell davfs2 dconf-cli dconf-editor ddccontrol ddrescueview decopy dnsdiag ebtables efibootmgr elpa-lua-mode entr figlet file-roller fio flac flex font-manager freetype2-demos ftp gallery-dl gcc-arm-linux-gnueabihf gcolor3 gcp gdu gedit git-debrebase gnote golang-docker-credential-helpers golang-golang-x-tools gtypist hackrf hashcat html2text httpie httping hugo humanfriendly iamerican-huge ibus ibus-pinyin imediff input-utils internetarchive ipmitool iptraf-ng jackd2 jupyter-qtconsole k3b kbtin kdialog keditbookmarks keepassxc kexec-tools kfind konsole leiningen lightdm lynx lz4json magic-wormhole manuskript mat2 mate-notification-daemon mktorrent mp3splt msitools mtp-tools mtree-netbsd nautilus nautilus-sendto nd ndisc6 neomutt net-tools nethogs nghttp2-client nocache ntpdate nwipe obs-studio openstack-pkg-tools paprefs pass-extension-audit pcmanfm pdf-presenter-console pdf2svg percol pipenv playerctl qjackctl qpdfview qrencode r-cran-ggplot2 r-cran-reshape2 rake restic rhash rpl rpm2cpio rs ruby-feedparser ruby-magic ruby-mocha ruby-ronn s-tui saytime scrcpy screenfetch scrot sdate seahorse shim-signed sigil smem smplayer sng sound-juicer spectre-meltdown-checker sq ssh-audit sshuttle stress-ng system-config-printer system-config-printer-common tardiff tasksel tellico texlive-lang-cyrillic texlive-lang-french tftp-hpa tikzit tint2 tintin++ tpm2-tools traceroute tree unrar-free vagrant-cachier vagrant-libvirt vmtouch vorbis-tools w3m wamerican wamerican-huge wfrench whipper whohas xdg-user-dirs-gtk xlax xmlto xsensors xxd yubioath-desktop zenity zip
131 new: beignet-opencl-icd bridge-utils cups-pk-helper dconf-gsettings-backend debmake debootstrap dict-devil dict-freedict-eng-fra dict-freedict-eng-spa dict-freedict-fra-eng dict-freedict-spa-eng diffoscope dropbear-initramfs eog evince file fonts-cantarell fonts-inconsolata fonts-ipafont-gothic fonts-ipafont-mincho fonts-liberation fonts-monoid fonts-monoid-tight fonts-noto fonts-powerline fonts-symbola freeipmi fwupd-amd64-signed gdisk gdm3 gedit-plugins gettext-base gnome-boxes gnupg2 golang-any grub-efi-amd64-signed gsettings-desktop-schemas gsfonts gstreamer1.0-libav gstreamer1.0-plugins-base gstreamer1.0-plugins-good gstreamer1.0-plugins-ugly gstreamer1.0-pulseaudio gvfs-backends ibus-gtk3 ibus-libpinyin im-config img2pdf imv initramfs-tools installation-birthday iptables jupyter jupyter-nbextension-jupyter-js-widgets keyboard-configuration krb5-locales kwin-x11 lintian linux-image-amd64 linux-perf lmodern lsb-base lvm2 mailscripts mailutils mate-themes mime-support mpdris2 mupdf ncal npm2deb ntfs-3g nvme-cli okular-extra-backends openstack-clients plymouth plymouth-themes popularity-contest progress prometheus-node-exporter psensor pubpaste pulseaudio python3-ldap ruby ruby-dev rygel-playbin rygel-tracker sanoid scrcpy-server sddm smartmontools sound-theme-freedesktop strongswan strongswan-swanctl syncthing system-config-printer-udev systemd-bootchart systemd-container task-desktop task-english task-ssh-server texinfo texlive-fonts-extra texlive-lang-german texlive-lang-italian texlive-xetex thunar-archive-plugin tidy tipa trocla ucf udisks2 unifont upower usbguard uuid-runtime virt-manager wireshark xapian-tools xclip xserver-xorg xsltproc xz-utils zathura zathura-pdf-poppler zfs-dkms zfs-initramfs zfsutils-linux zlib1g zlib1g-dev
Yuck! That's a lot of shit to go through. Notice how the packages get sorted between "old" and "new" packages. This is because popcon is used as a tool to mark which packages are "old". If you have unmanaged packages, the "old" ones are likely things that you can uninstall, for example. If you don't have popcon installed, you'll also get this warning:
popcon stats not available: [Errno 2] No such file or directory: '/var/log/popularity-contest'
The error can otherwise be safely ignored, but you won't get "help" prioritizing the packages to add to your manifests. Note that the tool ignores packages that were "marked" (see apt-mark(8)) as automatically installed. This implies that you might have to do a little bit of cleanup the first time you run this, as Debian doesn't necessarily mark all of those packages correctly on first install. For example, here's how it looks like on a clean install, after Puppet ran:
root@angela:/home/anarcat# ./bin/puppet-package-check -v
listing puppet packages...
listing apt packages...
loading apt cache...
127 unmanaged packages found
ca-certificates console-setup cryptsetup-initramfs dbus file gcc-12-base gettext-base grub-common grub-efi-amd64 i3lock initramfs-tools iw keyboard-configuration krb5-locales laptop-detect libacl1 libapparmor1 libapt-pkg6.0 libargon2-1 libattr1 libaudit-common libaudit1 libblkid1 libbpf0 libbsd0 libbz2-1.0 libc6 libcap-ng0 libcap2 libcap2-bin libcom-err2 libcrypt1 libcryptsetup12 libdb5.3 libdebconfclient0 libdevmapper1.02.1 libedit2 libelf1 libext2fs2 libfdisk1 libffi8 libgcc-s1 libgcrypt20 libgmp10 libgnutls30 libgpg-error0 libgssapi-krb5-2 libhogweed6 libidn2-0 libip4tc2 libiw30 libjansson4 libjson-c5 libk5crypto3 libkeyutils1 libkmod2 libkrb5-3 libkrb5support0 liblocale-gettext-perl liblockfile-bin liblz4-1 liblzma5 libmd0 libmnl0 libmount1 libncurses6 libncursesw6 libnettle8 libnewt0.52 libnftables1 libnftnl11 libnl-3-200 libnl-genl-3-200 libnl-route-3-200 libnss-systemd libp11-kit0 libpam-systemd libpam0g libpcre2-8-0 libpcre3 libpcsclite1 libpopt0 libprocps8 libreadline8 libselinux1 libsemanage-common libsemanage2 libsepol2 libslang2 libsmartcols1 libss2 libssl1.1 libssl3 libstdc++6 libsystemd-shared libsystemd0 libtasn1-6 libtext-charwidth-perl libtext-iconv-perl libtext-wrapi18n-perl libtinfo6 libtirpc-common libtirpc3 libudev1 libunistring2 libuuid1 libxtables12 libxxhash0 libzstd1 linux-image-amd64 logsave lsb-base lvm2 media-types mlocate ncurses-term pass-extension-otp puppet python3-reportbug shim-signed tasksel ucf usr-is-merged util-linux-extra wpasupplicant xorg zlib1g
popcon stats not available: [Errno 2] No such file or directory: '/var/log/popularity-contest'
Normally, there should be unmanaged packages here. But because of the way Debian is installed, a lot of libraries and some core packages are marked as manually installed, and are of course not managed through Puppet. There are two solutions to this problem:
  • really manage everything in Puppet (argh)
  • mark packages as automatically installed
I typically chose the second path and mark a ton of stuff as automatic. Then either they will be auto-removed, or will stop being listed. In the above scenario, one could mark all libraries as automatically installed with:
apt-mark auto $(./bin/puppet-package-check   grep -o 'lib[^ ]*')
... but if you trust that most of that stuff is actually garbage that you don't really want installed anyways, you could just mark it all as automatically installed:
apt-mark auto $(./bin/puppet-package-check)
In my case, that ended up keeping basically all libraries (because of course they're installed for some reason) and auto-removing this:
dh-dkms discover-data dkms libdiscover2 libjsoncpp25 libssl1.1 linux-headers-amd64 mlocate pass-extension-otp pass-otp plocate x11-apps x11-session-utils xinit xorg
You'll notice xorg in there: yep, that's bad. Not what I wanted. But for some reason, on other workstations, I did not actually have xorg installed. Turns out having xserver-xorg is enough, and that one has dependencies. So now I guess I just learned to stop worrying and live without X(org).

Optimizing large package installs But that, of course, is not all. Why make things simple when you can have an unreadable title that is trying to be both syntactically correct and click-baity enough to flatter my vain ego? Right. One of the challenges in bootstrapping Puppet with large package lists is that it's slow. Puppet lists packages as individual resources and will basically run apt install $PKG on every package in the manifest, one at a time. While the overhead of apt is generally small, when you add things like apt-listbugs, apt-listchanges, needrestart, triggers and so on, it can take forever setting up a new host. So for initial installs, it can actually makes sense to skip the queue and just install everything in one big batch. And because the above tool inspects the packages installed by Puppet, you can run it against a catalog and have a full lists of all the packages Puppet would install, even before I even had Puppet running. So when reinstalling my laptop, I basically did this:
apt install puppet-agent/experimental
puppet agent --test --noop
apt install $(./puppet-package-check --debug \
    2>&1   grep ^puppet\ packages 
      sed 's/puppet packages://;s/ /\n/g'
      grep -v -e onionshare -e golint -e git-sizer -e github-backup -e hledger -e xsane -e audacity -e chirp -e elpa-flycheck -e elpa-lsp-ui -e yubikey-manager -e git-annex -e hopenpgp-tools -e puppet
) puppet-agent/experimental
That massive grep was because there are currently a lot of packages missing from bookworm. Those are all packages that I have in my catalog but that still haven't made it to bookworm. Sad, I know. I eventually worked around that by adding bullseye sources so that the Puppet manifest actually ran. The point here is that this improves the Puppet run time a lot. All packages get installed at once, and you get a nice progress bar. Then you actually run Puppet to deploy configurations and all the other goodies:
puppet agent --test
I wish I could tell you how much faster that ran. I don't know, and I will not go through a full reinstall just to please your curiosity. The only hard number I have is that it installed 444 packages (which exploded in 10,191 packages with dependencies) in a mere 10 minutes. That might also be with the packages already downloaded. In any case, I have that gut feeling it's faster, so you'll have to just trust my gut. It is, after all, much more important than you might think.

25 September 2022

Sergio Talens-Oliag: Kubernetes Static Content Server

This post describes how I ve put together a simple static content server for kubernetes clusters using a Pod with a persistent volume and multiple containers: an sftp server to manage contents, a web server to publish them with optional access control and another one to run scripts which need access to the volume filesystem. The sftp server runs using MySecureShell, the web server is nginx and the script runner uses the webhook tool to publish endpoints to call them (the calls will come from other Pods that run backend servers or are executed from Jobs or CronJobs).

HistoryThe system was developed because we had a NodeJS API with endpoints to upload files and store them on S3 compatible services that were later accessed via HTTPS, but the requirements changed and we needed to be able to publish folders instead of individual files using their original names and apply access restrictions using our API. Thinking about our requirements the use of a regular filesystem to keep the files and folders was a good option, as uploading and serving files is simple. For the upload I decided to use the sftp protocol, mainly because I already had an sftp container image based on mysecureshell prepared; once we settled on that we added sftp support to the API server and configured it to upload the files to our server instead of using S3 buckets. To publish the files we added a nginx container configured to work as a reverse proxy that uses the ngx_http_auth_request_module to validate access to the files (the sub request is configurable, in our deployment we have configured it to call our API to check if the user can access a given URL). Finally we added a third container when we needed to execute some tasks directly on the filesystem (using kubectl exec with the existing containers did not seem a good idea, as that is not supported by CronJobs objects, for example). The solution we found avoiding the NIH Syndrome (i.e. write our own tool) was to use the webhook tool to provide the endpoints to call the scripts; for now we have three:
  • one to get the disc usage of a PATH,
  • one to hardlink all the files that are identical on the filesystem,
  • one to copy files and folders from S3 buckets to our filesystem.

Container definitions

mysecureshellThe mysecureshell container can be used to provide an sftp service with multiple users (although the files are owned by the same UID and GID) using standalone containers (launched with docker or podman) or in an orchestration system like kubernetes, as we are going to do here. The image is generated using the following Dockerfile:
ARG ALPINE_VERSION=3.16.2
FROM alpine:$ALPINE_VERSION as builder
LABEL maintainer="Sergio Talens-Oliag <sto@mixinet.net>"
RUN apk update &&\
 apk add --no-cache alpine-sdk git musl-dev &&\
 git clone https://github.com/sto/mysecureshell.git &&\
 cd mysecureshell &&\
 ./configure --prefix=/usr --sysconfdir=/etc --mandir=/usr/share/man\
 --localstatedir=/var --with-shutfile=/var/lib/misc/sftp.shut --with-debug=2 &&\
 make all && make install &&\
 rm -rf /var/cache/apk/*
FROM alpine:$ALPINE_VERSION
LABEL maintainer="Sergio Talens-Oliag <sto@mixinet.net>"
COPY --from=builder /usr/bin/mysecureshell /usr/bin/mysecureshell
COPY --from=builder /usr/bin/sftp-* /usr/bin/
RUN apk update &&\
 apk add --no-cache openssh shadow pwgen &&\
 sed -i -e "s ^.*\(AuthorizedKeysFile\).*$ \1 /etc/ssh/auth_keys/%u "\
 /etc/ssh/sshd_config &&\
 mkdir /etc/ssh/auth_keys &&\
 cat /dev/null > /etc/motd &&\
 add-shell '/usr/bin/mysecureshell' &&\
 rm -rf /var/cache/apk/*
COPY bin/* /usr/local/bin/
COPY etc/sftp_config /etc/ssh/
COPY entrypoint.sh /
EXPOSE 22
VOLUME /sftp
ENTRYPOINT ["/entrypoint.sh"]
CMD ["server"]
The /etc/sftp_config file is used to configure the mysecureshell server to have all the user homes under /sftp/data, only allow them to see the files under their home directories as if it were at the root of the server and close idle connections after 5m of inactivity:
etc/sftp_config
# Default mysecureshell configuration
<Default>
   # All users will have access their home directory under /sftp/data
   Home /sftp/data/$USER
   # Log to a file inside /sftp/logs/ (only works when the directory exists)
   LogFile /sftp/logs/mysecureshell.log
   # Force users to stay in their home directory
   StayAtHome true
   # Hide Home PATH, it will be shown as /
   VirtualChroot true
   # Hide real file/directory owner (just change displayed permissions)
   DirFakeUser true
   # Hide real file/directory group (just change displayed permissions)
   DirFakeGroup true
   # We do not want users to keep forever their idle connection
   IdleTimeOut 5m
</Default>
# vim: ts=2:sw=2:et
The entrypoint.sh script is the one responsible to prepare the container for the users included on the /secrets/user_pass.txt file (creates the users with their HOME directories under /sftp/data and a /bin/false shell and creates the key files from /secrets/user_keys.txt if available). The script expects a couple of environment variables:
  • SFTP_UID: UID used to run the daemon and for all the files, it has to be different than 0 (all the files managed by this daemon are going to be owned by the same user and group, even if the remote users are different).
  • SFTP_GID: GID used to run the daemon and for all the files, it has to be different than 0.
And can use the SSH_PORT and SSH_PARAMS values if present. It also requires the following files (they can be mounted as secrets in kubernetes):
  • /secrets/host_keys.txt: Text file containing the ssh server keys in mime format; the file is processed using the reformime utility (the one included on busybox) and can be generated using the gen-host-keys script included on the container (it uses ssh-keygen and makemime).
  • /secrets/user_pass.txt: Text file containing lines of the form username:password_in_clear_text (only the users included on this file are available on the sftp server, in fact in our deployment we use only the scs user for everything).
And optionally can use another one:
  • /secrets/user_keys.txt: Text file that contains lines of the form username:public_ssh_ed25519_or_rsa_key; the public keys are installed on the server and can be used to log into the sftp server if the username exists on the user_pass.txt file.
The contents of the entrypoint.sh script are:
entrypoint.sh
#!/bin/sh
set -e
# ---------
# VARIABLES
# ---------
# Expects SSH_UID & SSH_GID on the environment and uses the value of the
# SSH_PORT & SSH_PARAMS variables if present
# SSH_PARAMS
SSH_PARAMS="-D -e -p $ SSH_PORT:=22  $ SSH_PARAMS "
# Fixed values
# DIRECTORIES
HOME_DIR="/sftp/data"
CONF_FILES_DIR="/secrets"
AUTH_KEYS_PATH="/etc/ssh/auth_keys"
# FILES
HOST_KEYS="$CONF_FILES_DIR/host_keys.txt"
USER_KEYS="$CONF_FILES_DIR/user_keys.txt"
USER_PASS="$CONF_FILES_DIR/user_pass.txt"
USER_SHELL_CMD="/usr/bin/mysecureshell"
# TYPES
HOST_KEY_TYPES="dsa ecdsa ed25519 rsa"
# ---------
# FUNCTIONS
# ---------
# Validate HOST_KEYS, USER_PASS, SFTP_UID and SFTP_GID
_check_environment()  
  # Check the ssh server keys ... we don't boot if we don't have them
  if [ ! -f "$HOST_KEYS" ]; then
    cat <<EOF
We need the host keys on the '$HOST_KEYS' file to proceed.
Call the 'gen-host-keys' script to create and export them on a mime file.
EOF
    exit 1
  fi
  # Check that we have users ... if we don't we can't continue
  if [ ! -f "$USER_PASS" ]; then
    cat <<EOF
We need at least the '$USER_PASS' file to provision users.
Call the 'gen-users-tar' script to create a tar file to create an archive that
contains public and private keys for users, a 'user_keys.txt' with the public
keys of the users and a 'user_pass.txt' file with random passwords for them 
(pass the list of usernames to it).
EOF
    exit 1
  fi
  # Check SFTP_UID
  if [ -z "$SFTP_UID" ]; then
    echo "The 'SFTP_UID' can't be empty, pass a 'GID'."
    exit 1
  fi
  if [ "$SFTP_UID" -eq "0" ]; then
    echo "The 'SFTP_UID' can't be 0, use a different 'UID'"
    exit 1
  fi
  # Check SFTP_GID
  if [ -z "$SFTP_GID" ]; then
    echo "The 'SFTP_GID' can't be empty, pass a 'GID'."
    exit 1
  fi
  if [ "$SFTP_GID" -eq "0" ]; then
    echo "The 'SFTP_GID' can't be 0, use a different 'GID'"
    exit 1
  fi
 
# Adjust ssh host keys
_setup_host_keys()  
  opwd="$(pwd)"
  tmpdir="$(mktemp -d)"
  cd "$tmpdir"
  ret="0"
  reformime <"$HOST_KEYS"   ret="1"
  for kt in $HOST_KEY_TYPES; do
    key="ssh_host_$ kt _key"
    pub="ssh_host_$ kt _key.pub"
    if [ ! -f "$key" ]; then
      echo "Missing '$key' file"
      ret="1"
    fi
    if [ ! -f "$pub" ]; then
      echo "Missing '$pub' file"
      ret="1"
    fi
    if [ "$ret" -ne "0" ]; then
      continue
    fi
    cat "$key" >"/etc/ssh/$key"
    chmod 0600 "/etc/ssh/$key"
    chown root:root "/etc/ssh/$key"
    cat "$pub" >"/etc/ssh/$pub"
    chmod 0600 "/etc/ssh/$pub"
    chown root:root "/etc/ssh/$pub"
  done
  cd "$opwd"
  rm -rf "$tmpdir"
  return "$ret"
 
# Create users
_setup_user_pass()  
  opwd="$(pwd)"
  tmpdir="$(mktemp -d)"
  cd "$tmpdir"
  ret="0"
  [ -d "$HOME_DIR" ]   mkdir "$HOME_DIR"
  # Make sure the data dir can be managed by the sftp user
  chown "$SFTP_UID:$SFTP_GID" "$HOME_DIR"
  # Allow the user (and root) to create directories inside the $HOME_DIR, if
  # we don't allow it the directory creation fails on EFS (AWS)
  chmod 0755 "$HOME_DIR"
  # Create users
  echo "sftp:sftp:$SFTP_UID:$SFTP_GID:::/bin/false" >"newusers.txt"
  sed -n "/^[^#]/   s/:/ /p  " "$USER_PASS"   while read -r _u _p; do
    echo "$_u:$_p:$SFTP_UID:$SFTP_GID::$HOME_DIR/$_u:$USER_SHELL_CMD"
  done >>"newusers.txt"
  newusers --badnames newusers.txt
  # Disable write permission on the directory to forbid remote sftp users to
  # remove their own root dir (they have already done it); we adjust that
  # here to avoid issues with EFS (see before)
  chmod 0555 "$HOME_DIR"
  # Clean up the tmpdir
  cd "$opwd"
  rm -rf "$tmpdir"
  return "$ret"
 
# Adjust user keys
_setup_user_keys()  
  if [ -f "$USER_KEYS" ]; then
    sed -n "/^[^#]/   s/:/ /p  " "$USER_KEYS"   while read -r _u _k; do
      echo "$_k" >>"$AUTH_KEYS_PATH/$_u"
    done
  fi
 
# Main function
exec_sshd()  
  _check_environment
  _setup_host_keys
  _setup_user_pass
  _setup_user_keys
  echo "Running: /usr/sbin/sshd $SSH_PARAMS"
  # shellcheck disable=SC2086
  exec /usr/sbin/sshd -D $SSH_PARAMS
 
# ----
# MAIN
# ----
case "$1" in
"server") exec_sshd ;;
*) exec "$@" ;;
esac
# vim: ts=2:sw=2:et
The container also includes a couple of auxiliary scripts, the first one can be used to generate the host_keys.txt file as follows:
$ docker run --rm stodh/mysecureshell gen-host-keys > host_keys.txt
Where the script is as simple as:
bin/gen-host-keys
#!/bin/sh
set -e
# Generate new host keys
ssh-keygen -A >/dev/null
# Replace hostname
sed -i -e 's/@.*$/@mysecureshell/' /etc/ssh/ssh_host_*_key.pub
# Print in mime format (stdout)
makemime /etc/ssh/ssh_host_*
# vim: ts=2:sw=2:et
And there is another script to generate a .tar file that contains auth data for the list of usernames passed to it (the file contains a user_pass.txt file with random passwords for the users, public and private ssh keys for them and the user_keys.txt file that matches the generated keys). To generate a tar file for the user scs we can execute the following:
$ docker run --rm stodh/mysecureshell gen-users-tar scs > /tmp/scs-users.tar
To see the contents and the text inside the user_pass.txt file we can do:
$ tar tvf /tmp/scs-users.tar
-rw-r--r-- root/root        21 2022-09-11 15:55 user_pass.txt
-rw-r--r-- root/root       822 2022-09-11 15:55 user_keys.txt
-rw------- root/root       387 2022-09-11 15:55 id_ed25519-scs
-rw-r--r-- root/root        85 2022-09-11 15:55 id_ed25519-scs.pub
-rw------- root/root      3357 2022-09-11 15:55 id_rsa-scs
-rw------- root/root      3243 2022-09-11 15:55 id_rsa-scs.pem
-rw-r--r-- root/root       729 2022-09-11 15:55 id_rsa-scs.pub
$ tar xfO /tmp/scs-users.tar user_pass.txt
scs:20JertRSX2Eaar4x
The source of the script is:
bin/gen-users-tar
#!/bin/sh
set -e
# ---------
# VARIABLES
# ---------
USER_KEYS_FILE="user_keys.txt"
USER_PASS_FILE="user_pass.txt"
# ---------
# MAIN CODE
# ---------
# Generate user passwords and keys, return 1 if no username is received
if [ "$#" -eq "0" ]; then
  return 1
fi
opwd="$(pwd)"
tmpdir="$(mktemp -d)"
cd "$tmpdir"
for u in "$@"; do
  ssh-keygen -q -a 100 -t ed25519 -f "id_ed25519-$u" -C "$u" -N ""
  ssh-keygen -q -a 100 -b 4096 -t rsa -f "id_rsa-$u" -C "$u" -N ""
  # Legacy RSA private key format
  cp -a "id_rsa-$u" "id_rsa-$u.pem"
  ssh-keygen -q -p -m pem -f "id_rsa-$u.pem" -N "" -P "" >/dev/null
  chmod 0600 "id_rsa-$u.pem"
  echo "$u:$(pwgen -s 16 1)" >>"$USER_PASS_FILE"
  echo "$u:$(cat "id_ed25519-$u.pub")" >>"$USER_KEYS_FILE"
  echo "$u:$(cat "id_rsa-$u.pub")" >>"$USER_KEYS_FILE"
done
tar cf - "$USER_PASS_FILE" "$USER_KEYS_FILE" id_* 2>/dev/null
cd "$opwd"
rm -rf "$tmpdir"
# vim: ts=2:sw=2:et

nginx-scsThe nginx-scs container is generated using the following Dockerfile:
ARG NGINX_VERSION=1.23.1
FROM nginx:$NGINX_VERSION
LABEL maintainer="Sergio Talens-Oliag <sto@mixinet.net>"
RUN rm -f /docker-entrypoint.d/*
COPY docker-entrypoint.d/* /docker-entrypoint.d/
Basically we are removing the existing docker-entrypoint.d scripts from the standard image and adding a new one that configures the web server as we want using a couple of environment variables:
  • AUTH_REQUEST_URI: URL to use for the auth_request, if the variable is not found on the environment auth_request is not used.
  • HTML_ROOT: Base directory of the web server, if not passed the default /usr/share/nginx/html is used.
Note that if we don t pass the variables everything works as if we were using the original nginx image. The contents of the configuration script are:
docker-entrypoint.d/10-update-default-conf.sh
#!/bin/sh
# Replace the default.conf nginx file by our own version.
set -e
if [ -z "$HTML_ROOT" ]; then
  HTML_ROOT="/usr/share/nginx/html"
fi
if [ "$AUTH_REQUEST_URI" ]; then
  cat >/etc/nginx/conf.d/default.conf <<EOF
server  
  listen       80;
  server_name  localhost;
  location /  
    auth_request /.auth;
    root  $HTML_ROOT;
    index index.html index.htm;
   
  location /.auth  
    internal;
    proxy_pass $AUTH_REQUEST_URI;
    proxy_pass_request_body off;
    proxy_set_header Content-Length "";
    proxy_set_header X-Original-URI \$request_uri;
   
  error_page   500 502 503 504  /50x.html;
  location = /50x.html  
    root /usr/share/nginx/html;
   
 
EOF
else
  cat >/etc/nginx/conf.d/default.conf <<EOF
server  
  listen       80;
  server_name  localhost;
  location /  
    root  $HTML_ROOT;
    index index.html index.htm;
   
  error_page   500 502 503 504  /50x.html;
  location = /50x.html  
    root /usr/share/nginx/html;
   
 
EOF
fi
# vim: ts=2:sw=2:et
As we will see later the idea is to use the /sftp/data or /sftp/data/scs folder as the root of the web published by this container and create an Ingress object to provide access to it outside of our kubernetes cluster.

webhook-scsThe webhook-scs container is generated using the following Dockerfile:
ARG ALPINE_VERSION=3.16.2
ARG GOLANG_VERSION=alpine3.16
FROM golang:$GOLANG_VERSION AS builder
LABEL maintainer="Sergio Talens-Oliag <sto@mixinet.net>"
ENV WEBHOOK_VERSION 2.8.0
ENV WEBHOOK_PR 549
ENV S3FS_VERSION v1.91
WORKDIR /go/src/github.com/adnanh/webhook
RUN apk update &&\
 apk add --no-cache -t build-deps curl libc-dev gcc libgcc patch
RUN curl -L --silent -o webhook.tar.gz\
 https://github.com/adnanh/webhook/archive/$ WEBHOOK_VERSION .tar.gz &&\
 tar xzf webhook.tar.gz --strip 1 &&\
 curl -L --silent -o $ WEBHOOK_PR .patch\
 https://patch-diff.githubusercontent.com/raw/adnanh/webhook/pull/$ WEBHOOK_PR .patch &&\
 patch -p1 < $ WEBHOOK_PR .patch &&\
 go get -d && \
 go build -o /usr/local/bin/webhook
WORKDIR /src/s3fs-fuse
RUN apk update &&\
 apk add ca-certificates build-base alpine-sdk libcurl automake autoconf\
 libxml2-dev libressl-dev mailcap fuse-dev curl-dev
RUN curl -L --silent -o s3fs.tar.gz\
 https://github.com/s3fs-fuse/s3fs-fuse/archive/refs/tags/$S3FS_VERSION.tar.gz &&\
 tar xzf s3fs.tar.gz --strip 1 &&\
 ./autogen.sh &&\
 ./configure --prefix=/usr/local &&\
 make -j && \
 make install
FROM alpine:$ALPINE_VERSION
LABEL maintainer="Sergio Talens-Oliag <sto@mixinet.net>"
WORKDIR /webhook
RUN apk update &&\
 apk add --no-cache ca-certificates mailcap fuse libxml2 libcurl libgcc\
 libstdc++ rsync util-linux-misc &&\
 rm -rf /var/cache/apk/*
COPY --from=builder /usr/local/bin/webhook /usr/local/bin/webhook
COPY --from=builder /usr/local/bin/s3fs /usr/local/bin/s3fs
COPY entrypoint.sh /
COPY hooks/* ./hooks/
EXPOSE 9000
ENTRYPOINT ["/entrypoint.sh"]
CMD ["server"]
Again, we use a multi-stage build because in production we wanted to support a functionality that is not already on the official versions (streaming the command output as a response instead of waiting until the execution ends); this time we build the image applying the PATCH included on this pull request against a released version of the source instead of creating a fork. The entrypoint.sh script is used to generate the webhook configuration file for the existing hooks using environment variables (basically the WEBHOOK_WORKDIR and the *_TOKEN variables) and launch the webhook service:
entrypoint.sh
#!/bin/sh
set -e
# ---------
# VARIABLES
# ---------
WEBHOOK_BIN="$ WEBHOOK_BIN:-/webhook/hooks "
WEBHOOK_YML="$ WEBHOOK_YML:-/webhook/scs.yml "
WEBHOOK_OPTS="$ WEBHOOK_OPTS:--verbose "
# ---------
# FUNCTIONS
# ---------
print_du_yml()  
  cat <<EOF
- id: du
  execute-command: '$WEBHOOK_BIN/du.sh'
  command-working-directory: '$WORKDIR'
  response-headers:
  - name: 'Content-Type'
    value: 'application/json'
  http-methods: ['GET']
  include-command-output-in-response: true
  include-command-output-in-response-on-error: true
  pass-arguments-to-command:
  - source: 'url'
    name: 'path'
  pass-environment-to-command:
  - source: 'string'
    envname: 'OUTPUT_FORMAT'
    name: 'json'
EOF
 
print_hardlink_yml()  
  cat <<EOF
- id: hardlink
  execute-command: '$WEBHOOK_BIN/hardlink.sh'
  command-working-directory: '$WORKDIR'
  http-methods: ['GET']
  include-command-output-in-response: true
  include-command-output-in-response-on-error: true
EOF
 
print_s3sync_yml()  
  cat <<EOF
- id: s3sync
  execute-command: '$WEBHOOK_BIN/s3sync.sh'
  command-working-directory: '$WORKDIR'
  http-methods: ['POST']
  include-command-output-in-response: true
  include-command-output-in-response-on-error: true
  pass-environment-to-command:
  - source: 'payload'
    envname: 'AWS_KEY'
    name: 'aws.key'
  - source: 'payload'
    envname: 'AWS_SECRET_KEY'
    name: 'aws.secret_key'
  - source: 'payload'
    envname: 'S3_BUCKET'
    name: 's3.bucket'
  - source: 'payload'
    envname: 'S3_REGION'
    name: 's3.region'
  - source: 'payload'
    envname: 'S3_PATH'
    name: 's3.path'
  - source: 'payload'
    envname: 'SCS_PATH'
    name: 'scs.path'
  stream-command-output: true
EOF
 
print_token_yml()  
  if [ "$1" ]; then
    cat << EOF
  trigger-rule:
    match:
      type: 'value'
      value: '$1'
      parameter:
        source: 'header'
        name: 'X-Webhook-Token'
EOF
  fi
 
exec_webhook()  
  # Validate WORKDIR
  if [ -z "$WEBHOOK_WORKDIR" ]; then
    echo "Must define the WEBHOOK_WORKDIR variable!" >&2
    exit 1
  fi
  WORKDIR="$(realpath "$WEBHOOK_WORKDIR" 2>/dev/null)"   true
  if [ ! -d "$WORKDIR" ]; then
    echo "The WEBHOOK_WORKDIR '$WEBHOOK_WORKDIR' is not a directory!" >&2
    exit 1
  fi
  # Get TOKENS, if the DU_TOKEN or HARDLINK_TOKEN is defined that is used, if
  # not if the COMMON_TOKEN that is used and in other case no token is checked
  # (that is the default)
  DU_TOKEN="$ DU_TOKEN:-$COMMON_TOKEN "
  HARDLINK_TOKEN="$ HARDLINK_TOKEN:-$COMMON_TOKEN "
  S3_TOKEN="$ S3_TOKEN:-$COMMON_TOKEN "
  # Create webhook configuration
    
    print_du_yml
    print_token_yml "$DU_TOKEN"
    echo ""
    print_hardlink_yml
    print_token_yml "$HARDLINK_TOKEN"
    echo ""
    print_s3sync_yml
    print_token_yml "$S3_TOKEN"
   >"$WEBHOOK_YML"
  # Run the webhook command
  # shellcheck disable=SC2086
  exec webhook -hooks "$WEBHOOK_YML" $WEBHOOK_OPTS
 
# ----
# MAIN
# ----
case "$1" in
"server") exec_webhook ;;
*) exec "$@" ;;
esac
The entrypoint.sh script generates the configuration file for the webhook server calling functions that print a yaml section for each hook and optionally adds rules to validate access to them comparing the value of a X-Webhook-Token header against predefined values. The expected token values are taken from environment variables, we can define a token variable for each hook (DU_TOKEN, HARDLINK_TOKEN or S3_TOKEN) and a fallback value (COMMON_TOKEN); if no token variable is defined for a hook no check is done and everybody can call it. The Hook Definition documentation explains the options you can use for each hook, the ones we have right now do the following:
  • du: runs on the $WORKDIR directory, passes as first argument to the script the value of the path query parameter and sets the variable OUTPUT_FORMAT to the fixed value json (we use that to print the output of the script in JSON format instead of text).
  • hardlink: runs on the $WORKDIR directory and takes no parameters.
  • s3sync: runs on the $WORKDIR directory and sets a lot of environment variables from values read from the JSON encoded payload sent by the caller (all the values must be sent by the caller even if they are assigned an empty value, if they are missing the hook fails without calling the script); we also set the stream-command-output value to true to make the script show its output as it is working (we patched the webhook source to be able to use this option).

The du hook scriptThe du hook script code checks if the argument passed is a directory, computes its size using the du command and prints the results in text format or as a JSON dictionary:
hooks/du.sh
#!/bin/sh
set -e
# Script to print disk usage for a PATH inside the scs folder
# ---------
# FUNCTIONS
# ---------
print_error()  
  if [ "$OUTPUT_FORMAT" = "json" ]; then
    echo " \"error\":\"$*\" "
  else
    echo "$*" >&2
  fi
  exit 1
 
usage()  
  if [ "$OUTPUT_FORMAT" = "json" ]; then
    echo " \"error\":\"Pass arguments as '?path=XXX\" "
  else
    echo "Usage: $(basename "$0") PATH" >&2
  fi
  exit 1
 
# ----
# MAIN
# ----
if [ "$#" -eq "0" ]   [ -z "$1" ]; then
  usage
fi
if [ "$1" = "." ]; then
  DU_PATH="./"
else
  DU_PATH="$(find . -name "$1" -mindepth 1 -maxdepth 1)"   true
fi
if [ -z "$DU_PATH" ]   [ ! -d "$DU_PATH/." ]; then
  print_error "The provided PATH ('$1') is not a directory"
fi
# Print disk usage in bytes for the given PATH
OUTPUT="$(du -b -s "$DU_PATH")"
if [ "$OUTPUT_FORMAT" = "json" ]; then
  # Format output as  "path":"PATH","bytes":"BYTES" 
  echo "$OUTPUT"  
    sed -e "s%^\(.*\)\t.*/\(.*\)$% \"path\":\"\2\",\"bytes\":\"\1\" %"  
    tr -d '\n'
else
  # Print du output as is
  echo "$OUTPUT"
fi
# vim: ts=2:sw=2:et:ai:sts=2

The s3sync hook scriptThe s3sync hook script uses the s3fs tool to mount a bucket and synchronise data between a folder inside the bucket and a directory on the filesystem using rsync; all values needed to execute the task are taken from environment variables:
hooks/s3sync.sh
#!/bin/ash
set -euo pipefail
set -o errexit
set -o errtrace
# Functions
finish()  
  ret="$1"
  echo ""
  echo "Script exit code: $ret"
  exit "$ret"
 
# Check variables
if [ -z "$AWS_KEY" ]   [ -z "$AWS_SECRET_KEY" ]   [ -z "$S3_BUCKET" ]  
  [ -z "$S3_PATH" ]   [ -z "$SCS_PATH" ]; then
  [ "$AWS_KEY" ]   echo "Set the AWS_KEY environment variable"
  [ "$AWS_SECRET_KEY" ]   echo "Set the AWS_SECRET_KEY environment variable"
  [ "$S3_BUCKET" ]   echo "Set the S3_BUCKET environment variable"
  [ "$S3_PATH" ]   echo "Set the S3_PATH environment variable"
  [ "$SCS_PATH" ]   echo "Set the SCS_PATH environment variable"
  finish 1
fi
if [ "$S3_REGION" ] && [ "$S3_REGION" != "us-east-1" ]; then
  EP_URL="endpoint=$S3_REGION,url=https://s3.$S3_REGION.amazonaws.com"
else
  EP_URL="endpoint=us-east-1"
fi
# Prepare working directory
WORK_DIR="$(mktemp -p "$HOME" -d)"
MNT_POINT="$WORK_DIR/s3data"
PASSWD_S3FS="$WORK_DIR/.passwd-s3fs"
# Check the moutpoint
if [ ! -d "$MNT_POINT" ]; then
  mkdir -p "$MNT_POINT"
elif mountpoint "$MNT_POINT"; then
  echo "There is already something mounted on '$MNT_POINT', aborting!"
  finish 1
fi
# Create password file
touch "$PASSWD_S3FS"
chmod 0400 "$PASSWD_S3FS"
echo "$AWS_KEY:$AWS_SECRET_KEY" >"$PASSWD_S3FS"
# Mount s3 bucket as a filesystem
s3fs -o dbglevel=info,retries=5 -o "$EP_URL" -o "passwd_file=$PASSWD_S3FS" \
  "$S3_BUCKET" "$MNT_POINT"
echo "Mounted bucket '$S3_BUCKET' on '$MNT_POINT'"
# Remove the password file, just in case
rm -f "$PASSWD_S3FS"
# Check source PATH
ret="0"
SRC_PATH="$MNT_POINT/$S3_PATH"
if [ ! -d "$SRC_PATH" ]; then
  echo "The S3_PATH '$S3_PATH' can't be found!"
  ret=1
fi
# Compute SCS_UID & SCS_GID (by default based on the working directory owner)
SCS_UID="$ SCS_UID:=$(stat -c "%u" "." 2>/dev/null) "   true
SCS_GID="$ SCS_GID:=$(stat -c "%g" "." 2>/dev/null) "   true
# Check destination PATH
DST_PATH="./$SCS_PATH"
if [ "$ret" -eq "0" ] && [ -d "$DST_PATH" ]; then
  mkdir -p "$DST_PATH"   ret="$?"
fi
# Copy using rsync
if [ "$ret" -eq "0" ]; then
  rsync -rlptv --chown="$SCS_UID:$SCS_GID" --delete --stats \
    "$SRC_PATH/" "$DST_PATH/"   ret="$?"
fi
# Unmount the S3 bucket
umount -f "$MNT_POINT"
echo "Called umount for '$MNT_POINT'"
# Remove mount point dir
rmdir "$MNT_POINT"
# Remove WORK_DIR
rmdir "$WORK_DIR"
# We are done
finish "$ret"
# vim: ts=2:sw=2:et:ai:sts=2

Deployment objectsThe system is deployed as a StatefulSet with one replica. Our production deployment is done on AWS and to be able to scale we use EFS for our PersistenVolume; the idea is that the volume has no size limit, its AccessMode can be set to ReadWriteMany and we can mount it from multiple instances of the Pod without issues, even if they are in different availability zones. For development we use k3d and we are also able to scale the StatefulSet for testing because we use a ReadWriteOnce PVC, but it points to a hostPath that is backed up by a folder that is mounted on all the compute nodes, so in reality Pods in different k3d nodes use the same folder on the host.

secrets.yamlThe secrets file contains the files used by the mysecureshell container that can be generated using kubernetes pods as follows (we are only creating the scs user):
$ kubectl run "mysecureshell" --restart='Never' --quiet --rm --stdin \
  --image "stodh/mysecureshell:latest" -- gen-host-keys >"./host_keys.txt"
$ kubectl run "mysecureshell" --restart='Never' --quiet --rm --stdin \
  --image "stodh/mysecureshell:latest" -- gen-users-tar scs >"./users.tar"
Once we have the files we can generate the secrets.yaml file as follows:
$ tar xf ./users.tar user_keys.txt user_pass.txt
$ kubectl --dry-run=client -o yaml create secret generic "scs-secret" \
  --from-file="host_keys.txt=host_keys.txt" \
  --from-file="user_keys.txt=user_keys.txt" \
  --from-file="user_pass.txt=user_pass.txt" > ./secrets.yaml
The resulting secrets.yaml will look like the following file (the base64 would match the content of the files, of course):
secrets.yaml
apiVersion: v1
data:
  host_keys.txt: TWlt...
  user_keys.txt: c2Nz...
  user_pass.txt: c2Nz...
kind: Secret
metadata:
  creationTimestamp: null
  name: scs-secret

pvc.yamlThe persistent volume claim for a simple deployment (one with only one instance of the statefulSet) can be as simple as this:
pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: scs-pvc
  labels:
    app.kubernetes.io/name: scs
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 8Gi
On this definition we don t set the storageClassName to use the default one.

Volumes in our development environment (k3d)In our development deployment we create the following PersistentVolume as required by the Local Persistence Volume Static Provisioner (note that the /volumes/scs-pv has to be created by hand, in our k3d system we mount the same host directory on the /volumes path of all the nodes and create the scs-pv directory by hand before deploying the persistent volume):
k3d-pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: scs-pv
  labels:
    app.kubernetes.io/name: scs
spec:
  capacity:
    storage: 8Gi
  volumeMode: Filesystem
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Delete
  claimRef:
    name: scs-pvc
  storageClassName: local-storage
  local:
    path: /volumes/scs-pv
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: node.kubernetes.io/instance-type
          operator: In
          values:
          - k3s
And to make sure that everything works as expected we update the PVC definition to add the right storageClassName:
k3d-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: scs-pvc
  labels:
    app.kubernetes.io/name: scs
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 8Gi
  storageClassName: local-storage

Volumes in our production environment (aws)In the production deployment we don t create the PersistentVolume (we are using the aws-efs-csi-driver which supports Dynamic Provisioning) but we add the storageClassName (we set it to the one mapped to the EFS driver, i.e. efs-sc) and set ReadWriteMany as the accessMode:
efs-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: scs-pvc
  labels:
    app.kubernetes.io/name: scs
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 8Gi
  storageClassName: efs-sc

statefulset.yamlThe definition of the statefulSet is as follows:
statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: scs
  labels:
    app.kubernetes.io/name: scs
spec:
  serviceName: scs
  replicas: 1
  selector:
    matchLabels:
      app: scs
  template:
    metadata:
      labels:
        app: scs
    spec:
      containers:
      - name: nginx
        image: stodh/nginx-scs:latest
        ports:
        - containerPort: 80
          name: http
        env:
        - name: AUTH_REQUEST_URI
          value: ""
        - name: HTML_ROOT
          value: /sftp/data
        volumeMounts:
        - mountPath: /sftp
          name: scs-datadir
      - name: mysecureshell
        image: stodh/mysecureshell:latest
        ports:
        - containerPort: 22
          name: ssh
        securityContext:
          capabilities:
            add:
            - IPC_OWNER
        env:
        - name: SFTP_UID
          value: '2020'
        - name: SFTP_GID
          value: '2020'
        volumeMounts:
        - mountPath: /secrets
          name: scs-file-secrets
          readOnly: true
        - mountPath: /sftp
          name: scs-datadir
      - name: webhook
        image: stodh/webhook-scs:latest
        securityContext:
          privileged: true
        ports:
        - containerPort: 9000
          name: webhook-http
        env:
        - name: WEBHOOK_WORKDIR
          value: /sftp/data/scs
        volumeMounts:
        - name: devfuse
          mountPath: /dev/fuse
        - mountPath: /sftp
          name: scs-datadir
      volumes:
      - name: devfuse
        hostPath:
          path: /dev/fuse
      - name: scs-file-secrets
        secret:
          secretName: scs-secrets
      - name: scs-datadir
        persistentVolumeClaim:
          claimName: scs-pvc
Notes about the containers:
  • nginx: As this is an example the web server is not using an AUTH_REQUEST_URI and uses the /sftp/data directory as the root of the web (to get to the files uploaded for the scs user we will need to use /scs/ as a prefix on the URLs).
  • mysecureshell: We are adding the IPC_OWNER capability to the container to be able to use some of the sftp-* commands inside it, but they are not really needed, so adding the capability is optional.
  • webhook: We are launching this container in privileged mode to be able to use the s3fs-fuse, as it will not work otherwise for now (see this kubernetes issue); if the functionality is not needed the container can be executed with regular privileges; besides, as we are not enabling public access to this service we don t define *_TOKEN variables (if required the values should be read from a Secret object).
Notes about the volumes:
  • the devfuse volume is only needed if we plan to use the s3fs command on the webhook container, if not we can remove the volume definition and its mounts.

service.yamlTo be able to access the different services on the statefulset we publish the relevant ports using the following Service object:
service.yaml
apiVersion: v1
kind: Service
metadata:
  name: scs-svc
  labels:
    app.kubernetes.io/name: scs
spec:
  ports:
  - name: ssh
    port: 22
    protocol: TCP
    targetPort: 22
  - name: http
    port: 80
    protocol: TCP
    targetPort: 80
  - name: webhook-http
    port: 9000
    protocol: TCP
    targetPort: 9000
  selector:
    app: scs

ingress.yamlTo download the scs files from the outside we can add an ingress object like the following (the definition is for testing using the localhost name):
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: scs-ingress
  labels:
    app.kubernetes.io/name: scs
spec:
  ingressClassName: nginx
  rules:
  - host: 'localhost'
    http:
      paths:
      - path: /scs
        pathType: Prefix
        backend:
          service:
            name: scs-svc
            port:
              number: 80

DeploymentTo deploy the statefulSet we create a namespace and apply the object definitions shown before:
$ kubectl create namespace scs-demo
namespace/scs-demo created
$ kubectl -n scs-demo apply -f secrets.yaml
secret/scs-secrets created
$ kubectl -n scs-demo apply -f pvc.yaml
persistentvolumeclaim/scs-pvc created
$ kubectl -n scs-demo apply -f statefulset.yaml
statefulset.apps/scs created
$ kubectl -n scs-demo apply -f service.yaml
service/scs-svc created
$ kubectl -n scs-demo apply -f ingress.yaml
ingress.networking.k8s.io/scs-ingress created
Once the objects are deployed we can check that all is working using kubectl:
$ kubectl  -n scs-demo get all,secrets,ingress
NAME        READY   STATUS    RESTARTS   AGE
pod/scs-0   3/3     Running   0          24s
NAME            TYPE       CLUSTER-IP  EXTERNAL-IP  PORT(S)                  AGE
service/scs-svc ClusterIP  10.43.0.47  <none>       22/TCP,80/TCP,9000/TCP   21s

NAME                   READY   AGE
statefulset.apps/scs   1/1     24s
NAME                         TYPE                                  DATA   AGE
secret/default-token-mwcd7   kubernetes.io/service-account-token   3      53s
secret/scs-secrets           Opaque                                3      39s
NAME                                   CLASS  HOSTS      ADDRESS     PORTS   AGE
ingress.networking.k8s.io/scs-ingress  nginx  localhost  172.21.0.5  80      17s
At this point we are ready to use the system.

Usage examples

File uploadsAs previously mentioned in our system the idea is to use the sftp server from other Pods, but to test the system we are going to do a kubectl port-forward and connect to the server using our host client and the password we have generated (it is on the user_pass.txt file, inside the users.tar archive):
$ kubectl -n scs-demo port-forward service/scs-svc 2020:22 &
Forwarding from 127.0.0.1:2020 -> 22
Forwarding from [::1]:2020 -> 22
$ PF_PID=$!
$ sftp -P 2020 scs@127.0.0.1                                                 1
Handling connection for 2020
The authenticity of host '[127.0.0.1]:2020 ([127.0.0.1]:2020)' can't be \
  established.
ED25519 key fingerprint is SHA256:eHNwCnyLcSSuVXXiLKeGraw0FT/4Bb/yjfqTstt+088.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '[127.0.0.1]:2020' (ED25519) to the list of known \
  hosts.
scs@127.0.0.1's password: **********
Connected to 127.0.0.1.
sftp> ls -la
drwxr-xr-x    2 sftp     sftp         4096 Sep 25 14:47 .
dr-xr-xr-x    3 sftp     sftp         4096 Sep 25 14:36 ..
sftp> !date -R > /tmp/date.txt                                               2
sftp> put /tmp/date.txt .
Uploading /tmp/date.txt to /date.txt
date.txt                                      100%   32    27.8KB/s   00:00
sftp> ls -l
-rw-r--r--    1 sftp     sftp           32 Sep 25 15:21 date.txt
sftp> ln date.txt date.txt.1                                                 3
sftp> ls -l
-rw-r--r--    2 sftp     sftp           32 Sep 25 15:21 date.txt
-rw-r--r--    2 sftp     sftp           32 Sep 25 15:21 date.txt.1
sftp> put /tmp/date.txt date.txt.2                                           4
Uploading /tmp/date.txt to /date.txt.2
date.txt                                      100%   32    27.8KB/s   00:00
sftp> ls -l                                                                  5
-rw-r--r--    2 sftp     sftp           32 Sep 25 15:21 date.txt
-rw-r--r--    2 sftp     sftp           32 Sep 25 15:21 date.txt.1
-rw-r--r--    1 sftp     sftp           32 Sep 25 15:21 date.txt.2
sftp> exit
$ kill "$PF_PID"
[1]  + terminated  kubectl -n scs-demo port-forward service/scs-svc 2020:22
  1. We connect to the sftp service on the forwarded port with the scs user.
  2. We put a file we have created on the host on the directory.
  3. We do a hard link of the uploaded file.
  4. We put a second copy of the file we created locally.
  5. On the file list we can see that the two first files have two hardlinks

File retrievalsIf our ingress is configured right we can download the date.txt file from the URL http://localhost/scs/date.txt:
$ curl -s http://localhost/scs/date.txt
Sun, 25 Sep 2022 17:21:51 +0200

Use of the webhook containerTo finish this post we are going to show how we can call the hooks directly, from a CronJob and from a Job.

Direct script call (du)In our deployment the direct calls are done from other Pods, to simulate it we are going to do a port-forward and call the script with an existing PATH (the root directory) and a bad one:
$ kubectl -n scs-demo port-forward service/scs-svc 9000:9000 >/dev/null &
$ PF_PID=$!
$ JSON="$(curl -s "http://localhost:9000/hooks/du?path=.")"
$ echo $JSON
 "path":"","bytes":"4160" 
$ JSON="$(curl -s "http://localhost:9000/hooks/du?path=foo")"
$ echo $JSON
 "error":"The provided PATH ('foo') is not a directory" 
$ kill $PF_PID
As we only have files on the base directory we print the disk usage of the . PATH and the output is in json format because we export OUTPUT_FORMAT with the value json on the webhook configuration.

Jobs (s3sync)The following job can be used to synchronise the contents of a directory in a S3 bucket with the SCS Filesystem:
job.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: s3sync
  labels:
    cronjob: 's3sync'
spec:
  template:
    metadata:
      labels:
        cronjob: 's3sync'
    spec:
      containers:
      - name: s3sync-job
        image: alpine:latest
        command: 
        - "wget"
        - "-q"
        - "--header"
        - "Content-Type: application/json"
        - "--post-file"
        - "/secrets/s3sync.json"
        - "-O-"
        - "http://scs-svc:9000/hooks/s3sync"
        volumeMounts:
        - mountPath: /secrets
          name: job-secrets
          readOnly: true
      restartPolicy: Never
      volumes:
      - name: job-secrets
        secret:
          secretName: webhook-job-secrets
The file with parameters for the script must be something like this:
s3sync.json
 
  "aws":  
    "key": "********************",
    "secret_key": "****************************************"
   ,
  "s3":  
    "region": "eu-north-1",
    "bucket": "blogops-test",
    "path": "test"
   ,
  "scs":  
    "path": "test"
   
 
Once we have both files we can run the Job as follows:
$ kubectl -n scs-demo create secret generic webhook-job-secrets \            1
  --from-file="s3sync.json=s3sync.json"
secret/webhook-job-secrets created
$ kubectl -n scs-demo apply -f webhook-job.yaml                              2
job.batch/s3sync created
$ kubectl -n scs-demo get pods -l "cronjob=s3sync"                           3
NAME           READY   STATUS      RESTARTS   AGE
s3sync-zx2cj   0/1     Completed   0          12s
$ kubectl -n scs-demo logs s3sync-zx2cj                                      4
Mounted bucket 's3fs-test' on '/root/tmp.jiOjaF/s3data'
sending incremental file list
created directory ./test
./
kyso.png
Number of files: 2 (reg: 1, dir: 1)
Number of created files: 2 (reg: 1, dir: 1)
Number of deleted files: 0
Number of regular files transferred: 1
Total file size: 15,075 bytes
Total transferred file size: 15,075 bytes
Literal data: 15,075 bytes
Matched data: 0 bytes
File list size: 0
File list generation time: 0.147 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 15,183
Total bytes received: 74
sent 15,183 bytes  received 74 bytes  30,514.00 bytes/sec
total size is 15,075  speedup is 0.99
Called umount for '/root/tmp.jiOjaF/s3data'
Script exit code: 0
$ kubectl -n scs-demo delete -f webhook-job.yaml                             5
job.batch "s3sync" deleted
$ kubectl -n scs-demo delete secrets webhook-job-secrets                     6
secret "webhook-job-secrets" deleted
  1. Here we create the webhook-job-secrets secret that contains the s3sync.json file.
  2. This command runs the job.
  3. Checking the label cronjob=s3sync we get the Pods executed by the job.
  4. Here we print the logs of the completed job.
  5. Once we are finished we remove the Job.
  6. And also the secret.

Final remarksThis post has been longer than I expected, but I believe it can be useful for someone; in any case, next time I ll try to explain something shorter or will split it into multiple entries.

Shirish Agarwal: Rama II, Arthur C. Clarke, Aliens

Rama II This would be more of a short post about the current book I am reading. Now people who have seen Arrival would probably be more at home. People who have also seen Avatar would also be familiar to the theme or concept I am sharing about. Now before I go into detail, it seems that Arthur C. Clarke wanted to use a powerful god or mythological character for the name and that is somehow the RAMA series started. Now the first book in the series explores an extraterrestrial spaceship that earth people see/connect with. The spaceship is going somewhere and is doing an Earth flyby so humans don t have much time to explore the spaceship and it is difficult to figure out how the spaceship worked. The spaceship is around 40 km. long. They don t meet any living Ramans but mostly automated systems and something called biots. As I m still reading it, I can t really say what happens next. Although in Rama or Rama I, the powers that be want to destroy it while in the end last they don t. Whether they could have destroyed it or not would be whole another argument. What people need to realize is that the book is a giant What IF scenario.

Aliens If there were any intelligent life in the Universe, I don t think they will take the pain of visiting Earth. And the reasons are far more mundane than anything else. Look at how we treat each other. One of the largest democracies on Earth, The U.S. has been so divided. While the progressives have made some good policies, the Republicans are into political stunts, consider the political stunt of sending Refugees to Martha s Vineyard. The ex-president also made a statement that he can declassify anything just by thinking about it. Now understand this, a refugee is a legal migrant whose papers would be looked into by the American Govt. and till the time he/she/their application is approved or declined they can work, have a house, or do whatever to support themselves. There is a huge difference between having refugee status and being an undocumented migrant. And it isn t as if the Republicans don t know this, they did it because they thought they will be able to get away with it. Both the above episodes don t throw us in a good light. If we treat others like the above, how can we expect to be treated? And refugees always have a hard time, not just in the U.S, , the UK you name it. The UK just some months ago announced a controversial deal where they will send Refugees to Rwanda while their refugee application is accepted or denied, most of them would be denied. The Indian Government is more of the same. A friend, a casual acquaintance Nishant Shah shared the same issues as I had shared a few weeks back even though he s an NRI. So, it seems we are incapable of helping ourselves as well as helping others. On top of it, we have the temerity of using the word alien for them. Now, just for a moment, imagine you are an intelligent life form. An intelligent life-form that could coax energy from the stars, why would you come to Earth, where the people at large have already destroyed more than half of the atmosphere and still arguing about it with the other half. On top of it, we see a list of authoritarian figures like Putin, Xi Jinping whose whole idea is to hold on to power for as long as they can, damn the consequences. Mr. Modi is no different, he is the dumbest of the lot and that s saying something. Most of the projects made by him are in disarray, Pune Metro, my city giving an example. And this is when Pune was the first applicant to apply for a Metro. Just like the UK, India too has tanked the economy under his guidance. Every time they come closer to target dates, the targets are put far into the future, for e.g. now they have said 2040 for a good economy. And just like in other countries, he has some following even though he has a record of failure in every sector of the economy, education, and defense, the list is endless. There isn t a single accomplishment by him other than screwing with other religions. Most of my countrymen also don t really care or have a bother to see how the economy grows and how exports play a crucial part otherwise they would be more alert. Also, just like the UK, India too gave tax cuts to the wealthy, most people don t understand how economies function and the PM doesn t care. The media too is subservient and because nobody asks the questions, nobody seems to be accountable :(.

Religion There is another aspect that also has been to the fore, just like in medieval times, I see a great fervor for religion happening here, especially since the pandemic and people are much more insecure than ever before. Before, I used to think that insecurity and religious appeal only happen in the uneducated, and I was wrong. I have friends who are highly educated and yet still are blinded by religion. In many such cases or situations, I find their faith to be a sham. If you have faith, then there shouldn t be any room for doubt or insecurity. And if you are not in doubt or insecure, you won t need to talk about your religion. The difference between the two is that a person is satiated himself/herself/themselves with thirst and hunger. That person would be in a relaxed mode while the other person would continue to create drama as there is no peace in their heart. Another fact is none of the major religions, whether it is Christianity, Islam, Buddhism or even Hinduism has allowed for the existence of extraterrestrials. We have already labeled them as aliens even before meeting them & just our imagination. And more often than not, we end up killing them. There are and have been scores of movies that have explored the idea. Independence day, Aliens, Arrival, the list goes on and on. And because our religions have never thought about the idea of ET s and how they will affect us, if ET s do come, all the religions and religious practices would panic and die. That is the possibility why even the 1947 Roswell Incident has been covered up . If the above was not enough, the bombing of Hiroshima and Nagasaki by the Americans would always be a black mark against humanity. From the alien perspective, if you look at the technology that they have vis-a-vis what we have, they will probably think of us as spoilt babies and they wouldn t be wrong. Spoilt babies with nuclear weapons are not exactly a healthy mix

Earth To add to our fragile ego, we didn t even leave earth even though we have made sure we exploit it as much as we can. We even made the anthropocentric or homocentric view that makes man the apex animal and to top it we have this weird idea that extraterrestrials come here or will invade for water. A species that knows how to get energy out of stars but cannot make a little of H2O. The idea belies logic and again has been done to death. Why we as humans are so insecure even though we have been given so much I fail to understand. I have shared on numerous times the Kardeshev Scale on this blog itself. The above are some of the reasons why Arthur C. Clarke s works are so controversial and this is when I haven t even read the whole book. It forces us to ask questions that we normally would never think about. And I have to repeat that when these books were published for the first time, they were new ideas. All the movies, from Stanley Kubrick s 2001: Space Odyssey, Aliens, Arrival, and Avatar, somewhere or the other reference some aspect of this work. It is highly possible that I may read and re-read the book couple of times before beginning the next one. There is also quite a bit of human drama, but then that is to be expected. I have to admit I did have some nice dreams after reading just the first few pages, imagining being given the opportunity to experience an Extraterrestrial spaceship that is beyond our wildest dreams. While the Governments may try to cover up or something, the ones who get to experience that spacecraft would be unimaginable. And if they were able to share the pictures or a Livestream, it would be nothing short of amazing. For those who want to, there is a lot going on with the New James Webb Telescope. I am sure it would give rise to more questions than answers.

23 September 2022

Gunnar Wolf: 6237415

Years ago, it was customary that some of us stated publicly the way we think in time of Debian General Resolutions (GRs). And even if we didn t, vote lists were open (except when voting for people, i.e. when electing a DPL), so if interested we could understand what our different peers thought. This is the first vote, though, where a Debian vote is protected under voting secrecy. I think it is sad we chose that path, as I liken a GR vote more with a voting process within a general assembly of a cooperative than with a countrywide voting one; I feel that understanding who is behind each posture helps us better understand the project as a whole. But anyway, I m digressing Even though I remained quiet during much of the discussion period (I was preparing and attending a conference), I am very much interested in this vote I am the maintainer for the Raspberry Pi firmware, and am a seconder for two of them. Many people know me for being quite inflexible in my interpretation of what should be considered Free Software, and I m proud of it. But still, I believer it to be fundamental for Debian to be able to run on the hardware most users have. So My vote was as follows:
[6] Choice 1: Only one installer, including non-free firmware
[2] Choice 2: Recommend installer containing non-free firmware
[3] Choice 3: Allow presenting non-free installers alongside the free one
[7] Choice 4: Installer with non-free software is not part of Debian
[4] Choice 5: Change SC for non-free firmware in installer, one installer
[1] Choice 6: Change SC for non-free firmware in installer, keep both installers
[5] Choice 7: None Of The Above
For people reading this not into Debian s voting processes: Debian uses the cloneproof Schwatz sequential dropping Condorcet method, which means we don t only choose our favorite option (which could lead to suboptimal strategic voting outcomes), but we rank all the options according to our preferences. To read this vote, we should first locate position of None of the above , which for my ballot is #5. Let me reorder the ballot according to my preferences:
[1] Choice 6: Change SC for non-free firmware in installer, keep both installers
[2] Choice 2: Recommend installer containing non-free firmware
[3] Choice 3: Allow presenting non-free installers alongside the free one
[4] Choice 5: Change SC for non-free firmware in installer, one installer
[5] Choice 7: None Of The Above
[6] Choice 1: Only one installer, including non-free firmware
[7] Choice 4: Installer with non-free software is not part of Debian
This is, I don t agree either with Steve McIntyre s original proposal, Choice 1 (even though I seconded it, this means, I think it s very important to have this vote, and as a first proposal, it s better than the status quo maybe it s contradictory that I prefer it to the status quo, but ranked it below NotA. Well, more on that when I present Choice 5). My least favorite option is Choice 4, presented by Simon Josefsson, which represents the status quo: I don t want Debian not to have at all an installer that cannot be run on most modern hardware with reasonably good user experience (i.e. network support or the ability to boot at all!) Slightly above my acceptability threshold, I ranked Choice 5, presented by Russ Allbery. Debian s voting and its constitution rub each other in interesting ways, so the Project Secretary has to run the votes as they are presented but he has interpreted Choice 1 to be incompatible with the Social Contract (as there would no longer be a DFSG-free installer available), and if it wins, it could lead him to having to declare the vote invalid. I don t want that to happen, and that s why I ranked Choice 1 below None of the above.
[update/note] Several people have asked me to back that the Secretary said so. I can refer to four mails: 2022.08.29, 2022.08.30, 2022.09.02, 2022.09.04.
Other than that, Choice 6 (proposed by Holger Levsen), Choice 2 (proposed by me) and Choice 3 (proposed by Bart Martens) are very much similar; the main difference is that Choice 6 includes a modification to the Social Contract expressing that:
The Debian official media may include firmware that is otherwise not
part of the Debian system to enable use of Debian with hardware that
requires such firmware.
I believe choices 2 and 3 to be mostly the same, being Choice 2 more verbose in explaining the reasoning than Choice 3. Oh! And there are always some more bits to the discussion For example, given they hold modifications to the Social Contract, both Choice 5 and Choice 6 need a 3:1 supermajority to be valid. So, lets wait until the beginning of October to get the results, and to implement the changes they will (or not?) allow. If you are a Debian Project Member, please vote!

17 September 2022

Shirish Agarwal: Books and Indian Tourism

Fiction A few days ago somebody asked me and I think it is an often requested to perhaps all fiction readers as to why we like fiction? First of all, reading in itself is told as food for the soul. Because, whenever you write or read anything you don t just read it, you also visualize it. And that visualization is and would be far greater than any attempt in cinema as there are no budget constraints and it takes no more than a minute to visualize a scenario if the writer is any good. You just close your eyes and in a moment you are transported to a different world. This is also what is known as world building . Something fantasy writers are especially gifted in. Also, with the whole parallel Universes being a reality, it is just so much fertile land for imagination that I just cannot believe that it hasn t been worked to death to date. And you do need a lot of patience to make a world, to make characters, to make characters a bit eccentric one way or the other. And you have to know to put into a three, five, or whatever number of acts you want to put in. And then, of course, they have readers like us who dream and add more color to the story than the author did. As we take his, her, or their story and weave countless stories depending on where we are, where we are and who we are. What people need to understand is that not just readers want escapism but writers too want to escape from the human condition. And they find solace in whatever they write. The well-known example of J.R.R. Tolkien is always there. How he must have felt each day coming after war, to somehow find the strength and just dream away, transport himself to a world of hobbits, elves, and other mysterious beings. It surely must have taken a lot of pain from him that otherwise, he would have felt. There are many others. What also does happen now and then, is authors believe in their own intelligence so much, that they commit crimes, but that s par for the course.

Dean Koontz, Odd Apocalypse Currently, I am reading the above title. It is perhaps one of the first horror title books that I have read which has so much fun. The hero has a great sense of wit, humor, and sarcasm that you can cut butter with it. Now if you got that, this is par for the wordplay happening every second paragraph and I m just 100 pages in of the 500-page Novel. Now, while I haven t read the whole book and I m just speculating, what if at the end we realize that the hero all along was or is the villain. Sadly, we don t have many such twisted stories and that too is perhaps because most people used to have black and white rather than grey characters. From all my reading, and even watching web series and whatnot, it is only the Europeans who seem to have a taste for exploring grey characters and giving twists at the end that people cannot anticipate. Even their heroes or heroines are grey characters. and they can really take you for a ride. It is also perhaps how we humans are, neither black nor white but more greyish. Having grey characters also frees the author quite a bit as she doesn t have to use so-called tropes and just led the characters to lead themselves.

Indian Book publishing Industry I do know Bengali stories do have a lot of grey characters, but sadly most of the good works are still in Bengali and not widely published compared to say European or American authors. While there is huge potential in the Indian publishing market for English books and there is also hunger, getting good and cheap publishers is the issue. Just recently SAGE publishing division shut down and this does not augur well for the Indian market. In the past few years, I and other readers have seen some very good publishing houses quit India for one reason or the other. GST has also made the sector more expensive. The only thing that works now and has been for some time is the seconds and thirds market. For e.g. I just bought today about 15-20 books @INR 125/- a kind of belated present for the self. That would be what, at the most 2 USD or 2 Euros per book. I bet even a burger costs more than that, but again India being a price-sensitive market, at these prices the seconds book sells. And these are all my favorite authors, Lee Child, Tom Clancy, Dean Koontz, and so on and so forth. I also saw a lot of fantasy books but they would have to wait for another day.

Tourism in India for Debconf 23 I had shared a while back that I would write a bit about tourism as Debconf or Annual Debian Conference will happen in India next year around this time. I was supposed to write it in the FAQ but couldn t find a place or a corner where I could write it. There are actually two things that people need to be aware of. The one thing that people need to be very aware of is food poisoning or Delhi Belly. This is a far too common sight that I have witnessed especially with westerners when they come to visit India. I am somewhat shocked that it hasn t been shared in the FAQ but then perhaps we cannot cover all the bases therein. I did find this interesting article and would recommend the suggestions given in it wholeheartedly. I would suggest people coming to India to buy and have purifying water tablets with them if they decide to stay back and explore India. Now the problem with tourism is, that one can have as much tourism as one wants. One of the unique ways I found some westerners having the time of their life is buying an Indian Rickshaw or Tuk-Tuk and traveling with it. A few years ago, when I was more adventourous-spirited I was able to meet a few of them. There is also the Race with Rickshaws that happens in Rajasthan and you get to see about 10 odd cities in and around Rajasthan state and get to see the vibrancy in the North. If somebody really wants to explore India, then I would suggest getting down to Goa, specifically, South Goa, meeting with the hippie crowd, and getting one of the hippie guidebooks to India. Most people forget that the Hippies came to India in the 1960s and many of them just never left. Tap water in Pune is ok, have seen and experienced the same in Himachal, Garwhal, and Uttarakhand, although it has been a few years since I have been to those places. North-East is a place I have yet to venture into. India does have a lot of beauty but most people are not clean-conscious so if you go to common tourist destinations, you will find a lot of garbage. Most cities in India do give you an option of homestays and some even offer food, so if you are on a budget as well as wanna experience life with an Indian family, that could be something you could look into. So you can see and share about India with different eyes. There is casteism, racism, and all that. Generally speaking, you would see it wielded a lot more in your face in North India than in South India where it is there but far more subtle. About food, what has been shared in the India BOF. Have to say, it doesn t even scratch the surface. If you stay with an Indian family, there is probably a much better chance of exploring the variety of food that India has to offer. From the western perspective, we tend to overcook stuff and make food with Masalas but that s the way most people like it. People who have had hot sauces or whatnot would probably find India much easier to adjust to as tastes might be similar to some extent. If you want to socialize with young people, while discos are an option, meetup.com also is a good place. You can share your passions and many people have taken to it with gusto. We also have been hosting Comiccons in India, but I haven t had the opportunity to attend them so far. India has a rich oral culture reach going back a few thousand years, but many of those who are practicing those reside more in villages rather than in cities. And while there have been attempts in the past to record them, most of those have come to naught as money runs out as there is no commercial viability to such projects, but that probably is for another day. In the end, what I have shared is barely a drop in the ocean that is India. Come, have fun, explore, enjoy and invigorate yourself and others

9 September 2022

Reproducible Builds: Reproducible Builds in August 2022

Welcome to the August 2022 report from the Reproducible Builds project! In these reports we outline the most important things that we have been up to over the past month. As a quick recap, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries. The motivation behind the reproducible builds effort is to ensure no flaws have been introduced during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website.

Community news As announced last month, registration is currently open for our in-person summit this year which is due to be held between November 1st November 3rd. The event will take place in Venice (Italy). Very soon we intend to pick a venue reachable via the train station and an international airport. However, the precise venue will depend on the number of attendees. Please see the announcement email for information about how to register.
The US National Security Agency (NSA), Cybersecurity and Infrastructure Security Agency (CISA) and the Office of the Director of National Intelligence (ODNI) have released a document called Securing the Software Supply Chain: Recommended Practices Guide for Developers (PDF) as part of their Enduring Security Framework (ESF) work. The document expressly recommends having reproducible builds as part of advanced recommended mitigations, along with hermetic builds. Page 31 (page 35 in the PDF) says:
Reproducible builds provide additional protection and validation against attempts to compromise build systems. They ensure the binary products of each build system match: i.e., they are built from the same source, regardless of variable metadata such as the order of input files, timestamps, locales, and paths. Reproducible builds are those where re-running the build steps with identical input artifacts results in bit-for-bit identical output. Builds that cannot meet this must provide a justification why the build cannot be made reproducible.
The full press release is available online.
On our mailing list this month, Marc Prud hommeaux posted a feature request for diffoscope which additionally outlines a project called The App Fair, an autonomous distribution network of free and open-source macOS and iOS applications, where validated apps are then signed and submitted for publication .
Author/blogger Cory Doctorow posted published a provocative blog post this month titled Your computer is tormented by a wicked god . Touching on Ken Thompson s famous talk, Reflections on Trusting Trust , the early goals of Secure Computing and UEFI firmware interfaces:
This is the core of a two-decade-old debate among security people, and it s one that the benevolent God faction has consistently had the upper hand in. They re the curated computing advocates who insist that preventing you from choosing an alternative app store or side-loading a program is for your own good because if it s possible for you to override the manufacturer s wishes, then malicious software may impersonate you to do so, or you might be tricked into doing so. [..] This benevolent dictatorship model only works so long as the dictator is both perfectly benevolent and perfectly competent. We know the dictators aren t always benevolent. [ ] But even if you trust a dictator s benevolence, you can t trust in their perfection. Everyone makes mistakes. Benevolent dictator computing works well, but fails badly. Designing a computer that intentionally can t be fully controlled by its owner is a nightmare, because that is a computer that, once compromised, can attack its owner with impunity.

Lastly, Chengyu HAN updated the Reproducible Builds website to correct an incorrect Git command. [ ]

Debian In Debian this month, the essential and required package sets became 100% reproducible in Debian bookworm on the amd64 and arm64 architectures. These two subsets of the full Debian archive refer to Debian package priority levels as described in the 2.5 Priorities section of the Debian Policy there is no canonical minimal installation package set in Debian due to its diverse methods of installation. As it happens, these package sets are not reproducible on the i386 architecture because the ncurses package on that architecture is not yet reproducible, and the sed package currently fails to build from source on armhf too. The full list of reproducible packages within these package sets can be viewed within our QA system, such as on the page of required packages in amd64 and the list of essential packages on arm64, both for Debian bullseye.
It recently has become very easy to install reproducible Debian Docker containers using podman on Debian bullseye:
$ sudo apt install podman
$ podman run --rm -it debian:bullseye bash
The (pre-built) image used is itself built using debuerrotype, as explained on docker.debian.net. This page also details how to build the image yourself and what checksums are expected if you do so.
Related to this, it has also become straightforward to reproducibly bootstrap Debian using mmdebstrap, a replacement for the usual debootstrap tool to create Debian root filesystems:
$ SOURCE_DATE_EPOCH=$(date --utc --date=2022-08-29 +%s) mmdebstrap unstable > unstable.tar
This works for (at least) Debian unstable, bullseye and bookworm, and is tested automatically by a number of QA jobs set up by Holger Levsen (unstable, bookworm and bullseye)
Work has also taken place to ensure that the canonical debootstrap and cdebootstrap tools are also capable of bootstrapping Debian reproducibly, although it currently requires a few extra steps:
  1. Clamping the modification time of files that are newer than $SOURCE_DATE_EPOCH to be not greater than SOURCE_DATE_EPOCH.
  2. Deleting a few files. For debootstrap, this requires the deletion of /etc/machine-id, /var/cache/ldconfig/aux-cache, /var/log/dpkg.log, /var/log/alternatives.log and /var/log/bootstrap.log, and for cdebootstrap we also need to delete the /var/log/apt/history.log and /var/log/apt/term.log files as well.
This process works at least for unstable, bullseye and bookworm and is now being tested automatically by a number of QA jobs setup by Holger Levsen [ ][ ][ ][ ][ ][ ]. As part of this work, Holger filed two bugs to request a better initialisation of the /etc/machine-id file in both debootstrap [ ] and cdebootstrap [ ].
Elsewhere in Debian, 131 reviews of Debian packages were added, 20 were updated and 27 were removed this month, adding to our extensive knowledge about identified issues. Chris Lamb added a number of issue types, including: randomness_in_browserify_output [ ], haskell_abi_hash_differences [ ], nondeterministic_ids_in_html_output_generated_by_python_sphinx_panels [ ]. Lastly, Mattia Rizzolo removed the deterministic flag from the captures_kernel_variant flag [ ].

Other distributions Vagrant Cascadian posted an update of the status of Reproducible Builds in GNU Guix, writing that:
Ignoring the pesky unknown packages, it is more like ~93% reproducible and ~7% unreproducible... that feels a bit better to me! These numbers wander around over time, mostly due to packages moving back into an "unknown" state while the build farms catch up with each other... although the above numbers seem to have been pretty consistent over the last few days.
The post itself contains a lot more details, including a brief discussion of tooling. Elsewhere in GNU Guix, however, Vagrant updated a number of packages such as itpp [ ], perl-class-methodmaker [ ], libnet [ ], directfb [ ] and mm-common [ ], as well as updated the version of reprotest to 0.7.21 [ ]. In openSUSE, Bernhard M. Wiedemann published his usual openSUSE monthly report.

diffoscope diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploaded versions 220 and 221 to Debian, as well as made the following changes:
  • Update external_tools.py to reflect changes to xxd and the vim-common package. [ ]
  • Depend on the dedicated xxd package now, not the vim-common package. [ ]
  • Don t crash if we can open a PDF file using the PyPDF library, but cannot subsequently parse the annotations within. [ ]
In addition, Vagrant Cascadian updated diffoscope in GNU Guix, first to to version 220 [ ] and later to 221 [ ].

Community news The Reproducible Builds project aims to fix as many currently-unreproducible packages as possible as well as to send all of our patches upstream wherever appropriate. This month we created a number of patches, including:

Testing framework The Reproducible Builds project runs a significant testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, Holger Levsen made the following changes:
  • Debian-related changes:
    • Temporarily add Debian unstable deb-src lines to enable test builds a Non-maintainer Upload (NMU) campaign targeting 708 sources without .buildinfo files found in Debian unstable, including 475 in bookworm. [ ][ ]
    • Correctly deal with the Debian Edu packages not being installable. [ ]
    • Finally, stop scheduling stretch. [ ]
    • Make sure all Ubuntu nodes have the linux-image-generic kernel package installed. [ ]
  • Health checks & view:
    • Detect SSH login problems. [ ]
    • Only report the first uninstallable package set. [ ]
    • Show new bootstrap jobs. [ ] and debian-live jobs. [ ] in the job health view.
    • Fix regular expression to detect various zombie jobs. [ ]
  • New jobs:
    • Add a new job to test reproducibility of mmdebstrap bootstrapping tool. [ ][ ][ ][ ]
    • Run our new mmdebstrap job remotely [ ][ ]
    • Improve the output of the mmdebstrap job. [ ][ ][ ]
    • Adjust the mmdebstrap script to additionally support debootstrap as well. [ ][ ][ ]
    • Work around mmdebstrap and debootstrap keeping logfiles within their artifacts. [ ][ ][ ]
    • Add support for testing cdebootstrap too and add such a job for unstable. [ ][ ][ ]
    • Use a reproducible value for SOURCE_DATE_EPOCH for all our new bootstrap jobs. [ ]
  • Misc changes:
    • Send the create_meta_pkg_sets notification to #debian-reproducible-changes instead of #debian-reproducible. [ ]
In addition, Roland Clobus re-enabled the tests for live-build images [ ] and added a feature where the build would retry instead of give up when the archive was synced whilst building an ISO [ ], and Vagrant Cascadian added logging to report the current target of the /bin/sh symlink [ ].

Contact As ever, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

8 September 2022

Antoine Beaupr : Complaint about Canada's phone cartel

I have just filed a complaint with the CRTC about my phone provider's outrageous fees. This is a copy of the complaint.
I am traveling to Europe, specifically to Ireland, for a 6 days for a work meeting. I thought I could use my phone there. So I looked at my phone provider's services in Europe, and found the "Fido roaming" services: https://www.fido.ca/mobility/roaming The fees, at the time of writing, at fifteen (15!) dollars PER DAY to get access to my regular phone service (not unlimited!!). If I do not use that "roaming" service, the fees are:
  • 2$/min
  • 0.75$/text
  • 10$/20MB
That is absolutely outrageous. Any random phone plan in Europe will be cheaper than this, by at least one order of magnitude. Just to take any example: https://www.tescomobile.ie/sim-only-plans.aspx Those fine folks offer a one-time, prepaid plan for 15 for 28 days which includes:
  • unlimited data
  • 1000 minutes
  • 500 text messages
  • 12GB data elsewhere in Europe
I think it's absolutely scandalous that telecommunications providers in Canada can charge so much money, especially since the most prohibitive fee (the "non-prepaid" plans) are automatically charged if I happen to forget to remove my sim card or put my phone in "airplane mode". As advised, I have called customer service at Fido for advice on how to handle this situation. They have confirmed those are the only plans available for travelers and could not accommodate me otherwise. I have notified them I was in the process of filing this complaint. I believe that Canada has become the technological dunce of the world, and I blame the CRTC for its lack of regulation in that matter. You should not allow those companies to grow into such a cartel that they can do such price-fixing as they wish. I haven't investigated Fido's competitors, but I will bet at least one of my hats that they do not offer better service. I attach a screenshot of the Fido page showing those outrageous fees.
I have no illusions about this having any effect. I thought of filing such a complain after the Rogers outage as well, but felt I had less of a standing there because I wasn't affected that much (e.g. I didn't have a life-threatening situation myself). This, however, was ridiculous and frustrating enough to trigger this outrage. We'll see how it goes...
"We will respond to you within 10 working days."

Response from CRTC They did respond within 10 days. Here is the full response:
Dear Antoine Beaupr : Thank you for contacting us about your mobile telephone international roaming service plan rates concern with Fido Solutions Inc. (Fido). In Canada, mobile telephone service is offered on a competitive basis. Therefore, the Canadian Radio-television and Telecommunications Commission (CRTC) is not involved in Fido's terms of service (including international roaming service plan rates), billing and marketing practices, quality of service issues and customer relations. If you haven't already done so, we encourage you to escalate your concern to a manager if you believe the answer you have received from Fido's customer service is not satisfactory. Based on the information that you have provided, this may also appear to be a Competition Bureau matter. The Competition Bureau is responsible for administering and enforcing the Competition Act, and deals with issues such as false or misleading representations, deceptive marketing practices and collusion. You can reach the Competition Bureau by calling 1-800-348-5358 (toll-free), by TTY (for deaf and hard of hearing people) by calling 1-866-694-8389 (toll-free). For more contact information, please visit http://www.competitionbureau.gc.ca/eic/site/cb-bc.nsf/eng/00157.html When consumers are not satisfied with the service they are offered, we encourage them to compare the products and services of other providers in their area and look for a company that can better match their needs. The following tool helps to show choices of providers in your area: https://crtc.gc.ca/eng/comm/fourprov.htm Thank you for sharing your concern with us.
In other words, complain with Fido, or change providers. Don't complain to us, we don't manage the telcos, they self-regulate. Great job, CRTC. This is going great. This is exactly why we're one of the most expensive countries on the planet for cell phone service.

Live chat with Fido Interestingly, the day after I received that response from the CRTC, I received this email from Fido, while traveling:
Date: Tue, 13 Sep 2022 10:10:00 -0400 From: Fido DONOTREPLY@fido.ca To: REDACTED Subject: Courriel d avis d itin rance Fido Roaming Welcome Confirmation Fido Date : 13 septembre 2022
Num ro de compte : [redacted] Bonjour
Antoine Beaupr ! Nous vous crivons pour vous indiquer qu au moins un utilisateur inscrit votre compte s est r cemment connect un r seau en itin rance.
Vous trouverez ci-dessous le message texte de bienvenue en itin rance envoy l utilisateur (ou aux utilisateurs), qui contenait les tarifs d itin rance
applicables. Message texte de bienvenue en itin rance Destinataire : REDACTED Date et heure : 2022-09-13 / 10:10:00
Allo, ici Fido : Bienvenue destination! Vous tes inscrit Fido Nomade alors utilisez vos donn es, parlez et textez comme vous le faites la
maison. Depuis le 1 mars 2022 le tarif cette destination pour 15 $/jour (+ taxes) et valide tous les jours jusqu' 23 h 59 HE, peu importe le fuseau
horaire dans lequel vous vous trouvez. Bon voyage! Des questions? Consultez fido.ca/m/itinerance ou composez +15149333436 (sans frais). Besoin d aide?
  • PLANIFIEZ UN VOYAGE AVEC Fido NomadeMC
    D couvrez nos options d itin rance et restez en contact l tranger sans vous soucier de votre
    facture.
D tails
  • G rez votre compte
    G rez vos produits et services Fido la maison ou sur la route gr ce Mon
    Compte.
D tails Ce courriel est produit automatiquement; veuillez ne pas y r pondre. Ce courriel (et toute pi ce jointe) est confidentiel. Si vous n tes pas le destinataire,
veuillez supprimer le pr sent message et en d truire toute copie. FIDO SOLUTIONS 800, RUE DE LA GAUCHETI RE OUEST
BUREAU 4000 MONTR AL (QU BEC) H5A 1K3
Fido
I found that message utterly confusing (and yes, I can read french). Basically, it says that some user (presumably me!) connected to the network with roaming. I did just disabled airplane mode on my phone to debug a Syncthing bug but had not enabled roaming. So this message seemed to say that I would be charged 15$ (per DAY!) for roaming from now on. Confused, I tried their live chat to try to clarify things, worried I would get charged even more for calling tech support on *611. This is a transcript of the chat:
F: Hi! What are we doing today? Type in your question or choose from the options below: * Track my Equipment Order * View Bill Online * Payment Options * iPhone 14 Pre-Order A: i received a message about roaming while abroad but i did not enable roaming on my phone, will i be charged anyways? F: I think I know what you re asking for. Select the topic that best matches your request or try rephrasing your question. A: no F: Thank you, this will help us to improve! Would you like to chat with a specialist? Chat with a specialist I'll get a specialist to help you with this. It appears that you're not signed in. Your session may have timed out. To save time and identify your account details, please sign in to My Account.
  • Sign in
  • I'm not able to sign in
Have any questions specific to your Fido account? To service you faster, please identify yourself by completing the form below. A: Personal info Form submitted F: Thank you! I'll connect you with the next available specialist. Your chat is being transferred to a Live Chat agent. Thanks for your patience. We are here to assist you and we kindly ask that our team members be treated with respect and dignity. Please note that abuse directed towards any Consumer Care Specialist will not be tolerated and will result in the termination of your conversation with us. All of our agents are with other customers at the moment. Your chat is in a priority sequence and someone will be with you as soon as possible. Thanks! Thanks for continuing to hold. An agent will be with you as soon as possible. Thank you for your continued patience. We re getting more Live Chat requests than usual so it s taking longer to answer. Your chat is still in a priority sequence and will be answered as soon as an agent becomes available. Thank you so much for your patience we're sorry for the wait. Your chat is still in a priority sequence and will be answered as soon as possible. Hi, I'm [REDACTED] from Fido in [REDACTED]. May I have your name please? A: hi i am antoine, nice to meet you sorry to use the live chat, but it's not clear to me i can safely use my phone to call support, because i am in ireland and i'm worried i'll get charged for the call F: Thank You Antoine , I see you waited to speak with me today, thank you for your patience.Apart from having to wait, how are you today? A: i am good thank you
[... delay ...]
A: should i restate my question? F: Yes please what is the concern you have? A: i have received an email from fido saying i someone used my phone for roaming it's in french (which is fine), but that's the gist of it i am traveling to ireland for a week i do not want to use fido's services here... i have set the phon eto airplane mode for most of my time here F: The SMS just says what will be the charges if you used any services. A: but today i have mistakenly turned that off and did not turn on roaming well it's not a SMS, it's an email F: Yes take out the sim and keep it safe.Turun off or On for roaming you cant do it as it is part of plan. A: wat F: if you used any service you will be charged if you not used any service you will not be charged. A: you are saying i need to physically take the SIM out of the phone? i guess i will have a fun conversation with your management once i return from this trip not that i can do that now, given that, you know, i nee dto take the sim out of this phone fun times F: Yes that is better as most of the customer end up using some kind of service and get charged for roaming. A: well that is completely outrageous roaming is off on the phone i shouldn't get charged for roaming, since roaming is off on the phone i also don't get why i cannot be clearly told whether i will be charged or not the message i have received says i will be charged if i use the service and you seem to say i could accidentally do that easily can you tell me if i have indeed used service sthat will incur an extra charge? are incoming text messages free? F: I understand but it is on you if you used some data SMS or voice mail you can get charged as you used some services.And we cant check anything for now you have to wait for next bill. and incoming SMS are free rest all service comes under roaming. That is the reason I suggested take out the sim from phone and keep it safe or always keep the phone or airplane mode. A: okay can you confirm whether or not i can call fido by voice for support? i mean for free F: So use your Fido sim and call on +1-514-925-4590 on this number it will be free from out side Canada from Fido sim. A: that is quite counter-intuitive, but i guess i will trust you on that thank you, i think that will be all F: Perfect, Again, my name is [REDACTED] and it s been my pleasure to help you today. Thank you for being a part of the Fido family and have a great day! A: you too
So, in other words:
  1. they can't tell me if I've actually been roaming
  2. they can't tell me how much it's going to cost me
  3. I should remove the SIM card from my phone (!?) or turn on airplane mode, but the former is safer
  4. I can call Fido support, but not on the usual *611, and instead on that long-distance-looking phone number, and yes, that means turning off airplane mode and putting the SIM card in, which contradicts step 3
Also notice how the phone number from the live chat (+1-514-925-4590) is different than the one provided in the email (15149333436). So who knows what would have happened if I would have called the latter. The former is mentioned in their contact page. I guess the next step is to call Fido over the phone and talk to a manager, which is what the CRTC told me to do in the first place... I ended up talking with a manager (another 1h phone call) and they confirmed there is no other package available at Fido for this. At best they can provide me with a credit if I mistakenly use the roaming by accident to refund me, but that's it. The manager also confirmed that I cannot know if I have actually used any data before reading the bill, which is issued on the 15th of every month, but only available... three days later, at which point I'll be back home anyways. Fantastic.

6 September 2022

Jonathan Dowland: Borg corrupted hints file

I've been using Borg backup for a couple of years and it has seemingly worked very well for me. One difference I really appreciate from my previous arrangement (rdiff-backup) is the freedom to move large files or file hierarchies around (including between different filesystems) without provoking large backup incrementals. About a week ago I had my first real problem with Borg: backups started to fail with the following complaints:
Creating archive at "/backup/borg:: hostname -home-jon- now:%Y-%m-%dT%H:%M:%S.%f "
segment 61916 not found, but listed in compaction data
[ further, similar lines ]
Local Exception
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 4690, in main
    exit_code = archiver.run(args)
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 4622, in run
    return set_ec(func(args))
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 177, in wrapper
    return method(self, args, repository=repository, **kwargs)
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 595, in do_create
    create_inner(archive, cache)
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 560, in create_inner
    archive.save(comment=args.comment, timestamp=args.timestamp)
  File "/usr/lib/python3/dist-packages/borg/archive.py", line 530, in save
    self.repository.commit()
  File "/usr/lib/python3/dist-packages/borg/repository.py", line 475, in commit
    self.compact_segments()
  File "/usr/lib/python3/dist-packages/borg/repository.py", line 835, in compact_segments
    assert segments[segment] == 0, 'Corrupted segment reference count - corrupted index or hints'
AssertionError: Corrupted segment reference count - corrupted index or hints
At about the same time I had managed to fill the backup host's root filesystem. I thought the two issues must be related. Although all the files Borg is backing up, and the backup repository it writes to, are located on different partitions, Borg's client-side of things does maintain some caching in /root/.cache/borg. My first idea was that this must have been corrupted by an aborted write, but zapping it did not cure the above problem. It occurred to me that I run Borg via the convenience wrapper Borgmatic, and it was possible that was failing, but after a short investigation I ruled that out. Various attempts at running borg check or borg check --repair didn't help either. The underlying filesystem (XFS) passed a filesystem check. There wasn't any obvious complaints about IO errors from the kernel or anything reported in the HDD's SMART data. What did work, in the end, was removing the file matching $BORG_REPO/*hint* and trying again. Although this is read/written to on the backup partition, it seems filling the root partition caused Borg (1.1.16-3) to corrupt that file. Everything seems fine following that. I have recently started trying to semi- automatically verify backups on a monthly basis, on a machine independent from the NAS; all the tests I have written so far passed.

3 September 2022

James Valleroy: File sharing with bepasty

One of the apps running on my FreedomBox that I use frequently is bepasty. bepasty is essentially a self-hosted, free software pastebin. It allows you to paste text, or upload any type of file. You can also set an expiration date for when the file or text will automatically be deleted. If you are uploading multiple related files, you can organize them into a list. bepasty does not have user accounts. Instead, it has shared passwords, where each password is linked to a set of permissions. There are five permissions: Read, List, Create, Delete, and Admin. (The meanings are mostly straightforward, except for Admin, which means the ability to lock and unlock files.) This allows very fine-grained control. For example, if you want someone to be able to upload files to your bepasty, but not view or download anything, than you can generate a password with only the Create permission, and give this password to the person who will be uploading files. To simplify the initial setup in FreedomBox, we generate three passwords by default: one for viewers (List and Read), one for editors (List, Read, Create, and Delete), and one for admins (all permissions). In addition, when no password has been provided, the Read (but not List) permission is provided by default. This allows files to be easily shared by sending just their URLs (and no password required). The URLs contain some random characters, so it is not easy to guess. I mostly use bepasty for moving files between systems, whether its a physical machine or VPS, or a VM or container that I will use only briefly. Especially in the latter case, it s nice that I don t need to do any extra setup (such as copying SSH keys) before I copy my files over. The bepasty package is available in Debian stable (with a newer version in stable-backports and testing). The many use-cases that it provides, and the well-maintained Debian packaging, made it a compelling choice for integration into FreedomBox, which has included bepasty for one-click installation since version 20.14.

2 September 2022

John Goerzen: Dead USB Drives Are Fine: Building a Reliable Sneakernet

OK, you re probably thinking. John, you talk a lot about things like Gopher and personal radios, and now you want to talk about building a reliable network out of USB drives? Well, yes. In fact, I ve already done it.

What is sneakernet? Normally, sneakernet is a sort of tongue-in-cheek reference to using disconnected storage to transport data or messages. By disconnect storage I mean anything like CD-ROMs, hard drives, SD cards, USB drives, and so forth. There are times when loading up 12TB on a device and driving it across town is just faster and easier than using the Internet for the same. And, sometimes you need to get data to places that have no Internet at all. Another reason for sneakernet is security. For instance, if your backup system is online, and your systems being backed up are online, then it could become possible for an attacker to destroy both your primary copy of data and your backups. Or, you might use a dedicated computer with no network connection to do GnuPG (GPG) signing.

What about reliable sneakernet, then? TCP is often considered a reliable protocol. That means that the sending side is generally able to tell if its message was properly received. As with most reliable protocols, we have these components:
  1. After transmitting a piece of data, the sender retains it.
  2. After receiving a piece of data, the receiver sends an acknowledgment (ACK) back to the sender.
  3. Upon receiving the acknowledgment, the sender removes its buffered copy of the data.
  4. If no acknowledgment is received at the sender, it retransmits the data, in case it gets lost in transit.
  5. It reorders any packets that arrive out of order, so that the recipient s data stream is ordered correctly.
Now, a lot of the things I just mentioned for sneakernet are legendarily unreliable. USB drives fail, CD-ROMs get scratched, hard drives get banged up. Think about putting these things in a bicycle bag or airline luggage. Some of them are going to fail. You might think, well, I ll just copy files to a USB drive instead of move them, and once I get them onto the destination machine, I ll delete them from the source. Congratulations! You are a human retransmit algorithm! We should be able to automate this! And we can.

Enter NNCP NNCP is one of those things that almost defies explanation. It is a toolkit for building asynchronous networks. It can use as a carrier: a pipe, TCP network connection, a mounted filesystem (specifically intended for cases like this), and much more. It also supports multi-hop asynchronous routing and asynchronous meshing, but these are beyond the scope of this particular article. NNCP s transports that involve live communication between two hops already had all the hallmarks of being reliable; there was a positive ACK and retransmit. As of version 8.7.0, NNCP s ACKs themselves can also be asynchronous meaning that every NNCP transport can now be reliable. Yes, that s right. Your ACKs can flow over tapes and USB drives if you want them to. I use this for archiving and backups. If you aren t already familiar with NNCP, you might take a look at my NNCP page. I also have a lot of blog posts about NNCP. Those pages describe the basics of NNCP: the packet (the unit of transmission in NNCP, which can be tiny or many TB), the end-to-end encryption, and so forth. The new command we will now be interested in is nncp-ack.

The Basic Idea Here are the basic steps to processing this stuff with NNCP:
  1. First, we use nncp-xfer -rx to process incoming packets from the USB (or other media) device. This moves them into the NNCP inbound queue, deleting them from the media device, and verifies the packet integrity.
  2. We use nncp-ack -node $NODE to create ACK packets responding to the packets we just loaded into the rx queue. It writes a list of generated ACKs onto fd 4, which we save off for later use.
  3. We run nncp-toss -seen to process the incoming queue. The use of -seen causes NNCP to remember the hashes of packets seen before, so a duplicate of an already-seen packet will not be processed twice. This command also processes incoming ACKs for packets we ve sent out previously; if they pass verification, the relevant packets are removed from the local machine s tx queue.
  4. Now, we use nncp-xfer -keep -tx -mkdir -node $NODE to send outgoing packets to a given node by writing them to a given directory on the media device. -keep causes them to remain in the outgoing queue.
  5. Finally, we use the list of generated ACK packets saved off in step 2 above. That list is passed to nncp-rm -node $NODE -pkt < $FILE to remove those specific packets from the outbound queue. The reason is that there will never be an ACK of ACK packet (that would create an infinite loop), so if we don t delete them in this manner, they would hang around forever.
You can see these steps follow the same basic outline on upstream s nncp-ack page. One thing to keep in mind: if anything else is running nncp-toss, there is a chance of a race condition between steps 1 and 2 (if nncp-toss gets to it first, it might not get an ack generated). This would sort itself out eventually, presumably, as the sender would retransmit and it would be ACKed later.

Further ideas NNCP guarantees the integrity of packets, but not ordering between packets; if you need that, you might look into my Filespooler program. It is designed to work with NNCP and can provide ordered processing.

An example script Here is a script you might try for this sort of thing. It may have more logic than you need really, you just need the steps above but hopefully it is clear.
#!/bin/bash
set -eo pipefail
MEDIABASE="/media/$USER"
# The local node name
NODENAME=" hostname "
# All nodes.  NODENAME should be in this list.
ALLNODES="node1 node2 node3"
RUNNNCP=""
# If you need to sudo, use something like RUNNNCP="sudo -Hu nncp"
NNCPPATH="/usr/local/nncp/bin"
ACKPATH=" mktemp -d "
# Process incoming packets.
#
# Parameters: $1 - the path to scan.  Must contain a directory
# named "nncp".
procrxpath ()  
    while [ -n "$1" ]; do
        BASEPATH="$1/nncp"
        shift
        if ! [ -d "$BASEPATH" ]; then
            echo "$BASEPATH doesn't exist; skipping"
            continue
        fi
        echo " *** Incoming: processing $BASEPATH"
        TMPDIR=" mktemp -d "
        # This rsync and the one below can help with
        # certain permission issues from weird foreign
        # media.  You could just eliminate it and
        # always use $BASEPATH instead of $TMPDIR below.
        rsync -rt "$BASEPATH/" "$TMPDIR/"
        # You may need these next two lines if using sudo as above.
        # chgrp -R nncp "$TMPDIR"
        # chmod -R g+rwX "$TMPDIR"
        echo "     Running nncp-xfer -rx"
        $RUNNNCP $NNCPPATH/nncp-xfer -progress -rx "$TMPDIR"
        for NODE in $ALLNODES; do
                if [ "$NODE" != "$NODENAME" ]; then
                        echo "     Running nncp-ack for $NODE"
                        # Now, we generate ACK packets for each node we will
                        # process.  nncp-ack writes a list of the created
                        # ACK packets to fd 4.  We'll use them later.
                        # If using sudo, add -C 5 after $RUNNNCP.
                        $RUNNNCP $NNCPPATH/nncp-ack -progress -node "$NODE" \
                           4>> "$ACKPATH/$NODE"
                fi
        done
        rsync --delete -rt "$TMPDIR/" "$BASEPATH/"
        rm -fr "$TMPDIR"
    done
 
proctxpath ()  
    while [ -n "$1" ]; do
        BASEPATH="$1/nncp"
        shift
        if ! [ -d "$BASEPATH" ]; then
            echo "$BASEPATH doesn't exist; skipping"
            continue
        fi
        echo " *** Outgoing: processing $BASEPATH"
        TMPDIR=" mktemp -d "
        rsync -rt "$BASEPATH/" "$TMPDIR/"
        # You may need these two lines if using sudo:
        # chgrp -R nncp "$TMPDIR"
        # chmod -R g+rwX "$TMPDIR"
        for DESTHOST in $ALLNODES; do
            if [ "$DESTHOST" = "$NODENAME" ]; then
                continue
            fi
            # Copy outgoing packets to this node, but keep them in the outgoing
            # queue with -keep.
            $RUNNNCP $NNCPPATH/nncp-xfer -keep -tx -mkdir -node "$DESTHOST" -progress "$TMPDIR"
            # Here is the key: that list of ACK packets we made above - now we delete them.
            # There will never be an ACK for an ACK, so they'd keep sending forever
            # if we didn't do this.
            if [ -f "$ACKPATH/$DESTHOST" ]; then
                echo "nncp-rm for node $DESTHOST"
                $RUNNNCP $NNCPPATH/nncp-rm -debug -node "$DESTHOST" -pkt < "$ACKPATH/$DESTHOST"
            fi
        done
        rsync --delete -rt "$TMPDIR/" "$BASEPATH/"
        rm -rf "$TMPDIR"
        # We only want to write stuff once.
        return 0
    done
 
procrxpath "$MEDIABASE"/*
echo " *** Initial tossing..."
# We make sure to use -seen to rule out duplicates.
$RUNNNCP $NNCPPATH/nncp-toss -progress -seen
proctxpath "$MEDIABASE"/*
echo "You can unmount devices now."
echo "Done."

This post is also available on my webiste, where it may be periodically updated.

1 September 2022

Shirish Agarwal: Culture, Books, Friends

Culture Just before I start, I would like to point out that this post may or would probably be NSFW. Again, what is SFW (Safe at Work) and NSFW that so much depends on culture and perception of culture from wherever we are or wherever we take birth? But still, to be on the safe side I have put it as NSFW. Now there have been a few statements and ideas that gave me a pause. This will be a sort of chaotic blog post as I am in such a phase today. For e.g. while I do not know which culture or which country this comes from, somebody shared that in some cultures one can talk/comment May your poop be easy and with a straight face. I dunno which culture is this but if somebody asked me that I would just die from laughing or maybe poop there itself. While I can understand if it is a constipated person, but a whole culture? Until and unless their DNA is really screwed, I don t think so but then what do I know? I do know that we shit when we have extreme reactions of either joy or fear. And IIRC, this comes from mammal response when they were in dangerous situations and we got the same as humans evolved. I would really be interested to know which culture is that. I did come to know that the Japanese do wish that you may not experience hard work or something to that effect while ironically they themselves are becoming extinct due to hard work and not enough relaxation, toxic workplace is common in Japan according to social scientists and population experts. Another term that I couldn t figure out is The Florida Man Strikes again and this term is usually used when somebody does something stupid or something weird. While it is exclusively used in the American context, I am curious to know how that came about. Why does Florida have such people or is it an exaggeration? I have heard the term e.g. What happens in Vegas, stays in Vegas . Think it is also called Sin city although why just Vegas is beyond me?

Omicron-8712 Blood pressure machine I felt so stupid. I found another site or e-commerce site called Wellness Forever. They had the blood pressure machine I wanted, an Omron-8172. I bought it online and they delivered the same within half an hour. Amazon took six days and in the end, didn t deliver it at all. I tried taking measurements from it yesterday. I have yet to figure out what it all means but I did get measurements of 109 SYS, 88 DIA and Pulse is 72. As far as the pulse is concerned, guess that is normal, the others just don t know. If only I had known this couple of months ago. I was able to register the product as well as download and use the Omron Connect app. For roughly INR 2.5k you have a sort of health monitoring system. It isn t Star Trek Tricorder in any shape or form but it will have to do while the tricorder gets invented. And while we are on the subject let s not forget Elizabeth Holmes and the scam called Theranos. It really is something to see How Elizabeth Holmes modeled so much of herself on Steve Jobs mimicking how he left college/education halfway. A part of me is sad that Theranos is not real. Joe Scott just a few days ago shared some perspectives on the same just a few days ago. The idea in itself is pretty seductive, to say the least, and that is the reason the scam went on for more than a decade and perhaps would have been longer if some people hadn t gotten the truth out. I do see potentially, something like that coming on as A.I. takes a bigger role in automating testing. Half a decade to a decade from now, who knows if there is an algorithm that is able to do what is needed? If such a product were to come to the marketplace at a decent price, it would revolutionize medicine, especially in countries like India, South Africa, and all sorts of remote places. Especially, with all sorts of off-grid technologies coming and maturing in the marketplace. Before I forget, there is a game called Cell on Android that tells or shares about the evolution of life on earth. It also shares credence to the idea that life has come 6 times on Earth and has been destroyed multiple times by asteroids. It is in the idle sort of game format, so you can see the humble beginnings from the primordial soup to various kinds of cells and bacteria to finally a mammal. This is where I am and a long way to go.

Indian Bureaucracy One of the few things that Britishers gave to India, is the bureaucracy and the bureaucracy tests us in myriad ways. It would be full 2 months on 5th September and I haven t yet got a death certificate. And I need that for a sundry number of things. The same goes for a disability certificate. What is and was interesting is my trip to the local big hospital called Sassoon Hospital. My mum had shared incidents that occurred in the 1950s when she and the family had come to Pune. According to her, when she was alive, while Sassoon was the place to be, it was big and chaotic and you never knew where you are going. That was in 1950, I had the same experience in 2022. The term/adage the more things change, the more they remain the same seems to be held true for Sassoon Hospital. Btw, those of you who think the Devil exists, he is totally a fallacy. There is a popular myth that the devil comes to deal that he/she/they come to deal with you when somebody close to you passes, I was waiting desperately for him when mum passed. Any deal that he/she/they would have offered me I would have gladly taken, but all my wait was all for nothing. While I believe evil exists, that is manifested by humans and nobody else. The whole idea and story of the devil is just to control young children and nothing beyond that

Debconf 2023, friends, JPEGOptim, and EV s Quite a number of friends had gone to Albania this year as India won the right to host Debconf for the year 2023. While I did lurk on the Debconf orga IRC channel, I m not sure how helpful I would be currently. One news that warmed my heart is some people would be coming to India to check the site way before and make sure things go smoothly. Nothing like having more eyes (in this case bodies) to throw at a problem and hopefully it will be sorted. While I have not been working for the last couple of years, one of the things that I had to do and have been doing is moving a lot of stuff online. This is in part due to the Government s own intention of having everything on the cloud. One of the things I probably may have shared it more than enough times is that the storage most of these sites give is like the 1990s. I tried jpegoptim and while it works, it degrades the quality of the image quite a bit. The whole thing seems backward, especially as newer and newer smartphones are capturing more data per picture (megapixel resolution), case in point Samsung Galaxy A04 that is being introduced. But this is not only about newer phones, even my earlier phone, Samsung J-5/500 which I bought in 2016 took images at 5 MB. So it is not a new issue but a continuous issue. And almost all Govt. sites have the upper band fixed at 1 MB. But this is not limited to Govt. sites alone, most sites in India are somewhat frozen in the 1990s. And it isn t as if resources for designing web pages using HTML5, CSS3, Javascript, Python, or Java aren t available. If worse comes to worst, one can even use amp to make his, her or their point. But this is if they want to do stuff. I would be sharing a few photos with commentary, there are still places where I can put photos apart from social media

Friends Last week, Saturday suddenly all the friends decided to show up. I have no clue one way or the other why but am glad they showed up.
Mahendra, Akshat, Shirish and Sagar Sukhose (Mangesh's friend). Mahendra, Akshat, Shirish and Sagar Sukhose (Mangesh s friend) at Bal Gandharva..
Electric scooter as shared by Akshat seen in Albania Electric scooter as shared by Akshat seen in Albania
Somebody making a  real-life replica of Wall Street on F.C. Road (Commercial, all glass)Somebody making a real-life replica of Wall Street on F.C. Road (Commercial, all glass)
Ganesh Idol near my houseGanesh Idol near my house
Wearing new clothesWearing new clothes
I will have to be a bit rapid about what I am sharing above so here goes nothing

1. The first picture shows Mahendra, Akshat, me, and Sagar Sukhose (Mangesh s friend). The picture was taken by Mangesh Diwate. We talked quite a bit of various things that could be done in Debian. A few of the things that I shared were (bringing more stuff from BSD to Debian, I am sure there s still quite a lot of security software that could be advantageous to have in Debian.) The best person to talk to or guide about this would undoubtedly be Paul Wise or as he is affectionally called Pabs. He is one of the shy ones and yet knows so much about how things work. The one and only time I met him is 2016. The other thing that we talked about is porting Debian to one of the phones. This has been done in the past and done by a Puneitie some 4-5 years back. While I don t recollect the gentleman s name, I remember that the porting was done on a Motorola phone as that was the easiest to do. He had tried some other mobile but that didn t work. Making Debian available on phone is hard work. Just to have an idea, I went to the xda developers forum and found out that while M51 has been added, my specific phone model is not there. A Samsung Galaxy M52G Android (samsung; SM-M526B; lahaina; arm64-v8a) v12 . You look at the chat and you understand how difficult the process might be. One of the other ideas that Akshat pitched was Debian Astro, this is something that is close to the heart of many, including me. I also proposed to have some kind of web app or something where we can find and share about the various astronomy and related projects done by various agencies. While there is a NASA app, nothing comes close to JSR and that site just shares stuff, no speculation. There are so many projects taken or being done by the EU, JAXA, ISRO, and even middle-east countries are trying but other than people who are following some of the developments, we hear almost nothing. Even the Chinese have made some long strides but most people know nothing about the same. And it s sad to know that those developments are not being known, shared, or even speculated about as much as say NASA or SpaceX is. How do we go about it and how do we get people to contribute or ask questions around it would be interesting. 2. The second picture was something that was shared by Akshat. Akshat was sharing how in Albania people are moving on these electric scooters . I dunno if that is the right word for it or what. I had heard from a couple of friends who had gone to Vietnam a few years ago how most people in Vietnam had modified their scooters and they were snaking lines of electric wires charging scooters. I have no clue whether they were closer to Vespa or something like above. In India, the Govt. is in partnership with the oil, gas, and coal mafia just as it was in Australia (the new Govt. in Australia is making changes) the same thing is here. With the humongous profits that the oil sector provides the petro states and others, Corruption is bound to happen. We talk and that s the extent of things. 3. The third picture is from a nearby area called F.C. Road or Fergusson College Road. The area has come up quite sharply (commercially) in the last few years. Apparently, Mr. Kushal is making a real-life replica of Wall Street which would be given to commercial tenants. Right now the real estate market is tight in India, we will know how things pan out in the next few years. 4. Number four is an image of a Ganesh idol near my house. There is a 10-day festival of the elephant god that people used to celebrate every year. For the last couple of years because of the pandemic, people were unable to celebrate the festival as it is meant to celebrate. This time some people are going overboard while others are cautious and rightfully so. 5. Last and not least, one of the things that people do at this celebration is to have new clothes, so I shared a photo of a gentleman who had bought and was wearing new clothes. While most countries around the world are similar, Latin America is very similar to India in many ways, perhaps Gunnar can share. especially about religious activities. The elephant god is known for his penchant for sweets and that can be seen from his rounded stomach, that is also how he is celebrated. He is known to make problems disappear or that is supposed to be his thing. We do have something like 4 billion gods, so each one has to be given some work or quality to justify the same

30 August 2022

John Goerzen: The PC & Internet Revolution in Rural America

Inspired by several others (such as Alex Schroeder s post and Szcze uja s prompt), as well as a desire to get this down for my kids, I figure it s time to write a bit about living through the PC and Internet revolution where I did: outside a tiny town in rural Kansas. And, as I ve been back in that same area for the past 15 years, I reflect some on the challenges that continue to play out. Although the stories from the others were primarily about getting online, I want to start by setting some background. Those of you that didn t grow up in the same era as I did probably never realized that a typical business PC setup might cost $10,000 in today s dollars, for instance. So let me start with the background.

Nothing was easy This story begins in the 1980s. Somewhere around my Kindergarten year of school, around 1985, my parents bought a TRS-80 Color Computer 2 (aka CoCo II). It had 64K of RAM and used a TV for display and sound. This got you the computer. It didn t get you any disk drive or anything, no joysticks (required by a number of games). So whenever the system powered down, or it hung and you had to power cycle it a frequent event you d lose whatever you were doing and would have to re-enter the program, literally by typing it in. The floppy drive for the CoCo II cost more than the computer, and it was quite common for people to buy the computer first and then the floppy drive later when they d saved up the money for that. I particularly want to mention that computers then didn t come with a modem. What would be like buying a laptop or a tablet without wifi today. A modem, which I ll talk about in a bit, was another expensive accessory. To cobble together a system in the 80s that was capable of talking to others with persistent storage (floppy, or hard drive), screen, keyboard, and modem would be quite expensive. Adjusted for inflation, if you re talking a PC-style device (a clone of the IBM PC that ran DOS), this would easily be more expensive than the Macbook Pros of today. Few people back in the 80s had a computer at home. And the portion of those that had even the capability to get online in a meaningful way was even smaller. Eventually my parents bought a PC clone with 640K RAM and dual floppy drives. This was primarily used for my mom s work, but I did my best to take it over whenever possible. It ran DOS and, despite its monochrome screen, was generally a more capable machine than the CoCo II. For instance, it supported lowercase. (I m not even kidding; the CoCo II pretty much didn t.) A while later, they purchased a 32MB hard drive for it what luxury! Just getting a machine to work wasn t easy. Say you d bought a PC, and then bought a hard drive, and a modem. You didn t just plug in the hard drive and it would work. You would have to fight it every step of the way. The BIOS and DOS partition tables of the day used a cylinder/head/sector method of addressing the drive, and various parts of that those addresses had too few bits to work with the big drives of the day above 20MB. So you would have to lie to the BIOS and fdisk in various ways, and sort of work out how to do it for each drive. For each peripheral serial port, sound card (in later years), etc., you d have to set jumpers for DMA and IRQs, hoping not to conflict with anything already in the system. Perhaps you can now start to see why USB and PCI were so welcomed.

Sharing and finding resources Despite the two computers in our home, it wasn t as if software written on one machine just ran on another. A lot of software for PC clones assumed a CGA color display. The monochrome HGC in our PC wasn t particularly compatible. You could find a TSR program to emulate the CGA on the HGC, but it wasn t particularly stable, and there s only so much you can do when a program that assumes color displays on a monitor that can only show black, dark amber, or light amber. So I d periodically get to use other computers most commonly at an office in the evening when it wasn t being used. There were some local computer clubs that my dad took me to periodically. Software was swapped back then; disks copied, shareware exchanged, and so forth. For me, at least, there was no online to download software from, and selling software over the Internet wasn t a thing at all.

Three Different Worlds There were sort of three different worlds of computing experience in the 80s:
  1. Home users. Initially using a wide variety of software from Apple, Commodore, Tandy/RadioShack, etc., but eventually coming to be mostly dominated by IBM PC clones
  2. Small and mid-sized business users. Some of them had larger minicomputers or small mainframes, but most that I had contact with by the early 90s were standardized on DOS-based PCs. More advanced ones had a network running Netware, most commonly. Networking hardware and software was generally too expensive for home users to use in the early days.
  3. Universities and large institutions. These are the places that had the mainframes, the earliest implementations of TCP/IP, the earliest users of UUCP, and so forth.
The difference between the home computing experience and the large institution experience were vast. Not only in terms of dollars the large institution hardware could easily cost anywhere from tens of thousands to millions of dollars but also in terms of sheer resources required (large rooms, enormous power circuits, support staff, etc). Nothing was in common between them; not operating systems, not software, not experience. I was never much aware of the third category until the differences started to collapse in the mid-90s, and even then I only was exposed to it once the collapse was well underway. You might say to me, Well, Google certainly isn t running what I m running at home! And, yes of course, it s different. But fundamentally, most large datacenters are running on x86_64 hardware, with Linux as the operating system, and a TCP/IP network. It s a different scale, obviously, but at a fundamental level, the hardware and operating system stack are pretty similar to what you can readily run at home. Back in the 80s and 90s, this wasn t the case. TCP/IP wasn t even available for DOS or Windows until much later, and when it was, it was a clunky beast that was difficult. One of the things Kevin Driscoll highlights in his book called Modem World see my short post about it is that the history of the Internet we usually receive is focused on case 3: the large institutions. In reality, the Internet was and is literally a network of networks. Gateways to and from Internet existed from all three kinds of users for years, and while TCP/IP ultimately won the battle of the internetworking protocol, the other two streams of users also shaped the Internet as we now know it. Like many, I had no access to the large institution networks, but as I ve been reflecting on my experiences, I ve found a new appreciation for the way that those of us that grew up with primarily home PCs shaped the evolution of today s online world also.

An Era of Scarcity I should take a moment to comment about the cost of software back then. A newspaper article from 1985 comments that WordPerfect, then the most powerful word processing program, sold for $495 (or $219 if you could score a mail order discount). That s $1360/$600 in 2022 money. Other popular software, such as Lotus 1-2-3, was up there as well. If you were to buy a new PC clone in the mid to late 80s, it would often cost $2000 in 1980s dollars. Now add a printer a low-end dot matrix for $300 or a laser for $1500 or even more. A modem: another $300. So the basic system would be $3600, or $9900 in 2022 dollars. If you wanted a nice printer, you re now pushing well over $10,000 in 2022 dollars. You start to see one barrier here, and also why things like shareware and piracy if it was indeed even recognized as such were common in those days. So you can see, from a home computer setup (TRS-80, Commodore C64, Apple ][, etc) to a business-class PC setup was an order of magnitude increase in cost. From there to the high-end minis/mainframes was another order of magnitude (at least!) increase. Eventually there was price pressure on the higher end and things all got better, which is probably why the non-DOS PCs lasted until the early 90s.

Increasing Capabilities My first exposure to computers in school was in the 4th grade, when I would have been about 9. There was a single Apple ][ machine in that room. I primarily remember playing Oregon Trail on it. The next year, the school added a computer lab. Remember, this is a small rural area, so each graduating class might have about 25 people in it; this lab was shared by everyone in the K-8 building. It was full of some flavor of IBM PS/2 machines running DOS and Netware. There was a dedicated computer teacher too, though I think she was a regular teacher that was given somewhat minimal training on computers. We were going to learn typing that year, but I did so well on the very first typing program that we soon worked out that I could do programming instead. I started going to school early these machines were far more powerful than the XT at home and worked on programming projects there. Eventually my parents bought me a Gateway 486SX/25 with a VGA monitor and hard drive. Wow! This was a whole different world. It may have come with Windows 3.0 or 3.1 on it, but I mainly remember running OS/2 on that machine. More on that below.

Programming That CoCo II came with a BASIC interpreter in ROM. It came with a large manual, which served as a BASIC tutorial as well. The BASIC interpreter was also the shell, so literally you could not use the computer without at least a bit of BASIC. Once I had access to a DOS machine, it also had a basic interpreter: GW-BASIC. There was a fair bit of software written in BASIC at the time, but most of the more advanced software wasn t. I wondered how these .EXE and .COM programs were written. I could find vague references to DEBUG.EXE, assemblers, and such. But it wasn t until I got a copy of Turbo Pascal that I was able to do that sort of thing myself. Eventually I got Borland C++ and taught myself C as well. A few years later, I wanted to try writing GUI programs for Windows, and bought Watcom C++ much cheaper than the competition, and it could target Windows, DOS (and I think even OS/2). Notice that, aside from BASIC, none of this was free, and none of it was bundled. You couldn t just download a C compiler, or Python interpreter, or whatnot back then. You had to pay for the ability to write any kind of serious code on the computer you already owned.

The Microsoft Domination Microsoft came to dominate the PC landscape, and then even the computing landscape as a whole. IBM very quickly lost control over the hardware side of PCs as Compaq and others made clones, but Microsoft has managed in varying degrees even to this day to keep a stranglehold on the software, and especially the operating system, side. Yes, there was occasional talk of things like DR-DOS, but by and large the dominant platform came to be the PC, and if you had a PC, you ran DOS (and later Windows) from Microsoft. For awhile, it looked like IBM was going to challenge Microsoft on the operating system front; they had OS/2, and when I switched to it sometime around the version 2.1 era in 1993, it was unquestionably more advanced technically than the consumer-grade Windows from Microsoft at the time. It had Internet support baked in, could run most DOS and Windows programs, and had introduced a replacement for the by-then terrible FAT filesystem: HPFS, in 1988. Microsoft wouldn t introduce a better filesystem for its consumer operating systems until Windows XP in 2001, 13 years later. But more on that story later.

Free Software, Shareware, and Commercial Software I ve covered the high cost of software already. Obviously $500 software wasn t going to sell in the home market. So what did we have? Mainly, these things:
  1. Public domain software. It was free to use, and if implemented in BASIC, probably had source code with it too.
  2. Shareware
  3. Commercial software (some of it from small publishers was a lot cheaper than $500)
Let s talk about shareware. The idea with shareware was that a company would release a useful program, sometimes limited. You were encouraged to register , or pay for, it if you liked it and used it. And, regardless of whether you registered it or not, were told please copy! Sometimes shareware was fully functional, and registering it got you nothing more than printed manuals and an easy conscience (guilt trips for not registering weren t necessarily very subtle). Sometimes unregistered shareware would have a nag screen a delay of a few seconds while they told you to register. Sometimes they d be limited in some way; you d get more features if you registered. With games, it was popular to have a trilogy, and release the first episode inevitably ending with a cliffhanger as shareware, and the subsequent episodes would require registration. In any event, a lot of software people used in the 80s and 90s was shareware. Also pirated commercial software, though in the earlier days of computing, I think some people didn t even know the difference. Notice what s missing: Free Software / FLOSS in the Richard Stallman sense of the word. Stallman lived in the big institution world after all, he worked at MIT and what he was doing with the Free Software Foundation and GNU project beginning in 1983 never really filtered into the DOS/Windows world at the time. I had no awareness of it even existing until into the 90s, when I first started getting some hints of it as a port of gcc became available for OS/2. The Internet was what really brought this home, but I m getting ahead of myself. I want to say again: FLOSS never really entered the DOS and Windows 3.x ecosystems. You d see it make a few inroads here and there in later versions of Windows, and moreso now that Microsoft has been sort of forced to accept it, but still, reflect on its legacy. What is the software market like in Windows compared to Linux, even today? Now it is, finally, time to talk about connectivity!

Getting On-Line What does it even mean to get on line? Certainly not connecting to a wifi access point. The answer is, unsurprisingly, complex. But for everyone except the large institutional users, it begins with a telephone.

The telephone system By the 80s, there was one communication network that already reached into nearly every home in America: the phone system. Virtually every household (note I don t say every person) was uniquely identified by a 10-digit phone number. You could, at least in theory, call up virtually any other phone in the country and be connected in less than a minute. But I ve got to talk about cost. The way things worked in the USA, you paid a monthly fee for a phone line. Included in that monthly fee was unlimited local calling. What is a local call? That was an extremely complex question. Generally it meant, roughly, calling within your city. But of course, as you deal with things like suburbs and cities growing into each other (eg, the Dallas-Ft. Worth metroplex), things got complicated fast. But let s just say for simplicity you could call others in your city. What about calling people not in your city? That was long distance , and you paid often hugely by the minute for it. Long distance rates were difficult to figure out, but were generally most expensive during business hours and cheapest at night or on weekends. Prices eventually started to come down when competition was introduced for long distance carriers, but even then you often were stuck with a single carrier for long distance calls outside your city but within your state. Anyhow, let s just leave it at this: local calls were virtually free, and long distance calls were extremely expensive.

Getting a modem I remember getting a modem that ran at either 1200bps or 2400bps. Either way, quite slow; you could often read even plain text faster than the modem could display it. But what was a modem? A modem hooked up to a computer with a serial cable, and to the phone system. By the time I got one, modems could automatically dial and answer. You would send a command like ATDT5551212 and it would dial 555-1212. Modems had speakers, because often things wouldn t work right, and the telephone system was oriented around speech, so you could hear what was happening. You d hear it wait for dial tone, then dial, then hopefully the remote end would ring, a modem there would answer, you d hear the screeching of a handshake, and eventually your terminal would say CONNECT 2400. Now your computer was bridged to the other; anything going out your serial port was encoded as sound by your modem and decoded at the other end, and vice-versa. But what, exactly, was the other end? It might have been another person at their computer. Turn on local echo, and you can see what they did. Maybe you d send files to each other. But in my case, the answer was different: PC Magazine.

PC Magazine and CompuServe Starting around 1986 (so I would have been about 6 years old), I got to read PC Magazine. My dad would bring copies that were being discarded at his office home for me to read, and I think eventually bought me a subscription directly. This was not just a standard magazine; it ran something like 350-400 pages an issue, and came out every other week. This thing was a monster. It had reviews of hardware and software, descriptions of upcoming technologies, pages and pages of ads (that often had some degree of being informative to them). And they had sections on programming. Many issues would talk about BASIC or Pascal programming, and there d be a utility in most issues. What do I mean by a utility in most issues ? Did they include a floppy disk with software? No, of course not. There was a literal program listing printed in the magazine. If you wanted the utility, you had to type it in. And a lot of them were written in assembler, so you had to have an assembler. An assembler, of course, was not free and I didn t have one. Or maybe they wrote it in Microsoft C, and I had Borland C, and (of course) they weren t compatible. Sometimes they would list the program sort of in binary: line after line of a BASIC program, with lines like 64, 193, 253, 0, 53, 0, 87 that you would type in for hours, hopefully correctly. Running the BASIC program would, if you got it correct, emit a .COM file that you could then run. They did have a rudimentary checksum system built in, but it wasn t even a CRC, so something like swapping two numbers you d never notice except when the program would mysteriously hang. Eventually they teamed up with CompuServe to offer a limited slice of CompuServe for the purpose of downloading PC Magazine utilities. This was called PC MagNet. I am foggy on the details, but I believe that for a time you could connect to the limited PC MagNet part of CompuServe for free (after the cost of the long-distance call, that is) rather than paying for CompuServe itself (because, OF COURSE, that also charged you per the minute.) So in the early days, I would get special permission from my parents to place a long distance call, and after some nerve-wracking minutes in which we were aware every minute was racking up charges, I could navigate the menus, download what I wanted, and log off immediately. I still, incidentally, mourn what PC Magazine became. As with computing generally, it followed the mass market. It lost its deep technical chops, cut its programming columns, stopped talking about things like how SCSI worked, and so forth. By the time it stopped printing in 2009, it was no longer a square-bound 400-page beheamoth, but rather looked more like a copy of Newsweek, but with less depth.

Continuing with CompuServe CompuServe was a much larger service than just PC MagNet. Eventually, our family got a subscription. It was still an expensive and scarce resource; I d call it only after hours when the long-distance rates were cheapest. Everyone had a numerical username separated by commas; mine was 71510,1421. CompuServe had forums, and files. Eventually I would use TapCIS to queue up things I wanted to do offline, to minimize phone usage online. CompuServe eventually added a gateway to the Internet. For the sum of somewhere around $1 a message, you could send or receive an email from someone with an Internet email address! I remember the thrill of one time, as a kid of probably 11 years, sending a message to one of the editors of PC Magazine and getting a kind, if brief, reply back! But inevitably I had

The Godzilla Phone Bill Yes, one month I became lax in tracking my time online. I ran up my parents phone bill. I don t remember how high, but I remember it was hundreds of dollars, a hefty sum at the time. As I watched Jason Scott s BBS Documentary, I realized how common an experience this was. I think this was the end of CompuServe for me for awhile.

Toll-Free Numbers I lived near a town with a population of 500. Not even IN town, but near town. The calling area included another town with a population of maybe 1500, so all told, there were maybe 2000 people total I could talk to with a local call though far fewer numbers, because remember, telephones were allocated by the household. There was, as far as I know, zero modems that were a local call (aside from one that belonged to a friend I met in around 1992). So basically everything was long-distance. But there was a special feature of the telephone network: toll-free numbers. Normally when calling long-distance, you, the caller, paid the bill. But with a toll-free number, beginning with 1-800, the recipient paid the bill. These numbers almost inevitably belonged to corporations that wanted to make it easy for people to call. Sales and ordering lines, for instance. Some of these companies started to set up modems on toll-free numbers. There were few of these, but they existed, so of course I had to try them! One of them was a company called PennyWise that sold office supplies. They had a toll-free line you could call with a modem to order stuff. Yes, online ordering before the web! I loved office supplies. And, because I lived far from a big city, if the local K-Mart didn t have it, I probably couldn t get it. Of course, the interface was entirely text, but you could search for products and place orders with the modem. I had loads of fun exploring the system, and actually ordered things from them and probably actually saved money doing so. With the first order they shipped a monster full-color catalog. That thing must have been 500 pages, like the Sears catalogs of the day. Every item had a part number, which streamlined ordering through the modem.

Inbound FAXes By the 90s, a number of modems became able to send and receive FAXes as well. For those that don t know, a FAX machine was essentially a special modem. It would scan a page and digitally transmit it over the phone system, where it would at least in the early days be printed out in real time (because the machines didn t have the memory to store an entire page as an image). Eventually, PC modems integrated FAX capabilities. There still wasn t anything useful I could do locally, but there were ways I could get other companies to FAX something to me. I remember two of them. One was for US Robotics. They had an on demand FAX system. You d call up a toll-free number, which was an automated IVR system. You could navigate through it and select various documents of interest to you: spec sheets and the like. You d key in your FAX number, hang up, and US Robotics would call YOU and FAX you the documents you wanted. Yes! I was talking to a computer (of a sorts) at no cost to me! The New York Times also ran a service for awhile called TimesFax. Every day, they would FAX out a page or two of summaries of the day s top stories. This was pretty cool in an era in which I had no other way to access anything from the New York Times. I managed to sign up for TimesFax I have no idea how, anymore and for awhile I would get a daily FAX of their top stories. When my family got its first laser printer, I could them even print these FAXes complete with the gothic New York Times masthead. Wow! (OK, so technically I could print it on a dot-matrix printer also, but graphics on a 9-pin dot matrix is a kind of pain that is a whole other article.)

My own phone line Remember how I discussed that phone lines were allocated per household? This was a problem for a lot of reasons:
  1. Anybody that tried to call my family while I was using my modem would get a busy signal (unable to complete the call)
  2. If anybody in the house picked up the phone while I was using it, that would degrade the quality of the ongoing call and either mess up or disconnect the call in progress. In many cases, that could cancel a file transfer (which wasn t necessarily easy or possible to resume), prompting howls of annoyance from me.
  3. Generally we all had to work around each other
So eventually I found various small jobs and used the money I made to pay for my own phone line and my own long distance costs. Eventually I upgraded to a 28.8Kbps US Robotics Courier modem even! Yes, you heard it right: I got a job and a bank account so I could have a phone line and a faster modem. Uh, isn t that why every teenager gets a job? Now my local friend and I could call each other freely at least on my end (I can t remember if he had his own phone line too). We could exchange files using HS/Link, which had the added benefit of allowing split-screen chat even while a file transfer is in progress. I m sure we spent hours chatting to each other keyboard-to-keyboard while sharing files with each other.

Technology in Schools By this point in the story, we re in the late 80s and early 90s. I m still using PC-style OSs at home; OS/2 in the later years of this period, DOS or maybe a bit of Windows in the earlier years. I mentioned that they let me work on programming at school starting in 5th grade. It was soon apparent that I knew more about computers than anybody on staff, and I started getting pulled out of class to help teachers or administrators with vexing school problems. This continued until I graduated from high school, incidentally often to my enjoyment, and the annoyance of one particular teacher who, I must say, I was fine with annoying in this way. That s not to say that there was institutional support for what I was doing. It was, after all, a small school. Larger schools might have introduced BASIC or maybe Logo in high school. But I had already taught myself BASIC, Pascal, and C by the time I was somewhere around 12 years old. So I wouldn t have had any use for that anyhow. There were programming contests occasionally held in the area. Schools would send teams. My school didn t really send anybody, but I went as an individual. One of them was run by a local college (but for jr. high or high school students. Years later, I met one of the professors that ran it. He remembered me, and that day, better than I did. The programming contest had problems one could solve in BASIC or Logo. I knew nothing about what to expect going into it, but I had lugged my computer and screen along, and asked him, Can I write my solutions in C? He was, apparently, stunned, but said sure, go for it. I took first place that day, leading to some rather confused teams from much larger schools. The Netware network that the school had was, as these generally were, itself isolated. There was no link to the Internet or anything like it. Several schools across three local counties eventually invested in a fiber-optic network linking them together. This built a larger, but still closed, network. Its primary purpose was to allow students to be exposed to a wider variety of classes at high schools. Participating schools had an ITV room , outfitted with cameras and mics. So students at any school could take classes offered over ITV at other schools. For instance, only my school taught German classes, so people at any of those participating schools could take German. It was an early Zoom room. But alongside the TV signal, there was enough bandwidth to run some Netware frames. By about 1995 or so, this let one of the schools purchase some CD-ROM software that was made available on a file server and could be accessed by any participating school. Nice! But Netware was mainly about file and printer sharing; there wasn t even a facility like email, at least not on our deployment.

BBSs My last hop before the Internet was the BBS. A BBS was a computer program, usually ran by a hobbyist like me, on a computer with a modem connected. Callers would call it up, and they d interact with the BBS. Most BBSs had discussion groups like forums and file areas. Some also had games. I, of course, continued to have that most vexing of problems: they were all long-distance. There were some ways to help with that, chiefly QWK and BlueWave. These, somewhat like TapCIS in the CompuServe days, let me download new message posts for reading offline, and queue up my own messages to send later. QWK and BlueWave didn t help with file downloading, though.

BBSs get networked BBSs were an interesting thing. You d call up one, and inevitably somewhere in the file area would be a BBS list. Download the BBS list and you ve suddenly got a list of phone numbers to try calling. All of them were long distance, of course. You d try calling them at random and have a success rate of maybe 20%. The other 80% would be defunct; you might get the dreaded this number is no longer in service or the even more dreaded angry human answering the phone (and of course a modem can t talk to a human, so they d just get silence for probably the nth time that week). The phone company cared nothing about BBSs and recycled their numbers just as fast as any others. To talk to various people, or participate in certain discussion groups, you d have to call specific BBSs. That s annoying enough in the general case, but even more so for someone paying long distance for it all, because it takes a few minutes to establish a connection to a BBS: handshaking, logging in, menu navigation, etc. But BBSs started talking to each other. The earliest successful such effort was FidoNet, and for the duration of the BBS era, it remained by far the largest. FidoNet was analogous to the UUCP that the institutional users had, but ran on the much cheaper PC hardware. Basically, BBSs that participated in FidoNet would relay email, forum posts, and files between themselves overnight. Eventually, as with UUCP, by hopping through this network, messages could reach around the globe, and forums could have worldwide participation asynchronously, long before they could link to each other directly via the Internet. It was almost entirely volunteer-run.

Running my own BBS At age 13, I eventually chose to set up my own BBS. It ran on my single phone line, so of course when I was dialing up something else, nobody could dial up me. Not that this was a huge problem; in my town of 500, I probably had a good 1 or 2 regular callers in the beginning. In the PC era, there was a big difference between a server and a client. Server-class software was expensive and rare. Maybe in later years you had an email client, but an email server would be completely unavailable to you as a home user. But with a BBS, I could effectively run a server. I even ran serial lines in our house so that the BBS could be connected from other rooms! Since I was running OS/2, the BBS didn t tie up the computer; I could continue using it for other things. FidoNet had an Internet email gateway. This one, unlike CompuServe s, was free. Once I had a BBS on FidoNet, you could reach me from the Internet using the FidoNet address. This didn t support attachments, but then email of the day didn t really, either. Various others outside Kansas ran FidoNet distribution points. I believe one of them was mgmtsys; my memory is quite vague, but I think they offered a direct gateway and I would call them to pick up Internet mail via FidoNet protocols, but I m not at all certain of this.

Pros and Cons of the Non-Microsoft World As mentioned, Microsoft was and is the dominant operating system vendor for PCs. But I left that world in 1993, and here, nearly 30 years later, have never really returned. I got an operating system with more technical capabilities than the DOS and Windows of the day, but the tradeoff was a much smaller software ecosystem. OS/2 could run DOS programs, but it ran OS/2 programs a lot better. So if I were to run a BBS, I wanted one that had a native OS/2 version limiting me to a small fraction of available BBS server software. On the other hand, as a fully 32-bit operating system, there started to be OS/2 ports of certain software with a Unix heritage; most notably for me at the time, gcc. At some point, I eventually came across the RMS essays and started to be hooked.

Internet: The Hunt Begins I certainly was aware that the Internet was out there and interesting. But the first problem was: how the heck do I get connected to the Internet?

Computer labs There was one place that tended to have Internet access: colleges and universities. In 7th grade, I participated in a program that resulted in me being invited to visit Duke University, and in 8th grade, I participated in National History Day, resulting in a trip to visit the University of Maryland. I probably sought out computer labs at both of those. My most distinct memory was finding my way into a computer lab at one of those universities, and it was full of NeXT workstations. I had never seen or used NeXT before, and had no idea how to operate it. I had brought a box of floppy disks, unaware that the DOS disks probably weren t compatible with NeXT. Closer to home, a small college had a computer lab that I could also visit. I would go there in summer or when it wasn t used with my stack of floppies. I remember downloading disk images of FLOSS operating systems: FreeBSD, Slackware, or Debian, at the time. The hash marks from the DOS-based FTP client would creep across the screen as the 1.44MB disk images would slowly download. telnet was also available on those machines, so I could telnet to things like public-access Archie servers and libraries though not Gopher. Still, FTP and telnet access opened up a lot, and I learned quite a bit in those years.

Continuing the Journey At some point, I got a copy of the Whole Internet User s Guide and Catalog, published in 1994. I still have it. If it hadn t already figured it out by then, I certainly became aware from it that Unix was the dominant operating system on the Internet. The examples in Whole Internet covered FTP, telnet, gopher all assuming the user somehow got to a Unix prompt. The web was introduced about 300 pages in; clearly viewed as something that wasn t page 1 material. And it covered the command-line www client before introducing the graphical Mosaic. Even then, though, the book highlighted Mosaic s utility as a front-end for Gopher and FTP, and even the ability to launch telnet sessions by clicking on links. But having a copy of the book didn t equate to having any way to run Mosaic. The machines in the computer lab I mentioned above all ran DOS and were incapable of running a graphical browser. I had no SLIP or PPP (both ways to run Internet traffic over a modem) connectivity at home. In short, the Web was something for the large institutional users at the time.

CD-ROMs As CD-ROMs came out, with their huge (for the day) 650MB capacity, various companies started collecting software that could be downloaded on the Internet and selling it on CD-ROM. The two most popular ones were Walnut Creek CD-ROM and Infomagic. One could buy extensive Shareware and gaming collections, and then even entire Linux and BSD distributions. Although not exactly an Internet service per se, it was a way of bringing what may ordinarily only be accessible to institutional users into the home computer realm.

Free Software Jumps In As I mentioned, by the mid 90s, I had come across RMS s writings about free software most probably his 1992 essay Why Software Should Be Free. (Please note, this is not a commentary on the more recently-revealed issues surrounding RMS, but rather his writings and work as I encountered them in the 90s.) The notion of a Free operating system not just in cost but in openness was incredibly appealing. Not only could I tinker with it to a much greater extent due to having source for everything, but it included so much software that I d otherwise have to pay for. Compilers! Interpreters! Editors! Terminal emulators! And, especially, server software of all sorts. There d be no way I could afford or run Netware, but with a Free Unixy operating system, I could do all that. My interest was obviously piqued. Add to that the fact that I could actually participate and contribute I was about to become hooked on something that I ve stayed hooked on for decades. But then the question was: which Free operating system? Eventually I chose FreeBSD to begin with; that would have been sometime in 1995. I don t recall the exact reasons for that. I remember downloading Slackware install floppies, and probably the fact that Debian wasn t yet at 1.0 scared me off for a time. FreeBSD s fantastic Handbook far better than anything I could find for Linux at the time was no doubt also a factor.

The de Raadt Factor Why not NetBSD or OpenBSD? The short answer is Theo de Raadt. Somewhere in this time, when I was somewhere between 14 and 16 years old, I asked some questions comparing NetBSD to the other two free BSDs. This was on a NetBSD mailing list, but for some reason Theo saw it and got a flame war going, which CC d me. Now keep in mind that even if NetBSD had a web presence at the time, it would have been minimal, and I would have not all that unusually for the time had no way to access it. I was certainly not aware of the, shall we say, acrimony between Theo and NetBSD. While I had certainly seen an online flamewar before, this took on a different and more disturbing tone; months later, Theo randomly emailed me under the subject SLIME saying that I was, well, SLIME . I seem to recall periodic emails from him thereafter reminding me that he hates me and that he had blocked me. (Disclaimer: I have poor email archives from this period, so the full details are lost to me, but I believe I am accurately conveying these events from over 25 years ago) This was a surprise, and an unpleasant one. I was trying to learn, and while it is possible I didn t understand some aspect or other of netiquette (or Theo s personal hatred of NetBSD) at the time, still that is not a reason to flame a 16-year-old (though he would have had no way to know my age). This didn t leave any kind of scar, but did leave a lasting impression; to this day, I am particularly concerned with how FLOSS projects handle poisonous people. Debian, for instance, has come a long way in this over the years, and even Linus Torvalds has turned over a new leaf. I don t know if Theo has. In any case, I didn t use NetBSD then. I did try it periodically in the years since, but never found it compelling enough to justify a large switch from Debian. I never tried OpenBSD for various reasons, but one of them was that I didn t want to join a community that tolerates behavior such as Theo s from its leader.

Moving to FreeBSD Moving from OS/2 to FreeBSD was final. That is, I didn t have enough hard drive space to keep both. I also didn t have the backup capacity to back up OS/2 completely. My BBS, which ran Virtual BBS (and at some point also AdeptXBBS) was deleted and reincarnated in a different form. My BBS was a member of both FidoNet and VirtualNet; the latter was specific to VBBS, and had to be dropped. I believe I may have also had to drop the FidoNet link for a time. This was the biggest change of computing in my life to that point. The earlier experiences hadn t literally destroyed what came before. OS/2 could still run my DOS programs. Its command shell was quite DOS-like. It ran Windows programs. I was going to throw all that away and leap into the unknown. I wish I had saved a copy of my BBS; I would love to see the messages I exchanged back then, or see its menu screens again. I have little memory of what it looked like. But other than that, I have no regrets. Pursuing Free, Unixy operating systems brought me a lot of enjoyment and a good career. That s not to say it was easy. All the problems of not being in the Microsoft ecosystem were magnified under FreeBSD and Linux. In a day before EDID, monitor timings had to be calculated manually and you risked destroying your monitor if you got them wrong. Word processing and spreadsheet software was pretty much not there for FreeBSD or Linux at the time; I was therefore forced to learn LaTeX and actually appreciated that. Software like PageMaker or CorelDraw was certainly nowhere to be found for those free operating systems either. But I got a ton of new capabilities. I mentioned the BBS didn t shut down, and indeed it didn t. I ran what was surely a supremely unique oddity: a free, dialin Unix shell server in the middle of a small town in Kansas. I m sure I provided things such as pine for email and some help text and maybe even printouts for how to use it. The set of callers slowly grew over the time period, in fact. And then I got UUCP.

Enter UUCP Even throughout all this, there was no local Internet provider and things were still long distance. I had Internet Email access via assorted strange routes, but they were all strange. And, I wanted access to Usenet. In 1995, it happened. The local ISP I mentioned offered UUCP access. Though I couldn t afford the dialup shell (or later, SLIP/PPP) that they offered due to long-distance costs, UUCP s very efficient batched processes looked doable. I believe I established that link when I was 15, so in 1995. I worked to register my domain, complete.org, as well. At the time, the process was a bit lengthy and involved downloading a text file form, filling it out in a precise way, sending it to InterNIC, and probably mailing them a check. Well I did that, and in September of 1995, complete.org became mine. I set up sendmail on my local system, as well as INN to handle the limited Usenet newsfeed I requested from the ISP. I even ran Majordomo to host some mailing lists, including some that were surprisingly high-traffic for a few-times-a-day long-distance modem UUCP link! The modem client programs for FreeBSD were somewhat less advanced than for OS/2, but I believe I wound up using Minicom or Seyon to continue to dial out to BBSs and, I believe, continue to use Learning Link. So all the while I was setting up my local BBS, I continued to have access to the text Internet, consisting of chiefly Gopher for me.

Switching to Debian I switched to Debian sometime in 1995 or 1996, and have been using Debian as my primary OS ever since. I continued to offer shell access, but added the WorldVU Atlantis menuing BBS system. This provided a return of a more BBS-like interface (by default; shell was still an uption) as well as some BBS door games such as LoRD and TradeWars 2002, running under DOS emulation. I also continued to run INN, and ran ifgate to allow FidoNet echomail to be presented into INN Usenet-like newsgroups, and netmail to be gated to Unix email. This worked pretty well. The BBS continued to grow in these days, peaking at about two dozen total user accounts, and maybe a dozen regular users.

Dial-up access availability I believe it was in 1996 that dial up PPP access finally became available in my small town. What a thrill! FINALLY! I could now FTP, use Gopher, telnet, and the web all from home. Of course, it was at modem speeds, but still. (Strangely, I have a memory of accessing the Web using WebExplorer from OS/2. I don t know exactly why; it s possible that by this time, I had upgraded to a 486 DX2/66 and was able to reinstall OS/2 on the old 25MHz 486, or maybe something was wrong with the timeline from my memories from 25 years ago above. Or perhaps I made the occasional long-distance call somewhere before I ditched OS/2.) Gopher sites still existed at this point, and I could access them using Netscape Navigator which likely became my standard Gopher client at that point. I don t recall using UMN text-mode gopher client locally at that time, though it s certainly possible I did.

The city Starting when I was 15, I took computer science classes at Wichita State University. The first one was a class in the summer of 1995 on C++. I remember being worried about being good enough for it I was, after all, just after my HS freshman year and had never taken the prerequisite C class. I loved it and got an A! By 1996, I was taking more classes. In 1996 or 1997 I stayed in Wichita during the day due to having more than one class. So, what would I do then but enjoy the computer lab? The CS dept. had two of them: one that had NCD X terminals connected to a pair of SunOS servers, and another one running Windows. I spent most of the time in the Unix lab with the NCDs; I d use Netscape or pine, write code, enjoy the University s fast Internet connection, and so forth. In 1997 I had graduated high school and that summer I moved to Wichita to attend college. As was so often the case, I shut down the BBS at that time. It would be 5 years until I again dealt with Internet at home in a rural community. By the time I moved to my apartment in Wichita, I had stopped using OS/2 entirely. I have no memory of ever having OS/2 there. Along the way, I had bought a Pentium 166, and then the most expensive piece of computing equipment I have ever owned: a DEC Alpha, which, of course, ran Linux.

ISDN I must have used dialup PPP for a time, but I eventually got a job working for the ISP I had used for UUCP, and then PPP. While there, I got a 128Kbps ISDN line installed in my apartment, and they gave me a discount on the service for it. That was around 3x the speed of a modem, and crucially was always on and gave me a public IP. No longer did I have to use UUCP; now I got to host my own things! By at least 1998, I was running a web server on www.complete.org, and I had an FTP server going as well.

Even Bigger Cities In 1999 I moved to Dallas, and there got my first broadband connection: an ADSL link at, I think, 1.5Mbps! Now that was something! But it had some reliability problems. I eventually put together a server and had it hosted at an acquantaince s place who had SDSL in his apartment. Within a couple of years, I had switched to various kinds of proper hosting for it, but that is a whole other article. In Indianapolis, I got a cable modem for the first time, with even tighter speeds but prohibitions on running servers on it. Yuck.

Challenges Being non-Microsoft continued to have challenges. Until the advent of Firefox, a web browser was one of the biggest. While Netscape supported Linux on i386, it didn t support Linux on Alpha. I hobbled along with various attempts at emulators, old versions of Mosaic, and so forth. And, until StarOffice was open-sourced as Open Office, reading Microsoft file formats was also a challenge, though WordPerfect was briefly available for Linux. Over the years, I have become used to the Linux ecosystem. Perhaps I use Gimp instead of Photoshop and digikam instead of well, whatever somebody would use on Windows. But I get ZFS, and containers, and so much that isn t available there. Yes, I know Apple never went away and is a thing, but for most of the time period I discuss in this article, at least after the rise of DOS, it was niche compared to the PC market.

Back to Kansas In 2002, I moved back to Kansas, to a rural home near a different small town in the county next to where I grew up. Over there, it was back to dialup at home, but I had faster access at work. I didn t much care for this, and thus began a 20+-year effort to get broadband in the country. At first, I got a wireless link, which worked well enough in the winter, but had serious problems in the summer when the trees leafed out. Eventually DSL became available locally highly unreliable, but still, it was something. Then I moved back to the community I grew up in, a few miles from where I grew up. Again I got DSL a bit better. But after some years, being at the end of the run of DSL meant I had poor speeds and reliability problems. I eventually switched to various wireless ISPs, which continues to the present day; while people in cities can get Gbps service, I can get, at best, about 50Mbps. Long-distance fees are gone, but the speed disparity remains.

Concluding Reflections I am glad I grew up where I did; the strong community has a lot of advantages I don t have room to discuss here. In a number of very real senses, having no local services made things a lot more difficult than they otherwise would have been. However, perhaps I could say that I also learned a lot through the need to come up with inventive solutions to those challenges. To this day, I think a lot about computing in remote environments: partially because I live in one, and partially because I enjoy visiting places that are remote enough that they have no Internet, phone, or cell service whatsoever. I have written articles like Tools for Communicating Offline and in Difficult Circumstances based on my own personal experience. I instinctively think about making protocols robust in the face of various kinds of connectivity failures because I experience various kinds of connectivity failures myself.

(Almost) Everything Lives On In 2002, Gopher turned 10 years old. It had probably been about 9 or 10 years since I had first used Gopher, which was the first way I got on live Internet from my house. It was hard to believe. By that point, I had an always-on Internet link at home and at work. I had my Alpha, and probably also at least PCMCIA Ethernet for a laptop (many laptops had modems by the 90s also). Despite its popularity in the early 90s, less than 10 years after it came on the scene and started to unify the Internet, it was mostly forgotten. And it was at that moment that I decided to try to resurrect it. The University of Minnesota finally released it under an Open Source license. I wrote the first new gopher server in years, pygopherd, and introduced gopher to Debian. Gopher lives on; there are now quite a few Gopher clients and servers out there, newly started post-2002. The Gemini protocol can be thought of as something akin to Gopher 2.0, and it too has a small but blossoming ecosystem. Archie, the old FTP search tool, is dead though. Same for WAIS and a number of the other pre-web search tools. But still, even FTP lives on today. And BBSs? Well, they didn t go away either. Jason Scott s fabulous BBS documentary looks back at the history of the BBS, while Back to the BBS from last year talks about the modern BBS scene. FidoNet somehow is still alive and kicking. UUCP still has its place and has inspired a whole string of successors. Some, like NNCP, are clearly direct descendents of UUCP. Filespooler lives in that ecosystem, and you can even see UUCP concepts in projects as far afield as Syncthing and Meshtastic. Usenet still exists, and you can now run Usenet over NNCP just as I ran Usenet over UUCP back in the day (which you can still do as well). Telnet, of course, has been largely supplanted by ssh, but the concept is more popular now than ever, as Linux has made ssh be available on everything from Raspberry Pi to Android. And I still run a Gopher server, looking pretty much like it did in 2002. This post also has a permanent home on my website, where it may be periodically updated.

29 August 2022

Emmanuel Kasper: Moving blog from blogger.com to wordpress.com

I switched from blogger.com the Google Blog platform to the hosted wordpress.com of Automaticc, the WordPress blog engine main authors.
I thus gain: I lose:
  • free CNAME redirect using my own domain name
  • a bit of advertising-free space. The blog at wordpress.com has a prominent header indicating I am using the free plan, but I am OK so far with that.
What stays the same:
  • Blogger and WordPress.com offer both tag-based RSS feed exports, so I decided to keep for Debian Planet a feed containing only the posts related to free, libre and opensource software.
I was not ready to make the jump to a self hosted static blog generator, as I still wanted to have the possibility to comment, without me having to host the comment subsystem. On the personal side, I also intend to pause twitter activity, as I notice current microblogging platforms tend to mostly contain flame wars, self promotion, or shared links I could find anyway with a good feed reader.

26 August 2022

Antoine Beaupr : How to nationalize the internet in Canada

Rogers had a catastrophic failure in July 2022. It affected emergency services (as in: people couldn't call 911, but also some 911 services themselves failed), hospitals (which couldn't access prescriptions), banks and payment systems (as payment terminals stopped working), and regular users as well. The outage lasted almost a full day, and Rogers took days to give any technical explanation on the outage, and even when they did, details were sparse. So far the only detailed account is from outside actors like Cloudflare which seem to point at an internal BGP failure. Its impact on the economy has yet to be measured, but it probably cost millions of dollars in wasted time and possibly lead to life-threatening situations. Apart from holding Rogers (criminally?) responsible for this, what should be done in the future to avoid such problems? It's not the first time something like this has happened: it happened to Bell Canada as well. The Rogers outage is also strangely similar to the Facebook outage last year, but, to its credit, Facebook did post a fairly detailed explanation only a day later. The internet is designed to be decentralised, and having large companies like Rogers hold so much power is a crucial mistake that should be reverted. The question is how. Some critics were quick to point out that we need more ISP diversity and competition, but I think that's missing the point. Others have suggested that the internet should be a public good or even straight out nationalized. I believe the solution to the problem of large, private, centralised telcos and ISPs is to replace them with smaller, public, decentralised service providers. The only way to ensure that works is to make sure that public money ends up creating infrastructure controlled by the public, which means treating ISPs as a public utility. This has been implemented elsewhere: it works, it's cheaper, and provides better service.

A modest proposal Global wireless services (like phone services) and home internet inevitably grow into monopolies. They are public utilities, just like water, power, railways, and roads. The question of how they should be managed is therefore inherently political, yet people don't seem to question the idea that only the market (i.e. "competition") can solve this problem. I disagree. 10 years ago (in french), I suggested we, in Qu bec, should nationalize large telcos and internet service providers. I no longer believe is a realistic approach: most of those companies have crap copper-based networks (at least for the last mile), yet are worth billions of dollars. It would be prohibitive, and a waste, to buy them out. Back then, I called this idea "R seau-Qu bec", a reference to the already nationalized power company, Hydro-Qu bec. (This idea, incidentally, made it into the plan of a political party.) Now, I think we should instead build our own, public internet. Start setting up municipal internet services, fiber to the home in all cities, progressively. Then interconnect cities with fiber, and build peering agreements with other providers. This also includes a bid on wireless spectrum to start competing with phone providers as well. And while that sounds really ambitious, I think it's possible to take this one step at a time.

Municipal broadband In many parts of the world, municipal broadband is an elegant solution to the problem, with solutions ranging from Stockholm's city-owned fiber network (dark fiber, layer 1) to Utah's UTOPIA network (fiber to the premises, layer 2) and municipal wireless networks like Guifi.net which connects about 40,000 nodes in Catalonia. A good first step would be for cities to start providing broadband services to its residents, directly. Cities normally own sewage and water systems that interconnect most residences and therefore have direct physical access everywhere. In Montr al, in particular, there is an ongoing project to replace a lot of old lead-based plumbing which would give an opportunity to lay down a wired fiber network across the city. This is a wild guess, but I suspect this would be much less expensive than one would think. Some people agree with me and quote this as low as 1000$ per household. There is about 800,000 households in the city of Montr al, so we're talking about a 800 million dollars investment here, to connect every household in Montr al with fiber and incidentally a quarter of the province's population. And this is not an up-front cost: this can be built progressively, with expenses amortized over many years. (We should not, however, connect Montr al first: it's used as an example here because it's a large number of households to connect.) Such a network should be built with a redundant topology. I leave it as an open question whether we should adopt Stockholm's more minimalist approach or provide direct IP connectivity. I would tend to favor the latter, because then you can immediately start to offer the service to households and generate revenues to compensate for the capital expenditures. Given the ridiculous profit margins telcos currently have 8 billion $CAD net income for BCE (2019), 2 billion $CAD for Rogers (2020) I also believe this would actually turn into a profitable revenue stream for the city, the same way Hydro-Qu bec is more and more considered as a revenue stream for the state. (I personally believe that's actually wrong and we should treat those resources as human rights and not money cows, but I digress. The point is: this is not a cost point, it's a revenue.) The other major challenge here is that the city will need competent engineers to drive this project forward. But this is not different from the way other public utilities run: we have electrical engineers at Hydro, sewer and water engineers at the city, this is just another profession. If anything, the computing science sector might be more at fault than the city here in its failure to provide competent and accountable engineers to society... Right now, most of the network in Canada is copper: we are hitting the limits of that technology with DSL, and while cable has some life left to it (DOCSIS 4.0 does 4Gbps), that is nowhere near the capacity of fiber. Take the town of Chattanooga, Tennessee: in 2010, the city-owned ISP EPB finished deploying a fiber network to the entire town and provided gigabit internet to everyone. Now, 12 years later, they are using this same network to provide the mind-boggling speed of 25 gigabit to the home. To give you an idea, Chattanooga is roughly the size and density of Sherbrooke.

Provincial public internet As part of building a municipal network, the question of getting access to "the internet" will immediately come up. Naturally, this will first be solved by using already existing commercial providers to hook up residents to the rest of the global network. But eventually, networks should inter-connect: Montr al should connect with Laval, and then Trois-Rivi res, then Qu bec City. This will require long haul fiber runs, but those links are not actually that expensive, and many of those already exist as a public resource at RISQ and CANARIE, which cross-connects universities and colleges across the province and the country. Those networks might not have the capacity to cover the needs of the entire province right now, but that is a router upgrade away, thanks to the amazing capacity of fiber. There are two crucial mistakes to avoid at this point. First, the network needs to remain decentralised. Long haul links should be IP links with BGP sessions, and each city (or MRC) should have its own independent network, to avoid Rogers-class catastrophic failures. Second, skill needs to remain in-house: RISQ has already made that mistake, to a certain extent, by selling its neutral datacenter. Tellingly, MetroOptic, probably the largest commercial dark fiber provider in the province, now operates the QIX, the second largest "public" internet exchange in Canada. Still, we have a lot of infrastructure we can leverage here. If RISQ or CANARIE cannot be up to the task, Hydro-Qu bec has power lines running into every house in the province, with high voltage power lines running hundreds of kilometers far north. The logistics of long distance maintenance are already solved by that institution. In fact, Hydro already has fiber all over the province, but it is a private network, separate from the internet for security reasons (and that should probably remain so). But this only shows they already have the expertise to lay down fiber: they would just need to lay down a parallel network to the existing one. In that architecture, Hydro would be a "dark fiber" provider.

International public internet None of the above solves the problem for the entire population of Qu bec, which is notoriously dispersed, with an area three times the size of France, but with only an eight of its population (8 million vs 67). More specifically, Canada was originally a french colony, a land violently stolen from native people who have lived here for thousands of years. Some of those people now live in reservations, sometimes far from urban centers (but definitely not always). So the idea of leveraging the Hydro-Qu bec infrastructure doesn't always work to solve this, because while Hydro will happily flood a traditional hunting territory for an electric dam, they don't bother running power lines to the village they forcibly moved, powering it instead with noisy and polluting diesel generators. So before giving me fiber to the home, we should give power (and potable water, for that matter), to those communities first. So we need to discuss international connectivity. (How else could we consider those communities than peer nations anyways?c) Qu bec has virtually zero international links. Even in Montr al, which likes to style itself a major player in gaming, AI, and technology, most peering goes through either Toronto or New York. That's a problem that we must fix, regardless of the other problems stated here. Looking at the submarine cable map, we see very few international links actually landing in Canada. There is the Greenland connect which connects Newfoundland to Iceland through Greenland. There's the EXA which lands in Ireland, the UK and the US, and Google has the Topaz link on the west coast. That's about it, and none of those land anywhere near any major urban center in Qu bec. We should have a cable running from France up to Saint-F licien. There should be a cable from Vancouver to China. Heck, there should be a fiber cable running all the way from the end of the great lakes through Qu bec, then up around the northern passage and back down to British Columbia. Those cables are expensive, and the idea might sound ludicrous, but Russia is actually planning such a project for 2026. The US has cables running all the way up (and around!) Alaska, neatly bypassing all of Canada in the process. We just look ridiculous on that map. (Addendum: I somehow forgot to talk about Teleglobe here was founded as publicly owned company in 1950, growing international phone and (later) data links all over the world. It was privatized by the conservatives in 1984, along with rails and other "crown corporations". So that's one major risk to any effort to make public utilities work properly: some government might be elected and promptly sell it out to its friends for peanuts.)

Wireless networks I know most people will have rolled their eyes so far back their heads have exploded. But I'm not done yet. I want wireless too. And by wireless, I don't mean a bunch of geeks setting up OpenWRT routers on rooftops. I tried that, and while it was fun and educational, it didn't scale. A public networking utility wouldn't be complete without providing cellular phone service. This involves bidding for frequencies at the federal level, and deploying a rather large amount of infrastructure, but it could be a later phase, when the engineers and politicians have proven their worth. At least part of the Rogers fiasco would have been averted if such a decentralized network backend existed. One might even want to argue that a separate institution should be setup to provide phone services, independently from the regular wired networking, if only for reliability. Because remember here: the problem we're trying to solve is not just technical, it's about political boundaries, centralisation, and automation. If everything is ran by this one organisation again, we will have failed. However, I must admit that phone services is where my ideas fall a little short. I can't help but think it's also an accessible goal maybe starting with a virtual operator but it seems slightly less so than the others, especially considering how closed the phone ecosystem is.

Counter points In debating these ideas while writing this article, the following objections came up.

I don't want the state to control my internet One legitimate concern I have about the idea of the state running the internet is the potential it would have to censor or control the content running over the wires. But I don't think there is necessarily a direct relationship between resource ownership and control of content. Sure, China has strong censorship in place, partly implemented through state-controlled businesses. But Russia also has strong censorship in place, based on regulatory tools: they force private service providers to install back-doors in their networks to control content and surveil their users. Besides, the USA have been doing warrantless wiretapping since at least 2003 (and yes, that's 10 years before the Snowden revelations) so a commercial internet is no assurance that we have a free internet. Quite the contrary in fact: if anything, the commercial internet goes hand in hand with the neo-colonial internet, just like businesses did in the "good old colonial days". Large media companies are the primary censors of content here. In Canada, the media cartel requested the first site-blocking order in 2018. The plaintiffs (including Qu becor, Rogers, and Bell Canada) are both content providers and internet service providers, an obvious conflict of interest. Nevertheless, there are some strong arguments against having a centralised, state-owned monopoly on internet service providers. FDN makes a good point on this. But this is not what I am suggesting: at the provincial level, the network would be purely physical, and regional entities (which could include private companies) would peer over that physical network, ensuring decentralization. Delegating the management of that infrastructure to an independent non-profit or cooperative (but owned by the state) would also ensure some level of independence.

Isn't the government incompetent and corrupt? Also known as "private enterprise is better skilled at handling this, the state can't do anything right" I don't think this is a "fait accomplit". If anything, I have found publicly ran utilities to be spectacularly reliable here. I rarely have trouble with sewage, water, or power, and keep in mind I live in a city where we receive about 2 meters of snow a year, which tend to create lots of trouble with power lines. Unless there's a major weather event, power just runs here. I think the same can happen with an internet service provider. But it would certainly need to have higher standards to what we're used to, because frankly Internet is kind of janky.

A single monopoly will be less reliable I actually agree with that, but that is not what I am proposing anyways. Current commercial or non-profit entities will be free to offer their services on top of the public network. And besides, the current "ha! diversity is great" approach is exactly what we have now, and it's not working. The pretense that we can have competition over a single network is what led the US into the ridiculous situation where they also pretend to have competition over the power utility market. This led to massive forest fires in California and major power outages in Texas. It doesn't work.

Wouldn't this create an isolated network? One theory is that this new network would be so hostile to incumbent telcos and ISPs that they would simply refuse to network with the public utility. And while it is true that the telcos currently do also act as a kind of "tier one" provider in some places, I strongly feel this is also a problem that needs to be solved, regardless of ownership of networking infrastructure. Right now, telcos often hold both ends of the stick: they are the gateway to users, the "last mile", but they also provide peering to the larger internet in some locations. In at least one datacenter in downtown Montr al, I've seen traffic go through Bell Canada that was not directly targeted at Bell customers. So in effect, they are in a position of charging twice for the same traffic, and that's not only ridiculous, it should just be plain illegal. And besides, this is not a big problem: there are other providers out there. As bad as the market is in Qu bec, there is still some diversity in Tier one providers that could allow for some exits to the wider network (e.g. yes, Cogent is here too).

What about Google and Facebook? Nationalization of other service providers like Google and Facebook is out of scope of this discussion. That said, I am not sure the state should get into the business of organising the web or providing content services however, but I will point out it already does do some of that through its own websites. It should probably keep itself to this, and also consider providing normal services for people who don't or can't access the internet. (And I would also be ready to argue that Google and Facebook already act as extensions of the state: certainly if Facebook didn't exist, the CIA or the NSA would like to create it at this point. And Google has lucrative business with the US department of defense.)

What does not work So we've seen one thing that could work. Maybe it's too expensive. Maybe the political will isn't there. Maybe it will fail. We don't know yet. But we know what does not work, and it's what we've been doing ever since the internet has gone commercial.

Subsidies The absurd price we pay for data does not actually mean everyone gets high speed internet at home. Large swathes of the Qu bec countryside don't get broadband at all, and it can be difficult or expensive, even in large urban centers like Montr al, to get high speed internet. That is despite having a series of subsidies that all avoided investing in our own infrastructure. We had the "fonds de l'autoroute de l'information", "information highway fund" (site dead since 2003, archive.org link) and "branchez les familles", "connecting families" (site dead since 2003, archive.org link) which subsidized the development of a copper network. In 2014, more of the same: the federal government poured hundreds of millions of dollars into a program called connecting Canadians to connect 280 000 households to "high speed internet". And now, the federal and provincial governments are proudly announcing that "everyone is now connected to high speed internet", after pouring more than 1.1 billion dollars to connect, guess what, another 380 000 homes, right in time for the provincial election. Of course, technically, the deadline won't actually be met until 2023. Qu bec is a big area to cover, and you can guess what happens next: the telcos threw up their hand and said some areas just can't be connected. (Or they connect their CEO but not the poor folks across the lake.) The story then takes the predictable twist of giving more money out to billionaires, subsidizing now Musk's Starlink system to connect those remote areas. To give a concrete example: a friend who lives about 1000km away from Montr al, 4km from a small, 2500 habitant village, has recently got symmetric 100 mbps fiber at home from Telus, thanks to those subsidies. But I can't get that service in Montr al at all, presumably because Telus and Bell colluded to split that market. Bell doesn't provide me with such a service either: they tell me they have "fiber to my neighborhood", and only offer me a 25/10 mbps ADSL service. (There is Vid otron offering 400mbps, but that's copper cable, again a dead technology, and asymmetric.)

Conclusion Remember Chattanooga? Back in 2010, they funded the development of a fiber network, and now they have deployed a network roughly a thousand times faster than what we have just funded with a billion dollars. In 2010, I was paying Bell Canada 60$/mth for 20mbps and a 125GB cap, and now, I'm still (indirectly) paying Bell for roughly the same speed (25mbps). Back then, Bell was throttling their competitors networks until 2009, when they were forced by the CRTC to stop throttling. Both Bell and Vid otron still explicitly forbid you from running your own servers at home, Vid otron charges prohibitive prices which make it near impossible for resellers to sell uncapped services. Those companies are not spurring innovation: they are blocking it. We have spent all this money for the private sector to build us a private internet, over decades, without any assurance of quality, equity or reliability. And while in some locations, ISPs did deploy fiber to the home, they certainly didn't upgrade their entire network to follow suit, and even less allowed resellers to compete on that network. In 10 years, when 100mbps will be laughable, I bet those service providers will again punt the ball in the public courtyard and tell us they don't have the money to upgrade everyone's equipment. We got screwed. It's time to try something new.

Updates There was a discussion about this article on Hacker News which was surprisingly productive. Trigger warning: Hacker News is kind of right-wing, in case you didn't know. Since this article was written, at least two more major acquisitions happened, just in Qu bec: In the latter case, vMedia was explicitly saying it couldn't grow because of "lack of access to capital". So basically, we have given those companies a billion dollars, and they are not using that very money to buy out their competition. At least we could have given that money to small players to even out the playing field. But this is not how that works at all. Also, in a bizarre twist, an "analyst" believes the acquisition is likely to help Rogers acquire Shaw. Also, since this article was written, the Washington Post published a review of a book bringing similar ideas: Internet for the People The Fight for Our Digital Future, by Ben Tarnoff, at Verso books. It's short, but even more ambitious than what I am suggesting in this article, arguing that all big tech companies should be broken up and better regulated:
He pulls from Ethan Zuckerman s idea of a web that is plural in purpose that just as pool halls, libraries and churches each have different norms, purposes and designs, so too should different places on the internet. To achieve this, Tarnoff wants governments to pass laws that would make the big platforms unprofitable and, in their place, fund small-scale, local experiments in social media design. Instead of having platforms ruled by engagement-maximizing algorithms, Tarnoff imagines public platforms run by local librarians that include content from public media.
(Links mine: the Washington Post obviously prefers to not link to the real web, and instead doesn't link to Zuckerman's site all and suggests Amazon for the book, in a cynical example.) And in another example of how the private sector has failed us, there was recently a fluke in the AMBER alert system where the entire province was warned about a loose shooter in Saint-Elz ar except the people in the town, because they have spotty cell phone coverage. In other words, millions of people received a strongly toned, "life-threatening", alert for a city sometimes hours away, except the people most vulnerable to the alert. Not missing a beat, the CAQ party is promising more of the same medicine again and giving more money to telcos to fix the problem, suggesting to spend three billion dollars in private infrastructure.

23 August 2022

Ian Jackson: prefork-interp - automatic startup time amortisation for all manner of scripts

The problem I had - Mason, so, sadly, FastCGI Since the update to current Debian stable, the website for YARRG, (a play-aid for Puzzle Pirates which I wrote some years ago), started to occasionally return Internal Server Error , apparently due to bug(s) in some FastCGI libraries. I was using FastCGI because the website is written in Mason, a Perl web framework, and I found that Mason CGI calls were slow. I m using CGI - yes, trad CGI - via userv-cgi. Running Mason this way would compile the template for each HTTP request just when it was rendered, and then throw the compiled version away. The more modern approach of an application server doesn t scale well to a system which has many web applications most of which are very small. The admin overhead of maintaining a daemon, and corresponding webserver config, for each such service would be prohibitive, even with some kind of autoprovisioning setup. FastCGI has an interpreter wrapper which seemed like it ought to solve this problem, but it s quite inconvenient, and often flaky. I decided I could do better, and set out to eliminate FastCGI from my setup. The result seems to be a success; once I d done all the hard work of writing prefork-interp, I found the result very straightforward to deploy. prefork-interp prefork-interp is a small C program which wraps a script, plus a scripting language library to cooperate with the wrapper program. Together they achieve the following:
  • Startup cost of the script (loading modules it uses, precompuations, loading and processing of data files, etc.) is paid once, and reused for subsequent invocations of the same script.
  • Minimal intervention to the script source code:
    • one new library to import
    • one new call to make from that library, right after the script intialisation is complete
    • change to the #! line.
  • The new initialisation complete call turns the program into a little server (a daemon), and then returns once for each actual invocation, each time in a fresh grandchild process.
Features:
  • Seamless reloading on changes to the script source code (automatic, and configurable).
  • Concurrency limiting.
  • Options for distinguishing different configurations of the same script so that they get a server each.
  • You can run the same script standalone, as a one-off execution, as well as under prefork-interp.
  • Currently, a script-side library is provided for Perl. I m pretty sure Python would be fairly straightforward.
Important properties not always satisfied by competing approaches:
  • Error output (stderr) and exit status from both phases of the script code execution faithfully reproduced to the calling context. Environment, arguments, and stdin/stdout/stderr descriptors, passed through to each invocation.
  • No polling, other than a long-term idle timeout, so good on laptops (or phones).
  • Automatic lifetime management of the per-script server, including startup and cleanup. No integration needed with system startup machinery: No explicit management of daemons, init scripts, systemd units, cron jobs, etc.
  • Useable right away without fuss for CGI programs but also for other kinds of program invocation.
  • (I believe) reliable handling of unusual states arising from failed invocations or races.
Swans paddling furiously The implementation is much more complicated than the (apparent) interface. I won t go into all the details here (there are some terrifying diagrams in the source code if you really want), but some highlights: We use an AF_UNIX socket (hopefully in /run/user/UID, but in ~ if not) for rendezvous. We can try to connect without locking, but we must protect the socket with a separate lockfile to avoid two concurrent restart attempts. We want stderr from the script setup (pre-initialisation) to be delivered to the caller, so the script ought to inherit our stderr and then will need to replace it later. Twice, in fact, because the daemonic server process can t have a stderr. When a script is restarted for any reason, any old socket will be removed. We want the old server process to detect that and quit. (If hung about, it would wait for the idle timeout; if this happened a lot - eg, a constantly changing set of services - we might end up running out of pids or something.) Spotting the socket disappearing, without polling, involves use of a library capable of using inotify (or the equivalent elsewhere). Choosing a C library to do this is not so hard, but portable interfaces to this functionality can be hard to find in scripting languages, and also we don t want every language binding to have to reimplement these checks. So for this purpose there s a little watcher process, and associated IPC. When an invoking instance of prefork-interp is killed, we must arrange for the executing service instance to stop reading from its stdin (and, ideally, writing its stdout). Otherwise it s stealing input from prefork-interp s successors (maybe the user s shell)! Cleanup ought not to depend on positive actions by failing processes, so each element of the system has to detect failures of its peers by means such as EOF on sockets/pipes. Obtaining prefork-interp I put this new tool in my chiark-utils package, which is a collection of useful miscellany. It s available from git. Currently I make releases by uploading to Debian, where prefork-interp has just hit Debian unstable, in chiark-utils 7.0.0. Support for other scripting languages I would love Python to be supported. If any pythonistas reading this think you might like to help out, please get in touch. The specification for the protocol, and what the script library needs to do, is documented in the source code Future plans for chiark-utils chiark-utils as a whole is in need of some tidying up of its build system and packaging. I intend to try to do some reorganisation. Currently I think it would be better to organising the source tree more strictly with a directory for each included facility, rather than grouping compiled and scripts together. The Debian binary packages should be reorganised more fully according to their dependencies, so that installing a program will ensure that it works. I should probably move the official git repo from my own git+gitweb to a forge (so we can have MRs and issues and so on). And there should be a lot more testing, including Debian autopkgtests.
edited 2022-08-23 10:30 +01:00 to improve the formatting


comment count unavailable comments

21 August 2022

Iustin Pop: Note to self: Don't forget Qemu's discard option

This is just a short note to myself, and to anyone who might run VMs via home-grown scripts (or systemd units). I expect modern VM managers to do this automatically, but for myself, I have just a few hacked together scripts. By default, QEMU (at least as of version 7.0) does not honour/pass discard requests from block devices to the underlying storage. This is a sane default (like lvm s default setting), but with long-lived VMs it can lead to lots of wasted disk space. I keep my VMs on SSDs, which is limited space for me, so savings here are important. Older Debian versions did not trim automatically, but nowadays they do (which is why this is worth enabling for all VMs), so all you need is to pass:
  • unmap=discard to activate the pass-through.
  • optionally, detect-zeroes=unmap, but I don t know how useful this is, as in, how often zeroes are written.
And the next trim should save lots of disk space. It doesn t matter much if you use raw or qcow2, both will know to unmap the unused disk, leading to less disk space used. This part seems to me safe security-wise, as long as you trust the host. If you have pass-through to the actual hardware, it will also do proper discard at the SSD level (with the potential security issues leading from that). I m happy with the freed up disk space Note: If you have (like I do) Windows VMs as well, using paravirt block devices, make sure the drive is recent enough. One interesting behaviour from Windows: it looks like the default cluster size is quite high (64K), which with many small files will lead to significant overhead. But, either I misunderstand, or Windows actually knows how to unmap the unused part of a cluster (although it takes a while). So in the end, the backing file for the VM (19G) is smaller than the disk used as reported in Windows (23-24G), but higher than size on disk for all the files (17.2G). Seems legit, and it still boots Most Linux file systems have much smaller block sizes (usually 4K), so this is not a problem for it.

15 August 2022

John Goerzen: The Joy of Easy Personal Radio: FRS, GMRS, and Motorola DLR/DTR

Most of us carry cell phones with us almost everywhere we go. So much so that we often forget not just the usefulness, but even the joy, of having our own radios. For instance:
  • When traveling to national parks or other wilderness areas, family and friends can keep in touch even where there is no cell coverage.
  • It is a lot faster to just push a button and start talking than it is to unlock a phone, open the phone app, select a person, wait for the call to connect, wait for the other person to answer, etc. I m heading back. OK. Boom, 5 seconds, done. A phone user wouldn t have even dialed in that time.
  • A whole group of people can be on the same channel.
  • You can often buy a radio for less than the monthly cost of a cell plan.
From my own experience, as a person and a family that enjoys visiting wilderness areas, having radio communication is great. I have also heard from others that they re also very useful on cruise ships (I ve never been on one so I can t attest to that). There is also a sheer satisfaction in not needing anybody else s infrastructure, not paying any sort of monthly fee, and setting up the radios ourselves.

How these services fit in This article is primarily about handheld radios that can be used by anybody. I laid out some of their advantages above. Before continuing, I should point out some of the other services you may consider:
  • Cell phones, obviously. Due to the impressive infrastructure you pay for each month (many towers in high locations), in areas of cell coverage, you have this ability to connect to so many other phones around the world. With radios like discussed here, your range will likely a few miles.
  • Amateur Radio has often been a decade or more ahead of what you see in these easy personal radio devices. You can unquestionably get amateur radio devices with many more features and better performance. However, generally speaking, each person that transmits on an amateur radio band must be licensed. Getting an amateur radio license isn t difficult, but it does involve passing a test and some time studying for the exam. So it isn t something you can count on random friends or family members being able to do. That said, I have resources on Getting Started With Amateur Radio and it s not as hard as you might think! There are also a lot of reasons to use amateur radio if you want to go down that path.
  • Satellite messengers such as the Garmin Inreach or Zoleo can send SMS-like messages across anywhere in the globe with a clear view of the sky. They also often have SOS features. While these are useful safety equipment, it can take many minutes for a message to be sent and received it s not like an interactive SMS conversation and there are places where local radios will have better signal. Notably, satellite messengers are almost useless indoors and can have trouble in areas without a clear view of the sky, such as dense forests, valleys, etc.
  • My earlier Roundup of secure messengers with off-the-grid capabilities (distributed/mesh messengers) highlighted a number of other options as well, for text-only communication. For instance:
    • For very short-range service, Briar can form a mesh over Bluetooth from cell phones or over Tor, if Internet access is available.
    • Dedicated short message services Mesh Networks like Meshtastic or Beartooth have no voice capability, but share GPS locations and short text messages over their own local mesh. Generally they need to pair to a cell phone (even if that phone has no cell service) for most functionality.
  • Yggdrasil can do something similar over ad-hoc Wifi, but it is a lower-level protocol and you d need some sort of messaging to run atop it.
This article is primarily about the USA, though these concepts, if not the specific implementation, apply many other areas as well.

The landscape of easy personal radios The oldest personal radio service in the US is Citizens Band (CB). Because it uses a lower frequency band than others, handheld radios are larger, heavier, and less efficient. It is mostly used in vehicles or other installations where size isn t an issue. The FRS/GMRS services mostly share a set of frequencies. The Family Radio Service is unlicensed (you don t have to get a license to use it) and radios are plentiful and cheap. When you get a blister pack or little radios for maybe $50 for a pair or less, they re probably FRS. FRS was expanded by the FCC in 2017, and now most FRS channels can run up to 2 watts of power (with channels 8-14 still limited to 0.5W). FRS radios are pretty much always handheld. GMRS runs on mostly the same frequencies as FRS. GMRS lets you run up to 5W on some channels, up to 50W on others, and operate repeaters. GMRS also permits limited occasional digital data bursts; three manufacturers currently use this to exchange GPS data or text messages. To use GMRS, you must purchase a GMRS license; it costs $35 for a person and their immediate family and is good for 10 years. No exam is required. GMRS radios can transmit on FRS frequencies using the GMRS authorization. The extra power of GMRS gets you extra distance. While only the best handheld GMRS radios can put out 5W of power, some mobile (car) or home radios can put out the full 50W, and use more capable exterior antennas too. There is also the MURS band, which offers very few channels and also very few devices. It is not in wide use, probably for good reason. Finally, some radios use some other unlicensed bands. The Motorola DTR and DLR series I will talk about operate in the 900MHz ISM band. Regulations there limit them to a maximum power of 1W, but as you will see, due to some other optimizations, their range is often quite similar to a 5W GMRS handheld. All of these radios share something in common: your radio can either transmit, or receive, but not both simultaneously. They all have a PTT (push-to-talk) button that you push and hold while you are transmitting, and at all other times, they act as receivers. You ll learn that doubling is a thing where 2 or more people attempt to transmit at the same time. To listeners, the result is often garbled. To the transmitters, they may not even be aware they did it since, after all, they were transmitting. Usually it will be clear pretty quickly as people don t get responses or responses say it was garbled. Only the digital Motorola DLR/DTR series detects and prevents this situation.

FRS and GMRS radios As mentioned, the FRS/GMRS radios are generally the most popular, and quite inexpensive. Those that can emit 2W will have pretty decent range; 5W even better (assuming a decent antenna), though the 5W ones will require a GMRS license. For the most part, there isn t much that differentiates one FRS radio from another, or (with a few more exceptions) one GMRS handheld from another. Do not believe the manufacturers claims of 50 mile range or whatever; more on range below. FRS and GMRS radios use FM. GMRS radios are permitted to use a wider bandwidth than FRS radios, but in general, FRS and GMRS users can communicate with each other from any brand of radio to any other brand of radio, assuming they are using basic voice services. Some FRS and GMRS radios can receive the NOAA weather radio. That s nice for wilderness use. Nicer ones can monitor it for alert tones, even when you re tuned to a different channel. The very nicest on this as far as I know, only the Garmin Rino series will receive and process SAME codes to only trigger alerts for your specific location. GMRS (but not FRS) also permits 1-second digital data bursts at periodic intervals. There are now three radio series that take advantage of this: the Garmin Rino, the Motorola T800, and BTech GMRS-PRO. Garmin s radios are among the priciest of GMRS handhelds out there; the top-of-the-line Rino will set you back $650. The cheapest is $350, but does not contain a replaceable battery, which should be an instant rejection of a device like this. So, for $550, you can get the middle-of-the-road Rino. It features a sophisticated GPS system with Garmin trail maps and such, plus a 5W GMRS radio with GPS data sharing and a very limited (13-character) text messaging system. It does have a Bluetooth link to a cell phone, which can provide a link to trail maps and the like, and limited functionality for the radio. The Rino is also large and heavy (due to its large map-capable screen). Many consider it to be somewhat dated technology; for instance, other ways to have offline maps now exist (such as my Garmin Fenix 6 Pro, which has those maps on a watch!). It is bulky enough to likely be left at home in many situations. The Motorola T800 doesn t have much to talk about compared to the other two. Both of those platforms are a number of years old. The newest entrant in this space, from budget radio maker Baofeng, is the BTech GMRS-PRO, which came out just a couple of weeks ago. Its screen, though lacking built-in maps, does still have a GPS digital link similar to Garmin s, and can show you a heading and distance to other GMRS-PRO users. It too is a 5W unit, and has a ton of advanced features that are rare in GMRS: ability to pair a Bluetooth headset to it directly (though the Garmin Rino supports Bluetooth, it doesn t support this), ability to use the phone app as a speaker/mic for the radio, longer text messages than the Garmin Rino, etc. The GMRS-PRO sold out within a few days of its announcement, and I am presently waiting for mine to arrive to review. At $140 and with a more modern radio implementation, for people that don t need the trail maps and the like, it makes a compelling alternative to Garmin for outdoor use. Garmin documents when GPS beacons are sent out: generally, when you begin a transmission, or when another radio asks for your position. I couldn t find similar documentation from Motorola or BTech, but I believe FCC regulations mean that the picture would be similar with them. In other words, none of these devices is continuously, automatically, transmitting position updates. However, you can request a position update from another radio. It should be noted that, while voice communication is compatible across FRS/GMRS, data communication is not. Garmin, Motorola, and BTech all have different data protocols that are incompatible with radios from other manufacturers. FRS/GMRS radios often advertise privacy codes. These do nothing to protect your privacy; see more under the privacy section below.

Motorola DLR and DTR series Although they can be used for similar purposes, and I do, these radios are unique from the others in this article in several ways:
  • Their sales and marketing is targeted at businesses rather than consumers
  • They use digital encoding of audio, rather than analog FM or AM
  • They use FHSS (Frequency-Hopping Spread Spectrum) rather than a set frequency
  • They operate on the 900MHz ISM band, rather than a 460MHz UHF band (or a lower band yet for MURS and CB)
  • The DLR series is quite small, smaller than many GMRS radios.
I don t have space to go into a lot of radio theory in this article, but I ll briefly expand on some of this. First, FHSS. A FHSS radio hops from frequency to frequency many times per second, following some preset hopping algorithm that is part of the radio. Although it complicates the radio design, it has some advantages; it tends to allow more users to share a band, and if one particular frequency has a conflict with something else, it will be for a brief fraction of a second and may not even be noticeable. Digital encoding generally increases the quality of the audio, and keeps the quality high even in degraded signal conditions where analog radios would experience static or a quieter voice. However, you also lose that sort of audible feedback that your signal is getting weak. When you get too far away, the digital signal drops off a cliff . Often, either you have a crystal-clear signal or you have no signal at all. Motorola s radios leverage these features to build a unique radio. Not only can you talk to a group, but you can select a particular person to talk to with a private conversation, and so forth. DTR radios can send text messages to each other (but only preset canned ones, not arbitrary ones). Channels are more like configurations; they can include various arbitrary groupings of radios. Deconfliction with other users is established via hopsets rather than frequencies; that is, the algorithm that it uses to hop from frequency to frequency. There is a 4-digit PIN in the DLR radios, and newer DTR radios, that makes privacy very easy to set up and maintain. As far as I am aware, no scanner can monitor DLR/DTR signals. Though they technically aren t encrypted, cracking a DLR/DTR conversation would require cracking Motorola s firmware, and the chances of this happening in your geographical proximity seem vanishingly small. I will write more below on comparing the range of these to GMRS radios, but in a nutshell, it compares well, despite the fact that the 900MHz band restrictions allow Motorola only 1W of power output with these radios. There are three current lines of Motorola DLR/DTR radios:
  • The Motorola DLR1020 and DLR1060 radios. These have no screen; the 1020 has two channels (configurations) while the 1060 supports 6. They are small and compact and great pocketable just work radios.
  • The Motorola DTR600 and DTR700 radios. These are larger, with a larger antenna (that should theoretically provide greater range) and have a small color screen. They support more channels and more features (eg, short messages, etc).
  • The Motorola Curve (aka DLR110). Compared to the DLR1060, it adds limited WiFi capabilities that are primarily useful in certain business environments. See this thread for more. These features are unlikely to be useful in the environments we re talking about here.
These radios are fairly expensive new, but DLRs can be readily found at around $60 on eBay. (DTRs for about $250) They are quite rugged. Be aware when purchasing that some radios sold on eBay may not include a correct battery and charger. (Not necessarily a problem; Motorola batteries are easy to find online, and as with any used battery, the life of a used one may not be great.) For more advanced configuration, the Motorola CPS cable works with both radios (plugs into the charging cradle) and is used with the programming software to configure them in more detail. The older Motorola DTR650, DTR550, and older radios are compatible with the newer DLR and DTR series, if you program the newer ones carefully. The older ones don t support PINs and have a less friendly way of providing privacy, but they do work also. However, for most, I think the newer ones will be friendlier; but if you find a deal on the older ones, hey, why not? This thread on the MyGMRS forums has tons of useful information on the DLR/DTR radios. Check it out for a lot more detail. One interesting feature of these radios is that they are aware if there are conflicting users on the channel, and even if anybody is hearing your transmission. If your transmission is not being heard by at least one radio, you will get an audible (and visual, on the DTR) indication that your transmission failed. One thing that pleasantly surprised me is just how tiny the Motorola DLR is. The whole thing with antenna is like a small candy bar, and thinner. My phone is slightly taller, much wider, and only a little thinner than the Motorola DLR. Seriously, it s more pocketable than most smartphones. The DTR is of a size more commonly associated with radios, though still on the smaller side. Some of the most low-power FRS radios might get down to that size, but to get equivolent range, you need a 5W GMRS unit, which will be much bulkier. Being targeted at business users, the DLR/DTR don t include NOAA weather radio or GPS.

Power These radios tend to be powered by:
  • NiMH rechargable battery packs
  • AA/AAA batteries
  • Lithium Ion batteries
Most of the cheap FRS/GMRS radios have a NiMH rechargable battery pack and a terrible charge controller that will tend to overcharge, and thus prematurely destroy, the NiMH packs. This has long ago happened in my GMRS radios, and now I use Eneloop NiMH AAs in them (charged separately by a proper charger). The BTech, Garmin, and Motorola DLR/DTR radios all use Li-Ion batteries. These have the advantage of being more efficient batteries, though you can t necessarily just swap in AAs in a pinch. Pay attention to your charging options; if you are backpacking, for instance, you may want something that can charge from solar-powered USB or battery banks. The Motorola DLR/DTR radios need to sit in a charging cradle, but the cradle is powered by a Micro USB cable. The BTech GMRS-PRO is charged via USB-C. I don t know about the Garmin Rino or others. Garmin offers an optional AA battery pack for the Rino. BTech doesn t (yet) for the GMRS-PRO, but they do for some other models, and have stated accessories for the GMRS-PRO are coming. I don t have information about the T800. This is not an option for the DLR/DTR.

Meshtastic I ll briefly mention Meshtastic. It uses a low-power LoRa system. It can t handle voice transmissions; only data. On its own, it can transmit and receive automatic GPS updates from other Meshtastic devices, which you can view on its small screen. It forms a mesh, so each node can relay messages for others. It is also the only unit in this roundup that uses true encryption, and its battery lasts about a week more than the a solid day you can expect out of the best of the others here. When paired with a cell phone, Meshtastic can also send and receive short text messages. Meshtastic uses much less power than even the cheapest of the FRS radios discussed here. It can still achieve respectable range because it uses LoRa, which can trade bandwidth for power or range. It can take it a second or two to transmit a 50-character text message. Still, the GMRS or Motorola radios discussed here will have more than double the point-to-point range of a Meshtastic device. And, if you intend to take advantage of the text messaging features, keep in mind that you must now take two electronic devices with you and maintain a charge for them both.

Privacy The privacy picture on these is interesting.

Cell phone privacy Cell phones are difficult for individuals to eavesdrop, but a sophisticated adversary probably could: or an unsophisticated adversary with any manner of malware. Privacy on modern smartphones is a huge area of trouble, and it is safe to say that data brokers and many apps probably know at least your location and contact list, if not also the content of your messages. Though end-to-end encrypted apps such as Signal can certainly help. See Tools for Communicating Offline and in Difficult Circumstances for more details.

GMRS privacy GMRS radios are unencrypted and public. Anyone in range with another GMRS radio, or a scanner, can listen to your conversations even if you have a privacy code set. The privacy code does not actually protect your privacy; rather, it keeps your radio from playing conversations from others using the same channel, for your convenience. However, note the in range limitation. An eavesdropper would generally need to be within a few miles of you.

Motorola DLR/DTR privacy As touched on above, while these also aren t encrypted, as far as I am aware, no tools exist to eavesdrop on DLR/DTR conversations. Change the PIN away from the default 0000, ideally to something that doesn t end in 0 (to pick a different hopset) and you have pretty decent privacy right there. Decent doesn t mean perfect; it is certainly possible that sophisticated adversaries or state agencies could decode DLR/DTR traffic, since it is unencrypted. As a practical matter, though, the lack of consumer equipment that can decode this makes it be, as I say, pretty decent .

Meshtastic Meshtastic uses strong AES encryption. But as messaging features require a paired phone, the privacy implications of a phone also apply here.

Range I tested my best 5W GMRS radios, as well as a Motorola DTR600 talking to a DLR1060. (I also tried two DLR1060s talking to each other; there was no change in rnage.) I took a radio with me in the car, and had another sitting on my table indoors. Those of you familiar with radios will probably recognize that being in a car and being indoors both attenuate (reduce the strength of) the signal significantly. I drove around in a part of Kansas with gentle rolling hills. Both the GMRS and the DLR/DTR had a range of about 2-3 miles. There were times when each was able to pull out a signal when the other was not. The DLR/DTR series was significantly better while the vehicle was in motion. In weaker signal conditions, the GMRS radios were susceptible to significant picket fencing (static caused by variation in the signal strength when passing things like trees), to the point of being inaudible or losing the signal entirely. The DLR/DTR remained perfectly clear there. I was able to find some spots where, while parked, the GMRS radios had a weak but audible signal but the DLR/DTR had none. However, in all those cases, the distance to GMRS dropping out as well was small. Basically, no radios penetrate the ground, and the valleys were a problem for them all. Differences may play out in other ways in other environments as well: for instance, dense urban environments, heavy woods, indoor buildings, etc. GMRS radios can be used with repeaters, or have a rooftop antenna mounted on a car, both of which could significantly extend range and both of which are rare. The DLR/DTR series are said to be exceptionally good at indoor environments; Motorola rates them for penetrating 20 floors, for instance. Reports on MyGMRS forums state that they are able to cover an entire cruise ship, while the metal and concrete in them poses a big problem for GMRS radios. Different outdoor landscapes may favor one or the other also. Some of the cheapest FRS radios max out at about 0.5W or even less. This is probably only a little better than yelling distance in many cases. A lot of manufacturers obscure transmit power and use outlandish claims of range instead; don t believe those. Find the power output. A 2W FRS transmitter will be more credible range-wise, and the 5W GMRS transmitter as I tested better yet. Note that even GMRS radios are restricted to 0.5W on channels 8-14. The Motorola DLR/DTR radio gets about the same range with 1W as a GMRS radio does with 5W. The lower power output allows the DLR to be much smaller and lighter than a 5W GMRS radio for similar performance.

Overall conclusions Of course, what you use may depend on your needs. I d generally say:
  • For basic use, the high quality, good range, reasonable used price, and very small size of the Motorola DLR would make it a good all-arounder. Give one to each person (or kid) for use at the mall or amusement park, take them with you to concerts and festivals, etc.
  • Between vehicles, the Motorola DLR/DTR have a clear range advantage over the GMRS radios for vehicles in motion, though the GPS features of the more advanced GMRS radios may be more useful here.
  • For wilderness hiking and the like, GMRS radios that have GPS, maps, and NOAA weather radio reception may prove compelling and worth the extra bulk. More flexible power options may also be useful.
  • Low-end FRS radios can be found very cheap; around $20-$30 new for the lowest end, though their low power output and questionable charging circuits may limit their utility where it really counts.
  • If you just can t move away from cell phones, try the Zoleo app, which can provide some radio-like features.
  • A satellite communicator is still good backup safety gear for the wilderness.

Postscript: A final plug for amateur radio My 10-year-old Kenwood TH-D71A already had features none of these others have. For instance, its support for APRS and ability to act as a digipeater for APRS means that TH-D71As can form an automatic mesh between them, each one repeating new GPS positions or text messages to the others. Traditional APRS doesn t perform well in weak signal situations; however, more modern digital systems like D-Star and DMR also support APRS over more modern codecs and provide all sorts of other advantages as well (though not FHSS). My conclusions above assume a person is not going to go the amateur radio route for whatever reason. If you can get those in your group to get their license the technician is all you need a whole world of excellent options opens to you.

Appendix: The Trisquare eXRS Prior to 2012, a small company named Trisquare made a FHSS radio they called the eXRS that operated on the 900MHz band like Motorola s DLR/DTR does. Trisquare aimed at consumers and their radios were cheaper than the Motorola DLR/DTR. However, that is where the similarities end. Trisquare had an analog voice transmission, even though it used FHSS. Also, there is a problem that can arise with FHSS systems: synchronization. The receiver must hop frequencies in exactly the same order at exactly the same time as the sender. Motorola has clearly done a lot of engineering around this, and I have never encountered a synchronization problem in my DLR/DTR testing, not even once. eXRS, on the other hand, had frequent synchronization problems, which manifested themselves in weak signal conditions and sometimes with doubling. When it would happen, everyone would have to be quiet for a minute or two to give all the radios a chance to timeout and reset to the start of the hop sequence. In addition, the eXRS hardware wasn t great, and was susceptible to hardware failure. There are some that still view eXRS as a legendary device and hoard them. You can still find them used on eBay. When eXRS came out in 2007, it was indeed nice technology for the day, ahead of its time in some ways. I used and loved the eXRS radios back then; powerful GMRS wasn t all that common. But compared to today s technology, eXRS has inferior range to both GMRS and Motorola DLR/DTR (from my recollection, about a third to half of what I get with today s GMRS and DLR/DTR), is prone to finicky synchronization issues when signals are weak, and isn t made very robustly. I therefore don t recommend the eBay eXRS units. Don t assume that the eXRS weaknesses extend to Motorola DLR/DTR. The DLR/DTR radios are done well and don t suffer from the same problems. Note: This article has a long-term home on my website, where it may be updated from time to time.

14 August 2022

Sergio Durigan Junior: Debuginfod is coming to Ubuntu

These past couple of months I have been working to bring debuginfod to Ubuntu. I thought it would be a good idea to make this post and explain a little bit about what the service is and how I'm planning to deploy it. A quick recap: what's debuginfod? Here's a good summary of what debuginfod is:
debuginfod is a new-ish project whose purpose is to serve
ELF/DWARF/source-code information over HTTP.  It is developed under the
elfutils umbrella.  You can find more information about it here:
  https://sourceware.org/elfutils/Debuginfod.html
In a nutshell, by using a debuginfod service you will not need to
install debuginfo (a.k.a. dbgsym) files anymore; the symbols will be
served to GDB (or any other debuginfo consumer that supports debuginfod)
over the network.  Ultimately, this makes the debugging experience much
smoother (I myself never remember the full URL of our debuginfo
repository when I need it).
If you follow the Debian project, you might know that I run their debuginfod service. In fact, the excerpt above was taken from the announcement I made last year, letting the Debian community know that the service was available. First stage With more and more GNU/Linux distributions offering a debuginfod service to their users, I strongly believe that Ubuntu cannot afford to stay out of this "party" anymore. Fortunately, I have a manager who not only agrees with me but also turned the right knobs in order to make this project one of my priorities for this development cycle. The deployment of this service will be made in stages. The first one, whose results are due to be announced in the upcoming weeks, encompasses indexing and serving all of the available debug symbols from the official Ubuntu repository. In other words, the service will serve everything from main, universe and multiverse, from every supported Ubuntu release out there. This initial (a.k.a. "alpha") stage will also allow us to have an estimate of how much the service is used, so that we can better determine the resources allocated to it. More down the road This is just the beginning. In the following cycles, I will be working on a few interesting projects to expand the scope of the service and make it even more useful for the broader Ubuntu community. To give you an idea, here is what is on my plate:
  • Working on the problem of indexing and serving source code as well. This is an interesting problem and I already have some ideas, but it's also challenging and may unfold into more sub-projects. The good news is that a solution for this problem will also be beneficial to Debian.
  • Working with the snap developers to come up with a way to index and serve debug symbols for snaps as well.
  • Improve the integration of the service into Ubuntu. In fact, I have already started working on this by making elfutils (actually, libdebuginfod) install a customized shell snippet to automatically setup access to Ubuntu's debuginfod instance.
As you can see, there's a lot to do. I am happy to be working on this project, and I hope it will be helpful and useful for the Ubuntu community.

12 August 2022

Wouter Verhelst: Upgrading a Windows 10 VM to Windows 11

I run Debian on my laptop (obviously); but occasionally, for $DAYJOB, I have some work to do on Windows. In order to do so, I have had a Windows 10 VM in my libvirt configuration that I can use. A while ago, Microsoft issued Windows 11. I recently found out that all the components for running Windows 11 inside a libvirt VM are available, and so I set out to upgrade my VM from Windows 10 to Windows 11. This wasn't as easy as I thought, so here's a bit of a writeup of all the things I ran against, and how I fixed them. Windows 11 has a number of hardware requirements that aren't necessary for Windows 10. There are a number of them, but the most important three are:
  • Secure Boot is required (Windows 10 would still boot on a machine without Secure Boot, although buying hardware without at least support for that hasn't been possible for several years now)
  • A v2.0 TPM module (Windows 10 didn't need any TPM)
  • A modern enough processor.
So let's see about all three.

A modern enough processor If your processor isn't modern enough to run Windows 11, then you can probably forget about it (unless you want to use qemu JIT compilation -- I dunno, probably not going to work, and also not worth it if it were). If it is, all you need is the "host-passthrough" setting in libvirt, which I've been using for a long time now. Since my laptop is less than two months old, that's not a problem for me.

A TPM 2.0 module My Windows 10 VM did not have a TPM configured, because it wasn't needed. Luckily, a quick web search told me that enabling that is not hard. All you need to do is:
  • Install the swtpm and swtpm-tools packages
  • Adding the TPM module, by adding the following XML snippet to your VM configuration:
    <devices>
      <tpm model='tpm-tis'>
        <backend type='emulator' version='2.0'/>
      </tpm>
    </devices>
    
    Alternatively, if you prefer the graphical interface, click on the "Add hardware" button in the VM properties, choose the TPM, set it to Emulated, model TIS, and set its version to 2.0.
You're done! Well, with this part, anyway. Read on.

Secure boot Here is where it gets interesting. My Windows 10 VM was old enough that it was configured for the older i440fx chipset. This one is limited to PCI and IDE, unlike the more modern q35 chipset (which supports PCIe and SATA, and does not support IDE nor SATA in IDE mode). There is a UEFI/Secure Boot-capable BIOS for qemu, but it apparently requires the q35 chipset, Fun fact (which I found out the hard way): Windows stores where its boot partition is somewhere. If you change the hard drive controller from an IDE one to a SATA one, you will get a BSOD at startup. In order to fix that, you need a recovery drive. To create the virtual USB disk, go to the VM properties, click "Add hardware", choose "Storage", choose the USB bus, and then under "Advanced options", select the "Removable" option, so it shows up as a USB stick in the VM. Note: this takes a while to do (took about an hour on my system), and your virtual USB drive needs to be 16G or larger (I used the libvirt default of 20G). There is no possibility, using the buttons in the virt-manager GUI, to convert the machine from i440fx to q35. However, that doesn't mean it's not possible to do so. I found that the easiest way is to use the direct XML editing capabilities in the virt-manager interface; if you edit the XML in an editor it will produce error messages if something doesn't look right and tell you to go and fix it, whereas the virt-manager GUI will actually fix things itself in some cases (and will produce helpful error messages if not). What I did was:
  • Take backups of everything. No, really. If you fuck up, you'll have to start from scratch. I'm not responsible if you do.
  • Go to the Edit->Preferences option in the VM manager, then on the "General" tab, choose "Enable XML editing"
  • Open the Windows VM properties, and in the "Overview" section, go to the "XML" tab.
  • Change the value of the machine attribute of the domain.os.type element, so that it says pc-q35-7.0.
  • Search for the domain.devices.controller element that has pci in its type attribute and pci-root in its model one, and set the model attribute to pcie-root instead.
  • Find all domain.devices.disk.target elements, setting their dev=hdX to dev=sdX, and bus="ide" to bus="sata"
  • Find the USB controller (domain.devices.controller with type="usb", and set its model to qemu-xhci. You may also want to add ports="15" if you didn't have that yet.
  • Perhaps also add a few PCIe root ports:
    <controller type="pci" index="1" model="pcie-root-port"/>
    <controller type="pci" index="2" model="pcie-root-port"/>
    <controller type="pci" index="3" model="pcie-root-port"/>
    
I figured out most of this by starting the process for creating a new VM, on the last page of the wizard that pops up selecting the "Modify configuration before installation" option, going to the "XML" tab on the "Overview" section of the new window that shows up, and then comparing that against what my current VM had. Also, it took me a while to get this right, so I might have forgotten something. If virt-manager gives you an error when you hit the Apply button, compare notes against the VM that you're in the process of creating, and copy/paste things from there to the old VM to make the errors go away. As long as you don't remove configuration that is critical for things to start, this shouldn't break matters permanently (but hey, use your backups if you do break -- you have backups, right?) OK, cool, so now we have a Windows VM that is... unable to boot. Remember what I said about Windows storing where the controller is? Yeah, there you go. Boot from the virtual USB disk that you created above, and select the "Fix the boot" option in the menu. That will fix it. Ha ha, only kidding. Of course it doesn't. I honestly can't tell you everything that I fiddled with, but I think the bit that eventually fixed it was where I chose "safe mode", which caused the system to do a hickup, a regular reboot, and then suddenly everything was working again. Meh. Don't throw the virtual USB disk away yet, you'll still need it. Anyway, once you have it booting again, you will now have a machine that theoretically supports Secure Boot, but you're still running off an MBR partition. I found a procedure on how to convert things from MBR to GPT that was written almost 10 years ago, but surprisingly it still works, except for the bit where the procedure suggests you use diskmgmt.msc (for one thing, that was renamed; and for another, it can't touch the partition table of the system disk either). The last step in that procedure says to restart your computer!, which is fine, except at this point you obviously need to switch over to the TianoCore firmware, otherwise you're trying to read a UEFI boot configuration on a system that only supports MBR booting, which obviously won't work. In order to do that, you need to add a loader element to the domain.os element of your libvirt configuration:
<loader readonly="yes" type="pflash">/usr/share/OVMF/OVMF_CODE_4M.ms.fd</loader>
When you do this, you'll note that virt-manager automatically adds an nvram element. That's fine, let it. I figured this out by looking at the documentation for enabling Secure Boot in a VM on the Debian wiki, and using the same trick as for how to switch chipsets that I explained above. Okay, yay, so now secure boot is enabled, and we can install Windows 11! All good? Well, almost. I found that once I enabled secure boot, my display reverted to a 1024x768 screen. This turned out to be because I was using older unsigned drivers, and since we're using Secure Boot, that's no longer allowed, which means Windows reverts to the default VGA driver, and that only supports the 1024x768 resolution. Yeah, I know. The solution is to download the virtio-win ISO from one of the links in the virtio-win github project, connecting it to the VM, going to Device manager, selecting the display controller, clicking on the "Update driver" button, telling the system that you have the driver on your computer, browsing to the CD-ROM drive, clicking the "include subdirectories" option, and then tell Windows to do its thing. While there, it might be good to do the same thing for unrecognized devices in the device manager, if any. So, all I have to do next is to get used to the completely different user interface of Windows 11. Sigh. Oh, and to rename the "w10" VM to "w11", or some such. Maybe.

Guido G nther: On a road to Prizren with a Free Software Phone

Since people are sometimes slightly surprised that you can go onto a multi week trip with a smartphone running free sofware so only I wanted to share some impressions from my recent trip to Prizren/Kosovo to attend Debconf 22 using a Librem 5. It's a mix of things that happend and bits that got improved to hopefully make things more fun to use. And, yes, there won't be any big surprises like being stranded without the ability to do phone calls in this read because there weren't and there shouldn't be. After two online versions Debconf 22 (the annual Debian Conference) took place in Prizren / Kosovo this year and I sure wanted to go. Looking for options I settled for a train trip to Vienna, to meet there with friends and continue the trip via bus to Zagreb, then switching to a final 11h direct bus to Prizren. When preparing for the trip and making sure my Librem 5 phone has all the needed documents I noticed that there will be quite some PDFs to show until I arrive in Kosovo: train ticket, bus ticket, hotel reservation, and so on. While that works by tapping unlocking the phone, opening the file browser, navigating to the folder with the PDFs and showing it via evince this looked like a lot of steps to repeat. Can't we have that information on the Phone Shell's lockscreen? This was a good opportunity to see if the upcoming plugin infrastructure for the lock screen (initially meant to allow for a plugin to show upcoming events) was flexible enough, so I used some leisure time on the train to poke at this and just before I reached Vienna I was able to use it for the first time. It was the very last check of that ticket, it also was a bit of cheating since I didn't present the ticket on the phone itself but from phosh (the phones graphical shell) running on my laptop but still. PDF barcode on phosh's lockscreen List of tickets on phosh's lockscreen This was possible since phosh is written in GTK and so I could just leverage evince's EvView. Unfortunately the hotel check in didn't want to see any documents . For the next day I moved the code over to the Librem 5 and (being a bit nervous as the queue to get on the bus was quite long) could happily check into the Flixbus by presenting the barcode to the barcode reader via the Librem 5's lockscreen. When switching to the bus to Prizren I didn't get to use that feature again as we bought the tickets at a counter but we got a nice krem banana after entering the bus - they're not filled with jelly, but krem - a real Kosovo must eat!). Although it was a rather long trip we had frequent breaks and I'd certainly take the same route again. Here's a photo of Prizren taken on the Librem 5 without any additional postprocessing: Prizren What about seeing the conference schedule on the phone? Confy(a conferences schedule viewer using GTK and libhandy) to the rescue: Confy with Debconf's schedule Since Debian's confy maintainer was around too, confy saw a bunch of improvements over the conference. For getting around Puremaps(an application to display maps and show routing instructions) was very helpful, here geolocating me in Prizren via GPS: Puremaps Puremaps currently isn't packaged in Debian but there's work onging to fix that (I used the flatpak for the moment). We got ourselves sim cards for the local phone network. For some reason mine wouldn't work (other sim cards from the same operator worked in my phone but this one just wouldn't). So we went to the sim card shop and the guy there was perfectly able to operate the Librem 5 without further explanation (including making calls, sending USSD codes to query balance, ). The sim card problem turned out to be a problem on the operator side and after a couple of days they got it working. We had nice, sunny weather about all the time. That made me switch between high contrast mode (to read things in bright sunlight) and normal mode (e.g. in conference rooms) on the phone quite often. Thankfully we have a ambient light sensor in the phone so we can make that automatic. Phosh in HighContrast See here for a video. Jathan kicked off a DebianOnMobile sprint during the conference where we were able to improve several aspects of mobile support in Debian and on Friday I had the chance to give a talk about the state of Debian on smartphones. pdf-presenter-console is a great tool for this as it can display the current slide together with additional notes. I needed some hacks to make it fit the phone screen but hopefully we figure out a way to have this by default. Debconf talk Pdf presenter console on a phone I had two great weeks in Prizren. Many thanks to the organizers of Debconf 22 - I really enjoyed the conference.

Next.