Search Results: "bofh"

24 March 2024

Marco d'Itri: CISPE's call for new regulations on VMware

A few days ago CISPE, a trade association of European cloud providers, published a press release complaining about the new VMware licensing scheme and asking for regulators and legislators to intervene. But VMware does not have a monopoly on virtualization software: I think that asking regulators to interfere is unnecessary and unwise, unless, of course, they wish to question the entire foundations of copyright. Which, on the other hand, could be an intriguing position that I would support... I believe that over-reliance on a single supplier is a typical enterprise risk: in the past decade some companies have invested in developing their own virtualization infrastructure using free software, while others have decided to rely entirely on a single proprietary software vendor. My only big concern is that many public sector organizations will continue to use VMware and pay the huge fees designed by Broadcom to extract the maximum amount of money from their customers. However, it is ultimately the citizens who pay these bills, and blaming the evil US corporation is a great way to avoid taking responsibility for these choices.

"Several CISPE members have stated that without the ability to license and use VMware products they will quickly go bankrupt and out of business."

Insert here the Jeremy Clarkson "Oh no! Anyway..." meme.

11 February 2024

Marco d'Itri: Extending access to the systemd RuntimeDirectory with a POSIX ACL

inn2 uses ephemeral UNIX domain sockets in /run/news/ to communicate with the ctlinnd program. Since the directory is only writeable by the "news" user, other unprivileged users are not able to use the command. I solved this by extending the inn2.service systemd unit with a drop-in file which uses setfacl to give access to my user "md" to the RuntimeDirectory created by systemd. This is the content of /etc/systemd/system/inn2.service.d/md-ctlinnd.conf:

[Service]
# innd will change the permissions of /run/news/ when started: without
# creating it now with mode 0775 then that will change the ACL mask.
RuntimeDirectoryMode=0775
# allow user md to run ctlinnd(8), which creates sockets in /run/news/
ExecStartPost=/usr/bin/setfacl --modify user:md:rwx $RUNTIME_DIRECTORY

The non-obvious issue here is that the innd daemon on startup will change the directory permissions in a way which sets a more restrictive (non group-writeable) ACL mask, and this would make the newly created user ACL ineffective. The solution is to create the directory group-writeable from start. (Beware: this creates a trivial privileges escalation from md to news.)

10 June 2023

Marco d'Itri: On having a track record in operating systems development

Now that Debian 12 has been released with proprietary firmwares on the official media, non-optional merged-/usr and systemd adopted by everybody, I want to take a moment to list, not without some pride, a few things that I was right about over the last 20 years:

Distribution of proprietary firmwares (#33, #40, #114)
udev
systemd (#454)
merged-/usr

Accepting the obvious solution about firmwares took 18 years. My work on the merged-/usr transition started in 2014, and the first discussions about replacing sysvinit are from 2011. The general adoption of udev (and dynamic device names, and persistent network interface names...) took less time in comparison and no large-scale flame wars, since people could enable it at their own pace. But it required countless little debates in the Debian Bug Tracking System: I still remember the people insisting that they would never use this newfangled dynamic /dev/, or complaining about their beloved /dev/cdrom symbolic link and persistent network interface names. So follow me for more rants about inevitable technologies.

9 April 2023

Marco d'Itri: Installing Debian 12 on a Banana Pi M5

I recently bought a Banana Pi BPI-M5, which uses the Amlogic S905X3 SoC: these are my notes about installing Debian on it. While this SoC is supported by the upstream U-Boot it is not supported by the Debian U-Boot package, so debian-installer does not work. Do not be fooled by seeing the DTB file for this exact board being distributed with debian-installer: all DTB files are, and it does not mean that the board is supposed to work. As I documented in #1033504, the Debian kernels are currently missing some patches needed to support the SD card reader. I started by downloading an Armbian Banana Pi image and booted it from an SD card. From there I partitioned the eMMC, which always appears as /dev/mmcblk1:

parted /dev/mmcblk1
(parted) mklabel msdos
(parted) mkpart primary ext4 4194304B -1
(parted) align-check optimal 1
mkfs.ext4 /dev/mmcblk1p1

Make sure to leave enough space before the first partition, or else U-Boot will overwrite it: as it is common for many ARM SoCs, U-Boot lives somewhere in the gap between the MBR and the first partition. I looked at Armbian's /usr/lib/u-boot/platform_install.sh and installed U-Boot by manually copying it to the eMMC:

dd if=/usr/lib/linux-u-boot-edge-bananapim5_22.08.6_arm64/u-boot.bin of=/dev/mmcblk1 bs=1 count=442
dd if=/usr/lib/linux-u-boot-edge-bananapim5_22.08.6_arm64/u-boot.bin of=/dev/mmcblk1 bs=512 skip=1 seek=1

Beware: Armbian's U-Boot 2022.10 is buggy, so I had to use an older image. I did not want to install a new system, so I copied over my old Cubieboard install:

mount /dev/mmcblk1p1 /mnt/
rsync -xaHSAX --delete --numeric-ids root@old-server:/ /mnt/ --exclude='/tmp/*' --exclude='/var/tmp/*'

Since the Cubieboard has a 32 bit CPU and the Banana Pi requires an arm64 kernel I enabled the architecture and installed a new kernel:

dpkg --add-architecture arm64
apt update
apt install linux-image-arm64
apt purge linux-image-6.1.0-6-armmp linux-image-armmp

At some point I will cross-grade the entire system. Even if ttyS0 exists it is not the serial console, which appears as ttyAML0 instead. Nowadays systemd automatically start a getty if the serial console is enabled on the kernel command line, so I just had to disable the old manually-configured getty:

systemctl disable serial-getty@ttyS0.service

I wanted to have a fully working flash-kernel, so I used Armbian's boot.scr as a template to create /etc/flash-kernel/bootscript/bootscr.meson and then added a custom entry for the Banana Pi to /etc/flash-kernel/db:

Machine: Banana Pi BPI-M5
Kernel-Flavors: arm64
DTB-Id: amlogic/meson-sm1-bananapi-m5.dtb
U-Boot-Initrd-Address: 0x0
Boot-Initrd-Path: /boot/uInitrd
Boot-Initrd-Path-Version: yes
Boot-Script-Path: /boot/boot.scr
U-Boot-Script-Name: bootscr.meson
Required-Packages: u-boot-tools

All things considered I do not think that I would recommend to Debian users to buy Amlogic-based boards since there are many other better supported SoCs.

15 February 2023

Marco d'Itri: I replaced grub with systemd-boot

To be able to investigate and work on the the measured boot features I have switched from grub to systemd-boot (sd-boot). This initial step is optional, but it is useful because this way /etc/kernel/cmdline will become the new place where the kernel command line can be configured:

. /etc/default/grub
echo "root=/dev/mapper/root $GRUB_CMDLINE_LINUX $GRUB_CMDLINE_LINUX_DEFAULT" > /etc/kernel/cmdline

Do not forget to set the correct root file system there, because initramfs-tools does not support discovering it at boot time using the Discoverable Partitions Specification. The installation has been automated since systemd version 252.6-1, so installing the package has the effect of installing sd-boot in the ESP, enabling it in the UEFI boot sequence and then creating boot loader entries for the kernels already installed on the system:

apt install systemd-boot

If needed, it could be manually installed again just by running bootctl install. I like to show the boot menu by default, at least until I will be more familiar with sd-boot:

bootctl set-timeout 4

Since other UEFI binaries can be easily chainloaded, I am also going to keep around grub for a while, just to be sure:

cat <<END > /boot/efi/loader/entries/grub.conf
title Grub
linux /EFI/debian/grubx64.efi
END

At this point sd-boot works, but I still had to enable secure boot. So far sd-boot has not been signed with a Debian key known to the shim bootloader, so I needed to create a Machine Owner Key (MOK), enroll it in UEFI and then use it to sign everything. I dislike the complexity of mokutil and the other related programs, so after removing it and the boot shim I have decided to use sbctl instead. With it I easily created new keys, enrolled them in the EFI key store and then signed everything:

sbctl create-keys
sbctl enroll-keys
for file in /boot/efi/*/*/linux /boot/efi/EFI/*/*.efi; do
  sbctl sign -s $file
done

Since there is no sbctl package yet I need to make sure that also the kernels installed in the future will be automatically signed, so I have created a trivial script in /etc/kernel/install.d/ which automatically runs sbctl sign -s or sbctl remove-file. The Debian wiki SecureBoot page documents how do do this with mokutil and sbsigntool, but I think that sbctl is much friendlier. Since I am not using the boot shim, I also had to set DisableShimForSecureBoot=true in /etc/fwupd/uefi_capsule.conf to make firmware updates work automatically. As a bonus, I have also added to the boot menu the excellent Debian-based GRML live distribution. Since sd-boot is not capable of loopback-mounting CD-ROM images like grub, I first had to extract the kernel and initramfs and copy them to the ESP:

mount -o loop /boot/grml/grml64-full_2022.11.iso /mnt/
mkdir /boot/efi/grml/
cp /mnt/boot/grml64full/* /boot/efi/grml/
umount /mnt/
cat <<END > /boot/efi/loader/entries/grml.conf
title GRML
linux /grml/vmlinuz
initrd /grml/initrd.img
options boot=live bootid=grml64full202211 findiso=/grml/grml64-full_2022.11.iso live-media-path=/live/grml64-full net.ifnames=0 
END

As expected, after a reboot bootctl reports the new security features:

System:
      Firmware: UEFI 2.70 (Lenovo 0.4496)
 Firmware Arch: x64
   Secure Boot: enabled (user)
  TPM2 Support: yes
  Boot into FW: supported
Current Boot Loader:
      Product: systemd-boot 252.5-2
     Features:   Boot counting
                 Menu timeout control
                 One-shot menu timeout control
                 Default entry control
                 One-shot entry control
                 Support for XBOOTLDR partition
                 Support for passing random seed to OS
                 Load drop-in drivers
                 Support Type #1 sort-key field
                 Support @saved pseudo-entry
                 Support Type #1 devicetree field
                 Boot loader sets ESP information
          ESP: /dev/disk/by-partuuid/1b767f8e-70fa-5a48-b444-cfe5c272d66e
         File:  /EFI/systemd/systemd-bootx64.efi
...

Relevant documentation:

systemd-boot(7)

systemd-cryptenroll(1)

sbctl

Boot Loader Specification

Unified Kernel Image (UKI)

27 December 2022

Ian Wienand: Redirecting webfinger requests with Apache

If you have a personal domain, it is nice if you can redirect webfinger requests so you can be easily found via your email. This is hardly a new idea, but the growth of Mastodon recently has made this more prominent. I wanted to redirect webfinger endpoints to a Mastondon host I am using, but only my email and only standard Apache rewrites. Below, replace xxx@yyy\.com with your email and zzz.social with the account to be redirected to. There are a couple of tricks in being able to inspect the query-string and quoting, but the end result that works for me is
RewriteEngine On RewriteMap lc int:tolower RewriteMap unescape int:unescape RewriteCond % REQUEST_URI ^/\.well-known/webfinger$ RewriteCond $ lc:$ unescape:% QUERY_STRING (?:^ &)resource=acct:xxx@yyy\.com(?:$ &) RewriteRule ^(.*)$ https://zzz.social/.well-known/webfinger?resource=acct:xxx@zzz.social [L,R=302] RewriteCond % REQUEST_URI ^/\.well-known/host-meta$ RewriteCond $ lc:$ unescape:% QUERY_STRING (?:^ &)resource=acct:xxx@yyy\.com(?:$ &) RewriteRule ^(.*)$ https://zzz.social/.well-known/host-meta?resource=acct:xxx@zzz.social [L,R=302] RewriteCond % REQUEST_URI ^/\.well-known/nodeinfo$ RewriteCond $ lc:$ unescape:% QUERY_STRING (?:^ &)resource=acct:xxx@yyy\.org(?:$ &) RewriteRule ^(.*)$ https://zzz.social/.well-known/nodeinfo?resource=acct:xxx@zzz.social [L,R=302]
c.f. https://blog.bofh.it/debian/id_464

6 November 2022

Marco d'Itri: Using a custom domain as the Mastodon identity

I just did again the usual web search, and I have verified that Mastodon still does not support managing multiple domains on the same instance, and that there is still no way to migrate an account to a different instance without losing all posts (and more?). As much as I like the idea of a federated social network, open standards and so on, I do not think that it would be wise for me to spend time developing a social network identity on somebody else's instance which could disappear at any time. I have managed my own email server since the '90s, but I do not feel that the system administration effort required to maintain a private Mastodon instance would be justified at this point: there is not even a Debian package! Mastodon either needs to become much simpler to maintain or become much more socially important, and so far it is neither. Also, it would be wasteful to use so many computing resources for a single-user instance. While it is not ideal, for the time being I compromised by redirecting WebFinger requests for md@linux.it using this Apache configuration:
<Location /.well-known/host-meta> Header set Access-Control-Allow-Origin: "*" Header set Content-Type: "application/xrd+json; charset=utf-8" Header set Cache-Control: "max-age=86400" </Location> <Location /.well-known/webfinger> Header set Access-Control-Allow-Origin: "*" Header set Content-Type: "application/jrd+json; charset=utf-8" Header set Cache-Control: "max-age=86400" </Location> # WebFinger (https://www.rfc-editor.org/rfc/rfc7033) RewriteMap lc int:tolower RewriteMap unescape int:unescape RewriteCond % REQUEST_URI ^/\.well-known/webfinger$ RewriteCond $ lc:$ unescape:% QUERY_STRING (?:^ &)resource=acct:([^&]+)@linux\.it(?:$ &) RewriteRule .+ /home/soci/%1/public_html/webfinger.json [L,QSD] # answer 404 to requests missing "acct:" or for domains != linux.it RewriteCond % REQUEST_URI ^/\.well-known/webfinger$ RewriteCond $ unescape:% QUERY_STRING (?:^ &)resource= RewriteRule .+ - [L,R=404] # answer 400 to requests without the resource parameter RewriteCond % REQUEST_URI ^/\.well-known/webfinger$ RewriteRule .+ - [L,R=400]

3 October 2022

Marco d'Itri: Debian bookworm on a Lenovo T14s Gen3 AMD

I recently upgraded my laptop to a Lenovo T14s Gen3 AMD and I am happy to report that it works just fine with Debian/unstable using a 5.19 kernel. The only issue is that some firmware files are still missing and I had to install them manually. Updates are needed for the firmware-amd-graphics package (#1019847) for the Radeon 680M GPU (AMD Rembrandt) and for the firmware-atheros package (#1021157) for the Qualcomm NFA725A Wi-Fi card (which is actually reported as a NFA765). s2idle (AKA "modern suspend") works too, and a ~10 seconds delay on resume has been removed by setting iommu=pt on the kernel command line. For improved energy efficiency it is recommended to switch from the acpi_cpufreq CPU frequency scaling driver to amd_pstate. Please note that so far it is not loaded automatically. As expected, fwupdmgr can update the system BIOS and the firmware of the NVMe device. Everybody should do it immediately, because there are major suspend bugs with BIOS releases earlier than 1.25.

25 July 2021

Marco d'Itri: Run an Ansible playbook in a chroot

Running a playbook in a remote chroot or container is not supported by Ansible, but I have invented a good workaround to do it anyway. The first step is to install Mitogen for Ansible (ansible-mitogen in Debian) and then configure ansible.cfg to use it:
[defaults] strategy = mitogen_linear
But everybody should use Mitogen anyway, because it makes Ansible much faster. The trick to have Ansible operate in a chroot is to make it call a wrapper script instead of Python. The wrapper can be created manually or by another playbook, e.g.:
vars: - fsroot: /mnt tasks: - name: Create the chroot wrapper copy: dest: "/usr/local/sbin/chroot_ inventory_hostname_short " mode: 0755 content: #!/bin/sh -e exec chroot fsroot /usr/bin/python3 "$@" - name: Continue with stage 2 inside the chroot debug: msg: - "Please run:" - "ansible-playbook therealplaybook.yaml -l inventory_hostname -e ansible_python_interpreter=/usr/local/sbin/chroot_ inventory_hostname_short "
This works thanks to Mitogen, which funnels all remote tasks inside that single call to Python. It would not work with standard Ansible, because it copies files to the remote system with SFTP and would do it outside of the chroot. The same principle can also be applied to containers by changing wrapper script, e.g:
#!/bin/sh -e exec systemd-run --quiet --pipe --machine= container_name --service-type=exec /usr/bin/python3 "$@"
After the wrapper will have been installed then you can run the real playbook by setting the ansible_python_interpreter variable, either on the command line, in the inventory or anywhere else that variables can be defined:
ansible-playbook therealplaybook.yaml -l inventory_hostname -e ansible_python_interpreter=/usr/local/sbin/chroot_ inventory_hostname_short

19 May 2021

Marco d'Itri: My resignation from freenode

As it is now known, the freenode IRC network has been taken over by a Trumpian wannabe korean royalty bitcoins millionaire. To make a long story short, the former freenode head of staff secretly "sold" the network to this person even if it was not hers to sell, and our lawyers have advised us that there is not much that we can do about it without some of us risking financial ruin. Fuck you Christel, lilo's life work did not deserve this. What you knew as freenode after 12:00 UTC of May 19 will be managed by different people. As I have no desire to volunteer under the new regime, this marks the end of my involvement with freenode. It had started in 1999 when I encouraged the good parts of #linux-it to leave ircnet, and soon after I became senior staff. Even if I have not been very active recently, at this point I was the longest-serving freenode staff member and now I expect that I will hold this record forever. The people that I have met on IRC, on freenode and other networks, have been and still are a very important part of my life, second only to the ones that I have known thanks to Usenet. I am not fine, but I know that the communities which I have been a part of are not defined by a domain name and will regroup somewhere else. The current freenode staff members have resigned with me, these are some of their farewell messages:
amdj

edk

emilsp

Fuchs

jess

JonathanD

kline

niko

mniip

Swant
Together we have created Libera.Chat, a new IRC network based on the same principles of the old freenode.

26 October 2020

Marco d'Itri: RPKI validation with FORT Validator

This article documents how to install FORT Validator (an RPKI relying party software which also implements the RPKI to Router protocol in a single daemon) on Debian 10 to provide RPKI validation to routers. If you are using testing or unstable then you can just skip the part about apt pinnings. The packages in bullseye (Debian testing) can be installed as is on Debian stable with no need to rebuild them, by configuring an appropriate pinning for apt:
cat <<END > /etc/apt/sources.list.d/bullseye.list deb http://deb.debian.org/debian/ bullseye main END cat <<END > /etc/apt/preferences.d/pin-rpki # by default do not install anything from bullseye Package: * Pin: release bullseye Pin-Priority: 100 Package: fort-validator rpki-trust-anchors Pin: release bullseye Pin-Priority: 990 END apt update
Before starting, make sure that curl (or wget) and the web PKI certificates are installed:
apt install curl ca-certificates
If you already know about the legal issues related to the ARIN TAL then you may instruct the package to automatically install it. If you skip this step then you will be asked at installation time about it, either way is fine.
echo 'rpki-trust-anchors rpki-trust-anchors/get_arin_tal boolean true' \ debconf-set-selections
Install the package as usual:
apt install fort-validator
You may also install rpki-client and gortr on Debian 10, or maybe cfrpki and gortr. I have also tried packaging Routinator 3000 for Debian, but this effort is currently on hold because the Rust ecosystem is broken and hostile to the good packaging practices of Linux distributions.

Marco d'Itri: RPKI validation with OpenBSD's rpki-client and Cloudflare's gortr

This article documents how to install rpki-client (an RPKI relying party software, the actual validator) and gortr (which implements the RPKI to Router protocol) on Debian 10 to provide RPKI validation to routers. If you are using testing or unstable then you can just skip the part about apt pinnings. The packages in bullseye (Debian testing) can be installed as is on Debian stable with no need to rebuild them, by configuring an appropriate pinning for apt:
cat <<END > /etc/apt/sources.list.d/bullseye.list deb http://deb.debian.org/debian/ bullseye main END cat <<END > /etc/apt/preferences.d/pin-rpki # by default do not install anything from bullseye Package: * Pin: release bullseye Pin-Priority: 100 Package: gortr rpki-client rpki-trust-anchors Pin: release bullseye Pin-Priority: 990 END apt update
Before starting, make sure that curl (or wget) and the web PKI certificates are installed:
apt install curl ca-certificates
If you already know about the legal issues related to the ARIN TAL then you may instruct the package to automatically install it. If you skip this step then you will be asked at installation time about it, either way is fine.
echo 'rpki-trust-anchors rpki-trust-anchors/get_arin_tal boolean true' \ debconf-set-selections
Install the packages as usual:
apt install rpki-client gortr
And then configure rpki-client to generate its output in the the JSON format needed by gortr:
echo 'OPTIONS=-j' > /etc/default/rpki-client
You may manually start the service unit to immediately generate the data instead of waiting for the next timer run:
systemctl start rpki-client &
gortr too needs to be configured to use the JSON data generated by rpki-client:
echo 'GORTR_ARGS=-bind :323 -verify=false -checktime=false -cache /var/lib/rpki-client/json' > /etc/default/gortr
And then it needs to be restarted to use the new configuration:
systemctl restart gortr
You may also install FORT Validator on Debian 10, or maybe cfrpki with gortr. I have also tried packaging Routinator 3000 for Debian, but this effort is currently on hold because the Rust ecosystem is broken and hostile to the packaging practices of Linux distributions.

6 August 2020

Joey Hess: Mr Process's wild ride

When a unix process is running in a directory, and that directory gets renamed, the process is taken on a ride to a new location in the filesystem. Suddenly, any "../" paths it might be using point to new, and unexpected locations. This can be a source of interesting behavior, and also of security holes. Suppose root is poking around in ~user/foo/bar/ and decides to vim ../../etc/conffile If the user notices this process is running, they can mv ~/foo/bar /tmp and when vim saves the file, it will write to /tmp/bar/../../etc/conffile AKA /etc/conffile. (Vim does warn that the file has changed while it was being edited. Other editors may not. Or root may be feeling especially BoFH and decide to overwrite the user's changes to their file. Or the rename could perhaps be carefully timed to avoid vim's overwrite protection.) Or, suppose root, in the same place, decides to archive ../../etc with tar, and then delete it:
tar cf etc.tar ../../etc; rm -rf ../../etc
Now the user has some time to take root's shell on a ride, before the rm starts ... and make it delete all of /etc! Anyone know if this class of security hole has a name?

27 October 2015

Marco d'Itri: Per-process netfilter rules

This article documents how the traffic of specific Linux processes can be subjected to a custom firewall or routing configuration, thanks to the magic of cgroups. We will use the Network classifier cgroup, which allows tagging the packets sent by specific processes. To create the cgroup which will be used to identify the processes I added something like this to /etc/rc.local:
mkdir /sys/fs/cgroup/net_cls/unlocator /bin/echo 42 > /sys/fs/cgroup/net_cls/unlocator/net_cls.classid chown md: /sys/fs/cgroup/net_cls/unlocator/tasks
The tasks file, which controls the membership of processes in a cgroup, is made writeable by my user: this way I can add new processes without becoming root. 42 is the arbitrary class identifier that the kernel will associate with the packets generated by the member processes. A command like systemd-cgls /sys/fs/cgroup/net_cls/ can be used to explore which processes are in which cgroup. I use a simple shell wrapper to start a shell or a new program as members of this cgroup:
#!/bin/sh -e CGROUP_NAME=unlocator if [ ! -d /sys/fs/cgroup/net_cls/$CGROUP_NAME/ ]; then echo "The $CGROUP_NAME net_cls cgroup does not exist!" >&2 exit 1 fi /bin/echo $$ > /sys/fs/cgroup/net_cls/$CGROUP_NAME/tasks if [ $# = 0 ]; then exec $ SHELL:-/bin/sh fi exec "$@"
My first goal is to use a special name server for the DNS queries of some processes, thanks to a second dnsmasq process which acts as a caching forwarder. /etc/dnsmasq2.conf:
port=5354 listen-address=127.0.0.1 bind-interfaces no-dhcp-interface=* no-hosts no-resolv server=185.37.37.37 server=185.37.37.185
/etc/systemd/system/dnsmasq2.service:
[Unit] Description=dnsmasq - Second instance Requires=network.target [Service] ExecStartPre=/usr/sbin/dnsmasq --test ExecStart=/usr/sbin/dnsmasq --keep-in-foreground --conf-file=/etc/dnsmasq2.conf ExecReload=/bin/kill -HUP $MAINPID PIDFile=/run/dnsmasq/dnsmasq.pid [Install] WantedBy=multi-user.target
Do not forget to enable the new service:
systemctl enable dnsmasq2 systemctl start dnsmasq2
Since the cgroup match extension is not yet available in a released version of iptables, you will first need to build and install it manually:
git clone git://git.netfilter.org/iptables.git cd iptables ./autogen.sh ./configure make -k sudo cp extensions/libxt_cgroup.so /lib/xtables/ sudo chmod -x /lib/xtables/libxt_cgroup.so
The netfilter configuration required is very simple: all DNS traffic from the marked processes is redirected to the port of the local dnsmasq2:
iptables -t nat -A OUTPUT -m cgroup --cgroup 42 -p udp --dport 53 -j REDIRECT --to-ports 5354 iptables -t nat -A OUTPUT -m cgroup --cgroup 42 -p tcp --dport 53 -j REDIRECT --to-ports 5354
For related reasons, I also need to disable IPv6 for these processes:
ip6tables -A OUTPUT -m cgroup --cgroup 42 -j REJECT
I use a different cgroup to force some programs to use my office VPN by first setting a netfilter packet mark on their traffic:
iptables -t mangle -A OUTPUT -m cgroup --cgroup 43 -j MARK --set-mark 43
The packet mark is then used to policy-route this traffic using a dedicate VRF, i.e. routing table 43:
ip rule add fwmark 43 table 43
This VPN VRF just contains a default route for the VPN interface:
ip route add default dev tun0 table 43
Depending on your local configuration it may be a good idea to also add to the VPN VRF the routes of your local interfaces:
ip route show scope link proto kernel \ xargs -I ROUTE ip route add ROUTE table 43
Since the source address selection happens before the traffic is diverted to the VPN, we also need to source-NAT to the VPN address the marked packets:
iptables -t nat -A POSTROUTING -m mark --mark 43 --out-interface tun0 -j MASQUERADE

4 November 2014

Marco d'Itri: My position on the "init system coupling" General Resolution

I first want to clarify for the people not intimately involved with Debian that the GR-2014-003 vote is not about choosing the default init system or deciding if sysvinit should still be supported: its outcome will not stop systemd from being Debian's default init system and will not prevent any interested developers from supporting sysvinit. Some non-developers have recently threatened of "forking Debian" if this GR will not pass, apparently without understanding well the concept: Debian welcomes forks and I think that having more users working on free software would be great no matter which init system they favour. The goal of Ian Jackson's proposal is to force the maintainers who want to use the superior features of systemd in their packages to spend their time on making them work with sysvinit as well. This is antisocial and also hard to reconcile it with the Debian Constitution, which states: 2.1.1 Nothing in this constitution imposes an obligation on anyone to do work for the Project. A person who does not want to do a task which has been delegated or assigned to them does not need to do it. [...] As it has been patiently explained by many other people, this proposal is unrealistic: if the maintainers of some packages were not interested in working on support for sysvinit and nobody else submitted patches then we would probably still have to release them as is even if formally declared unsuitable for a release. On the other hand, if somebody is interested in working on sysvinit support then there is no need for a GR forcing them to do it. The most elegant outcome of this GR would be a victory of choice 4 ("please do not waste everybody's time with pointless general resolutions"), but Ian Jackson has been clear enough in explaining how he sees the future of this debate: If my GR passes we will only have to have this conversation if those who are outvoted do not respect the project's collective decision. If my GR fails I expect a series of bitter rearguard battles over individual systemd dependencies. There are no significant practical differences between choices 2 "support alternative init systems as much as possible" and 3 "packages may require specific init systems if maintainers decide", but the second option is more explicit in supporting the technical decisions of maintainers and upstream developers. This is why I think that we need a stronger outcome to prevent discussing this over and over, no matter how each one of us feels about working personally on sysvinit support in the future. I will vote for choices 3, 2, 4, 1.

14 October 2014

Marco d'Itri: The Italian peering ecosystem

I published the slides of my talk "An introduction to peering in Italy - Interconnections among the Italian networks" that I presented today at the MIX-IT (the Milano internet exchange) technical meeting.

3 October 2014

Marco d'Itri: 15 years of whois

Exactly 15 years ago I uploaded to Debian the first release of my whois client. At the end of 1999 the United States Government forced Network Solutions, at the time the only registrar for the .com, .net and .org top level domains, to split their functions in a registry and a registrar and to and allow competing registrars to operate. Since then, two whois queries are needed to access the data for a domain in a TLD operating with a thin registry model: first one to the registry to find out which registrar was used to register the domain, and then one the registrar to actually get the data. Being as lazy as I am I tought that this was unacceptable, so I implemented a whois client that would know which whois server to query for all TLDs and then automatically follow the referrals to the registrars. But the initial reason for writing this program was to replace the simplistic BSD-derived whois client that was shipped with Debian with one that would know which server to query for IP addresses and autonomous system numbers, a useful feature in a time when people still used to manually report all their spam to the originating ISPs. Over the years I have spent countless hours searching for the right servers for the domains of far away countries (something that has often been incredibly instructive) and now the program database is usually more up to date than the official IANA one. One of my goals for this program has always been wide portability, so I am happy that over the years it was adopted by other Linux distributions, made available by third parties to all common variants of UNIX and even to systems as alien as Windows and OS/2. Now that whois is 15 years old I am happy to announce that I have recently achieved complete world domination and that all Linux distributions use it as their default whois client.

29 September 2014

Marco d'Itri: CVE-2014-6271 fix for Debian woody, sarge, etch and lenny

Very old Debian releases like woody (3.0), sarge (3.1), etch (4.0) and lenny (5.0) are not supported anymore by the Debian Security Team and do not get security updates. Since some of our customers still have servers running these version, I have built bash packages with the fix for CVE-2014-6271 (the "shellshock" bug) and Florian Weimer's patch which restricts the parsing of shell functions to specially named variables: http://ftp.linux.it/pub/People/md/bash/ This work has been sponsored by my employer Seeweb, an hosting, cloud infrastructure and colocation provider.

1 April 2014

Marco d'Itri: Real out of band connectivity with network namespaces

This post explains how to configure on a Linux server a second and totally independent network interface with its own connectivity. It can be useful to access the server when the regular connectivity is broken. This can happen thanks to network namespaces, a virtualization feature available in recent kernels. We need to create a simple script to be run at boot time which will create and configure the namespace. First, move in the new namespace the network interface which will be dedicated to it:
ip netns add oob ip link set eth2 netns oob
And then configure it as usual with iproute, by executing it in the new namespace with ip netns exec:
ip netns exec oob ip link set lo up ip netns exec oob ip link set eth2 up ip netns exec oob ip addr add 192.168.1.2/24 dev eth2 ip netns exec oob ip route add default via 192.168.1.1
The interface must be configured manually because ifupdown does not support namespaces yet, and it would use the same /run/network/ifstate file which tracks the interfaces of the main namespace (this is also a good argument in favour of something persistent like Network Manager...). Now we can start any daemon in the namespace, just make sure that they will not interfere with the on-disk state of other instances:
ip netns exec oob /usr/sbin/sshd -o PidFile=/run/sshd-oob.pid
Netfilter is virtualized as well, so we can load a firewall configuration which will be applied only to the new namespace:
ip netns exec oob iptables-restore < /etc/network/firewall-oob-v4
As documented in ip-netns(8), iproute netns add will also create a mount namespace and bind mount in it the files in /etc/netns/$NAMESPACE/: this is very useful since some details of the configuration, like the name server IP, will be different in the new namespace:
mkdir -p /etc/netns/oob/ echo 'nameserver 8.8.8.8' > /etc/netns/oob/resolv.conf
If we connect to the second SSH daemon, it will create a shell in the second namespace. To enter the main one, i.e. the one used by PID 1, we can use a simple script like:
#!/bin/sh -e exec nsenter --net --mount --target 1 "$@"
To reach the out of band namespace from the main one we can use instead:
#!/bin/sh -e exec nsenter --net --mount --target $(cat /var/run/sshd-oob.pid) "$@"
Scripts like these can also be used in fun ssh configurations like:
Host 10.2.1.* ProxyCommand ssh -q -a -x -N -T server-oob.example.net 'nsenter-main nc %h %p'

20 February 2014

Marco d'Itri: Automatically unlocking xscreensaver in some locations

When I am at home I do not want to be bothered by the screensaver locking my laptop. To solve this, I use a custom PAM configuration which checks if I am authenticated to the local access point. Add this to the top of /etc/pam.d/xscreensaver:
auth sufficient pam_exec.so quiet /usr/local/sbin/pam_auth_xscreensaver
And then use a script like this one to decide when you want the display to be automatically unlocked:
#!/bin/sh -e # return the ESSID of this interface current_essid() /sbin/iwconfig $1 sed -nre '/ESSID/s/.*ESSID:"([^"]+)".*/\1/p' # automatically unlock only for these users case "$PAM_USER" in "") echo "This program must be run by pam_auth.so!" exit 1 ;; md) ;; *) exit 1 ;; esac CURRENT_ESSID=$(current_essid wlan0) # automatically unlock when connected to these networks case "$CURRENT_ESSID" in MYOWNESSID) exit 0 ;; esac exit 6

Next.