Search Results: "fs"

19 January 2026

Dirk Eddelbuettel: RApiDatetime 0.0.11 on CRAN: Micro-Maintenance

A new (micro) maintenance release of our RApiDatetime package is now on CRAN, coming only a good week after the 0.0.10 release which itself had a two year gap to its predecessor release. RApiDatetime provides a number of entry points for C-level functions of the R API for Date and Datetime calculations. The functions asPOSIXlt and asPOSIXct convert between long and compact datetime representation, formatPOSIXlt and Rstrptime convert to and from character strings, and POSIXlt2D and D2POSIXlt convert between Date and POSIXlt datetime. Lastly, asDatePOSIXct converts to a date type. All these functions are rather useful, but were not previously exported by R for C-level use by other packages. Which this package aims to change. This release adds a single (and ) around one variable as the rchk container and service by Tomas now flagged this. Which is somewhat peculiar, as this is old code also borrowed from R itself but no point arguing so I just added this. Details of the release follow based on the NEWS file.

Changes in RApiDatetime version 0.0.11 (2026-01-19)
  • Add PROTECT (and UNPROTECT) to appease rchk

Courtesy of my CRANberries, there is also a diffstat report for this release.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

Francesco Paolo Lovergine: A Terramaster NAS with Debian, take two.

After experimenting at home, the very first professional-grade NAS from Terramaster arrived at work, too, with 12 HDD bays and possibly a pair of M2s. NVME cards. In this case, I again installed a plain Debian distribution, but HDD monitoring required some configuration adjustments to run smartd properly.A decent approach to data safety is to run regularly scheduled short and long SMART tests on all disks to detect potential damage. Running such tests on all disks at once isn't ideal, so I set up a script to create a staggered configuration and test multiple groups of disks at different times. Note that it is mandatory to read the devices at each reboot because their names and order can change.Of course, the same principle (short/long test at regular intervals along the week) should be applied for a simpler configuration, as in the case of my home NAS with a pair of RAID1 devices.What follows is a simple script to create a staggered smartd.conf at boot time:
#!/bin/bash
#
# Save this as /usr/local/bin/create-smartd-conf.sh
#
# Dynamically generate smartd.conf with staggered SMART test scheduling
# at boot time based on discovered ATA devices
# HERE IS A LIST OF DIRECTIVES FOR THIS CONFIGURATION FILE.
# PLEASE SEE THE smartd.conf MAN PAGE FOR DETAILS
#
#   -d TYPE Set the device type: ata, scsi[+TYPE], nvme[,NSID],
#           sat[,auto][,N][+TYPE], usbcypress[,X], usbjmicron[,p][,x][,N],
#           usbprolific, usbsunplus, sntasmedia, sntjmicron[,NSID], sntrealtek,
#           ... (platform specific)
#   -T TYPE Set the tolerance to one of: normal, permissive
#   -o VAL  Enable/disable automatic offline tests (on/off)
#   -S VAL  Enable/disable attribute autosave (on/off)
#   -n MODE No check if: never, sleep[,N][,q], standby[,N][,q], idle[,N][,q]
#   -H      Monitor SMART Health Status, report if failed
#   -s REG  Do Self-Test at time(s) given by regular expression REG
#   -l TYPE Monitor SMART log or self-test status:
#           error, selftest, xerror, offlinests[,ns], selfteststs[,ns]
#   -l scterc,R,W  Set SCT Error Recovery Control
#   -e      Change device setting: aam,[N off], apm,[N off], dsn,[on off],
#           lookahead,[on off], security-freeze, standby,[N off], wcache,[on off]
#   -f      Monitor 'Usage' Attributes, report failures
#   -m ADD  Send email warning to address ADD
#   -M TYPE Modify email warning behavior (see man page)
#   -p      Report changes in 'Prefailure' Attributes
#   -u      Report changes in 'Usage' Attributes
#   -t      Equivalent to -p and -u Directives
#   -r ID   Also report Raw values of Attribute ID with -p, -u or -t
#   -R ID   Track changes in Attribute ID Raw value with -p, -u or -t
#   -i ID   Ignore Attribute ID for -f Directive
#   -I ID   Ignore Attribute ID for -p, -u or -t Directive
#   -C ID[+] Monitor [increases of] Current Pending Sectors in Attribute ID
#   -U ID[+] Monitor [increases of] Offline Uncorrectable Sectors in Attribute ID
#   -W D,I,C Monitor Temperature D)ifference, I)nformal limit, C)ritical limit
#   -v N,ST Modifies labeling of Attribute N (see man page)
#   -P TYPE Drive-specific presets: use, ignore, show, showall
#   -a      Default: -H -f -t -l error -l selftest -l selfteststs -C 197 -U 198
#   -F TYPE Use firmware bug workaround:
#           none, nologdir, samsung, samsung2, samsung3, xerrorlba
#   -c i=N  Set interval between disk checks to N seconds
#    #      Comment: text after a hash sign is ignored
#    \      Line continuation character
# Attribute ID is a decimal integer 1 <= ID <= 255
# except for -C and -U, where ID = 0 turns them off.
set -euo pipefail
# Test schedule configuration
BASE_SCHEDULE="L/../../6"  # Long test on Saturdays
TEST_HOURS=(01 03 05 07)   # 4 time slots: 1am, 3am, 5am, 7am
DEVICES_PER_GROUP=3
main()  
    # Get array of device names (e.g., sda, sdb, sdc)
    mapfile -t devices < <(ls -l /dev/disk/by-id/   grep ata   awk ' print $11 '   grep sd   cut -d/ -f3   sort -u)
    if [[ $ #devices[@]  -eq 0 ]]; then
        exit 1
    fi
    # Start building config file
    cat << EOF
# smartd.conf - Auto-generated at boot
# Generated: $(date '+%Y-%m-%d %H:%M:%S')
#
# Staggered SMART test scheduling to avoid concurrent disk load
# Long tests run on Saturdays at different times per group
#
EOF
    # Process devices into groups
    local group=0
    local count_in_group=0
    for i in "$ !devices[@] "; do
        local dev="$ devices[$i] "
        local hour="$ TEST_HOURS[$group] "
        # Add group header at start of each group
        if [[ $count_in_group -eq 0 ]]; then
            echo ""
            echo "# Group $((group + 1)) - Tests at $ hour :00 on Saturdays"
        fi
        # Add device entry
        #echo "/dev/$ dev  -a -o on -S on -s ($ BASE_SCHEDULE /$ hour ) -m root"
        echo "/dev/$ dev  -a -o on -S on -s (L/../../6/$ hour ) -s (S/../.././$(((hour + 12) % 24))) -m root"
        # Move to next group when current group is full
        count_in_group=$((count_in_group + 1))
        if [[ $count_in_group -ge $DEVICES_PER_GROUP ]]; then
            count_in_group=0
            group=$(((group + 1) % $ #TEST_HOURS[@] ))
        fi
    done
 
main "$@"
To run such a script at boot, add a unit file to the systemd configuration.
sudo systemctl  edit --full /etc/systemd/system/regenerate-smartd-conf.service
sudo systemctl enable regenerate-smartd-conf.service
Where the unit service is the following:
[Unit]
Description=Generate smartd.conf with staggered SMART test scheduling
# Wait for all local filesystems and udev device detection
After=local-fs.target systemd-udev-settle.service
Before=smartd.service
Wants=systemd-udev-settle.service
DefaultDependencies=no
[Service]
Type=oneshot
# Only generate the config file, don't touch smartd here
ExecStart=/bin/bash -c '/usr/local/bin/create-smartd-config.sh > /etc/smartd.conf'
StandardOutput=journal
StandardError=journal
RemainAfterExit=yes
[Install]
WantedBy=multi-user.target

Vincent Bernat: RAID 5 with mixed-capacity disks on Linux

Standard RAID solutions waste space when disks have different sizes. Linux software RAID with LVM uses the full capacity of each disk and lets you grow storage by replacing one or two disks at a time. We start with four disks of equal size:
$ lsblk -Mo NAME,TYPE,SIZE
NAME TYPE  SIZE
vda  disk  101M
vdb  disk  101M
vdc  disk  101M
vdd  disk  101M
We create one partition on each of them:
$ sgdisk --zap-all --new=0:0:0 -t 0:fd00 /dev/vda
$ sgdisk --zap-all --new=0:0:0 -t 0:fd00 /dev/vdb
$ sgdisk --zap-all --new=0:0:0 -t 0:fd00 /dev/vdc
$ sgdisk --zap-all --new=0:0:0 -t 0:fd00 /dev/vdd
$ lsblk -Mo NAME,TYPE,SIZE
NAME   TYPE  SIZE
vda    disk  101M
 vda1 part  100M
vdb    disk  101M
 vdb1 part  100M
vdc    disk  101M
 vdc1 part  100M
vdd    disk  101M
 vdd1 part  100M
We set up a RAID 5 device by assembling the four partitions:1
$ mdadm --create /dev/md0 --level=raid5 --bitmap=internal --raid-devices=4 \
>   /dev/vda1 /dev/vdb1 /dev/vdc1 /dev/vdd1
$ lsblk -Mo NAME,TYPE,SIZE
    NAME          TYPE    SIZE
    vda           disk    101M
   vda1        part    100M
    vdb           disk    101M
   vdb1        part    100M
    vdc           disk    101M
   vdc1        part    100M
    vdd           disk    101M
   vdd1        part    100M
  md0           raid5 292.5M
$ cat /proc/mdstat
md0 : active raid5 vdd1[4] vdc1[2] vdb1[1] vda1[0]
      299520 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
We use LVM to create logical volumes on top of the RAID 5 device.
$ pvcreate /dev/md0
  Physical volume "/dev/md0" successfully created.
$ vgcreate data /dev/md0
  Volume group "data" successfully created
$ lvcreate -L 100m -n bits data
  Logical volume "bits" created.
$ lvcreate -L 100m -n pieces data
  Logical volume "pieces" created.
$ mkfs.ext4 -q /dev/data/bits
$ mkfs.ext4 -q /dev/data/pieces
$ lsblk -Mo NAME,TYPE,SIZE
    NAME          TYPE    SIZE
    vda           disk    101M
   vda1        part    100M
    vdb           disk    101M
   vdb1        part    100M
    vdc           disk    101M
   vdc1        part    100M
    vdd           disk    101M
   vdd1        part    100M
  md0           raid5 292.5M
     data-bits   lvm     100M
     data-pieces lvm     100M
$ vgs
  VG   #PV #LV #SN Attr   VSize   VFree
  data   1   2   0 wz--n- 288.00m 88.00m
This gives us the following setup:
One RAID 5 device built from four partitions from four disks of equal capacity. The RAID device is part of an LVM volume group with two logical volumes.
RAID 5 setup with disks of equal capacity
We replace /dev/vda with a bigger disk. We add it back to the RAID 5 array after copying the partitions from /dev/vdb:
$ cat /proc/mdstat
md0 : active (auto-read-only) raid5 vdb1[1] vdd1[4] vdc1[2]
      299520 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [_UUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
$ sgdisk --replicate=/dev/vda /dev/vdb
$ sgdisk --randomize-guids /dev/vda
$ mdadm --manage /dev/md0 --add /dev/vda1
$ cat /proc/mdstat
md0 : active raid5 vda1[5] vdb1[1] vdd1[4] vdc1[2]
      299520 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
We do not use the additional capacity: this setup would not survive the loss of /dev/vda because we have no spare capacity. We need a second disk replacement, like /dev/vdb:
$ cat /proc/mdstat
md0 : active (auto-read-only) raid5 vda1[5] vdd1[4] vdc1[2]
      299520 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [U_UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
$ sgdisk --replicate=/dev/vdb /dev/vdc
$ sgdisk --randomize-guids /dev/vdb
$ mdadm --manage /dev/md0 --add /dev/vdb1
$ cat /proc/mdstat
md0 : active raid5 vdb1[6] vda1[5] vdd1[4] vdc1[2]
      299520 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
We create a new RAID 1 array by using the free space on /dev/vda and /dev/vdb:
$ sgdisk --new=0:0:0 -t 0:fd00 /dev/vda
$ sgdisk --new=0:0:0 -t 0:fd00 /dev/vdb
$ mdadm --create /dev/md1 --level=raid1 --bitmap=internal --raid-devices=2 \
>   /dev/vda2 /dev/vdb2
$ cat /proc/mdstat
md1 : active raid1 vdb2[1] vda2[0]
      101312 blocks super 1.2 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
md0 : active raid5 vdb1[6] vda1[5] vdd1[4] vdc1[2]
      299520 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
We add /dev/md1 to the volume group:
$ pvcreate /dev/md1
  Physical volume "/dev/md1" successfully created.
$ vgextend data /dev/md1
  Volume group "data" successfully extended
$ vgs
  VG   #PV #LV #SN Attr   VSize   VFree
  data   2   2   0 wz--n- 384.00m 184.00m
$  lsblk -Mo NAME,TYPE,SIZE
       NAME          TYPE    SIZE
       vda           disk    201M
      vda1        part    100M
     vda2        part    100M
       vdb           disk    201M
      vdb1        part    100M
     vdb2        part    100M
  md1           raid1  98.9M
       vdc           disk    101M
      vdc1        part    100M
       vdd           disk    101M
      vdd1        part    100M
     md0           raid5 292.5M
        data-bits   lvm     100M
        data-pieces lvm     100M
This gives us the following setup:2
One RAID 5 device built from four partitions and one RAID 1 device built from two partitions. The two last disks are smaller. The two RAID devices are part of a single LVM volume group.
Setup mixing both RAID 1 and RAID 5
We extend our capacity further by replacing /dev/vdc:
$ cat /proc/mdstat
md1 : active (auto-read-only) raid1 vda2[0] vdb2[1]
      101312 blocks super 1.2 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
md0 : active (auto-read-only) raid5 vda1[5] vdd1[4] vdb1[6]
      299520 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UU_U]
      bitmap: 0/1 pages [0KB], 65536KB chunk
$ sgdisk --replicate=/dev/vdc /dev/vdb
$ sgdisk --randomize-guids /dev/vdc
$ mdadm --manage /dev/md0 --add /dev/vdc1
$ cat /proc/mdstat
md1 : active (auto-read-only) raid1 vda2[0] vdb2[1]
      101312 blocks super 1.2 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
md0 : active raid5 vdc1[7] vda1[5] vdd1[4] vdb1[6]
      299520 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
Then, we convert /dev/md1 from RAID 1 to RAID 5:
$ mdadm --grow /dev/md1 --level=5 --raid-devices=3 --add /dev/vdc2
mdadm: level of /dev/md1 changed to raid5
mdadm: added /dev/vdc2
$ cat /proc/mdstat
md1 : active raid5 vdc2[2] vda2[0] vdb2[1]
      202624 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/3] [UUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
md0 : active raid5 vdc1[7] vda1[5] vdd1[4] vdb1[6]
      299520 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
$ pvresize /dev/md1
$ vgs
  VG   #PV #LV #SN Attr   VSize   VFree
  data   2   2   0 wz--n- 482.00m 282.00m
This gives us the following layout:
Two RAID 5 devices built from four disks of different sizes. The last disk is smaller and contains only one partition, while the others have two partitions: one for /dev/md0 and one for /dev/md1. The two RAID devices are part of a single LVM volume group.
RAID 5 setup with mixed-capacity disks using partitions and LVM
We further extend our capacity by replacing /dev/vdd:
$ cat /proc/mdstat
md0 : active (auto-read-only) raid5 vda1[5] vdc1[7] vdb1[6]
      299520 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUU_]
      bitmap: 0/1 pages [0KB], 65536KB chunk
md1 : active (auto-read-only) raid5 vda2[0] vdc2[2] vdb2[1]
      202624 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/3] [UUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
$ sgdisk --replicate=/dev/vdd /dev/vdc
$ sgdisk --randomize-guids /dev/vdd
$ mdadm --manage /dev/md0 --add /dev/vdd1
$ cat /proc/mdstat
md0 : active raid5 vdd1[4] vda1[5] vdc1[7] vdb1[6]
      299520 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
md1 : active (auto-read-only) raid5 vda2[0] vdc2[2] vdb2[1]
      202624 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/3] [UUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
We grow the second RAID 5 array:
$ mdadm --grow /dev/md1 --raid-devices=4 --add /dev/vdd2
mdadm: added /dev/vdd2
$ cat /proc/mdstat
md0 : active raid5 vdd1[4] vda1[5] vdc1[7] vdb1[6]
      299520 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
md1 : active raid5 vdd2[3] vda2[0] vdc2[2] vdb2[1]
      303936 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
$ pvresize /dev/md1
$ vgs
  VG   #PV #LV #SN Attr   VSize   VFree
  data   2   2   0 wz--n- 580.00m 380.00m
$ lsblk -Mo NAME,TYPE,SIZE
       NAME          TYPE    SIZE
       vda           disk    201M
      vda1        part    100M
     vda2        part    100M
       vdb           disk    201M
      vdb1        part    100M
     vdb2        part    100M
       vdc           disk    201M
      vdc1        part    100M
     vdc2        part    100M
       vdd           disk    301M
      vdd1        part    100M
      vdd2        part    100M
     md0           raid5 292.5M
        data-bits   lvm     100M
        data-pieces lvm     100M
  md1           raid5 296.8M
You can continue by replacing each disk one by one using the same steps.

  1. Write-intent bitmaps speed up recovery of the RAID array after a power failure by marking unsynchronized regions as dirty. They have an impact on performance, but I did not measure it myself.
  2. In the lsblk output, /dev/md1 appears unused because the logical volumes do not use any space from it yet. Once you create more logical volumes or extend them, lsblk will reflect the usage.

Dima Kogan: mrcal 2.5 released!

mrcal 2.5 is out: the release notes. Once again, this is mostly a bug-fix release en route to the big new features coming in 3.0. One cool thing is that these tools have now matured enough to no longer be considered experimental. They have been used with great success in lots of contexts across many different projects and organizations. Some highlights: Some of the above is new, and not yet fully polished and documented and tested, but it works. In mrcal 2.5, most of the implementation of some new big features is written and committed, but it's still incomplete. The new stuff is there, but is lightly tested and documented. This will be completed eventually in mrcal 3.0: mrcal is quite good already, and will be even better in the future. Try it today!

17 January 2026

Simon Josefsson: Backup of S3 Objects Using rsnapshot

I ve been using rsnapshot to take backups of around 10 servers and laptops for well over 15 years, and it is a remarkably reliable tool that has proven itself many times. Rsnapshot uses rsync over SSH and maintains a temporal hard-link file pool. Once rsnapshot is configured and running, on the backup server, you get a hardlink farm with directories like this for the remote server:
/backup/serverA.domain/.sync/foo
/backup/serverA.domain/daily.0/foo
/backup/serverA.domain/daily.1/foo
/backup/serverA.domain/daily.2/foo
...
/backup/serverA.domain/daily.6/foo
/backup/serverA.domain/weekly.0/foo
/backup/serverA.domain/weekly.1/foo
...
/backup/serverA.domain/monthly.0/foo
/backup/serverA.domain/monthly.1/foo
...
/backup/serverA.domain/yearly.0/foo
I can browse and rescue files easily, going back in time when needed. The rsnapshot project README explains more, there is a long rsnapshot HOWTO although I usually find the rsnapshot man page the easiest to digest. I have stored multi-TB Git-LFS data on GitLab.com for some time. The yearly renewal is coming up, and the price for Git-LFS storage on GitLab.com is now excessive (~$10.000/year). I have reworked my work-flow and finally migrated debdistget to only store Git-LFS stubs on GitLab.com and push the real files to S3 object storage. The cost for this is barely measurable, I have yet to run into the 25/month warning threshold. But how do you backup stuff stored in S3? For some time, my S3 backup solution has been to run the minio-client mirror command to download all S3 objects to my laptop, and rely on rsnapshot to keep backups of this. While 4TB NVME s are relatively cheap, I ve felt that this disk and network churn on my laptop is unsatisfactory for quite some time. What is a better approach? I find S3 hosting sites fairly unreliable by design. Only a couple of clicks in your web browser and you have dropped 100TB of data. Or by someone else who steal your plaintext-equivalent cookie. Thus, I haven t really felt comfortable using any S3-based backup option. I prefer to self-host, although continously running a mirror job is not sufficient: if I accidentally drop the entire S3 object store, my mirror run will remove all files locally too. The rsnapshot approach that allows going back in time and having data on self-managed servers feels superior to me. What if we could use rsnapshot with a S3 client instead of rsync? Someone else asked about this several years ago, and the suggestion was to use the fuse-based s3fs which sounded unreliable to me. After some experimentation, working around some hard-coded assumption in the rsnapshot implementation, I came up with a small configuration pattern and a wrapper tool to implement what I desired. Here is my configuration snippet:
cmd_rsync    /backup/s3/s3rsync
rsync_short_args    -Q
rsync_long_args    --json --remove
lockfile    /backup/s3/rsnapshot.pid
snapshot_root    /backup/s3
backup    s3:://hetzner/debdistget-gnuinos    ./debdistget-gnuinos
backup    s3:://hetzner/debdistget-tacos  ./debdistget-tacos
backup    s3:://hetzner/debdistget-diffos ./debdistget-diffos
backup    s3:://hetzner/debdistget-pureos ./debdistget-pureos
backup    s3:://hetzner/debdistget-kali   ./debdistget-kali
backup    s3:://hetzner/debdistget-devuan ./debdistget-devuan
backup    s3:://hetzner/debdistget-trisquel   ./debdistget-trisquel
backup    s3:://hetzner/debdistget-debian ./debdistget-debian
The idea is to save a backup of a couple of S3 buckets under /backup/s3/. I have some scripts that take a complete rsnapshot.conf file and append my per-directory configuration so that this becomes a complete configuration. If you are curious how I roll this, backup-all invokes backup-one appending my rsnapshot.conf template with the snippet above. The s3rsync wrapper script is the essential hack to convert rsnapshot s rsync parameters into something that talks S3 and the script is as follows:
#!/bin/sh
set -eu
S3ARG=
for ARG in "$@"; do
    case $ARG in
    s3:://*) S3ARG="$S3ARG "$(echo $ARG   sed -e 's,s3:://,,');;
    -Q*) ;;
    *) S3ARG="$S3ARG $ARG";;
    esac
done
echo /backup/s3/mc mirror $S3ARG
exec /backup/s3/mc mirror $S3ARG
It uses the minio-client tool. I first tried s3cmd but its sync command read all files to compute MD5 checksums every time you invoke it, which is very slow. The mc mirror command is blazingly fast since it only compare mtime s, just like rsync or git. First you need to store credentials for your S3 bucket. These are stored in plaintext in ~/.mc/config.json which I find to be sloppy security practices, but I don t know of any better way to do this. Replace AKEY and SKEY with your access token and secret token from your S3 provider:
/backup/s3/mc alias set hetzner AKEY SKEY
If I invoke a sync job for a fully synced up directory the output looks like this:
root@hamster /backup# /run/current-system/profile/bin/rsnapshot -c /backup/s3/rsnapshot.conf -V sync
Setting locale to POSIX "C"
echo 1443 > /backup/s3/rsnapshot.pid 
/backup/s3/s3rsync -Qv --json --remove s3:://hetzner/debdistget-gnuinos \
    /backup/s3/.sync//debdistget-gnuinos 
/backup/s3/mc mirror --json --remove hetzner/debdistget-gnuinos /backup/s3/.sync//debdistget-gnuinos
 "status":"success","total":0,"transferred":0,"duration":0,"speed":0 
/backup/s3/s3rsync -Qv --json --remove s3:://hetzner/debdistget-tacos \
    /backup/s3/.sync//debdistget-tacos 
/backup/s3/mc mirror --json --remove hetzner/debdistget-tacos /backup/s3/.sync//debdistget-tacos
 "status":"success","total":0,"transferred":0,"duration":0,"speed":0 
/backup/s3/s3rsync -Qv --json --remove s3:://hetzner/debdistget-diffos \
    /backup/s3/.sync//debdistget-diffos 
/backup/s3/mc mirror --json --remove hetzner/debdistget-diffos /backup/s3/.sync//debdistget-diffos
 "status":"success","total":0,"transferred":0,"duration":0,"speed":0 
/backup/s3/s3rsync -Qv --json --remove s3:://hetzner/debdistget-pureos \
    /backup/s3/.sync//debdistget-pureos 
/backup/s3/mc mirror --json --remove hetzner/debdistget-pureos /backup/s3/.sync//debdistget-pureos
 "status":"success","total":0,"transferred":0,"duration":0,"speed":0 
/backup/s3/s3rsync -Qv --json --remove s3:://hetzner/debdistget-kali \
    /backup/s3/.sync//debdistget-kali 
/backup/s3/mc mirror --json --remove hetzner/debdistget-kali /backup/s3/.sync//debdistget-kali
 "status":"success","total":0,"transferred":0,"duration":0,"speed":0 
/backup/s3/s3rsync -Qv --json --remove s3:://hetzner/debdistget-devuan \
    /backup/s3/.sync//debdistget-devuan 
/backup/s3/mc mirror --json --remove hetzner/debdistget-devuan /backup/s3/.sync//debdistget-devuan
 "status":"success","total":0,"transferred":0,"duration":0,"speed":0 
/backup/s3/s3rsync -Qv --json --remove s3:://hetzner/debdistget-trisquel \
    /backup/s3/.sync//debdistget-trisquel 
/backup/s3/mc mirror --json --remove hetzner/debdistget-trisquel /backup/s3/.sync//debdistget-trisquel
 "status":"success","total":0,"transferred":0,"duration":0,"speed":0 
/backup/s3/s3rsync -Qv --json --remove s3:://hetzner/debdistget-debian \
    /backup/s3/.sync//debdistget-debian 
/backup/s3/mc mirror --json --remove hetzner/debdistget-debian /backup/s3/.sync//debdistget-debian
 "status":"success","total":0,"transferred":0,"duration":0,"speed":0 
touch /backup/s3/.sync/ 
rm -f /backup/s3/rsnapshot.pid 
/run/current-system/profile/bin/logger -p user.info -t rsnapshot[1443] \
    /run/current-system/profile/bin/rsnapshot -c /backup/s3/rsnapshot.conf \
    -V sync: completed successfully 
root@hamster /backup# 
You can tell from the paths that this machine runs Guix. This was the first production use of the Guix System for me, and the machine has been running since 2015 (with the occasional new hard drive). Before, I used rsnapshot on Debian, but some stable release of Debian dropped the rsnapshot package, paving the way for me to test Guix in production on a non-Internet exposed machine. Unfortunately, mc is not packaged in Guix, so you will have to install it from the MinIO Client GitHub page manually. Running the daily rotation looks like this:
root@hamster /backup# /run/current-system/profile/bin/rsnapshot -c /backup/s3/rsnapshot.conf -V daily
Setting locale to POSIX "C"
echo 1549 > /backup/s3/rsnapshot.pid 
mv /backup/s3/daily.5/ /backup/s3/daily.6/ 
mv /backup/s3/daily.4/ /backup/s3/daily.5/ 
mv /backup/s3/daily.3/ /backup/s3/daily.4/ 
mv /backup/s3/daily.2/ /backup/s3/daily.3/ 
mv /backup/s3/daily.1/ /backup/s3/daily.2/ 
mv /backup/s3/daily.0/ /backup/s3/daily.1/ 
/run/current-system/profile/bin/cp -al /backup/s3/.sync /backup/s3/daily.0 
rm -f /backup/s3/rsnapshot.pid 
/run/current-system/profile/bin/logger -p user.info -t rsnapshot[1549] \
    /run/current-system/profile/bin/rsnapshot -c /backup/s3/rsnapshot.conf \
    -V daily: completed successfully 
root@hamster /backup# 
Hopefully you will feel inspired to take backups of your S3 buckets now!

16 January 2026

Dirk Eddelbuettel: RcppSpdlog 0.0.26 on CRAN: Another Microfix

Version 0.0.26 of RcppSpdlog arrived on CRAN moments ago, and will be uploaded to Debian and built for r2u shortly. The (nice) documentation site has been refreshed too. RcppSpdlog bundles spdlog, a wonderful header-only C++ logging library with all the bells and whistles you would want that was written by Gabi Melman, and also includes fmt by Victor Zverovich. You can learn more at the nice package documention site. Brian Ripley noticed an infelicity when building under C++20 which he is testing hard and fast. Sadly, this came a day late for yesterday s upload of release 0.0.25 with another trivial fix so another incremental release was called for. We already accommodated C++20 and its use of std::format (in lieu of the included fmt::format) but had not turned it on unconditionally. We do so now, but offer an opt-out for those who prefer the previous build type. The NEWS entry for this release follows.

Changes in RcppSpdlog version 0.0.26 (2026-01-16)
  • Under C++20 or later, switch to using std::format to avoid a compiler nag that CRAN now complains about

Courtesy of my CRANberries, there is also a diffstat report detailing changes. More detailed information is on the RcppSpdlog page, or the package documention site.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

Freexian Collaborators: Monthly report about Debian Long Term Support, December 2025 (by Santiago Ruano Rinc n)

The Debian LTS Team, funded by [Freexian s Debian LTS offering] (https://www.freexian.com/lts/debian/), is pleased to report its activities for December.

Activity summary During the month of December, 18 contributors have been paid to work on Debian LTS (links to individual contributor reports are located below). The team released 41 DLAs fixing 252 CVEs. The team currently focuses on preparing security updates for Debian 11 bullseye , but also looks for contributing with updates for Debian 12 bookworm , Debian 13 trixie and even Debian unstable. Notable security updates:
  • libsoup2.4 (DLA-4398-1), prepared by Andreas Henrikson, fixing several vulnerabilities.
  • glib2.0 (DLA-4412-1), published by Emilio Pozuelo Monfort, addressing multiple issues.
  • lasso (DLA-4397-1), prepared by Sylvain Beucler, addressing multiple issues, including a critical remote code execution (RCE) vulnerability (CVE-2025-47151)
  • roundcube (DLA 4415-1), prepared by Guilhem Moulin, fixing a cross-site-scripting (XSS) (CVE-2025-68461) and an information disclosure (CVE-2025-68460) vulnerabilities
  • mediawiki (DLA 4428-1), published by Guilhem, fixing multiple vulnerabilities could lead to information disclosure, denial of service or privilege escalation.
  • While the DLA has not been published yet, Charles Henrique Melara proposed upstream fixes for seven CVEs in ffmpeg: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/21275.
  • python-apt (DLA 4408-1), prepared by Utkarsh Gupta, in coordination with the Debian Security Team and Julian Andres Klode, the apt s maintainer.
  • libpng1.6 (DLA-4396-1), published by Tobias Frost, completing the work started the previous month.
Notable non-security updates:
  • tzdata (DLA-4403-1), prepared by Emilio, including the latest changes to the leap second list and its expiry date, which was set for the end of December.
Contributions from outside the LTS Team:
  • Christoph Berg, co-maintainer of PostgreSQL in Debian, prepared a postgresql-13 update, released as DLA-4420-1
The LTS Team has also contributed with updates to the latest Debian releases:

Individual Debian LTS contributor reports

Thanks to our sponsors Sponsors that joined recently are in bold.

15 January 2026

Dirk Eddelbuettel: RcppSpdlog 0.0.25 on CRAN: Microfix

Version 0.0.25 of RcppSpdlog arrived on CRAN right now, and will be uploaded to Debian and built for r2u shortly along with a minimal refresh of the documentation site. RcppSpdlog bundles spdlog, a wonderful header-only C++ logging library with all the bells and whistles you would want that was written by Gabi Melman, and also includes fmt by Victor Zverovich. You can learn more at the nice package documention site. This release fixes a minuscule cosmetic issue from the previous release a week ago. We rely on two #defines that R sets to signal to spdlog that we are building in the R context (which matters for the R-specific logging sink, and picks up something Gabi added upon my suggestion at the very start of this package). But I use the same #defines to now check in Rcpp that we are building with R and, in this case, wrongly conclude R headers have already been installed so Rcpp (incorrectly) nags about that. The solution is to add two #undefine and proceed as normal (with Rcpp controlling and taking care of R header includion too) and that is what we do here. All good now, no nags from a false positive. The NEWS entry for this release follows.

Changes in RcppSpdlog version 0.0.25 (2026-01-15)
  • Ensure #define signaling R build (needed with spdlog) is unset before including R headers to not falsely triggering message from Rcpp

Courtesy of my CRANberries, there is also a diffstat report detailing changes. More detailed information is on the RcppSpdlog page, or the package documention site.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

14 January 2026

Dirk Eddelbuettel: RcppSimdJson 0.1.15 on CRAN: New Upstream, Some Maintenance

A brand new release 0.1.15 of the RcppSimdJson package is now on CRAN. RcppSimdJson wraps the fantastic and genuinely impressive simdjson library by Daniel Lemire and collaborators. Via very clever algorithmic engineering to obtain largely branch-free code, coupled with modern C++ and newer compiler instructions, it results in parsing gigabytes of JSON parsed per second which is quite mindboggling. The best-case performance is faster than CPU speed as use of parallel SIMD instructions and careful branch avoidance can lead to less than one cpu cycle per byte parsed; see the video of the talk by Daniel Lemire at QCon. This version updates to the current 4.2.4 upstream release. It also updates the RcppExports.cpp file with glue between C++ and R. We want move away from using Rf_error() (as Rcpp::stop() is generally preferable). Packages (such as this one) that are declaring an interface have an actual Rf_error() call generated in RcppExports.cpp which can protect which is what current Rcpp code generation does. Long story short, a minor internal reason. The short NEWS entry for this release follows.

Changes in version 0.1.15 (2026-01-14)
  • simdjson was upgraded to version 4.2.4 (Dirk in #97
  • RcppExports.cpp was regenerated to aid a Rcpp transition
  • Standard maintenance updates for continuous integration and URLs

Courtesy of my CRANberries, there is also a diffstat report for this release. For questions, suggestions, or issues please use the issue tracker at the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can now sponsor me at GitHub.

Dirk Eddelbuettel: gunsales 0.1.3 on CRAN: Maintenance

An update to the gunsales package is now on CRAN. As in the last update nine years ago (!!), changes are mostly internal. An upcoming dplyr change requires a switch from the old and soon to-be-removed underscored verb form; that was kindly addressed in an incoming pull request. We also updated the CI scripts a few times during this period as needed, and switched to using Authors@R, and refreshed and updated a number of URL references. Courtesy of my CRANberries, there is also a diffstat report for this release.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

13 January 2026

Simon Josefsson: Debian Libre Live 13.3.0 is released!

Following up on my initial announcement about Debian Libre Live I am happy to report on continued progress and the release of Debian Libre Live version 13.3.0. Since both this and the previous 13.2.0 release are based on the stable Debian trixie release, there really isn t a lot of major changes but instead incremental minor progress for the installation process. Repeated installations has a tendency to reveal bugs, and we have resolved the apt sources list confusion for Calamares-based installations and a couple of other nits. This release is more polished and we are not aware of any known remaining issues with them (unlike for earlier versions which were released with known problems), although we conservatively regard the project as still in beta. A Debian Libre Live logo is needed before marking this as stable, any graphically talented takers? (Please base it on the Debian SVG upstream logo image.) We provide GNOME, KDE, and XFCE desktop images, as well as text-only standard image, which match the regular Debian Live images with non-free software on them, but also provide a slim variant which is merely 750MB compared to the 1.9GB standard image. The slim image can still start a debian installer, and can still boot into a minimal live text-based system. The GNOME, KDE and XFCE desktop images feature the Calamares installer, and we have performed testing on a variety of machines. The standard and slim images does not have a installer from the running live system, but all images support a boot menu entry to start the installer. With this release we also extend our arm64 support to two tested platforms. The current list of successfully installed and supported systems now include the following hardware: This is a very limited set of machines, but the diversity in CPUs and architecture should hopefully reflect well on a wide variety of commonly available machines. Several of these machines are crippled (usually GPU or WiFI) without adding non-free software, complain at your hardware vendor and adapt your use-cases and future purchases. The images are as follows, with SHA256SUM checksums and GnuPG signature on the 13.3.0 release page. Curious how the images were made? Fear not, for the Debian Libre Live project README has documentation, the run.sh script is short and the .gitlab-ci.yml CI/CD Pipeline definition file brief. Happy Libre OS hacking!

Freexian Collaborators: Debian Contributions: dh-python development, Python 3.14 and Ruby 3.4 transitions, Surviving scraper traffic in Debian CI and more! (by Anupa Ann Joseph)

Debian Contributions: 2025-12 Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services.

dh-python development, by Stefano Rivera In Debian we build our Python packages with the help of a debhelper-compatible tool, dh-python. Before starting the 3.14 transition (that would rebuild many packages) we landed some updates to dh-python to fix bugs and add features. This started a month of attention on dh-python, iterating through several bug fixes, and a couple of unfortunate regressions. dh-python is used by almost all packages containing Python (over 5000). Most of these are very simple, but some are complex and use dh-python in unexpected ways. It s hard to avoid almost any change (including obvious bug fixes) from causing some unexpected knock-on behaviour. There is a fair amount of complexity in dh-python, and some rather clever code, which can make it tricky to work on. All of this means that good QA is important. Stefano spent some time adding type annotations and specialized types to make it easier to see what the code is doing and catch mistakes. This has already made work on dh-python easier. Now that Debusine has built-in repositories and debdiff support, Stefano could quickly test the effects of changes on many other packages. After each big change, he could upload dh-python to a repository, rebuild e.g. 50 Python packages with it, and see what differences appeared in the output. Reviewing the diffs is still a manual process, but can be improved. Stefano did a small test on what it would take to replace direct setuptools setup.py calls with PEP-517 (pyproject-style) builds. There is more work to do here.

Python 3.14 transition, by Stefano Rivera (et al.) In December the transition to add Python 3.14 as a supported version started in Debian unstable. To do this, we update the list of supported versions in python3-defaults, and then start rebuilding modules with C extensions from the leaves inwards. This had already been tested in a PPA and Ubuntu, so many of the biggest blocking compatibility issues with 3.14 had already been found and fixed. But there are always new issues to discover. Thanks to a number of people in the Debian Python team, we got through the first bit of the transition fairly quickly. There are still a number of open bugs that need attention and many failed tests blocking migration to testing. Python 3.14.1 released just after we started the transition, and very soon after, a follow-up 3.14.2 release came out to address a regression. We ran into another regression in Python 3.14.2.

Ruby 3.4 transition, by Lucas Kanashiro (et al.) The Debian Ruby team just started the preparation to move the default Ruby interpreter version to 3.4. At the moment, ruby3.4 source package is already available in experimental, also ruby-defaults added support to Ruby 3.4. Lucas rebuilt all reverse dependencies against this new version of the interpreter and published the results here. Lucas also reached out to some stakeholders to coordinate the work. Next steps are: 1) announcing the results to the whole team and asking for help to fix packages failing to build against the new interpreter; 2) file bugs against packages FTBFSing against Ruby 3.4 which are not fixed yet; 3) once we have a low number of build failures against Ruby 3.4, ask the Debian Release team to start the transition in unstable.

Surviving scraper traffic in Debian CI, by Antonio Terceiro Like most of the open web, Debian Continuous Integration has been struggling for a while to keep up with the insatiable hunger from data scrapers everywhere. Solving this involved a lot of trial and error; the final result seems to be stable, and consists of two parts. First, all Debian CI data pages, except the direct links to test log files (such as those provided by the Release Team s testing migration excuses), now require users to be authenticated before being accessed. This means that the Debian CI data is no longer publicly browseable, which is a bit sad. However, this is where we are now. Additionally, there is now a fail2ban powered firewall-level access limitation for clients that display an abusive access pattern. This went through several iterations, with some of them unfortunately blocking legitimate Debian contributors, but the current state seems to strike a good balance between blocking scrapers and not blocking real users. Please get in touch with the team on the #debci OFTC channel if you are affected by this.

A hybrid dependency solver for crossqa.debian.net, by Helmut Grohne crossqa.debian.net continuously cross builds packages from the Debian archive. Like Debian s native build infrastructure, it uses dose-builddebcheck to determine whether a package s dependencies can be satisfied before attempting a build. About one third of Debian s packages fail this check, so understanding the reasons is key to improving cross building. Unfortunately, dose-builddebcheck stops after reporting the first problem and does not display additional ones. To address this, a greedy solver implemented in Python now examines each build-dependency individually and can report multiple causes. dose-builddebcheck is still used as a fall-back when the greedy solver does not identify any problems. The report for bazel-bootstrap is a lengthy example.

rebootstrap, by Helmut Grohne Due to the changes suggested by Loongson earlier, rebootstrap now adds debhelper to its final installability test and builds a few more packages required for installing it. It also now uses a variant of build-essential that has been marked Multi-Arch: same (see foundational work from last year). This in turn made the use of a non-default GCC version more difficult and required more work to make it work for gcc-16 from experimental. Ongoing archive changes temporarily regressed building fribidi and dash. libselinux and groff have received patches for architecture specific changes and libverto has been NMUed to remove the glib2.0 dependency.

Miscellaneous contributions
  • Stefano did some administrative work on debian.social and debian.net instances and Debian reimbursements.
  • Stefano did routine updates of python-authlib, python-mitogen, xdot.
  • Stefano spent several hours discussing Debian s Python package layout with the PyPA upstream community. Debian has ended up with a very different on-disk installed Python layout than other distributions, and this continues to cause some frustration in many communities that have to have special workarounds to handle it. This ended up impacting cross builds as Helmut discovered.
  • Rapha l set up Debusine workflows for the various backports repositories on debusine.debian.net.
  • Zulip is not yet in Debian (RFP in #800052), but Rapha l helped on the French translation as he is experimenting with that discussion platform.
  • Antonio performed several routine Salsa maintenance tasks, including fixing salsa-nm-sync, the service that synchronizes project members data from LDAP to Salsa, which had been broken since salsa.debian.org was upgraded to trixie .
  • Antonio deployed a new amd64 worker host for Debian CI.
  • Antonio did several DebConf technical and administrative bits, including but adding support for custom check-in/check-out dates in the MiniDebConf registration module, publishing a call for bids for DebConf27.
  • Carles reviewed and submitted 14 Catalan translations using po-debconf-manager.
  • Carles improved po-debconf-manager: added delete-package command, show-information now uses properly formatted output (YAML), it now attaches the translation on the bug reports for which a merge request has been opened too long.
  • Carles investigated why some packages appeared in po-debconf-manager but not in the Debian l10n list. Turns out that some packages had debian/po/templates.pot (appearing in po-debconf-manager) but not the POTFILES.in file as expected. Created a script to find out which packages were in this or similar situation and reported bugs.
  • Carles tested and documented how to set up voices (mbrola and festival) if using Orca speech synthesizer. Commented a few issues and possible improvements in the debian-accessibility list.
  • Helmut sent patches for 48 cross build failures and initiated discussions on how to deal with two non-trivial matters. Besides Python mentioned above, CMake introduced a cmake_pkg_config builtin which is not aware of the host architecture. He also forwarded a Meson patch upstream.
  • Thorsten uploaded a new upstream version of cups to fix a nasty bug that was introduced by the latest security update.
  • Along with many other Python 3.14 fixes, Colin fixed a tricky segfault in python-confluent-kafka after a helpful debugging hint from upstream.
  • Colin upstreamed an improved version of an OpenSSH patch we ve been carrying since 2008 to fix misleading verbose output from scp.
  • Colin used Debusine to coordinate transitions for astroid and pygments, and wrote up the astroid case on his blog.
  • Emilio helped with various transitions, and provided a build fix for opencv for the ffmpeg 8 transition.
  • Emilio tested the GNOME updates for trixie proposed updates (gnome-shell, mutter, glib2.0).
  • Santiago helped to review the status of how to test different build profiles in parallel on the same pipeline, using the test-build-profiles job. This means, for example, to simultaneously test build profiles such as nocheck and nodoc for the same git tree. Finally, Santiago provided MR !685 to fix the documentation.
  • Anupa prepared a bits post for Outreachy interns announcement along with T ssia Cam es Ara jo and worked on publicity team tasks.

12 January 2026

Louis-Philippe V ronneau: Reducing the size of initramfs kernel images

In the past few years, the size of the kernel images in Debian have been steadily growing. I don't see this as a problem per se, but it has been causing me trouble, as my /boot partition has become too small to accommodate two kernel images at the same time. Since I'm running Debian Unstable on my personal systems and keep them updated with unattended-upgrade, this meant each (frequent) kernel upgrade triggered an error like this one:
update-initramfs: failed for /boot/initrd.img-6.17.11+deb14-amd64 with 1.
dpkg: error processing package initramfs-tools (--configure):
 installed initramfs-tools package post-installation script subprocess returned
 error exit status 1
Errors were encountered while processing:
 initramfs-tools
E: Sub-process /usr/bin/dpkg returned an error code (1)
This would in turn break the automated upgrade process and require me to manually delete the currently running kernel (which works, but isn't great) to complete the upgrade. The "obvious" solution would have been to increase the size of my /boot partition to something larger than the default 456M. Since my systems use full-disk encryption and LVM, this isn't trivial and would have required me to play Tetris and swap files back and forth using another drive. Another solution proposed by anarcat was to migrate to systemd-boot (I'm still using grub), use Unified Kernel Images (UKI) and merge the /boot and /boot/efi partitions. Since I already have a bunch of configurations using grub and I am not too keen on systemd taking over all the things on my computer, I was somewhat reluctant. As my computers are all configured by Puppet, I could of course have done a complete system reinstallation, but again, this was somewhat more involved than what I wanted it to be. After looking online for a while, I finally stumbled on this blog post by Neil Brown detailing how to shrink the size of the initramfs images. With MODULES=dep my images shrunk from 188M to 41M, fixing my issue. Thanks Neil! I was somewhat worried removing kernel modules would break something on my systems, but so far, I only had to manually load the i2c_dev module, as I need it to manage my home monitor's brightness using ddcutil.

Dirk Eddelbuettel: RcppAnnoy 0.0.23 on CRAN: Several Updates

annoy image A new release, now at version 0.0.22, of RcppAnnoy has arrived on CRAN, just a little short of two years since the previous release. RcppAnnoy is the Rcpp-based R integration of the nifty Annoy library by Erik Bernhardsson. Annoy is a small and lightweight C++ template header library for very fast approximate nearest neighbours originally developed to drive the Spotify music discovery algorithm. It had all the buzzwords already a decade ago: it is one of the algorithms behind (drum roll ) vector search as it finds approximate matches very quickly and also allows to persist the data. This release contains three contributed pull requests covering a new metric, a new demo and quieter compilation, some changes to documentation and last but not least general polish including letting the vignette now use the Rcpp::asis builder. Details of the release follow based on the NEWS file.

Changes in version 0.0.23 (2026-01-12)
  • Add dot product distance metrics (Benjamin James in #78)
  • Apply small polish to the documentation (Dirk closing #79)
  • A new demo() has been added (Samuel Granjeaud in #79)
  • Switch to Authors@R in DESCRIPTION
  • Several updates to continuous integration and README.md
  • Small enhancements to package help files
  • Updates to vignettes and references
  • Vignette now uses Rcpp::asis builder (Dirk in #80)
  • Switch one macro to a function to avoid a compiler nag (Amos Elberg in #81)

Courtesy of my CRANberries, there is also a diffstat report for this release.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

11 January 2026

Dirk Eddelbuettel: RApiDatetime 0.0.10 on CRAN: Maintenance

A new maintenance release of our RApiDatetime package is now on CRAN, coming just about two years after the previous maintenance release. RApiDatetime provides a number of entry points for C-level functions of the R API for Date and Datetime calculations. The functions asPOSIXlt and asPOSIXct convert between long and compact datetime representation, formatPOSIXlt and Rstrptime convert to and from character strings, and POSIXlt2D and D2POSIXlt convert between Date and POSIXlt datetime. Lastly, asDatePOSIXct converts to a date type. All these functions are rather useful, but were not previously exported by R for C-level use by other packages. Which this package aims to change. This release avoids use of and which are now outlawed under R-devel, and makes a number of other smaller maintenance updates. Just like the previous release, we are at OS_type: unix meaning there will not be any Windows builds at CRAN. If you would like that to change, and ideally can work in the Windows portion, do not hesitate to get in touch. Details of the release follow based on the NEWS file.

Changes in RApiDatetime version 0.0.10 (2026-01-11)
  • Minor maintenance for continuous integration files, README.md
  • Switch to Authors@R in DESCRIPTION
  • Use Rf_setAttrib with R 4.5.0 or later

Courtesy of my CRANberries, there is also a diffstat report for this release.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

9 January 2026

Simon Josefsson: Debian Taco Towards a GitSecDevOps Debian

One of my holiday projects was to understand and gain more trust in how Debian binaries are built, and as the holidays are coming to an end, I d like to introduce a new research project called Debian Taco. I apparently need more holidays, because there are still more work to be done here, so at the end I ll summarize some pending work. Debian Taco, or TacOS, is a GitSecDevOps rebuild of Debian GNU/Linux. The Debian Taco project publish rebuilt binary packages, package repository metadata (InRelease, Packages, etc), container images, cloud images and live images. All packages are built from pristine source packages in the Debian archive. Debian Taco does not modify any Debian source code nor add or remove any packages found in Debian. No servers are involved! Everything is built in GitLab pipelines and results are published through modern GitDevOps mechanism like GitLab Pages and S3 object storage. You can fork the individual projects below on GitLab.com and you will have your own Debian-derived OS available for tweaking. (Of course, at some level, servers are always involved, so this claim is a bit of hyperbole.)

Goals The goal of TacOS is to be bit-by-bit identical with official Debian GNU/Linux, and until that has been completed, publish diffoscope output with differences. The idea is to further categorize all artifact differences into one of the following categories: 1) An obvious bug in Debian. For example, if a package does not build reproducible. 2) An obvious bug in TacOS. For example, if our build environment does not manage to build a package. 3) Something else. This would be input for further research and consideration. This category also include things where it isn t obvious if it is a bug in Debian or in TacOS. Known examples: 3A) Packages in TacOS are rebuilt the latest available source code, not the (potentially) older package that were used to build the Debian packages. This could lead to differences in the packages. These differences may be useful to analyze to identify supply-chain attacks. See some discussion about idempotent rebuilds. Our packages are all built from source code, unless we have not yet managed to build something. In the latter situation, Debian Taco falls back and uses the official Debian artifact. This allows an incremental publication of Debian Taco that still is 100% complete without requiring that everything is rebuilt instantly. The goal is that everything should be rebuilt, and until that has been completed, publish a list of artifacts that we use verbatim from Debian.

Debian Taco Archive The Debian Taco Archive project generate and publish the package archive (dists/tacos-trixie/InRelease, dists/tacos-trixie/main/binary-amd64/Packages.gz, pool/* etc), similar to what is published at https://deb.debian.org/debian/. The output of the Debian Taco Archive is available from https://debdistutils.gitlab.io/tacos/archive/.

Debian Taco Container Images The Debian Taco Container Images project provide container images of Debian Taco for trixie, forky and sid on the amd64, arm64, ppc64el and riscv64 architectures. These images allow quick and simple use of Debian Taco interactively, but makes it easy to deploy for container orchestration frameworks.

Debian Taco Cloud Images The Debian Taco Cloud Images project provide cloud images of Debian Taco for trixie, forky and sid on the amd64, arm64, ppc64el and riscv64 architectures. Launch and install Debian Taco for your cloud environment!

Debian Taco Live Images The Debian Taco Live Images project provide live images of Debian Taco for trixie, forky and sid on the amd64 and arm64 architectures. These images allows running Debian Taco on physical hardware (or virtual machines), and even installation for permanent use.

Debian Taco Build Images and Packages Packages are built using debdistbuild, which was introduced in a blog about Build Debian in a GitLab Pipeline. The first step is to prepare build images, which is done by the Debian Taco Build Images project. They are similar to the Debian Taco containers but have build-essential and debdistbuild installed on them. Debdistbuild is launched in a per-architecture per-suite CI/CD project. Currently only trixie-amd64 is available. That project has built some essential early packages like base-files, debian-archive-keyring and hostname. They are stored in Git LFS backed by a S3 object storage. These packages were all built reproducibly. So this means Debian Taco is still 100% bit-by-bit identical to Debian, except for the renaming. I ve yet to launch a more massive wide-scale package rebuild until some outstanding issues have been resolved. I earlier rebuilt around 7000 packages from Trixie on amd64, so I know that the method easily scales.

Remaining work Where is the diffoscope package outputs and list of package differences? For another holiday! Clearly this is an important remaining work item. Another important outstanding issue is how to orchestrate launching the build of all packages. Clearly a list of packages is needed, and some trigger mechanism to understand when new packages are added to Debian. One goal was to build packages from the tag2upload browse.dgit.debian.org archive, before checking the Debian Archive. This ought to be really simple to implement, but other matters came first.

GitLab or Codeberg? Everything is written using basic POSIX /bin/sh shell scripts. Debian Taco uses the GitLab CI/CD Pipeline mechanism together with a Hetzner S3 object storage to serve packages. The scripts have only weak reliance on GitLab-specific principles, and were designed with the intention to support other platforms. I believe reliance on a particular CI/CD platform is a limitation, so I d like to explore shipping Debian Taco through a Forgejo-based architecture, possibly via Codeberg as soon as I manage to deploy reliable Forgejo runners. The important aspects that are required are: 1) Pipelines that can build and publish web sites similar to GitLab Pages. Codeberg has a pipeline mechanism. I ve successfully used Codeberg Pages to publish the OATH Toolkit homepage homepage. Glueing this together seems feasible. 2) Container Registry. It seems Forgejo supports a Container Registry but I ve not worked with it at Codeberg to understand if there are any limitations. 3) Package Registry. The Deban Taco live images are uploaded into a package registry, because they are too big for being served through GitLab Pages. It may be converted to using a Pages mechanism, or possibly through Release Artifacts if multi-GB artifacts are supported on other platforms. I hope to continue this work and explaining more details in a series of posts, stay tuned!

8 January 2026

Reproducible Builds: Reproducible Builds in December 2025

Welcome to the December 2025 from the Reproducible Builds project! Our monthly reports outline what we ve been up to over the past month, highlighting items of news from elsewhere in the increasingly-important area of software supply-chain security. As ever, if you are interested in contributing to the Reproducible Builds project, please see the Contribute page on our website.

  1. New orig-check service to validate Debian upstream tarballs
  2. Distribution work
  3. disorderfs updated to FUSE 3
  4. Mailing list updates
  5. Three new academic papers published
  6. Website updates
  7. Upstream patches

New orig-check service to validate Debian upstream tarballs This month, Debian Developer Lucas Nussbaum announced the orig-check service, which attempts to automatically reproduce the generation upstream tarballs (ie. the original source component of a Debian source package), comparing that to the upstream tarball actually shipped with Debian. As of the time of writing, it is possible for a Debian developer to upload a source archive that does not actually correspond to upstream s version. Whilst this is not inherently malicious (it typically indicates some tooling/process issue), the very possibility that a maintainer s version may differ potentially permits a maintainer to make (malicious) changes that would be misattributed to upstream. This service therefore nicely complements the whatsrc.org service, which was reported in our reports for both April and August. The orig-check is dedicated to Lunar, who sadly passed away a year ago.

Distribution work In Arch Linux this month, Robin Candau and Mark Hegreberg worked at making the Arch Linux WSL image bit-for-bit reproducible. Robin also shared some implementation details and future related work on our mailing list. Continuing a series reported in these reports for March, April and July 2025 (etc.), Simon Josefsson has published another interesting article this month, itself a followup to a post Simon published in December 2024 regarding GNU Guix Container Images that are hosted on GitLab. In Debian this month, Micha Lenk posted to the debian-backports-announce mailing list with the news that the Backports archive will now discard binaries generated and uploaded by maintainers: The benefit is that all binary packages [will] get built by the Debian buildds before we distribute them within the archive. Felix Moessbauer of Siemens then filed a bug in the Debian bug tracker to signal their intention to package debsbom, a software bill of materials (SBOM) generator for distributions based on Debian. This generated a discussion on the bug inquiring about the output format as well as a question about how these SBOMs might be distributed. Holger Levsen merged a number of significant changes written by Alper Nebi Yasak to the Debian Installer in order to improve its reproducibility. As noted in Alper s merge request, These are the reproducibility fixes I looked into before bookworm release, but was a bit afraid to send as it s just before the release, because the things like the xorriso conversion changes the content of the files to try to make them reproducible. In addition, 76 reviews of Debian packages were added, 8 were updated and 27 were removed this month adding to our knowledge about identified issues. A new different_package_content_when_built_with_nocheck issue type was added by Holger Levsen. [ ] Arnout Engelen posted to our mailing list reporting that they successfully reproduced the NixOS minimal installation ISO for the 25.11 release without relying on a pre-compiled package archive, with more details on their blog. Lastly, Bernhard M. Wiedemann posted another openSUSE monthly update for his work there.

disorderfs updated to FUSE 3 disorderfs is our FUSE-based filesystem that deliberately introduces non-determinism into system calls to reliably flush out reproducibility issues. This month, however, Roland Clobus upgraded disorderfs* from FUSE 2 to FUSE 3 after its package automatically got removed from Debian testing. Some tests in Debian currently require disorderfs to make the Debian live images reproducible, although disorderfs is not a Debian-specific tool.

Mailing list updates On our mailing list this month:
  • Luca Di Maio announced stampdalf, a filesystem timestamp preservation tool that wraps arbitrary commands and ensures filesystem timestamp reproducibility :
    stampdalf allows you to run any command that modifies files in a directory tree, then automatically resets all timestamps back to their original values. Any new files created during command execution are set to [the UNIX epoch] or a custom timestamp via SOURCE_DATE_EPOCH.
    The project s GitHub page helpfully reveals that the project is pronounced: stamp-dalf (stamp like time-stamp, dalf like Gandalf the wizard) as it s a wizard of time and stamps .)
  • Lastly, Reproducible Builds developer cen1 posted to our list announcing that early/experimental/alpha support for FreeBSD was added to rebuilderd. In their post, cen1 reports that the initial builds are in progress and look quite decent . cen1 also interestingly notes that since the upstream is currently not technically reproducible I had to relax the bit-for-bit identical requirement of rebuilderd [ ] I consider the pkg to be reproducible if the tar is content-identical (via diffoscope), ignoring timestamps and some of the manifest files. .

Three new academic papers published Yogya Gamage and Benoit Baudry of Universit de Montr al, Canada together with Deepika Tiwari and Martin Monperrus of KTH Royal Institute of Technology, Sweden published a paper on The Design Space of Lockfiles Across Package Managers:
Most package managers also generate a lockfile, which records the exact set of resolved dependency versions. Lockfiles are used to reduce build times; to verify the integrity of resolved packages; and to support build reproducibility across environments and time. Despite these beneficial features, developers often struggle with their maintenance, usage, and interpretation. In this study, we unveil the major challenges related to lockfiles, such that future researchers and engineers can address them. [ ]
A PDF of their paper is available online. Benoit Baudry also posted an announcement to our mailing list, which generated a number of replies.
Betul Gokkaya, Leonardo Aniello and Basel Halak of the University of Southampton then published a paper on the A taxonomy of attacks, mitigations and risk assessment strategies within the software supply chain:
While existing studies primarily focus on software supply chain attacks prevention and detection methods, there is a need for a broad overview of attacks and comprehensive risk assessment for software supply chain security. This study conducts a systematic literature review to fill this gap. By analyzing 96 papers published between 2015-2023, we identified 19 distinct SSC attacks, including 6 novel attacks highlighted in recent studies. Additionally, we developed 25 specific security controls and established a precisely mapped taxonomy that transparently links each control to one or more specific attacks. [ ]
A PDF of the paper is available online via the article s canonical page.
Aman Sharma and Martin Monperrus of the KTH Royal Institute of Technology, Sweden along with Benoit Baudry of Universit de Montr al, Canada published a paper this month on Causes and Canonicalization of Unreproducible Builds in Java. The abstract of the paper is as follows:
[Achieving] reproducibility at scale remains difficult, especially in Java, due to a range of non-deterministic factors and caveats in the build process. In this work, we focus on reproducibility in Java-based software, archetypal of enterprise applications. We introduce a conceptual framework for reproducible builds, we analyze a large dataset from Reproducible Central, and we develop a novel taxonomy of six root causes of unreproducibility. [ ]
A PDF of the paper is available online.

Website updates Once again, there were a number of improvements made to our website this month including:

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Finally, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

Dirk Eddelbuettel: RcppCCTZ 0.2.14 on CRAN: New Upstream, Small Edits

A new release 0.2.14 of RcppCCTZ is now on CRAN, in Debian and built for r2u. RcppCCTZ uses Rcpp to bring CCTZ to R. CCTZ is a C++ library for translating between absolute and civil times using the rules of a time zone. In fact, it is two libraries. One for dealing with civil time: human-readable dates and times, and one for converting between between absolute and civil times via time zones. And while CCTZ is made by Google(rs), it is not an official Google product. The RcppCCTZ page has a few usage examples and details. This package was the first CRAN package to use CCTZ; by now several others packages (four the last time we counted) include its sources too. Not ideal, but beyond our control. This version updates to a new upstream release, and brings some small local edits. CRAN and R-devel were stumbled over us still mentioning C++11 in SystemRequirements (yes, this package is old enough for that to have mattered once). As that is a false positive the package compiles well under any recent standard we removed the mention. The key changes since the last CRAN release are summarised below.

Changes in version 0.2.14 (2026-01-08)
  • Synchronized with upstream CCTZ (Dirk in #46).
  • Explicitly enumerate files to be compiled in src/Makevars* (Dirk in #47)

Courtesy of my CRANberries, there is a diffstat report relative to to the previous version. More details are at the RcppCCTZ page; code, issue tickets etc at the GitHub repository.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

Dirk Eddelbuettel: RcppSpdlog 0.0.24 on CRAN: New Upstream

Version 0.0.24 of RcppSpdlog arrived on CRAN today, has been been uploaded to Debian, and also been built for r2u. This follows an upstream release on Sunday which we incorporated immediately, but CRAN was still closed for the winter break until yesterday. RcppSpdlog bundles spdlog, a wonderful header-only C++ logging library with all the bells and whistles you would want that was written by Gabi Melman, and also includes fmt by Victor Zverovich. You can learn more at the nice package documention site. This release updates the code to the version 1.17.0 of spdlog which was released yesterday morning, and includes version 12.1.0 of fmt. No other changes besides tweaks to the documentation site (that was updated to using altdoc last release) have been made. The NEWS entry for this release follows.

Changes in RcppSpdlog version 0.0.24 (2026-01-07)
  • Upgraded to upstream release spdlog 1.17.0 (including fmt 12.1.0)

Courtesy of my CRANberries, there is also a diffstat report detailing changes. More detailed information is on the RcppSpdlog page, or the package documention site.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

5 January 2026

Colin Watson: Free software activity in December 2025

About 95% of my Debian contributions this month were sponsored by Freexian. You can also support my work directly via Liberapay or GitHub Sponsors. Python packaging I upgraded these packages to new upstream versions: Python 3.14 is now a supported version in unstable, and we re working to get that into testing. As usual this is a pretty arduous effort because it requires going round and fixing lots of odds and ends across the whole ecosystem. We can deal with a fair number of problems by keeping up with upstream (see above), but there tends to be a long tail of packages whose upstreams are less active and where we need to chase them, or where problems only show up in Debian for one reason or another. I spent a lot of time working on this: Fixes for pytest 9: I filed lintian: Report Python egg-info files/directories to help us track the migration to pybuild-plugin-pyproject. I did some work on dh-python: Normalize names in pydist lookups and pyproject plugin: Support headers (the latter of which allowed converting python-persistent and zope.proxy to pybuild-plugin-pyproject, although it needed a follow-up fix). I fixed or helped to fix several other build/test failures: Other bugs: Other bits and pieces Code reviews

Next.