Search Results: "ina"

13 February 2026

Erich Schubert: Dogfood Generative AI

Current AI companies ignore licenses such as the GPL, and often train on anything they can scrape. This is not acceptable. The AI companies ignore web conventions, e.g., they deep link images from your web sites (even adding ?utm_source=chatgpt.com to image URIs, I suggest that you return 403 on these requests), but do not direct visitors to your site. You do not get a reliable way of opting out from generative AI training or use. For example, the only way to prevent your contents from being used in Google AI Overviews is to use data-nosnippet and cripple the snippet preview in Google. The AI browsers such as Comet, Atlas do not identify as such, but rather pretend they are standard Chromium. There is no way to ban such AI use on your web site. Generative AI overall is flooding the internet with garbage. It was estimated that 1/3rd of the content uploaded to YouTube is by now AI generated. This includes the same veteran stories crap in thousands of variants as well as brainrot content (that at least does not pretend to be authentic), some of which is among the most viewed recent uploads. Hence, these platforms even benefit from the AI slop. And don t blame the creators because you can currently earn a decent amount of money from such contents, people will generate brainrot content. If you have recently tried to find honest reviews of products you considered buying, you will have noticed thousands of sites with AI generated fake product reviews, that all are financed by Amazon PartnerNet commissions. Often with hilarious nonsense such as recommending sewing thread with German instructions as tool for repairing a sewing machine. And on Amazon, there are plenty of AI generated product reviews the use of emoji is a strong hint. And if you leave a negative product review, there is a chance they offer you a refund to get rid of it And the majority of SPAM that gets through my filters is by now sent via Gmail and Amazon SES. Partially because of GenAI, StackOverflow is pretty much dead which used to be one of the most valuable programming resources. (While a lot of people complain about moderation, famous moderator Shog9 from the early SO days suggested that a change in Google s ranking is also to blame, as it began favoring showing new content over the existing answered questions causing more and more duplicates to be posted because people no longer found the existing good answers. In January 2026, there were around 3400 questions and 6000 answers posted, less than in the first month of SO of August 2008 (before the official launch). Many open-source projects are suffering in many ways, e.g., false bug reports that caused curl to stop its bug bounty program. Wikipedia is also suffering badly from GenAI. Science is also flooded with poor AI generated papers, often reviewed with help from AI. This is largely due to bad incentives to graduate, you are expected to write many papers on certain A conferences, such as NeurIPS. On these conferences the number of submissions is growing insane, and the review quality plummets. All to often, the references in these papers are hallucinated, too; and libraries complain that they receive more and more requests to locate literature that does not appear to exist. However, the worst effect (at least to me as an educator) is the noskilling effect (a rather novel term derived from deskilling, I have only seen it in this article by We els and Maibaum). Instead of acquiring skills (writing, reading, summarizing, programming) by practising, too many people now outsource all this to AI, leading to them not learn the basics necessary to advance to a higher skill level. In my impression, this effect is dramatic. It is even worse than deskilling, as it does not mean losing an advanced skill that you apparently can replace, but often means not acquiring basic skills in the first place. And the earlier pupils start using generative AI, the less skills they acquire. Dogfood the AI Let s dogfood the AI. Here s an outline:
  1. Get a list of programming topics, e.g., get a list of algorithms from Wikidata, get a StackOverflow data dump.
  2. Generate flawed code examples for the algorithms / programming questions, maybe generate blog posts, too.
    You do not need a high-quality model for this. Use something you can run locally or access for free.
  3. Date everything back in time, remove typical indications of AI use.
  4. Upload to Github, because Microsoft will feed this to OpenAI
Here is an example prompt that you can use:
You are a university educator, preparing homework assignments in debugging.
The programming language used is  lang .
The students are tasked to find bugs in given code.
Do not just call existing implementations from libraries, but implement the algorithm from scratch.
Make sure there are two mistakes in the code that need to be discovered by the students.
Do NOT repeat instructions. Do NOT add small-talk. Do NOT provide a solution.
The code may have (misleading) comments, but must NOT mention the bugs.
If you do not know how to implement the algorithm, output an empty response.
Output only the code for the assignment! Do not use markdown.
Begin with a code comment that indicates the algorithm name and idea.
If you indicate a bug, always use a comment with the keyword BUG
Generate a  lang  implementation (with bugs) of:  n  ( desc )
Remember to remove the BUG comments! If you pick some slighly less common programming languages (by quantity of available code, say Go or Rust) you have higher chances that this gets into the training data. If many of us do this, we can feed GenAI its own garbage. If we generate thousands of bad code examples, this will poison their training data, and may eventually lead to an effect known as model collapse . On the long run, we need to get back to an internet for people, not an internet for bots. Some kind of internet 2.0 , but I do not have a clear vision on how to keep AI out if AI can train on it, they will. And someone will copy and paste the AI generated crap back into whatever system we built. Hence I don t think technology is the answere here, but human networks of trust.

12 February 2026

Dirk Eddelbuettel: RcppSpdlog 0.0.27 on CRAN: C++20 Accommodations

Version 0.0.27 of RcppSpdlog arrived on CRAN moments ago, and will be uploaded to Debian and built for r2u shortly. The (nice) documentation site will be refreshed too. RcppSpdlog bundles spdlog, a wonderful header-only C++ logging library with all the bells and whistles you would want that was written by Gabi Melman, and also includes fmt by Victor Zverovich. You can learn more at the nice package documention site. Brian Ripley has now turned C++20 on as a default for R-devel (aka R 4.6.0 to be ), and this turned up misbehvior in packages using RcppSpdlog such as our spdl wrapper (offering a nicer interface from both R and C++) when relying on std::format. So for now, we turned this off and remain with fmt::format from the fmt library while we investigate further. The NEWS entry for this release follows.

Changes in RcppSpdlog version 0.0.27 (2026-02-11)
  • Under C++20 or later, keep relying on fmt::format until issues experienced using std::format can be identified and resolved

Courtesy of my CRANberries, there is also a diffstat report detailing changes. More detailed information is on the RcppSpdlog page, or the package documention site.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

Freexian Collaborators: Debian Contributions: cross building, rebootstrap updates, Refresh of the patch tagging guidelines and more! (by Anupa Ann Joseph)

Debian Contributions: 2026-01 Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services.

cross building, by Helmut Grohne In version 1.10.1, Meson merged a patch to make it call the correct g-ir-scanner by default thanks to Eli Schwarz. This problem affected more than 130 source packages. Helmut retried building them all and filed 69 patches as a result. A significant portion of those packages require another Meson change to call the correct vapigen. Another notable change is converting gnu-efi to multiarch, which ended up requiring changes to a number of other packages. Since Aurelien dropped the libcrypt-dev dependency from libc6-dev, this transition now is mostly complete and has resulted in most of the Perl ecosystem correctly expressing perl-xs-dev dependencies needed for cross building. It is these infrastructure changes affecting several client packages that this work targets. As a result of this continued work, about 66% of Debian s source packages now have satisfiable cross Build-Depends in unstable and about 10000 (55%) actually can be cross built. There are now more than 500 open bug reports affecting more than 2000 packages most of which carry patches.

rebootstrap, by Helmut Grohne Maintaining architecture cross-bootstrap requires continued effort for adapting to archive changes such as glib2.0 dropping a build profile or an e2fsprogs FTBFS. Beyond those generic problems, architecture-specific problems with e.g. musl-linux-any or sparc may arise. While all these changes move things forward on the surface, the bootstrap tooling has become a growing pile of patches. Helmut managed to upstream two changes to glibc for reducing its Build-Depends in the stage2 build profile and thanks Aurelien Jarno.

Refresh of the patch tagging guidelines, by Rapha l Hertzog Debian Enhancement Proposal #3 (DEP-3) is named Patch Tagging Guidelines and standardizes meta-information that Debian contributors can put in patches included in Debian source packages. With the feedback received over the years, and with the change in the package management landscape, the need to refresh those guidelines became evident. As the initial driver of that DEP, I spent a good day reviewing all the feedback (that I kept in a folder) and producing a new version of the document. The changes aim to give more weight to the syntax that is compatible with git format-patch s output, and also to clarify the expected uses and meanings of a couple of fields, including some algorithm that parsers should follow to define the state of the patch. After the announcement of the new draft on debian-devel, the revised DEP-3 received a significant number of comments that I still have to process.

Miscellaneous contributions
  • Helmut uploaded debvm making it work with unstable as a target distribution again.
  • Helmut modernized the code base backing dedup.debian.net significantly expanding the support for type checking.
  • Helmut fixed the multiarch hinter once more given feedback from Fabian Gr nbichler.
  • Helmut worked on migrating the rocblas package to forky.
  • Rapha l fixed RC bug #1111812 in publican and did some maintenance for tracker.debian.org.
  • Carles added support in the festival Debian package for systemd socket activation and systemd service and socket units. Adapted the patch for upstream and created a merge request (also fixed a MacOS X building system error while working on it). Updated Orca Wiki documentation regarding festival. Discussed a 2007 bug/feature in festival which allowed having a local shell and that the new systemd socket activation has the same code path.
  • Carles using po-debconf-manager worked on Catalan translations: 7 reviewed and sent; 5 follow ups, 5 deleted packages.
  • Carls made some po-debconf-manager changes: now it attaches the translation file on follow ups, fixed bullseye compatibility issues.
  • Carles reviewed a new Catalan apt translation.
  • Carles investigated and reported a lxhotkey bug and sent a patch for the abcde package.
  • Carles made minor updates for Debian Wiki for different pages (lxde for dead keys, Ripping with abcde troubleshooting, VirtualBox troubleshooting).
  • Stefano renamed build-details.json in Python 3.14 to fix multiarch coinstallability.
  • Stefano audited the tooling and ignore lists for checking the contents of the python3.X-minimal packages, finding and fixing some issues in the process.
  • Stefano made a few uploads of python3-defaults and dh-python in support of Python 3.14-as-default in Ubuntu. Also investigated the risk of ignoring byte-compilation failures by default, and started down the road of implementing this.
  • Stefano did some sysadmin work on debian.social infrastructure.
  • Stefano and Santiago worked on preparations for DebConf 26. Especially to help the local team on opening the registration, and reviewing the budget to be presented for approval.
  • Stefano uploaded routine updates of python-virtualenv and python-flexmock.
  • Antonio collaborated with DSA on enabling a new proxy for salsa to prevent scrapers from taking the service down.
  • Antonio did miscellaneous salsa administrative tasks.
  • Antonio fixed a few Ruby packages towards the Ruby 3.4 transition.
  • Antonio started work on planned improvements to the DebConf registration system.
  • Santiago prepared unstable updates for the latest upstream versions of knot-dns and knot-resolver. The authoritative DNS server and DNS resolver software developed by CZ.NIC. It is worth highlighting that, given the separation of functionality compared to other implementations, knot-dns and knot-resolver are also less complex software, which results in advantages in terms of security: only three CVEs have been reported for knot-dns since 2011).
  • Santiago made some routine reviews of merge requests proposed for the Salsa CI s pipeline. E.g. a proposal to fix how sbuild chooses the chroot when building a package for experimental.
  • Colin fixed lots of Python packages to handle Python 3.14 and to avoid using the deprecated pkg_resources module.
  • Colin added forky support to the images used in Salsa CI pipelines.
  • Colin began working on getting a release candidate of groff 1.24.0 (the first upstream release since mid-2023, so a very large set of changes) into experimental.
  • Lucas kept working on the preparation for Ruby 3.4 transition. Some packages fixed (support build against Ruby 3.3 and 3.4): ruby-rbpdf, jekyll, origami-pdf, ruby-kdl, ruby-twitter, ruby-twitter-text, ruby-globalid.
  • Lucas supported some potential mentors in the Google Summer of Code 26 program to submit their projects.
  • Anupa worked on the point release announcements for Debian 12.13 and 13.3 from the Debian publicity team side.
  • Anupa attended the publicity team meeting to discuss the team activities and to plan an online sprint in February.
  • Anupa attended meetings with the Debian India team to plan and coordinate the MinDebConf Kanpur and sent out related Micronews.
  • Emilio coordinated various transitions and helped get rid of llvm-toolchain-17 from sid.

10 February 2026

Freexian Collaborators: Writing a new worker task for Debusine (by Carles Pina i Estany)

Debusine is a tool designed for Debian developers and Operating System developers in general. You can try out Debusine on debusine.debian.net, and follow its development on salsa.debian.org. This post describes how to write a new worker task for Debusine. It can be used to add tasks to a self-hosted Debusine instance, or to submit to the Debusine project new tasks to add new capabilities to Debusine. Tasks are the lower-level pieces of Debusine workflows. Examples of tasks are Sbuild, Lintian, Debdiff (see the available tasks). This post will document the steps to write a new basic worker task. The example will add a worker task that runs reprotest and creates an artifact of the new type ReprotestArtifact with the reprotest log. Tasks are usually used by workflows. Workflows solve high-level goals by creating and orchestrating different tasks (e.g. a Sbuild workflow would create different Sbuild tasks, one for each architecture).

Overview of tasks A task usually does the following:
  • It receives structured data defining its input artifacts and configuration
  • Input artifacts are downloaded
  • A process is run by the worker (e.g. lintian, debdiff, etc.). In this blog post, it will run reprotest
  • The output (files, logs, exit code, etc.) is analyzed, artifacts and relations might be generated, and the work request is marked as completed, either with Success or Failure
If you want to follow the tutorial and add the Reprotest task, your Debusine development instance should have at least one worker, one user, a debusine client set up, and permissions for the client to create tasks. All of this can be setup following the steps in the Contribute section of the documentation. This blog post shows a functional Reprotest task. This task is not currently part of Debusine. The Reprotest task implementation is simplified (no error handling, unit tests, specific view, docs, some shortcuts in the environment preparation, etc.). At some point, in Debusine, we might add a debrebuild task which is based on buildinfo files and uses snapshot.debian.org to recreate the binary packages.

Defining the inputs of the task The input of the reprotest task will be a source artifact (a Debian source package). We model the input with pydantic in debusine/tasks/models.py:
class ReprotestData(BaseTaskDataWithExecutor):
   """Data for Reprotest task."""

   source_artifact: LookupSingle

class ReprotestDynamicData(BaseDynamicTaskDataWithExecutor):
   """Reprotest dynamic data."""

   source_artifact_id: int   None = None
The ReprotestData is what the user will input. A LookupSingle is a lookup that resolves to a single artifact. We would also have configuration for the desired variations to test, but we have left that out of this example for simplicity. Configuring variations is left as an exercise for the reader. Since ReprotestData is a subclass of BaseTaskDataWithExecutor it also contains environment where the user can specify in which environment the task will run. The environment is an artifact with a Debian image. The ReprotestDynamicData holds the resolution of all lookups. These can be seen in the Internals tab of the work request view.

Add the new Reprotest artifact data class In order for the reprotest task to create a new Artifact of the type DebianReprotest with the log and output metadata: add the new category to ArtifactCategory in debusine/artifacts/models.py:
    REPROTEST = "debian:reprotest"
In the same file add the DebianReprotest class:
class DebianReprotest(ArtifactData):
   """Data for debian:reprotest artifacts."""

   reproducible: bool   None = None

   def get_label(self) -> str:
       """Return a short human-readable label for the artifact."""
       return "reprotest analysis"
It could also include the package name or version. In order to have the category listed in the work request output artifacts table, edit the file debusine/db/models/artifacts.py: In ARTIFACT_CATEGORY_ICON_NAMES add ArtifactCategory.REPROTEST: "folder", and in ARTIFACT_CATEGORY_SHORT_NAMES add ArtifactCategory.REPROTEST: "reprotest",.

Create the new Task class In debusine/tasks/ create a new file reprotest.py.
reprotest.py
# Copyright   The Debusine Developers
# See the AUTHORS file at the top-level directory of this distribution
#
# This file is part of Debusine. It is subject to the license terms
# in the LICENSE file found in the top-level directory of this
# distribution. No part of Debusine, including this file, may be copied,
# modified, propagated, or distributed except according to the terms
# contained in the LICENSE file.

"""Task to use reprotest in debusine."""

from pathlib import Path
from typing import Any

from debusine import utils
from debusine.artifacts.local_artifact import ReprotestArtifact
from debusine.artifacts.models import (
    ArtifactCategory,
    CollectionCategory,
    DebianSourcePackage,
    DebianUpload,
    WorkRequestResults,
    get_source_package_name,
    get_source_package_version,
)
from debusine.client.models import RelationType
from debusine.tasks import BaseTaskWithExecutor, RunCommandTask
from debusine.tasks.models import ReprotestData, ReprotestDynamicData
from debusine.tasks.server import TaskDatabaseInterface


class Reprotest(
    RunCommandTask[ReprotestData, ReprotestDynamicData],
    BaseTaskWithExecutor[ReprotestData, ReprotestDynamicData],
):
    """Task to use reprotest in debusine."""

    TASK_VERSION = 1

    CAPTURE_OUTPUT_FILENAME = "reprotest.log"

    def __init__(
        self,
        task_data: dict[str, Any],
        dynamic_task_data: dict[str, Any]   None = None,
    ) -> None:
        """Initialize object."""
        super().__init__(task_data, dynamic_task_data)

        self._reprotest_target: Path   None = None

    def build_dynamic_data(
        self, task_database: TaskDatabaseInterface
    ) -> ReprotestDynamicData:
        """Compute and return ReprotestDynamicData."""
        input_source_artifact = task_database.lookup_single_artifact(
            self.data.source_artifact
        )

        assert input_source_artifact is not None
        self.ensure_artifact_categories(
            configuration_key="input.source_artifact",
            category=input_source_artifact.category,
            expected=(
                ArtifactCategory.SOURCE_PACKAGE,
                ArtifactCategory.UPLOAD,
            ),
        )
        assert isinstance(
            input_source_artifact.data, (DebianSourcePackage, DebianUpload)
        )
        subject = get_source_package_name(input_source_artifact.data)
        version = get_source_package_version(input_source_artifact.data)

        assert self.data.environment is not None

        environment = self.get_environment(
            task_database,
            self.data.environment,
            default_category=CollectionCategory.ENVIRONMENTS,
        )

        return ReprotestDynamicData(
            source_artifact_id=input_source_artifact.id,
            subject=subject,
            parameter_summary=f" subject _ version ",
            environment_id=environment.id,
        )

    def get_input_artifacts_ids(self) -> list[int]:
        """Return the list of input artifact IDs used by this task."""
        if not self.dynamic_data:
            return []

        return [
            self.dynamic_data.source_artifact_id,
            self.dynamic_data.environment_id,
        ]

    def fetch_input(self, destination: Path) -> bool:
        """Download the required artifacts."""
        assert self.dynamic_data

        artifact_id = self.dynamic_data.source_artifact_id
        assert artifact_id is not None
        self.fetch_artifact(artifact_id, destination)

        return True

    def configure_for_execution(self, download_directory: Path) -> bool:
        """
        Find a .dsc in download_directory.

        Install reprotest and other utilities used in _cmdline.
        Set self._reprotest_target to it.

        :param download_directory: where to search the files
        :return: True if valid files were found
        """
        self._prepare_executor_instance()

        if self.executor_instance is None:
            raise AssertionError("self.executor_instance cannot be None")

        self.run_executor_command(
            ["apt-get", "update"],
            log_filename="install.log",
            run_as_root=True,
            check=True,
        )
        self.run_executor_command(
            [
                "apt-get",
                "--yes",
                "--no-install-recommends",
                "install",
                "reprotest",
                "dpkg-dev",
                "devscripts",
                "equivs",
                "sudo",
            ],
            log_filename="install.log",
            run_as_root=True,
        )

        self._reprotest_target = utils.find_file_suffixes(
            download_directory, [".dsc"]
        )
        return True

    def _cmdline(self) -> list[str]:
        """
        Build the reprotest command line.

        Use configuration of self.data and self._reprotest_target.
        """
        target = self._reprotest_target
        assert target is not None

        cmd = [
            "bash",
            "-c",
            f"TMPDIR=/tmp ; cd /tmp ; dpkg-source -x  target  package/; "
            "cd package/ ; mk-build-deps ; apt-get install --yes ./*.deb ; "
            "rm *.deb ; "
            "reprotest --vary=-time,-user_group,-fileordering,-domain_host .",
        ]

        return cmd

    @staticmethod
    def _cmdline_as_root() -> bool:
        r"""apt-get install --yes ./\*.deb must be run as root."""
        return True

    def task_result(
        self,
        returncode: int   None,
        execute_directory: Path,  # noqa: U100
    ) -> WorkRequestResults:
        """
        Evaluate task output and return success.

        For a successful run of reprotest:
        -must have the output file
        -exit code is 0

        :return: WorkRequestResults.SUCCESS or WorkRequestResults.FAILURE.
        """
        reprotest_file = execute_directory / self.CAPTURE_OUTPUT_FILENAME

        if reprotest_file.exists() and returncode == 0:
            return WorkRequestResults.SUCCESS

        return WorkRequestResults.FAILURE

    def upload_artifacts(
        self, exec_directory: Path, *, execution_result: WorkRequestResults
    ) -> None:
        """Upload the ReprotestArtifact with the files and relationships."""
        if not self.debusine:
            raise AssertionError("self.debusine not set")

        assert self.dynamic_data is not None
        assert self.dynamic_data.parameter_summary is not None

        reprotest_artifact = ReprotestArtifact.create(
            reprotest_output=exec_directory / self.CAPTURE_OUTPUT_FILENAME,
            reproducible=execution_result == WorkRequestResults.SUCCESS,
            package=self.dynamic_data.parameter_summary,
        )

        uploaded = self.debusine.upload_artifact(
            reprotest_artifact,
            workspace=self.workspace_name,
            work_request=self.work_request_id,
        )

        assert self.dynamic_data is not None
        assert self.dynamic_data.source_artifact_id is not None
        self.debusine.relation_create(
            uploaded.id,
            self.dynamic_data.source_artifact_id,
            RelationType.RELATES_TO,
        )
Below are the main methods with some basic explanation. In order for Debusine to discover the task, add "Reprotest" in the file debusine/tasks/__init__.py in the __all__ list. Let s explain the different methods of the Reprotest class:

build_dynamic_data method The worker has no access to Debusine s database. Lookups are all resolved before the task gets dispatched to a worker, so all it has to do is download the specified input artifacts. build_dynamic_data method lookup the artifact, assert that is a valid category, extract the package name and version, and get the environment in which it will be executed. The environment is needed to run the task (reprotest will run in a container using unshare, incus ).
    def build_dynamic_data(
        self, task_database: TaskDatabaseInterface
    ) -> ReprotestDynamicData:
        """Compute and return ReprotestDynamicData."""
        input_source_artifact = task_database.lookup_single_artifact(
            self.data.source_artifact
        )

        assert input_source_artifact is not None
        self.ensure_artifact_categories(
            configuration_key="input.source_artifact",
            category=input_source_artifact.category,
            expected=(
                ArtifactCategory.SOURCE_PACKAGE,
                ArtifactCategory.UPLOAD,
            ),
        )
        assert isinstance(
            input_source_artifact.data, (DebianSourcePackage, DebianUpload)
        )
        subject = get_source_package_name(input_source_artifact.data)
        version = get_source_package_version(input_source_artifact.data)

        assert self.data.environment is not None

        environment = self.get_environment(
            task_database,
            self.data.environment,
            default_category=CollectionCategory.ENVIRONMENTS,
        )

        return ReprotestDynamicData(
            source_artifact_id=input_source_artifact.id,
            subject=subject,
            parameter_summary=f" subject _ version ",
            environment_id=environment.id,
        )

get_input_artifacts_ids method Used to list the task s input artifacts in the web UI.
   def get_input_artifacts_ids(self) -> list[int]:
       """Return the list of input artifact IDs used by this task."""
       if not self.dynamic_data:
           return []

       assert self.dynamic_data.source_artifact_id is not None
       return [self.dynamic_data.source_artifact_id]

fetch_input method Download the required artifacts on the worker.
    def fetch_input(self, destination: Path) -> bool:
        """Download the required artifacts."""
        assert self.dynamic_data

        artifact_id = self.dynamic_data.source_artifact_id
        assert artifact_id is not None
        self.fetch_artifact(artifact_id, destination)

        return True

configure_for_execution method Install the packages needed by the task and set _reprotest_target, which is used to build the task s command line.
   def configure_for_execution(self, download_directory: Path) -> bool:
       """
       Find a .dsc in download_directory.

       Install reprotest and other utilities used in _cmdline.
       Set self._reprotest_target to it.

       :param download_directory: where to search the files
       :return: True if valid files were found
       """
       self._prepare_executor_instance()

       if self.executor_instance is None:
           raise AssertionError("self.executor_instance cannot be None")

       self.run_executor_command(
           ["apt-get", "update"],
           log_filename="install.log",
           run_as_root=True,
           check=True,
       )
       self.run_executor_command(
           [
               "apt-get",
               "--yes",
               "--no-install-recommends",
               "install",
               "reprotest",
               "dpkg-dev",
               "devscripts",
               "equivs",
               "sudo",
           ],
           log_filename="install.log",
           run_as_root=True,
       )

       self._reprotest_target = utils.find_file_suffixes(
           download_directory, [".dsc"]
       )
       return True

_cmdline method Return the command line to run the task. In this case, and to keep the example simple, we will run reprotest directly in the worker s executor VM/container, without giving it an isolated virtual server. So, this command installs the build dependencies required by the package (so reprotest can build it) and runs reprotest itself.
   def _cmdline(self) -> list[str]:
       """
       Build the reprotest command line.

       Use configuration of self.data and self._reprotest_target.
       """
       target = self._reprotest_target
       assert target is not None

       cmd = [
           "bash",
           "-c",
           f"TMPDIR=/tmp ; cd /tmp ; dpkg-source -x  target  package/; "
           "cd package/ ; mk-build-deps ; apt-get install --yes ./*.deb ; "
           "rm *.deb ; "
           "reprotest --vary=-time,-user_group,-fileordering,-domain_host .",
       ]

       return cmd
Some reprotest variations are disabled. This is to keep the example simple with the set of packages to install and reprotest features.

_cmdline_as_root method Since during the execution it s needed to install packages, run it as root (in the container):
   @staticmethod
   def _cmdline_as_root() -> bool:
       r"""apt-get install --yes ./\*.deb must be run as root."""
       return True

task_result method Task succeeded if a log is generated and the return code is 0.
    def task_result(
        self,
        returncode: int   None,
        execute_directory: Path,  # noqa: U100
    ) -> WorkRequestResults:
        """
        Evaluate task output and return success.

        For a successful run of reprotest:
        -must have the output file
        -exit code is 0

        :return: WorkRequestResults.SUCCESS or WorkRequestResults.FAILURE.
        """
        reprotest_file = execute_directory / self.CAPTURE_OUTPUT_FILENAME

        if reprotest_file.exists() and returncode == 0:
            return WorkRequestResults.SUCCESS

        return WorkRequestResults.FAILURE

upload_artifacts method Create the ReprotestArtifact with the log and the reproducible boolean, upload it, and then add a relation between the ReprotestArtifact and the source package:
    def upload_artifacts(
        self, exec_directory: Path, *, execution_result: WorkRequestResults
    ) -> None:
        """Upload the ReprotestArtifact with the files and relationships."""
        if not self.debusine:
            raise AssertionError("self.debusine not set")

        assert self.dynamic_data is not None
        assert self.dynamic_data.parameter_summary is not None

        reprotest_artifact = ReprotestArtifact.create(
            reprotest_output=exec_directory / self.CAPTURE_OUTPUT_FILENAME,
            reproducible=execution_result == WorkRequestResults.SUCCESS,
            package=self.dynamic_data.parameter_summary,
        )

        uploaded = self.debusine.upload_artifact(
            reprotest_artifact,
            workspace=self.workspace_name,
            work_request=self.work_request_id,
        )

        assert self.dynamic_data is not None
        assert self.dynamic_data.source_artifact_id is not None
        self.debusine.relation_create(
            uploaded.id,
            self.dynamic_data.source_artifact_id,
            RelationType.RELATES_TO,
        )

Execution example To run this task in a local Debusine (see steps to have it ready with an environment, permissions and users created) you can do:
$ python3 -m debusine.client artifact import-debian -w System http://deb.debian.org/debian/pool/main/h/hello/hello_2.10-5.dsc
(get the artifact ID from the output of that command) The artifact can be seen in http://$DEBUSINE/debusine/System/artifact/$ARTIFACTID/. Then create a reprotest.yaml:
$ cat <<EOF > reprotest.yaml
source_artifact: $ARTIFACT_ID
environment: "debian/match:codename=bookworm"
EOF
Instead of debian/match:codename=bookworm it could use the artifact ID. Finally, create the work request to run the task:
$ python3 -m debusine.client create-work-request -w System reprotest --data reprotest.yaml
Using Debusine web you can see the work request, which should go to Running status, then Completed with Success or Failure (depending if reprotest could reproduce it or not). Clicking on the Output tab would have an artifact of type debian:reprotest with one file: the log. In the Metadata tab of the artifact it has Data: the package name and reproducible (true or false).

What is left to do? This was a simple example of creating a task. Other things that could be done:
  • unit tests
  • documentation
  • configurable variations
  • running reprotest directly on the worker host, using the executor environment as a reprotest virtual server
  • in this specific example, the command line might be doing too many things that could maybe be done by other parts of the task, such as prepare_environment.
  • integrate it in a workflow so it s easier to use (e.g. part of QaWorkflow)
  • extract more from the log than just pass/fail
  • display the output in a more useful way (implement an artifact specialized view)

8 February 2026

Colin Watson: Free software activity in January 2026

About 80% of my Debian contributions this month were sponsored by Freexian, as well as one direct donation via GitHub Sponsors (thanks!). If you appreciate this sort of work and are at a company that uses Debian, have a look to see whether you can pay for any of Freexian s services; as well as the direct benefits, that revenue stream helps to keep Debian development sustainable for me and several other lovely people. You can also support my work directly via Liberapay or GitHub Sponsors. Python packaging New upstream versions: Fixes for Python 3.14: Fixes for pytest 9: Porting away from the deprecated pkg_resources: Other build/test failures: I investigated several more build failures and suggested removing the packages in question: Other bugs: Other bits and pieces Alejandro Colomar reported that man(1) ignored the MANWIDTH environment variable in some circumstances. I investigated this and fixed it upstream. I contributed an ubuntu-dev-tools patch to stop recommending sudo. I added forky support to the images used in Salsa CI pipelines. I began working on getting a release candidate of groff 1.24.0 into experimental, though haven t finished that yet. I worked on some lower-priority security updates for OpenSSH. Code reviews

Dirk Eddelbuettel: chronometre: A new package (pair) demo for R and Python

Both R and Python make it reasonably easy to work with compiled extensions. But how to access objects in one environment from the other and share state or (non-trivial) objects remains trickier. Recently (and while r-forge was resting so we opened GitHub Discussions) a question was asked concerning R and Python object pointer exchange. This lead to a pretty decent discussion including arrow interchange demos (pretty ideal if dealing with data.frame-alike objects), but once the focus is on more library-specific objects from a given (C or C++, say) library it is less clear what to do, or how involved it may get. R has external pointers, and these make it feasible to instantiate the same object in Python. To demonstrate, I created a pair of (minimal) packages wrapping a lovely (small) class from the excellent spdlog library by Gabi Melman, and more specifically in an adapted-for-R version (to avoid some R CMD check nags) in my RcppSpdlog package. It is essentially a nicer/fancier C++ version of the tic() and tic() timing scheme. When an object is instantiated, it starts the clock and when we accessing it later it prints the time elapsed in microsecond resolution. In Modern C++ this takes little more than keeping an internal chrono object. Which makes for a nice, small, yet specific object to pass to Python. So the R side of the package pair instantiates such an object, and accesses its address. For different reasons, sending a raw pointer across does not work so well, but a string with the address printed works fabulously (and is a paradigm used around other packages so we did not invent this). Over on the Python side of the package pair, we then take this string representation and pass it to a little bit of pybind11 code to instantiate a new object. This can of course also expose functionality such as the show time elapsed feature, either formatted or just numerically, of interest here. And that is all that there is! Now this can be done from R as well thanks to reticulate as the demo() (also shown on the package README.md) shows:
> library(chronometre)
> demo("chronometre", ask=FALSE)


        demo(chronometre)
        ---- ~~~~~~~~~~~

> #!/usr/bin/env r
> 
> stopifnot("Demo requires 'reticulate'" = requireNamespace("reticulate", quietly=TRUE))

> stopifnot("Demo requires 'RcppSpdlog'" = requireNamespace("RcppSpdlog", quietly=TRUE))

> stopifnot("Demo requires 'xptr'" = requireNamespace("xptr", quietly=TRUE))

> library(reticulate)

> ## reticulate and Python in general these days really want a venv so we will use one,
> ## the default value is a location used locally; if needed create one
> ## check for existing virtualenv to use, or else set one up
> venvdir <- Sys.getenv("CHRONOMETRE_VENV", "/opt/venv/chronometre")

> if (dir.exists(venvdir))  
+ >     use_virtualenv(venvdir, required = TRUE)
+ >   else  
+ >     ## create a virtual environment, but make it temporary
+ >     Sys.setenv(RETICULATE_VIRTUALENV_ROOT=tempdir())
+ >     virtualenv_create("r-reticulate-env")
+ >     virtualenv_install("r-reticulate-env", packages = c("chronometre"))
+ >     use_virtualenv("r-reticulate-env", required = TRUE)
+ >  


> sw <- RcppSpdlog::get_stopwatch()                   # we use a C++ struct as example

> Sys.sleep(0.5)                                      # imagine doing some code here

> print(sw)                                           # stopwatch shows elapsed time
0.501220 

> xptr::is_xptr(sw)                                   # this is an external pointer in R
[1] TRUE

> xptr::xptr_address(sw)                              # get address, format is "0x...."
[1] "0x58adb5918510"

> sw2 <- xptr::new_xptr(xptr::xptr_address(sw))       # cloned (!!) but unclassed

> attr(sw2, "class") <- c("stopwatch", "externalptr") # class it .. and then use it!

> print(sw2)                                          #  xptr  allows us close and use
0.501597 

> sw3 <- ch$Stopwatch(  xptr::xptr_address(sw) )      # new Python object via string ctor

> print(sw3$elapsed())                                # shows output via Python I/O
datetime.timedelta(microseconds=502013)

> cat(sw3$count(), "\n")                              # shows double
0.502657 

> print(sw)                                           # object still works in R
0.502721 
> 
The same object, instantiated in R is used in Python and thereafter again in R. While this object here is minimal in features, the concept of passing a pointer is universal. We could use it for any interesting object that R can access and Python too can instantiate. Obviously, there be dragons as we pass pointers so one may want to ascertain that headers from corresponding compatible versions are used etc but principle is unaffected and should just work. Both parts of this pair of packages are now at the corresponding repositories: PyPI and CRAN. As I commonly do here on package (change) announcements, I include the (minimal so far) set of high-level changes for the R package.

Changes in version 0.0.2 (2026-02-05)
  • Removed replaced unconditional virtualenv use in demo given preceding conditional block
  • Updated README.md with badges and an updated demo

Changes in version 0.0.1 (2026-01-25)
  • Initial version and CRAN upload

Questions, suggestions, bug reports, are welcome at either the (now awoken from the R-Forge slumber) Rcpp mailing list or the newer Rcpp Discussions.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

Vincent Bernat: Fragments of an adolescent web

I have unearthed a few old articles typed during my adolescence, between 1996 and 1998. Unremarkable at the time, these pages now compose, three decades later, the chronicle of a vanished era.1 The word blog does not exist yet. Wikipedia remains to come. Google has not been born. AltaVista reigns over searches, while already struggling to embrace the nascent immensity of the web2. To meet someone, you had to agree in advance and prepare your route on paper maps. The web is taking off. The CSS specification has just emerged, HTML tables still serve for page layout. Cookies and advertising banners are making their appearance. Pages are adorned with music and videos, forcing browsers to arm themselves with plugins. Netscape Navigator sits on 86% of the territory, but Windows 95 now bundles Internet Explorer to quickly catch up. Facing this offensive, Netscape opensource its browser. France falls behind. Outside universities, Internet access remains expensive and laborious. Minitel still reigns, offering phone directory, train tickets, remote shopping. This was not yet possible with the Internet: buying a CD online was a pipe dream. Encryption suffers from inappropriate regulation: the DES algorithm is capped at 40 bits and cracked in a few seconds. These pages bear the trace of the web s adolescence. Thirty years have passed. The same battles continue: data selling, advertising, monopolies.

  1. Most articles linked here are not translated from French to English.
  2. I recently noticed that Google no longer fully indexes my blog. For example, it is no longer possible to find the article on lan o. I assume this is a consequence of the explosion of AI-generated content or a change in priorities for Google.

Louis-Philippe V ronneau: Montreal Subway Foot Traffic Data, 2025 edition

Another year of data from Soci t de Transport de Montr al, Montreal's transit agency! A few highlights this year:
  1. Although the Saint-Michel station closed for emergency repairs in November 2024, traffic never bounced back to its pre-closure levels and is still stuck somewhere around 2022 Q2 levels. I wonder if this could be caused by the roadwork on Jean-Talon for the new Blue Line stations making it harder for folks in Montreal-Nord to reach the station by bus.
  2. The effects of the opening of the Royalmount shopping center has had a durable impact on the traffic at the De la Savane station. I reported on this last year, but it seems this wasn't just a fad.
  3. With the completion of the Deux-Montagnes branch of the R seau express m tropolitain (REM, a light-rail, above the surface transit network still in construction), the transfer stations to the Montreal subway have seen major traffic increases. The douard-Montpetit station has nearly reached its previous all-time record of 2015 and the McGill station has recovered from the general slump all the other stations have had in 2025.
  4. The Assomption station, which used to have one of the lowest number of riders of the subway network, has had a tremendous growth in the past few years. This is mostly explained by the many high-rise projects that were built around the station since the end of the COVID-19 pandemic.
  5. Although still affected by a very high seasonality, the Jean-Drapeau station broke its previous record of 2019, a testament of the continued attraction power of the various summer festivals taking place on the Sainte-H l ne et Notre-Dame islands.
More generally, it seems the Montreal subway has had a pretty bad year. Traffic had been slowly climbing back since the COVID-19 pandemic, but this is the first year since 2020 such a sharp decline can be witnessed. Even major stations like Jean-Talon or Lionel-Groulx are on a downward trend and it is pretty worrisome. As for causes, a few things come to mind. First of all, as the number of Montrealers commuting to work by bike continues to rise1, a modal shift from public transit to active mobility is to be expected. As local experts put it, this is not uncommon and has been seen in other cities before. Another important factor that certainly turned people away from the subway this year has been the impacts of the continued housing crisis in Montreal. As more and more people get kicked out of their apartments, many have been seeking refuge in the subway stations to find shelter. Sadly, this also brought a unprecedented wave of incivilities. As riders' sense of security sharply decreased, the STM eventually resorted to banning unhoused people from sheltering in the subway. This decision did bring back some peace to the network, but one can posit damage had already been done and many casual riders are still avoiding the subway for this reason. Finally, the weekslong STM worker's strike in Q4 had an important impact on general traffic, as it severely reduced the opening hours of the subway. As for the previous item, once people find alternative ways to get around, it's always harder to bring them back. Hopefully, my 2026 report will be a more cheerful one... By clicking on a subway station, you'll be redirected to a graph of the station's foot traffic. Licences

  1. Mostly thanks to major improvements to the cycling network and the BIXI bikesharing program.

6 February 2026

Reproducible Builds: Reproducible Builds in January 2026

Welcome to the first monthly report in 2026 from the Reproducible Builds project! These reports outline what we ve been up to over the past month, highlighting items of news from elsewhere in the increasingly-important area of software supply-chain security. As ever, if you are interested in contributing to the Reproducible Builds project, please see the Contribute page on our website.

  1. Flathub now testing for reproducibility
  2. Reproducibility identifying projects that will fail to build in 2038
  3. Distribution work
  4. Tool development
  5. Two new academic papers
  6. Upstream patches

Flathub now testing for reproducibility Flathub, the primary repository/app store for Flatpak-based applications, has begun checking for build reproducibility. According to a recent blog post:
We have started testing binary reproducibility of x86_64 builds targeting the stable repository. This is possible thanks to flathub-repro-checker, a tool doing the necessary legwork to recreate the build environment and compare the result of the rebuild with what is published on Flathub. While these tests have been running for a while now, we have recently restarted them from scratch after enabling S3 storage for diffoscope artifacts.
The test results and status is available on their reproducible builds page.

Reproducibility identifying software projects that will fail to build in 2038 Longtime Reproducible Builds developer Bernhard M. Wiedemann posted on Reddit on Y2K38 commemoration day T-12 that is to say, twelve years to the day before the UNIX Epoch will no longer fit into a signed 32-bit integer variable on 19th January 2038. Bernhard s comment succinctly outlines the problem as well as notes some of the potential remedies, as well as links to a discussion with the GCC developers regarding adding warnings for int time_t conversions . At the time of publication, Bernard s topic had generated 50 comments in response.

Distribution work Conda is language-agnostic package manager which was originally developed to help Python data scientists and is now a popular package manager for Python and R. conda-forge, a community-led infrastructure for Conda recently revamped their dashboards to rebuild packages straight to track reproducibility. There have been changes over the past two years to make the conda-forge build tooling fully reproducible by embedding the lockfile of the entire build environment inside the packages.
In Debian this month:
In NixOS this month, it was announced that the GNU Guix Full Source Bootstrap was ported to NixOS as part of Wire Jansen bachelor s thesis (PDF). At the time of publication, this change has landed in NiX stdev distribution.
Lastly, Bernhard M. Wiedemann posted another openSUSE monthly update for his work there.

Tool development diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made the following changes, including preparing and uploading versions, 310 and 311 to Debian.
  • Fix test compatibility with u-boot-tools version 2026-01. [ ]
  • Drop the implied Rules-Requires-Root: no entry in debian/control. [ ]
  • Bump Standards-Version to 4.7.3. [ ]
  • Reference the Debian ocaml package instead of ocaml-nox. (#1125094)
  • Apply a patch by Jelle van der Waa to adjust a test fixture match new lines. [ ]
  • Also the drop implied Priority: optional from debian/control. [ ]

In addition, Holger Levsen uploaded two versions of disorderfs, first updating the package from FUSE 2 to FUSE 3 as described in last months report, as well as updating the packaging to the latest Debian standards. A second upload (0.6.2-1) was subsequently made, with Holger adding instructions on how to add the upstream release to our release archive and incorporating changes by Roland Clobus to set _FILE_OFFSET_BITS on 32-bit platforms, fixing a build failure on 32-bit systems. Vagrant Cascadian updated diffoscope in GNU Guix to version 311-2-ge4ec97f7 and disorderfs to 0.6.2.

Two new academic papers Julien Malka, Stefano Zacchiroli and Th o Zimmermann of T l com Paris in-house research laboratory, the Information Processing and Communications Laboratory (LTCI) published a paper this month titled Docker Does Not Guarantee Reproducibility:
[ ] While Docker is frequently cited in the literature as a tool that enables reproducibility in theory, the extent of its guarantees and limitations in practice remains under-explored. In this work, we address this gap through two complementary approaches. First, we conduct a systematic literature review to examine how Docker is framed in scientific discourse on reproducibility and to identify documented best practices for writing Dockerfiles enabling reproducible image building. Then, we perform a large-scale empirical study of 5,298 Docker builds collected from GitHub workflows. By rebuilding these images and comparing the results with their historical counterparts, we assess the real reproducibility of Docker images and evaluate the effectiveness of the best practices identified in the literature.
A PDF of their paper is available online.
Quentin Guilloteau, Antoine Waehren and Florina M. Ciorba of the University of Basel in Sweden also published a Docker-related paper, theirs called Longitudinal Study of the Software Environments Produced by Dockerfiles from Research Artifacts:
The reproducibility crisis has affected all scientific disciplines, including computer science (CS). To address this issue, the CS community has established artifact evaluation processes at conferences and in journals to evaluate the reproducibility of the results shared in publications. Authors are therefore required to share their artifacts with reviewers, including code, data, and the software environment necessary to reproduce the results. One method for sharing the software environment proposed by conferences and journals is to utilize container technologies such as Docker and Apptainer. However, these tools rely on non-reproducible tools, resulting in non-reproducible containers. In this paper, we present a tool and methodology to evaluate variations over time in software environments of container images derived from research artifacts. We also present initial results on a small set of Dockerfiles from the Euro-Par 2024 conference.
A PDF of their paper is available online.

Miscellaneous news On our mailing list this month: Lastly, kpcyrd added a Rust section to the Stable order for outputs page on our website. [ ]

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Finally, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

Birger Schacht: Status update, January 2026

January was a slow month, I only did three uploads to Debian unstable: I was very happy to see the new dfsg-new-queue and that there are more hands now processing the NEW queue. I also finally got one of the packages accepted that I uploaded after the Trixie release: wayback which I uploaded last August. There has been another release since then, I ll try to upload that in the next few days. There was a bug report for carl asking for Windows support. carl used the xdg create for looking up the XDG directories, but xdg does not support windows systems (and it seems this will not change) The reporter also provided a PR to replace the dependency with the directories crate which more system agnostic. I adapted the PR a bit and merged it and released version 0.6.0 of carl. At my dayjob I refactored django-grouper. django-grouper is a package we use to find duplicate objects in our data. Our users often work with datasets of thousands of historical persons, places and institutions and in projects that run over years and ingest data from multiple sources, it happens that entries are created several times. I wrote the initial app in 2024, but was never really happy about the approach I used back then. It was based on this blog post that describes how to group spreadsheet text cells. It uses sklearns TfidfVectorizer with a custom analyzer and the library sparse_dot_topn for creating the matrix. All in all the module to calculate the clusters was 80 lines and with sparse_dot_topn it pulled in a rather niche Python library. I was pretty sure that this functionality could also be implemented with basic sklearn functionality and it was: we are now using DictVectorizer because in a Django app we are working with objects that can be mapped to dicts anyway. And for clustering the data, the app now uses the DBSCAN algorithm (with the manhattan distance as metric). The module is now only half the size and the whole app lost one dependency! I released those changes as version 0.3.0 of the app. At the end of January together with friends I went to Brussels to attend FOSDEM. We took the night train but there were a couple of broken down trains so the ride took 26 hours instead of one night. It is a good thing we had a one day buffer and FOSDEM only started on Saturday. As usual there were too many talks to visit, so I ll have to watch some of the recordings in the next few weeks. Some examples of talks I found interesting so far:

5 February 2026

Dirk Eddelbuettel: rfoaas 2.3.3: Limited Rebirth

rfoaas greed example The original FOAAS site provided a rather wide variety of REST access points, but it sadky is no more (while the old repo is still there). A newer replacement site FOASS is up and running, but with a somewhat reduced offering. (For example, the two accessors shown in the screenshot are no more. C est la vie.) Recognising that perfect may once again be the enemy of (somewhat) good (enough), we have rejigged the rfoaas package in a new release 2.3.3. (The precding version number 2.3.2 corresponded to the upstream version, indicating which API release we matched. Now we just went + 0.0.1 but there is no longer a correspondence to the service version at FOASS.) Accessor functions for each of the now available access points are provided, ans the random sampling accessor getRandomFO() now picks from that set. My CRANberries service provides a comparison to the previous release. Questions, comments etc should go to the GitHub issue tracker. More background information is on the project page as well as on the github repo

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

4 February 2026

Dirk Eddelbuettel: littler 0.3.23 on CRAN: More Features (and Fixes)

max-heap image The twentythird release of littler as a CRAN package landed on CRAN just now, following in the now twenty year history (!!) as a (initially non-CRAN) package started by Jeff in 2006, and joined by me a few weeks later. littler is the first command-line interface for R as it predates Rscript. It allows for piping as well for shebang scripting via #!, uses command-line arguments more consistently and still starts faster. It also always loaded the methods package which Rscript only began to do in later years. littler lives on Linux and Unix, has its difficulties on macOS due to some-braindeadedness there (who ever thought case-insensitive filesystems as a default were a good idea?) and simply does not exist on Windows (yet the build system could be extended see RInside for an existence proof, and volunteers are welcome!). See the FAQ vignette on how to add it to your PATH. A few examples are highlighted at the Github repo:, as well as in the examples vignette. This release, the first in about eleven months, once again brings two new helper scripts, and enhances six existing one. The release was triggered because it finally became clear why installGitHub.r ignored r2u when available: we forced the type argument to source (so thanks to I aki for spotting this). One change was once again contributed by Michael which is again greatly appreciated. The full change description follows.

Changes in littler version 0.3.22 (2026-02-03)
  • Changes in examples scripts
    • A new script busybees.r aggregates deadlined packages by maintainer
    • Several small updated have been made to the (mostly internal) 'r2u.r' script
    • The deadliners.r script has refined treatment for screen width
    • The install2.r script has new options --quiet and --verbose as proposed by Zivan Karaman
    • The rcc.r script passes build-args to 'rcmdcheck' to compact vignettes and save data
    • The installRub.r script now defaults to 'noble' and is more tolerant of inputs
    • The installRub.r script deals correctly with empty utils::osVersion thanks to Michael Chirico
    • New script checkPackageUrls.r inspired by how CRAN checks (with thanks to Kurt Hornik for the hint)
    • The installGithub.r script now adjusts to bspm and takes advantage of r2u binaries for its build dependencies
  • Changes in package
    • Environment variables (read at build time) can use double quotes
    • Continuous intgegration scripts received a minor update

My CRANberries service provides a comparison to the previous release. Full details for the littler release are provided as usual at the ChangeLog page, and also on the package docs website. The code is available via the GitHub repo, from tarballs and now of course also from its CRAN page and via install.packages("littler"). Binary packages are available directly in Debian as well as (in a day or two) Ubuntu binaries at CRAN thanks to the tireless Michael Rutter. Comments and suggestions are welcome at the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

2 February 2026

Isoken Ibizugbe: How Open Source Contributions Define Your Career Path

Hi there, I m more than halfway through (8 weeks) my Outreachy internship with Debian, working on the openQA project to test Live Images.

My journey into tech began as a software engineering trainee, during which I built a foundation in Bash scripting, C programming, and Python. Later, I worked for a startup as a Product Manager. As is common in startups, I wore many hats, but I found myself drawn most to the Quality Assurance team. Testing user flows and edge-case scenarios sparked my curiosity, and that curiosity is exactly what led me to the Debian Live Image testing project.

From Manual to Automated

In my previous roles, I was accustomed to manual testing, simulating user actions one by one. While effective, I quickly realized it could be a bottleneck in fast-paced environments. This internship has been a masterclass in removing that bottleneck. I ve learned that automating repetitive actions makes life (and engineering) much easier. Life s too short for manual testing  .

One of my proudest technical wins so far has been creating Synergy across desktop environments. I proposed a solution to group common applications so we could use a single Perl script to handle tests for multiple desktop environments. With my mentor s guidance, we implemented this using symbolic links, which significantly reduced code redundancy.

Expanding My Technical Toolkit

Over the last 8 weeks, my technical toolkit has expanded significantly:

  • Perl & openQA: I ve learnt writing with Perl for automation within the openQA framework and I ve successfully automated apps_startstop tests for Cinnamon and LXQt
  • Technical Documentation: I authored a contributor guide. This required paying close attention to detail, ensuring that new contributors can have faster reviews and merged contributions
  • Ansible: I am learning that testing doesn t happen in a vacuum. To ensure new tests are truly automated, they must be integrated into the system s configuration.

Working on this project has shaped my perspective on where I fit in the tech ecosystem. In Open Source, my resume isn t just a PDF, it s a public trail of merged code, technical proposals, and collaborative discussions.

As my mentor recently pointed out, this is my proof-of-work. It s a transparent record of what I am capable of and where my interests lie.

Finally, I ve grown as a team player. Working with a global team across different time zones has taught me the importance of asynchronous communication and respecting everyone s time. Whether I am proposing a new logic or documenting a process, I am learning that open communication is just as vital as clean code.

1 February 2026

Benjamin Mako Hill: What do people do when they edit Wikipedia through Tor?

Note: I have not published blog posts about my academic papers over the past few years. To ensure that my blog contains a more comprehensive record of my published papers and to surface these for folks who missed them, I will be periodically (re)publishing blog posts about some older published projects. This post is closely based on a previously published post by Kaylea Champion on the Community Data Science Blog.

Many individuals use Tor to reduce their visibility to widespread internet surveillance.
One popular approach to protecting our privacy online is to use the Tor network. Tor protects users from being identified by their IP address, which can be tied to a physical location. However, if you d like to contribute to Wikipedia using Tor, you ll run into a problem. Although most IP addresses can edit without an account, Tor users are blocked from editing.
Tor users attempting to contribute to Wikipedia are shown a screen that informs them that they are not allowed to edit Wikipedia.
Other research by my team has shown that Wikipedia s attempt to block Tor is imperfect and that some people have been able to edit despite the ban. As part of this work, we built a dataset of more than 11,000 contributions to Wikipedia via Tor and used quantitative analysis to show that contributions from Tor were of about the same quality as contributions from other new editors and other contributors without accounts. Of course, given the unusual circumstances Tor-based contributors face, we wondered whether a deeper look at the content of their edits might reveal more about their motives and the kinds of contributions they seek to make. Kaylea Champion (then a student, now faculty at UW Bothell) led a qualitative investigation to explore these questions. Given the challenges of studying anonymity seekers, we designed a novel forensic qualitative approach inspired by techniques common in computer security and criminal investigation. We applied this new technique to a sample of 500 editing sessions and categorized each session based on what the editor seemed to be intending to do. Most of the contributions we found fell into one of the two following categories: Although these were most of what we observed, we also found evidence of several types of contributor intent:
An exploratory mapping of our themes in terms of the value a type of contribution represents to the Wikipedia community and the importance of anonymity in facilitating it. Anonymity-protecting tools play a critical role in facilitating contributions on the right side of the figure, while edits on the left are more likely to occur even when anonymity is impossible. Contributions toward the top reflect valuable forms of participation in Wikipedia, while edits at the bottom reflect damage.
In all, these themes led us to reflect on how the risks individuals face when contributing to online communities sometimes diverge from the risks the communities face in accepting their work. Expressing minoritized perspectives, maintaining community standards even when you may be targeted by the rulebreaker, highlighting injustice, or acting as a whistleblower can be very risky for an individual, and may not be possible without privacy protections. Of course, in platforms seeking to support the public good, such knowledge and accountability may be crucial.

This work was published as a paper at CSCW: Kaylea Champion, Nora McDonald, Stephanie Bankes, Joseph Zhang, Rachel Greenstadt, Andrea Forte, and Benjamin Mako Hill. 2019. A Forensic Qualitative Analysis of Contributions to Wikipedia from Anonymity Seeking Users. Proceedings of the ACM on Human-Computer Interaction. 3, CSCW, Article 53 (November 2019), 26 pages. https://doi.org/10.1145/3359155

This project was conducted by Kaylea Champion, Nora McDonald, Stephanie Bankes, Joseph Zhang, Rachel Greenstadt, Andrea Forte, and Benjamin Mako Hill. This work was supported by the National Science Foundation (awards CNS-1703736 and CNS-1703049) and included the work of two undergraduates supported through an NSF REU supplement.

Russ Allbery: Review: Paladin's Faith

Review: Paladin's Faith, by T. Kingfisher
Series: The Saint of Steel #4
Publisher: Red Wombat Studio
Copyright: 2023
ISBN: 1-61450-614-0
Format: Kindle
Pages: 515
Paladin's Faith is the fourth book in T. Kingfisher's loosely connected series of fantasy novels about the berserker former paladins of the Saint of Steel. You could read this as a standalone, but there are numerous (spoilery) references to the previous books in the series. Marguerite, who was central to the plot of the first book in the series, Paladin's Grace, is a spy with a problem. An internal power struggle in the Red Sail, the organization that she's been working for, has left her a target. She has a plan for how to break their power sufficiently that they will hopefully leave her alone, but to pull it off she's going to need help. As the story opens, she is working to acquire that help in a very Marguerite sort of way: breaking into the office of Bishop Beartongue of the Temple of the White Rat. The Red Sail, the powerful merchant organization Marguerite worked for, makes their money in the salt trade. Marguerite has learned that someone invented a cheap and reproducible way to extract salt from sea water, thus making the salt trade irrelevant. The Red Sail wants to ensure that invention never sees the light of day, and has forced the artificer into hiding. Marguerite doesn't know where they are, but she knows where she can find out: the Court of Smoke, where the artificer has a patron.
Having grown up in Anuket City, Marguerite was familiar with many clockwork creations, not to mention all the ways that they could go horribly wrong. (Ninety-nine times out of a hundred, it was an explosion. The hundredth time, it ran amok and stabbed innocent bystanders, and the artificer would be left standing there saying, "But I had to put blades on it, or how would it rake the leaves?" while the gutters filled up with blood.)
All Marguerite needs to put her plan into motion is some bodyguards so that she's not constantly distracted and anxious about being assassinated. Readers of this series will be unsurprised to learn that the bodyguards she asks Beartongue for are paladins, including a large broody male one with serious self-esteem problems. This is, like the other books in this series, a slow-burn romance with infuriating communication problems and a male protagonist who would do well to seek out a sack of hammers as a mentor. However, it has two things going for it that most books in this series do not: a long and complex plot to which the romance takes a back seat, and Marguerite, who is not particularly interested in playing along with the expected romance developments. There are also two main paladins in this story, not just one, and the other is one of the two female paladins of the Saint of Steel and rather more entertaining than Shane. I generally like court intrigue stories, which is what fills most of this book. Marguerite is an experienced operative, so the reader gets some solid competence porn, and the paladins are fish out of water but are also unexpectedly dangerous, which adds both comedy and satisfying table-turning. I thoroughly enjoyed the maneuvering and the culture clashes. Marguerite is very good at what she does, knows it, and is entirely uninterested in other people's opinions about that, which short-circuits a lot of Shane's most annoying behavior and keeps the story from devolving into mopey angst like some of the books in this series have done. The end of this book takes the plot in a different direction that adds significantly to the world-building, but also has a (thankfully short) depths of despair segment that I endured rather than enjoyed. I am not really in the mood for bleak hopelessness in my fiction at the moment, even if the reader is fairly sure it will be temporary. But apart from that, I thoroughly enjoyed this book from beginning to end. When we finally meet the artificer, they are an absolute delight in that way that Kingfisher is so good at. The whole story is infused with the sense of determined and competent people refusing to stop trying to fix problems. As usual, the romance was not for me and I think the book would have been better without it, but it's less central to the plot and therefore annoyed me less than any of the books in this series so far. My one major complaint is the lack of gnoles, but we get some new and intriguing world-building to make up for it, along with a setup for a fifth book that I am now extremely curious about. By this point in the series, you probably know if you like the general formula. Compared to the previous book, Paladin's Hope, I thought Paladin's Faith was much stronger and more interesting, but it's clearly of the same type. If, like me, you like the plots but not the romance, the plot here is more substantial. You will have to decide if that makes up for a romance in the typical T. Kingfisher configuration. Personally, I enjoyed this quite a bit, except for the short bleak part, and I'm back to eagerly awaiting the next book in the series. Rating: 8 out of 10

31 January 2026

Michael Prokop: apt, SHA-1 keys + 2026-02-01

You might have seen Policy will reject signature within a year warnings in apt(-get) update runs like this:
root@424812bd4556:/# apt update
Get:1 http://foo.example.org/debian demo InRelease [4229 B]
Hit:2 http://deb.debian.org/debian trixie InRelease
Hit:3 http://deb.debian.org/debian trixie-updates InRelease
Hit:4 http://deb.debian.org/debian-security trixie-security InRelease
Get:5 http://foo.example.org/debian demo/main amd64 Packages [1097 B]
Fetched 5326 B in 0s (43.2 kB/s)
All packages are up to date.
Warning: http://foo.example.org/debian/dists/demo/InRelease: Policy will reject signature within a year, see --audit for details
root@424812bd4556:/# apt --audit update
Hit:1 http://foo.example.org/debian demo InRelease
Hit:2 http://deb.debian.org/debian trixie InRelease
Hit:3 http://deb.debian.org/debian trixie-updates InRelease
Hit:4 http://deb.debian.org/debian-security trixie-security InRelease
All packages are up to date.    
Warning:  http://foo.example.org/debian/dists/demo/InRelease: Policy will reject signature within a year, see --audit for details
Audit:  http://foo.example.org/debian/dists/demo/InRelease: Sub-process /usr/bin/sqv returned an error code (1), error message is:
   Signing key on 54321ABCD6789ABCD0123ABCD124567ABCD89123 is not bound:
              No binding signature at time 2024-06-19T10:33:47Z
     because: Policy rejected non-revocation signature (PositiveCertification) requiring second pre-image resistance
     because: SHA1 is not considered secure since 2026-02-01T00:00:00Z
Audit: The sources.list(5) entry for 'http://foo.example.org/debian' should be upgraded to deb822 .sources
Audit: Missing Signed-By in the sources.list(5) entry for 'http://foo.example.org/debian'
Audit: Consider migrating all sources.list(5) entries to the deb822 .sources format
Audit: The deb822 .sources format supports both embedded as well as external OpenPGP keys
Audit: See apt-secure(8) for best practices in configuring repository signing.
Audit: Some sources can be modernized. Run 'apt modernize-sources' to do so.
If you ignored this for the last year, I would like to tell you that 2026-02-01 is not that far away (hello from the past if you re reading this because you re already affected). Let s simulate the future:
root@424812bd4556:/# apt --update -y install faketime
[...]
root@424812bd4556:/# export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/faketime/libfaketime.so.1 FAKETIME="2026-08-29 23:42:11" 
root@424812bd4556:/# date
Sat Aug 29 23:42:11 UTC 2026
root@424812bd4556:/# apt update
Get:1 http://foo.example.org/debian demo InRelease [4229 B]
Hit:2 http://deb.debian.org/debian trixie InRelease                                 
Err:1 http://foo.example.org/debian demo InRelease
  Sub-process /usr/bin/sqv returned an error code (1), error message is: Signing key on 54321ABCD6789ABCD0123ABCD124567ABCD89123 is not bound:            No binding signature at time 2024-06-19T10:33:47Z   because: Policy rejected non-revocation signature (PositiveCertification) requiring second pre-image resistance   because: SHA1 is not considered secure since 2026-02-01T00:00:00Z
[...]
Warning: An error occurred during the signature verification. The repository is not updated and the previous index files will be used. OpenPGP signature verification failed: http://foo.example.org/debian demo InRelease: Sub-process /usr/bin/sqv returned an error code (1), error message is: Signing key on 54321ABCD6789ABCD0123ABCD124567ABCD89123 is not bound:            No binding signature at time 2024-06-19T10:33:47Z   because: Policy rejected non-revocation signature (PositiveCertification) requiring second pre-image resistance   because: SHA1 is not considered secure since 2026-02-01T00:00:00Z
[...]
root@424812bd4556:/# echo $?
100
Now, the proper solution would have been to fix the signing key underneath (via e.g. sq cert lint &dash&dashfix &dash&dashcert-file $PRIVAT_KEY_FILE > $PRIVAT_KEY_FILE-fixed). If you don t have access to the according private key (e.g. when using an upstream repository that has been ignoring this issue), you re out of luck for a proper fix. But there s a workaround for the apt situation (related see apt commit 0989275c2f7afb7a5f7698a096664a1035118ebf):
root@424812bd4556:/# cat /usr/share/apt/default-sequoia.config
# Default APT Sequoia configuration. To overwrite, consider copying this
# to /etc/crypto-policies/back-ends/apt-sequoia.config and modify the
# desired values.
[asymmetric_algorithms]
dsa2048 = 2024-02-01
dsa3072 = 2024-02-01
dsa4096 = 2024-02-01
brainpoolp256 = 2028-02-01
brainpoolp384 = 2028-02-01
brainpoolp512 = 2028-02-01
rsa2048  = 2030-02-01
[hash_algorithms]
sha1.second_preimage_resistance = 2026-02-01    # Extend the expiry for legacy repositories
sha224 = 2026-02-01
[packets]
signature.v3 = 2026-02-01   # Extend the expiry
Adjust this according to your needs:
root@424812bd4556:/# mkdir -p /etc/crypto-policies/back-ends/
root@424812bd4556:/# cp /usr/share/apt/default-sequoia.config /etc/crypto-policies/back-ends/apt-sequoia.config
root@424812bd4556:/# $EDITOR /etc/crypto-policies/back-ends/apt-sequoia.config
root@424812bd4556:/# cat /etc/crypto-policies/back-ends/apt-sequoia.config
# APT Sequoia override configuration
[asymmetric_algorithms]
dsa2048 = 2024-02-01
dsa3072 = 2024-02-01
dsa4096 = 2024-02-01
brainpoolp256 = 2028-02-01
brainpoolp384 = 2028-02-01
brainpoolp512 = 2028-02-01
rsa2048  = 2030-02-01
[hash_algorithms]
sha1.second_preimage_resistance = 2026-09-01    # Extend the expiry for legacy repositories
sha224 = 2026-09-01
[packets]
signature.v3 = 2026-02-01   # Extend the expiry
Then we re back into the original situation, being a warning instead of an error:
root@424812bd4556:/# apt update
Hit:1 http://deb.debian.org/debian trixie InRelease
Get:2 http://foo.example.org/debian demo InRelease [4229 B]
Hit:3 http://deb.debian.org/debian trixie-updates InRelease
Hit:4 http://deb.debian.org/debian-security trixie-security InRelease
Warning: http://foo.example.org/debian/dists/demo/InRelease: Policy will reject signature within a year, see --audit for details
[..]
Please note that this is a workaround, and not a proper solution.

29 January 2026

C.J. Collier: Part 3: Building the Keystone Dataproc Custom Images for Secure Boot & GPUs

Part
3: Building the Keystone Dataproc Custom Images for Secure Boot &
GPUs In Part 1, we established a secure, proxy-only network. In Part 2, we
explored the enhanced install_gpu_driver.sh initialization
action. Now, in Part 3, we ll focus on using the LLC-Technologies-Collier/custom-images
repository (branch proxy-exercise-2025-11) to build the
actual custom Dataproc images embedded with NVIDIA drivers signed for
Secure Boot, all within our proxied environment.

Why Custom Images? To run NVIDIA GPUs on Shielded VMs with Secure Boot enabled, the
NVIDIA kernel modules must be signed with a key trusted by the VM s EFI
firmware. Since standard Dataproc images don t include these
custom-signed modules, we need to build our own. This process also
allows us to pre-install a full stack of GPU-accelerated software.

The
custom-images Toolkit
(examples/secure-boot) The examples/secure-boot directory within the
custom-images repository contains the necessary scripts and
configurations, refined through significant development to handle proxy
and Secure Boot challenges. Key Components & Development Insights:
  • env.json: The central configuration
    file (as used in Part 1) for project, network, proxy, and bucket
    details. This became the single source of truth to avoid configuration
    drift.
  • create-key-pair.sh: Manages the Secure
    Boot signing keys (PK, KEK, DB) in Google Secret Manager, essential for
    the module signing.
  • build-and-run-podman.sh: Orchestrates
    the image build process in an isolated Podman container. This was
    introduced to standardize the build environment and encapsulate
    dependencies, simplifying what the user needs to install locally.
  • pre-init.sh: Sets up the build
    environment within the container and calls
    generate_custom_image.py. It crucially passes metadata
    derived from env.json (like proxy settings and Secure Boot
    key secret names) to the temporary build VM.
  • generate_custom_image.py: The core
    Python script that automates GCE VM creation, runs the customization
    script, and creates the final GCE image.
  • gce-proxy-setup.sh: This script from
    startup_script/ is vital. It s injected into the temporary
    build VM and runs first to configure the OS, package
    managers (apt, dnf), tools (curl, wget, GPG), Conda, and Java to use the
    proxy settings passed in the metadata. This ensures the entire build
    process is proxy-aware.
  • install_gpu_driver.sh: Used as the
    --customization-script within the build VM. As detailed in
    Part 2, this script handles the driver/CUDA/ML stack installation and
    signing, now able to function correctly due to the proxy setup by
    gce-proxy-setup.sh.
Layered Image Strategy: The pre-init.sh script employs a layered approach:
  1. secure-boot Image: Base image with
    Secure Boot certificates injected.
  2. tf Image: Based on
    secure-boot, this image runs the full
    install_gpu_driver.sh within the proxy-configured build VM
    to install NVIDIA drivers, CUDA, ML libraries (TensorFlow, PyTorch,
    RAPIDS), and sign the modules. This is the primary target image for our
    use case.
(Note: secure-proxy and proxy-tf layers
were experiments, but the -tf image combined with runtime
metadata emerged as the most effective solution for 2.2-debian12). Build Steps:
  1. Clone Repos & Configure
    env.json: Ensure you have the
    custom-images and cloud-dataproc repos and a
    complete env.json as described in Part 1.
  2. Run the Build:
    bash # Example: Build a 2.2-debian12 based image set # Run from the custom-images repository root bash examples/secure-boot/build-and-run-podman.sh 2.2-debian12
    This command will build the layered images, leveraging the proxy
    settings from env.json via the metadata injected into the
    build VM. Note the final image name produced (e.g.,
    dataproc-2-2-deb12-YYYYMMDD-HHMMSS-tf).

Conclusion of Part 3 Through an iterative process, we ve developed a robust workflow
within the custom-images repository to build Secure
Boot-compatible GPU images in a proxy-only environment. The key was
isolating the build in Podman, ensuring the build VM is fully
proxy-aware using gce-proxy-setup.sh, and leveraging the
enhanced install_gpu_driver.sh from Part 2. In Part 4, we ll bring it all together, deploying a Dataproc cluster
using this custom -tf image within the secure network, and
verifying the end-to-end functionality.

Bits from Debian: New Debian Developers and Maintainers (November and December 2025)

The following contributors got their Debian Developer accounts in the last two months: The following contributors were added as Debian Maintainers in the last two months: Congratulations!

27 January 2026

Elana Hashman: A beginner's guide to improving your digital security

In 2017, I led a series of workshops aimed at teaching beginners a better understanding of encryption, how the internet works, and their digital security. Nearly a decade later, there is still a great need to share reliable resources and guides on improving these skills. I have worked professionally in computer security one way or another for well over a decade, at many major technology companies and in many open source software projects. There are many inaccurate and unreliable resources out there on this subject, put together by well-meaning people without a background in security, which can lead to sharing misinformation, exaggeration and fearmongering. I hope that I can offer you a trusted, curated list of high impact things that you can do right now, using whichever vetted guide you prefer. In addition, I also include how long it should take, why you should do each task, and any limitations. This guide is aimed at improving your personal security, and does not apply to your work-owned devices. Always assume your company can monitor all of your messages and activities on work devices. What can I do to improve my security right away? I put together this list in order of effort, easiest tasks first. You should be able to complete many of the low effort tasks in a single hour. The medium to high effort tasks are very much worth doing, but may take you a few days or even weeks to complete them. Low effort (<15 minutes) Upgrade your software to the latest versions Why? I don't know anyone who hasn't complained about software updates breaking features, introducing bugs, and causing headaches. If it ain't broke, why upgrade, right? Well, alongside all of those annoying bugs and breaking changes, software updates also include security fixes, which will protect your device from being exploited by bad actors. Security issues can be found in software at any time, even software that's been available for many years and thought to be secure. You want to install these as soon as they are available. Recommendation: Turn on automatic upgrades and always keep your devices as up-to-date as possible. If you have some software you know will not work if you upgrade it, at least be sure to upgrade your laptop and phone operating system (iOS, Android, Windows, etc.) and web browser (Chrome, Safari, Firefox, etc.). Do not use devices that do not receive security support (e.g. old Android or iPhones). Guides: Limitations: This will prevent someone from exploiting known security issues on your devices, but it won't help if your device was already compromised. If this is a concern, doing a factory reset, upgrade, and turning on automatic upgrades may help. This also won't protect against all types of attacks, but it is a necessary foundation. Use Signal Why? Signal is a trusted, vetted, secure messaging application that allows you to send end-to-end encrypted messages and make video/phone calls. This means that only you and your intended recipient can decrypt the messages and someone cannot intercept and read your messages, in contrast to texting (SMS) and other insecure forms of messaging. Other applications advertise themselves as end-to-end encrypted, but Signal provides the strongest protections. Recommendation: I recommend installing the Signal app and using it! My mom loves that she can video call me on Wi-Fi on my Android phone. It also supports group chats. I use it as a secure alternative to texting (SMS) and other chat platforms. I also like Signal's "disappearing messages" feature which I enable by default because it automatically deletes messages after a certain period of time. This avoids your messages taking up too much storage. Guides: Limitations: Signal is only able to protect your messages in transit. If someone has access to your phone or the phone of the person you sent messages to, they will still be able to read them. As a rule of thumb, if you don't want someone to read something, don't write it down! Meet in person or make an encrypted phone call where you will not be overheard. If you are talking to someone you don't know, assume your messages are as public as posting on social media. Set passwords and turn on device encryption Why? Passwords ensure that someone else can't unlock your device without your consent or knowledge. They also are required to turn on device encryption, which protects your information on your device from being accessed when it is locked. Biometric (fingerprint or face ID) locking provides some privacy, but your fingerprint or face ID can be used against your wishes, whereas if you are the only person who knows your password, only you can use it. Recommendation: Always set passwords and have device encryption enabled in order to protect your personal privacy. It may be convenient to allow kids or family members access to an unlocked device, but anyone else can access it, too! Use strong passwords that cannot be guessed avoid using names, birthdays, phone numbers, addresses, or other public information. Using a password manager will make creating and managing passwords even easier. Disable biometric unlock, or at least know how to disable it. Most devices will enable disk encryption by default, but you should double-check. Guides: Limitations: If your device is unlocked, the password and encryption will provide no protections; the device must be locked for this to protect your privacy. It is possible, though unlikely, for someone to gain remote access to your device (for example through malware or stalkerware), which would bypass these protections. Some forensic tools are also sophisticated enough to work with physical access to a device that is turned on and locked, but not a device that is turned off/freshly powered on and encrypted. If you lose your password or disk encryption key, you may lose access to your device. For this reason, Windows and Apple laptops can make a cloud backup of your disk encryption key. However, a cloud backup can potentially be disclosed to law enforcement. Install an ad blocker Why? Online ad networks are often exploited to spread malware to unsuspecting visitors. If you've ever visited a regular website and suddenly seen an urgent, flashing pop-up claiming your device was hacked, it is often due to a bad ad. Blocking ads provides an additional layer of protection against these kinds of attacks. Recommendation: I recommend everyone uses an ad blocker at all times. Not only are ads annoying and disruptive, but they can even result in your devices being compromised! Guides: Limitations: Sometimes the use of ad blockers can break functionality on websites, which can be annoying, but you can temporarily disable them to fix the problem. These may not be able to block all ads or all tracking, but they make browsing the web much more pleasant and lower risk! Some people might also be concerned that blocking ads might impact the revenue of their favourite websites or creators. In this case, I recommend either donating directly or sharing the site with a wider audience, but keep using the ad blocker for your safety. Enable HTTPS-Only Mode Why? The "S" in "HTTPS" stands for "secure". This feature, which can be enabled on your web browser, ensures that every time you visit a website, your connection is always end-to-end encrypted (just like when you use Signal!) This ensures that someone can't intercept what you search for, what pages on websites you visit, and any information you or the website share such as your banking details. Recommendation: I recommend enabling this for everyone, though with improvements in web browser security and adoption of HTTPS over the years, your devices will often do this by default! There is a small risk you will encounter some websites that do not support HTTPS, usually older sites. Guides: Limitations: HTTPS protects the information on your connection to a website. It does not hide or protect the fact that you visited that website, only the information you accessed. If the website is malicious, HTTPS does not provide any protection. In certain settings, like when you use a work-managed computer that was set up for you, it can still be possible for your IT Department to see what you are browsing, even over an HTTPS connection, because they have administrator access to your computer and the network. Medium to high effort (1+ hours) These tasks require more effort but are worth the investment. Set up a password manager Why? It is not possible for a person to remember a unique password for every single website and app that they use. I have, as of writing, 556 passwords stored in my password manager. Password managers do three important things very well:
  1. They generate secure passwords with ease. You don't need to worry about getting your digits and special characters just right; the app will do it for you, and generate long, secure passwords.
  2. They remember all your passwords for you, and you just need to remember one password to access all of them. The most common reason people's accounts get hacked online is because they used the same password across multiple websites, and one of the websites had all their passwords leaked. When you use a unique password on every website, it doesn't matter if your password gets leaked!
  3. They autofill passwords based on the website you're visiting. This is important because it helps prevent you from getting phished. If you're tricked into visiting an evil lookalike site, your password manager will refuse to fill the password.
Recommendation: These benefits are extremely important, and setting up a password manager is often one of the most impactful things you can do for your digital security. However, they take time to get used to, and migrating all of your passwords into the app (and immediately changing them!) can take a few minutes at a time... over weeks. I recommend you prioritize the most important sites, such as your email accounts, banking/financial sites, and cellphone provider. This process will feel like a lot of work, but you will get to enjoy the benefits of never having to remember new passwords and the autofill functionality for websites. My recommended password manager is 1Password, but it stores passwords in the cloud and costs money. There are some good free options as well if cost is a concern. You can also use web browser- or OS-based password managers, but I do not prefer these. Guides: Limitations: Many people are concerned about the risk of using a password manager causing all of their passwords to be compromised. For this reason, it's very important to use a vetted, reputable password manager that has passed audits, such as 1Password or Bitwarden. It is also extremely important to choose a strong password to unlock your password manager. 1Password makes this easier by generating a secret to strengthen your unlock password, but I recommend using a long, memorable password in any case. Another risk is that if you forget your password manager's password, you will lose access to all your passwords. This is why I recommend 1Password, which has you set up an Emergency Kit to recover access to your account. Set up two-factor authentication (2FA) for your accounts Why? If your password is compromised in a website leak or due to a phishing attack, two-factor authentication will require a second piece of information to log in and potentially thwart the intruder. This provides you with an extra layer of security on your accounts. Recommendation: You don't necessarily need to enable 2FA on every account, but prioritize enabling it on your most important accounts (email, banking, cellphone, etc.) There are typically a few different kinds: email-based (which is why your email account's security is so important), text message or SMS-based (which is why your cell phone account's security is so important), app-based, and hardware token-based. Email and text message 2FA are fine for most accounts. You may want to enable app- or hardware token-based 2FA for your most sensitive accounts. Guides: Limitations: The major limitation is that if you lose access to 2FA, you can be locked out of an account. This can happen if you're travelling abroad and can't access your usual cellphone number, if you break your phone and you don't have a backup of your authenticator app, or if you lose your hardware-based token. For this reason, many websites will provide you with "backup tokens" you can print them out and store them in a secure location or use your password manager. I also recommend if you use an app, you choose one that will allow you to make secure backups, such as Ente. You are also limited by the types of 2FA a website supports; many don't support app- or hardware token-based 2FA. Remove your information from data brokers Why? This is a problem that mostly affects people in the US. It surprises many people that information from their credit reports and other public records is scraped and available (for free or at a low cost) online through "data broker" websites. I have shocked friends who didn't believe this was an issue by searching for their full names and within 5 minutes being able to show them their birthday, home address, and phone number. This is a serious privacy problem! Recommendation: Opt out of any and all data broker websites to remove this information from the internet. This is especially important if you are at risk of being stalked or harassed. Guides: Limitations: It can take time for your information to be removed once you opt out, and unfortunately search engines may have cached your information for a while longer. This is also not a one-and-done process. New data brokers are constantly popping up and some may not properly honour your opt out, so you will need to check on a regular basis (perhaps once or twice a year) to make sure your data has been properly scrubbed. This also cannot prevent someone from directly searching public records to find your information, but that requires much more effort. "Recommended security measures" I think beginners should avoid We've covered a lot of tasks you should do, but I also think it's important to cover what not to do. I see many of these tools recommended to security beginners, and I think that's a mistake. For each tool, I will explain my reasoning around why I don't think you should use it, and the scenarios in which it might make sense to use. "Secure email" What is it? Many email providers, such as Proton Mail, advertise themselves as providing secure email. They are often recommended as a "more secure" alternative to typical email providers such as GMail. What's the problem? Email is fundamentally insecure by design. The email specification (RFC-3207) states that any publicly available email server MUST NOT require the use of end-to-end encryption in transit. Email providers can of course provide additional security by encrypting their copies of your email, and providing you access to your email by HTTPS, but the messages themselves can always be sent without encryption. Some platforms such as Proton Mail advertise end-to-end encrypted emails so long as you email another Proton user. This is not truly email, but their own internal encrypted messaging platform that follows the email format. What should I do instead? Use Signal to send encrypted messages. NEVER assume the contents of an email are secure. Who should use it? I don't believe there are any major advantages to using a service such as this one. Even if you pay for a more "secure" email provider, the majority of your emails will still be delivered to people who don't. Additionally, while I don't use or necessarily recommend their service, Google offers an Advanced Protection Program for people who may be targeted by state-level actors. PGP/GPG Encryption What is it? PGP ("Pretty Good Privacy") and GPG ("GNU Privacy Guard") are encryption and cryptographic signing software. They are often recommended to encrypt messages or email. What's the problem? GPG is decades old and its usability has always been terrible. It is extremely easy to accidentally send a message that you thought was encrypted without encryption! The problems with PGP/GPG have been extensively documented. What should I do instead? Use Signal to send encrypted messages. Again, NEVER use email for sensitive information. Who should use it? Software developers who contribute to projects where there is a requirement to use GPG should continue to use it until an adequate alternative is available. Everyone else should live their lives in PGP-free bliss. Installing a "secure" operating system (OS) on your phone What is it? There are a number of self-installed operating systems for Android phones, such as GrapheneOS, that advertise as being "more secure" than using the version of the Android operating system provided by your phone manufacturer. They often remove core Google APIs and services to allow you to "de-Google" your phone. What's the problem? These projects are relatively niche, and don't have nearly enough resourcing to be able to respond to the high levels of security pressure Android experiences (such as against the forensic tools I mentioned earlier). You may suddenly lose security support with no notice, as with CalyxOS. You need a high level of technical know-how and a lot of spare time to maintain your device with a custom operating system, which is not a reasonable expectation for the average person. By stripping all Google APIs such as Google Play Services, some useful apps can no longer function. And some law enforcement organizations have gone as far as accusing people who install GrapheneOS on Pixel phones to be engaging in criminal activity. What should I do instead? For the best security on an Android device, use a phone manufactured by Google or Samsung (smaller manufacturers are more unreliable), or consider buying an iPhone. Make sure your device is receiving security updates and up-to-date. Who should use it? These projects are great for tech enthusiasts who are interested in contributing to and developing them further. They can be used to give new life to old phones that are not receiving security or software updates. They are also great for people with an interest in free and open source software and digital autonomy. But these tools are not a good choice for a general audience, nor do they provide more practical security than using an up-to-date Google or Samsung Android phone. Virtual Private Network (VPN) Services What is it? A virtual private network or VPN service can provide you with a secure tunnel from your device to the location that the VPN operates. This means that if I am using my phone in Seattle connected to a VPN in Amsterdam, if I access a website, it appears to the website that my phone is located in Amsterdam. What's the problem? VPN services are frequently advertised as providing security or protection from nefarious bad actors, or helping protect your privacy. These benefits are often far overstated, and there are predatory VPN providers that can actually be harmful. It costs money and resources to provide a VPN, so free VPN services are especially suspect. When you use a VPN, the VPN provider knows the websites you are visiting in order to provide you with the service. Free VPN providers may sell this data in order to cover the cost of providing the service, leaving you with less security and privacy. The average person does not have the knowledge to be able to determine if a VPN service is trustworthy or not. VPNs also don't provide any additional encryption benefits if you are already using HTTPS. They may provide a small amount of privacy benefit if you are connected to an untrusted network with an attacker. What should I do instead? Always use HTTPS to access websites. Don't connect to untrusted internet providers for example, use cellphone network data instead of a sketchy Wi-Fi access point. Your local neighbourhood coffee shop is probably fine. Who should use it? There are three main use cases for VPNs. The first is to bypass geographic restrictions. A VPN will cause all of your web traffic to appear to be coming from another location. If you live in an area that has local internet censorship policies, you can use a VPN to access the internet from a location that lacks such policies. The second is if you know your internet service provider is actively hostile or malicious. A trusted VPN will protect the visibility of all your traffic, including which websites you visit, from your internet service provider, and the only thing they will be able to see is that you are accessing a VPN. The third use case is to access a network that isn't connected to the public internet, such as a corporate intranet. I strongly discourage the use of VPNs for "general-purpose security." Tor What is it? Tor, "The Onion Router", is a free and open source software project that provides anonymous networking. Unlike with a VPN, where the VPN provider knows who you are and what websites you are requesting, Tor's architecture makes it extremely difficult to determine who sent a request. What's the problem? Tor is difficult to set up properly; similar to PGP-encrypted email, it is possible to accidentally not be connected to Tor and not know the difference. This usability has improved over the years, but Tor is still not a good tool for beginners to use. Due to the way Tor works, it is also extremely slow. If you have used cable or fiber internet, get ready to go back to dialup speeds. Tor also doesn't provide perfect privacy and without a strong understanding of its limitations, it can be possible to deanonymize someone despite using it. Additionally, many websites are able to detect connections from the Tor network and block them. What should I do instead? If you want to use Tor to bypass censorship, it is often better to use a trusted VPN provider, particularly if you need high bandwidth (e.g. for streaming). If you want to use Tor to access a website anonymously, Tor itself might not be enough to protect you. For example, if you need to provide an email address or personal information, you can decline to provide accurate information and use a masked email address. A friend of mine once used the alias "Nunya Biznes" Who should use it? Tor should only be used by people who are experienced users of security tools and understand its strengths and limitations. Tor also is best used on a purpose-built system, such as Tor Browser or Freedom of the Press Foundation's SecureDrop. I want to learn more! I hope you've found this guide to be a useful starting point. I always welcome folks reaching out to me with questions, though I might take a little bit of time to respond. You can always email me. If there's enough interest, I might cover the following topics in a future post: Stay safe out there!

24 January 2026

Gunnar Wolf: Finally some light for those who care about Debian on the Raspberry Pi

Finally, some light at the end of the tunnel! As I have said in this blog and elsewhere, after putting quite a bit of work into generating the Debian Raspberry Pi images between late 2018 and 2023, I had to recognize I don t have the time and energy to properly care for it. I even registered a GSoC project for it. I mentored Kurva Prashanth, who did good work on the vmdb2 scripts we use for the image generation but in the end, was unable to push them to be built in Debian infrastructure. Maybe a different approach was needed! While I adopted the images as they were conceived by Michael Stapelberg, sometimes it s easier to start from scratch and build a fresh approach. So, I m not yet pointing at a stable, proven release, but to a good promise. And I hope I m not being pushy by making this public: in the #debian-raspberrypi channel, waldi has shared the images he has created with the Debian Cloud Team s infrastructure. So, right now, the images built so far support Raspberry Pi families 4 and 5 (notably, not the 500 computer I have, due to a missing Device Tree, but I ll try to help figure that bit out Anyway, p400/500/500+ systems are not that usual). Work is underway to get the 3B+ to boot (some hackery is needed, as it only understands MBR partition schemes, so creating a hybrid image seems to be needed). Debian Cloud images for Raspberries Sadly, I don t think the effort will be extended to cover older, 32-bit-only systems (RPi 0, 1 and 2). Anyway, as this effort stabilizes, I will phase out my (stale!) work on raspi.debian.net, and will redirect it to point at the new images.

Comments Andrea Pappacoda tachi@d.o 2026-01-26 17:39:14 GMT+1 Are there any particular caveats compared to using the regular Raspberry Pi OS? Are they documented anywhere? Gunnar Wolf gwolf.blog@gwolf.org 2026-01-26 11:02:29 GMT-6 Well, the Raspberry Pi OS includes quite a bit of software that s not packaged in Debian for various reasons some of it because it s non-free demo-ware, some of it because it s RPiOS-specific configuration, some of it I don t care, I like running Debian wherever possible Andrea Pappacoda tachi@d.o 2026-01-26 18:20:24 GMT+1 Thanks for the reply! Yeah, sorry, I should ve been more specific. I also just care about the Debian part. But: are there any hardware issues or unsupported stuff, like booting from an SSD (which I m currently doing)? Gunnar Wolf gwolf.blog@gwolf.org 2026-01-26 12:16:29 GMT-6 That s beyond my knowledge Although I can tell you that:
  • Raspberry Pi OS has hardware support as soon as their new boards hit the market. The ability to even boot a board can take over a year for the mainline Linux kernel (at least, it has, both in the cases of the 4 and the 5 families).
  • Also, sometimes some bits of hardware are not discovered by the Linux kernels even if the general family boots because they are not declared in the right place of the Device Tree (i.e. the wireless network interface in the 02W is in a different address than in the 3B+, or the 500 does not fully boot while the 5B now does). Usually it is a matter of just declaring stuff in the right place, but it s not a skill many of us have.
  • Also, many RPi hats ship with their own Device Tree overlays, and they cannot always be loaded on top of mainline kernels.
Andrea Pappacoda tachi@d.o 2026-01-26 19:31:55 GMT+1
That s beyond my knowledge Although I can tell you that: Raspberry Pi OS has hardware support as soon as their new boards hit the market. The ability to even boot a board can take over a year for the mainline Linux kernel (at least, it has, both in the cases of the 4 and the 5 families).
Yeah, unfortunately I m aware of that I ve also been trying to boot OpenBSD on my rpi5 out of curiosity, but been blocked by my somewhat unusual setup involving an NVMe SSD as the boot drive :/
Also, sometimes some bits of hardware are not discovered by the Linux kernels even if the general family boots because they are not declared in the right place of the Device Tree (i.e. the wireless network interface in the 02W is in a different address than in the 3B+, or the 500 does not fully boot while the 5B now does). Usually it is a matter of just declaring stuff in the right place, but it s not a skill many of us have.
At some point in my life I had started reading a bit about device trees and stuff, but got distracted by other stuff before I could develop any familiarity with it. So I don t have the skills either :)
Also, many RPi hats ship with their own Device Tree overlays, and they cannot always be loaded on top of mainline kernels.
I m definitely not happy to hear this! Guess I ll have to try, and maybe report back once some page for these new builds materializes.

Next.