A long, long time ago
I have a few pictures on this blog, mostly in earlier years, because even with
small pictures, the git repository became 80MiB soon this is not much in
absolute terms, but the actual Markdown/Haskell/CSS/HTML total size is tiny
compared to the picture, PDFs and fonts. I realised I need a better solution,
probably about ten years ago, and that I should investigate
git-annex. Then time passed, and I heard
about git-lfs, so I thought that s the way forward.
Now, I recently got interested again into doing something about this repository,
and started researching.
Detour: git-lfs
I was sure that git-lfs, being supported by large providers, would be the
modern solution. But to my surprise, git-lfs is very server centric, which in
hindsight makes sense, but for a home setup, it s not very good. Maybe I
misunderstood, but git-lfs is more a protocol/method for a forge to store
files, rather than an end-user solution. But then you need to backup those files
separately (together with the rest of the forge), or implement another way of
safeguarding them.
Further details such as the fact that it keeps two copies of the files (one in
the actual checked-out tree, one in internal storage) means it s not a good
solution. Well, for my blog yes, but not in general. Then posts on Reddit about
horror stories people being locked out of github due to quota, as an example, or
this Stack Overflow
post
about git-lfs constraining how one uses git, convinced me that s not what I
want. To each their own, but not for me I might want to push this blog s repo to
github, but I definitely wouldn t want in that case to pay for github storage
for my blog images (which are copies, not originals). And yes, even in 2025,
those quotas are real GitHub
limits and
I agree with GitHub, storage and large bandwidth can t be free.
Back to the future: git-annex
So back to git-annex. I thought it s going to be a simple thing, but oh boy,
was I wrong. It took me half a week of continuous (well, in free time) reading
and discussions with LLMs to understand a bit how it works. I think, honestly,
it s a bit too complex, which is why the workflows
page lists seven (!) levels of
workflow complexity, from fully-managed, to fully-manual. IMHO, respect to the
author for the awesome tool, but if you need a web app to help you manage git,
it hints that the tool is too complex.
I made the mistake of running git annex sync once, to realise it actually
starts pushing to my upstream repo and creating new branches and whatnot, so
after enough reading, I settled on workflow 6/7, since I don t want another tool
to manage my git history. Maybe I m an outlier here, but everything automatic
is a bit too much for me.
Once you do managed yourself how git-annex works (on the surface, at least), it
is a pretty cool thing. It uses a git-annex git branch to store
metainformation, and that is relatively clean. If you do run git annex sync,
it creates some extra branches, which I don t like, but meh.
Trick question: what is a remote?
One of the most confusing things about git-annex was understanding its remote
concept. I thought a remote is a place where you replicate your data. But not,
that s a special remote. A normal remote is a git remote, but which is
expected to be git/ssh/with command line access. So if you have a git+ssh
remote, git-annex will not only try to push it s above-mentioned branch, but
also copy the files. If such a remote is on a forge that doesn t support
git-annex, then it will complain and get confused.
Of course, if you read the extensive docs, you just do git config remote.<name>.annex-ignore true, and it will understand that it should not
sync to it.
But, aside, from this case, git-annex expects that all checkouts and clones of
the repository are both metadata and data. And if you do any annex commands in
them, all other clones will know about them! This can be unexpected, and you
find people complaining about it, but nowadays there s a solution:
git clone dir && cd dir
git config annex.private true
git annex init "temp copy"
This is important. Any leaf git clone must be followed by that annex.private true config, especially on CI/CD machines. Honestly, I don t understand why
by default clones should be official data stores, but it is what it is.
I settled on not making any of my checkouts stable , but only the actual
storage places. Except those are not git repositories, but just git-annex
storage things. I.e., special remotes.
Is it confusing enough yet ?
Special remotes
The special remotes, as said, is what I expected to be the normal git annex
remotes, i.e. places where the data is stored. But well, they exist, and while
I m only using a couple simple ones, there is a large number of
them. Among the interesting
ones: git-lfs, a
remote that allows also storing the git repository itself
(git-remote-annex),
although I m bit confused about this one, and most of the common storage
providers via the rclone
remote.
Plus, all of the special remotes support encryption, so this is a really neat
way to store your files across a large number of things, and handle replication,
number of copies, from which copy to retrieve, etc. as you with.
And many of other features
git-annex has tons of other features, so to some extent, the sky s the limit.
Automatic selection of what to add git it vs plain git, encryption handling,
number of copies, clusters, computed files, etc. etc. etc. I still think it s
cool but too complex, though!
Uses
Aside from my blog post, of course.
I ve seen blog posts/comments about people using git-annex to track/store their
photo collection, and I could see very well how the remote encrypted repos any
of the services supported by rclone could be an N+2 copy or so. For me, tracking
photos would be a bit too tedious, but it could maybe work after more research.
A more practical thing would probably be replicating my local movie collection
(all legal, to be clear) better than just run rsync from time to time and
tracking the large files in it via git-annex. That s an exercise for another
day, though, once I get more mileage with it - my blog pictures are copies, so I
don t care much if they get lost, but movies are primary online copies, and I
don t want to re-dump the discs. Anyway, for later.
Migrating to git-annex
Migrating here means ending in a state where all large files are in git-annex,
and the plain git repo is small. Just moving the files to git annex at the
current head doesn t remove them from history, so your git repository is still
large; it won t grow in the future, but remains with old size (and contains the
large files in its history).
In my mind, a nice migration would be: run a custom command, and all the history
is migrated to git-annex, so I can go back in time and the still use git-annex.
I na vely expected this would be easy and already available, only to find
comments on the git-annex site with unsure git-filter-branch calls and some
web discussions. This is the
discussion
on the git annex website, but it didn t make me confident it would do the right
thing.
But that discussion is now 8 years old. Surely in 2025, with git-filter-repo,
it s easier? And, maybe I m missing something, but it is not. Not from the point
of view of plain git, that s easy, but because interacting with git-annex, which
stores its data in git itself, so doing this properly across successive steps of
a repo (when replaying the commits) is, I think, not well defined behaviour.
So I was stuck here for a few days, until I got an epiphany: As I m going to
rewrite the repository, of course I m keeping a copy of it from before
git-annex. If so, I don t need the history, back in time, to be correct in the
sense of being able to retrieve the binary files too. It just needs to be
correct from the point of view of the actual Markdown and Haskell files that
represent the meat of the blog.
This simplified the problem a lot. At first, I wanted to just skip these files,
but this could also drop commits (git-filter-repo, by default, drops the commits
if they re empty), and removing the files loses information - when they were
added, what were the paths, etc. So instead I came up with a rather clever idea,
if I might say so: since git-annex replaces files with symlinks already, just
replace the files with symlinks in the whole history, except symlinks that
are dangling (to represent the fact that files are missing). One could also use
empty files, but empty files are more valid in a sense than dangling symlinks,
hence why I settled on those.
Doing this with git-filter-repo is easy, in newer versions, with the
new --file-info-callback. Here is the simple code I used:
This goes and replaces files with a symlink to nowhere, but the symlink should
explain why it s dangling. Then later renames or moving the files around work
naturally , as the rename/mv doesn t care about file contents. Then, when the
filtering is done via:
copy the (binary) files from the original repository
since they re named the same, and in the same places, git sees a type change
then simply run git annex add on those files
For me it was easy as all such files were in a few directories, so just copying
those directories back, a few git-annex add commands, and done.
Of course, then adding a few rsync remotes, git annex copy --to, and the
repository was ready.
Well, I also found a bug in my own Hakyll setup: on a fresh clone, when the
large files are just dangling symlinks, the builder doesn t complain, just
ignores the images. Will have to fix.
Other resources
This is a blog that I read at the beginning, and I found it very useful as an
intro: https://switowski.com/blog/git-annex/. It didn t help me understand how
it works under the covers, but it is well written. The author does use the
sync command though, which is too magic for me, but also agrees about its
complexity
The proof is in the pudding
And now, for the actual first image to be added that never lived in the old
plain git repository. It s not full-res/full-size, it s cropped a bit on the
bottom.
Earlier in the year, I went to Paris for a very brief work trip, and I walked
around a bit it was more beautiful than what I remembered from way way back. So
a bit random selection of a picture, but here it is:
Un bateau sur la Seine
Enjoy!
Welcome to our 5th report from the Reproducible Builds project in 2025! Our monthly reports outline what we ve been up to over the past month, and highlight items of news from elsewhere in the increasingly-important area of software supply-chain security. If you are interested in contributing to the Reproducible Builds project, please do visit the Contribute page on our website.
In this report:
Security audit of Reproducible Builds tools published
The Open Technology Fund s (OTF) security partner Security Research Labs recently an conducted audit of some specific parts of tools developed by Reproducible Builds. This form of security audit, sometimes called a whitebox audit, is a form testing in which auditors have complete knowledge of the item being tested. They auditors assessed the various codebases for resilience against hacking, with key areas including differential report formats in diffoscope, common client web attacks, command injection, privilege management, hidden modifications in the build process and attack vectors that might enable denials of service.
The audit focused on three core Reproducible Builds tools: diffoscope, a Python application that unpacks archives of files and directories and transforms their binary formats into human-readable form in order to compare them; strip-nondeterminism, a Perl program that improves reproducibility by stripping out non-deterministic information such as timestamps or other elements introduced during packaging; and reprotest, a Python application that builds source code multiple times in various environments in order to to test reproducibility.
OTF s announcement contains more of an overview of the audit, and the full 24-page report is available in PDF form as well.
[Colleagues] approached me to talk about a reproducibility issue they d been having with some R code. They d been running simulations that rely on generating samples from a multivariate normal distribution, and despite doing the prudent thing and using set.seed() to control the state of the random number generator (RNG), the results were not computationally reproducible. The same code, executed on different machines, would produce different random numbers. The numbers weren t just a little bit different in the way that we ve all wearily learned to expect when you try to force computers to do mathematics. They were painfully, brutally, catastrophically, irreproducible different. Somewhere, somehow, something broke.
present attestable builds, a new paradigm to provide strong source-to-binary correspondence in software artifacts. We tackle the challenge of opaque build pipelines that disconnect the trust between source code, which can be understood and audited, and the final binary artifact, which is difficult to inspect. Our system uses modern trusted execution environments (TEEs) and sandboxed build containers to provide strong guarantees that a given artifact was correctly built from a specific source code snapshot. As such it complements existing approaches like reproducible builds which typically require time-intensive modifications to existing build configurations and dependencies, and require independent parties to continuously build and verify artifacts.
The authors compare attestable builds with reproducible builds by noting an attestable build requires only minimal changes to an existing project, and offers nearly instantaneous verification of the correspondence between a given binary and the source code and build pipeline used to construct it , and proceed by determining that t he overhead (42 seconds start-up latency and 14% increase in build duration) is small in comparison to the overall build time.
Timo Pohl, Pavel Nov k, Marc Ohm and Michael Meier have published a paper called Towards Reproducibility for Software Packages in Scripting Language Ecosystems. The authors note that past research into Reproducible Builds has focused primarily on compiled languages and their ecosystems, with a further emphasis on Linux distribution packages:
However, the popular scripting language ecosystems potentially face unique issues given the systematic difference in distributed artifacts. This Systemization of Knowledge (SoK) [paper] provides an overview of existing research, aiming to highlight future directions, as well as chances to transfer existing knowledge from compiled language ecosystems. To that end, we work out key aspects in current research, systematize identified challenges for software reproducibility, and map them between the ecosystems.
Ultimately, the three authors find that the literature is sparse , focusing on few individual problems and ecosystems, and therefore identify space for more critical research.
Distribution work
In Debian this month:
Ian Jackson filed a bug against the debian-policy package in order to delve into an issue affecting Debian s support for cross-architecture compilation, multiple-architecture systems, reproducible builds SOURCE_DATE_EPOCH environment variable and the ability to recompile already-uploaded packages to Debian with a new/updated toolchain (binNMUs). Ian identifies a specific case, specifically in the libopts25-dev package, involving a manual page that had interesting downstream effects, potentially affecting backup systems. The bug generated a large number of replies, some of which have references to similar or overlapping issues, such as this one from 2016/2017.
There is now a Reproducibility Status link for each app on f-droid.org, listed on every app s page. Our verification server shows or based on its build results, where means our rebuilder reproduced the same APK file and means it did not. The IzzyOnDroid repository has developed a more elaborate system of badges which displays a for each rebuilder. Additionally, there is a sketch of a five-level graph to represent some aspects about which processes were run.
Hans compares the approach with projects such as Arch Linux and Debian that provide developer-facing tools to give feedback about reproducible builds, but do not display information about reproducible builds in the user-facing interfaces like the package management GUIs.
Arnout Engelen of the NixOS project has been working on reproducing the minimal installation ISO image. This month, Arnout has successfully reproduced the build of the minimal image for the 25.05 release without relying on the binary cache. Work on also reproducing the graphical installer image is ongoing.
In openSUSE news, Bernhard M. Wiedemann posted another monthly update for their work there.
Lastly in Fedora news, Jelle van der Waa opened issues tracking reproducible issues in Haskell documentation, Qt6 recording the host kernel and R packages recording the current date. The R packages can be made reproducible with packaging changes in Fedora.
diffoscope & disorderfsdiffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made the following changes, including preparing and uploading versions 295, 296 and 297 to Debian:
Don t rely on zipdetails --walk argument being available, and only add that argument on newer versions after we test for that. []
Review and merge support for NuGet packages from Omair Majid. []
Update copyright years. []
Merge support for an lzma comparator from Will Hollywood. [][]
Chris also merged an impressive changeset from Siva Mahadevan to make disorderfs more portable, especially on FreeBSD. disorderfs is our FUSE-based filesystem that deliberately introduces non-determinism into directory system calls in order to flush out reproducibility issues []. This was then uploaded to Debian as version 0.6.0-1.
Lastly, Vagrant Cascadian updated diffoscope in GNU Guix to version 296 [][] and 297 [][], and disorderfs to version 0.6.0 [][].
Website updates
Once again, there were a number of improvements made to our website this month including:
Chris Lamb:
Merged four or five suggestions from Guillem Jover for the GNU Autotools examples on the SOURCE_DATE_EPOCH example page []
Incorporated a number of fixes for the JavaScript SOURCE_DATE_EPOCH snippet from Sebastian Davis, which did not handle non-integer values correctly. []
Remove the JavaScript example that uses a fixed timezone on the SOURCE_DATE_EPOCH page. []
Reproducibility testing framework
The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility.
However, Holger Levsen posted to our mailing list this month in order to bring a wider awareness to funding issues faced by the Oregon State University (OSU) Open Source Lab (OSL). As mentioned on OSL s public post, recent changes in university funding makes our current funding model no longer sustainable [and that] unless we secure $250,000 in committed funds, the OSL will shut down later this year . As Holger notes in his post to our mailing list, the Reproducible Builds project relies on hardware nodes hosted there. Nevertheless, Lance Albertson of OSL posted an update to the funding situation later in the month with broadly positive news.
Separate to this, there were various changes to the Jenkins setup this month, which is used as the backend driver of for both tests.reproducible-builds.org and reproduce.debian.net, including:
Migrating the central jenkins.debian.net server AMD Opteron to Intel Haswell CPUs. Thanks to IONOS for hosting this server since 2012.
After testing it for almost ten years, the i386 architecture has been dropped from tests.reproducible-builds.org. This is because that, with the upcoming release of Debian trixie, i386 is no longer supported as a regular architecture there will be no official kernel and no Debian installer for i386 systems. As a result, a large number of nodes hosted by Infomaniak have been retooled from i386 to amd64.
Another node, ionos17-amd64.debian.net, which is used for verifying packages for all.reproduce.debian.net (hosted by IONOS) has had its memory increased from 40 to 64GB, and the number of cores doubled to 32 as well. In addition, two nodes generously hosted by OSUOSL have had their memory doubled to 16GB.
Lastly, we have been granted access to more riscv64 architecture boards, so now we have seven such nodes, all with 16GB memory and 4 cores that are verifying packages for riscv64.reproduce.debian.net. Many thanks to PLCT Lab, ISCAS for providing those.
Outside of this, a number of smaller changes were also made by Holger Levsen:
Disable testing of the i386 architecture. [][][][][]
Document the current disk usage. [][]
Address some image placement now that we only test three architectures. []
Keep track of build performance. []
Misc:
Fix a (harmless) typo in the multiarch_versionskew script. []
In addition, Jochen Sprickerhof made a series of changes related to reproduce.debian.net:
Add out of memory detection to the statistics page. []
Reverse the sorting order on the statistics page. [][][][]
Improve the spacing between statistics groups. []
Update a (hard-coded) line number in error message detection pertaining to a debrebuild line number. []
Support Debian unstable in the rebuilder-debian.sh script. []]
Rely on rebuildctl to sync only arch-specific packages. [][]
Upstream patches
The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. This month, we wrote a large number of such patches, including:
0xFFFF: Use SOURCE_DATE_EPOCH for date in manual pages.
Finally, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:
Another short status update of what happened on my side last
month. Larger blocks besides the Phosh 0.47 release are on screen
keyboard and cell broadcast improvements, work on separate volume
streams, the switch of phoc to wlroots 0.19.0 and effort to make
Phosh work on Debian's upcoming stable release (Trixie) out of the
box. Trixie will ship with Phosh 0.46, if you want to try out 0.47
you can fetch it from Debian's experimental suite.
See below for details on the above and more:
phosh
Track volume control based on media role priority (MR)
Standardize audio stream roles (MR). Otherwise we'll have a hard time
with e.g. WirePlumbers role based policy linking as apps might use all kinds of types.
Reviews
This is not code by me but reviews on other peoples code. The list is
(as usual) slightly incomplete. Thanks for the contributions!
In this post, I demonstrate the optimal workflow for creating new Debian packages in 2025, preserving the upstream git history. The motivation for this is to lower the barrier for sharing improvements to and from upstream, and to improve software provenance and supply-chain security by making it easy to inspect every change at any level using standard git tooling.
Key elements of this workflow include:
Using a Git fork/clone of the upstream repository as the starting point for creating Debian packaging repositories.
Consistent use of the same git-buildpackage commands, with all package-specific options in gbp.conf.
Pristine-tar and upstream signatures for supply-chain security.
Use of Files-Excluded in the debian/copyright file to filter out unwanted files in Debian.
Patch queues to easily rebase and cherry-pick changes across Debian and upstream branches.
Efficient use of Salsa, Debian s GitLab instance, for both automated feedback from CI systems and human feedback from peer reviews.
To make the instructions so concrete that anyone can repeat all the steps themselves on a real package, I demonstrate the steps by packaging the command-line tool Entr. It is written in C, has very few dependencies, and its final Debian source package structure is simple, yet exemplifies all the important parts that go into a complete Debian package:
Creating a new packaging repository and publishing it under your personal namespace on salsa.debian.org.
Using dh_make to create the initial Debian packaging.
Posting the first draft of the Debian packaging as a Merge Request (MR) and using Salsa CI to verify Debian packaging quality.
Running local builds efficiently and iterating on the packaging process.
Create new Debian packaging repository from the existing upstream project git repository
First, create a new empty directory, then clone the upstream Git repository inside it:
Using a clean directory makes it easier to inspect the build artifacts of a Debian package, which will be output in the parent directory of the Debian source directory.
The extra parameters given to git clone lay the foundation for the Debian packaging git repository structure where the upstream git remote name is upstreamvcs. Only the upstream main branch is tracked to avoid cluttering git history with upstream development branches that are irrelevant for packaging in Debian.
Next, enter the git repository directory and list the git tags. Pick the latest upstream release tag as the commit to start the branch upstream/latest. This latest refers to the upstream release, not the upstream development branch. Immediately after, branch off the debian/latest branch, which will have the actual Debian packaging files in the debian/ subdirectory.
shellcd entr
git tag # shows the latest upstream release tag was '5.6'
git checkout -b upstream/latest 5.6
git checkout -b debian/latest
cd entr
git tag # shows the latest upstream release tag was '5.6'git checkout -b upstream/latest 5.6
git checkout -b debian/latest
At this point, the repository is structured according to DEP-14 conventions, ensuring a clear separation between upstream and Debian packaging changes, but there are no Debian changes yet. Next, add the Salsa repository as a new remote which called origin, the same as the default remote name in git.
This is an important preparation step to later be able to create a Merge Request on Salsa that targets the debian/latest branch, which does not yet have any debian/ directory.
Launch a Debian Sid (unstable) container to run builds in
To ensure that all packaging tools are of the latest versions, run everything inside a fresh Sid container. This has two benefits: you are guaranteed to have the most up-to-date toolchain, and your host system stays clean without getting polluted by various extra packages. Additionally, this approach works even if your host system is not Debian/Ubuntu.
cd ..
podman run --interactive --tty --rm --shm-size=1G --cap-add SYS_PTRACE \
--env='DEB*' --volume=$PWD:/tmp/test --workdir=/tmp/test debian:sid bash
Note that the container should be started from the parent directory of the git repository, not inside it. The --volume parameter will loop-mount the current directory inside the container. Thus all files created and modified are on the host system, and will persist after the container shuts down.
Once inside the container, install the basic dependencies:
Automate creating the debian/ files with dh-make
To create the files needed for the actual Debian packaging, use dh_make:
shell# dh_make --packagename entr_5.6 --single --createorig
Maintainer Name : Otto Kek l inen
Email-Address : otto@debian.org
Date : Sat, 15 Feb 2025 01:17:51 +0000
Package Name : entr
Version : 5.6
License : blank
Package Type : single
Are the details correct? [Y/n/q]
Done. Please edit the files in the debian/ subdirectory now.
# dh_make --packagename entr_5.6 --single --createorigMaintainer Name : Otto Kek l inen
Email-Address : otto@debian.org
Date : Sat, 15 Feb 2025 01:17:51 +0000
Package Name : entr
Version : 5.6
License : blank
Package Type : single
Are the details correct? [Y/n/q]Done. Please edit the files in the debian/ subdirectory now.
Due to how dh_make works, the package name and version need to be written as a single underscore separated string. In this case, you should choose --single to specify that the package type is a single binary package. Other options would be --library for library packages (see libgda5 sources as an example) or --indep (see dns-root-data sources as an example). The --createorig will create a mock upstream release tarball (entr_5.6.orig.tar.xz) from the current release directory, which is necessary due to historical reasons and how dh_make worked before git repositories became common and Debian source packages were based off upstream release tarballs (e.g. *.tar.gz).
At this stage, a debian/ directory has been created with template files, and you can start modifying the files and iterating towards actual working packaging.
shellgit add debian/
git commit -a -m "Initial Debian packaging"
git add debian/
git commit -a -m "Initial Debian packaging"
Review the files
The full list of files after the above steps with dh_make would be:
You can browse these files in the demo repository.
The mandatory files in the debian/ directory are:
changelog,
control,
copyright,
and rules.
All the other files have been created for convenience so the packager has template files to work from. The files with the suffix .ex are example files that won t have any effect until their content is adjusted and the suffix removed.
For detailed explanations of the purpose of each file in the debian/ subdirectory, see the following resources:
The Debian Policy Manual: Describes the structure of the operating system, the package archive and requirements for packages to be included in the Debian archive.
The Developer s Reference: A collection of best practices and process descriptions Debian packagers are expected to follow while interacting with one another.
Debhelper man pages: Detailed information of how the Debian package build system works, and how the contents of the various files in debian/ affect the end result.
As Entr, the package used in this example, is a real package that already exists in the Debian archive, you may want to browse the actual Debian packaging source at https://salsa.debian.org/debian/entr/-/tree/debian/latest/debian for reference.
Most of these files have standardized formatting conventions to make collaboration easier. To automatically format the files following the most popular conventions, simply run wrap-and-sort -vast or debputy reformat --style=black.
Identify build dependencies
The most common reason for builds to fail is missing dependencies. The easiest way to identify which Debian package ships the required dependency is using apt-file. If, for example, a build fails complaining that pcre2posix.h cannot be found or that libcre2-posix.so is missing, you can use these commands:
The output above implies that the debian/control should be extended to define a Build-Depends: libpcre2-dev relationship.
There is also dpkg-depcheck that uses strace to trace the files the build process tries to access, and lists what Debian packages those files belong to. Example usage:
shelldpkg-depcheck -b debian/rules build
dpkg-depcheck -b debian/rules build
Build the Debian sources to generate the .deb package
After the first pass of refining the contents of the files in debian/, test the build by running dpkg-buildpackage inside the container:
shelldpkg-buildpackage -uc -us -b
dpkg-buildpackage -uc -us -b
The options -uc -us will skip signing the resulting Debian source package and other build artifacts. The -b option will skip creating a source package and only build the (binary) *.deb packages.
The output is very verbose and gives a large amount of context about what is happening during the build to make debugging build failures easier. In the build log of entr you will see for example the line dh binary --buildsystem=makefile. This and other dh commands can also be run manually if there is a need to quickly repeat only a part of the build while debugging build failures.
To see what files were generated or modified by the build simply run git status --ignored:
shell$ git status --ignored
On branch debian/latest
Untracked files:
(use "git add <file>..." to include in what will be committed)
debian/debhelper-build-stamp
debian/entr.debhelper.log
debian/entr.substvars
debian/files
Ignored files:
(use "git add -f <file>..." to include in what will be committed)
Makefile
compat.c
compat.o
debian/.debhelper/
debian/entr/
entr
entr.o
status.o
$ git status --ignored
On branch debian/latest
Untracked files:
(use "git add <file>..." to include in what will be committed) debian/debhelper-build-stamp
debian/entr.debhelper.log
debian/entr.substvars
debian/files
Ignored files:
(use "git add -f <file>..." to include in what will be committed) Makefile
compat.c
compat.o
debian/.debhelper/
debian/entr/
entr
entr.o
status.o
Re-running dpkg-buildpackage will include running the command dh clean, which assuming it is configured correctly in the debian/rules file will reset the source directory to the original pristine state. The same can of course also be done with regular git commands git reset --hard; git clean -fdx. To avoid accidentally committing unnecessary build artifacts in git, a debian/.gitignore can be useful and it would typically include all four files listed as untracked above.
After a successful build you would have the following files:
The contents of debian/entr are essentially what goes into the resulting entr_5.6-1_amd64.deb package. Familiarizing yourself with the majority of the files in the original upstream source as well as all the resulting build artifacts is time consuming, but it is a necessary investment to get high-quality Debian packages.
There are also tools such as Debcraft that automate generating the build artifacts in separate output directories for each build, thus making it easy to compare the changes to correlate what change in the Debian packaging led to what change in the resulting build artifacts.
Re-run the initial import with git-buildpackage
When upstreams publish releases as tarballs, they should also be imported for optimal software supply-chain security, in particular if upstream also publishes cryptographic signatures that can be used to verify the authenticity of the tarballs.
To achieve this, the files debian/watch, debian/upstream/signing-key.asc, and debian/gbp.conf need to be present with the correct options. In the gbp.conf file, ensure you have the correct options based on:
Does upstream release tarballs? If so, enforce pristine-tar = True.
Does upstream sign the tarballs? If so, configure explicit signature checking with upstream-signatures = on.
Does upstream have a git repository, and does it have release git tags? If so, configure the release git tag format, e.g. upstream-vcs-tag = %(version%~%.)s.
To validate that the above files are working correctly, run gbp import-orig with the current version explicitly defined:
shell$ gbp import-orig --uscan --upstream-version 5.6
gbp:info: Launching uscan...
gpgv: Signature made 7. Aug 2024 07.43.27 PDT
gpgv: using RSA key 519151D83E83D40A232B4D615C418B8631BC7C26
gpgv: Good signature from "Eric Radman <ericshane@eradman.com>"
gbp:info: Using uscan downloaded tarball ../entr_5.6.orig.tar.gz
gbp:info: Importing '../entr_5.6.orig.tar.gz' to branch 'upstream/latest'...
gbp:info: Source package is entr
gbp:info: Upstream version is 5.6
gbp:info: Replacing upstream source on 'debian/latest'
gbp:info: Running Postimport hook
gbp:info: Successfully imported version 5.6 of ../entr_5.6.orig.tar.gz
$ gbp import-orig --uscan --upstream-version 5.6
gbp:info: Launching uscan...
gpgv: Signature made 7. Aug 2024 07.43.27 PDT
gpgv: using RSA key 519151D83E83D40A232B4D615C418B8631BC7C26
gpgv: Good signature from "Eric Radman <ericshane@eradman.com>"gbp:info: Using uscan downloaded tarball ../entr_5.6.orig.tar.gz
gbp:info: Importing '../entr_5.6.orig.tar.gz' to branch 'upstream/latest'...
gbp:info: Source package is entr
gbp:info: Upstream version is 5.6
gbp:info: Replacing upstream source on 'debian/latest'gbp:info: Running Postimport hook
gbp:info: Successfully imported version 5.6 of ../entr_5.6.orig.tar.gz
As the original packaging was done based on the upstream release git tag, the above command will fetch the tarball release, create the pristine-tar branch, and store the tarball delta on it. This command will also attempt to create the tag upstream/5.6 on the upstream/latest branch.
Import new upstream versions in the future
Forking the upstream git repository, creating the initial packaging, and creating the DEP-14 branch structure are all one-off work needed only when creating the initial packaging.
Going forward, to import new upstream releases, one would simply run git fetch upstreamvcs; gbp import-orig --uscan, which fetches the upstream git tags, checks for new upstream tarballs, and automatically downloads, verifies, and imports the new version. See the galera-4-demo example in the Debian source packages in git explained post as a demo you can try running yourself and examine in detail.
You can also try running gbp import-orig --uscan without specifying a version. It would fetch it, as it will notice there is now Entr version 5.7 available, and import it.
Build using git-buildpackage
From this stage onwards you should build the package using gbp buildpackage, which will do a more comprehensive build.
shellgbp buildpackage -uc -us
gbp buildpackage -uc -us
The git-buildpackage build also includes running Lintian to find potential Debian policy violations in the sources or in the resulting .deb binary packages. Many Debian Developers run lintian -EviIL +pedantic after every build to check that there are no new nags, and to validate that changes intended to previous Lintian nags were correct.
Open a Merge Request on Salsa for Debian packaging review
Getting everything perfectly right takes a lot of effort, and may require reaching out to an experienced Debian Developers for review and guidance. Thus, you should aim to publish your initial packaging work on Salsa, Debian s GitLab instance, for review and feedback as early as possible.
For somebody to be able to easily see what you have done, you should rename your debian/latest branch to another name, for example next/debian/latest, and open a Merge Request that targets the debian/latest branch on your Salsa fork, which still has only the unmodified upstream files.
If you have followed the workflow in this post so far, you can simply run:
git checkout -b next/debian/latest
git push --set-upstream origin next/debian/latest
Open in a browser the URL visible in the git remote response
Write the Merge Request description in case the default text from your commit is not enough
Mark the MR as Draft using the checkbox
Publish the MR and request feedback
Once a Merge Request exists, discussion regarding what additional changes are needed can be conducted as MR comments. With an MR, you can easily iterate on the contents of next/debian/latest, rebase, force push, and request re-review as many times as you want.
While at it, make sure the Settings > CI/CD page has under CI/CD configuration file the value debian/salsa-ci.yml so that the CI can run and give you immediate automated feedback.
For an example of an initial packaging Merge Request, see https://salsa.debian.org/otto/entr-demo/-/merge_requests/1.
Open a Merge Request / Pull Request to fix upstream code
Due to the high quality requirements in Debian, it is fairly common that while doing the initial Debian packaging of an open source project, issues are found that stem from the upstream source code. While it is possible to carry extra patches in Debian, it is not good practice to deviate too much from upstream code with custom Debian patches. Instead, the Debian packager should try to get the fixes applied directly upstream.
Using git-buildpackage patch queues is the most convenient way to make modifications to the upstream source code so that they automatically convert into Debian patches (stored at debian/patches), and can also easily be submitted upstream as any regular git commit (and rebased and resubmitted many times over).
First, decide if you want to work out of the upstream development branch and later cherry-pick to the Debian packaging branch, or work out of the Debian packaging branch and cherry-pick to an upstream branch.
The example below starts from the upstream development branch and then cherry-picks the commit into the git-buildpackage patch queue:
shellgit checkout -b bugfix-branch master
nano entr.c
make
./entr # verify change works as expected
git commit -a -m "Commit title" -m "Commit body"
git push # submit upstream
gbp pq import --force --time-machine=10
git cherry-pick <commit id>
git commit --amend # extend commit message with DEP-3 metadata
gbp buildpackage -uc -us -b
./entr # verify change works as expected
gbp pq export --drop --commit
git commit --amend # Write commit message along lines "Add patch to .."
git checkout -b bugfix-branch master
nano entr.c
make
./entr # verify change works as expectedgit commit -a -m "Commit title" -m "Commit body"git push # submit upstreamgbp pq import --force --time-machine=10git cherry-pick <commit id>
git commit --amend # extend commit message with DEP-3 metadatagbp buildpackage -uc -us -b
./entr # verify change works as expectedgbp pq export --drop --commit
git commit --amend # Write commit message along lines "Add patch to .."
The example below starts by making the fix on a git-buildpackage patch queue branch, and then cherry-picking it onto the upstream development branch:
These can be run at any time, regardless if any debian/patches existed prior, or if existing patches applied cleanly or not, or if there were old patch queue branches around. Note that the extra -b in gbp buildpackage -uc -us -b instructs to build only binary packages, avoiding any nags from dpkg-source that there are modifications in the upstream sources while building in the patches-applied mode.
Programming-language specific dh-make alternatives
As each programming language has its specific way of building the source code, and many other conventions regarding the file layout and more, Debian has multiple custom tools to create new Debian source packages for specific programming languages.
Notably, Python does not have its own tool, but there is an dh_make --python option for Python support directly in dh_make itself. The list is not complete and many more tools exist. For some languages, there are even competing options, such as for Go there is in addition to dh-make-golang also Gophian.
When learning Debian packaging, there is no need to learn these tools upfront. Being aware that they exist is enough, and one can learn them only if and when one starts to package a project in a new programming language.
The difference between source git repository vs source packages vs binary packages
As seen in earlier example, running gbp buildpackage on the Entr packaging repository above will result in several files:
The entr_5.6-1_amd64.deb is the binary package, which can be installed on a Debian/Ubuntu system. The rest of the files constitute the source package. To do a source-only build, run gbp buildpackage -S and note the files produced:
The source package files can be used to build the binary .deb for amd64, or any architecture that the package supports. It is important to grasp that the Debian source package is the preferred form to be able to build the binary packages on various Debian build systems, and the Debian source package is not the same thing as the Debian packaging git repository contents.
If the package is large and complex, the build could result in multiple binary packages. One set of package definition files in debian/ will however only ever result in a single source package.
Option to repackage source packages with Files-Excluded lists in the debian/copyright file
Some upstream projects may include binary files in their release, or other undesirable content that needs to be omitted from the source package in Debian. The easiest way to filter them out is by adding to the debian/copyright file a Files-Excluded field listing the undesired files. The debian/copyright file is read by uscan, which will repackage the upstream sources on-the-fly when importing new upstream releases.
For a real-life example, see the debian/copyright files in the Godot package that lists:
The resulting repackaged upstream source tarball, as well as the upstream version component, will have an extra +ds to signify that it is not the true original upstream source but has been modified by Debian:
godot_4.3+ds.orig.tar.xz
godot_4.3+ds-1_amd64.deb
godot_4.3+ds.orig.tar.xz
godot_4.3+ds-1_amd64.deb
Creating one Debian source package from multiple upstream source packages also possible
In some rare cases the upstream project may be split across multiple git repositories or the upstream release may consist of multiple components each in their own separate tarball. Usually these are very large projects that get some benefits from releasing components separately. If in Debian these are deemed to go into a single source package, it is technically possible using the component system in git-buildpackage and uscan. For an example see the gbp.conf and watch files in the node-cacache package.
Using this type of structure should be a last resort, as it creates complexity and inter-dependencies that are bound to cause issues later on. It is usually better to work with upstream and champion universal best practices with clear releases and version schemes.
When not to start the Debian packaging repository as a fork of the upstream one
Not all upstreams use Git for version control. It is by far the most popular, but there are still some that use e.g. Subversion or Mercurial. Who knows maybe in the future some new version control systems will start to compete with Git. There are also projects that use Git in massive monorepos and with complex submodule setups that invalidate the basic assumptions required to map an upstream Git repository into a Debian packaging repository.
In those cases one can t use a debian/latest branch on a clone of the upstream git repository as the starting point for the Debian packaging, but one must revert the traditional way of starting from an upstream release tarball with gbp import-orig package-1.0.tar.gz.
Conclusion
Created in August 1993, Debian is one of the oldest Linux distributions. In the 32 years since inception, the .deb packaging format and the tooling to work with it have evolved several generations. In the past 10 years, more and more Debian Developers have converged on certain core practices evidenced by https://trends.debian.net/, but there is still a lot of variance in workflows even for identical tasks. Hopefully, you find this post useful in giving practical guidance on how exactly to do the most common things when packaging software for Debian.
Happy packaging!
In this post, I demonstrate the optimal workflow for creating new Debian packages in 2025, preserving the upstream git history. The motivation for this is to lower the barrier for sharing improvements to and from upstream, and to improve software provenance and supply-chain security by making it easy to inspect every change at any level using standard git tooling.
Key elements of this workflow include:
Using a Git fork/clone of the upstream repository as the starting point for creating Debian packaging repositories.
Consistent use of the same git-buildpackage commands, with all package-specific options in gbp.conf.
Pristine-tar and upstream signatures for supply-chain security.
Use of Files-Excluded in the debian/copyright file to filter out unwanted files in Debian.
Patch queues to easily rebase and cherry-pick changes across Debian and upstream branches.
Efficient use of Salsa, Debian s GitLab instance, for both automated feedback from CI systems and human feedback from peer reviews.
To make the instructions so concrete that anyone can repeat all the steps themselves on a real package, I demonstrate the steps by packaging the command-line tool Entr. It is written in C, has very few dependencies, and its final Debian source package structure is simple, yet exemplifies all the important parts that go into a complete Debian package:
Creating a new packaging repository and publishing it under your personal namespace on salsa.debian.org.
Using dh_make to create the initial Debian packaging.
Posting the first draft of the Debian packaging as a Merge Request (MR) and using Salsa CI to verify Debian packaging quality.
Running local builds efficiently and iterating on the packaging process.
Create new Debian packaging repository from the existing upstream project git repository
First, create a new empty directory, then clone the upstream Git repository inside it:
Using a clean directory makes it easier to inspect the build artifacts of a Debian package, which will be output in the parent directory of the Debian source directory.
The extra parameters given to git clone lay the foundation for the Debian packaging git repository structure where the upstream git remote name is upstreamvcs. Only the upstream main branch is tracked to avoid cluttering git history with upstream development branches that are irrelevant for packaging in Debian.
Next, enter the git repository directory and list the git tags. Pick the latest upstream release tag as the commit to start the branch upstream/latest. This latest refers to the upstream release, not the upstream development branch. Immediately after, branch off the debian/latest branch, which will have the actual Debian packaging files in the debian/ subdirectory.
shellcd entr
git tag # shows the latest upstream release tag was '5.6'
git checkout -b upstream/latest 5.6
git checkout -b debian/latest
cd entr
git tag # shows the latest upstream release tag was '5.6'git checkout -b upstream/latest 5.6
git checkout -b debian/latest
At this point, the repository is structured according to DEP-14 conventions, ensuring a clear separation between upstream and Debian packaging changes, but there are no Debian changes yet. Next, add the Salsa repository as a new remote which called origin, the same as the default remote name in git.
This is an important preparation step to later be able to create a Merge Request on Salsa that targets the debian/latest branch, which does not yet have any debian/ directory.
Launch a Debian Sid (unstable) container to run builds in
To ensure that all packaging tools are of the latest versions, run everything inside a fresh Sid container. This has two benefits: you are guaranteed to have the most up-to-date toolchain, and your host system stays clean without getting polluted by various extra packages. Additionally, this approach works even if your host system is not Debian/Ubuntu.
cd ..
podman run --interactive --tty --rm --shm-size=1G --cap-add SYS_PTRACE \
--env='DEB*' --volume=$PWD:/tmp/test --workdir=/tmp/test debian:sid bash
Note that the container should be started from the parent directory of the git repository, not inside it. The --volume parameter will loop-mount the current directory inside the container. Thus all files created and modified are on the host system, and will persist after the container shuts down.
Once inside the container, install the basic dependencies:
Automate creating the debian/ files with dh-make
To create the files needed for the actual Debian packaging, use dh_make:
shell# dh_make --packagename entr_5.6 --single --createorig
Maintainer Name : Otto Kek l inen
Email-Address : otto@debian.org
Date : Sat, 15 Feb 2025 01:17:51 +0000
Package Name : entr
Version : 5.6
License : blank
Package Type : single
Are the details correct? [Y/n/q]
Done. Please edit the files in the debian/ subdirectory now.
# dh_make --packagename entr_5.6 --single --createorigMaintainer Name : Otto Kek l inen
Email-Address : otto@debian.org
Date : Sat, 15 Feb 2025 01:17:51 +0000
Package Name : entr
Version : 5.6
License : blank
Package Type : single
Are the details correct? [Y/n/q]Done. Please edit the files in the debian/ subdirectory now.
Due to how dh_make works, the package name and version need to be written as a single underscore separated string. In this case, you should choose --single to specify that the package type is a single binary package. Other options would be --library for library packages (see libgda5 sources as an example) or --indep (see dns-root-data sources as an example). The --createorig will create a mock upstream release tarball (entr_5.6.orig.tar.xz) from the current release directory, which is necessary due to historical reasons and how dh_make worked before git repositories became common and Debian source packages were based off upstream release tarballs (e.g. *.tar.gz).
At this stage, a debian/ directory has been created with template files, and you can start modifying the files and iterating towards actual working packaging.
shellgit add debian/
git commit -a -m "Initial Debian packaging"
git add debian/
git commit -a -m "Initial Debian packaging"
Review the files
The full list of files after the above steps with dh_make would be:
You can browse these files in the demo repository.
The mandatory files in the debian/ directory are:
changelog,
control,
copyright,
and rules.
All the other files have been created for convenience so the packager has template files to work from. The files with the suffix .ex are example files that won t have any effect until their content is adjusted and the suffix removed.
For detailed explanations of the purpose of each file in the debian/ subdirectory, see the following resources:
The Debian Policy Manual: Describes the structure of the operating system, the package archive and requirements for packages to be included in the Debian archive.
The Developer s Reference: A collection of best practices and process descriptions Debian packagers are expected to follow while interacting with one another.
Debhelper man pages: Detailed information of how the Debian package build system works, and how the contents of the various files in debian/ affect the end result.
As Entr, the package used in this example, is a real package that already exists in the Debian archive, you may want to browse the actual Debian packaging source at https://salsa.debian.org/debian/entr/-/tree/debian/latest/debian for reference.
Most of these files have standardized formatting conventions to make collaboration easier. To automatically format the files following the most popular conventions, simply run wrap-and-sort -vast or debputy reformat --style=black.
Identify build dependencies
The most common reason for builds to fail is missing dependencies. The easiest way to identify which Debian package ships the required dependency is using apt-file. If, for example, a build fails complaining that pcre2posix.h cannot be found or that libcre2-posix.so is missing, you can use these commands:
The output above implies that the debian/control should be extended to define a Build-Depends: libpcre2-dev relationship.
There is also dpkg-depcheck that uses strace to trace the files the build process tries to access, and lists what Debian packages those files belong to. Example usage:
shelldpkg-depcheck -b debian/rules build
dpkg-depcheck -b debian/rules build
Build the Debian sources to generate the .deb package
After the first pass of refining the contents of the files in debian/, test the build by running dpkg-buildpackage inside the container:
shelldpkg-buildpackage -uc -us -b
dpkg-buildpackage -uc -us -b
The options -uc -us will skip signing the resulting Debian source package and other build artifacts. The -b option will skip creating a source package and only build the (binary) *.deb packages.
The output is very verbose and gives a large amount of context about what is happening during the build to make debugging build failures easier. In the build log of entr you will see for example the line dh binary --buildsystem=makefile. This and other dh commands can also be run manually if there is a need to quickly repeat only a part of the build while debugging build failures.
To see what files were generated or modified by the build simply run git status --ignored:
shell$ git status --ignored
On branch debian/latest
Untracked files:
(use "git add <file>..." to include in what will be committed)
debian/debhelper-build-stamp
debian/entr.debhelper.log
debian/entr.substvars
debian/files
Ignored files:
(use "git add -f <file>..." to include in what will be committed)
Makefile
compat.c
compat.o
debian/.debhelper/
debian/entr/
entr
entr.o
status.o
$ git status --ignored
On branch debian/latest
Untracked files:
(use "git add <file>..." to include in what will be committed) debian/debhelper-build-stamp
debian/entr.debhelper.log
debian/entr.substvars
debian/files
Ignored files:
(use "git add -f <file>..." to include in what will be committed) Makefile
compat.c
compat.o
debian/.debhelper/
debian/entr/
entr
entr.o
status.o
Re-running dpkg-buildpackage will include running the command dh clean, which assuming it is configured correctly in the debian/rules file will reset the source directory to the original pristine state. The same can of course also be done with regular git commands git reset --hard; git clean -fdx. To avoid accidentally committing unnecessary build artifacts in git, a debian/.gitignore can be useful and it would typically include all four files listed as untracked above.
After a successful build you would have the following files:
The contents of debian/entr are essentially what goes into the resulting entr_5.6-1_amd64.deb package. Familiarizing yourself with the majority of the files in the original upstream source as well as all the resulting build artifacts is time consuming, but it is a necessary investment to get high-quality Debian packages.
There are also tools such as Debcraft that automate generating the build artifacts in separate output directories for each build, thus making it easy to compare the changes to correlate what change in the Debian packaging led to what change in the resulting build artifacts.
Re-run the initial import with git-buildpackage
When upstreams publish releases as tarballs, they should also be imported for optimal software supply-chain security, in particular if upstream also publishes cryptographic signatures that can be used to verify the authenticity of the tarballs.
To achieve this, the files debian/watch, debian/upstream/signing-key.asc, and debian/gbp.conf need to be present with the correct options. In the gbp.conf file, ensure you have the correct options based on:
Does upstream release tarballs? If so, enforce pristine-tar = True.
Does upstream sign the tarballs? If so, configure explicit signature checking with upstream-signatures = on.
Does upstream have a git repository, and does it have release git tags? If so, configure the release git tag format, e.g. upstream-vcs-tag = %(version%~%.)s.
To validate that the above files are working correctly, run gbp import-orig with the current version explicitly defined:
shell$ gbp import-orig --uscan --upstream-version 5.6
gbp:info: Launching uscan...
gpgv: Signature made 7. Aug 2024 07.43.27 PDT
gpgv: using RSA key 519151D83E83D40A232B4D615C418B8631BC7C26
gpgv: Good signature from "Eric Radman <ericshane@eradman.com>"
gbp:info: Using uscan downloaded tarball ../entr_5.6.orig.tar.gz
gbp:info: Importing '../entr_5.6.orig.tar.gz' to branch 'upstream/latest'...
gbp:info: Source package is entr
gbp:info: Upstream version is 5.6
gbp:info: Replacing upstream source on 'debian/latest'
gbp:info: Running Postimport hook
gbp:info: Successfully imported version 5.6 of ../entr_5.6.orig.tar.gz
$ gbp import-orig --uscan --upstream-version 5.6
gbp:info: Launching uscan...
gpgv: Signature made 7. Aug 2024 07.43.27 PDT
gpgv: using RSA key 519151D83E83D40A232B4D615C418B8631BC7C26
gpgv: Good signature from "Eric Radman <ericshane@eradman.com>"gbp:info: Using uscan downloaded tarball ../entr_5.6.orig.tar.gz
gbp:info: Importing '../entr_5.6.orig.tar.gz' to branch 'upstream/latest'...
gbp:info: Source package is entr
gbp:info: Upstream version is 5.6
gbp:info: Replacing upstream source on 'debian/latest'gbp:info: Running Postimport hook
gbp:info: Successfully imported version 5.6 of ../entr_5.6.orig.tar.gz
As the original packaging was done based on the upstream release git tag, the above command will fetch the tarball release, create the pristine-tar branch, and store the tarball delta on it. This command will also attempt to create the tag upstream/5.6 on the upstream/latest branch.
Import new upstream versions in the future
Forking the upstream git repository, creating the initial packaging, and creating the DEP-14 branch structure are all one-off work needed only when creating the initial packaging.
Going forward, to import new upstream releases, one would simply run git fetch upstreamvcs; gbp import-orig --uscan, which fetches the upstream git tags, checks for new upstream tarballs, and automatically downloads, verifies, and imports the new version. See the galera-4-demo example in the Debian source packages in git explained post as a demo you can try running yourself and examine in detail.
You can also try running gbp import-orig --uscan without specifying a version. It would fetch it, as it will notice there is now Entr version 5.7 available, and import it.
Build using git-buildpackage
From this stage onwards you should build the package using gbp buildpackage, which will do a more comprehensive build.
shellgbp buildpackage -uc -us
gbp buildpackage -uc -us
The git-buildpackage build also includes running Lintian to find potential Debian policy violations in the sources or in the resulting .deb binary packages. Many Debian Developers run lintian -EviIL +pedantic after every build to check that there are no new nags, and to validate that changes intended to previous Lintian nags were correct.
Open a Merge Request on Salsa for Debian packaging review
Getting everything perfectly right takes a lot of effort, and may require reaching out to an experienced Debian Developers for review and guidance. Thus, you should aim to publish your initial packaging work on Salsa, Debian s GitLab instance, for review and feedback as early as possible.
For somebody to be able to easily see what you have done, you should rename your debian/latest branch to another name, for example next/debian/latest, and open a Merge Request that targets the debian/latest branch on your Salsa fork, which still has only the unmodified upstream files.
If you have followed the workflow in this post so far, you can simply run:
git checkout -b next/debian/latest
git push --set-upstream origin next/debian/latest
Open in a browser the URL visible in the git remote response
Write the Merge Request description in case the default text from your commit is not enough
Mark the MR as Draft using the checkbox
Publish the MR and request feedback
Once a Merge Request exists, discussion regarding what additional changes are needed can be conducted as MR comments. With an MR, you can easily iterate on the contents of next/debian/latest, rebase, force push, and request re-review as many times as you want.
While at it, make sure the Settings > CI/CD page has under CI/CD configuration file the value debian/salsa-ci.yml so that the CI can run and give you immediate automated feedback.
For an example of an initial packaging Merge Request, see https://salsa.debian.org/otto/entr-demo/-/merge_requests/1.
Open a Merge Request / Pull Request to fix upstream code
Due to the high quality requirements in Debian, it is fairly common that while doing the initial Debian packaging of an open source project, issues are found that stem from the upstream source code. While it is possible to carry extra patches in Debian, it is not good practice to deviate too much from upstream code with custom Debian patches. Instead, the Debian packager should try to get the fixes applied directly upstream.
Using git-buildpackage patch queues is the most convenient way to make modifications to the upstream source code so that they automatically convert into Debian patches (stored at debian/patches), and can also easily be submitted upstream as any regular git commit (and rebased and resubmitted many times over).
First, decide if you want to work out of the upstream development branch and later cherry-pick to the Debian packaging branch, or work out of the Debian packaging branch and cherry-pick to an upstream branch.
The example below starts from the upstream development branch and then cherry-picks the commit into the git-buildpackage patch queue:
shellgit checkout -b bugfix-branch master
nano entr.c
make
./entr # verify change works as expected
git commit -a -m "Commit title" -m "Commit body"
git push # submit upstream
gbp pq import --force --time-machine=10
git cherry-pick <commit id>
git commit --amend # extend commit message with DEP-3 metadata
gbp buildpackage -uc -us -b
./entr # verify change works as expected
gbp pq export --drop --commit
git commit --amend # Write commit message along lines "Add patch to .."
git checkout -b bugfix-branch master
nano entr.c
make
./entr # verify change works as expectedgit commit -a -m "Commit title" -m "Commit body"git push # submit upstreamgbp pq import --force --time-machine=10git cherry-pick <commit id>
git commit --amend # extend commit message with DEP-3 metadatagbp buildpackage -uc -us -b
./entr # verify change works as expectedgbp pq export --drop --commit
git commit --amend # Write commit message along lines "Add patch to .."
The example below starts by making the fix on a git-buildpackage patch queue branch, and then cherry-picking it onto the upstream development branch:
These can be run at any time, regardless if any debian/patches existed prior, or if existing patches applied cleanly or not, or if there were old patch queue branches around. Note that the extra -b in gbp buildpackage -uc -us -b instructs to build only binary packages, avoiding any nags from dpkg-source that there are modifications in the upstream sources while building in the patches-applied mode.
Programming-language specific dh-make alternatives
As each programming language has its specific way of building the source code, and many other conventions regarding the file layout and more, Debian has multiple custom tools to create new Debian source packages for specific programming languages.
Notably, Python does not have its own tool, but there is an dh_make --python option for Python support directly in dh_make itself. The list is not complete and many more tools exist. For some languages, there are even competing options, such as for Go there is in addition to dh-make-golang also Gophian.
When learning Debian packaging, there is no need to learn these tools upfront. Being aware that they exist is enough, and one can learn them only if and when one starts to package a project in a new programming language.
The difference between source git repository vs source packages vs binary packages
As seen in earlier example, running gbp buildpackage on the Entr packaging repository above will result in several files:
The entr_5.6-1_amd64.deb is the binary package, which can be installed on a Debian/Ubuntu system. The rest of the files constitute the source package. To do a source-only build, run gbp buildpackage -S and note the files produced:
The source package files can be used to build the binary .deb for amd64, or any architecture that the package supports. It is important to grasp that the Debian source package is the preferred form to be able to build the binary packages on various Debian build systems, and the Debian source package is not the same thing as the Debian packaging git repository contents.
If the package is large and complex, the build could result in multiple binary packages. One set of package definition files in debian/ will however only ever result in a single source package.
Option to repackage source packages with Files-Excluded lists in the debian/copyright file
Some upstream projects may include binary files in their release, or other undesirable content that needs to be omitted from the source package in Debian. The easiest way to filter them out is by adding to the debian/copyright file a Files-Excluded field listing the undesired files. The debian/copyright file is read by uscan, which will repackage the upstream sources on-the-fly when importing new upstream releases.
For a real-life example, see the debian/copyright files in the Godot package that lists:
The resulting repackaged upstream source tarball, as well as the upstream version component, will have an extra +ds to signify that it is not the true original upstream source but has been modified by Debian:
godot_4.3+ds.orig.tar.xz
godot_4.3+ds-1_amd64.deb
godot_4.3+ds.orig.tar.xz
godot_4.3+ds-1_amd64.deb
Creating one Debian source package from multiple upstream source packages also possible
In some rare cases the upstream project may be split across multiple git repositories or the upstream release may consist of multiple components each in their own separate tarball. Usually these are very large projects that get some benefits from releasing components separately. If in Debian these are deemed to go into a single source package, it is technically possible using the component system in git-buildpackage and uscan. For an example see the gbp.conf and watch files in the node-cacache package.
Using this type of structure should be a last resort, as it creates complexity and inter-dependencies that are bound to cause issues later on. It is usually better to work with upstream and champion universal best practices with clear releases and version schemes.
When not to start the Debian packaging repository as a fork of the upstream one
Not all upstreams use Git for version control. It is by far the most popular, but there are still some that use e.g. Subversion or Mercurial. Who knows maybe in the future some new version control systems will start to compete with Git. There are also projects that use Git in massive monorepos and with complex submodule setups that invalidate the basic assumptions required to map an upstream Git repository into a Debian packaging repository.
In those cases one can t use a debian/latest branch on a clone of the upstream git repository as the starting point for the Debian packaging, but one must revert the traditional way of starting from an upstream release tarball with gbp import-orig package-1.0.tar.gz.
Conclusion
Created in August 1993, Debian is one of the oldest Linux distributions. In the 32 years since inception, the .deb packaging format and the tooling to work with it have evolved several generations. In the past 10 years, more and more Debian Developers have converged on certain core practices evidenced by https://trends.debian.net/, but there is still a lot of variance in workflows even for identical tasks. Hopefully, you find this post useful in giving practical guidance on how exactly to do the most common things when packaging software for Debian.
Happy packaging!
As you may recall from previous posts and elsewhere I have been busy writing a new solver for APT.
Today I want to share some of the latest changes in how to approach solving.
The idea for the solver was that manually installed packages are always protected from removals
in terms of SAT solving, they are facts. Automatically installed packages become optional unit
clauses. Optional clauses are solved after manual ones, they don t partake in normal unit propagation.
This worked fine, say you had
A # install request for A
B # manually installed, keep it
A depends on: conflicts-B C
Installing A on a system with B installed installed C, as it was not allowed to
install the conflicts-B package since B is installed.
However, I also introduced a mode to allow removing manually installed packages, and that s
where it broke down, now instead of B being a fact, our clauses looked like:
A # install request for A
A depends on: conflicts-B C
Optional: B # try to keep B installed
As a result, we installed conflicts-B and removed B; the steps the solver takes are:
A is a fact, mark it
A depends on: conflicts-B C is the strongest clause, try to install conflicts-B
We unit propagate that conflicts-B conflicts with B, so we mark not B
Optional: B is reached, but not satisfiable, ignore it because it s optional.
This isn t correct: Just because we allow removing manually installed packages doesn t mean that we should remove manually installed packages if we don t need to.
Fixing this turns out to be surprisingly easy. In addition to adding our optional (soft) clauses, let s first assume all of them!
But to explain how this works, we first need to explain some terminology:
The solver operates on a stack of decisions
enqueue means a fact is being added at the current decision level, and enqueued for propagation
assume bumps the decision level, and then enqueues the assumed variable
propagate looks at all the facts and sees if any clause becomes unit, and then enqueues it
unit is when a clause has a single literal left to assign
To illustrate this in pseudo Python code:
We introduce all our facts, and if they conflict, we are unsat:
for fact in facts:
enqueue(fact)
ifnot propagate():
returnFalse
For each optional literal, we register a soft clause and assume it. If the assumption fails,
we ignore it. If it succeeds, but propagation fails, we undo the assumption.
for optionalLiteral in optionalLiterals:
registerClause(SoftClause([optionalLiteral]))
if assume(optionalLiteral) andnot propagate():
undo()
Finally we enter the main solver loop:
whileTrue:
ifnot propagate():
ifnot backtrack():
returnFalseelif<all clauses are satisfied>:
returnTrueelif it := find("best unassigned literal satisfying a hard clause"):
assume(it)
elif it := find("best literal satisfying a soft clause"):
assume(it)
The key point to note is that the main loop will undo the assumptions in order; so
if you assume A,B,C and B is not possible, we will have also undone C. But since
C is also enqueued as a soft clause, we will then later find it again:
Solve finds a conflict, backtracks, and sets not C: State=[Assume(A),Assume(B),not(C)]
Solve finds a conflict, backtracks, and sets not B: State=[Assume(A),not(B)] C is no longer assumed either
Solve, assume C as it satisfies SoftClause([C]) as next best literal: State=[Assume(A),not(B),Assume(C)]
All clauses are satisfied, solution is A, not B, and C.
This is not (correct) MaxSAT, because we actually do not guarantee that we satisfy as many soft clauses as possible. Consider you have the following clauses:
Optional: A
Optional: B
Optional: C
B Conflicts with A
C Conflicts with A
There are two possible results here:
A If we assume A first, we are unable to satisfy B or C.
B,C If we assume either B or C first, A is unsat.
The question to ponder though is whether we actually need a global maximum or whether a local maximum is satisfactory in practice for a dependency solver
If you look at it, a naive MaxSAT solver needs to run the SAT solver 2**n times for n soft clauses, whereas our heuristic only needs n runs.
For dependency solving, it seems we do not seem have a strong need for a global maximum:
There are various other preferences between our literals, say priorities;
and empirically, from evaluating hundreds of regressions without the initial assumptions,
I can say that the assumptions do fix those cases and the result is correct.
Further improvements exist, though, and we can look into them if they are needed, such as:
Use a better heuristic:
If we assume 1 clause and solve, and we cause 2 or more clauses to become unsatisfiable,
then that clause is a local minimum and can be skipped.
This is a more common heuristical MaxSAT solver.
This gives us a better local maximum, but not a global one.
This is more or less what the Smart package manager did,
except that in Smart, all packages were optional, and the entire solution was scored.
It calculated a basic solution without optimization and then toggled each variable and saw if the score improved.
Implement an actual search for a global maximum:
This involves reading the literature.
There are various versions of this, for example:
Find unsatisfiable cores and use those to guide relaxation of clauses.
A bounds-based search, where we translate sum(satisifed clauses) > k into SAT, and then search in one of the following ways:
from 0 upward
from n downward
perform a binary search on [0, k] satisfied clauses.
Actually we do not even need to calculate sum constraints into CNF, because we can just add a specialized new type of constraint to our code.
I actually released last week I haven t had time to blog, but today is my birthday and taking some time to myself!This release came with a major bugfix. As it turns out our applications were very crashy on non-KDE platforms including Ubuntu proper. Unfortunately, for years, and I didn t know. Developers were closing the bug reports as invalid because users couldn t provide a stacktrace. I have now convinced most developers to assign snap bugs to the Snap platform so I at least get a chance to try and fix them. So with that said, if you tried our snaps in the past and gave up in frustration, please do try them again! I also spent some time cleaning up our snaps to only have current releases in the store, as rumor has it snapcrafters will be responsible for any security issues. With 200+ snaps I maintain, that is a lot of responsibility. We ll see if I can pull it off.
Life!
My last surgery was a success! I am finally healing and out of a sling for the first time in almost a year. I have also lined up a good amount of web work for next month and hopefully beyond. I have decided to drop the piece work for donations and will only accept per project proposals for open source work. I will continue to maintain KDE snaps for as long as time allows. A big thank you to everyone that has donated over the last year to fund my survival during this broken arm fiasco. I truly appreciate it!
With that said, if you want to drop me a donation for my work, birthday or well-being until I get paid for the aforementioned web work please do so here:
In my home state of Wisconsin, there is an incredibly popular gas station called Kwik Trip. (Not to be confused with Quik Trip.) It is legitimately one of the best gas stations I ve ever been to, and I m a frequent customer.What makes it that great?Well, everything about it. The store is clean, the lights work, the staff are always friendly (and encourage you to come back next time), there s usually bakery on sale (just depends on location etc), and the list goes on.There s even a light-switch in the bathroom of a large amount of locations that you can flip if a janitor needs to attend to things. It actually does set off an alarm in the back room.A dear friend of mine from Wisconsin once told me something along the lines of, it s inaccurate to call Kwik Trip a gas station, because in all reality, it s a five star restaurant. (M , I hope you re well.)In my own opinion, they have an espresso machine. That s what really matters. ;)I mentioned the discount bakery. In reality, it s a pretty great system. To my limited understanding, the bakery that is older than standard but younger than expiry are set to half price and put towards the front of the store. In my personal experience, the vast majority of the time, the quality is still amazing. In fact, even if it isn t, the people working at Kwik Trip seem to genuinely enjoy their job.When you re looking at that discount rack of bakery, what do you choose? A personal favorite of mine is the banana nut bread with frosting on top. (To the non-Americans, yes, it does taste like it s homemade, it doesn t taste like something made in a factory.)Everyone chooses different bakery items. And honestly, there could be different discount items out depending on the time. You take what you can get, but you still have your own preferences. You like a specific type of donut (custard-filled, or maybe jelly-filled). Frosting, sprinkles there are so many ways to make different bakery items.It s not only art, it s kind of a science too.Is there a Kwik Trip that you ve called a gas station instead of a five star restaurant? Do you also want to tell people about your gas station? Do you only pick certain bakery items off the discount rack, or maybe ignore it completely? (And yes, there would be good reason to ignore the bakery in favor of the Hot Spot, I d consider that acceptable in my personal opinion.)Remember, sometimes you just have to like donuts.https://medium.com/media/73f78efd7bd6bb9ce495c2f08428c7d3/hrefHave a sweet day. :)
As a followup of my post about the use of argocd-autopilot
I m going to deploy various applications to the cluster using Argo CD from the same
repository we used on the previous post.
For our examples we are going to test a solution to the problem we had when we updated a ConfigMap used by the
argocd-server (the resource was updated but the application Pod was not because there was no change on the
argocd-server deployment); our original fix was to kill the pod manually, but the manual operation is something we
want to avoid.
The proposed solution to this kind of issues on the
helm documentation is to add
annotations to the Deployments with values that are a hash of the ConfigMaps or Secrets used by them, this way if
a file is updated the annotation is also updated and when the Deployment changes are applied a roll out of the pods is
triggered.
On this post we will install a couple of controllers and an application to show how we can handle Secrets with
argocd and solve the issue with updates on ConfigMaps and Secrets, to do it we will execute the following tasks:
Deploy the Reloader controller to our cluster. It is a tool that watches
changes in ConfigMaps and Secrets and does rolling upgrades on the Pods that use them from Deployment,
StatefulSet, DaemonSet or DeploymentConfig objects when they are updated (by default we have to add some
annotations to the objects to make things work).
Deploy a simple application that can use ConfigMaps and Secrets and test that the Reloader controller does its
job when we add or update a ConfigMap.
Install the Sealed Secrets controller to manage secrets inside our
cluster, use it to add a secret to our sample application and see that the application is reloaded automatically.
Creating the test project for argocd-autopilotAs we did our installation using argocd-autopilot we will use its structure to manage the applications.
The first thing to do is to create a project (we will name it test) as follows:
argocd-autopilot project create test
INFO cloning git repository: https://forgejo.mixinet.net/blogops/argocd.git
Enumerating objects: 18, done.
Counting objects: 100% (18/18), done.
Compressing objects: 100% (16/16), done.
Total 18 (delta 1), reused 0 (delta 0), pack-reused 0
INFO using revision: "", installation path: "/"
INFO pushing new project manifest to repo
INFO project created: 'test'
Now that the test project is available we will use it on our argocd-autopilot invocations when creating
applications.
Installing the reloader controllerTo add the reloader application to the test project as a kustomize application and deploy it on the tools
namespace with argocd-autopilot we do the following:
argocd-autopilot app create reloader \
--app 'github.com/stakater/Reloader/deployments/kubernetes/?ref=v1.4.2' \
--project test --type kustomize --dest-namespace tools
INFO cloning git repository: https://forgejo.mixinet.net/blogops/argocd.git
Enumerating objects: 19, done.
Counting objects: 100% (19/19), done.
Compressing objects: 100% (18/18), done.
Total 19 (delta 2), reused 0 (delta 0), pack-reused 0
INFO using revision: "", installation path: "/"
INFO created 'application namespace' file at '/bootstrap/cluster-resources/in-cluster/tools-ns.yaml'
INFO committing changes to gitops repo...
INFO installed application: reloader
That command creates four files on the argocd repository:
One to create the tools namespace:bootstrap/cluster-resources/in-cluster/tools-ns.yaml
The kustomization.yaml file for the test project (by default it includes the same configuration used on the
base definition, but we could make other changes if needed):apps/reloader/overlays/test/kustomization.yaml
The config.json file used to define the application on argocd for the test project (it points to the folder
that includes the previous kustomization.yaml file):apps/reloader/overlays/test/config.json
We can check that the application is working using the argocd command line application:
argocd app get argocd/test-reloader -o tree
Name: argocd/test-reloader
Project: test
Server: https://kubernetes.default.svc
Namespace: tools
URL: https://argocd.lo.mixinet.net:8443/applications/test-reloader
Source:
- Repo: https://forgejo.mixinet.net/blogops/argocd.git
Target:
Path: apps/reloader/overlays/test
SyncWindow: Sync Allowed
Sync Policy: Automated (Prune)
Sync Status: Synced to (2893b56)
Health Status: Healthy
KIND/NAME STATUS HEALTH MESSAGE
ClusterRole/reloader-reloader-role Synced
ClusterRoleBinding/reloader-reloader-role-binding Synced
ServiceAccount/reloader-reloader Synced serviceaccount/reloader-reloader created
Deployment/reloader-reloader Synced Healthy deployment.apps/reloader-reloader created
ReplicaSet/reloader-reloader-5b6dcc7b6f Healthy
Pod/reloader-reloader-5b6dcc7b6f-vwjcx Healthy
Adding flags to the reloader serverThe runtime configuration flags for the reloader server are described on the project
README.md
file, in our case we want to adjust three values:
We want to enable the option to reload a workload when a ConfigMap or Secret is created,
We want to enable the option to reload a workload when a ConfigMap or Secret is deleted,
We want to use the annotations strategy for reloads, as it is the recommended mode of operation when using argocd.
To pass them we edit the apps/reloader/overlays/test/kustomization.yaml file to patch the pod container template, the
text added is the following:
patches:# Add flags to reload workloads when ConfigMaps or Secrets are created or deleted-target:kind:Deploymentname:reloader-reloaderpatch: -- op: addpath: /spec/template/spec/containers/0/argsvalue:- '--reload-on-create=true'- '--reload-on-delete=true'- '--reload-strategy=annotations'
After committing and pushing the updated file the system launches the application with the new options.
The dummyhttp applicationTo do a quick test we are going to deploy the dummyhttp web server using an
image generated using the following Dockerfile:
# Image to run the dummyhttp application <https://github.com/svenstaro/dummyhttp># This arg could be passed by the container build command (used with mirrors)ARG OCI_REGISTRY_PREFIX# Latest tested version of alpineFROM $ OCI_REGISTRY_PREFIX alpine:3.21.3# Tool versionsARG DUMMYHTTP_VERS=1.1.1# Download binaryRUN ARCH="$(apk --print-arch)"&&\
VERS="$DUMMYHTTP_VERS"&&\
URL="https://github.com/svenstaro/dummyhttp/releases/download/v$VERS/dummyhttp-$VERS-$ARCH-unknown-linux-musl"&&\
wget "$URL"-O"/tmp/dummyhttp"&&\
install /tmp/dummyhttp /usr/local/bin &&\
rm-f /tmp/dummyhttp
# Set the entrypoint to /usr/local/bin/dummyhttpENTRYPOINT [ "/usr/local/bin/dummyhttp" ]
The kustomize base application is available on a monorepo that contains the following files:
A Deployment definition that uses the previous image but uses /bin/sh -c as its entrypoint (command in the
k8s Pod terminology) and passes as its argument a string that runs the eval command to be able to expand
environment variables passed to the pod (the definition includes two optional variables, one taken from a
ConfigMap and another one from a Secret):
Deploying the dummyhttp application from argocdWe could create the dummyhttp application using the argocd-autopilot command as we ve done on the reloader case,
but we are going to do it manually to show how simple it is.
First we ve created the apps/dummyhttp/base/kustomization.yaml file to include the application from the previous
repository:
And finally we add the apps/dummyhttp/overlays/test/config.json file to configure the application as the
ApplicationSet defined by argocd-autopilot expects:
Patching the applicationNow we will add patches to the apps/dummyhttp/overlays/test/kustomization.yaml file:
One to add annotations for reloader (one to enable it and another one to set the roll out strategy to restart to
avoid touching the deployments, as that can generate issues with argocd).
Another to change the ingress hostname (not really needed, but something quite reasonable for a specific project).
After committing and pushing the changes we can use the argocd cli to check the status of the application:
argocd app get argocd/test-dummyhttp -o tree
Name: argocd/test-dummyhttp
Project: test
Server: https://kubernetes.default.svc
Namespace: default
URL: https://argocd.lo.mixinet.net:8443/applications/test-dummyhttp
Source:
- Repo: https://forgejo.mixinet.net/blogops/argocd.git
Target:
Path: apps/dummyhttp/overlays/test
SyncWindow: Sync Allowed
Sync Policy: Automated (Prune)
Sync Status: Synced to (fbc6031)
Health Status: Healthy
KIND/NAME STATUS HEALTH MESSAGE
Deployment/dummyhttp Synced Healthy deployment.apps/dummyhttp configured
ReplicaSet/dummyhttp-55569589bc Healthy
Pod/dummyhttp-55569589bc-qhnfk Healthy
Ingress/dummyhttp Synced Healthy ingress.networking.k8s.io/dummyhttp configured
Service/dummyhttp Synced Healthy service/dummyhttp unchanged
Endpoints/dummyhttp
EndpointSlice/dummyhttp-x57bl
As we can see, the Deployment and Ingress where updated, but the Service is unchanged.
To validate that the ingress is using the new hostname we can use curl:
curl -s https://dummyhttp.lo.mixinet.net:8443/
404 page not found
curl -s https://test-dummyhttp.lo.mixinet.net:8443/
"c": "", "s": ""
Adding a ConfigMapNow that the system is adjusted to reload the application when the ConfigMap or Secret is created, deleted or
updated we are ready to add one file and see how the system reacts.
We modify the apps/dummyhttp/overlays/test/kustomization.yaml file to create the ConfigMap using the
configMapGenerator as follows:
After committing and pushing the changes we can see that the ConfigMap is available, the pod has been deleted and
started again and the curl output includes the new value:
kubectl get configmaps,pods
NAME READY STATUS RESTARTS AGE
configmap/dummyhttp-configmap 1 11s
configmap/kube-root-ca.crt 1 4d7h
NAME DATA AGE
pod/dummyhttp-779c96c44b-pjq4d 1/1 Running 0 11s
pod/dummyhttp-fc964557f-jvpkx 1/1 Terminating 0 2m42s
curl -s https://test-dummyhttp.lo.mixinet.net:8443 jq -M .
"c": "Default Test Value",
"s": ""
Using helm with argocd-autopilotRight now there is no direct support in argocd-autopilot to manage applications using helm (see the issue
#38 on the project), but we want to use a chart in our
next example.
There are multiple ways to add the support, but the simplest one that allows us to keep using argocd-autopilot is to
use kustomize applications that call helm as described
here.
The only thing needed before being able to use the approach is to add the kustomize.buildOptions flag to the
argocd-cm on the bootstrap/argo-cd/kustomization.yaml file, its contents now are follows:
bootstrap/argo-cd/kustomization.yaml
apiVersion:kustomize.config.k8s.io/v1beta1configMapGenerator:-behavior:mergeliterals:# Enable helm usage from kustomize (see https://github.com/argoproj/argo-cd/issues/2789#issuecomment-960271294)-kustomize.buildOptions="--enable-helm"-repository.credentials=- passwordSecret:key: git_tokenname: autopilot-secreturl: https://forgejo.mixinet.net/usernameSecret:key: git_usernamename: autopilot-secretname:argocd-cm# Disable TLS for the Argo Server (see https://argo-cd.readthedocs.io/en/stable/operator-manual/ingress/#traefik-v30)-behavior:mergeliterals:-"server.insecure=true"name:argocd-cmd-params-cmkind:Kustomizationnamespace:argocdresources:-github.com/argoproj-labs/argocd-autopilot/manifests/base?ref=v0.4.19-ingress_route.yaml
On the following section we will explain how the application is defined to make things work.
Installing the sealed-secrets controllerTo manage secrets in our cluster we are going to use the
sealed-secrets controller and to install it we are going to use its
chart.
As we mentioned on the previous section, the idea is to create a kustomize application and use that to deploy the
chart, but we are going to create the files manually, as we are not going import the base kustomization files from a
remote repository.
As there is no clear way to override helm Chart values using
overlays we are going to use a generator to create the helm configuration from an external resource and include it
from our overlays (the idea has been taken from this repository,
which was referenced from a comment
on the kustomize issue #38 mentioned earlier).
The sealed-secrets applicationWe have created the following files and folders manually:
apps/sealed-secrets/
helm
chart.yaml
kustomization.yaml
overlays
test
config.json
kustomization.yaml
values.yaml
The helm folder contains the generator template that will be included from our overlays.
The kustomization.yaml includes the chart.yaml as a resource:
apps/sealed-secrets/helm/kustomization.yaml
And the chart.yaml file defines the HelmChartInflationGenerator:
apps/sealed-secrets/helm/chart.yaml
apiVersion:builtinkind:HelmChartInflationGeneratormetadata:name:sealed-secretsreleaseName:sealed-secretsname:sealed-secretsnamespace:kube-systemrepo:https://bitnami-labs.github.io/sealed-secretsversion:2.17.2includeCRDs:true# Add common values to all argo-cd projects inlinevaluesInline:fullnameOverride:sealed-secrets-controller# Load a values.yaml file from the same directory that uses this generatorvaluesFile:values.yaml
For this chart the template adjusts the namespace to kube-system and adds the fullnameOverride on the
valuesInline key because we want to use those settings on all the projects (they are the values expected by the
kubeseal command line application, so we adjust them to avoid the need to add additional parameters to it).
We adjust global values as inline to be able to use a the valuesFile from our overlays; as we are using a generator
the path is relative to the folder that contains the kustomization.yaml file that calls it, in our case we will need
to have a values.yaml file on each overlay folder (if we don t want to overwrite any values for a project we can
create an empty file, but it has to exist).
Finally, our overlay folder contains three files, a kustomization.yaml file that includes the generator from the
helm folder, the values.yaml file needed by the chart and the config.json file used by argocd-autopilot to
install the application.
The kustomization.yaml file contents are:
apps/sealed-secrets/overlays/test/kustomization.yaml
apiVersion:kustomize.config.k8s.io/v1beta1kind:Kustomization# Uncomment if you want to add additional resources using kustomize#resources:#- ../../basegenerators:-../../helm
The values.yaml file enables the ingress for the application and adjusts its hostname:
apps/sealed-secrets/overlays/test/values.yaml
Once we commit and push the files the sealed-secrets application is installed in our cluster, we can check it using
curl to get the public certificate used by it:
That invocation needs to have access to the cluster to do its job and in our case it works because we modified the chart
to use the kube-system namespace and set the controller name to sealed-secrets-controller as the tool expects.
If we need to create the secrets without credentials we can connect to the ingress address we added to retrieve the
public key:
Or, if we don t have access to the ingress address, we can save the certificate on a file and use it instead of the URL.
The sealed version of the secret looks like this:
This file can be deployed to the cluster to create the secret (in our case we will add it to the argocd application),
but before doing that we are going to check the output of our dummyhttp service and get the list of Secrets and
SealedSecrets in the default namespace:
curl -s https://test-dummyhttp.lo.mixinet.net:8443 jq -M .
"c": "Default Test Value",
"s": ""
kubectl get sealedsecrets,secrets
No resources found in default namespace.
Now we add the SealedSecret to the dummyapp copying the file and adding it to the kustomization.yaml file:
Once we commit and push the files Argo CD creates the SealedSecret and the controller generates the Secret:
kubectl apply -f /tmp/dummyhttp-sealed-secret.yaml
sealedsecret.bitnami.com/dummyhttp-secret created
kubectl get sealedsecrets,secrets
NAME STATUS SYNCED AGE
sealedsecret.bitnami.com/dummyhttp-secret True 3s
NAME TYPE DATA AGE
secret/dummyhttp-secret Opaque 1 3s
If we check the command output we can see the new value of the secret:
Using sealed-secrets in production clustersIf you plan to use sealed-secrets look into its
documentation to understand how it manages the
private keys, how to backup things and keep in mind that, as the documentation
explains, you can rotate
your sealed version of the secrets, but that doesn t change the actual secrets.
If you want to rotate your secrets you have to update them and commit the sealed version of the updates (as the
controller also rotates the encryption keys your new sealed version will also be using a newer key, so you will be doing
both things at the same time).
Final remarksOn this post we have seen how to deploy applications using the argocd-autopilot model, including the use of helm
charts inside kustomize applications and how to install and use the sealed-secrets controller.
It has been interesting and I ve learnt a lot about argocd in the process, but I believe that if I ever want to use it
in production I will also review the native helm support in argocd using a separate repository to manage the
applications, at least to be able to compare it to the model explained here.
I've long said that the main tools in the Open Source security space, OpenSSL and GnuPG (gpg), are broken and only a complete re-write will solve this. And that is still pending as nobody came forward with the funding. It's not a sexy topic, so it has to get really bad before it'll get better.
Gpg has a UI that is close to useless.
That won't substantially change with more bolted-on improvements.
Now Robert J. Hansen and Daniel Kahn Gillmor had somebody add ~50k signatures (read 1, 2, 3, 4 for the g l ory details) to their keys and - oops - they say that breaks gpg.
But does it?
I downloaded Robert J. Hansen's key off the SKS-Keyserver network.
It's a nice 45MB file when de-ascii-armored (gpg --dearmor broken_key.asc ; mv broken_key.asc.gpg broken_key.gpg).
Now a friendly:
$ /usr/bin/time-v gpg --no-default-keyring--keyring ./broken_key.gpg --batch--quiet--edit-key 0x1DCBDC01B44427C7 clean save quit
pub rsa3072/0x1DCBDC01B44427C7 erzeugt: 2015-07-16 verf llt: niemals Nutzung: SC Vertrauen: unbekannt G ltigkeit: unbekannt sub ed25519/0xA83CAE94D3DC3873 erzeugt: 2017-04-05 verf llt: niemals Nutzung: S sub cv25519/0xAA24CC81B8AED08B erzeugt: 2017-04-05 verf llt: niemals Nutzung: E sub rsa3072/0xDC0F82625FA6AADE erzeugt: 2015-07-16 verf llt: niemals Nutzung: E [ unbekannt ](1). Robert J. Hansen <rjh@sixdemonbag.org> [ unbekannt ](2) Robert J. Hansen <rob@enigmail.net> [ unbekannt ](3) Robert J. Hansen <rob@hansen.engineering>
User-ID "Robert J. Hansen <rjh@sixdemonbag.org>": 49705 Signaturen entfernt User-ID "Robert J. Hansen <rob@enigmail.net>": 49704 Signaturen entfernt User-ID "Robert J. Hansen <rob@hansen.engineering>": 49701 Signaturen entfernt
pub rsa3072/0x1DCBDC01B44427C7 erzeugt: 2015-07-16 verf llt: niemals Nutzung: SC Vertrauen: unbekannt G ltigkeit: unbekannt sub ed25519/0xA83CAE94D3DC3873 erzeugt: 2017-04-05 verf llt: niemals Nutzung: S sub cv25519/0xAA24CC81B8AED08B erzeugt: 2017-04-05 verf llt: niemals Nutzung: E sub rsa3072/0xDC0F82625FA6AADE erzeugt: 2015-07-16 verf llt: niemals Nutzung: E [ unbekannt ](1). Robert J. Hansen <rjh@sixdemonbag.org> [ unbekannt ](2) Robert J. Hansen <rob@enigmail.net> [ unbekannt ](3) Robert J. Hansen <rob@hansen.engineering>
Command being timed: "gpg --no-default-keyring --keyring ./broken_key.gpg --batch --quiet --edit-key 0x1DCBDC01B44427C7 clean save quit" User time(seconds): 3911.14 System time(seconds): 2442.87 Percent of CPU this job got: 99% Elapsed (wall clock)time(h:mm:ss or m:ss): 1:45:56 Average shared text size(kbytes): 0 Average unshared data size(kbytes): 0 Average stack size(kbytes): 0 Average total size(kbytes): 0 Maximum resident setsize(kbytes): 107660 Average resident setsize(kbytes): 0 Major (requiring I/O) page faults: 1 Minor (reclaiming a frame) page faults: 26630 Voluntary context switches: 43 Involuntary context switches: 59439 Swaps: 0 File system inputs: 112 File system outputs: 48 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size(bytes): 4096 Exit status: 0
And the result is a nicely useable 3835 byte file of the clean public key.
If you supply a keyring instead of --no-default-keyring it will also keep the non-self signatures that are useful for you (as you apparently know the signing party).
So it does not break gpg. It does break things that call gpg at runtime and not asynchronously. I heard Enigmail is affected, quelle surprise.
Now the main problem here is the runtime. 1h45min is just ridiculous. As Filippo Valsorda puts it:
Someone added a few thousand entries to a list that lets anyone append to it.
GnuPG, software supposed to defeat state actors, suddenly takes minutes to process entries.
How big is that list you ask? 17 MiB. Not GiB, 17 MiB. Like a large picture.
https://dev.gnupg.org/T4592
If I were a gpg / SKS keyserver developer, I'd
speed this up so the edit-key run above completes in less than 10 s (just getting rid of the lseek/read dance and deferring all time-based decisions should get close)
(ideally) make the drop-sig import-filter syntax useful (date-ranges, non-reciprocal signatures, ...)
clean affected keys on the SKS keyservers (needs coordination of sysops, drop servers from unreachable people)
only accept new keys and new signatures on keys extending the strong set (rather small change to the existing codebase)
That way another key can only be added to the keyserver network if it contains at least one signature from a previously known strong-set key.
Attacking the keyserver network would become at least non-trivial. And the web-of-trust thing may make sense again.
Updates
09.07.2019
GnuPG 2.2.17 has been released with another set of quickly bolted together fixes:
gpg: Ignore all key-signatures received from keyservers. This
change is required to mitigate a DoS due to keys flooded with
faked key-signatures. The old behaviour can be achieved by adding
keyserver-options no-self-sigs-only,no-import-clean
to your gpg.conf. [#4607]
gpg: If an imported keyblocks is too large to be stored in the
keybox (pubring.kbx) do not error out but fallback to an import
using the options "self-sigs-only,import-clean". [#4591]
gpg: New command --locate-external-key which can be used to
refresh keys from the Web Key Directory or via other methods
configured with --auto-key-locate.
gpg: New import option "self-sigs-only".
gpg: In --auto-key-retrieve prefer WKD over keyservers. [#4595]
dirmngr: Support the "openpgpkey" subdomain feature from
draft-koch-openpgp-webkey-service-07. [#4590].
dirmngr: Add an exception for the "openpgpkey" subdomain to the
CSRF protection. [#4603]
dirmngr: Fix endless loop due to http errors 503 and 504. [#4600]
dirmngr: Fix TLS bug during redirection of HKP requests. [#4566]
gpgconf: Fix a race condition when killing components. [#4577]
Bug T4607 shows that these changes are all but well thought-out.
They introduce artificial limits, like 64kB for WKD-distributed keys or 5MB for local signature imports (Bug T4591) which weaken the web-of-trust further.
I recommend to not run gpg 2.2.17 in production environments without extensive testing as these limits and the unverified network traffic may bite you. Do validate your upgrade with valid and broken keys that have segments (packet groups) surpassing the above mentioned limits. You may be surprised what gpg does. On the upside: you can now refresh keys (sans signatures) via WKD. So if your buddies still believe in limiting their subkey validities, you can more easily update them bypassing the SKS keyserver network. NB: I have not tested that functionality. So test before deploying.
10.08.2019
Christopher Wellons (skeeto) has released his pgp-poisoner tool. It is a go program that can add thousands of malicious signatures to a GNUpg key per second. He comments "[pgp-poisoner is] proof that such attacks are very easy to pull off. It doesn't take a nation-state actor to break the PGP ecosystem, just one person and couple evenings studying RFC 4880. This system is not robust." He also hints at the next likely attack vector, public subkeys can be bound to a primary key of choice.
After thinking about multi-stage Debian rebuilds I wanted to implement the idea. Recall my illustration:
Earlier I rebuilt all packages that make up the difference between Ubuntu and Trisquel. It turned out to be a 42% bit-by-bit identical similarity. To check the generality of my approach, I rebuilt the difference between Debian and Devuan too. That was the debdistreproduce project. It only had to orchestrate building up to around 500 packages for each distribution and per architecture.
Differential reproducible rebuilds doesn t give you the full picture: it ignore the shared package between the distribution, which make up over 90% of the packages. So I felt a desire to do full archive rebuilds. The motivation is that in order to trust Trisquel binary packages, I need to trust Ubuntu binary packages (because that make up 90% of the Trisquel packages), and many of those Ubuntu binaries are derived from Debian source packages. How to approach all of this? Last year I created the debdistrebuild project, and did top-50 popcon package rebuilds of Debian bullseye, bookworm, trixie, and Ubuntu noble and jammy, on a mix of amd64 and arm64. The amount of reproducibility was lower. Primarily the differences were caused by using different build inputs.
Last year I spent (too much) time creating a mirror of snapshot.debian.org, to be able to have older packages available for use as build inputs. I have two copies hosted at different datacentres for reliability and archival safety. At the time, snapshot.d.o had serious rate-limiting making it pretty unusable for massive rebuild usage or even basic downloads. Watching the multi-month download complete last year had a meditating effect. The completion of my snapshot download co-incided with me realizing something about the nature of rebuilding packages. Let me below give a recap of the idempotent rebuilds idea, because it motivate my work to build all of Debian from a GitLab pipeline.
One purpose for my effort is to be able to trust the binaries that I use on my laptop. I believe that without building binaries from source code, there is no practically feasible way to trust binaries. To trust any binary you receive, you can de-assemble the bits and audit the assembler instructions for the CPU you will execute it on. Doing that on a OS-wide level this is unpractical. A more practical approach is to audit the source code, and then confirm that the binary is 100% bit-by-bit identical to one that you can build yourself (from the same source) on your own trusted toolchain. This is similar to a reproducible build.
My initial goal with debdistrebuild was to get to 100% bit-by-bit identical rebuilds, and then I would have trustworthy binaries. Or so I thought. This also appears to be the goal of reproduce.debian.net. They want to reproduce the official Debian binaries. That is a worthy and important goal. They achieve this by building packages using the build inputs that were used to build the binaries. The build inputs are earlier versions of Debian packages (not necessarily from any public Debian release), archived at snapshot.debian.org.
I realized that these rebuilds would be not be sufficient for me: it doesn t solve the problem of how to trust the toolchain. Let s assume the reproduce.debian.net effort succeeds and is able to 100% bit-by-bit identically reproduce the official Debian binaries. Which appears to be within reach. To have trusted binaries we would only have to audit the source code for the latest version of the packages AND audit the tool chain used. There is no escaping from auditing all the source code that s what I think we all would prefer to focus on, to be able to improve upstream source code.
The trouble is about auditing the tool chain. With the Reproduce.debian.net approach, that is a recursive problem back to really ancient Debian packages, some of them which may no longer build or work, or even be legally distributable. Auditing all those old packages is a LARGER effort than auditing all current packages! Doing auditing of old packages is of less use to making contributions: those releases are old, and chances are any improvements have already been implemented and released. Or that improvements are no longer applicable because the projects evolved since the earlier version.
See where this is going now? I reached the conclusion that reproducing official binaries using the same build inputs is not what I m interested in. I want to be able to build the binaries that I use from source using a toolchain that I can also build from source. And preferably that all of this is using latest version of all packages, so that I can contribute and send patches for them, to improve matters.
The toolchain that Reproduce.Debian.Net is using is not trustworthy unless all those ancient packages are audited or rebuilt bit-by-bit identically, and I don t see any practical way forward to achieve that goal. Nor have I seen anyone working on that problem. It is possible to do, though, but I think there are simpler ways to achieve the same goal.
My approach to reach trusted binaries on my laptop appears to be a three-step effort:
Encourage an idempotently rebuildable Debian archive, i.e., a Debian archive that can be 100% bit-by-bit identically rebuilt using Debian itself.
Construct a smaller number of binary *.deb packages based on Guix binaries that when used as build inputs (potentially iteratively) leads to 100% bit-by-bit identical packages as in step 1.
How to go about achieving this? Today s Debian build architecture is something that lack transparency and end-user control. The build environment and signing keys are managed by, or influenced by, unidentified people following undocumented (or at least not public) security procedures, under unknown legal jurisdictions. I always wondered why none of the Debian-derivates have adopted a modern GitDevOps-style approach as a method to improve binary build transparency, maybe I missed some project?
If you want to contribute to some GitHub or GitLab project, you click the Fork button and get a CI/CD pipeline running which rebuild artifacts for the project. This makes it easy for people to contribute, and you get good QA control because the entire chain up until its artifact release are produced and tested. At least in theory. Many projects are behind on this, but it seems like this is a useful goal for all projects. This is also liberating: all users are able to reproduce artifacts. There is no longer any magic involved in preparing release artifacts. As we ve seen with many software supply-chain security incidents for the past years, where the magic is involved is a good place to introduce malicious code.
To allow me to continue with my experiment, I thought the simplest way forward was to setup a GitDevOps-centric and user-controllable way to build the entire Debian archive. Let me introduce the debdistbuild project.
Debdistbuild is a re-usable GitLab CI/CD pipeline, similar to the Salsa CI pipeline. It provide one build job definition and one deploy job definition. The pipeline can run on GitLab.org Shared Runners or you can set up your own runners, like my GitLab riscv64 runner setup. I have concerns about relying on GitLab (both as software and as a service), but my ideas are easy to transfer to some other GitDevSecOps setup such as Codeberg.org. Self-hosting GitLab, including self-hosted runners, is common today, and Debian rely increasingly on Salsa for this. All of the build infrastructure could be hosted on Salsa eventually.
The build job is simple. From within an official Debian container image build packages using dpkg-buildpackage essentially by invoking the following commands.
sed -i 's/ deb$/ deb deb-src/' /etc/apt/sources.list.d/*.sources
apt-get -o Acquire::Check-Valid-Until=false update
apt-get dist-upgrade -q -y
apt-get install -q -y --no-install-recommends build-essential fakeroot
env DEBIAN_FRONTEND=noninteractive \
apt-get build-dep -y --only-source $PACKAGE=$VERSION
useradd -m build
DDB_BUILDDIR=/build/reproducible-path
chgrp build $DDB_BUILDDIR
chmod g+w $DDB_BUILDDIR
su build -c "apt-get source --only-source $PACKAGE=$VERSION" > ../$PACKAGE_$VERSION.build
cd $DDB_BUILDDIR
su build -c "dpkg-buildpackage"
cd ..
mkdir out
mv -v $(find $DDB_BUILDDIR -maxdepth 1 -type f) out/
The deploy job is also simple. It commit artifacts to a Git project using Git-LFS to handle large objects, essentially something like this:
if ! grep -q '^pool/**' .gitattributes; then
git lfs track 'pool/**'
git add .gitattributes
git commit -m"Track pool/* with Git-LFS." .gitattributes
fi
POOLDIR=$(if test "$(echo "$PACKAGE" cut -c1-3)" = "lib"; then C=4; else C=1; fi; echo "$DDB_PACKAGE" cut -c1-$C)
mkdir -pv pool/main/$POOLDIR/
rm -rfv pool/main/$POOLDIR/$PACKAGE
mv -v out pool/main/$POOLDIR/$PACKAGE
git add pool
git commit -m"Add $PACKAGE." -m "$CI_JOB_URL" -m "$VERSION" -a
if test "$ DDB_GIT_TOKEN:- " = ""; then
echo "SKIP: Skipping git push due to missing DDB_GIT_TOKEN (see README)."
else
git push -o ci.skip
fi
That s it! The actual implementation is a bit longer, but the major difference is for log and error handling.
You may review the source code of the base Debdistbuild pipeline definition, the base Debdistbuild script and the rc.d/-style scripts implementing the build.d/ process and the deploy.d/ commands.
There was one complication related to artifact size. GitLab.org job artifacts are limited to 1GB. Several packages in Debian produce artifacts larger than this. What to do? GitLab supports up to 5GB for files stored in its package registry, but this limit is too close for my comfort, having seen some multi-GB artifacts already. I made the build job optionally upload artifacts to a S3 bucket using SHA256 hashed file hierarchy. I m using Hetzner Object Storage but there are many S3 providers around, including self-hosting options. This hierarchy is compatible with the Git-LFS .git/lfs/object/ hierarchy, and it is easy to setup a separate Git-LFS object URL to allow Git-LFS object downloads from the S3 bucket. In this mode, only Git-LFS stubs are pushed to the git repository. It should have no trouble handling the large number of files, since I have earlier experience with Apt mirrors in Git-LFS.
To speed up job execution, and to guarantee a stable build environment, instead of installing build-essential packages on every build job execution, I prepare some build container images. The project responsible for this is tentatively called stage-N-containers. Right now it create containers suitable for rolling builds of trixie on amd64, arm64, and riscv64, and a container intended for as use the stage-0 based on the 20250407 docker images of bookworm on amd64 and arm64 using the snapshot.d.o 20250407 archive. Or actually, I m using snapshot-cloudflare.d.o because of download speed and reliability. I would have prefered to use my own snapshot mirror with Hetzner bandwidth, alas the Debian snapshot team have concerns about me publishing the list of (SHA1 hash) filenames publicly and I haven t been bothered to set up non-public access.
Debdistbuild has built around 2.500 packages for bookworm on amd64 and bookworm on arm64. To confirm the generality of my approach, it also build trixie on amd64, trixie on arm64 and trixie on riscv64. The riscv64 builds are all on my own hosted runners. For amd64 and arm64 my own runners are only used for large packages where the GitLab.com shared runners run into the 3 hour time limit.
What s next in this venture? Some ideas include:
Optimize the stage-N build process by identifying the transitive closure of build dependencies from some initial set of packages.
Create a build orchestrator that launches pipelines based on the previous list of packages, as necessary to fill the archive with necessary packages. Currently I m using a basic /bin/sh for loop around curl to trigger GitLab CI/CD pipelines with names derived from https://popcon.debian.org/.
Create and publish a dists/ sub-directory, so that it is possible to use the newly built packages in the stage-1 build phase.
Produce diffoscope-style differences of built packages, both stage0 against official binaries and between stage0 and stage1.
Create the stage-1 build containers and stage-1 archive.
Review build failures. On amd64 and arm64 the list is small (below 10 out of ~5000 builds), but on riscv64 there is some icache-related problem that affects Java JVM that triggers build failures.
Provide GitLab pipeline based builds of the Debian docker container images, cloud-images, debian-live CD and debian-installer ISO s.
Provide integration with Sigstore and Sigsum for signing of Debian binaries with transparency-safe properties.
Implement a simple replacement for dpkg and apt using /bin/sh for use during bootstrapping when neither packaging tools are available.
I was just released from the hospital after a 3 day stay for my ( hopefully ) last surgery. There was concern with massive blood loss and low heart rate. I have stabilized and have come home. Unfortunately, they had to prescribe many medications this round and they are extremely expensive and used up all my funds. I need gas money to get to my post-op doctors appointments, and food would be cool. I would appreciate any help, even just a dollar!
I am already back to work, and continued work on the crashy KDE snaps in a non KDE env. ( Also affects anyone using kde-neon extensions such as FreeCAD) I hope to have a fix in the next day or so.
Fixed kate bug https://bugs.kde.org/show_bug.cgi?id=503285
Thanks for stopping by.
I host my own GitLab CI/CD runners, and find that having coverage on the riscv64 CPU architecture is useful for testing things. The HiFive Premier P550 seems to be a common hardware choice. The P550 is possible to purchase online. You also need a (mini-)ATX chassi, power supply (~500W is more than sufficient), PCI-to-M2 converter and a NVMe storage device. Total cost per machine was around $8k/ 8k for me. Assembly was simple: bolt everything, connect ATX power, connect cables for the front-panel, USB and and Audio. Be sure to toggle the physical power switch on the P550 before you close the box. Front-panel power button will start your machine. There is a P550 user manual available.
Below I will guide you to install the GitLab Runner on the pre-installed Ubuntu 24.04 that ships with the P550, and configure it to use Podman in root-less mode and without the --privileged flag, without any additional capabilities like SYS_ADMIN. Presumably you want to migrate to some other OS instead; hey Trisquel 13 riscv64 I m waiting for you! I wouldn t recommend using this machine for anything sensitive, there is an awful lot of non-free and/or vendor-specific software installed, and the hardware itself is young. I am not aware of any riscv64 hardware that can run a libre OS, all of them appear to require non-free blobs and usually a non-mainline kernel.
Login on console using username ubuntu and password ubuntu . You will be asked to change the password, so do that.
Start a terminal, gain root with sudo -i and change the hostname: echo jas-p550-01 > /etc/hostname
Connect ethernet and run: apt-get update && apt-get dist-upgrade -u.
If your system doesn t have valid MAC address (they show as MAC 8c:00:00:00:00:00 if you run ip a ), you can fix this to avoid collisions if you install multiple P550 s on the same network. Connect the Debug USB-C connector on the back to one of the hosts USB-A slots. Use minicom (use Ctrl-A X to exit) to talk to it.
apt-get install minicom minicom -o -D /dev/ttyUSB3 #cmd: ifconfig inet 192.168.0.2 netmask: 255.255.240.0 gatway 192.168.0.1 SOM_Mac0: 8c:00:00:00:00:00 SOM_Mac1: 8c:00:00:00:00:00 MCU_Mac: 8c:00:00:00:00:00 #cmd: setmac 0 CA:FE:42:17:23:00 The MAC setting will be valid after rebooting the carrier board!!! MAC[0] addr set to CA:FE:42:17:23:00(ca:fe:42:17:23:0) #cmd: setmac 1 CA:FE:42:17:23:01 The MAC setting will be valid after rebooting the carrier board!!! MAC[1] addr set to CA:FE:42:17:23:01(ca:fe:42:17:23:1) #cmd: setmac 2 CA:FE:42:17:23:02 The MAC setting will be valid after rebooting the carrier board!!! MAC[2] addr set to CA:FE:42:17:23:02(ca:fe:42:17:23:2) #cmd:
For reference, if you wish to interact with the MCU you may do that via OpenOCD and telnet, like the following (as root on the P550). You need to have the Debug USB-C connected to a USB-A host port.
Now with a reasonable setup ready, let s install the GitLab Runner. The following is adapted from gitlab-runner s official installation instructions documentation. The normal installation flow doesn t work because they don t publish riscv64 apt repositories, so you will have to perform upgrades manually.
Remember the NVMe device? Let s not forget to use it, to avoid wear and tear of the internal MMC root disk. Do this now before any files in /home/gitlab-runner appears, or you have to move them manually.
Next install gitlab-runner and configure it. Replace token glrt-REPLACEME below with the registration token you get from your GitLab project s Settings -> CI/CD -> Runners -> New project runner. I used the tags riscv64 and a runner description of the hostname.
You need to run some commands as the gitlab-runner user, but unfortunately some interaction between sudo/su and pam_systemd makes this harder than it should be. So you have to setup SSH for the user and login via SSH to run the commands. Does anyone know of a better way to do this?
# on the p550: cp -a /root/.ssh/ /home/gitlab-runner/ chown -R gitlab-runner:gitlab-runner /home/gitlab-runner/.ssh/ # on your laptop: ssh gitlab-runner@jas-p550-01 systemctl --user --now enable podman.socket systemctl --user --now start podman.socket loginctl enable-linger gitlab-runner gitlab-runner systemctl status --user podman.socket
We modify /etc/gitlab-runner/config.toml as follows, replace 997 with the user id shown by systemctl status above. See feature flags documentation for more documentation.
Note that unlike the documentation I do not add the privileged = true parameter here. I will come back to this later.
Restart the system to confirm that pushing a .gitlab-ci.yml with a job that uses the riscv64 tag like the following works properly.
At this point, things were working fine and I was running many successful builds. Now starts the fun part with operational aspects!
I had a problem when running buildah to build a new container from within a job, and noticed that aardvark-dns was crashing. You can use the Debian aardvark-dns binary instead.
wget http://ftp.de.debian.org/debian/pool/main/a/aardvark-dns/aardvark-dns_1.14.0-3_riscv64.deb echo 'df33117b6069ac84d3e97dba2c59ba53775207dbaa1b123c3f87b3f312d2f87a aardvark-dns_1.14.0-3_riscv64.deb' sha256sum -c mkdir t cd t dpkg -x ../aardvark-dns_1.14.0-3_riscv64.deb . mv /usr/lib/podman/aardvark-dns /usr/lib/podman/aardvark-dns.ubuntu mv usr/lib/podman/aardvark-dns /usr/lib/podman/aardvark-dns.debian
My setup uses podman in rootless mode without passing the privileged parameter or any add-cap parameters to add non-default capabilities. This is sufficient for most builds. However if you try to create container using buildah from within a job, you may see errors like this:
Writing manifest to image destination Error: mounting new container: mounting build container "8bf1ec03d967eae87095906d8544f51309363ddf28c60462d16d73a0a7279ce1": creating overlay mount to /var/lib/containers/storage/overlay/23785e20a8bac468dbf028bf524274c91fbd70dae195a6cdb10241c345346e6f/merged, mount_data="lowerdir=/var/lib/containers/storage/overlay/l/I3TWYVYTRZ4KVYCT6FJKHR3WHW,upperdir=/var/lib/containers/storage/overlay/23785e20a8bac468dbf028bf524274c91fbd70dae195a6cdb10241c345346e6f/diff,workdir=/var/lib/containers/storage/overlay/23785e20a8bac468dbf028bf524274c91fbd70dae195a6cdb10241c345346e6f/work,volatile": using mount program /usr/bin/fuse-overlayfs: unknown argument ignored: lazytime fuse: device not found, try 'modprobe fuse' first fuse-overlayfs: cannot mount: No such file or directory : exit status 1
Can we do better? After some experimentation, and reading open issues with suggested capabilities and configuration snippets, I ended up with the following configuration. It runs podman in rootless mode (as the gitlab-runner user) without --privileged, but add the CAP_SYS_ADMIN capability and exposes the /dev/fuse device. Still, this is running as non-root user on the machine, so I think it is an improvement compared to using --privileged and also compared to running podman as root.
Still I worry about the security properties of such a setup, so I only enable these settings for a separately configured runner instance that I use when I need this docker-in-docker (oh, I meant buildah-in-podman) functionality. I found one article discussing Rootless Podman without the privileged flag that suggest isolation=chroot but I have yet to make this work. Suggestions for improvement are welcome.
Happy Riscv64 Building!
Update 2025-05-05: I was able to make it work without the SYS_ADMIN capability too, with a GitLab /etc/gitlab-runner/config.toml like the following:
I ve updated the blog title to add the word capability-less as well. I ve confirmed that the same recipe works on podman on a ppc64el platform too. Remaining loop-holes are escaping from the chroot into the non-root gitlab-runner user, and escalating that privilege to root. The /dev/fuse and sub-uid/gid may be privilege escalation vectors here, otherwise I believe you ve found a serious software security issue rather than a configuration mistake.
Back when I setup my home automation I ended up with one piece that used an external service: Amazon Alexa. I d rather not have done this, but voice control is extremely convenient, both for us, and guests. Since then Home Assistant has done a lot of work in developing the capability of a local voice assistant - 2023 was their Year of Voice. I ve had brief looks at this in the past, but never quite had the time to dig into setting it up, and was put off by the fact a lot of the setup instructions were just Download our prebuilt components . While I admire the efforts to get Home Assistant fully packaged for Debian I accept that s a tricky proposition, and settle for running it in a venv on a Debian stable container. Voice requires a lot more binary components, and I want to have voice satellites in more than one location, so I set about trying to understand a bit better what I was deploying, and actually building the binary bits myself.
This is the start of a write-up of that. I ll break it into a bunch of posts, trying to cover one bit in each, because otherwise this will get massive. Let s start with some requirements:
All local processing; no call-outs to external services
Ability to have multiple voice satellites in the house
A desire to do wake word detection on the satellites, to avoid lots of network audio traffic all the time
As clean an install on a Debian stable based system as possible
Binaries built locally
No need for a GPU
My house server is an AMD Ryzen 7 5700G, so my expectation was that I d have enough local processing power to be able to do this. That turned out to be a valid assumption - speech to text really has come a long way in recent years. I m still running Home Assistant 2024.3.3 - the last one that supports (but complains about) Python 3.11. Trixie has started the freeze process, so once it releases I ll look at updating the HA install. For now what I have has turned out to be Good Enough, but I know there have been improvements upstream I m missing.
Finally, before I get into the details, I should point out that if you just want to get started with a voice assistant on Home Assistant and don t care about what s under the hood, there are a bunch of more user friendly details on Home Assistant s site itself, and they have pre-built images you can just deploy.
My first step was sorting out a voice satellite . This is the device that actually has a microphone and speaker and communicates with the main Home Assistant setup. I d seen the post about a $13 voice assistant, and as a result had an ATOM Echo sitting on my desk I hadn t got around to setting up.
Here, we ignore a bit about delving into exactly what s going on under the hood, even if we re compiling locally. This is a constrained embedded device and while I m familiar with the ESP32 IDF build system I just accepted that using ESPHome and letting it do it s thing was the quickest way to get up and running. It is possible to do this all via the web with a pre-built image, but I wanted to change the wake word to Hey Jarvis rather than the default Okay Nabu , and that was a good reason to bother doing a local build. We ll get into actually building a voice satellite on Debian in later posts.
I started with the default upstream assistant config and tweaked it a little for my setup:
diff of my configuration tweaks
(I note that the current upstream config has moved on a bit since I first did this, but I double checked the above instructions still work at the time of writing. I end up pinning ESPHome to the right version below due to that.)
It turns out to be fairly easy to setup ESPHome in a venv and get it to build + flash the image for you:
Instructions for building + flashing ESPHome to ATOM Echo
noodles@sevai:~$ python3 -m venv esphome-atom-echo
noodles@sevai:~$ . esphome-atom-echo/bin/activate
(esphome-atom-echo) noodles@sevai:~$ cd esphome-atom-echo/
(esphome-atom-echo) noodles@sevai:~/esphome-atom-echo$ pip install esphome==2024.12.4
Collecting esphome==2024.12.4
Using cached esphome-2024.12.4-py3-none-any.whl (4.1 MB)
Successfully installed FontTools-4.57.0 PyYAML-6.0.2 appdirs-1.4.4 attrs-25.3.0 bottle-0.13.2 defcon-0.12.1 esphome-2024.12.4 esphome-dashboard-20241217.1 freetype-py-2.5.1 fs-2.4.16 gflanguages-0.7.3 glyphsLib-6.10.1 glyphsets-1.0.0 openstep-plist-0.5.0 pillow-10.4.0 platformio-6.1.16 protobuf-3.20.3 puremagic-1.27 ufoLib2-0.17.1 unicodedata2-16.0.0
(esphome-atom-echo) noodles@sevai:~/esphome-atom-echo$ esphome compile assistant.yaml
INFO ESPHome 2024.12.4
INFO Reading configuration assistant.yaml...
INFO Updating https://github.com/esphome/esphome.git@pull/5230/head
INFO Updating https://github.com/jesserockz/esphome-components.git@None
Linking .pioenvs/study-atom-echo/firmware.elf
/home/noodles/.platformio/packages/toolchain-xtensa-esp32@8.4.0+2021r2-patch5/bin/../lib/gcc/xtensa-esp32-elf/8.4.0/../../../../xtensa-esp32-elf/bin/ld: missing --end-group; added as last command line option
RAM: [= ] 10.6% (used 34632 bytes from 327680 bytes)
Flash: [======== ] 79.8% (used 1463813 bytes from 1835008 bytes)
Building .pioenvs/study-atom-echo/firmware.bin
Creating esp32 image...
Successfully created esp32 image.
esp32_create_combined_bin([".pioenvs/study-atom-echo/firmware.bin"], [".pioenvs/study-atom-echo/firmware.elf"])
Wrote 0x176fb0 bytes to file /home/noodles/esphome-atom-echo/.esphome/build/study-atom-echo/.pioenvs/study-atom-echo/firmware.factory.bin, ready to flash to offset 0x0
esp32_copy_ota_bin([".pioenvs/study-atom-echo/firmware.bin"], [".pioenvs/study-atom-echo/firmware.elf"])
==================================================================================== [SUCCESS] Took 130.57 seconds ====================================================================================
INFO Successfully compiled program.
(esphome-atom-echo) noodles@sevai:~/esphome-atom-echo$ esphome upload --device /dev/serial/by-id/usb-Hades2001_M5stack_9552AF8367-if00-port0 assistant.yaml
INFO ESPHome 2024.12.4
INFO Reading configuration assistant.yaml...
INFO Updating https://github.com/esphome/esphome.git@pull/5230/head
INFO Updating https://github.com/jesserockz/esphome-components.git@None
INFO Upload with baud rate 460800 failed. Trying again with baud rate 115200.
esptool.py v4.7.0
Serial port /dev/serial/by-id/usb-Hades2001_M5stack_9552AF8367-if00-port0
Connecting....
Chip is ESP32-PICO-D4 (revision v1.1)
Features: WiFi, BT, Dual Core, 240MHz, Embedded Flash, VRef calibration in efuse, Coding Scheme None
Crystal is 40MHz
MAC: 64:b7:08:8a:1b:c0
Uploading stub...
Running stub...
Stub running...
Configuring flash size...
Auto-detected Flash size: 4MB
Flash will be erased from 0x00010000 to 0x00176fff...
Flash will be erased from 0x00001000 to 0x00007fff...
Flash will be erased from 0x00008000 to 0x00008fff...
Flash will be erased from 0x00009000 to 0x0000afff...
Compressed 1470384 bytes to 914252...
Wrote 1470384 bytes (914252 compressed) at 0x00010000 in 82.0 seconds (effective 143.5 kbit/s)...
Hash of data verified.
Compressed 25632 bytes to 16088...
Wrote 25632 bytes (16088 compressed) at 0x00001000 in 1.8 seconds (effective 113.1 kbit/s)...
Hash of data verified.
Compressed 3072 bytes to 134...
Wrote 3072 bytes (134 compressed) at 0x00008000 in 0.1 seconds (effective 383.7 kbit/s)...
Hash of data verified.
Compressed 8192 bytes to 31...
Wrote 8192 bytes (31 compressed) at 0x00009000 in 0.1 seconds (effective 813.5 kbit/s)...
Hash of data verified.
Leaving...
Hard resetting via RTS pin...
INFO Successfully uploaded program.
And then you can watch it boot (this is mine already configured up in Home Assistant):
Watching the ATOM Echo boot
$ picocom --quiet --imap lfcrlf --baud 115200 /dev/serial/by-id/usb-Hades2001_M5stack_9552AF8367-if00-port0
I (29) boot: ESP-IDF 4.4.8 2nd stage bootloader
I (29) boot: compile time 17:31:08
I (29) boot: Multicore bootloader
I (32) boot: chip revision: v1.1
I (36) boot.esp32: SPI Speed : 40MHz
I (40) boot.esp32: SPI Mode : DIO
I (45) boot.esp32: SPI Flash Size : 4MB
I (49) boot: Enabling RNG early entropy source...
I (55) boot: Partition Table:
I (58) boot: ## Label Usage Type ST Offset Length
I (66) boot: 0 otadata OTA data 01 00 00009000 00002000
I (73) boot: 1 phy_init RF data 01 01 0000b000 00001000
I (81) boot: 2 app0 OTA app 00 10 00010000 001c0000
I (88) boot: 3 app1 OTA app 00 11 001d0000 001c0000
I (96) boot: 4 nvs WiFi data 01 02 00390000 0006d000
I (103) boot: End of partition table
I (107) esp_image: segment 0: paddr=00010020 vaddr=3f400020 size=58974h (362868) map
I (247) esp_image: segment 1: paddr=0006899c vaddr=3ffb0000 size=03400h ( 13312) load
I (253) esp_image: segment 2: paddr=0006bda4 vaddr=40080000 size=04274h ( 17012) load
I (260) esp_image: segment 3: paddr=00070020 vaddr=400d0020 size=f5cb8h (1006776) map
I (626) esp_image: segment 4: paddr=00165ce0 vaddr=40084274 size=112ach ( 70316) load
I (665) boot: Loaded app from partition at offset 0x10000
I (665) boot: Disabling RNG early entropy source...
I (677) cpu_start: Multicore app
I (677) cpu_start: Pro cpu up.
I (677) cpu_start: Starting app cpu, entry point is 0x400825c8
I (0) cpu_start: App cpu up.
I (695) cpu_start: Pro cpu start user code
I (695) cpu_start: cpu freq: 160000000
I (695) cpu_start: Application information:
I (700) cpu_start: Project name: study-atom-echo
I (705) cpu_start: App version: 2024.12.4
I (710) cpu_start: Compile time: Apr 18 2025 17:29:39
I (716) cpu_start: ELF file SHA256: 1db4989a56c6c930...
I (722) cpu_start: ESP-IDF: 4.4.8
I (727) cpu_start: Min chip rev: v0.0
I (732) cpu_start: Max chip rev: v3.99
I (737) cpu_start: Chip rev: v1.1
I (742) heap_init: Initializing. RAM available for dynamic allocation:
I (749) heap_init: At 3FFAE6E0 len 00001920 (6 KiB): DRAM
I (755) heap_init: At 3FFB8748 len 000278B8 (158 KiB): DRAM
I (761) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (767) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (774) heap_init: At 40095520 len 0000AAE0 (42 KiB): IRAM
I (781) spi_flash: detected chip: gd
I (784) spi_flash: flash io: dio
I (790) cpu_start: Starting scheduler on PRO CPU.
I (0) cpu_start: Starting scheduler on APP CPU.
[I][logger:171]: Log initialized
[C][safe_mode:079]: There have been 0 suspected unsuccessful boot attempts
[D][esp32.preferences:114]: Saving 1 preferences to flash...
[D][esp32.preferences:143]: Saving 1 preferences to flash: 0 cached, 1 written, 0 failed
[I][app:029]: Running through setup()...
[C][esp32_rmt_led_strip:021]: Setting up ESP32 LED Strip...
[D][template.select:014]: Setting up Template Select
[D][template.select:023]: State from initial (could not load stored index): On device
[D][select:015]: 'Wake word engine location': Sending state On device (index 1)
[D][esp-idf:000]: I (100) gpio: GPIO[39] InputEn: 1 OutputEn: 0 OpenDrain: 0 Pullup: 0 Pulldown: 0 Intr:0
[D][binary_sensor:034]: 'Button': Sending initial state OFF
[C][light:021]: Setting up light 'M5Stack Atom Echo 8a1bc0'...
[D][light:036]: 'M5Stack Atom Echo 8a1bc0' Setting:
[D][light:041]: Color mode: RGB
[D][template.switch:046]: Restored state ON
[D][switch:012]: 'Use listen light' Turning ON.
[D][switch:055]: 'Use listen light': Sending state ON
[D][light:036]: 'M5Stack Atom Echo 8a1bc0' Setting:
[D][light:047]: State: ON
[D][light:051]: Brightness: 60%
[D][light:059]: Red: 100%, Green: 89%, Blue: 71%
[D][template.switch:046]: Restored state OFF
[D][switch:016]: 'timer_ringing' Turning OFF.
[D][switch:055]: 'timer_ringing': Sending state OFF
[C][i2s_audio:028]: Setting up I2S Audio...
[C][i2s_audio.microphone:018]: Setting up I2S Audio Microphone...
[C][i2s_audio.speaker:096]: Setting up I2S Audio Speaker...
[C][wifi:048]: Setting up WiFi...
[D][esp-idf:000]: I (206) wifi:
[D][esp-idf:000]: wifi driver task: 3ffc8544, prio:23, stack:6656, core=0
[D][esp-idf:000]:
[D][esp-idf:000][wifi]: I (1238) system_api: Base MAC address is not set
[D][esp-idf:000][wifi]: I (1239) system_api: read default base MAC address from EFUSE
[D][esp-idf:000][wifi]: I (1274) wifi:
[D][esp-idf:000][wifi]: wifi firmware version: ff661c3
[D][esp-idf:000][wifi]:
[D][esp-idf:000][wifi]: I (1274) wifi:
[D][esp-idf:000][wifi]: wifi certification version: v7.0
[D][esp-idf:000][wifi]:
[D][esp-idf:000][wifi]: I (1286) wifi:
[D][esp-idf:000][wifi]: config NVS flash: enabled
[D][esp-idf:000][wifi]:
[D][esp-idf:000][wifi]: I (1297) wifi:
[D][esp-idf:000][wifi]: config nano formating: disabled
[D][esp-idf:000][wifi]:
[D][esp-idf:000][wifi]: I (1317) wifi:
[D][esp-idf:000][wifi]: Init data frame dynamic rx buffer num: 32
[D][esp-idf:000][wifi]:
[D][esp-idf:000][wifi]: I (1338) wifi:
[D][esp-idf:000][wifi]: Init static rx mgmt buffer num: 5
[D][esp-idf:000][wifi]:
[D][esp-idf:000][wifi]: I (1348) wifi:
[D][esp-idf:000][wifi]: Init management short buffer num: 32
[D][esp-idf:000][wifi]:
[D][esp-idf:000][wifi]: I (1368) wifi:
[D][esp-idf:000][wifi]: Init dynamic tx buffer num: 32
[D][esp-idf:000][wifi]:
[D][esp-idf:000][wifi]: I (1389) wifi:
[D][esp-idf:000][wifi]: Init static rx buffer size: 1600
[D][esp-idf:000][wifi]:
[D][esp-idf:000][wifi]: I (1399) wifi:
[D][esp-idf:000][wifi]: Init static rx buffer num: 10
[D][esp-idf:000][wifi]:
[D][esp-idf:000][wifi]: I (1419) wifi:
[D][esp-idf:000][wifi]: Init dynamic rx buffer num: 32
[D][esp-idf:000][wifi]:
[D][esp-idf:000]: I (1441) wifi_init: rx ba win: 6
[D][esp-idf:000]: I (1441) wifi_init: tcpip mbox: 32
[D][esp-idf:000]: I (1450) wifi_init: udp mbox: 6
[D][esp-idf:000]: I (1450) wifi_init: tcp mbox: 6
[D][esp-idf:000]: I (1460) wifi_init: tcp tx win: 5760
[D][esp-idf:000]: I (1471) wifi_init: tcp rx win: 5760
[D][esp-idf:000]: I (1481) wifi_init: tcp mss: 1440
[D][esp-idf:000]: I (1481) wifi_init: WiFi IRAM OP enabled
[D][esp-idf:000]: I (1491) wifi_init: WiFi RX IRAM OP enabled
[C][wifi:061]: Starting WiFi...
[C][wifi:062]: Local MAC: 64:B7:08:8A:1B:C0
[D][esp-idf:000][wifi]: I (1513) phy_init: phy_version 4791,2c4672b,Dec 20 2023,16:06:06
[D][esp-idf:000][wifi]: I (1599) wifi:
[D][esp-idf:000][wifi]: mode : sta (64:b7:08:8a:1b:c0)
[D][esp-idf:000][wifi]:
[D][esp-idf:000][wifi]: I (1600) wifi:
[D][esp-idf:000][wifi]: enable tsf
[D][esp-idf:000][wifi]:
[D][esp-idf:000][wifi]: I (1605) wifi:
[D][esp-idf:000][wifi]: Set ps type: 1
[D][esp-idf:000][wifi]:
[D][wifi:482]: Starting scan...
[D][esp32.preferences:114]: Saving 1 preferences to flash...
[D][esp32.preferences:143]: Saving 1 preferences to flash: 1 cached, 0 written, 0 failed
[W][micro_wake_word:151]: Wake word detection can't start as the component hasn't been setup yet
[D][esp-idf:000][wifi]: I (1646) wifi:
[D][esp-idf:000][wifi]: Set ps type: 1
[D][esp-idf:000][wifi]:
[W][component:157]: Component wifi set Warning flag: scanning for networks
[I][wifi:617]: WiFi Connected!
[D][wifi:626]: Disabling AP...
[C][api:026]: Setting up Home Assistant API server...
[C][micro_wake_word:062]: Setting up microWakeWord...
[C][micro_wake_word:069]: Micro Wake Word initialized
[I][app:062]: setup() finished successfully!
[W][component:170]: Component wifi cleared Warning flag
[W][component:157]: Component api set Warning flag: unspecified
[I][app:100]: ESPHome version 2024.12.4 compiled on Apr 18 2025, 17:29:39
[C][logger:185]: Logger:
[C][logger:186]: Level: DEBUG
[C][logger:188]: Log Baud Rate: 115200
[C][logger:189]: Hardware UART: UART0
[C][esp32_rmt_led_strip:187]: ESP32 RMT LED Strip:
[C][esp32_rmt_led_strip:188]: Pin: 27
[C][esp32_rmt_led_strip:189]: Channel: 0
[C][esp32_rmt_led_strip:214]: RGB Order: GRB
[C][esp32_rmt_led_strip:215]: Max refresh rate: 0
[C][esp32_rmt_led_strip:216]: Number of LEDs: 1
[C][template.select:065]: Template Select 'Wake word engine location'
[C][template.select:066]: Update Interval: 60.0s
[C][template.select:069]: Optimistic: YES
[C][template.select:070]: Initial Option: On device
[C][template.select:071]: Restore Value: YES
[C][gpio.binary_sensor:015]: GPIO Binary Sensor 'Button'
[C][gpio.binary_sensor:016]: Pin: GPIO39
[C][light:092]: Light 'M5Stack Atom Echo 8a1bc0'
[C][light:094]: Default Transition Length: 0.0s
[C][light:095]: Gamma Correct: 2.80
[C][template.switch:068]: Template Switch 'Use listen light'
[C][template.switch:091]: Restore Mode: restore defaults to ON
[C][template.switch:057]: Optimistic: YES
[C][template.switch:068]: Template Switch 'timer_ringing'
[C][template.switch:091]: Restore Mode: always OFF
[C][template.switch:057]: Optimistic: YES
[C][factory_reset.button:011]: Factory Reset Button 'Factory reset'
[C][factory_reset.button:011]: Icon: 'mdi:restart-alert'
[C][captive_portal:089]: Captive Portal:
[C][mdns:116]: mDNS:
[C][mdns:117]: Hostname: study-atom-echo-8a1bc0
[C][esphome.ota:073]: Over-The-Air updates:
[C][esphome.ota:074]: Address: study-atom-echo.local:3232
[C][esphome.ota:075]: Version: 2
[C][esphome.ota:078]: Password configured
[C][safe_mode:018]: Safe Mode:
[C][safe_mode:020]: Boot considered successful after 60 seconds
[C][safe_mode:021]: Invoke after 10 boot attempts
[C][safe_mode:023]: Remain in safe mode for 300 seconds
[C][api:140]: API Server:
[C][api:141]: Address: study-atom-echo.local:6053
[C][api:143]: Using noise encryption: YES
[C][micro_wake_word:051]: microWakeWord:
[C][micro_wake_word:052]: models:
[C][micro_wake_word:015]: - Wake Word: Hey Jarvis
[C][micro_wake_word:016]: Probability cutoff: 0.970
[C][micro_wake_word:017]: Sliding window size: 5
[C][micro_wake_word:021]: - VAD Model
[C][micro_wake_word:022]: Probability cutoff: 0.500
[C][micro_wake_word:023]: Sliding window size: 5
[D][api:103]: Accepted 192.168.39.6
[W][component:170]: Component api cleared Warning flag
[W][component:237]: Component api took a long time for an operation (58 ms).
[W][component:238]: Components should block for at most 30 ms.
[D][api.connection:1446]: Home Assistant 2024.3.3 (192.168.39.6): Connected successfully
[D][ring_buffer:034]: Created ring buffer with size 2048
[D][micro_wake_word:399]: Resetting buffers and probabilities
[D][micro_wake_word:195]: State changed from IDLE to START_MICROPHONE
[D][micro_wake_word:107]: Starting Microphone
[D][micro_wake_word:195]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[D][esp-idf:000]: I (11279) I2S: DMA Malloc info, datalen=blocksize=1024, dma_buf_count=4
[D][micro_wake_word:195]: State changed from STARTING_MICROPHONE to DETECTING_WAKE_WORD
That s enough to get a voice satellite that can be configured up in Home Assistant; you ll need the ESPHome Integration added, then for the noise_psk key you use the same string as I have under api/encryption/key in my diff above (obviously do your own, I used dd if=/dev/urandom bs=32 count=1 base64 to generate mine).
If you re like me and a compulsive VLANer and firewaller even within your own network then you need to allow Home Assistant to connect on TCP port 6053 to the ATOM Echo, and also allow access to/from UDP port 6055 on the Echo (it ll send audio from that port to Home Assistant, then receive back audio to the same port).
At this point you can now shout Hey Jarvis, what time is it? at the Echo, and the white light will turn flashing blue (indicating it s heard the wake word). Which means we re ready to teach Home Assistant how to do something with the incoming audio.
Remember the XZ Utils backdoor? One factor that enabled the attack was poor auditing of the release tarballs for differences compared to the Git version controlled source code. This proved to be a useful place to distribute malicious data.
The differences between release tarballs and upstream Git sources is typically vendored and generated files. Lots of them. Auditing all source tarballs in a distribution for similar issues is hard and boring work for humans. Wouldn t it be better if that human auditing time could be spent auditing the actual source code stored in upstream version control instead? That s where auditing time would help the most.
Are there better ways to address the concern about differences between version control sources and tarball artifacts? Let s consider some approaches:
Stop publishing (or at least stop building from) source tarballs that differ from version control sources.
Create recipes for how to derive the published source tarballs from version control sources. Verify that independently from upstream.
While I like the properties of the first solution, and have made effort to support that approach, I don t think normal source tarballs are going away any time soon. I am concerned that it may not even be a desirable complete solution to this problem. We may need tarballs with pre-generated content in them for various reasons that aren t entirely clear to us today.
So let s consider the second approach. It could help while waiting for more experience with the first approach, to see if there are any fundamental problems with it.
How do you know that the XZ release tarballs was actually derived from its version control sources? The same for Gzip? Coreutils? Tar? Sed? Bash? GCC? We don t know this! I am not aware of any automated or collaborative effort to perform this independent confirmation. Nor am I aware of anyone attempting to do this on a regular basis. We would want to be able to do this in the year 2042 too. I think the best way to reach that is to do the verification continuously in a pipeline, fixing bugs as time passes. The current state of the art seems to be that people audit the differences manually and hope to find something. I suspect many package maintainers ignore the problem and take the release source tarballs and trust upstream about this.
We can do better.
I have launched a project to setup a GitLab pipeline that invokes per-release scripts to rebuild that release artifact from git sources. Currently it only contain recipes for projects that I released myself. Releases which where done in a controlled way with considerable care to make reproducing the tarballs possible. The project homepage is here:
https://gitlab.com/debdistutils/verify-reproducible-releases
The project is able to reproduce the release tarballs for Libtasn1 v4.20.0, InetUtils v2.6, Libidn2 v2.3.8, Libidn v1.43, and GNU SASL v2.2.2. You can see this in a recent successful pipeline. All of those releases were prepared using Guix, and I m hoping the Guix time-machine will make it possible to keep re-generating these tarballs for many years to come.
I spent some time trying to reproduce the current XZ release tarball for version 5.8.1. That would have been a nice example, wouldn t it? First I had to somehow mimic upstream s build environment. The XZ release tarball contains GNU Libtool files that are identified with version 2.5.4.1-baa1-dirty. I initially assumed this was due to the maintainer having installed libtool from git locally (after making some modifications) and made the XZ release using it. Later I learned that it may actually be coming from ArchLinux which ship with this particular libtool version. It seems weird for a distribution to use libtool built from a non-release tag, and furthermore applying patches to it, but things are what they are. I made some effort to setup an ArchLinux build environment, however the now-current Gettext version in ArchLinux seems to be more recent than the one that were used to prepare the XZ release. I don t know enough ArchLinux to setup an environment corresponding to an earlier version of ArchLinux, which would be required to finish this. I gave up, maybe the XZ release wasn t prepared on ArchLinux after all. Actually XZ became a good example for this writeup anyway: while you would think this should be trivial, the fact is that it isn t! (There is another aspect here: fingerprinting the versions used to prepare release tarballs allows you to infer what kind of OS maintainers are using to make releases on, which is interesting on its own.)
I made some small attempts to reproduce the tarball for GNU Shepherd version 1.0.4 too, but I still haven t managed to complete it.
Do you want a supply-chain challenge for the Easter weekend? Pick some well-known software and try to re-create the official release tarballs from the corresponding Git checkout. Is anyone able to reproduce anything these days? Bonus points for wrapping it up as a merge request to my project.
Happy Supply-Chain Security Hacking!
In addition to all the regular testing I am testing our snaps in a non KDE environment, so far it is not looking good in Xubuntu. We have kernel/glibc crashes on startup for some and for file open for others. I am working on a hopeful fix.
Next week I will have ( I hope ) my final surgery. If you can spare any change to help bring me over the finish line, I will be forever grateful
Nextcloud is an open-source software suite that
enables you to set up and manage your own cloud storage and collaboration
platform. It offers a range of features similar to popular cloud services
like Google Drive or Dropbox but with the added benefit of complete control
over your data and the server where it s hosted.
I wanted to have a look at Nextcloud and the steps to setup a own instance
with a PostgreSQL based database together with NGinx as the webserver to
serve the WebUI. Before doing a full productive setup I wanted to play
around locally with all the needed steps and worked out all the steps within
KVM machine.
While doing this I wrote down some notes to mostly document for myself what
I need to do to get a Nextcloud installation running and usable. So this
manual describes how to setup a Nextcloud installation on Debian 12 Bookworm
based on NGinx and PostgreSQL.
Nextcloud Installation
Install PHP and PHP extensions for Nextcloud
Nextcloud is basically a PHP application so we need to install PHP packages
to get it working in the end. The following steps are based on the upstream
documentation about how to install a own
Nextcloud instance.
Installing the virtual package package php on a Debian Bookworm system
would pull in the depending meta package php8.2. This package itself would
then pull also the package libapache2-mod-php8.2 as an dependency which
then would pull in also the apache2 webserver as a depending package. This
is something I don t wanted to have as I want to use NGinx that is already
installed on the system instead.
To get this we need to explicitly exclude the package libapache2-mod-php8.2
from the list of packages which we want to install, to achieve this we have
to append a hyphen - at the end of the package name, so we need to use
libapache2-mod-php8.2- within the package list that is telling apt to
ignore this package as an dependency. I ended up with this call to get all
needed dependencies installed.
To make these settings effective, restart the php-fpm service
$ sudo systemctl restart php8.2-fpm
Install PostgreSQL, Create a database and user
This manual assumes we will use a PostgreSQL server on localhost, if you
have a server instance on some remote site you can skip the installation
step here.
$ sudo apt install postgresql postgresql-contrib postgresql-client
Check version after installation (optinal step):
$ sudo -i -u postgres$ psql -version
This output will be seen:
psql (15.12 (Debian 15.12-0+deb12u2))
Exit the PSQL shell by using the command \q.
postgres=# \q
Exit the CLI of the postgres user:
postgres@host:~$ exit
Create a PostgreSQL Database and User:
Create a new PostgreSQL user (Use a strong password!):
$ sudo -u postgres psql -c "CREATE USER nextcloud_user PASSWORD '1234';"
Create new database and grant access:
$ sudo -u postgres psql -c "CREATE DATABASE nextcloud_db WITH OWNER nextcloud_user ENCODING=UTF8;"
(Optional) Check if we now can connect to the database server and the
database in detail (you will get a question about the password for the database
user!). If this is not working it makes no sense to proceed further! We need to
fix first the access then!
$ psql -h localhost -U nextcloud_user -d nextcloud_db
or
$ psql -h 127.0.0.1 -U nextcloud_user -d nextcloud_db
Log out from postgres shell using the command \q.
Download and install Nextcloud
Use the following command to download the latest version of Nextcloud:
$ wget https://download.nextcloud.com/server/releases/latest.zip
Extract file into the folder /var/www/html with the following command:
$ sudo unzip latest.zip -d /var/www/html
Change ownership of the /var/www/html/nextcloud directory to www-data.
$ sudo chown -R www-data:www-data /var/www/html/nextcloud
Configure NGinx for Nextcloud to use a certificate
In case you want to use self signed certificate, e.g. if you play around to
setup Nextcloud locally for testing purposes you can do the following steps.
If you want or need to use the service of Let s Encrypt (or similar) drop
the step above and create your required key data by using this command:
$ sudo certbot --nginx -d nextcloud.your-domain.com
You will need to adjust the path to the key and certificate in the next
step!
Change the NGinx configuration:
$ sudo vi /etc/nginx/sites-available/nextcloud.conf
Add the following snippet into the file and save it.
# /etc/nginx/sites-available/nextcloud.conf
upstreamphp-handler#server 127.0.0.1:9000;
serverunix:/run/php/php8.2-fpm.sock;
# Set the immutable cache control options only for assets with a cache
# busting v argument
map $arg_v $asset_immutable
"""";
default",immutable";
serverlisten80;
listen[::]:80;
# Adjust this to the correct server name!
server_namenextcloud.local;
# Prevent NGinx HTTP Server Detection
server_tokensoff;
# Enforce HTTPS
return301https://$server_name$request_uri;
serverlisten443sslhttp2;
listen[::]:443sslhttp2;
# Adjust this to the correct server name!
server_namenextcloud.local;
# Path to the root of your installation
root/var/www/html/nextcloud;
# Use Mozilla's guidelines for SSL/TLS settings
# https://mozilla.github.io/server-side-tls/ssl-config-generator/
# Adjust the usage and paths of the correct key data! E.g. it you want to use Let's Encrypt key material!
ssl_certificate/etc/ssl/certs/nextcloud.crt;
ssl_certificate_key/etc/ssl/private/nextcloud.key;
# ssl_certificate /etc/letsencrypt/live/nextcloud.your-domain.com/fullchain.pem;
# ssl_certificate_key /etc/letsencrypt/live/nextcloud.your-domain.com/privkey.pem;
# Prevent NGinx HTTP Server Detection
server_tokensoff;
# HSTS settings
# WARNING: Only add the preload option once you read about
# the consequences in https://hstspreload.org/. This option
# will add the domain to a hardcoded list that is shipped
# in all major browsers and getting removed from this list
# could take several months.
#add_header Strict-Transport-Security "max-age=15768000; includeSubDomains; preload" always;
# set max upload size and increase upload timeout:
client_max_body_size512M;
client_body_timeout300s;
fastcgi_buffers644K;
# Enable gzip but do not remove ETag headers
gzipon;
gzip_varyon;
gzip_comp_level4;
gzip_min_length256;
gzip_proxiedexpiredno-cacheno-storeprivateno_last_modifiedno_etagauth;
gzip_typesapplication/atom+xmltext/javascriptapplication/javascriptapplication/jsonapplication/ld+jsonapplication/manifest+jsonapplication/rss+xmlapplication/vnd.geo+jsonapplication/vnd.ms-fontobjectapplication/wasmapplication/x-font-ttfapplication/x-web-app-manifest+jsonapplication/xhtml+xmlapplication/xmlfont/opentypeimage/bmpimage/svg+xmlimage/x-icontext/cache-manifesttext/csstext/plaintext/vcardtext/vnd.rim.location.xloctext/vtttext/x-componenttext/x-cross-domain-policy;
# Pagespeed is not supported by Nextcloud, so if your server is built
# with the ngx_pagespeed module, uncomment this line to disable it.
#pagespeed off;
# The settings allows you to optimize the HTTP2 bandwidth.
# See https://blog.cloudflare.com/delivering-http-2-upload-speed-improvements/
# for tuning hints
client_body_buffer_size512k;
# HTTP response headers borrowed from Nextcloud .htaccess
add_headerReferrer-Policy"no-referrer"always;
add_headerX-Content-Type-Options"nosniff"always;
add_headerX-Frame-Options"SAMEORIGIN"always;
add_headerX-Permitted-Cross-Domain-Policies"none"always;
add_headerX-Robots-Tag"noindex,nofollow"always;
add_headerX-XSS-Protection"1; mode=block"always;
# Remove X-Powered-By, which is an information leak
fastcgi_hide_headerX-Powered-By;
# Set .mjs and .wasm MIME types
# Either include it in the default mime.types list
# and include that list explicitly or add the file extension
# only for Nextcloud like below:
includemime.types;
typestext/javascriptjsmjs;
application/wasmwasm;
# Specify how to handle directories -- specifying /index.php$request_uri
# here as the fallback means that NGinx always exhibits the desired behaviour
# when a client requests a path that corresponds to a directory that exists
# on the server. In particular, if that directory contains an index.php file,
# that file is correctly served; if it doesn't, then the request is passed to
# the front-end controller. This consistent behaviour means that we don't need
# to specify custom rules for certain paths (e.g. images and other assets,
# /updater , /ocs-provider ), and thus
# try_files $uri $uri/ /index.php$request_uri
# always provides the desired behaviour.
indexindex.phpindex.html/index.php$request_uri;
# Rule borrowed from .htaccess to handle Microsoft DAV clients
location = /if( $http_user_agent ~ ^DavClnt)return302/remote.php/webdav/$is_args$args;
location = /robots.txtallowall;
log_not_foundoff;
access_logoff;
# Make a regex exception for /.well-known so that clients can still
# access it despite the existence of the regex rule
# location ~ /(\. autotest ...) which would otherwise handle requests
# for /.well-known .
location^~/.well-known# The rules in this block are an adaptation of the rules
# in .htaccess that concern /.well-known .
location = /.well-known/carddavreturn301/remote.php/dav/;
location = /.well-known/caldavreturn301/remote.php/dav/;
location/.well-known/acme-challengetry_files $uri $uri/ =404;
location/.well-known/pki-validationtry_files $uri $uri/ =404;
# Let Nextcloud's API for /.well-known URIs handle all other
# requests by passing them to the front-end controller.
return301/index.php$request_uri;
# Rules borrowed from .htaccess to hide certain paths from clients
location ~ ^/(?:build tests config lib 3rdparty templates data)(?:$ /)return404;
location ~ ^/(?:\. autotest occ issue indie db_ console)return404;
# Ensure this block, which passes PHP files to the PHP process, is above the blocks
# which handle static assets (as seen below). If this block is not declared first,
# then NGinx will encounter an infinite rewriting loop when it prepend /index.php
# to the URI, resulting in a HTTP 500 error response.
location ~ \.php(?:$ /)# Required for legacy support
rewrite^/(?!index remote public cron core\/ajax\/update status ocs\/v[12] updater\/.+ ocs-provider\/.+ .+\/richdocumentscode(_arm64)?\/proxy)/index.php$request_uri;
fastcgi_split_path_info^(.+?\.php)(/.*)$;
set $path_info $fastcgi_path_info;
try_files $fastcgi_script_name =404;
includefastcgi_params;
fastcgi_paramSCRIPT_FILENAME $document_root$fastcgi_script_name;
fastcgi_paramPATH_INFO $path_info;
fastcgi_paramHTTPSon;
fastcgi_parammodHeadersAvailabletrue; # Avoid sending the security headers twice
fastcgi_paramfront_controller_activetrue; # Enable pretty urls
fastcgi_passphp-handler;
fastcgi_intercept_errorson;
fastcgi_request_bufferingoff;
fastcgi_max_temp_file_size0;
# Serve static files
location ~ \.(?:css js mjs svg gif png jpg ico wasm tflite map ogg flac)$try_files $uri /index.php$request_uri;
# HTTP response headers borrowed from Nextcloud .htaccess
add_headerCache-Control"public,max-age=15778463$asset_immutable";
add_headerReferrer-Policy"no-referrer"always;
add_headerX-Content-Type-Options"nosniff"always;
add_headerX-Frame-Options"SAMEORIGIN"always;
add_headerX-Permitted-Cross-Domain-Policies"none"always;
add_headerX-Robots-Tag"noindex,nofollow"always;
add_headerX-XSS-Protection"1; mode=block"always;
access_logoff; # Optional: Don't log access to assets
location ~ \.woff2?$try_files $uri /index.php$request_uri;
expires7d; # Cache-Control policy borrowed from .htaccess
access_logoff; # Optional: Don't log access to assets
# Rule borrowed from .htaccess
location/remotereturn301/remote.php$request_uri;
location/try_files $uri $uri/ /index.php$request_uri;
Symlink configuration site available to site enabled.
$ ln -s /etc/nginx/sites-available/nextcloud.conf /etc/nginx/sites-enabled/
Restart NGinx and access the URI in the browser.
Go through the installation of Nextcloud.
The user data on the installation dialog should point e.g to
administrator or similar, that user will become administrative access
rights in Nextcloud!
To adjust the database connection detail you have to edit the file
$install_folder/config/config.php.
Means here in the example within this post you would need to modify
/var/www/html/nextcloud/config/config.php to control or change the
database connection.
---%<---'dbname'=>'nextcloud_db',
'dbhost'=>'localhost', #(Or your remote PostgreSQL server address if you have.)
'dbport'=>'',
'dbtableprefix'=>'oc_',
'dbuser'=>'nextcloud_user',
'dbpassword'=>'1234', #(The password you set for database user.)
--->%---
After the installation and setup of the Nextcloud PHP application there are
more steps to be done. Have a look into the WebUI what you will need to do
as additional steps like create a cronjob or tuning of some more PHP
configurations.
If you ve done all things correct you should see a login page similar to
this:
Optional other steps for more enhanced configuration modifications
Move the data folder to somewhere else
The data folder is the root folder for all user content. By default it is
located in $install_folder/data, so in our case here it is in
/var/www/html/nextcloud/data.
Move the data directory outside the web server document root.
$ sudo mv /var/www/html/nextcloud/data /var/nextcloud_data
Ensure access permissions, mostly not needed if you move the folder.
$ sudo chown -R www-data:www-data /var/nextcloud_data$ sudo chown -R www-data:www-data /var/www/html/nextcloud/
Update the Nextcloud configuration:
Open the config/config.php file of your Nextcloud installation.
$ sudo vi /var/www/html/nextcloud/config/config.php
Update the datadirectory parameter to point to the new location of your data directory.
Make the installation available for multiple FQDNs on the same server
Adjust the Nextcloud configuration to listen and accept requests for
different domain names. Configure and adjust the key trusted_domains
accordingly.
$ sudo vi /var/www/html/nextcloud/config/config.php
Create and adjust the needed site configurations for the webserver.
Restart the NGinx unit.
An error message about .ocdata might occur
.ocdata is not found inside the data directory
Create file using touch and set necessary permissions.
$ sudo touch /var/nextcloud_data/.ocdata$ sudo chown -R www-data:www-data /var/nextcloud_data/
The password for the administrator user is unknown
Log in to your server:
SSH into the server where your PostgreSQL database is hosted.
Switch to the PostgreSQL user:
$ sudo -i -u postgres
Access the PostgreSQL command line
psql
List the databases: (If you re unsure which database is being used by Nextcloud, you can list all the databases by the list command.)
\l
Switch to the Nextcloud database:
Switch to the specific database that Nextcloud is using.
\c nextclouddb
Reset the password for the Nextcloud database user:
ALTER USER nextcloud_user WITH PASSWORD 'new_password';
Exit the PostgreSQL command line:
\q
Verify Database Configuration:
Check the database connection details in the config.php file to ensure they are correct.
sudo vi /var/www/html/nextcloud/config/config.php
Replace nextcloud_db, nextcloud_user, and your_password with your actual database name, user, and password.
---%<---'dbname'=>'nextcloud_db',
'dbhost'=>'localhost', #(or your PostgreSQL server address)
'dbport'=>'',
'dbtableprefix'=>'oc_',
'dbuser'=>'nextcloud_user',
'dbpassword'=>'1234', #(The password you set for nextcloud_user.)
--->%---
Restart NGinx and access the UI through the browser.
Welcome to the third report in 2025 from the Reproducible Builds project. Our monthly reports outline what we ve been up to over the past month, and highlight items of news from elsewhere in the increasingly-important area of software supply-chain security. As usual, however, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website.
Table of contents:
Debian bookworm live images now fully reproducible from their binary packages
Roland Clobus announced on our mailing list this month that all the major desktop variants (ie. Gnome, KDE, etc.) can be reproducibly created for Debian bullseye, bookworm and trixie from their (pre-compiled) binary packages.
Building reproducible Debian live images does not require building from reproducible source code, but this is still a remarkable achievement. Some large proportion of the binary packages that comprise these live images can (and were) built reproducibly, but live image generation works at a higher level. (By contrast, full or end-to-end reproducibility of a bootable OS image will, in time, require both the compile-the-packages the build-the-bootable-image stages to be reproducible.)
Nevertheless, in response, Roland s announcement generated significant congratulations as well as some discussion regarding the finer points of the terms employed: a full outline of the replies can be found here.
The news was also picked up by Linux Weekly News (LWN) as well as to Hacker News.
LWN: Fedora change aims for 99% package reproducibilityLinux Weekly News (LWN) contributor Joe Brockmeier has published a detailed round-up on how Fedora change aims for 99% package reproducibility. The article opens by mentioning that although Debian has been working toward reproducible builds for more than a decade , the Fedora project has now:
progressed far enough that the project is now considering a change proposal for the Fedora 43 development cycle, expected to be released in October, with a goal of making 99% of Fedora s package builds reproducible. So far, reaction to the proposal seems favorable and focused primarily on how to achieve the goal with minimal pain for packagers rather than whether to attempt it.
Over the last few releases, we [Fedora] changed our build infrastructure to make package builds reproducible. This is enough to reach 90%. The remaining issues need to be fixed in individual packages. After this Change, package builds are expected to be reproducible. Bugs will be filed against packages when an irreproducibility is detected. The goal is to have no fewer than 99% of package builds reproducible.
Python adopts PEP standard for specifying package dependencies
Python developer Brett Cannonreported on Fosstodon that PEP 751 was recently accepted. This design document has the purpose of describing a file format to record Python dependencies for installation reproducibility . As the abstract of the proposal writes:
This PEP proposes a new file format for specifying dependencies to enable reproducible installation in a Python environment. The format is designed to be human-readable and machine-generated. Installers consuming the file should be able to calculate what to install without the need for dependency resolution at install-time.
The PEP, which itself supersedes PEP 665, mentions that there are at least five well-known solutions to this problem in the community .
OSS Rebuild real-time validation and tooling improvements
OSS Rebuild aims to automate rebuilding upstream language packages (e.g. from PyPI, crates.io, npm registries) and publish signed attestations and build definitions for public use.
OSS Rebuild is now attempting rebuilds as packages are published, shortening the time to validating rebuilds and publishing attestations.
Aman Sharma contributed classifiers and fixes for common sources of non-determinism in JAR packages.
Improvements were also made to some of the core tools in the project:
timewarp for simulating the registry responses from sometime in the past.
proxy for transparent interception and logging of network activity.
SimpleX Chat server components now reproducible
SimpleX Chat is a privacy-oriented decentralised messaging platform that eliminates user identifiers and metadata, offers end-to-end encryption and has a unique approach to decentralised identity. Starting from version 6.3, however, Simplex has implemented reproducible builds for its server components. This advancement allows anyone to verify that the binaries distributed by SimpleX match the source code, improving transparency and trustworthiness.
Three new scholarly papers
Aman Sharma of the KTH Royal Institute of Technology of Stockholm, Sweden published a paper on Build and Runtime Integrity for Java (PDF). The paper s abstract notes that Software Supply Chain attacks are increasingly threatening the security of software systems and goes on to compare build- and run-time integrity:
Build-time integrity ensures that the software artifact creation process, from source code to compiled binaries, remains untampered. Runtime integrity, on the other hand, guarantees that the executing application loads and runs only
trusted code, preventing dynamic injection of malicious components.
The recently mandated software bill of materials (SBOM) is intended to help mitigate software supply-chain risk. We discuss extensions that would enable an SBOM to serve as a basis for making trust assessments thus also serving as a proactive defense.
A full PDF of the paper is available.
Lastly, congratulations to Giacomo Benedetti of the University of Genoa for publishing their PhD thesis. Titled Improving Transparency, Trust, and Automation in the Software Supply Chain, Giacomo s thesis:
addresses three critical aspects of the software supply chain to enhance security: transparency, trust, and automation. First, it investigates transparency as a mechanism to empower developers with accurate and complete insights into the software components integrated into their applications. To this end, the thesis introduces SUNSET and PIP-SBOM, leveraging modeling and SBOMs (Software Bill of Materials) as foundational tools for transparency and security. Second, it examines software trust, focusing on the effectiveness of reproducible builds in major ecosystems and proposing solutions to bolster their adoption. Finally, it emphasizes the role of automation in modern software management, particularly in ensuring user safety and application reliability. This includes developing a tool for automated security testing of GitHub Actions and analyzing the permission models of prominent platforms like GitHub, GitLab, and BitBucket.
Debian developer Simon Josefsson published two reproducibility-related blog posts this month. The first was on the topic of Reproducible Software Releases which discusses some techniques and gotchas that can be encountered when generating reproducible source packages ie. ensuring that the source code archives that open-source software projects release can be reproduced by others. Simon s second post builds on his earlier experiments with reproducing parts of Trisquel/Debian. Titled On Binary Distribution Rebuilds, it discusses potential methods to bootstrap a binary distribution like Debian from some other bootstrappable environment like Guix.
Jochen Sprickerhof uploaded sbuild version 0.88.5 with a change relevant to reproducible builds: specifically, the build_as_root_when_needed functionality still supports older versions of dpkg(1). []
The IzzyOnDroid Android APK repository reached another milestone in March, crossing the 40% coverage mark specifically, more than 42% of the apps in the repository is now reproducible
Thanks to funding by NLnet/Mobifree, the project was also to put more
time into their tooling. For instance, developers can now run easily their own verification builder in less than 5 minutes . This currently supports Debian-based systems, but support for RPM-based systems is incoming. Future work in the pipeline, including documentation, guidelines and helpers for debugging.
Fedora developer Zbigniew J drzejewski-Szmek announced a work-in-progress script called fedora-repro-build which attempts to reproduce an existing package within a Koji build environment. Although the project s README file lists a number of fields will always or almost always vary (and there are a non-zero list of other known issues), this is an excellent first step towards full Fedora reproducibility (see above for more information).
Lastly, in openSUSE news, Bernhard M. Wiedemann posted another monthly update for his work there.
[What] would it take to compromise an entire Linux distribution directly through their public infrastructure? Is it possible to perform such a compromise as simple security researchers with no available resources but time?
diffoscope & strip-nondeterminismdiffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made the following changes, including preparing and uploading versions 290, 291, 292 and 293 and 293 to Debian:
Bug fixes:
file(1) version 5.46 now returns XHTML document for .xhtml files such as those found nested within our .epub tests. []
Also consider .aar files as APK files, at least for the sake of diffoscope. []
Require the new, upcoming, version of file(1) and update our quine-related testcase. []
Codebase improvements:
Ensure all calls to our_check_output in the ELF comparator have the potential CalledProcessError exception caught. [][]
Correct an import masking issue. []
Add a missing subprocess import. []
Reformat openssl.py. []
Update copyright years. [][][]
In addition, Ivan Trubach contributed a change to ignore the st_size metadata entry for directories as it is essentially arbitrary and introduces unnecessary or even spurious changes. []
Website updates
Once again, there were a number of improvements made to our website this month, including:
Herv Boutemy updated the JVM documentation to clarify that the target is rebuild attestation. []
Lastly, Holger Levsen added Julien Malka and Zbigniew J drzejewski-Szmek to our Involved people [][] as well as replaced suggestions to follow us on Twitter/X to follow us on Mastodon instead [][].
Reproducibility testing framework
The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In March, a number of changes were made by Holger Levsen, including:
And finally, node maintenance was performed by Holger Levsen [][][] and Mattia Rizzolo [][].
Upstream patches
The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:
Finally, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:
Icy morning Witch Wells Az
Life:
Last week we were enjoying springtime, this week winter has made a comeback! Good news on the broken arm front, the infection is gone, so they can finally deal with the broken issue again. I will have a less invasive surgery April 25th to pull the bones back together so they can properly knit back together! If you can spare any change please consider a donation to my continued healing and recovery, or just support my work
Kubuntu:
While testing Beta I came across some crashy apps ( Namely PIM ) due to apparmor. I have uploaded fixed profiles for kmail, akregator, akonadiconsole, konqueror, tellico
KDE Snaps:
Added sctp support in Qt https://invent.kde.org/neon/snap-packaging/kde-qt6-core-sdk/-/commit/bbcb1dc39044b930ab718c8ffabfa20ccd2b0f75
This will allow me to finish a pyside6 snap and fix FreeCAD build.
Changed build type to Release in the kf6-core24-sdk which will reduce the size of kf6-core24 significantly.
Fixed a few startup errors in kf5-core24 and kf6-core24 snapcraft-desktop-integration.
Soumyadeep fixed wayland icons in https://invent.kde.org/neon/snap-packaging/kf6-core-sdk/-/merge_requests/3
KDE Applications 25.03.90 RC released to candidate ( I know it says 24.12.3, version won t be updated until 25.04.0 release )
Kasts core24 fixed in candidate
Kate now core24 with Breeze theme! candidate
Neochat: Fixed missing QML and 25.04 dependencies in candidate
Kdenlive now with Galxnimate animations! candidate
Digikam 8.6.0 now with scanner support in stable
Kstars 3.7.6 released to stable for realz, removed store rejected plugs.
Thanks for stopping by!