Search Results: "lex"

15 April 2024

Andreas R nnquist: Status update for Allegro packaging in Debian

I have mailed to a Debian bug on allegro4.4 describing my reasoning
regarding the allegro libraries in short, allegro4.4 is pretty much
dead upstream, and my interest was basically to keep alex4 (which is
cool) in Debian, but since it migrated to non-free, my interest in
allegro4.4 has waned. So, if anybody would like to still see allegro4.4
in Debian, please step up now and help out. Since it is dead upstream,
my reasoning is that it is better to remove it from Debian if no
maintainer who wants to help steps up. Previously Tobias Hansen has helped out, but now it is 8 (!) years
since his last upload of either package.(Please don t interpret this as
judgement, I am very happy for the help he has provided and all the
work he has done on the packages). Allegro5 is another deal still active upstream, and I have kept it up
to date in Debian, and while I have held the latest upload a short while
because of the time_t transition, it will come sooner or later There
I am also waiting on a final decision on this bug from upstream. Other than
that allegro 5 is in a very good state, and I will keep maintaining it
as long as I can. But help would of course be appreciated on allegro5
too.

13 April 2024

Paul Tagliamonte: Domo Arigato, Mr. debugfs

Years ago, at what I think I remember was DebConf 15, I hacked for a while on debhelper to write build-ids to debian binary control files, so that the build-id (more specifically, the ELF note .note.gnu.build-id) wound up in the Debian apt archive metadata. I ve always thought this was super cool, and seeing as how Michael Stapelberg blogged some great pointers around the ecosystem, including the fancy new debuginfod service, and the find-dbgsym-packages helper, which uses these same headers, I don t think I m the only one. At work I ve been using a lot of rust, specifically, async rust using tokio. To try and work on my style, and to dig deeper into the how and why of the decisions made in these frameworks, I ve decided to hack up a project that I ve wanted to do ever since 2015 write a debug filesystem. Let s get to it.

Back to the Future Time to admit something. I really love Plan 9. It s just so good. So many ideas from Plan 9 are just so prescient, and everything just feels right. Not just right like, feels good like, correct. The bit that I ve always liked the most is 9p, the network protocol for serving a filesystem over a network. This leads to all sorts of fun programs, like the Plan 9 ftp client being a 9p server you mount the ftp server and access files like any other files. It s kinda like if fuse were more fully a part of how the operating system worked, but fuse is all running client-side. With 9p there s a single client, and different servers that you can connect to, which may be backed by a hard drive, remote resources over something like SFTP, FTP, HTTP or even purely synthetic. The interesting (maybe sad?) part here is that 9p wound up outliving Plan 9 in terms of adoption 9p is in all sorts of places folks don t usually expect. For instance, the Windows Subsystem for Linux uses the 9p protocol to share files between Windows and Linux. ChromeOS uses it to share files with Crostini, and qemu uses 9p (virtio-p9) to share files between guest and host. If you re noticing a pattern here, you d be right; for some reason 9p is the go-to protocol to exchange files between hypervisor and guest. Why? I have no idea, except maybe due to being designed well, simple to implement, and it s a lot easier to validate the data being shared and validate security boundaries. Simplicity has its value. As a result, there s a lot of lingering 9p support kicking around. Turns out Linux can even handle mounting 9p filesystems out of the box. This means that I can deploy a filesystem to my LAN or my localhost by running a process on top of a computer that needs nothing special, and mount it over the network on an unmodified machine unlike fuse, where you d need client-specific software to run in order to mount the directory. For instance, let s mount a 9p filesystem running on my localhost machine, serving requests on 127.0.0.1:564 (tcp) that goes by the name mountpointname to /mnt.
$ mount -t 9p \
-o trans=tcp,port=564,version=9p2000.u,aname=mountpointname \
127.0.0.1 \
/mnt
Linux will mount away, and attach to the filesystem as the root user, and by default, attach to that mountpoint again for each local user that attempts to use it. Nifty, right? I think so. The server is able to keep track of per-user access and authorization along with the host OS.

WHEREIN I STYX WITH IT Since I wanted to push myself a bit more with rust and tokio specifically, I opted to implement the whole stack myself, without third party libraries on the critical path where I could avoid it. The 9p protocol (sometimes called Styx, the original name for it) is incredibly simple. It s a series of client to server requests, which receive a server to client response. These are, respectively, T messages, which transmit a request to the server, which trigger an R message in response (Reply messages). These messages are TLV payload with a very straight forward structure so straight forward, in fact, that I was able to implement a working server off nothing more than a handful of man pages. Later on after the basics worked, I found a more complete spec page that contains more information about the unix specific variant that I opted to use (9P2000.u rather than 9P2000) due to the level of Linux specific support for the 9P2000.u variant over the 9P2000 protocol.

MR ROBOTO The backend stack over at zoo is rust and tokio running i/o for an HTTP and WebRTC server. I figured I d pick something fairly similar to write my filesystem with, since 9P can be implemented on basically anything with I/O. That means tokio tcp server bits, which construct and use a 9p server, which has an idiomatic Rusty API that partially abstracts the raw R and T messages, but not so much as to cause issues with hiding implementation possibilities. At each abstraction level, there s an escape hatch allowing someone to implement any of the layers if required. I called this framework arigato which can be found over on docs.rs and crates.io.
/// Simplified version of the arigato File trait; this isn't actually
/// the same trait; there's some small cosmetic differences. The
/// actual trait can be found at:
///
/// https://docs.rs/arigato/latest/arigato/server/trait.File.html
trait File  
/// OpenFile is the type returned by this File via an Open call.
 type OpenFile: OpenFile;
/// Return the 9p Qid for this file. A file is the same if the Qid is
 /// the same. A Qid contains information about the mode of the file,
 /// version of the file, and a unique 64 bit identifier.
 fn qid(&self) -> Qid;
/// Construct the 9p Stat struct with metadata about a file.
 async fn stat(&self) -> FileResult<Stat>;
/// Attempt to update the file metadata.
 async fn wstat(&mut self, s: &Stat) -> FileResult<()>;
/// Traverse the filesystem tree.
 async fn walk(&self, path: &[&str]) -> FileResult<(Option<Self>, Vec<Self>)>;
/// Request that a file's reference be removed from the file tree.
 async fn unlink(&mut self) -> FileResult<()>;
/// Create a file at a specific location in the file tree.
 async fn create(
&mut self,
name: &str,
perm: u16,
ty: FileType,
mode: OpenMode,
extension: &str,
) -> FileResult<Self>;
/// Open the File, returning a handle to the open file, which handles
 /// file i/o. This is split into a second type since it is genuinely
 /// unrelated -- and the fact that a file is Open or Closed can be
 /// handled by the  arigato  server for us.
 async fn open(&mut self, mode: OpenMode) -> FileResult<Self::OpenFile>;
 
/// Simplified version of the arigato OpenFile trait; this isn't actually
/// the same trait; there's some small cosmetic differences. The
/// actual trait can be found at:
///
/// https://docs.rs/arigato/latest/arigato/server/trait.OpenFile.html
trait OpenFile  
/// iounit to report for this file. The iounit reported is used for Read
 /// or Write operations to signal, if non-zero, the maximum size that is
 /// guaranteed to be transferred atomically.
 fn iounit(&self) -> u32;
/// Read some number of bytes up to  buf.len()  from the provided
 ///  offset  of the underlying file. The number of bytes read is
 /// returned.
 async fn read_at(
&mut self,
buf: &mut [u8],
offset: u64,
) -> FileResult<u32>;
/// Write some number of bytes up to  buf.len()  from the provided
 ///  offset  of the underlying file. The number of bytes written
 /// is returned.
 fn write_at(
&mut self,
buf: &mut [u8],
offset: u64,
) -> FileResult<u32>;
 

Thanks, decade ago paultag! Let s do it! Let s use arigato to implement a 9p filesystem we ll call debugfs that will serve all the debug files shipped according to the Packages metadata from the apt archive. We ll fetch the Packages file and construct a filesystem based on the reported Build-Id entries. For those who don t know much about how an apt repo works, here s the 2-second crash course on what we re doing. The first is to fetch the Packages file, which is specific to a binary architecture (such as amd64, arm64 or riscv64). That architecture is specific to a component (such as main, contrib or non-free). That component is specific to a suite, such as stable, unstable or any of its aliases (bullseye, bookworm, etc). Let s take a look at the Packages.xz file for the unstable-debug suite, main component, for all amd64 binaries.
$ curl \
https://deb.debian.org/debian-debug/dists/unstable-debug/main/binary-amd64/Packages.xz \
  unxz
This will return the Debian-style rfc2822-like headers, which is an export of the metadata contained inside each .deb file which apt (or other tools that can use the apt repo format) use to fetch information about debs. Let s take a look at the debug headers for the netlabel-tools package in unstable which is a package named netlabel-tools-dbgsym in unstable-debug.
Package: netlabel-tools-dbgsym
Source: netlabel-tools (0.30.0-1)
Version: 0.30.0-1+b1
Installed-Size: 79
Maintainer: Paul Tagliamonte <paultag@debian.org>
Architecture: amd64
Depends: netlabel-tools (= 0.30.0-1+b1)
Description: debug symbols for netlabel-tools
Auto-Built-Package: debug-symbols
Build-Ids: e59f81f6573dadd5d95a6e4474d9388ab2777e2a
Description-md5: a0e587a0cf730c88a4010f78562e6db7
Section: debug
Priority: optional
Filename: pool/main/n/netlabel-tools/netlabel-tools-dbgsym_0.30.0-1+b1_amd64.deb
Size: 62776
SHA256: 0e9bdb087617f0350995a84fb9aa84541bc4df45c6cd717f2157aa83711d0c60
So here, we can parse the package headers in the Packages.xz file, and store, for each Build-Id, the Filename where we can fetch the .deb at. Each .deb contains a number of files but we re only really interested in the files inside the .deb located at or under /usr/lib/debug/.build-id/, which you can find in debugfs under rfc822.rs. It s crude, and very single-purpose, but I m feeling a bit lazy.

Who needs dpkg?! For folks who haven t seen it yet, a .deb file is a special type of .ar file, that contains (usually) three files inside debian-binary, control.tar.xz and data.tar.xz. The core of an .ar file is a fixed size (60 byte) entry header, followed by the specified size number of bytes.
[8 byte .ar file magic]
[60 byte entry header]
[N bytes of data]
[60 byte entry header]
[N bytes of data]
[60 byte entry header]
[N bytes of data]
...
First up was to implement a basic ar parser in ar.rs. Before we get into using it to parse a deb, as a quick diversion, let s break apart a .deb file by hand something that is a bit of a rite of passage (or at least it used to be? I m getting old) during the Debian nm (new member) process, to take a look at where exactly the .debug file lives inside the .deb file.
$ ar x netlabel-tools-dbgsym_0.30.0-1+b1_amd64.deb
$ ls
control.tar.xz debian-binary
data.tar.xz netlabel-tools-dbgsym_0.30.0-1+b1_amd64.deb
$ tar --list -f data.tar.xz   grep '.debug$'
./usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug
Since we know quite a bit about the structure of a .deb file, and I had to implement support from scratch anyway, I opted to implement a (very!) basic debfile parser using HTTP Range requests. HTTP Range requests, if supported by the server (denoted by a accept-ranges: bytes HTTP header in response to an HTTP HEAD request to that file) means that we can add a header such as range: bytes=8-68 to specifically request that the returned GET body be the byte range provided (in the above case, the bytes starting from byte offset 8 until byte offset 68). This means we can fetch just the ar file entry from the .deb file until we get to the file inside the .deb we are interested in (in our case, the data.tar.xz file) at which point we can request the body of that file with a final range request. I wound up writing a struct to handle a read_at-style API surface in hrange.rs, which we can pair with ar.rs above and start to find our data in the .deb remotely without downloading and unpacking the .deb at all. After we have the body of the data.tar.xz coming back through the HTTP response, we get to pipe it through an xz decompressor (this kinda sucked in Rust, since a tokio AsyncRead is not the same as an http Body response is not the same as std::io::Read, is not the same as an async (or sync) Iterator is not the same as what the xz2 crate expects; leading me to read blocks of data to a buffer and stuff them through the decoder by looping over the buffer for each lzma2 packet in a loop), and tarfile parser (similarly troublesome). From there we get to iterate over all entries in the tarfile, stopping when we reach our file of interest. Since we can t seek, but gdb needs to, we ll pull it out of the stream into a Cursor<Vec<u8>> in-memory and pass a handle to it back to the user. From here on out its a matter of gluing together a File traited struct in debugfs, and serving the filesystem over TCP using arigato. Done deal!

A quick diversion about compression I was originally hoping to avoid transferring the whole tar file over the network (and therefore also reading the whole debug file into ram, which objectively sucks), but quickly hit issues with figuring out a way around seeking around an xz file. What s interesting is xz has a great primitive to solve this specific problem (specifically, use a block size that allows you to seek to the block as close to your desired seek position just before it, only discarding at most block size - 1 bytes), but data.tar.xz files generated by dpkg appear to have a single mega-huge block for the whole file. I don t know why I would have expected any different, in retrospect. That means that this now devolves into the base case of How do I seek around an lzma2 compressed data stream ; which is a lot more complex of a question. Thankfully, notoriously brilliant tianon was nice enough to introduce me to Jon Johnson who did something super similar adapted a technique to seek inside a compressed gzip file, which lets his service oci.dag.dev seek through Docker container images super fast based on some prior work such as soci-snapshotter, gztool, and zran.c. He also pulled this party trick off for apk based distros over at apk.dag.dev, which seems apropos. Jon was nice enough to publish a lot of his work on this specifically in a central place under the name targz on his GitHub, which has been a ton of fun to read through. The gist is that, by dumping the decompressor s state (window of previous bytes, in-memory data derived from the last N-1 bytes) at specific checkpoints along with the compressed data stream offset in bytes and decompressed offset in bytes, one can seek to that checkpoint in the compressed stream and pick up where you left off creating a similar block mechanism against the wishes of gzip. It means you d need to do an O(n) run over the file, but every request after that will be sped up according to the number of checkpoints you ve taken. Given the complexity of xz and lzma2, I don t think this is possible for me at the moment especially given most of the files I ll be requesting will not be loaded from again especially when I can just cache the debug header by Build-Id. I want to implement this (because I m generally curious and Jon has a way of getting someone excited about compression schemes, which is not a sentence I thought I d ever say out loud), but for now I m going to move on without this optimization. Such a shame, since it kills a lot of the work that went into seeking around the .deb file in the first place, given the debian-binary and control.tar.gz members are so small.

The Good First, the good news right? It works! That s pretty cool. I m positive my younger self would be amused and happy to see this working; as is current day paultag. Let s take debugfs out for a spin! First, we need to mount the filesystem. It even works on an entirely unmodified, stock Debian box on my LAN, which is huge. Let s take it for a spin:
$ mount \
-t 9p \
-o trans=tcp,version=9p2000.u,aname=unstable-debug \
192.168.0.2 \
/usr/lib/debug/.build-id/
And, let s prove to ourselves that this actually mounted before we go trying to use it:
$ mount   grep build-id
192.168.0.2 on /usr/lib/debug/.build-id type 9p (rw,relatime,aname=unstable-debug,access=user,trans=tcp,version=9p2000.u,port=564)
Slick. We ve got an open connection to the server, where our host will keep a connection alive as root, attached to the filesystem provided in aname. Let s take a look at it.
$ ls /usr/lib/debug/.build-id/
00 0d 1a 27 34 41 4e 5b 68 75 82 8E 9b a8 b5 c2 CE db e7 f3
01 0e 1b 28 35 42 4f 5c 69 76 83 8f 9c a9 b6 c3 cf dc E7 f4
02 0f 1c 29 36 43 50 5d 6a 77 84 90 9d aa b7 c4 d0 dd e8 f5
03 10 1d 2a 37 44 51 5e 6b 78 85 91 9e ab b8 c5 d1 de e9 f6
04 11 1e 2b 38 45 52 5f 6c 79 86 92 9f ac b9 c6 d2 df ea f7
05 12 1f 2c 39 46 53 60 6d 7a 87 93 a0 ad ba c7 d3 e0 eb f8
06 13 20 2d 3a 47 54 61 6e 7b 88 94 a1 ae bb c8 d4 e1 ec f9
07 14 21 2e 3b 48 55 62 6f 7c 89 95 a2 af bc c9 d5 e2 ed fa
08 15 22 2f 3c 49 56 63 70 7d 8a 96 a3 b0 bd ca d6 e3 ee fb
09 16 23 30 3d 4a 57 64 71 7e 8b 97 a4 b1 be cb d7 e4 ef fc
0a 17 24 31 3e 4b 58 65 72 7f 8c 98 a5 b2 bf cc d8 E4 f0 fd
0b 18 25 32 3f 4c 59 66 73 80 8d 99 a6 b3 c0 cd d9 e5 f1 fe
0c 19 26 33 40 4d 5a 67 74 81 8e 9a a7 b4 c1 ce da e6 f2 ff
Outstanding. Let s try using gdb to debug a binary that was provided by the Debian archive, and see if it ll load the ELF by build-id from the right .deb in the unstable-debug suite:
$ gdb -q /usr/sbin/netlabelctl
Reading symbols from /usr/sbin/netlabelctl...
Reading symbols from /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug...
(gdb)
Yes! Yes it will!
$ file /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug
/usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter *empty*, BuildID[sha1]=e59f81f6573dadd5d95a6e4474d9388ab2777e2a, for GNU/Linux 3.2.0, with debug_info, not stripped

The Bad Linux s support for 9p is mainline, which is great, but it s not robust. Network issues or server restarts will wedge the mountpoint (Linux can t reconnect when the tcp connection breaks), and things that work fine on local filesystems get translated in a way that causes a lot of network chatter for instance, just due to the way the syscalls are translated, doing an ls, will result in a stat call for each file in the directory, even though linux had just got a stat entry for every file while it was resolving directory names. On top of that, Linux will serialize all I/O with the server, so there s no concurrent requests for file information, writes, or reads pending at the same time to the server; and read and write throughput will degrade as latency increases due to increasing round-trip time, even though there are offsets included in the read and write calls. It works well enough, but is frustrating to run up against, since there s not a lot you can do server-side to help with this beyond implementing the 9P2000.L variant (which, maybe is worth it).

The Ugly Unfortunately, we don t know the file size(s) until we ve actually opened the underlying tar file and found the correct member, so for most files, we don t know the real size to report when getting a stat. We can t parse the tarfiles for every stat call, since that d make ls even slower (bummer). Only hiccup is that when I report a filesize of zero, gdb throws a bit of a fit; let s try with a size of 0 to start:
$ ls -lah /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug
-r--r--r-- 1 root root 0 Dec 31 1969 /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug
$ gdb -q /usr/sbin/netlabelctl
Reading symbols from /usr/sbin/netlabelctl...
Reading symbols from /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug...
warning: Discarding section .note.gnu.build-id which has a section size (24) larger than the file size [in module /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug]
[...]
This obviously won t work since gdb will throw away all our hard work because of stat s output, and neither will loading the real size of the underlying file. That only leaves us with hardcoding a file size and hope nothing else breaks significantly as a result. Let s try it again:
$ ls -lah /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug
-r--r--r-- 1 root root 954M Dec 31 1969 /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug
$ gdb -q /usr/sbin/netlabelctl
Reading symbols from /usr/sbin/netlabelctl...
Reading symbols from /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug...
(gdb)
Much better. I mean, terrible but better. Better for now, anyway.

Kilroy was here Do I think this is a particularly good idea? I mean; kinda. I m probably going to make some fun 9p arigato-based filesystems for use around my LAN, but I don t think I ll be moving to use debugfs until I can figure out how to ensure the connection is more resilient to changing networks, server restarts and fixes on i/o performance. I think it was a useful exercise and is a pretty great hack, but I don t think this ll be shipping anywhere anytime soon. Along with me publishing this post, I ve pushed up all my repos; so you should be able to play along at home! There s a lot more work to be done on arigato; but it does handshake and successfully export a working 9P2000.u filesystem. Check it out on on my github at arigato, debugfs and also on crates.io and docs.rs. At least I can say I was here and I got it working after all these years.

Russell Coker: Software Needed for Work

When I first started studying computer science setting up a programming project was easy, write source code files and a Makefile and that was it. IRC was the only IM system and email was the only other communications system that was used much. Writing Makefiles is difficult but products like the Borland Turbo series of IDEs did all that for you so you could just start typing code and press a function key to compile and run (F5 from memory). Over the years the requirements and expectations of computer use have grown significantly. The typical office worker is now doing many more things with computers than serious programmers used to do. Running an IM system, an online document editing system, and a series of web apps is standard for companies nowadays. Developers have to do all that in addition to tools for version control, continuous integration, bug reporting, and feature tracking. The development process is also more complex with extra steps for reproducible builds, automated tests, and code coverage metrics for the tests. I wonder how many programmers who started in the 90s would have done something else if faced with Github as their introduction. How much of this is good? Having the ability to send instant messages all around the world is great. Having dozens of different ways of doing so is awful. When a company uses multiple IM systems such as MS-Teams and Slack and forces some of it s employees to use them both it s getting ridiculous. Having different friend groups on different IM systems is anti-social networking. In the EU the Digital Markets Act [1] forces some degree of interoperability between different IM systems and as it s impossible to know who s actually in the EU that will end up being world-wide. In corporations document management often involves multiple ways of storing things, you have Google Docs, MS Office online, hosted Wikis like Confluence, and more. Large companies tend to use several such systems which means that people need to learn multiple systems to be able to work and they also need to know which systems are used by the various groups that they communicate with. Microsoft deserves some sort of award for the range of ways they have for managing documents, Sharepoint, OneDrive, Office Online, attachments to Teams rooms, and probably lots more. During WW2 the predecessor to the CIA produced an excellent manual for simple sabotage [2]. If something like that was written today the section General Interference with Organisations and Production would surely have something about using as many incompatible programs and web sites as possible in the work flow. The proliferation of software required for work is a form of denial of service attack against corporations. The efficiency of companies doesn t really bother me. It sucks that companies are creating a demoralising workplace that is unpleasant for workers. But the upside is that the biggest companies are the ones doing the worst things and are also the most afflicted by these problems. It s almost like the Bureau of Sabotage in some of Frank Herbert s fiction [3]. The thing that concerns me is the effect of multiple standards on free software development. We have IRC the most traditional IM support system which is getting replaced by Matrix but we also have some projects using Telegram, and Jabber hasn t gone away. I m sure there are others too. There are also multiple options for version control (although github seems to dominate the market), forums, bug trackers, etc. Reporting bugs or getting support in free software often requires interacting with several of them. Developing free software usually involves dealing with the bug tracking and documentation systems of the distribution you use as well as the upstream developers of the software. If the problem you have is related to compatibility between two different pieces of free software then you can end up dealing with even more bug tracking systems. There are real benefits to some of the newer programs to track bugs, write documentation, etc. There is also going to be a cost in changing which gives an incentive for the older projects to keep using what has worked well enough for them in the past, How can we improve things? Use only the latest tools? Prioritise ease of use? Aim more for the entry level contributors?

12 April 2024

Freexian Collaborators: Debian Contributions: SSO Authentication for jitsi.debian.social, /usr-move updates, and more! (by Utkarsh Gupta)

Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services. P.S. We ve completed over a year of writing these blogs. If you have any suggestions on how to make them better or what you d like us to cover, or any other opinions/reviews you might have, et al, please let us know by dropping an email to us. We d be happy to hear your thoughts. :)

SSO Authentication for jitsi.debian.social, by Stefano Rivera Debian.social s jitsi instance has been getting some abuse by (non-Debian) people sharing sexually explicit content on the service. After playing whack-a-mole with this for a month, and shutting the instance off for another month, we opened it up again and the abuse immediately re-started. Stefano sat down and wrote an SSO Implementation that hooks into Jitsi s existing JWT SSO support. This requires everyone using jitsi.debian.social to have a Salsa account. With only a little bit of effort, we could change this in future, to only require an account to open a room, and allow guests to join the call.

/usr-move, by Helmut Grohne The biggest task this month was sending mitigation patches for all of the /usr-move issues arising from package renames due to the 2038 transition. As a result, we can now say that every affected package in unstable can either be converted with dh-sequence-movetousr or has an open bug report. The package set relevant to debootstrap except for the set that has to be uploaded concurrently has been moved to /usr and is awaiting migration. The move of coreutils happened to affect piuparts which hard codes the location of /bin/sync and received multiple updates as a result.

Miscellaneous contributions
  • Stefano Rivera uploaded a stable release update to python3.11 for bookworm, fixing a use-after-free crash.
  • Stefano uploaded a new version of python-html2text, and updated python3-defaults to build with it.
  • In support of Python 3.12, Stefano dropped distutils as a Build-Dependency from a few packages, and uploaded a complex set of patches to python-mitogen.
  • Stefano landed some merge requests to clean up dead code in dh-python, removed the flit plugin, and uploaded it.
  • Stefano uploaded new upstream versions of twisted, hatchling, python-flexmock, python-authlib, python mitogen, python-pipx, and xonsh.
  • Stefano requested removal of a few packages supporting the Opsis HDMI2USB hardware that DebConf Video team used to use for HDMI capture, as they are not being maintained upstream. They started to FTBFS, with recent sdcc changes.
  • DebConf 24 is getting ready to open registration, Stefano spent some time fixing bugs in the website, caused by infrastructure updates.
  • Stefano reviewed all the DebConf 23 travel reimbursements, filing requests for more information from SPI where our records mismatched.
  • Stefano spun up a Wafer website for the Berlin 2024 mini DebConf.
  • Roberto C. S nchez worked on facilitating the transfer of upstream maintenance responsibility for the dormant Shorewall project to a new team led by the current maintainer of the Shorewall packages in Debian.
  • Colin Watson fixed build failures in celery-haystack-ng, db1-compat, jsonpickle, libsdl-perl, kali, knews, openssh-ssh1, python-json-log-formatter, python-typing-extensions, trn4, vigor, and wcwidth. Some of these were related to the 64-bit time_t transition, since that involved enabling -Werror=implicit-function-declaration.
  • Colin fixed an off-by-one error in neovim, which was already causing a build failure in Ubuntu and would eventually have caused a build failure in Debian with stricter toolchain settings.
  • Colin added an sshd@.service template to openssh to help newer systemd versions make containers and VMs SSH-accessible over AF_VSOCK sockets.
  • Following the xz-utils backdoor, Colin spent some time testing and discussing OpenSSH upstream s proposed inline systemd notification patch, since the current implementation via libsystemd was part of the attack vector used by that backdoor.
  • Utkarsh reviewed and sponsored some Go packages for Lena Voytek and Rajudev.
  • Utkarsh also helped Mitchell Dzurick with the adoption of pyparted package.
  • Helmut sent 10 patches for cross build failures.
  • Helmut partially fixed architecture cross bootstrap tooling to deal with changes in linux-libc-dev and the recent gcc-for-host changes and also fixed a 64bit-time_t FTBFS in libtextwrap.
  • Thorsten Alteholz uploaded several packages from debian-printing: cjet, lprng, rlpr and epson-inkjet-printer-escpr were affected by the newly enabled compiler switch -Werror=implicit-function-declaration. Besides fixing these serious bugs, Thorsten also worked on other bugs and could fix one or the other.
  • Carles updated simplemonitor and python-ring-doorbell packages with new upstream versions.
  • Santiago is still working on the Salsa CI MRs to adapt the build jobs so they can rely on sbuild. Current work includes adapting the images used by the build job, implementing the basic sbuild support the related jobs, and adjusting the support for experimental and *-backports releases..
    Additionally, Santiago reviewed some MR such as Make timeout action explicit in the logs and the subsequent Implement conditional timeout verbosity, and the batch of MRs included in https://salsa.debian.org/salsa-ci-team/pipeline/-/merge_requests/482.
  • Santiago also reviewed applications for the improving Salsa CI in Debian GSoC 2024 project. We received applications from four very talented candidates. The selection process is currently ongoing. A huge thanks to all of them!
  • As part of the DebConf 24 organization, Santiago has taken part in the Content team discussions.

11 April 2024

Reproducible Builds: Reproducible Builds in March 2024

Welcome to the March 2024 report from the Reproducible Builds project! In our reports, we attempt to outline what we have been up to over the past month, as well as mentioning some of the important things happening more generally in software supply-chain security. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website. Table of contents:
  1. Arch Linux minimal container userland now 100% reproducible
  2. Validating Debian s build infrastructure after the XZ backdoor
  3. Making Fedora Linux (more) reproducible
  4. Increasing Trust in the Open Source Supply Chain with Reproducible Builds and Functional Package Management
  5. Software and source code identification with GNU Guix and reproducible builds
  6. Two new Rust-based tools for post-processing determinism
  7. Distribution work
  8. Mailing list highlights
  9. Website updates
  10. Delta chat clients now reproducible
  11. diffoscope updates
  12. Upstream patches
  13. Reproducibility testing framework

Arch Linux minimal container userland now 100% reproducible In remarkable news, Reproducible builds developer kpcyrd reported that that the Arch Linux minimal container userland is now 100% reproducible after work by developers dvzv and Foxboron on the one remaining package. This represents a real world , widely-used Linux distribution being reproducible. Their post, which kpcyrd suffixed with the question now what? , continues on to outline some potential next steps, including validating whether the container image itself could be reproduced bit-for-bit. The post, which was itself a followup for an Arch Linux update earlier in the month, generated a significant number of replies.

Validating Debian s build infrastructure after the XZ backdoor From our mailing list this month, Vagrant Cascadian wrote about being asked about trying to perform concrete reproducibility checks for recent Debian security updates, in an attempt to gain some confidence about Debian s build infrastructure given that they performed builds in environments running the high-profile XZ vulnerability. Vagrant reports (with some caveats):
So far, I have not found any reproducibility issues; everything I tested I was able to get to build bit-for-bit identical with what is in the Debian archive.
That is to say, reproducibility testing permitted Vagrant and Debian to claim with some confidence that builds performed when this vulnerable version of XZ was installed were not interfered with.

Making Fedora Linux (more) reproducible In March, Davide Cavalca gave a talk at the 2024 Southern California Linux Expo (aka SCALE 21x) about the ongoing effort to make the Fedora Linux distribution reproducible. Documented in more detail on Fedora s website, the talk touched on topics such as the specifics of implementing reproducible builds in Fedora, the challenges encountered, the current status and what s coming next. (YouTube video)

Increasing Trust in the Open Source Supply Chain with Reproducible Builds and Functional Package Management Julien Malka published a brief but interesting paper in the HAL open archive on Increasing Trust in the Open Source Supply Chain with Reproducible Builds and Functional Package Management:
Functional package managers (FPMs) and reproducible builds (R-B) are technologies and methodologies that are conceptually very different from the traditional software deployment model, and that have promising properties for software supply chain security. This thesis aims to evaluate the impact of FPMs and R-B on the security of the software supply chain and propose improvements to the FPM model to further improve trust in the open source supply chain. PDF
Julien s paper poses a number of research questions on how the model of distributions such as GNU Guix and NixOS can be leveraged to further improve the safety of the software supply chain , etc.

Software and source code identification with GNU Guix and reproducible builds In a long line of commendably detailed blog posts, Ludovic Court s, Maxim Cournoyer, Jan Nieuwenhuizen and Simon Tournier have together published two interesting posts on the GNU Guix blog this month. In early March, Ludovic Court s, Maxim Cournoyer, Jan Nieuwenhuizen and Simon Tournier wrote about software and source code identification and how that might be performed using Guix, rhetorically posing the questions: What does it take to identify software ? How can we tell what software is running on a machine to determine, for example, what security vulnerabilities might affect it? Later in the month, Ludovic Court s wrote a solo post describing adventures on the quest for long-term reproducible deployment. Ludovic s post touches on GNU Guix s aim to support time travel , the ability to reliably (and reproducibly) revert to an earlier point in time, employing the iconic image of Harold Lloyd hanging off the clock in Safety Last! (1925) to poetically illustrate both the slapstick nature of current modern technology and the gymnastics required to navigate hazards of our own making.

Two new Rust-based tools for post-processing determinism Zbigniew J drzejewski-Szmek announced add-determinism, a work-in-progress reimplementation of the Reproducible Builds project s own strip-nondeterminism tool in the Rust programming language, intended to be used as a post-processor in RPM-based distributions such as Fedora In addition, Yossi Kreinin published a blog post titled refix: fast, debuggable, reproducible builds that describes a tool that post-processes binaries in such a way that they are still debuggable with gdb, etc.. Yossi post details the motivation and techniques behind the (fast) performance of the tool.

Distribution work In Debian this month, since the testing framework no longer varies the build path, James Addison performed a bulk downgrade of the bug severity for issues filed with a level of normal to a new level of wishlist. In addition, 28 reviews of Debian packages were added, 38 were updated and 23 were removed this month adding to ever-growing knowledge about identified issues. As part of this effort, a number of issue types were updated, including Chris Lamb adding a new ocaml_include_directories toolchain issue [ ] and James Addison adding a new filesystem_order_in_java_jar_manifest_mf_include_resource issue [ ] and updating the random_uuid_in_notebooks_generated_by_nbsphinx to reference a relevant discussion thread [ ]. In addition, Roland Clobus posted his 24th status update of reproducible Debian ISO images. Roland highlights that the images for Debian unstable often cannot be generated due to changes in that distribution related to the 64-bit time_t transition. Lastly, Bernhard M. Wiedemann posted another monthly update for his reproducibility work in openSUSE.

Mailing list highlights Elsewhere on our mailing list this month:

Website updates There were made a number of improvements to our website this month, including:
  • Pol Dellaiera noticed the frequent need to correctly cite the website itself in academic work. To facilitate easier citation across multiple formats, Pol contributed a Citation File Format (CIF) file. As a result, an export in BibTeX format is now available in the Academic Publications section. Pol encourages community contributions to further refine the CITATION.cff file. Pol also added an substantial new section to the buy in page documenting the role of Software Bill of Materials (SBOMs) and ephemeral development environments. [ ][ ]
  • Bernhard M. Wiedemann added a new commandments page to the documentation [ ][ ] and fixed some incorrect YAML elsewhere on the site [ ].
  • Chris Lamb add three recent academic papers to the publications page of the website. [ ]
  • Mattia Rizzolo and Holger Levsen collaborated to add Infomaniak as a sponsor of amd64 virtual machines. [ ][ ][ ]
  • Roland Clobus updated the stable outputs page, dropping version numbers from Python documentation pages [ ] and noting that Python s set data structure is also affected by the PYTHONHASHSEED functionality. [ ]

Delta chat clients now reproducible Delta Chat, an open source messaging application that can work over email, announced this month that the Rust-based core library underlying Delta chat application is now reproducible.

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions 259, 260 and 261 to Debian and made the following additional changes:
  • New features:
    • Add support for the zipdetails tool from the Perl distribution. Thanks to Fay Stegerman and Larry Doolittle et al. for the pointer and thread about this tool. [ ]
  • Bug fixes:
    • Don t identify Redis database dumps as GNU R database files based simply on their filename. [ ]
    • Add a missing call to File.recognizes so we actually perform the filename check for GNU R data files. [ ]
    • Don t crash if we encounter an .rdb file without an equivalent .rdx file. (#1066991)
    • Correctly check for 7z being available and not lz4 when testing 7z. [ ]
    • Prevent a traceback when comparing a contentful .pyc file with an empty one. [ ]
  • Testsuite improvements:
    • Fix .epub tests after supporting the new zipdetails tool. [ ]
    • Don t use parenthesis within test skipping messages, as PyTest adds its own parenthesis. [ ]
    • Factor out Python version checking in test_zip.py. [ ]
    • Skip some Zip-related tests under Python 3.10.14, as a potential regression may have been backported to the 3.10.x series. [ ]
    • Actually test 7z support in the test_7z set of tests, not the lz4 functionality. (Closes: reproducible-builds/diffoscope#359). [ ]
In addition, Fay Stegerman updated diffoscope s monkey patch for supporting the unusual Mozilla ZIP file format after Python s zipfile module changed to detect potentially insecure overlapping entries within .zip files. (#362) Chris Lamb also updated the trydiffoscope command line client, dropping a build-dependency on the deprecated python3-distutils package to fix Debian bug #1065988 [ ], taking a moment to also refresh the packaging to the latest Debian standards [ ]. Finally, Vagrant Cascadian submitted an update for diffoscope version 260 in GNU Guix. [ ]

Upstream patches This month, we wrote a large number of patches, including: Bernhard M. Wiedemann used reproducibility-tooling to detect and fix packages that added changes in their %check section, thus failing when built with the --no-checks option. Only half of all openSUSE packages were tested so far, but a large number of bugs were filed, including ones against caddy, exiv2, gnome-disk-utility, grisbi, gsl, itinerary, kosmindoormap, libQuotient, med-tools, plasma6-disks, pspp, python-pypuppetdb, python-urlextract, rsync, vagrant-libvirt and xsimd. Similarly, Jean-Pierre De Jesus DIAZ employed reproducible builds techniques in order to test a proposed refactor of the ath9k-htc-firmware package. As the change produced bit-for-bit identical binaries to the previously shipped pre-built binaries:
I don t have the hardware to test this firmware, but the build produces the same hashes for the firmware so it s safe to say that the firmware should keep working.

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In March, an enormous number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Sleep less after a so-called 404 package state has occurred. [ ]
    • Schedule package builds more often. [ ][ ]
    • Regenerate all our HTML indexes every hour, but only every 12h for the released suites. [ ]
    • Create and update unstable and experimental base systems on armhf again. [ ][ ]
    • Don t reschedule so many depwait packages due to the current size of the i386 architecture queue. [ ]
    • Redefine our scheduling thresholds and amounts. [ ]
    • Schedule untested packages with a higher priority, otherwise slow architectures cannot keep up with the experimental distribution growing. [ ]
    • Only create the stats_buildinfo.png graph once per day. [ ][ ]
    • Reproducible Debian dashboard: refactoring, update several more static stats only every 12h. [ ]
    • Document how to use systemctl with new systemd-based services. [ ]
    • Temporarily disable armhf and i386 continuous integration tests in order to get some stability back. [ ]
    • Use the deb.debian.org CDN everywhere. [ ]
    • Remove the rsyslog logging facility on bookworm systems. [ ]
    • Add zst to the list of packages which are false-positive diskspace issues. [ ]
    • Detect failures to bootstrap Debian base systems. [ ]
  • Arch Linux-related changes:
    • Temporarily disable builds because the pacman package manager is broken. [ ][ ]
    • Split reproducible_html_live_status and split the scheduling timing . [ ][ ][ ]
    • Improve handling when database is locked. [ ][ ]
  • Misc changes:
    • Show failed services that require manual cleanup. [ ][ ]
    • Integrate two new Infomaniak nodes. [ ][ ][ ][ ]
    • Improve IRC notifications for artifacts. [ ]
    • Run diffoscope in different systemd slices. [ ]
    • Run the node health check more often, as it can now repair some issues. [ ][ ]
    • Also include the string Bot in the userAgent for Git. (Re: #929013). [ ]
    • Document increased tmpfs size on our OUSL nodes. [ ]
    • Disable memory account for the reproducible_build service. [ ][ ]
    • Allow 10 times as many open files for the Jenkins service. [ ]
    • Set OOMPolicy=continue and OOMScoreAdjust=-1000 for both the Jenkins and the reproducible_build service. [ ]
Mattia Rizzolo also made the following changes:
  • Debian-related changes:
    • Define a systemd slice to group all relevant services. [ ][ ]
    • Add a bunch of quotes in scripts to assuage the shellcheck tool. [ ]
    • Add stats on how many packages have been built today so far. [ ]
    • Instruct systemd-run to handle diffoscope s exit codes specially. [ ]
    • Prefer the pgrep tool over grepping the output of ps. [ ]
    • Re-enable a couple of i386 and armhf architecture builders. [ ][ ]
    • Fix some stylistic issues flagged by the Python flake8 tool. [ ]
    • Cease scheduling Debian unstable and experimental on the armhf architecture due to the time_t transition. [ ]
    • Start a few more i386 & armhf workers. [ ][ ][ ]
    • Temporarly skip pbuilder updates in the unstable distribution, but only on the armhf architecture. [ ]
  • Other changes:
    • Perform some large-scale refactoring on how the systemd service operates. [ ][ ]
    • Move the list of workers into a separate file so it s accessible to a number of scripts. [ ]
    • Refactor the powercycle_x86_nodes.py script to use the new IONOS API and its new Python bindings. [ ]
    • Also fix nph-logwatch after the worker changes. [ ]
    • Do not install the stunnel tool anymore, it shouldn t be needed by anything anymore. [ ]
    • Move temporary directories related to Arch Linux into a single directory for clarity. [ ]
    • Update the arm64 architecture host keys. [ ]
    • Use a common Postfix configuration. [ ]
The following changes were also made by:
  • Jan-Benedict Glaw:
    • Initial work to clean up a messy NetBSD-related script. [ ][ ]
  • Roland Clobus:
    • Show the installer log if the installer fails to build. [ ]
    • Avoid the minus character (i.e. -) in a variable in order to allow for tags in openQA. [ ]
    • Update the schedule of Debian live image builds. [ ]
  • Vagrant Cascadian:
    • Maintenance on the virt* nodes is completed so bring them back online. [ ]
    • Use the fully qualified domain name in configuration. [ ]
Node maintenance was also performed by Holger Levsen, Mattia Rizzolo [ ][ ] and Vagrant Cascadian [ ][ ][ ][ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

2 April 2024

Dirk Eddelbuettel: ulid 0.3.1 on CRAN: New Maintainer, Some Polish

Happy to share that ulid is now (back) on CRAN. It provides universally unique identifiers that are lexicographically sortable, which improves over the more well-known uuid generators. ulid is a neat little package put together by Bob Rudis a few years ago. It had recently drifted off CRAN so I offered to brush it up and re-submit it. And as tooted earlier today, it took just over an hour to finish that (after the lead up work I had done, including prior email with CRAN in the loop, the repo transfer from Bob s to my ulid repo plus of course a wee bit of actual maintenance; see below for more). The NEWS entry follows.

Changes in version 0.3.1 (2024-04-02)
  • New Maintainer
  • Deleted several repository files no longer used or needed
  • Added .editorconfig, ChangeLog and cleanup
  • Converted NEWS.md to NEWS.Rd
  • Simplified R/ directory to one source file
  • Simplified src/ removing redundant Makevars
  • Added ulid() alias
  • Updated / edited roxygen and README.md documention
  • Removed vignette which was identical to README.md
  • Switched continuous integration to GitHub Actions
  • Placed upstream (header-only) library into src/ulid/
  • Renamed single interface file to src/wrapper

If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

1 April 2024

Simon Josefsson: Towards reproducible minimal source code tarballs? On *-src.tar.gz

While the work to analyze the xz backdoor is in progress, several ideas have been suggested to improve the software supply chain ecosystem. Some of those ideas are good, some of the ideas are at best irrelevant and harmless, and some suggestions are plain bad. I d like to attempt to formalize two ideas, which have been discussed before, but the context in which they can be appreciated have not been as clear as it is today.
  1. Reproducible tarballs. The idea is that published source tarballs should be possible to reproduce independently somehow, and that this should be continuously tested and verified preferrably as part of the upstream project continuous integration system (e.g., GitHub action or GitLab pipeline). While nominally this looks easy to achieve, there are some complex matters in this, for example: what timestamps to use for files in the tarball? I ve brought up this aspect before.
  2. Minimal source tarballs without generated vendor files. Most GNU Autoconf/Automake-based tarballs pre-generated files which are important for bootstrapping on exotic systems that does not have the required dependencies. For the bootstrapping story to succeed, this approach is important to support. However it has become clear that this practice raise significant costs and risks. Most modern GNU/Linux distributions have all the required dependencies and actually prefers to re-build everything from source code. These pre-generated extra files introduce uncertainty to that process.
My strawman proposal to improve things is to define new tarball format *-src.tar.gz with at least the following properties:
  1. The tarball should allow users to build the project, which is the entire purpose of all this. This means that at least all source code for the project has to be included.
  2. The tarballs should be signed, for example with PGP or minisign.
  3. The tarball should be possible to reproduce bit-by-bit by a third party using upstream s version controlled sources and a pointer to which revision was used (e.g., git tag or git commit).
  4. The tarball should not require an Internet connection to download things.
    • Corollary: every external dependency either has to be explicitly documented as such (e.g., gcc and GnuTLS), or included in the tarball.
    • Observation: This means including all *.po gettext translations which are normally downloaded when building from version controlled sources.
  5. The tarball should contain everything required to build the project from source using as much externally released versioned tooling as possible. This is the minimal property lacking today.
    • Corollary: This means including a vendored copy of OpenSSL or libz is not acceptable: link to them as external projects.
    • Open question: How about non-released external tooling such as gnulib or autoconf archive macros? This is a bit more delicate: most distributions either just package one current version of gnulib or autoconf archive, not previous versions. While this could change, and distributions could package the gnulib git repository (up to some current version) and the autoconf archive git repository and packages were set up to extract the version they need (gnulib s ./bootstrap already supports this via the gnulib-refdir parameter), this is not normally in place.
    • Suggested Corollary: The tarball should contain content from git submodule s such as gnulib and the necessary Autoconf archive M4 macros required by the project.
  6. Similar to how the GNU project specify the ./configure interface we need a documented interface for how to bootstrap the project. I suggest to use the already well established idiom of running ./bootstrap to set up the package to later be able to be built via ./configure. Of course, some projects are not using the autotool ./configure interface and will not follow this aspect either, but like most build systems that compete with autotools have instructions on how to build the project, they should document similar interfaces for bootstrapping the source tarball to allow building.
If tarballs that achieve the above goals were available from popular upstream projects, distributions could more easily use them instead of current tarballs that include pre-generated content. The advantage would be that the build process is not tainted by unnecessary files. We need to develop tools for maintainers to create these tarballs, similar to make dist that generate today s foo-1.2.3.tar.gz files. I think one common argument against this approach will be: Why bother with all that, and just use git-archive outputs? Or avoid the entire tarball approach and move directly towards version controlled check outs and referring to upstream releases as git URL and commit tag or id. One problem with this is that SHA-1 is broken, so placing trust in a SHA-1 identifier is simply not secure. Another counter-argument is that this optimize for packagers benefits at the cost of upstream maintainers: most upstream maintainers do not want to store gettext *.po translations in their source code repository. A compromise between the needs of maintainers and packagers is useful, so this *-src.tar.gz tarball approach is the indirection we need to solve that. Update: In my experiment with source-only tarballs for Libntlm I actually did use git-archive output. What do you think?

29 March 2024

Ravi Dwivedi: A visit to the Taj Mahal

Note: The currency used in this post is Indian Rupees, which was around 83 INR for 1 US Dollar as that time. I and my friend Badri visited the Taj Mahal this month. Taj Mahal is one of the main tourist destinations in India and does not need an introduction, I guess. It is in Agra, in the state of Uttar Pradesh, 188 km from Delhi by train. So, I am writing a post documenting useful information for people who are planning to visit Taj Mahal. Feel free to ask me questions about visiting the Taj Mahal.
Our retiring room at the Old Delhi Railway Station.
We had booked a train from Delhi to Agra. The name of the train was Taj Express, and its scheduled departure time from Hazrat Nizamuddin station in Delhi is 07:08 hours in the morning, and its arrival time at Agra Cantt station is 09:45. So, we booked a retiring room at the Old Delhi railway station for the previous night. This retiring room was hard to find. We woke up at 05:00 in the morning and took the metro to Hazrat Nizamuddin station. We barely reached the station in time, but anyway, the train was not yet at the station; it was late. We reached Agra at 10:30 and checked into our retiring room, took rest and went out for Taj Mahal at 13:00 in the afternoon. Taj Mahal s outer gate is 5 km away from the Agra Cantt station. As we were going out of the railway station, we were chased by an autorickshaw driver who offered to go to Taj Mahal for 150 INR for both of us. I asked him to bring it down to 60 INR, and after some back and forth, he agreed to drop us off at Taj Mahal for 80 INR. But I said we won t pay anything above 60 INR. He agreed with that amount but said that he would need to fill up with more passengers. When we saw that he wasn t making any effort in bringing more passengers, we walked away. As soon as we got out of the railway station complex, an autorickshaw driver came to us and offered to drop us off at Taj Mahal for 20 INR if we are sharing with other passengers and 100 INR if we reserve the auto for us. We agreed to go with 20 INR per person, but he started the autorickshaw as soon as we hopped in. I thought that the third person in the auto was another passenger sharing a ride with us, but later we got to know he was with the driver. Upon reaching the outer gate of Taj Mahal, I gave him 40 INR (for both of us), and he asked to instead give 100 INR as he said we reserved the auto, even though I clearly stated before taking the auto that we wanted to share the auto, not reserve it. I think this was a scam. We walked away, and he didn t insist further. Taj Mahal entrance was like 500 m from the outer gate. We went there and bought offline tickets just outside the West gate. For Indians, the ticket for going inside the Taj Mahal complex is 50 INR, and a visit to the mausoleum costs 200 INR extra.
Security outside the Taj Mahal complex.
This red colored building is entrance to where you can see the Taj Mahal.
Taj Mahal.
Shoe covers for going inside the mausoleum.
Taj Mahal from side angle.
We came out of the Taj Mahal complex at 18:00 and stopped for some tea and snacks. I also bought a fridge magnet for 30 INR. Then we walked back towards Agra Cantt station, as we had a train for Jaipur at midnight. We were hoping to find a restaurant along the way, but we didn t find any that we found interesting, so we just ate at the railway station. During the return trip, we noticed there was a bus stand near the station, which we didn t know about. It turns out you can catch a bus to Taj Mahal from there. You can click here to check out the location of that bus stand on OpenStreetMap.

Expenses These were our expenses per person Retiring room at Delhi Railway Station for 12 hours 131 Train ticket from Delhi to Agra (Taj Express) 110 Retiring room at Agra Cantt station for 12 hours 450 Auto-rickshaw to Taj Mahal 20 Taj Mahal ticket (including going inside the mausoleum): 250 Food 350

Important information for visitors
  • Taj Mahal is closed on Friday.
  • There are plenty of free-of-cost drinking water taps inside the Taj Mahal complex.
  • Ticket price for Indians is 50, for foreigners and NRIs it is 1100, and for people from SAARC/BIMSTEC is 540. 200 extra for the mausoleum for everyone.
  • A visit inside the mausoleum requires covering your shoes or removing them. Shoe covers costs 10 per person inside the complex, but are probably involved free of charge in foreigner tickets. We could not find a place to keep our shoes, but some people managed to enter barefoot, indicating there must be some place to keep your shoes.
  • Mobile phones and cameras are allowed inside the Taj Mahal, but not eatables.
  • We went there on March 10th, and the weather was pleasant. So, we recommend going around that time.
  • Regarding the timings, I found this written near the ticket counter: Taj Mahal opens 30 minutes before sunrise and closes 30 minutes before sunset during normal operating days, so the timings are vague. But we came out of the complex at 18:00 hours. I would interpret that to mean the Taj Mahal is open from 07:00 to 18:00, and the ticket counter closes at around 17:00. During the winter, the timings might differ.
  • The cheapest way to reach Taj Mahal is by bus, and the bus stop is here
Bye for now. See you in the next post :)

26 March 2024

Emmanuel Kasper: Adding a private / custom Certificate Authority to the firefox trust store

Today at $WORK I needed to add the private company Certificate Authority (CA) to Firefox, and I found the steps were unnecessarily complex. Time to blog about that, and I also made a Debian wiki article of that post, so that future generations can update the information, when Firefox 742 is released on Debian 17. The cacert certificate authority is not included in Debian and Firefox, and is thus a good example of adding a private CA. Note that this does not mean I specifically endorse that CA.
  • Test that SSL connections to a site signed by the private CA is failing
$ gnutls-cli wiki.cacert.org:443
...
- Status: The certificate is NOT trusted. The certificate issuer is unknown. 
*** PKI verification of server certificate failed...
*** Fatal error: Error in the certificate.
  • Download the private CA
$ wget http://www.cacert.org/certs/root_X0F.crt
  • test that a connection works with the private CA
$ gnutls-cli --x509cafile root_X0F.crt wiki.cacert.org:443
...
- Status: The certificate is trusted. 
- Description: (TLS1.2-X.509)-(ECDHE-SECP256R1)-(RSA-SHA256)-(AES-256-GCM)
- Session ID: 37:56:7A:89:EA:5F:13:E8:67:E4:07:94:4B:52:23:63:1E:54:31:69:5D:70:17:3C:D0:A4:80:B0:3A:E5:22:B3
- Options: safe renegotiation,
- Handshake was completed
...
  • add the private CA to the Debian trust store located in /etc/ssl/certs/ca-certificates.crt
$ sudo cp root_X0F.crt /usr/local/share/ca-certificates/cacert-org-root-ca.crt
$ sudo update-ca-certificates --verbose
...
Adding debian:cacert-org-root-ca.pem
...
  • verify that we can connect without passing the private CA on the command line
$ gnutls-cli wiki.cacert.org:443
... 
 - Status: The certificate is trusted.
  • At that point most applications are able to connect to systems with a certificate signed by the private CA (curl, Gnome builtin Browser ). However Firefox is using its own trust store and will still display a security error if connecting to https://wiki.cacert.org. To make Firefox trust the Debian trust store, we need to add a so called security device, in fact an extra library wrapping the Debian trust store. The library will wrap the Debian trust store in the PKCS#11 industry format that Firefox supports.
  • install the pkcs#11 wrapping library and command line tools
$ sudo apt install p11-kit p11-kit-modules
  • verify that the private CA is accessible via PKCS#11
$ trust list   grep --context 2 'CA Cert'
pkcs11:id=%16%B5%32%1B%D4%C7%F3%E0%E6%8E%F3%BD%D2%B0%3A%EE%B2%39%18%D1;type=cert
    type: certificate
    label: CA Cert Signing Authority
    trust: anchor
    category: authority
  • now we need to add a new security device in Firefox pointing to the pkcs11 trust store. The pkcs11 trust store is located in /usr/lib/x86_64-linux-gnu/pkcs11/p11-kit-trust.so
$ dpkg --listfiles p11-kit-modules   grep trust
/usr/lib/x86_64-linux-gnu/pkcs11/p11-kit-trust.so
  • in Firefox (tested in version 115 esr), go to Settings -> Privacy & Security -> Security -> Security Devices.
    Then click Load , in the popup window use My local trust as a module name, and /usr/lib/x86_64-linux-gnu/pkcs11/p11-kit-trust.so as a module filename. After adding the module, you should see it in the list of Security Devices, having /etc/ssl/certs/ca-certificates.crt as a description.
  • now restart Firefox and you should be able to browse https://wiki.cacert.org without security errors

25 March 2024

Jonathan Dowland: a bug a day

I recently became a maintainer of/committer to IkiWiki, the software that powers my site. I also took over maintenance of the Debian package. Last week I cut a new upstream point release, 3.20200202.4, and a corresponding Debian package upload, consisting only of a handful of low-hanging-fruit patches from other people, largely to exercise both processes. I've been discussing IkiWiki's maintenance situation with some other users for a couple of years now. I've also weighed up the pros and cons of moving to a different static-site-generator (a term that describes what IkiWiki is, but was actually coined more recently). It turns out IkiWiki is exceptionally flexible and powerful: I estimate the cost of moving to something modern(er) and fashionable such as Jekyll, Hugo or Hakyll as unreasonably high, in part because they are surprisingly rigid and inflexible in some key places. Like most mature software, IkiWiki has a bug backlog. Over the past couple of weeks, as a sort-of "palate cleanser" around work pieces, I've tried to triage one IkiWiki bug per day: either upstream or in the Debian Bug Tracker. This is a really lightweight task: it can be as simple as "find a bug reported in Debian, copy it upstream, tag it upstream, mark it forwarded; perhaps taking 5-10 minutes. Often I'll stumble across something that has already been fixed but not recorded as such as I go. Despite this minimal level of work, I'm quite satisfied with the cumulative progress. It's notable to me how much my perspective has shifted by becoming a maintainer: I'm considering everything through a different lens to that of being just one user. Eventually I will put some time aside to scratch some of my own itches (html5 by default; support dark mode; duckduckgo plugin; use the details tag...) but for now this minimal exercise is of broader use.

24 March 2024

Niels Thykier: debputy v0.1.21

Earlier today, I have just released debputy version 0.1.21 to Debian unstable. In the blog post, I will highlight some of the new features.
Package boilerplate reduction with automatic relationship substvar Last month, I started a discussion on rethinking how we do relationship substvars such as the $ misc:Depends . These generally ends up being boilerplate runes in the form of Depends: $ misc:Depends , $ shlibs:Depends where you as the packager has to remember exactly which runes apply to your package. My proposed solution was to automatically apply these substvars and this feature has now been implemented in debputy. It is also combined with the feature where essential packages should use Pre-Depends by default for dpkg-shlibdeps related dependencies. I am quite excited about this feature, because I noticed with libcleri that we are now down to 3-5 fields for defining a simple library package. Especially since most C library packages are trivial enough that debputy can auto-derive them to be Multi-Arch: same. As an example, the libcleric1 package is down to 3 fields (Package, Architecture, Description) with Section and Priority being inherited from the Source stanza. I have submitted a MR to show case the boilerplate reduction at https://salsa.debian.org/siridb-team/libcleri/-/merge_requests/3. The removal of libcleric1 (= $ binary:Version ) in that MR relies on another existing feature where debputy can auto-derive a dependency between an arch:any -dev package and the library package based on the .so symlink for the shared library. The arch:any restriction comes from the fact that arch:all and arch:any packages are not built together, so debputy cannot reliably see across the package boundaries during the build (and therefore refuses to do so at all). Packages that have already migrated to debputy can use debputy migrate-from-dh to detect any unnecessary relationship substitution variables in case you want to clean up. The removal of Multi-Arch: same and intra-source dependencies must be done manually and so only be done so when you have validated that it is safe and sane to do. I was willing to do it for the show-case MR, but I am less confident that would bother with these for existing packages in general. Note: I summarized the discussion of the automatic relationship substvar feature earlier this month in https://lists.debian.org/debian-devel/2024/03/msg00030.html for those who want more details. PS: The automatic relationship substvars feature will also appear in debhelper as a part of compat 14.
Language Server (LSP) and Linting I have long been frustrated by our poor editor support for Debian packaging files. To this end, I started working on a Language Server (LSP) feature in debputy that would cover some of our standard Debian packaging files. This release includes the first version of said language server, which covers the following files:
  • debian/control
  • debian/copyright (the machine readable variant)
  • debian/changelog (mostly just spelling)
  • debian/rules
  • debian/debputy.manifest (syntax checks only; use debputy check-manifest for the full validation for now)
Most of the effort has been spent on the Deb822 based files such as debian/control, which comes with diagnostics, quickfixes, spellchecking (but only for relevant fields!), and completion suggestions. Since not everyone has a LSP capable editor and because sometimes you just want diagnostics without having to open each file in an editor, there is also a batch version for the diagnostics via debputy lint. Please see debputy(1) for how debputy lint compares with lintian if you are curious about which tool to use at what time. To help you getting started, there is a now debputy lsp editor-config command that can provide you with the relevant editor config glue. At the moment, emacs (via eglot) and vim with vim-youcompleteme are supported. For those that followed the previous blog posts on writing the language server, I would like to point out that the command line for running the language server has changed to debputy lsp server and you no longer have to tell which format it is. I have decided to make the language server a "polyglot" server for now, which I will hopefully not regret... Time will tell. :) Anyhow, to get started, you will want:
$ apt satisfy 'dh-debputy (>= 0.1.21~), python3-pygls'
# Optionally, for spellchecking
$ apt install python3-hunspell hunspell-en-us
# For emacs integration
$ apt install elpa-dpkg-dev-el markdown-mode-el
# For vim integration via vim-youcompleteme
$ apt install vim-youcompleteme
Specifically for emacs, I also learned two things after the upload. First, you can auto-activate eglot via eglot-ensure. This badly feature interacts with imenu on debian/changelog for reasons I do not understand (causing a several second start up delay until something times out), but it works fine for the other formats. Oddly enough, opening a changelog file and then activating eglot does not trigger this issue at all. In the next version, editor config for emacs will auto-activate eglot on all files except debian/changelog. The second thing is that if you install elpa-markdown-mode, emacs will accept and process markdown in the hover documentation provided by the language server. Accordingly, the editor config for emacs will also mention this package from the next version on. Finally, on a related note, Jelmer and I have been looking at moving some of this logic into a new package called debpkg-metadata. The point being to support easier reuse of linting and LSP related metadata - like pulling a list of known fields for debian/control or sharing logic between lintian-brush and debputy.
Minimal integration mode for Rules-Requires-Root One of the original motivators for starting debputy was to be able to get rid of fakeroot in our build process. While this is possible, debputy currently does not support most of the complex packaging features such as maintscripts and debconf. Unfortunately, the kind of packages that need fakeroot for static ownership tend to also require very complex packaging features. To bridge this gap, the new version of debputy supports a very minimal integration with dh via the dh-sequence-zz-debputy-rrr. This integration mode keeps the vast majority of debhelper sequence in place meaning most dh add-ons will continue to work with dh-sequence-zz-debputy-rrr. The sequence only replaces the following commands:
  • dh_fixperms
  • dh_gencontrol
  • dh_md5sums
  • dh_builddeb
The installations feature of the manifest will be disabled in this integration mode to avoid feature interactions with debhelper tools that expect debian/<pkg> to contain the materialized package. On a related note, the debputy migrate-from-dh command now supports a --migration-target option, so you can choose the desired level of integration without doing code changes. The command will attempt to auto-detect the desired integration from existing package features such as a build-dependency on a relevant dh sequence, so you do not have to remember this new option every time once the migration has started. :)

Jacob Adams: Regular Reboots

Uptime is often considered a measure of system reliability, an indication that the running software is stable and can be counted on. However, this hides the insidious build-up of state throughout the system as it runs, the slow drift from the expected to the strange. As Nolan Lawson highlights in an excellent post entitled Programmers are bad at managing state, state is the most challenging part of programming. It s why did you try turning it off and on again is a classic tech support response to any problem. In addition to the problem of state, installing regular updates periodically requires a reboot, even if the rest of the process is automated through a tool like unattended-upgrades. For my personal homelab, I manage a handful of different machines running various services. I used to just schedule a day to update and reboot all of them, but that got very tedious very quickly. I then moved the reboot to a cronjob, and then recently to a systemd timer and service. I figure that laying out my path to better management of this might help others, and will almost certainly lead to someone telling me a better way to do this. UPDATE: Turns out there s another option for better systemd cron integration. See systemd-cron below.

Stage One: Reboot Cron The first, and easiest approach, is a simple cron job. Just adding the following line to /var/spool/cron/crontabs/root1 is enough to get your machine to reboot once a month2 on the 6th at 8:00 AM3:
0 8 6 * * reboot
I had this configured for many years and it works well. But you have no indication as to whether it succeeds except for checking your uptime regularly yourself.

Stage Two: Reboot systemd Timer The next evolution of this approach for me was to use a systemd timer. I created a regular-reboot.timer with the following contents:
[Unit]
Description=Reboot on a Regular Basis
[Timer]
Unit=regular-reboot.service
OnBootSec=1month
[Install]
WantedBy=timers.target
This timer will trigger the regular-reboot.service systemd unit when the system reaches one month of uptime. I ve seen some guides to creating timer units recommend adding a Wants=regular-reboot.service to the [Unit] section, but this has the consequence of running that service every time it starts the timer. In this case that will just reboot your system on startup which is not what you want. Care needs to be taken to use the OnBootSec directive instead of OnCalendar or any of the other time specifications, as your system could reboot, discover its still within the expected window and reboot again. With OnBootSec your system will not have that problem. Technically, this same problem could have occurred with the cronjob approach, but in practice it never did, as the systems took long enough to come back up that they were no longer within the expected window for the job. I then added the regular-reboot.service:
[Unit]
Description=Reboot on a Regular Basis
Wants=regular-reboot.timer
[Service]
Type=oneshot
ExecStart=shutdown -r 02:45
You ll note that this service is actually scheduling a specific reboot time via the shutdown command instead of just immediately rebooting. This is a bit of a hack needed because I can t control when the timer runs exactly when using OnBootSec. This way different systems have different reboot times so that everything doesn t just reboot and fail all at once. Were something to fail to come back up I would have some time to fix it, as each machine has a few hours between scheduled reboots. One you have both files in place, you ll simply need to reload configuration and then enable and start the timer unit:
systemctl daemon-reload
systemctl enable --now regular-reboot.timer
You can then check when it will fire next:
# systemctl status regular-reboot.timer
  regular-reboot.timer - Reboot on a Regular Basis
     Loaded: loaded (/etc/systemd/system/regular-reboot.timer; enabled; preset: enabled)
     Active: active (waiting) since Wed 2024-03-13 01:54:52 EDT; 1 week 4 days ago
    Trigger: Fri 2024-04-12 12:24:42 EDT; 2 weeks 4 days left
   Triggers:   regular-reboot.service
Mar 13 01:54:52 dorfl systemd[1]: Started regular-reboot.timer - Reboot on a Regular Basis.

Sidenote: Replacing all Cron Jobs with systemd Timers More generally, I ve now replaced all cronjobs on my personal systems with systemd timer units, mostly because I can now actually track failures via prometheus-node-exporter. There are plenty of ways to hack in cron support to the node exporter, but just moving to systemd units provides both support for tracking failure and logging, both of which make system administration much easier when things inevitably go wrong.

systemd-cron An alternative to converting everything by hand, if you happen to have a lot of cronjobs is systemd-cron. It will make each crontab and /etc/cron.* directory into automatic service and timer units. Thanks to Alexandre Detiste for letting me know about this project. I have few enough cron jobs that I ve already converted, but for anyone looking at a large number of jobs to convert you ll want to check it out!

Stage Three: Monitor that it s working The final step here is confirm that these units actually work, beyond just firing regularly. I now have the following rule in my prometheus-alertmanager rules:
  - alert: UptimeTooHigh
    expr: (time() - node_boot_time_seconds job="node" ) / 86400 > 35
    annotations:
      summary: "Instance  Has Been Up Too Long!"
      description: "Instance  Has Been Up Too Long!"
This will trigger an alert anytime that I have a machine up for more than 35 days. This actually helped me track down one machine that I had forgotten to set up this new unit on4.

Not everything needs to scale Is It Worth The Time One of the most common fallacies programmers fall into is that we will jump to automating a solution before we stop and figure out how much time it would even save. In taking a slow improvement route to solve this problem for myself, I ve managed not to invest too much time5 in worrying about this but also achieved a meaningful improvement beyond my first approach of doing it all by hand.
  1. You could also add a line to /etc/crontab or drop a script into /etc/cron.monthly depending on your system.
  2. Why once a month? Mostly to avoid regular disruptions, but still be reasonably timely on updates.
  3. If you re looking to understand the cron time format I recommend crontab guru.
  4. In the long term I really should set up something like ansible to automatically push fleetwide changes like this but with fewer machines than fingers this seems like overkill.
  5. Of course by now writing about it, I ve probably doubled the amount of time I ve spent thinking about this topic but oh well

18 March 2024

Simon Josefsson: Apt archive mirrors in Git-LFS

My effort to improve transparency and confidence of public apt archives continues. I started to work on this in Apt Archive Transparency in which I mention the debdistget project in passing. Debdistget is responsible for mirroring index files for some public apt archives. I ve realized that having a publicly auditable and preserved mirror of the apt repositories is central to being able to do apt transparency work, so the debdistget project has become more central to my project than I thought. Currently I track Trisquel, PureOS, Gnuinos and their upstreams Ubuntu, Debian and Devuan. Debdistget download Release/Package/Sources files and store them in a git repository published on GitLab. Due to size constraints, it uses two repositories: one for the Release/InRelease files (which are small) and one that also include the Package/Sources files (which are large). See for example the repository for Trisquel release files and the Trisquel package/sources files. Repositories for all distributions can be found in debdistutils archives GitLab sub-group. The reason for splitting into two repositories was that the git repository for the combined files become large, and that some of my use-cases only needed the release files. Currently the repositories with packages (which contain a couple of months worth of data now) are 9GB for Ubuntu, 2.5GB for Trisquel/Debian/PureOS, 970MB for Devuan and 450MB for Gnuinos. The repository size is correlated to the size of the archive (for the initial import) plus the frequency and size of updates. Ubuntu s use of Apt Phased Updates (which triggers a higher churn of Packages file modifications) appears to be the primary reason for its larger size. Working with large Git repositories is inefficient and the GitLab CI/CD jobs generate quite some network traffic downloading the git repository over and over again. The most heavy user is the debdistdiff project that download all distribution package repositories to do diff operations on the package lists between distributions. The daily job takes around 80 minutes to run, with the majority of time is spent on downloading the archives. Yes I know I could look into runner-side caching but I dislike complexity caused by caching. Fortunately not all use-cases requires the package files. The debdistcanary project only needs the Release/InRelease files, in order to commit signatures to the Sigstore and Sigsum transparency logs. These jobs still run fairly quickly, but watching the repository size growth worries me. Currently these repositories are at Debian 440MB, PureOS 130MB, Ubuntu/Devuan 90MB, Trisquel 12MB, Gnuinos 2MB. Here I believe the main size correlation is update frequency, and Debian is large because I track the volatile unstable. So I hit a scalability end with my first approach. A couple of months ago I solved this by discarding and resetting these archival repositories. The GitLab CI/CD jobs were fast again and all was well. However this meant discarding precious historic information. A couple of days ago I was reaching the limits of practicality again, and started to explore ways to fix this. I like having data stored in git (it allows easy integration with software integrity tools such as GnuPG and Sigstore, and the git log provides a kind of temporal ordering of data), so it felt like giving up on nice properties to use a traditional database with on-disk approach. So I started to learn about Git-LFS and understanding that it was able to handle multi-GB worth of data that looked promising. Fairly quickly I scripted up a GitLab CI/CD job that incrementally update the Release/Package/Sources files in a git repository that uses Git-LFS to store all the files. The repository size is now at Ubuntu 650kb, Debian 300kb, Trisquel 50kb, Devuan 250kb, PureOS 172kb and Gnuinos 17kb. As can be expected, jobs are quick to clone the git archives: debdistdiff pipelines went from a run-time of 80 minutes down to 10 minutes which more reasonable correlate with the archive size and CPU run-time. The LFS storage size for those repositories are at Ubuntu 15GB, Debian 8GB, Trisquel 1.7GB, Devuan 1.1GB, PureOS/Gnuinos 420MB. This is for a couple of days worth of data. It seems native Git is better at compressing/deduplicating data than Git-LFS is: the combined size for Ubuntu is already 15GB for a couple of days data compared to 8GB for a couple of months worth of data with pure Git. This may be a sub-optimal implementation of Git-LFS in GitLab but it does worry me that this new approach will be difficult to scale too. At some level the difference is understandable, Git-LFS probably store two different Packages files around 90MB each for Trisquel as two 90MB files, but native Git would store it as one compressed version of the 90MB file and one relatively small patch to turn the old files into the next file. So the Git-LFS approach surprisingly scale less well for overall storage-size. Still, the original repository is much smaller, and you usually don t have to pull all LFS files anyway. So it is net win. Throughout this work, I kept thinking about how my approach relates to Debian s snapshot service. Ultimately what I would want is a combination of these two services. To have a good foundation to do transparency work I would want to have a collection of all Release/Packages/Sources files ever published, and ultimately also the source code and binaries. While it makes sense to start on the latest stable releases of distributions, this effort should scale backwards in time as well. For reproducing binaries from source code, I need to be able to securely find earlier versions of binary packages used for rebuilds. So I need to import all the Release/Packages/Sources packages from snapshot into my repositories. The latency to retrieve files from that server is slow so I haven t been able to find an efficient/parallelized way to download the files. If I m able to finish this, I would have confidence that my new Git-LFS based approach to store these files will scale over many years to come. This remains to be seen. Perhaps the repository has to be split up per release or per architecture or similar. Another factor is storage costs. While the git repository size for a Git-LFS based repository with files from several years may be possible to sustain, the Git-LFS storage size surely won t be. It seems GitLab charges the same for files in repositories and in Git-LFS, and it is around $500 per 100GB per year. It may be possible to setup a separate Git-LFS backend not hosted at GitLab to serve the LFS files. Does anyone know of a suitable server implementation for this? I had a quick look at the Git-LFS implementation list and it seems the closest reasonable approach would be to setup the Gitea-clone Forgejo as a self-hosted server. Perhaps a cloud storage approach a la S3 is the way to go? The cost to host this on GitLab will be manageable for up to ~1TB ($5000/year) but scaling it to storing say 500TB of data would mean an yearly fee of $2.5M which seems like poor value for the money. I realized that ultimately I would want a git repository locally with the entire content of all apt archives, including their binary and source packages, ever published. The storage requirements for a service like snapshot (~300TB of data?) is today not prohibitly expensive: 20TB disks are $500 a piece, so a storage enclosure with 36 disks would be around $18.000 for 720TB and using RAID1 means 360TB which is a good start. While I have heard about ~TB-sized Git-LFS repositories, would Git-LFS scale to 1PB? Perhaps the size of a git repository with multi-millions number of Git-LFS pointer files will become unmanageable? To get started on this approach, I decided to import a mirror of Debian s bookworm for amd64 into a Git-LFS repository. That is around 175GB so reasonable cheap to host even on GitLab ($1000/year for 200GB). Having this repository publicly available will make it possible to write software that uses this approach (e.g., porting debdistreproduce), to find out if this is useful and if it could scale. Distributing the apt repository via Git-LFS would also enable other interesting ideas to protecting the data. Consider configuring apt to use a local file:// URL to this git repository, and verifying the git checkout using some method similar to Guix s approach to trusting git content or Sigstore s gitsign. A naive push of the 175GB archive in a single git commit ran into pack size limitations: remote: fatal: pack exceeds maximum allowed size (4.88 GiB) however breaking up the commit into smaller commits for parts of the archive made it possible to push the entire archive. Here are the commands to create this repository: git init
git lfs install
git lfs track 'dists/**' 'pool/**'
git add .gitattributes
git commit -m"Add Git-LFS track attributes." .gitattributes
time debmirror --method=rsync --host ftp.se.debian.org --root :debian --arch=amd64 --source --dist=bookworm,bookworm-updates --section=main --verbose --diff=none --keyring /usr/share/keyrings/debian-archive-keyring.gpg --ignore .git .
git add dists project
git commit -m"Add." -a
git remote add origin git@gitlab.com:debdistutils/archives/debian/mirror.git
git push --set-upstream origin --all
for d in pool//; do
echo $d;
time git add $d;
git commit -m"Add $d." -a
git push
done
The resulting repository size is around 27MB with Git LFS object storage around 174GB. I think this approach would scale to handle all architectures for one release, but working with a single git repository for all releases for all architectures may lead to a too large git repository (>1GB). So maybe one repository per release? These repositories could also be split up on a subset of pool/ files, or there could be one repository per release per architecture or sources. Finally, I have concerns about using SHA1 for identifying objects. It seems both Git and Debian s snapshot service is currently using SHA1. For Git there is SHA-256 transition and it seems GitLab is working on support for SHA256-based repositories. For serious long-term deployment of these concepts, it would be nice to go for SHA256 identifiers directly. Git-LFS already uses SHA256 but Git internally uses SHA1 as does the Debian snapshot service. What do you think? Happy Hacking!

11 March 2024

Joachim Breitner: Convenient sandboxed development environment

I like using one machine and setup for everything, from serious development work to hobby projects to managing my finances. This is very convenient, as often the lines between these are blurred. But it is also scary if I think of the large number of people who I have to trust to not want to extract all my personal data. Whenever I run a cabal install, or a fun VSCode extension gets updated, or anything like that, I am running code that could be malicious or buggy. In a way it is surprising and reassuring that, as far as I can tell, this commonly does not happen. Most open source developers out there seem to be nice and well-meaning, after all.

Convenient or it won t happen Nevertheless I thought I should do something about this. The safest option would probably to use dedicated virtual machines for the development work, with very little interaction with my main system. But knowing me, that did not seem likely to happen, as it sounded like a fair amount of hassle. So I aimed for a viable compromise between security and convenient, and one that does not get too much in the way of my current habits. For instance, it seems desirable to have the project files accessible from my unconstrained environment. This way, I could perform certain actions that need access to secret keys or tokens, but are (unlikely) to run code (e.g. git push, git pull from private repositories, gh pr create) from the outside , and the actual build environment can do without access to these secrets. The user experience I thus want is a quick way to enter a development environment where I can do most of the things I need to do while programming (network access, running command line and GUI programs), with access to the current project, but without access to my actual /home directory. I initially followed the blog post Application Isolation using NixOS Containers by Marcin Sucharski and got something working that mostly did what I wanted, but then a colleague pointed out that tools like firejail can achieve roughly the same with a less global setup. I tried to use firejail, but found it to be a bit too inflexible for my particular whims, so I ended up writing a small wrapper around the lower level sandboxing tool https://github.com/containers/bubblewrap.

Selective bubblewrapping This script, called dev and included below, builds a new filesystem namespace with minimal /proc and /dev directories, it s own /tmp directories. It then binds-mound some directories to make the host s NixOS system available inside the container (/bin, /usr, the nix store including domain socket, stuff for OpenGL applications). My user s home directory is taken from ~/.dev-home and some configuration files are bind-mounted for convenient sharing. I intentionally don t share most of the configuration for example, a direnv enable in the dev environment should not affect the main environment. The X11 socket for graphical applications and the corresponding .Xauthority file is made available. And finally, if I run dev in a project directory, this project directory is bind mounted writable, and the current working directory is preserved. The effect is that I can type dev on the command line to enter dev mode rather conveniently. I can run development tools, including graphical ones like VSCode, and especially the latter with its extensions is part of the sandbox. To do a git push I either exit the development environment (Ctrl-D) or open a separate terminal. Overall, the inconvenience of switching back and forth seems worth the extra protection. Clearly, isn t going to hold against a determined and maybe targeted attacker (e.g. access to the X11 and the nix daemon socket can probably be used to escape easily). But I hope it will help against a compromised dev dependency that just deletes or exfiltrates data, like keys or passwords, from the usual places in $HOME.

Rough corners There is more polishing that could be done.
  • In particular, clicking on a link inside VSCode in the container will currently open Firefox inside the container, without access to my settings and cookies etc. Ideally, links would be opened in the Firefox running outside. This is a problem that has a solution in the world of applications that are sandboxed with Flatpak, and involves a bunch of moving parts (a xdg-desktop-portal user service, a filtering dbus proxy, exposing access to that proxy in the container). I experimented with that for a bit longer than I should have, but could not get it to work to satisfaction (even without a container involved, I could not get xdg-desktop-portal to heed my default browser settings ). For now I will live with manually copying and pasting URLs, we ll see how long this lasts.
  • With this setup (and unlike the NixOS container setup I tried first), the same applications are installed inside and outside. It might be useful to separate the set of installed programs: There is simply no point in running evolution or firefox inside the container, and if I do not even have VSCode or cabal available outside, so that it s less likely that I forget to enter dev before using these tools. It shouldn t be too hard to cargo-cult some of the NixOS Containers infrastructure to be able to have a separate system configuration that I can manage as part of my normal system configuration and make available to bubblewrap here.
So likely I will refine this some more over time. Or get tired of typing dev and going back to what I did before

The script
The dev script (at the time of writing)

9 March 2024

Reproducible Builds: Reproducible Builds in February 2024

Welcome to the February 2024 report from the Reproducible Builds project! In our reports, we try to outline what we have been up to over the past month as well as mentioning some of the important things happening in software supply-chain security.

Reproducible Builds at FOSDEM 2024 Core Reproducible Builds developer Holger Levsen presented at the main track at FOSDEM on Saturday 3rd February this year in Brussels, Belgium. However, that wasn t the only talk related to Reproducible Builds. However, please see our comprehensive FOSDEM 2024 news post for the full details and links.

Maintainer Perspectives on Open Source Software Security Bernhard M. Wiedemann spotted that a recent report entitled Maintainer Perspectives on Open Source Software Security written by Stephen Hendrick and Ashwin Ramaswami of the Linux Foundation sports an infographic which mentions that 56% of [polled] projects support reproducible builds .

Mailing list highlights From our mailing list this month:

Distribution work In Debian this month, 5 reviews of Debian packages were added, 22 were updated and 8 were removed this month adding to Debian s knowledge about identified issues. A number of issue types were updated as well. [ ][ ][ ][ ] In addition, Roland Clobus posted his 23rd update of the status of reproducible ISO images on our mailing list. In particular, Roland helpfully summarised that all major desktops build reproducibly with bullseye, bookworm, trixie and sid provided they are built for a second time within the same DAK run (i.e. [within] 6 hours) and that there will likely be further work at a MiniDebCamp in Hamburg. Furthermore, Roland also responded in-depth to a query about a previous report
Fedora developer Zbigniew J drzejewski-Szmek announced a work-in-progress script called fedora-repro-build that attempts to reproduce an existing package within a koji build environment. Although the projects README file lists a number of fields will always or almost always vary and there is a non-zero list of other known issues, this is an excellent first step towards full Fedora reproducibility.
Jelle van der Waa introduced a new linter rule for Arch Linux packages in order to detect cache files leftover by the Sphinx documentation generator which are unreproducible by nature and should not be packaged. At the time of writing, 7 packages in the Arch repository are affected by this.
Elsewhere, Bernhard M. Wiedemann posted another monthly update for his work elsewhere in openSUSE.

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions 256, 257 and 258 to Debian and made the following additional changes:
  • Use a deterministic name instead of trusting gpg s use-embedded-filenames. Many thanks to Daniel Kahn Gillmor dkg@debian.org for reporting this issue and providing feedback. [ ][ ]
  • Don t error-out with a traceback if we encounter struct.unpack-related errors when parsing Python .pyc files. (#1064973). [ ]
  • Don t try and compare rdb_expected_diff on non-GNU systems as %p formatting can vary, especially with respect to MacOS. [ ]
  • Fix compatibility with pytest 8.0. [ ]
  • Temporarily fix support for Python 3.11.8. [ ]
  • Use the 7zip package (over p7zip-full) after a Debian package transition. (#1063559). [ ]
  • Bump the minimum Black source code reformatter requirement to 24.1.1+. [ ]
  • Expand an older changelog entry with a CVE reference. [ ]
  • Make test_zip black clean. [ ]
In addition, James Addison contributed a patch to parse the headers from the diff(1) correctly [ ][ ] thanks! And lastly, Vagrant Cascadian pushed updates in GNU Guix for diffoscope to version 255, 256, and 258, and updated trydiffoscope to 67.0.6.

reprotest reprotest is our tool for building the same source code twice in different environments and then checking the binaries produced by each build for any differences. This month, Vagrant Cascadian made a number of changes, including:
  • Create a (working) proof of concept for enabling a specific number of CPUs. [ ][ ]
  • Consistently use 398 days for time variation rather than choosing randomly and update README.rst to match. [ ][ ]
  • Support a new --vary=build_path.path option. [ ][ ][ ][ ]

Website updates There were made a number of improvements to our website this month, including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In February, a number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Temporarily disable upgrading/bootstrapping Debian unstable and experimental as they are currently broken. [ ][ ]
    • Use the 64-bit amd64 kernel on all i386 nodes; no more 686 PAE kernels. [ ]
    • Add an Erlang package set. [ ]
  • Other changes:
    • Grant Jan-Benedict Glaw shell access to the Jenkins node. [ ]
    • Enable debugging for NetBSD reproducibility testing. [ ]
    • Use /usr/bin/du --apparent-size in the Jenkins shell monitor. [ ]
    • Revert reproducible nodes: mark osuosl2 as down . [ ]
    • Thanks again to Codethink, for they have doubled the RAM on our arm64 nodes. [ ]
    • Only set /proc/$pid/oom_score_adj to -1000 if it has not already been done. [ ]
    • Add the opemwrt-target-tegra and jtx task to the list of zombie jobs. [ ][ ]
Vagrant Cascadian also made the following changes:
  • Overhaul the handling of OpenSSH configuration files after updating from Debian bookworm. [ ][ ][ ]
  • Add two new armhf architecture build nodes, virt32z and virt64z, and insert them into the Munin monitoring. [ ][ ] [ ][ ]
In addition, Alexander Couzens updated the OpenWrt configuration in order to replace the tegra target with mpc85xx [ ], Jan-Benedict Glaw updated the NetBSD build script to use a separate $TMPDIR to mitigate out of space issues on a tmpfs-backed /tmp [ ] and Zheng Junjie added a link to the GNU Guix tests [ ]. Lastly, node maintenance was performed by Holger Levsen [ ][ ][ ][ ][ ][ ] and Vagrant Cascadian [ ][ ][ ][ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

2 March 2024

Ravi Dwivedi: Malaysia Trip

Last month, I had a trip to Malaysia and Thailand. I stayed for six days in each of the countries. The selection of these countries was due to both of them granting visa-free entry to Indian tourists for some time window. This post covers the Malaysia part and Thailand part will be covered in the next post. If you want to travel to any of these countries in the visa-free time period, I have written all the questions asked during immigration and at airports during this trip here which might be of help. I mostly stayed in Kuala Lumpur and went to places around it. Although before the trip, I planned to visit Ipoh and Cameron Highlands too, but could not cover it during the trip. I found planning a trip to Malaysia a little difficult. The country is divided into two main islands - Peninsular Malaysia and Borneo. Then there are more islands - Langkawi, Penang island, Perhentian and Redang Islands. Reaching those islands seemed a little difficult to plan and I wish to visit more places in my next Malaysia trip. My first day hostel was booked in Chinatown part of Kuala Lumpur, near Pasar Seni LRT station. As soon as I checked-in and entered my room, I met another Indian named Fletcher, and after that we accompanied each other in the trip. That day, we went to Muzium Negara and Little India. I realized that if you know the right places to buy what you want, Malaysia could be quite cheap. Malaysian currency is Malaysian Ringgit (MYR). 1 MYR is equal to 18 INR. For 2 MYR, you can get a good masala tea in Little India and it costs like 4-5 MYR for a masala dosa. The vegetarian food has good availability in Kuala Lumpur, thanks to the Tamil community. I also tried Mee Goreng, which was vegetarian, and I found it fine in terms of taste. When I checked about Mee Goreng on Wikipedia, I found out that it is unique to Indian immigrants in Malaysia (and neighboring countries) but you don t get it in India!
Mee Goreng, a dish made of noodles in Malaysia.
For the next day, Fletcher had planned a trip to Genting Highlands and pre booked everything. I also planned to join him but when we went to KL Sentral to take the bus, his bus tickets were sold out. I could take a bus at a different time, but decided to visit some other place for the day and cover Genting Highlands later. At the ticket counter, I met a family from Delhi and they wanted to go to Genting Highlands but due to not getting bus tickets for that day, they decided to buy a ticket for the next day and instead planned for Batu Caves that day. I joined them and went to Batu Caves. After returning from Batu Caves, we went our separate ways. I went back and took rest at my hostel and later went to Petronas Towers at night. Petronas Towers is the icon of Kuala Lumpur. Having a photo there was a must. I was at Petronas Towers at around 9 PM. Around that time, Fletcher came back from Genting Highlands and we planned to meet at KL Sentral to head for dinner.
Me at Petronas Towers.
We went back to the same place as the day before where I had Mee Goreng. This time we had dosa and a masala tea. Their masala tea from the last day was tasty and that s why I was looking for them in the first place. We also met a Malaysian family having Indian ancestry dining there and had a nice conversation. Then we went to a place to eat roti canai in Pasar Seni market. Roti canai is a popular non-vegetarian dish in Malaysia but I took the vegetarian version.
Photo with Malaysians.
The next day, we went to Berjaya Time Square shopping place which sells pretty cheap items for daily use and souveniers too. However, I bought souveniers from Petaling Street, which is in Chinatown. At night, we explored Bukit Bintang, which is the heart of Kuala Lumpur and is famous for its nightlife. After that, Fletcher went to Bangkok and I was in Malaysia for two more days. Next day, I went to Genting Highlands and took the cable car, which had awesome views. I came back to Kuala Lumpur by the night. The remaining day I just roamed around in Bukit Bintang. Then I took a flight for Bangkok on 7th Feb, which I will cover in the next post. In Malaysia, I met so many people from different countries - apart from people from Indian subcontinent, I met Syrians, Indonesians (Malaysia seems to be a popular destination for Indonesian tourists) and Burmese people. Meeting people from other cultures is an integral part of travel for me. My expenses for Food + Accommodation + Travel added to 10,000 INR for a week in Malaysia, while flight costs were: 13,000 INR (Delhi to Kuala Lumpur) + 10,000 INR (Kuala Lumpur to Bangkok) + 12,000 INR (Bangkok to Delhi). For OpenStreetMap users, good news is Kuala Lumpur is fairly well-mapped on OpenStreetMap.

Tips
  • I bought local SIM from a shop at KL Sentral station complex which had news in their name (I forgot the exact name and there are two shops having news in their name) and it was the cheapest option I could find. The SIM was 10 MYR for 5 GB data for a week. If you want to make calls too, then you need to spend extra 5 MYR.
  • 7-Eleven and KK Mart convenience stores are everywhere in the city and they are open all the time (24 hours a day). If you are a vegetarian, you can at least get some bread and cheese from there to eat.
  • A lot of people know English (and many - Indians, Pakistanis, Nepalis - know Hindi) in Kuala Lumpur, so I had no language problems most of the time.
  • For shopping on budget, you can go to Petaling Street, Berjaya Time Square or Bukit Bintang. In particular, there is a shop named I Love KL Gifts in Bukit Bintang which had very good prices. just near the metro/monorail stattion. Check out location of the shop on OpenStreetMap.

28 February 2024

Daniel Lange: Opencollective shutting down

Update 28.02.2024 19:45 CET: There is now a blog entry at https://blog.opencollective.com/open-collective-official-statement-ocf-dissolution/ trying to discern the legal entities in the Open Collective ecosystem and recommending potential ways forward.
Gee, there is nothing on their blog yet, but I just [28.02.2023 00:07 CET] received this email from Mike Strode, Program Officer at the Open Collective Foundation: Dear Daniel Lange, It is with a heavy heart that I'm writing today to inform you that the Board of Directors of the Open Collective Foundation (OCF) has made the difficult decision to dissolve OCF, effective December 31, 2024. We are proud of the work we have been able to do together. We have been honored to build community with you and the hundreds of other collectives hosted at the Open Collective Foundation. What you need to know: We are beginning a staged dissolution process that will allow our over 600 collectives the time to close or transition their work. Dissolving OCF will take many months, and involves settling all liabilities while spending down all funds in a legally compliant manner. Our priority is to support our collectives in navigating this change. We want to provide collectives the longest possible runway to wind down or transition their operations while we focus on the many legal and financial tasks associated with dissolving a nonprofit. March 15 is the last day to accept donations. You will have until September 30 to work with us to develop and implement a plan to spend down the money in your fund. Key dates are included at the bottom of this email. We know this is going to be difficult, and we will do everything we can to ease the transition for you. How we will support collectives: It remains our fiduciary responsibility to safeguard each collective's charitable assets and ensure funds are used solely for specified charitable purposes. We will be providing assistance and support to you, whether you choose to spend out and close down your collective or continue your work through another 501(c)(3) organization or fiscal sponsor. Unfortunately, we had to say goodbye to several of our colleagues today as we pare down our core staff to reduce costs. I will be staying on staff to support collectives through this transition, along with Wayne Kleppe, our Finance Administrator. What led to this decision: From day one, OCF was committed to experimentation and innovation. We were dedicated to finding new ways to open up the nonprofit space, making it easier for people to raise and access funding so they can do good in their communities. OCF was created by Open Collective Inc. (OCI), a company formed in 2015 with the goal of "enabling groups to quickly set up a collective, raise funds and manage them transparently." Soon after being founded by OCI, OCF went through a period of rapid growth. We responded to increased demand arising from the COVID-19 pandemic without taking the time to establish the appropriate systems and infrastructure to sustain that growth. Unfortunately, over the past year, we have learned that Open Collective Foundation's business model is not sustainable with the number of complex services we have offered and the fees we pay to the Open Collective Inc. tech platform. In late 2023, we made the decision to pause accepting new collectives in order to create space for us to address the issues. Unfortunately, it became clear that it would not be financially feasible to make the necessary corrections, and we determined that OCF is not viable. What's next: We know this news will raise questions for many of our collectives. We will be making space for questions and reactions in the coming weeks. In the meantime, we have developed this FAQ which we will keep updated as more questions come in. What you need to do next: Dates to know: In Care & Accompaniment,
Mike Strode
Program Officer
Open Collective Foundation Our mailing address has changed! We are now located at 440 N. Barranca Avenue #3717, Covina, CA 91723, USA

11 February 2024

Freexian Collaborators: Debian Contributions: Upcoming Improvements to Salsa CI, /usr-move, and more! (by Utkarsh Gupta)

Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services.

Upcoming Improvements to Salsa CI, by Santiago Ruano Rinc n Santiago started picking up the work made by Outreachy Intern, Enock Kashada (a big thanks to him!), to solve some long-standing issues in Salsa CI. Currently, the first job in a Salsa CI pipeline is the extract-source job, used to produce a debianize source tree of the project. This job was introduced to make it possible to build the projects on different architectures, on the subsequent build jobs. However, that extract-source approach is sub-optimal: not only it increases the execution time of the pipeline by some minutes, but also projects whose source tree is too large are not able to use the pipeline. The debianize source tree is passed as an artifact to the build jobs, and for those large projects, the size of their source tree exceeds the Salsa s limits. This is specific issue is documented as issue #195, and the proposed solution is to get rid of the extract-source job, relying on sbuild in the very build job (see issue #296). Switching to sbuild would also help to improve the build source job, solving issues such as #187 and #298. The current work-in-progress is very preliminary, but it has already been possible to run the build (amd64), build-i386 and build-source job using sbuild with the unshare mode. The image on the right shows a pipeline that builds grep. All the test jobs use the artifacts of the new build job. There is a lot of remaining work, mainly making the integration with ccache work. This change could break some things, it will also be important to test how the new pipeline works with complex projects. Also, thanks to Emmanuel Arias, we are proposing a Google Summer of Code 2024 project to improve Salsa CI. As part of the ongoing work in preparation for the GSoC 2024 project, Santiago has proposed a merge request to make more efficient how contributors can test their changes on the Salsa CI pipeline.

/usr-move, by Helmut Grohne In January, we sent most of the moving patches for the set of packages involved with debootstrap. Notably missing is glibc, which turns out harder than anticipated via dumat, because it has Conflicts between different architectures, which dumat does not analyze. Patches for diversion mitigations have been updated in a way to not exhibit any loss anymore. The main change here is that packages which are being diverted now support the diverting packages in transitioning their diversions. We also supported a few packages with non-trivial changes such as netplan.io. dumat has been enhanced to better support derivatives such as Ubuntu.

Miscellaneous contributions
  • Python 3.12 migration trundles on. Stefano Rivera helped port several new packages to support 3.12.
  • Stefano updated the Sphinx configuration of DebConf Video Team s documentation, which was broken by Sphinx 7.
  • Stefano published the videos from the Cambridge MiniDebConf to YouTube and PeerTube.
  • DebConf 24 planning has begun, and Stefano & Utkarsh have started work on this.
  • Utkarsh re-sponsored the upload of golang-github-prometheus-community-pgbouncer-exporter for Lena.
  • Colin Watson added Incus support to autopkgtest.
  • Colin discovered Perl::Critic and used it to tidy up some poor practices in several of his packages, including debconf.
  • Colin did some overdue debconf maintenance, mainly around tidying up error message handling in several places (1, 2, 3).
  • Colin figured out how to update the mirror size documentation in debmirror, last updated in 2010. It should now be much easier to keep it up to date regularly.
  • Colin issued a man-db buster update to clean up some irritations due to strict sandboxing.
  • Thorsten Alteholz adopted two more packages, magicfilter and ifhp, for the debian-printing team. Those packages are the last ones of the latest round of adoptions to preserve the old printing protocol within Debian. If you know of other packages that should be retained, please don t hesitate to contact Thorsten.
  • Enrico participated in /usr-merge discussions with Helmut.
  • Helmut sent patches for 16 cross build failures.
  • Helmut supported Matthias Klose (not affiliated with Freexian) with adding -for-host support to gcc-defaults.
  • Helmut uploaded dput-ng enabling dcut migrate and merging two MRs of Ben Hutchings.
  • Santiago took part in the discussions relating to the EU Cyber Resilience Act (CRA) and the Debian public statement that was published last year. He participated in a meeting with Members of the European Parliament (MEPs), Marcel Kolaja and Karen Melchior, and their teams to clarify some points about the impact of the CRA and Debian and downstream projects, and the improvements in the last version of the proposed regulation.

8 February 2024

Dirk Eddelbuettel: RcppArmadillo 0.12.8.0.0 on CRAN: New Upstream, Interface Polish

armadillo image Armadillo is a powerful and expressive C++ template library for linear algebra and scientific computing. It aims towards a good balance between speed and ease of use, has a syntax deliberately close to Matlab, and is useful for algorithm development directly in C++, or quick conversion of research code into production environments. RcppArmadillo integrates this library with the R environment and language and is widely used by (currently) 1119 other packages on CRAN, downloaded 32.5 million times (per the partial logs from the cloud mirrors of CRAN), and the CSDA paper (preprint / vignette) by Conrad and myself has been cited 575 times according to Google Scholar. This release brings a new (stable) upstream (minor) release Armadillo 12.8.0 prepared by Conrad two days ago. We, as usual, prepared a release candidate which we tested against the over 1100 CRAN packages using RcppArmadillo. This found no issues, which was confirmed by CRAN once we uploaded and so it arrived as a new release today in a fully automated fashion. We also made a small change that had been prepared by GitHub issue #400: a few internal header files that were cluttering the top-level of the include directory have been moved to internal directories. The standard header is of course unaffected, and the set of full / light / lighter / lightest headers (matching we did a while back in Rcpp) also continue to work as one expects. This change was also tested in a full reverse-dependency check in January but had not been released to CRAN yet. The set of changes since the last CRAN release follows.

Changes in RcppArmadillo version 0.12.8.0.0 (2024-02-06)
  • Upgraded to Armadillo release 12.8.0 (Cortisol Injector)
    • Faster detection of symmetric expressions by pinv() and rank()
    • Expanded shift() to handle sparse matrices
    • Expanded conv_to for more flexible conversions between sparse and dense matrices
    • Added cbrt()
    • More compact representation of integers when saving matrices in CSV format
  • Five non-user facing top-level include files have been removed (#432 closing #400 and building on #395 and #396)

Courtesy of my CRANberries, there is a diffstat report relative to previous release. More detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the Rcpp R-Forge page. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

1 February 2024

Russ Allbery: Review: System Collapse

Review: System Collapse, by Martha Wells
Series: Murderbot Diaries #7
Publisher: Tordotcom
Copyright: 2023
ISBN: 1-250-82698-5
Format: Kindle
Pages: 245
System Collapse is the second Murderbot novel. Including the novellas, it's the 7th in the series. Unlike Fugitive Telemetry, the previous novella that was out of chronological order, this is the direct sequel to Network Effect. A very direct sequel; it picks up just a few days after the previous novel ended. Needless to say, you should not start here. I was warned by other people and therefore re-read Network Effect immediately before reading System Collapse. That was an excellent idea, since this novel opens with a large cast, no dramatis personae, not much in the way of a plot summary, and a lot of emotional continuity from the previous novel. I would grumble about this more, like I have in other reviews, but I thoroughly enjoyed re-reading Network Effect and appreciated the excuse.
ART-drone said, I wouldn t recommend it. I lack a sense of proportional response. I don t advise engaging with me on any level.
Saying much about the plot of this book without spoiling Network Effect and the rest of the series is challenging. Murderbot is suffering from the aftereffects of the events of the previous book more than it expected or would like to admit. It and its humans are in the middle of a complicated multi-way negotiation with some locals, who the corporates are trying to exploit. One of the difficulties in that negotiation is getting people to believe that the corporations are as evil as they actually are, a plot element that has a depressing amount in common with current politics. Meanwhile, Murderbot is trying to keep everyone alive. I loved Network Effect, but that was primarily for the social dynamics. The planet that was central to the novel was less interesting, so another (short) novel about the same planet was a bit of a disappointment. This does give Wells a chance to show in more detail what Murderbot's new allies have been up to, but there is a lot of speculative exploration and detailed descriptions of underground tunnels that I found less compelling than the relationship dynamics of the previous book. (Murderbot, on the other hand, would much prefer exploring creepy abandoned tunnels to talking about its feelings.) One of the things this series continues to do incredibly well, though, is take non-human intelligence seriously in a world where the humans mostly don't. It perfectly fills a gap between Star Wars, where neither the humans nor the story take non-human intelligences seriously (hence the creepy slavery vibes as soon as you start paying attention to droids), and the Culture, where both humans and the story do. The corporates (the bad guys in this series) treat non-human intelligences the way Star Wars treats droids. The good guys treat Murderbot mostly like a strange human, which is better but still wrong, and still don't notice the numerous other machine intelligences. But Wells, as the author, takes all of the non-human characters seriously, which means there are complex and fascinating relationships happening at a level of the story that the human characters are mostly unaware of. I love that Murderbot rarely bothers to explain; if the humans are too blinkered to notice, that's their problem. About halfway into the story, System Collapse hits its stride, not coincidentally at the point where Murderbot befriends some new computers. The rest of the book is great. This was not as good as Network Effect. There is a bit less competence porn at the start, and although that's for good in-story reasons I still missed it. Murderbot's redaction of things it doesn't want to talk about got a bit annoying before it finally resolved. And I was not sufficiently interested in this planet to want to spend two novels on it, at least without another major revelation that didn't come. But it's still a Murderbot novel, which means it has the best first-person narrative voice I've ever read, some great moments, and possibly the most compelling and varied presentation of computer intelligence in science fiction at the moment.
There was no feed ID, but AdaCol2 supplied the name Lucia and when I asked it for more info, the gender signifier bb (which didn t translate) and he/him pronouns. (I asked because the humans would bug me for the information; I was as indifferent to human gender as it was possible to be without being unconscious.)
This is not a series to read out of order, but if you have read this far, you will continue to be entertained. You don't need me to tell you this nearly everyone reviewing science fiction is saying it but this series is great and you should read it. Rating: 8 out of 10

Next.